Alexander Stoyakin 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							630bb3c9b6 
							
						 
					 
					
						
						
							
							Merge pull request  #2  from KonduitAI/asto_ops_wrapper  
						
						... 
						
						
						
						[WIP] New ops wrapper 
						
						
					 
					
						2019-10-16 20:21:50 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3662657d5c 
							
						 
					 
					
						
						
							
							Merge pull request  #1  from KonduitAI/shugeo_gamma  
						
						... 
						
						
						
						Shugeo gamma 
						
						
					 
					
						2019-10-16 18:49:33 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							24a2b2933f 
							
						 
					 
					
						
						
							
							Added gamma and lgamma functions.  
						
						
						
						
					 
					
						2019-10-16 18:22:18 +03:00 
						 
				 
			
				
					
						
							
							
								Alexander Stoyakin 
							
						 
					 
					
						
						
						
						
							
						
						
							96a9a1a733 
							
						 
					 
					
						
						
							
							Fixed output from operation.  
						
						
						
						
					 
					
						2019-10-16 18:07:52 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							7617682a46 
							
						 
					 
					
						
						
							
							Added declarations for igamma and igammac ops.  
						
						
						
						
					 
					
						2019-10-16 14:45:10 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							478a0c1f97 
							
						 
					 
					
						
						
							
							Added igamma and igammac broadcastable ops implementations and tests.  
						
						
						
						
					 
					
						2019-10-16 14:02:53 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							7103aca8c5 
							
						 
					 
					
						
						
							
							Added broadcastable IGamma and IGammac ops.  
						
						
						
						
					 
					
						2019-10-16 13:58:32 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							f90e6da97e 
							
						 
					 
					
						
						
							
							Added nd4j_gamma, nd4j_igamma and nd4j_igammac functions.  
						
						
						
						
					 
					
						2019-10-16 13:53:31 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							df2448613e 
							
						 
					 
					
						
						
							
							Added gamma distribution functions.  
						
						
						
						
					 
					
						2019-10-15 20:00:07 +03:00 
						 
				 
			
				
					
						
							
							
								AlexDBlack 
							
						 
					 
					
						
						
						
						
							
						
						
							2d750b69e5 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'konduit/master'  
						
						
						
						
					 
					
						2019-10-14 17:21:23 +11:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							ace65355c5 
							
						 
					 
					
						
						
							
							Added doc for fake_quant_with_min_max* op helpers cuda implementations.  
						
						
						
						
					 
					
						2019-10-10 18:35:28 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							c890de5a7b 
							
						 
					 
					
						
						
							
							Added doc for fake_quant_with_min_max* op helpers implementations.  
						
						
						
						
					 
					
						2019-10-10 18:31:17 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							c3f755d975 
							
						 
					 
					
						
						
							
							Refactored helpers both for cuda and cpu platforms.  
						
						
						
						
					 
					
						2019-10-10 18:02:49 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							a09cb5e2be 
							
						 
					 
					
						
						
							
							Added doc for fake_quant_with_min_max_per_channel op declaration.  
						
						
						
						
					 
					
						2019-10-10 17:13:33 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							92636b0b86 
							
						 
					 
					
						
						
							
							Eliminated waste operator.  
						
						
						
						
					 
					
						2019-10-10 17:08:59 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							d5b352273d 
							
						 
					 
					
						
						
							
							Implementation of cuda kernel for fake_quant_with_min_max_vars_per_channels op. Final revision.  
						
						
						
						
					 
					
						2019-10-10 16:51:29 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							02d8616692 
							
						 
					 
					
						
						
							
							Implementation of cuda kernel for fake_quant_with_min_max_vars_per_channels op.  
						
						
						
						
					 
					
						2019-10-10 16:40:56 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							3504b0cda9 
							
						 
					 
					
						
						
							
							Implemented fake_quant_with_min_max_vars_per_channel fop cuda helper. The first working revision.  
						
						
						
						
					 
					
						2019-10-10 15:44:50 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							753565145c 
							
						 
					 
					
						
						
							
							Refactored fake_quant_with_min_max_vars op cuda implementation.  
						
						
						
						
					 
					
						2019-10-10 14:00:49 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							c13e945a96 
							
						 
					 
					
						
						
							
							Fixed fake_quant_with_min_max_vars op and tests.  
						
						
						
						
					 
					
						2019-10-10 13:23:11 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							3c0c59ab88 
							
						 
					 
					
						
						
							
							Refactored fake_quant_with_min_max_vars op.  
						
						
						
						
					 
					
						2019-10-09 22:09:33 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							352f1eee80 
							
						 
					 
					
						
						
							
							Implemented fake_quant_with_min_max_per_channel helper for cpu platform. The first approach.  
						
						
						
						
					 
					
						2019-10-09 21:39:59 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							d0cbd33b0e 
							
						 
					 
					
						
						
							
							Added input checks for op.  
						
						
						
						
					 
					
						2019-10-09 15:52:13 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							3a89e51811 
							
						 
					 
					
						
						
							
							Added tests for fake_quant_with_min_max_vars_per_channel op.  
						
						
						
						
					 
					
						2019-10-09 13:38:18 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							cb56b0b06a 
							
						 
					 
					
						
						
							
							The first approach for fake_quant_with_min_max_vars_per_channel op implementation.  
						
						
						
						
					 
					
						2019-10-08 19:00:41 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							8fe5a1fa96 
							
						 
					 
					
						
						
							
							The working implementation of draw_bounding_boxes op.  
						
						
						
						
					 
					
						2019-10-08 15:42:27 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							30a8af566c 
							
						 
					 
					
						
						
							
							The first working implementation of cuda kernel for draw_bounding_boxes op helper.  
						
						
						
						
					 
					
						2019-10-08 13:45:18 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							ae09cfee32 
							
						 
					 
					
						
						
							
							Next approach of cuda imlementation for draw_bounding_boxes op helper.  
						
						
						
						
					 
					
						2019-10-08 00:09:46 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							6cf3a8fa9c 
							
						 
					 
					
						
						
							
							Refactored cpu implementatio and added cuda aproach.  
						
						
						
						
					 
					
						2019-10-07 17:51:07 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							78443ffebf 
							
						 
					 
					
						
						
							
							Working implementation of draw_bounding_boxes op for cpu.  
						
						
						
						
					 
					
						2019-10-07 15:04:44 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							16a66a65e3 
							
						 
					 
					
						
						
							
							Added helper declaration for draw_bounding_boxes op.  
						
						
						
						
					 
					
						2019-10-04 21:16:34 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							53a2ebddbe 
							
						 
					 
					
						
						
							
							Added test and helpers for draw_bounding_boxes op both cpu and cuda related.  
						
						
						
						
					 
					
						2019-10-04 20:46:26 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							8f70b4441f 
							
						 
					 
					
						
						
							
							draw_bounding_boxes op implementation. Inital revision.  
						
						
						
						
					 
					
						2019-10-04 18:32:21 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							908e4c4912 
							
						 
					 
					
						
						
							
							Added implementation for divide_no_nan op and tests.  
						
						
						
						
					 
					
						2019-10-04 10:29:15 +03:00 
						 
				 
			
				
					
						
							
							
								raver119 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cff26f13c5 
							
						 
					 
					
						
						
							
							Revert "Implement divide_no_nan op."  
						
						
						
						
					 
					
						2019-10-03 20:25:52 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							6eaca179d6 
							
						 
					 
					
						
						
							
							Implement divide_no_nan op.  
						
						
						
						
					 
					
						2019-10-03 18:22:17 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							130ee25682 
							
						 
					 
					
						
						
							
							Implemented compare_and_bitpack op.  
						
						
						
						
					 
					
						2019-10-03 10:57:48 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							75ad3c8153 
							
						 
					 
					
						
						
							
							Fixed test names.  
						
						
						
						
					 
					
						2019-10-02 19:05:26 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							f3e42173ef 
							
						 
					 
					
						
						
							
							Refactored buffer copying to avoid wrong usage of buffers.  
						
						
						
						
					 
					
						2019-10-02 16:51:09 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							1c6173d218 
							
						 
					 
					
						
						
							
							Added implementation of bitcast op.  
						
						
						
						
					 
					
						2019-10-02 15:04:59 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							a27e61553a 
							
						 
					 
					
						
						
							
							Added tests and fixed op name.  
						
						
						
						
					 
					
						2019-10-02 15:04:28 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							863ff76878 
							
						 
					 
					
						
						
							
							Added declaration for bincast op.  
						
						
						
						
					 
					
						2019-10-02 12:17:00 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							afeb524238 
							
						 
					 
					
						
						
							
							Refactored implementation for adjust_contrast ops.  
						
						
						
						
					 
					
						2019-10-01 14:13:09 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							1575c704ae 
							
						 
					 
					
						
						
							
							Added implementation for adjust_contrast_v2 op and tests.  
						
						
						
						
					 
					
						2019-10-01 11:44:27 +03:00 
						 
				 
			
				
					
						
							
							
								raver119 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							44a8d19ac6 
							
						 
					 
					
						
						
							
							[WIP] Broadcast changes ( #8257 )  
						
						... 
						
						
						
						* - provide correct call NDArray::applyBroadcast inside of NDArray::applyTrueBroadcast
Signed-off-by: Yurii <yurii@skymind.io>
* - provide new trueBroadcast helper
Signed-off-by: Yurii <yurii@skymind.io>
* example for yurii
Signed-off-by: raver119 <raver119@gmail.com>
* - provide new trueBroadcast helper for cpu
Signed-off-by: Yurii <yurii@skymind.io>
* - start working on new trueBroadcat helper for cuda
Signed-off-by: Yurii <yurii@skymind.io>
* - further work on trueBroadcast for cuda
Signed-off-by: Yurii <yurii@skymind.io>
* - fix bugs in cuda helper trueBroadcast
Signed-off-by: Yurii <yurii@skymind.io> 
						
						
					 
					
						2019-10-01 09:10:19 +03:00 
						 
				 
			
				
					
						
							
							
								shugeo 
							
						 
					 
					
						
						
						
						
							
						
						
							e06dfb5dcc 
							
						 
					 
					
						
						
							
							Implementation of adjust_contrast op.  
						
						
						
						
					 
					
						2019-09-30 18:24:12 +03:00 
						 
				 
			
				
					
						
							
							
								raver119 
							
						 
					 
					
						
						
						
						
							
						
						
							78bca543a8 
							
						 
					 
					
						
						
							
							missed include for MklDnnTests run without mkldnn  
						
						... 
						
						
						
						Signed-off-by: raver119 <raver119@gmail.com> 
						
						
					 
					
						2019-09-12 10:49:01 +03:00 
						 
				 
			
				
					
						
							
							
								AlexDBlack 
							
						 
					 
					
						
						
						
						
							
						
						
							a66e03355e 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'fork/master'  
						
						
						
						
					 
					
						2019-09-12 12:20:57 +10:00 
						 
				 
			
				
					
						
							
							
								raver119 
							
						 
					 
					
						
						
						
						
							
						
						
							07901ceb69 
							
						 
					 
					
						
						
							
							few more mkldnn dependencies removed  
						
						... 
						
						
						
						Signed-off-by: raver119 <raver119@gmail.com> 
						
						
					 
					
						2019-09-12 04:55:59 +03:00 
						 
				 
			
				
					
						
							
							
								raver119 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							98e2814879 
							
						 
					 
					
						
						
							
							Platform helpers ( #8216 )  
						
						... 
						
						
						
						* platform helpers draft
Signed-off-by: raver119 <raver119@gmail.com>
* typo
Signed-off-by: raver119 <raver119@gmail.com>
* disable platform cmake
Signed-off-by: raver119 <raver119@gmail.com>
* another draft
Signed-off-by: raver119 <raver119@gmail.com>
* mkldnn convolution refactored
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* one more safety check
Signed-off-by: raver119 <raver119@gmail.com>
* prototype works
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* force static library mode for mkldnn
Signed-off-by: raver119 <raver119@gmail.com>
* - ismax fix
- experimental arg fix
- don't enforce openblas on Apple hardware
Signed-off-by: raver119 <raver119@gmail.com>
* bunch of small fixes
Signed-off-by: raver119@gmail.com  <raver119@gmail.com>
* declare concurrent
Signed-off-by: raver119@gmail.com  <raver119@gmail.com>
* - MKLDNN version upgrade to 1.0.2
- avgpool2d/maxpool2d APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* - avgpool2d_bp/maxpool2d_bp APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* - conv2d/batchnorm APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* - lrn/conv2d_bp/conv3d/conv3d_bp APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* all ops converted to MKLDNN 1.x
Signed-off-by: raver119 <raver119@gmail.com>
* bunch of tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* namespace for platform helpers
Signed-off-by: raver119 <raver119@gmail.com>
* make sure platform helpers aren't opimized out
Signed-off-by: raver119 <raver119@gmail.com>
* build cpu_features on x86 systems
Signed-off-by: raver119 <raver119@gmail.com>
* build cpu_features on x86 systems
Signed-off-by: raver119 <raver119@gmail.com>
* more of cpu_features
Signed-off-by: raver119 <raver119@gmail.com>
* - mkldnn removed from java
- cpu_features checks in CpuNDArrayFactory
Signed-off-by: raver119 <raver119@gmail.com>
* F16C definition renamed
Signed-off-by: raver119 <raver119@gmail.com>
* some mkldnn rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* check supported instructions before doing anything
Signed-off-by: raver119 <raver119@gmail.com>
* typo
Signed-off-by: raver119 <raver119@gmail.com>
* missied impl
Signed-off-by: raver119 <raver119@gmail.com>
* BUILD_PIC option
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool3d fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool3d_bp fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool2d_bp leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool3d_bp leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* maxpool bp leaks fixed
Signed-off-by: raver119 <raver119@gmail.com>
* printf removed
Signed-off-by: raver119 <raver119@gmail.com>
* batchnorm fix
Signed-off-by: raver119 <raver119@gmail.com>
* AVX warning/error polishing
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* remove previous MKL-DNN support layer
Signed-off-by: raver119 <raver119@gmail.com>
* avx2 tweak
Signed-off-by: raver119 <raver119@gmail.com>
* allow static for apple
Signed-off-by: raver119@gmail.com  <raver119@gmail.com>
* exclude mkldnn in one more place
Signed-off-by: raver119 <raver119@gmail.com>
* exclude mkldnn in one more place
Signed-off-by: raver119 <raver119@gmail.com>
* restore OPENBLAS_PATH use
Signed-off-by: raver119 <raver119@gmail.com>
* add runtime check for avx/avx2 support
Signed-off-by: raver119 <raver119@gmail.com>
* convolution_auto
Signed-off-by: raver119 <raver119@gmail.com>
* Add logic for helper argument
* minor test fix
Signed-off-by: raver119 <raver119@gmail.com>
* few tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* few tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* skip OpTracker props for non-x86 builds
Signed-off-by: raver119 <raver119@gmail.com>
* linux arm isn't x86 :)
Signed-off-by: raver119 <raver119@gmail.com>
* avx-512
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA presets fix
Signed-off-by: raver119 <raver119@gmail.com>
* BUILD_PIC
Signed-off-by: raver119 <raver119@gmail.com>
* prefetchw for avx2
Signed-off-by: raver119 <raver119@gmail.com>
* BUILD_PIC again
Signed-off-by: raver119 <raver119@gmail.com> 
						
						
					 
					
						2019-09-11 21:50:28 +03:00