cavis

Author	SHA1	Message	Date
Yurii Shyrma	d19eeaec52	Shyrma casual conv1d (#90 ) * - add causal mode of padding to convolutions Signed-off-by: Yurii <iuriish@yahoo.com> * - add additional tests for causal conv1d Signed-off-by: Yurii <iuriish@yahoo.com> * - add causal mode for cuda conv kernels Signed-off-by: Yurii <iuriish@yahoo.com> * Java side of Conv1D changes Signed-off-by: raver119 <raver119@gmail.com> * Add Conv1DDerivative op Signed-off-by: Alex Black <blacka101@gmail.com> * Causal Conv1D gradient checks Signed-off-by: Alex Black <blacka101@gmail.com> * Tweaks Signed-off-by: Alex Black <blacka101@gmail.com> * - add causal padding mode to conv2d_bp Signed-off-by: Yurii <iuriish@yahoo.com> * More thorough causal conv1d tests Signed-off-by: Alex Black <blacka101@gmail.com>	2019-11-29 14:14:30 +03:00
Samuel Audet	5e07998e59	Add support for CUDA 10.2 (#89 )	2019-11-29 16:31:03 +11:00
shugeo	009007120b	Shugeo_release_fixes3 (#81 ) * Implementation for non_max_suppression_v3 was added. Initial version * Added check for overcome threshold. * Added definition for V3 method. * java remapping for NonMaxSuppressionV3 Signed-off-by: raver119 <raver119@gmail.com> * Fixed proporly processing of an empty output and test. * Refactored op to less threshold data to float. * Implemented cuda-based helper for non_max_suppression_v3 op. * Fixed fake_quant_with_min_max_vars op. * Fixed tests with float numbers. * - assert now stops execution - sortByKey/sortByValue now have input validation Signed-off-by: raver119 <raver119@gmail.com> * missing var Signed-off-by: raver119 <raver119@gmail.com> * Fixed proper processing for zero max_size inputs. * Refactored kernel callers. * Fixed return statement for logdet op helper. * Refactored unsorted segment SqrtN op. * get back 8 tail bytes on CUDA Signed-off-by: raver119 <raver119@gmail.com> * Refactored segment prod ops and helpers for cuda and tests. * Additional test. * CudaWorkspace tests updated for 8 tail bytes Signed-off-by: raver119 <raver119@gmail.com> * special atomic test Signed-off-by: raver119 <raver119@gmail.com> * atomicMul/atomicDiv fix for 16bit values Signed-off-by: raver119 <raver119@gmail.com> * Eliminated waste prints.	2019-11-28 21:08:51 +03:00
Yurii Shyrma	a8dd6713aa	Shyrma scatter (#84 ) * - improve performance of scatter (no lock) ops for 1D case Signed-off-by: Yurii <iuriish@yahoo.com> * - improve scatter lock op performance for 1D case Signed-off-by: Yurii <iuriish@yahoo.com> * - add kernel for verification of input indices-array elements in scatter and scatter_nd ops Signed-off-by: Yurii <iuriish@yahoo.com> * - provide fast indices checking on cpu side for scatter and gather osp Signed-off-by: Yurii <iuriish@yahoo.com> * - apply corrections requested by pr reviewer Signed-off-by: Yurii <iuriish@yahoo.com>	2019-11-26 20:29:09 +03:00
raver119	7f90930e7a	bring back cuda cc 30 Signed-off-by: raver119 <raver119@gmail.com>	2019-11-25 09:17:35 +03:00
shugeo	4187190609	Shugeo release fix2 (#70 ) * Corrected input checking and tests for bitcast op. * Fixed an issue with non_max_suppression form generation and processing with score threshold given. * Fixed bilinear resize kernel and tests. * push for Serhii Signed-off-by: raver119 <raver119@gmail.com> * Added test for nearest_neighbor resize with int input. * Added data type check for input/output match. * Eliminate error in macros. * Improved output message for type checking. * Fixed input/output types for op. * Eliminated waste logging. * Refactored resize_bilinear helper for multithreading for cpu platform. * Cosmetic changes only. * Fixed error for string substitution. * Skip test for cbow_batch with cuda. * fix for resizeNearestNeighbor output dtype Signed-off-by: raver119 <raver119@gmail.com> * Refactored non_max_suppression helper. * Refactored shape generation and input handling. * Added additional test.	2019-11-22 22:42:44 +03:00
Yurii Shyrma	7a90a31cfb	Shyrma deconv3 (#69 ) * - profiling cuda kernels for vol2col and im2col Signed-off-by: Yurii <iuriish@yahoo.com> * - correct addBias helper Signed-off-by: Yurii <iuriish@yahoo.com> * - correct mkl dilation formula and switch off mkl api for dilation deconvolutions Signed-off-by: Yurii <iuriish@yahoo.com>	2019-11-21 21:17:30 +02:00
raver119	064a56ccf1	Few fixes (#66 ) * skip legacy transforms execution in case of empty input arrays Signed-off-by: raver119 <raver119@gmail.com> * - BroadcastBool ops now accept extraParams to make MatchCondition possible - TrueBroadcastHelper now uses samediff::threads Signed-off-by: raver119 <raver119@gmail.com> * java side Signed-off-by: raver119 <raver119@gmail.com> * trigger jenkins Signed-off-by: raver119 <raver119@gmail.com> * update LessThanOrEqual opNum mapping Signed-off-by: raver119 <raver119@gmail.com> * update LessThanOrEqual opNum mapping Signed-off-by: raver119 <raver119@gmail.com>	2019-11-21 15:43:03 +03:00
raver119	83cb0d9329	[WIP] Create and small fix (#67 ) * - create op - skip exec for empty inputs for non_max_suppression - EmptyHandling idea Signed-off-by: raver119 <raver119@gmail.com> * Create op and mapping for it Signed-off-by: raver119 <raver119@gmail.com>	2019-11-21 13:31:20 +03:00
shugeo	dc0036f2c6	Shugeo image resize bicubic (#56 ) * Added implementation files for image_resize and resize_bicubic ops. * Image resize and image.resize_bicubic ops implementation. Initial revision. * Finished with infrastructure development for image.resize_bilinear op and image_resizo op implementation. * Refactored resize methods. * Added processing for Mitchelcubic algorithm. * Added check for input/output sizes. * Added int and float types for crop_and_resize op. * Refactored crop_and_resize output type check. * Added helper for bicubic interpolation as TF v.1 does. * Added TF v.1 bicubic helper for cuda platform. * Added cached class for bicubic algorithm. * Refactored cuda implementation for crop_and_resize helper to use proper output type. * Added facilities for bicubic interpolation. * Portion bicubic interpolation from TF. * Added tests for resize_bilinear testing. * Working implementation of bicubic interpolation and tests. * Refactored routines with image_resize bicubic op helper. * Refactored code with coding standards. * Refactored cpu helpers for resize_bicubic op. * Refactored bicubic helpers. * Added bicubic resize facilities. * Implementing cuda kernels for bicubic interpolation. Implementation step. * Cuda implementation of resize_bicubic op helper. * Refactor image.resize_bicubic op helpers. * Refactored helpers for resize_bicubic. Added error checking with cuda implementation. * Refactored cuda implementation of resize_bicubic op helper. The first working revision. * Cuda arch implementation for resize_bicubic op helper. Full working single-threaded revision. * Intermediate bicubic interpolation helper for cuda. * Refactored cpu helper for resize_bicubic. * Multithreaded cuda implementation for resize_bicubic. * Fixed merge issues. * Refactored nlp helpers. * Replicated resize_bicubic for 3D also. * Eliminated waste comments of unused code. * Eliminated waste comments with unused code. * Eliminated waste template definitions. * Eliminated waste debug code. * Eliminated waste comments. * Fixed multithreading with helpers. * Fixed test suites for float and double in float point input lists. * Fixed usage of reshape with 3D/4D on resizes. * Final fixes. * Fixed resize_neighbor op problem.	2019-11-20 21:11:04 +02:00
shugeo	13e5c0a280	Shugeo release fix1 (#61 ) * Added a pair of tests for failed ops. * Fixed cpu helper for draw_bounding_boxes op. * Refactored implementation of draw_bounding_boxes op to full conform with TF. * Improved multithreading with draw_bounding_boxes op cuda helper. * Eliminated log messages. * Changed logging with draw_bounding_boxes op helper and tests. * Resize_biliear with 3D input allowed. * Refactored 3D input acception with resize_bilinear op. * And another improvement. * Refactored reshape of input/output for resize_bilinear. * Improvements final. * Finished with 3D replication for image.resize_bilinear/_nearest_neighbor. * Added copyrights for TF code. * Using new form of multithreading for cpu implementation. * Fixed shape error. * Added multithreaded with batches on crop_and_resize functor. * Refactored multithreading with crop_and_resize and draw_bounding_boxes.	2019-11-20 13:37:48 +02:00
raver119	59e955cedc	- MKL-DNN version upgrade to 1.1.x (#62 ) - MKL-DNN namespace changes to match DNNL rename Signed-off-by: raver119 <raver119@gmail.com>	2019-11-20 13:23:08 +03:00
raver119	7898f3c0cc	fix for is_increasing/non_decreasing ops for empty input case (#63 ) Signed-off-by: raver119 <raver119@gmail.com>	2019-11-20 11:12:15 +03:00
Yurii Shyrma	66b84b38cf	Shyrma mmul (#58 ) * - get rid of some copy procedures in mmulHelper ops Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on embedding cuda api for batched gemm (cublasGemmBatchedEx) in our mmulHelper class Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on cuda batched gamm api Signed-off-by: Yurii <iuriish@yahoo.com> * - write own cuda kernel performing batched gemm Signed-off-by: Yurii <iuriish@yahoo.com> * missing include in MmulHelper Signed-off-by: raver119 <raver119@gmail.com> * - forgot to keep in code previous correct kernels for mmulNxN, since it may happen that new onw will fail for some reason in future Signed-off-by: Yurii <iuriish@yahoo.com> * disable old tensordot Signed-off-by: raver119 <raver119@gmail.com> * - rewrite cuda kernels for usualGemm and usualGemv Signed-off-by: Yurii <iuriish@yahoo.com> * - profiling mmul helpers Signed-off-by: Yurii <iuriish@yahoo.com> * - prints to check shapes were added Signed-off-by: Yurii <iuriish@yahoo.com> * - correct type of output array Cin mmulNxN Signed-off-by: Yurii <iuriish@yahoo.com> * - take into account possible nans in C array Signed-off-by: Yurii <iuriish@yahoo.com> * slightly change numThreads message Signed-off-by: raver119 <raver119@gmail.com> * - make corrections in accordance to given notes in pr review Signed-off-by: Yurii <iuriish@yahoo.com>	2019-11-19 15:39:36 +02:00
Alex Black	da1944e8e1	SameDiff TF import (#49 ) * Added implementation files for image_resize and resize_bicubic ops. * Image resize and image.resize_bicubic ops implementation. Initial revision. * Minor fix * Some TF imports disabled. * Finished with infrastructure development for image.resize_bilinear op and image_resizo op implementation. * Refactored resize methods. * Added processing for Mitchelcubic algorithm. * adjust_contrast * Small fix for TF import expected value loading when variable name starts with the test name Signed-off-by: AlexDBlack <blacka101@gmail.com> * Tests * Tests added. * Removed tf names absent in mapping. * Some fixes. * Small fixes * Minor change * Some failing tests. * Disable failed test * Ignore some tests * Fix import class mapping Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix float property mapping (flatbuffers) Signed-off-by: AlexDBlack <blacka101@gmail.com> * Override equality function for model 'dropout' Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fail tests * Failed tests ignored temporarily. * Minor fixes * Small fix * Conflict resolved * Default implementations of tensorflowName and onnxName	2019-11-19 22:44:29 +11:00
raver119	bbd59a3537	fake quant dtype validation fix (#60 ) Signed-off-by: raver119 <raver119@gmail.com>	2019-11-19 12:53:52 +03:00
raver119	db7ca956c5	[WIP] Mish (#55 ) * Mish activation function and its derivative Signed-off-by: raver119 <raver119@gmail.com> * signature fix Signed-off-by: raver119 <raver119@gmail.com> * mish as activation for dl4j Signed-off-by: raver119 <raver119@gmail.com> * javadoc Signed-off-by: raver119 <raver119@gmail.com> * minor optimization Signed-off-by: raver119 <raver119@gmail.com>	2019-11-18 13:21:26 +03:00
raver119@gmail.com	9101a0ee15	build fix for clang Signed-off-by: raver119@gmail.com <raver119@gmail.com>	2019-11-16 22:18:50 +03:00
Alex Black	09a827fb6d	Fixes and pre-release QA (#51 ) * #8395 Keras import - support scaled identity weight init Signed-off-by: AlexDBlack <blacka101@gmail.com> * More Keras scaled weight init fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8352 Deprecate duplicate SamplingDataSetIterator class Signed-off-by: AlexDBlack <blacka101@gmail.com> * Remove /O2 optimization for faster CUDA build Signed-off-by: AlexDBlack <blacka101@gmail.com> * Tweak regression test precision for CUDA Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix edge cases for buffer creation Signed-off-by: AlexDBlack <blacka101@gmail.com> * Update MKLDNN validation tests to new helper enable/disable settings Signed-off-by: AlexDBlack <blacka101@gmail.com> * Delete debugging class Signed-off-by: AlexDBlack <blacka101@gmail.com> * MKLDNN test - add proper skip for CUDA backend Signed-off-by: AlexDBlack <blacka101@gmail.com> * Align WeightInitUtil with weight init classes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix for SameDiff test layers weight init when using IWeightInit classes Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-11-16 17:04:29 +11:00
raver119	1780dcc883	[WIP] Small fixes here and there (#50 ) * one range test Signed-off-by: raver119 <raver119@gmail.com> * few Context convenience singatures Signed-off-by: raver119 <raver119@gmail.com> * one more range test Signed-off-by: raver119 <raver119@gmail.com> * "range" "fix" Signed-off-by: raver119 <raver119@gmail.com> * adjuct_contrast_v2 now allows scale factor to be provided via input_variable Signed-off-by: raver119 <raver119@gmail.com> * adjust_contrast now allows scale factor as variable too Signed-off-by: raver119 <raver119@gmail.com> * bitcast shape tests Signed-off-by: raver119 <raver119@gmail.com> * BitCast import dtype added Signed-off-by: raver119 <raver119@gmail.com> * few more BitCast signatures Signed-off-by: raver119 <raver119@gmail.com>	2019-11-15 17:04:29 +03:00
Yurii Shyrma	62d8e0d409	- make agreement between our and mkl api dilation/padding formulas (#47 ) Signed-off-by: Yurii <iuriish@yahoo.com>	2019-11-14 20:21:22 +03:00
raver119	1eb3de90d7	[WIP] Platform helpers switches (#44 ) * - platform helpers can be disabled on per-op basis now via Context::allowHelpers - java has access to it as well Signed-off-by: raver119 <raver119@gmail.com> * global platform-helpers trigger Signed-off-by: raver119 <raver119@gmail.com> * few signatures renamed Signed-off-by: raver119 <raver119@gmail.com> * - few new env variables to follow - maxThreads/masterThreads differentiation Signed-off-by: raver119 <raver119@gmail.com> * Javadoc update Signed-off-by: raver119 <raver119@gmail.com>	2019-11-14 14:35:02 +03:00
Alex Black	47d19908f4	Various fixes (#43 ) * #8172 Enable DL4J MKLDNN batch norm backward pass Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8382 INDArray.toString() rank 1 brackets / ambiguity fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8308 Fix handful of broken links (inc. some in errors) Signed-off-by: AlexDBlack <blacka101@gmail.com> * Unused dependencies, round 1 Signed-off-by: AlexDBlack <blacka101@gmail.com> * Unused dependencies, round 2 Signed-off-by: AlexDBlack <blacka101@gmail.com> * Unused dependencies, round 3 Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Uniform distribution TF import fix Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-11-14 19:38:20 +11:00
raver119	48df1acdfb	[WIP] ThreadPool (#8 ) This PR removes OpenMP use in 95% of cases	2019-11-13 17:04:59 +03:00
raver119	f05c6ee139	INLINE_LOOPS for windows Signed-off-by: raver119 <raver119@gmail.com>	2019-11-12 15:12:31 +03:00
Alex Black	18c01f5bdc	Add SameDiff memory reuse memory manager (array cache) (#39 ) * Attention op comments Signed-off-by: AlexDBlack <blacka101@gmail.com> * ArrayCacheMemoryMgr - first pass Signed-off-by: AlexDBlack <blacka101@gmail.com> * Tweak array cache for use with SameDiff identity arrays Signed-off-by: AlexDBlack <blacka101@gmail.com> * ArrayCacheMemoryMgr javadoc and properly get max memory Signed-off-by: AlexDBlack <blacka101@gmail.com> * LRU cache policy + add tests Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Resize arrays internally if required for ArrayCacheMemoryMgr Signed-off-by: AlexDBlack <blacka101@gmail.com> * Test improvement Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small polish Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-11-12 21:15:44 +11:00
Yurii Shyrma	0eda1e733e	Shyrma bnorm bp (#41 ) Batchnorm backprop mkldnn	2019-11-12 11:58:48 +03:00
raver119	cd961727bb	[WIP] perf tests (#40 ) * special maxpool test Signed-off-by: raver119 <raver119@gmail.com> * special maxpool test Signed-off-by: raver119 <raver119@gmail.com>	2019-11-11 17:45:59 +03:00
raver119	929c1dc5c7	- new NDArrayFactory scalar constructor - minor tweak in randomuniform - one more test Signed-off-by: raver119 <raver119@gmail.com>	2019-11-08 08:49:41 +03:00
raver119	51f3a1371d	[WIP] Random Uniform (#36 ) * args Signed-off-by: raver119@gmail.com <raver119@gmail.com> * T args Signed-off-by: raver119 <raver119@gmail.com>	2019-11-07 17:09:47 +03:00
shugeo	679e42199a	Shugeo strided slice bp fix2 (#33 ) * Fixed crash and restored brocken functionality for strided slice. * Added comments for strided_slice_bp main step.	2019-11-07 13:44:02 +03:00
raver119	4276e63054	one more test Signed-off-by: raver119 <raver119@gmail.com>	2019-11-07 08:49:27 +03:00
shugeo	08853c7829	Shugeo random uniform int (#30 ) * Corrected randomuniform declaration. * Refactored uniform distribution for both cuda and cpu platforms. * Refactored uniform distribution and tests. * Fixed type usage with indices. * Refactored uniform distribution implementation and tests to full conform with TF implementation. * Refactored gamma function to use type util method. * Copyright changes and fixes with ConstantHelper. * Added error checking on allocate cuda device memory and operations.	2019-11-06 12:49:27 +02:00
AlexDBlack	7583ccfa15	Merge Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-11-06 13:28:03 +11:00
Yurii Shyrma	871f3bb3e6	- add additional condition in svd helper to take into account rounding errors (#31 ) Signed-off-by: Yurii <iuriish@yahoo.com>	2019-11-05 17:16:17 +02:00
shugeo	9124974e3b	Fixed crash with strided_slice_bp op and tests. (#29 )	2019-11-05 12:49:15 +02:00
shugeo	7b14a9f603	Gamma and Poisson distributions (#27 ) * Added implementation for random_gamma op. * Added implementation for random_poisson op and support classes. * Added helpers for random_poisson and random_gamma ops. * Implementation of random_poisson. The first working edition. * Implementation of random_poisson. Parallelized working edition. * Implementation of random_gamma. Parallelized working edition with alpha only. * Added cuda implementation for helper of poisson distribution. * Corrected shape calculation with random_gamma and tests. * Finished cpu implementation for gamma distribution. * Finished cuda implementation for random_gamma op. * Refactored cpu helpers for random_gamma and random_poisson ops. * Refactored cuda helpers for gamma and poisson distribution. * Refactored cuda helper for gamma distribution. * Refactored cpu helper for random_poisson op. * Refactored cpu helper for random_gamma op.	2019-11-04 15:42:28 +02:00
Alex Black	948ebef41c	Op Fixes (#28 ) * #8280 biasadd_bp nchw arg fixes (java side) + test Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8285 Concat op Java side fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Concat op cpp fix - allow dynamic axis to be negative, same as static axis Signed-off-by: AlexDBlack <blacka101@gmail.com> * ignores for deconv3d import tests until deconv3d_tf op is implemented Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-11-05 00:05:04 +11:00
Yurii Shyrma	0cdb5750e0	Shyrma concat (#24 ) * - provide possibility to pass axis as last input array in concat op - corrcect sumation in bias_add_bp op for NHWC case Signed-off-by: Yurii <iuriish@yahoo.com> * - write code for deconv2d op based on mkl dnn api * no unsafe math Signed-off-by: raver119 <raver119@gmail.com> * no unsafe math Signed-off-by: raver119 <raver119@gmail.com> * - get rid of e<> and p<> methods in svd helper Signed-off-by: Yurii <iuriish@yahoo.com> * - provide mkl api support for deconvolution 3d Signed-off-by: Yurii <iuriish@yahoo.com> * - write deconv2d_bp based on mkl api Signed-off-by: Yurii <iuriish@yahoo.com> * - write deconv3d_bp based on mkl api Signed-off-by: Yurii <iuriish@yahoo.com> * - testing and fixing deconv based on mkl api Signed-off-by: Yurii <iuriish@yahoo.com> * - remove dilation form conv2d/3d mkl Signed-off-by: Yurii <iuriish@yahoo.com> * - minor changes Signed-off-by: Yurii <iuriish@yahoo.com> * - further corrections of deconv ops based on mkl dnn api Signed-off-by: Yurii <iuriish@yahoo.com> * - provide deconv2d_tf based on mkl dnn api Signed-off-by: Yurii <iuriish@yahoo.com> * - add minor corrections required by reviewer Signed-off-by: Yurii <iuriish@yahoo.com>	2019-11-03 12:37:19 +02:00
raver119	c94013f0a1	cc 52 -> 50 Signed-off-by: raver119 <raver119@gmail.com>	2019-11-03 09:54:35 +03:00
raver119	879a06c913	few typos fixed Signed-off-by: raver119 <raver119@gmail.com>	2019-11-01 09:13:15 +03:00
Alexander Stoyakin	45a40c8a89	DL4J/ND4J: Do pass on integer casts (#15 ) * Int cast fixes. * Revert "Int cast fixes." This reverts commit aa36e8ca * Int casts * Int cast * Int casts * Get rid of int casts. Dropping deprecated aggregate ops. * java scatterUpdate changes Signed-off-by: raver119 <raver119@gmail.com> * c++ scatterUpdate changes Signed-off-by: raver119 <raver119@gmail.com> * Remove aggregated ops. * Restored test * Tests restored. * Minor fixes	2019-10-31 11:23:09 +02:00
shugeo	95f7ad7b94	Shugeo suppression overlaps (#9 ) * Added non_max_suppression_overlaps op and tests. * Refactored implementation of non_max_suppression_overlaps. * Refactoring of implementation of non_max_suppression_overlaps op. * Refactoring of implementation of non_max_suppression op. * Fixed portion error. * Added cuda frontends for image suppression ops. * Eliminated crash with cuda arch on image.non_max_suppression_overlaps op. * Improved implementation of image_suppression helper for cpu platform. * The generic approach of non_max_suppression_overlaps op helper with cuda platform. * Working cuda implementation of helper non_max_suppression_overlaps op. * Eliminated waste comments. * Improved implementations for both platforms * Refactored cuda implementation of image.non_max_suppression_overlaps op helper. * Improved cuda implementation of non_max_suppression op helper. * Refactored cuda implementation of image.non_max_suppression_overlaps op helper. * Improved cuda implementation of image.non_max_suppression_overlaps op helper. * Added modifications into cuda implementation for image suppression overlaps op. * Correct queue emulation with cuda implementation of non_max_suppression_overlaps op. * Prefinal stage of cuda implementation of non_max_suppression_overlaps. * Worked cuda implementation of non_max_suppresion_overlaps helper. * Fixed return to proper thread. * Improvements for cuda implementation of image.non_max_suppression_overlaps op helper. * Fixed implementation issues with non_max_suppression_overlaps on cuda platform. * Fixed skip for non_max_suppression_overlaps on cuda platform. * Finalize implementation of image_suppression helper and tests. * Cosmetic changes only.	2019-10-30 13:43:45 +02:00
Yurii Shyrma	029a69a835	Shyrma bn mkl bp (#14 ) * - write code for new batchnorm backprop Signed-off-by: Yurii <iuriish@yahoo.com> * - testing batchnorm backprop Signed-off-by: Yurii <iuriish@yahoo.com> * - write code for batchnorm backprop based on mkl dnn api Signed-off-by: Yurii <iuriish@yahoo.com> * - testing and fixing bugs in batchnorm_bp mkl dnn Signed-off-by: Yurii <iuriish@yahoo.com> * - made corrections required by reviewer Signed-off-by: Yurii <iuriish@yahoo.com> * - change name in java wrapper for batchnorm op Signed-off-by: Yurii <iuriish@yahoo.com>	2019-10-26 14:14:21 +03:00
Alex Black	d333d29099	SameDiff cleanup and fixes (#12 ) * #8160 Remove resolvePrepertiesFromSameDiffBeforeExecution Signed-off-by: AlexDBlack <blacka101@gmail.com> * SameDiff API cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * More SameDiff cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8248 Switch SameDiff variable init from lazy to creation time for more predictable behaviour Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8252 TanhDerivative javadoc Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8225 Deconvolution2D input validation Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8265 Switch SameDiff.outputs() to user settable, instead of unreliable 'best guess' Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8224 SameDiff.zero and .one create constants, not variables Signed-off-by: AlexDBlack <blacka101@gmail.com> * More cleanup and fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small test fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * DL4J SameDiff fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Re-add hack for Deconvolution2DLayer until #8315 is resolved Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8270 Move CUDA device/version logging to Java; can be disabled via existing org.nd4j.log.initialization system property Signed-off-by: AlexDBlack <blacka101@gmail.com> * All ND4J init logging checks system property Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small tweak Signed-off-by: AlexDBlack <blacka101@gmail.com> * Remove redundant device logging Signed-off-by: AlexDBlack <blacka101@gmail.com> * One more fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * UX improvements Signed-off-by: AlexDBlack <blacka101@gmail.com> * Deconv fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Add deconv tests Signed-off-by: AlexDBlack <blacka101@gmail.com> * Cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * Remove debug code Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-10-26 12:38:08 +11:00
Alex Black	3f0b4a2d4c	SameDiff execution, TF and memory management overhaul (#10 ) * SameDiff execution memory management improvements, round 1 Signed-off-by: AlexDBlack <blacka101@gmail.com> * Round 2 Signed-off-by: AlexDBlack <blacka101@gmail.com> * Round 3 Signed-off-by: AlexDBlack <blacka101@gmail.com> * Clear node outputs closed array references; Slight change to OpValidation internals to not rely on cached op outputs Signed-off-by: AlexDBlack <blacka101@gmail.com> * Next steps Signed-off-by: AlexDBlack <blacka101@gmail.com> * Next step Signed-off-by: AlexDBlack <blacka101@gmail.com> * Cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * More polish Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Add WeakIdentityHashmap Signed-off-by: AlexDBlack <blacka101@gmail.com> * Session fixes for control ops and next steps Signed-off-by: AlexDBlack <blacka101@gmail.com> * More fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * First steps for training session + in-line updating Signed-off-by: AlexDBlack <blacka101@gmail.com> * Next steps Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix losses and history during training Signed-off-by: AlexDBlack <blacka101@gmail.com> * More fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * BiasAdd and other fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Don't use SDVariable.getArr() in TFGraphTestAllHelper (import tests) Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * First steps for new dependency tracking approach Signed-off-by: AlexDBlack <blacka101@gmail.com> * Start integrating dependency tracking for memory management Signed-off-by: AlexDBlack <blacka101@gmail.com> * Non-control op dependency tracking works/passes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Switch/merge Signed-off-by: AlexDBlack <blacka101@gmail.com> * Next steps Signed-off-by: AlexDBlack <blacka101@gmail.com> * Cleanup and next steps Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix issue dependency tracking for initial variables/constants Signed-off-by: AlexDBlack <blacka101@gmail.com> * Add check for aliases when determining if safe to close array Signed-off-by: AlexDBlack <blacka101@gmail.com> * First pass on new TF graph import class Signed-off-by: AlexDBlack <blacka101@gmail.com> * Import fixes, op fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Cleanup and fixes for new TF import mapper Signed-off-by: AlexDBlack <blacka101@gmail.com> * More cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * Partial implementation of new dependency tracker Signed-off-by: AlexDBlack <blacka101@gmail.com> * Next steps Signed-off-by: AlexDBlack <blacka101@gmail.com> * AbstractDependencyTracker for shared code Signed-off-by: AlexDBlack <blacka101@gmail.com> * Overhaul SameDiff graph execution (dependency tracking) Signed-off-by: AlexDBlack <blacka101@gmail.com> * More fixes, cleanup, next steps Signed-off-by: AlexDBlack <blacka101@gmail.com> * Ad no-op memory manager, cleanup, fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix switch dependency tracking Signed-off-by: AlexDBlack <blacka101@gmail.com> * INDArray.toString: no exception on closed arrays, just note closed Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix enter and exit dependency tracking Signed-off-by: AlexDBlack <blacka101@gmail.com> * TensorArray memory management fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Add unique ID for INDArray instances Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix memory management for NextIteration outputs in multi-iteration loops Signed-off-by: AlexDBlack <blacka101@gmail.com> * Remove (now unnecessary) special case handling for nested enters Signed-off-by: AlexDBlack <blacka101@gmail.com> * Cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * Handle control dependencies during execution; javadoc for memory managers Signed-off-by: AlexDBlack <blacka101@gmail.com> * Cleanup, polish, code comments, javadoc Signed-off-by: AlexDBlack <blacka101@gmail.com> * Cleanup and more javadoc Signed-off-by: AlexDBlack <blacka101@gmail.com> * Add memory validation for all TF import tests - ensure all arrays (except outputs) are released Signed-off-by: AlexDBlack <blacka101@gmail.com> * Clean up arrays waiting on unexecuted ops at the end of execution Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fixes for enter op memory managent in the context of multiple non-nested loops/frames Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix order of operation issues for dependency tracker Signed-off-by: AlexDBlack <blacka101@gmail.com> * Always clear op fields after execution to avoid leaks or unintended array reuse Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Re-implement dtype conversion Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix for control dependencies execution (dependency tracking) Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix TF import overrides and filtering Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix for constant enter array dependency tracking Signed-off-by: AlexDBlack <blacka101@gmail.com> * DL4J Fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * More DL4J fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Cleanup and polish Signed-off-by: AlexDBlack <blacka101@gmail.com> * More polish and javadoc Signed-off-by: AlexDBlack <blacka101@gmail.com> * More logging level tweaks, small DL4J fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small fix to DL4J SameDiffLayer Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix empty array deserialization, add extra deserialization checks Signed-off-by: AlexDBlack <blacka101@gmail.com> * FlatBuffers control dep serialization fixes; test serialization as part of all TF import tests Signed-off-by: AlexDBlack <blacka101@gmail.com> * Variable control dependencies serialization fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix issue with removing inputs for ops Signed-off-by: AlexDBlack <blacka101@gmail.com> * FlatBuffers NDArray deserialization fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * FlatBuffers NDArray deserialization fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Final cleanup/polish Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-10-23 21:19:50 +11:00
Alexander Stoyakin	f31661e13b	Merge pull request #7 from KonduitAI/asto_nd4s_10172019 KDTree optimization	2019-10-23 12:11:25 +03:00
Yurii	8f3eaebda5	- replace condition isScalar() by condition length ==1 in some NDArray methodds Signed-off-by: Yurii <iuriish@yahoo.com>	2019-10-21 16:25:13 +03:00
Yurii	99be467f76	- minor change in recurrent.h Signed-off-by: Yurii <iuriish@yahoo.com>	2019-10-17 20:46:51 +03:00
Yurii	70bd925abd	- write 2 versions of new lstmLayer: one is based on own code, second uses mkl dnn api	2019-10-17 20:44:52 +03:00

1 2 3 4 5 ...

288 Commits