Alex Black
47d19908f4
Various fixes ( #43 )
...
* #8172 Enable DL4J MKLDNN batch norm backward pass
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8382 INDArray.toString() rank 1 brackets / ambiguity fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8308 Fix handful of broken links (inc. some in errors)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Unused dependencies, round 1
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Unused dependencies, round 2
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Unused dependencies, round 3
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Uniform distribution TF import fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-11-14 19:38:20 +11:00
raver119
48df1acdfb
[WIP] ThreadPool ( #8 )
...
This PR removes OpenMP use in 95% of cases
2019-11-13 17:04:59 +03:00
raver119
f05c6ee139
INLINE_LOOPS for windows
...
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-12 15:12:31 +03:00
Alex Black
18c01f5bdc
Add SameDiff memory reuse memory manager (array cache) ( #39 )
...
* Attention op comments
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* ArrayCacheMemoryMgr - first pass
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Tweak array cache for use with SameDiff identity arrays
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* ArrayCacheMemoryMgr javadoc and properly get max memory
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* LRU cache policy + add tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Resize arrays internally if required for ArrayCacheMemoryMgr
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Test improvement
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-11-12 21:15:44 +11:00
Yurii Shyrma
0eda1e733e
Shyrma bnorm bp ( #41 )
...
Batchnorm backprop mkldnn
2019-11-12 11:58:48 +03:00
raver119
929c1dc5c7
- new NDArrayFactory scalar constructor
...
- minor tweak in randomuniform
- one more test
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-08 08:49:41 +03:00
raver119
51f3a1371d
[WIP] Random Uniform ( #36 )
...
* args
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* T args
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-07 17:09:47 +03:00
shugeo
679e42199a
Shugeo strided slice bp fix2 ( #33 )
...
* Fixed crash and restored brocken functionality for strided slice.
* Added comments for strided_slice_bp main step.
2019-11-07 13:44:02 +03:00
shugeo
08853c7829
Shugeo random uniform int ( #30 )
...
* Corrected randomuniform declaration.
* Refactored uniform distribution for both cuda and cpu platforms.
* Refactored uniform distribution and tests.
* Fixed type usage with indices.
* Refactored uniform distribution implementation and tests to full conform with TF implementation.
* Refactored gamma function to use type util method.
* Copyright changes and fixes with ConstantHelper.
* Added error checking on allocate cuda device memory and operations.
2019-11-06 12:49:27 +02:00
Yurii Shyrma
871f3bb3e6
- add additional condition in svd helper to take into account rounding errors ( #31 )
...
Signed-off-by: Yurii <iuriish@yahoo.com>
2019-11-05 17:16:17 +02:00
shugeo
9124974e3b
Fixed crash with strided_slice_bp op and tests. ( #29 )
2019-11-05 12:49:15 +02:00
shugeo
7b14a9f603
Gamma and Poisson distributions ( #27 )
...
* Added implementation for random_gamma op.
* Added implementation for random_poisson op and support classes.
* Added helpers for random_poisson and random_gamma ops.
* Implementation of random_poisson. The first working edition.
* Implementation of random_poisson. Parallelized working edition.
* Implementation of random_gamma. Parallelized working edition with alpha only.
* Added cuda implementation for helper of poisson distribution.
* Corrected shape calculation with random_gamma and tests.
* Finished cpu implementation for gamma distribution.
* Finished cuda implementation for random_gamma op.
* Refactored cpu helpers for random_gamma and random_poisson ops.
* Refactored cuda helpers for gamma and poisson distribution.
* Refactored cuda helper for gamma distribution.
* Refactored cpu helper for random_poisson op.
* Refactored cpu helper for random_gamma op.
2019-11-04 15:42:28 +02:00
Alex Black
948ebef41c
Op Fixes ( #28 )
...
* #8280 biasadd_bp nchw arg fixes (java side) + test
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8285 Concat op Java side fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Concat op cpp fix - allow dynamic axis to be negative, same as static axis
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* ignores for deconv3d import tests until deconv3d_tf op is implemented
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-11-05 00:05:04 +11:00
Yurii Shyrma
0cdb5750e0
Shyrma concat ( #24 )
...
* - provide possibility to pass axis as last input array in concat op
- corrcect sumation in bias_add_bp op for NHWC case
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write code for deconv2d op based on mkl dnn api
* no unsafe math
Signed-off-by: raver119 <raver119@gmail.com>
* no unsafe math
Signed-off-by: raver119 <raver119@gmail.com>
* - get rid of e<> and p<> methods in svd helper
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide mkl api support for deconvolution 3d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write deconv2d_bp based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write deconv3d_bp based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing deconv based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove dilation form conv2d/3d mkl
Signed-off-by: Yurii <iuriish@yahoo.com>
* - minor changes
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further corrections of deconv ops based on mkl dnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide deconv2d_tf based on mkl dnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add minor corrections required by reviewer
Signed-off-by: Yurii <iuriish@yahoo.com>
2019-11-03 12:37:19 +02:00
shugeo
95f7ad7b94
Shugeo suppression overlaps ( #9 )
...
* Added non_max_suppression_overlaps op and tests.
* Refactored implementation of non_max_suppression_overlaps.
* Refactoring of implementation of non_max_suppression_overlaps op.
* Refactoring of implementation of non_max_suppression op.
* Fixed portion error.
* Added cuda frontends for image suppression ops.
* Eliminated crash with cuda arch on image.non_max_suppression_overlaps op.
* Improved implementation of image_suppression helper for cpu platform.
* The generic approach of non_max_suppression_overlaps op helper with cuda platform.
* Working cuda implementation of helper non_max_suppression_overlaps op.
* Eliminated waste comments.
* Improved implementations for both platforms
* Refactored cuda implementation of image.non_max_suppression_overlaps op helper.
* Improved cuda implementation of non_max_suppression op helper.
* Refactored cuda implementation of image.non_max_suppression_overlaps op helper.
* Improved cuda implementation of image.non_max_suppression_overlaps op helper.
* Added modifications into cuda implementation for image suppression overlaps op.
* Correct queue emulation with cuda implementation of non_max_suppression_overlaps op.
* Prefinal stage of cuda implementation of non_max_suppression_overlaps.
* Worked cuda implementation of non_max_suppresion_overlaps helper.
* Fixed return to proper thread.
* Improvements for cuda implementation of image.non_max_suppression_overlaps op helper.
* Fixed implementation issues with non_max_suppression_overlaps on cuda platform.
* Fixed skip for non_max_suppression_overlaps on cuda platform.
* Finalize implementation of image_suppression helper and tests.
* Cosmetic changes only.
2019-10-30 13:43:45 +02:00
Yurii Shyrma
029a69a835
Shyrma bn mkl bp ( #14 )
...
* - write code for new batchnorm backprop
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing batchnorm backprop
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write code for batchnorm backprop based on mkl dnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in batchnorm_bp mkl dnn
Signed-off-by: Yurii <iuriish@yahoo.com>
* - made corrections required by reviewer
Signed-off-by: Yurii <iuriish@yahoo.com>
* - change name in java wrapper for batchnorm op
Signed-off-by: Yurii <iuriish@yahoo.com>
2019-10-26 14:14:21 +03:00
Alex Black
d333d29099
SameDiff cleanup and fixes ( #12 )
...
* #8160 Remove resolvePrepertiesFromSameDiffBeforeExecution
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* SameDiff API cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More SameDiff cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8248 Switch SameDiff variable init from lazy to creation time for more predictable behaviour
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8252 TanhDerivative javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8225 Deconvolution2D input validation
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8265 Switch SameDiff.outputs() to user settable, instead of unreliable 'best guess'
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8224 SameDiff.zero and .one create constants, not variables
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More cleanup and fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small test fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* DL4J SameDiff fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Re-add hack for Deconvolution2DLayer until #8315 is resolved
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8270 Move CUDA device/version logging to Java; can be disabled via existing org.nd4j.log.initialization system property
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* All ND4J init logging checks system property
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small tweak
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Remove redundant device logging
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* One more fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* UX improvements
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Deconv fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add deconv tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Remove debug code
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-10-26 12:38:08 +11:00
Alex Black
3f0b4a2d4c
SameDiff execution, TF and memory management overhaul ( #10 )
...
* SameDiff execution memory management improvements, round 1
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Round 2
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Round 3
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Clear node outputs closed array references; Slight change to OpValidation internals to not rely on cached op outputs
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next step
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add WeakIdentityHashmap
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Session fixes for control ops and next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* First steps for training session + in-line updating
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix losses and history during training
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* BiasAdd and other fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Don't use SDVariable.getArr() in TFGraphTestAllHelper (import tests)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* First steps for new dependency tracking approach
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Start integrating dependency tracking for memory management
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Non-control op dependency tracking works/passes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Switch/merge
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix issue dependency tracking for initial variables/constants
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add check for aliases when determining if safe to close array
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* First pass on new TF graph import class
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Import fixes, op fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and fixes for new TF import mapper
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Partial implementation of new dependency tracker
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* AbstractDependencyTracker for shared code
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Overhaul SameDiff graph execution (dependency tracking)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More fixes, cleanup, next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ad no-op memory manager, cleanup, fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix switch dependency tracking
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* INDArray.toString: no exception on closed arrays, just note closed
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix enter and exit dependency tracking
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* TensorArray memory management fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add unique ID for INDArray instances
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix memory management for NextIteration outputs in multi-iteration loops
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Remove (now unnecessary) special case handling for nested enters
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Handle control dependencies during execution; javadoc for memory managers
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup, polish, code comments, javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and more javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add memory validation for all TF import tests - ensure all arrays (except outputs) are released
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Clean up arrays waiting on unexecuted ops at the end of execution
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fixes for enter op memory managent in the context of multiple non-nested loops/frames
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix order of operation issues for dependency tracker
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Always clear op fields after execution to avoid leaks or unintended array reuse
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Re-implement dtype conversion
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix for control dependencies execution (dependency tracking)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix TF import overrides and filtering
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix for constant enter array dependency tracking
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* DL4J Fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More DL4J fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More polish and javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More logging level tweaks, small DL4J fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix to DL4J SameDiffLayer
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix empty array deserialization, add extra deserialization checks
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* FlatBuffers control dep serialization fixes; test serialization as part of all TF import tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Variable control dependencies serialization fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix issue with removing inputs for ops
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* FlatBuffers NDArray deserialization fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* FlatBuffers NDArray deserialization fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Final cleanup/polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-10-23 21:19:50 +11:00
Alexander Stoyakin
f31661e13b
Merge pull request #7 from KonduitAI/asto_nd4s_10172019
...
KDTree optimization
2019-10-23 12:11:25 +03:00
Yurii
99be467f76
- minor change in recurrent.h
...
Signed-off-by: Yurii <iuriish@yahoo.com>
2019-10-17 20:46:51 +03:00
Yurii
70bd925abd
- write 2 versions of new lstmLayer: one is based on own code, second uses mkl dnn api
2019-10-17 20:44:52 +03:00
Alexander Stoyakin
630bb3c9b6
Merge pull request #2 from KonduitAI/asto_ops_wrapper
...
[WIP] New ops wrapper
2019-10-16 20:21:50 +03:00
shugeo
3662657d5c
Merge pull request #1 from KonduitAI/shugeo_gamma
...
Shugeo gamma
2019-10-16 18:49:33 +03:00
shugeo
24a2b2933f
Added gamma and lgamma functions.
2019-10-16 18:22:18 +03:00
Alexander Stoyakin
96a9a1a733
Fixed output from operation.
2019-10-16 18:07:52 +03:00
shugeo
7617682a46
Added declarations for igamma and igammac ops.
2019-10-16 14:45:10 +03:00
shugeo
478a0c1f97
Added igamma and igammac broadcastable ops implementations and tests.
2019-10-16 14:02:53 +03:00
shugeo
7103aca8c5
Added broadcastable IGamma and IGammac ops.
2019-10-16 13:58:32 +03:00
shugeo
f90e6da97e
Added nd4j_gamma, nd4j_igamma and nd4j_igammac functions.
2019-10-16 13:53:31 +03:00
shugeo
df2448613e
Added gamma distribution functions.
2019-10-15 20:00:07 +03:00
AlexDBlack
2d750b69e5
Merge remote-tracking branch 'konduit/master'
2019-10-14 17:21:23 +11:00
shugeo
ace65355c5
Added doc for fake_quant_with_min_max* op helpers cuda implementations.
2019-10-10 18:35:28 +03:00
shugeo
c890de5a7b
Added doc for fake_quant_with_min_max* op helpers implementations.
2019-10-10 18:31:17 +03:00
shugeo
c3f755d975
Refactored helpers both for cuda and cpu platforms.
2019-10-10 18:02:49 +03:00
shugeo
a09cb5e2be
Added doc for fake_quant_with_min_max_per_channel op declaration.
2019-10-10 17:13:33 +03:00
shugeo
92636b0b86
Eliminated waste operator.
2019-10-10 17:08:59 +03:00
shugeo
d5b352273d
Implementation of cuda kernel for fake_quant_with_min_max_vars_per_channels op. Final revision.
2019-10-10 16:51:29 +03:00
shugeo
02d8616692
Implementation of cuda kernel for fake_quant_with_min_max_vars_per_channels op.
2019-10-10 16:40:56 +03:00
shugeo
3504b0cda9
Implemented fake_quant_with_min_max_vars_per_channel fop cuda helper. The first working revision.
2019-10-10 15:44:50 +03:00
shugeo
753565145c
Refactored fake_quant_with_min_max_vars op cuda implementation.
2019-10-10 14:00:49 +03:00
shugeo
c13e945a96
Fixed fake_quant_with_min_max_vars op and tests.
2019-10-10 13:23:11 +03:00
shugeo
3c0c59ab88
Refactored fake_quant_with_min_max_vars op.
2019-10-09 22:09:33 +03:00
shugeo
352f1eee80
Implemented fake_quant_with_min_max_per_channel helper for cpu platform. The first approach.
2019-10-09 21:39:59 +03:00
shugeo
d0cbd33b0e
Added input checks for op.
2019-10-09 15:52:13 +03:00
shugeo
cb56b0b06a
The first approach for fake_quant_with_min_max_vars_per_channel op implementation.
2019-10-08 19:00:41 +03:00
shugeo
8fe5a1fa96
The working implementation of draw_bounding_boxes op.
2019-10-08 15:42:27 +03:00
shugeo
30a8af566c
The first working implementation of cuda kernel for draw_bounding_boxes op helper.
2019-10-08 13:45:18 +03:00
shugeo
ae09cfee32
Next approach of cuda imlementation for draw_bounding_boxes op helper.
2019-10-08 00:09:46 +03:00
shugeo
6cf3a8fa9c
Refactored cpu implementatio and added cuda aproach.
2019-10-07 17:51:07 +03:00
shugeo
78443ffebf
Working implementation of draw_bounding_boxes op for cpu.
2019-10-07 15:04:44 +03:00