Alex Black
|
3f0b4a2d4c
|
SameDiff execution, TF and memory management overhaul (#10)
* SameDiff execution memory management improvements, round 1
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Round 2
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Round 3
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Clear node outputs closed array references; Slight change to OpValidation internals to not rely on cached op outputs
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next step
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add WeakIdentityHashmap
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Session fixes for control ops and next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* First steps for training session + in-line updating
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix losses and history during training
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* BiasAdd and other fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Don't use SDVariable.getArr() in TFGraphTestAllHelper (import tests)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* First steps for new dependency tracking approach
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Start integrating dependency tracking for memory management
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Non-control op dependency tracking works/passes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Switch/merge
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix issue dependency tracking for initial variables/constants
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add check for aliases when determining if safe to close array
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* First pass on new TF graph import class
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Import fixes, op fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and fixes for new TF import mapper
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Partial implementation of new dependency tracker
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* AbstractDependencyTracker for shared code
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Overhaul SameDiff graph execution (dependency tracking)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More fixes, cleanup, next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ad no-op memory manager, cleanup, fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix switch dependency tracking
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* INDArray.toString: no exception on closed arrays, just note closed
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix enter and exit dependency tracking
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* TensorArray memory management fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add unique ID for INDArray instances
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix memory management for NextIteration outputs in multi-iteration loops
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Remove (now unnecessary) special case handling for nested enters
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Handle control dependencies during execution; javadoc for memory managers
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup, polish, code comments, javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and more javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add memory validation for all TF import tests - ensure all arrays (except outputs) are released
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Clean up arrays waiting on unexecuted ops at the end of execution
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fixes for enter op memory managent in the context of multiple non-nested loops/frames
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix order of operation issues for dependency tracker
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Always clear op fields after execution to avoid leaks or unintended array reuse
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Re-implement dtype conversion
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix for control dependencies execution (dependency tracking)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix TF import overrides and filtering
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix for constant enter array dependency tracking
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* DL4J Fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More DL4J fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More polish and javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More logging level tweaks, small DL4J fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix to DL4J SameDiffLayer
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix empty array deserialization, add extra deserialization checks
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* FlatBuffers control dep serialization fixes; test serialization as part of all TF import tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Variable control dependencies serialization fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix issue with removing inputs for ops
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* FlatBuffers NDArray deserialization fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* FlatBuffers NDArray deserialization fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Final cleanup/polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
|
2019-10-23 21:19:50 +11:00 |
Alexander Stoyakin
|
f31661e13b
|
Merge pull request #7 from KonduitAI/asto_nd4s_10172019
KDTree optimization
|
2019-10-23 12:11:25 +03:00 |
raver119
|
35e6ffede4
|
Merge pull request #6 from KonduitAI/shyrma_broadcast
- replace condition isScalar() by condition length ==1 in some NDArra…
|
2019-10-22 07:56:04 +03:00 |
Yurii
|
8f3eaebda5
|
- replace condition isScalar() by condition length ==1 in some NDArray methodds
Signed-off-by: Yurii <iuriish@yahoo.com>
|
2019-10-21 16:25:13 +03:00 |
Alexander Stoyakin
|
775402c934
|
Merge pull request #5 from KonduitAI/asto_nd4s_build_10192019
Fix nd4s compilation
|
2019-10-19 21:57:02 +03:00 |
Alexander Stoyakin
|
b556519e15
|
Fix compilation.
|
2019-10-19 21:48:43 +03:00 |
raver119
|
a4984d52d7
|
Merge pull request #4 from KonduitAI/shyrma_lstm
Shyrma lstm
|
2019-10-17 20:58:06 +03:00 |
Yurii
|
99be467f76
|
- minor change in recurrent.h
Signed-off-by: Yurii <iuriish@yahoo.com>
|
2019-10-17 20:46:51 +03:00 |
Yurii
|
70bd925abd
|
- write 2 versions of new lstmLayer: one is based on own code, second uses mkl dnn api
|
2019-10-17 20:44:52 +03:00 |
Alexander Stoyakin
|
630bb3c9b6
|
Merge pull request #2 from KonduitAI/asto_ops_wrapper
[WIP] New ops wrapper
|
2019-10-16 20:21:50 +03:00 |
Alexander Stoyakin
|
9e5799847a
|
TF names fixed.
|
2019-10-16 19:50:18 +03:00 |
Alexander Stoyakin
|
502bedf5d5
|
Register ops for TF import.
|
2019-10-16 19:39:04 +03:00 |
Alexander Stoyakin
|
ec722b20ee
|
TF names added
|
2019-10-16 19:29:19 +03:00 |
Alexander Stoyakin
|
99d77e1384
|
Ops exported for sameDiff
|
2019-10-16 19:16:47 +03:00 |
shugeo
|
3662657d5c
|
Merge pull request #1 from KonduitAI/shugeo_gamma
Shugeo gamma
|
2019-10-16 18:49:33 +03:00 |
shugeo
|
24a2b2933f
|
Added gamma and lgamma functions.
|
2019-10-16 18:22:18 +03:00 |
Alexander Stoyakin
|
96a9a1a733
|
Fixed output from operation.
|
2019-10-16 18:07:52 +03:00 |
shugeo
|
7617682a46
|
Added declarations for igamma and igammac ops.
|
2019-10-16 14:45:10 +03:00 |
shugeo
|
478a0c1f97
|
Added igamma and igammac broadcastable ops implementations and tests.
|
2019-10-16 14:02:53 +03:00 |
shugeo
|
7103aca8c5
|
Added broadcastable IGamma and IGammac ops.
|
2019-10-16 13:58:32 +03:00 |
shugeo
|
f90e6da97e
|
Added nd4j_gamma, nd4j_igamma and nd4j_igammac functions.
|
2019-10-16 13:53:31 +03:00 |
Alexander Stoyakin
|
c4307384f3
|
Fixed shape for muli
|
2019-10-16 12:59:25 +03:00 |
Alexander Stoyakin
|
d5002b14c7
|
New ops wrappers
|
2019-10-16 12:59:08 +03:00 |
shugeo
|
df2448613e
|
Added gamma distribution functions.
|
2019-10-15 20:00:07 +03:00 |
AlexDBlack
|
2d750b69e5
|
Merge remote-tracking branch 'konduit/master'
|
2019-10-14 17:21:23 +11:00 |
raver119
|
d5568df7b3
|
Merge pull request #10 from KonduitAI/shugeo_fake_quant2
Shugeo fake quant2
|
2019-10-10 18:40:02 +03:00 |
shugeo
|
ace65355c5
|
Added doc for fake_quant_with_min_max* op helpers cuda implementations.
|
2019-10-10 18:35:28 +03:00 |
shugeo
|
c890de5a7b
|
Added doc for fake_quant_with_min_max* op helpers implementations.
|
2019-10-10 18:31:17 +03:00 |
shugeo
|
c3f755d975
|
Refactored helpers both for cuda and cpu platforms.
|
2019-10-10 18:02:49 +03:00 |
shugeo
|
a09cb5e2be
|
Added doc for fake_quant_with_min_max_per_channel op declaration.
|
2019-10-10 17:13:33 +03:00 |
shugeo
|
92636b0b86
|
Eliminated waste operator.
|
2019-10-10 17:08:59 +03:00 |
shugeo
|
d5b352273d
|
Implementation of cuda kernel for fake_quant_with_min_max_vars_per_channels op. Final revision.
|
2019-10-10 16:51:29 +03:00 |
shugeo
|
02d8616692
|
Implementation of cuda kernel for fake_quant_with_min_max_vars_per_channels op.
|
2019-10-10 16:40:56 +03:00 |
shugeo
|
3504b0cda9
|
Implemented fake_quant_with_min_max_vars_per_channel fop cuda helper. The first working revision.
|
2019-10-10 15:44:50 +03:00 |
shugeo
|
753565145c
|
Refactored fake_quant_with_min_max_vars op cuda implementation.
|
2019-10-10 14:00:49 +03:00 |
shugeo
|
c13e945a96
|
Fixed fake_quant_with_min_max_vars op and tests.
|
2019-10-10 13:23:11 +03:00 |
Alexander Stoyakin
|
fa8105bf0f
|
Merge pull request #9 from KonduitAI/asto_nd4s_indexing
[WIP] Indexing for INDArray in nd4s
|
2019-10-10 11:03:06 +03:00 |
Alexander Stoyakin
|
156fba4f77
|
Minor fix
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
|
2019-10-10 10:49:07 +03:00 |
Alexander Stoyakin
|
be70bea359
|
Operator :: for INDArray
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
|
2019-10-10 10:43:32 +03:00 |
shugeo
|
3c0c59ab88
|
Refactored fake_quant_with_min_max_vars op.
|
2019-10-09 22:09:33 +03:00 |
shugeo
|
352f1eee80
|
Implemented fake_quant_with_min_max_per_channel helper for cpu platform. The first approach.
|
2019-10-09 21:39:59 +03:00 |
Alexander Stoyakin
|
0de8523ebd
|
Indexing syntax changed
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
|
2019-10-09 16:07:20 +03:00 |
shugeo
|
d0cbd33b0e
|
Added input checks for op.
|
2019-10-09 15:52:13 +03:00 |
shugeo
|
3a89e51811
|
Added tests for fake_quant_with_min_max_vars_per_channel op.
|
2019-10-09 13:38:18 +03:00 |
Alexander Stoyakin
|
f515151f5f
|
Merge pull request #8 from KonduitAI/asto_nd4s
[WIP] Indexing improvements in nd4s
|
2019-10-09 11:39:51 +03:00 |
Alexandre Boulanger
|
3aa51e210a
|
RL4J: Extract TD Target calculations (StandardDQN and DoubleDQN) (#8267)
Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
|
2019-10-09 09:14:47 +09:00 |
Alexander Stoyakin
|
8472782d7f
|
More tests
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
|
2019-10-08 19:08:43 +03:00 |
shugeo
|
cb56b0b06a
|
The first approach for fake_quant_with_min_max_vars_per_channel op implementation.
|
2019-10-08 19:00:41 +03:00 |
Alexander Stoyakin
|
4956048a11
|
Tests added
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
|
2019-10-08 17:38:43 +03:00 |
Alexander Stoyakin
|
b36b3fa1aa
|
Some sugar
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
|
2019-10-08 16:41:12 +03:00 |