Alex Black
ce02b6fae7
Small fixes ( #140 )
...
* Allow scalar op result array auto allocation
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Don't swallow underlying exception for calculateOutputShape execution failures
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ignore for known keras failure
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-12-21 17:00:46 +11:00
raver119
a5f5ac72b1
reduce bool changes ( #118 )
...
* reduce bool changes
Signed-off-by: raver119 <raver119@gmail.com>
* reduce bool tweaks
Signed-off-by: raver119 <raver119@gmail.com>
2019-12-09 20:08:59 +03:00
shugeo
e09a785232
Shugeo resize fix5 ( #102 )
...
* Refactored resize images ops to use TF-like bool args as input.
* Refactored helpers for cpu implementation of resize_bilinear and resize_nearest_neighbor ops.
* Refactored cuda implementation for image.resize_bilinear and image.resize_nearest_neighbor ops helpers.
* Refactored nearest_neighbor resize op.
* Added a pair of tests for special case of resize_bilinear algorithm.
* Fixed issue with resize_bilinear op.
* Refactored cpu implementation for helpers with resize_nearest_neighbor op.
* Final fixed for resize ops to conform TF v.1.5
* Refactored cuda helpers for resize_neares_neighbor op.
* Fixed resize_bilinear to accept proper data.
* Fixed issue with non-float input for resize_bilinear op.
* Refactored cuda helper for resize_bilinear to proper process non-float inputs.
* Added tests for resize_bilinear to int inputs.
* Fixed ResizeBilinear wrapper
* Tests fixed
* Fixed float and bool constant to avoid overflow for some kind of compilers.
* Corrected float constants with float data type.
* Added f suffix for float constants.
* Corrected float constant to avoid overflow with initializing lists.
* Corrected float initializing list with float input.
* Corrected bool constant with initalizing list.
* Corrected float and bool values with initializing lists.
* Fixed wrong constant.
* Fixed issue with 1x1 input picture for resize.
* ResizeBilinear default values on import fix
Signed-off-by: raver119 <raver119@gmail.com>
2019-12-05 22:05:33 +03:00
Yurii Shyrma
d19eeaec52
Shyrma casual conv1d ( #90 )
...
* - add causal mode of padding to convolutions
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add additional tests for causal conv1d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add causal mode for cuda conv kernels
Signed-off-by: Yurii <iuriish@yahoo.com>
* Java side of Conv1D changes
Signed-off-by: raver119 <raver119@gmail.com>
* Add Conv1DDerivative op
Signed-off-by: Alex Black <blacka101@gmail.com>
* Causal Conv1D gradient checks
Signed-off-by: Alex Black <blacka101@gmail.com>
* Tweaks
Signed-off-by: Alex Black <blacka101@gmail.com>
* - add causal padding mode to conv2d_bp
Signed-off-by: Yurii <iuriish@yahoo.com>
* More thorough causal conv1d tests
Signed-off-by: Alex Black <blacka101@gmail.com>
2019-11-29 14:14:30 +03:00
Samuel Audet
5e07998e59
Add support for CUDA 10.2 ( #89 )
2019-11-29 16:31:03 +11:00
shugeo
009007120b
Shugeo_release_fixes3 ( #81 )
...
* Implementation for non_max_suppression_v3 was added. Initial version
* Added check for overcome threshold.
* Added definition for V3 method.
* java remapping for NonMaxSuppressionV3
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed proporly processing of an empty output and test.
* Refactored op to less threshold data to float.
* Implemented cuda-based helper for non_max_suppression_v3 op.
* Fixed fake_quant_with_min_max_vars op.
* Fixed tests with float numbers.
* - assert now stops execution
- sortByKey/sortByValue now have input validation
Signed-off-by: raver119 <raver119@gmail.com>
* missing var
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed proper processing for zero max_size inputs.
* Refactored kernel callers.
* Fixed return statement for logdet op helper.
* Refactored unsorted segment SqrtN op.
* get back 8 tail bytes on CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored segment prod ops and helpers for cuda and tests.
* Additional test.
* CudaWorkspace tests updated for 8 tail bytes
Signed-off-by: raver119 <raver119@gmail.com>
* special atomic test
Signed-off-by: raver119 <raver119@gmail.com>
* atomicMul/atomicDiv fix for 16bit values
Signed-off-by: raver119 <raver119@gmail.com>
* Eliminated waste prints.
2019-11-28 21:08:51 +03:00
Alex Black
e910ce75ec
Various Fixes ( #75 )
...
* #8431 Cast loss function weights array automatically
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add 'regex verbose mode' printing (ExecDebugListener) for TFGraphTestAllSameDiff'
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Class import mapping fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Reshape fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Don't swallow first exception in NativeOpExecutioner.exec(CustomOp)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-11-23 20:06:12 +11:00
shugeo
4187190609
Shugeo release fix2 ( #70 )
...
* Corrected input checking and tests for bitcast op.
* Fixed an issue with non_max_suppression form generation and processing with score threshold given.
* Fixed bilinear resize kernel and tests.
* push for Serhii
Signed-off-by: raver119 <raver119@gmail.com>
* Added test for nearest_neighbor resize with int input.
* Added data type check for input/output match.
* Eliminate error in macros.
* Improved output message for type checking.
* Fixed input/output types for op.
* Eliminated waste logging.
* Refactored resize_bilinear helper for multithreading for cpu platform.
* Cosmetic changes only.
* Fixed error for string substitution.
* Skip test for cbow_batch with cuda.
* fix for resizeNearestNeighbor output dtype
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored non_max_suppression helper.
* Refactored shape generation and input handling.
* Added additional test.
2019-11-22 22:42:44 +03:00
Samuel Audet
ff73e6da3f
ND4J: Fix OpenBLAS loading for nd4j-native ( #64 )
...
* ND4J: Fix OpenBLAS loading for nd4j-native and remove bundling of OpenMP
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* Bundle back libgomp.so.1 for Linux
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* Readd preload directories for ARM
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* Add back preloads for GCC on Windows
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* Add explicit preloadpaths for ARM and POWER to bundle correct library
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
2019-11-21 15:54:41 +03:00
raver119
064a56ccf1
Few fixes ( #66 )
...
* skip legacy transforms execution in case of empty input arrays
Signed-off-by: raver119 <raver119@gmail.com>
* - BroadcastBool ops now accept extraParams to make MatchCondition possible
- TrueBroadcastHelper now uses samediff::threads
Signed-off-by: raver119 <raver119@gmail.com>
* java side
Signed-off-by: raver119 <raver119@gmail.com>
* trigger jenkins
Signed-off-by: raver119 <raver119@gmail.com>
* update LessThanOrEqual opNum mapping
Signed-off-by: raver119 <raver119@gmail.com>
* update LessThanOrEqual opNum mapping
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-21 15:43:03 +03:00
raver119
83cb0d9329
[WIP] Create and small fix ( #67 )
...
* - create op
- skip exec for empty inputs for non_max_suppression
- EmptyHandling idea
Signed-off-by: raver119 <raver119@gmail.com>
* Create op and mapping for it
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-21 13:31:20 +03:00
Yurii Shyrma
66b84b38cf
Shyrma mmul ( #58 )
...
* - get rid of some copy procedures in mmulHelper ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on embedding cuda api for batched gemm (cublasGemmBatchedEx) in our mmulHelper class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on cuda batched gamm api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write own cuda kernel performing batched gemm
Signed-off-by: Yurii <iuriish@yahoo.com>
* missing include in MmulHelper
Signed-off-by: raver119 <raver119@gmail.com>
* - forgot to keep in code previous correct kernels for mmulNxN, since it may happen that new onw will fail for some reason in future
Signed-off-by: Yurii <iuriish@yahoo.com>
* disable old tensordot
Signed-off-by: raver119 <raver119@gmail.com>
* - rewrite cuda kernels for usualGemm and usualGemv
Signed-off-by: Yurii <iuriish@yahoo.com>
* - profiling mmul helpers
Signed-off-by: Yurii <iuriish@yahoo.com>
* - prints to check shapes were added
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct type of output array Cin mmulNxN
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account possible nans in C array
Signed-off-by: Yurii <iuriish@yahoo.com>
* slightly change numThreads message
Signed-off-by: raver119 <raver119@gmail.com>
* - make corrections in accordance to given notes in pr review
Signed-off-by: Yurii <iuriish@yahoo.com>
2019-11-19 15:39:36 +02:00
raver119
c5cbdcd8f4
[WIP] clang for jcpp ( #53 )
...
* clang as compiler for jcpp
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* we don't need macos profile
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-17 09:45:30 +03:00
raver119
1780dcc883
[WIP] Small fixes here and there ( #50 )
...
* one range test
Signed-off-by: raver119 <raver119@gmail.com>
* few Context convenience singatures
Signed-off-by: raver119 <raver119@gmail.com>
* one more range test
Signed-off-by: raver119 <raver119@gmail.com>
* "range" "fix"
Signed-off-by: raver119 <raver119@gmail.com>
* adjuct_contrast_v2 now allows scale factor to be provided via input_variable
Signed-off-by: raver119 <raver119@gmail.com>
* adjust_contrast now allows scale factor as variable too
Signed-off-by: raver119 <raver119@gmail.com>
* bitcast shape tests
Signed-off-by: raver119 <raver119@gmail.com>
* BitCast import dtype added
Signed-off-by: raver119 <raver119@gmail.com>
* few more BitCast signatures
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-15 17:04:29 +03:00
raver119
c5b912bddf
few changes for openblas and jcpp preloads (on macos) ( #46 )
...
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-14 19:50:24 +03:00
raver119
1eb3de90d7
[WIP] Platform helpers switches ( #44 )
...
* - platform helpers can be disabled on per-op basis now via Context::allowHelpers
- java has access to it as well
Signed-off-by: raver119 <raver119@gmail.com>
* global platform-helpers trigger
Signed-off-by: raver119 <raver119@gmail.com>
* few signatures renamed
Signed-off-by: raver119 <raver119@gmail.com>
* - few new env variables to follow
- maxThreads/masterThreads differentiation
Signed-off-by: raver119 <raver119@gmail.com>
* Javadoc update
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-14 14:35:02 +03:00
Alex Black
47d19908f4
Various fixes ( #43 )
...
* #8172 Enable DL4J MKLDNN batch norm backward pass
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8382 INDArray.toString() rank 1 brackets / ambiguity fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8308 Fix handful of broken links (inc. some in errors)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Unused dependencies, round 1
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Unused dependencies, round 2
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Unused dependencies, round 3
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Uniform distribution TF import fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-11-14 19:38:20 +11:00
raver119
48df1acdfb
[WIP] ThreadPool ( #8 )
...
This PR removes OpenMP use in 95% of cases
2019-11-13 17:04:59 +03:00
Alexander Stoyakin
b816845797
Fixing nd4j-cuda build ( #20 )
...
* Roll back recent fix to restore build.
* Fix compilation.
* presets updated
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-01 15:59:29 +02:00
Alexander Stoyakin
45a40c8a89
DL4J/ND4J: Do pass on integer casts ( #15 )
...
* Int cast fixes.
* Revert "Int cast fixes."
This reverts commit aa36e8ca
* Int casts
* Int cast
* Int casts
* Get rid of int casts. Dropping deprecated aggregate ops.
* java scatterUpdate changes
Signed-off-by: raver119 <raver119@gmail.com>
* c++ scatterUpdate changes
Signed-off-by: raver119 <raver119@gmail.com>
* Remove aggregated ops.
* Restored test
* Tests restored.
* Minor fixes
2019-10-31 11:23:09 +02:00
raver119
5a4d2e8b31
[WIP] SVD ( #16 )
...
* - new SVD constructor
- OrthogonalDistribution now uses SVD custom op
Signed-off-by: raver119 <raver119@gmail.com>
* shapes fixed
Signed-off-by: raver119 <raver119@gmail.com>
2019-10-28 12:31:01 +03:00
Alex Black
d333d29099
SameDiff cleanup and fixes ( #12 )
...
* #8160 Remove resolvePrepertiesFromSameDiffBeforeExecution
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* SameDiff API cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More SameDiff cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8248 Switch SameDiff variable init from lazy to creation time for more predictable behaviour
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8252 TanhDerivative javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8225 Deconvolution2D input validation
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8265 Switch SameDiff.outputs() to user settable, instead of unreliable 'best guess'
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8224 SameDiff.zero and .one create constants, not variables
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More cleanup and fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small test fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* DL4J SameDiff fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Re-add hack for Deconvolution2DLayer until #8315 is resolved
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8270 Move CUDA device/version logging to Java; can be disabled via existing org.nd4j.log.initialization system property
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* All ND4J init logging checks system property
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small tweak
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Remove redundant device logging
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* One more fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* UX improvements
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Deconv fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add deconv tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Remove debug code
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-10-26 12:38:08 +11:00
Alex Black
3f0b4a2d4c
SameDiff execution, TF and memory management overhaul ( #10 )
...
* SameDiff execution memory management improvements, round 1
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Round 2
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Round 3
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Clear node outputs closed array references; Slight change to OpValidation internals to not rely on cached op outputs
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next step
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add WeakIdentityHashmap
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Session fixes for control ops and next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* First steps for training session + in-line updating
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix losses and history during training
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* BiasAdd and other fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Don't use SDVariable.getArr() in TFGraphTestAllHelper (import tests)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* First steps for new dependency tracking approach
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Start integrating dependency tracking for memory management
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Non-control op dependency tracking works/passes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Switch/merge
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix issue dependency tracking for initial variables/constants
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add check for aliases when determining if safe to close array
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* First pass on new TF graph import class
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Import fixes, op fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and fixes for new TF import mapper
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Partial implementation of new dependency tracker
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* AbstractDependencyTracker for shared code
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Overhaul SameDiff graph execution (dependency tracking)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More fixes, cleanup, next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ad no-op memory manager, cleanup, fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix switch dependency tracking
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* INDArray.toString: no exception on closed arrays, just note closed
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix enter and exit dependency tracking
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* TensorArray memory management fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add unique ID for INDArray instances
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix memory management for NextIteration outputs in multi-iteration loops
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Remove (now unnecessary) special case handling for nested enters
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Handle control dependencies during execution; javadoc for memory managers
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup, polish, code comments, javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and more javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add memory validation for all TF import tests - ensure all arrays (except outputs) are released
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Clean up arrays waiting on unexecuted ops at the end of execution
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fixes for enter op memory managent in the context of multiple non-nested loops/frames
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix order of operation issues for dependency tracker
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Always clear op fields after execution to avoid leaks or unintended array reuse
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Re-implement dtype conversion
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix for control dependencies execution (dependency tracking)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix TF import overrides and filtering
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix for constant enter array dependency tracking
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* DL4J Fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More DL4J fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Cleanup and polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More polish and javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More logging level tweaks, small DL4J fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix to DL4J SameDiffLayer
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix empty array deserialization, add extra deserialization checks
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* FlatBuffers control dep serialization fixes; test serialization as part of all TF import tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Variable control dependencies serialization fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix issue with removing inputs for ops
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* FlatBuffers NDArray deserialization fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* FlatBuffers NDArray deserialization fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Final cleanup/polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-10-23 21:19:50 +11:00
Alexander Stoyakin
f31661e13b
Merge pull request #7 from KonduitAI/asto_nd4s_10172019
...
KDTree optimization
2019-10-23 12:11:25 +03:00
Robert Altena
83d958d536
Sparse matrix refactoring. ( #8238 )
...
* remove sparse method from INDArray.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* remove gemm
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* remove useage of n4j.sparseFactory
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* Nd4j.sparseFactory removed.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* sparseNDArray deleted.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* iremove more sparse calls and constants.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* remove SparseBlasWrapper.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* delete BaseSparseBlaswrapper.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* remove 3 sparse factory classes.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* delete SparseCPULevel.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* deletes JcusparseLevel, CUDASparselevel.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* delete nativeCPU sparse classes.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* removes sparse methods from NDArrayFactory.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* more deletes.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* delete (ignored) tests. BaseSparseNDArray.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* deletes ISparseNDArray.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* remove sparse methods from indArray.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* deletes sparse classes.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
2019-09-17 22:56:29 +03:00
AlexDBlack
a66e03355e
Merge remote-tracking branch 'fork/master'
2019-09-12 12:20:57 +10:00
raver119
98e2814879
Platform helpers ( #8216 )
...
* platform helpers draft
Signed-off-by: raver119 <raver119@gmail.com>
* typo
Signed-off-by: raver119 <raver119@gmail.com>
* disable platform cmake
Signed-off-by: raver119 <raver119@gmail.com>
* another draft
Signed-off-by: raver119 <raver119@gmail.com>
* mkldnn convolution refactored
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* one more safety check
Signed-off-by: raver119 <raver119@gmail.com>
* prototype works
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* force static library mode for mkldnn
Signed-off-by: raver119 <raver119@gmail.com>
* - ismax fix
- experimental arg fix
- don't enforce openblas on Apple hardware
Signed-off-by: raver119 <raver119@gmail.com>
* bunch of small fixes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* declare concurrent
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - MKLDNN version upgrade to 1.0.2
- avgpool2d/maxpool2d APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* - avgpool2d_bp/maxpool2d_bp APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* - conv2d/batchnorm APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* - lrn/conv2d_bp/conv3d/conv3d_bp APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* all ops converted to MKLDNN 1.x
Signed-off-by: raver119 <raver119@gmail.com>
* bunch of tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* namespace for platform helpers
Signed-off-by: raver119 <raver119@gmail.com>
* make sure platform helpers aren't opimized out
Signed-off-by: raver119 <raver119@gmail.com>
* build cpu_features on x86 systems
Signed-off-by: raver119 <raver119@gmail.com>
* build cpu_features on x86 systems
Signed-off-by: raver119 <raver119@gmail.com>
* more of cpu_features
Signed-off-by: raver119 <raver119@gmail.com>
* - mkldnn removed from java
- cpu_features checks in CpuNDArrayFactory
Signed-off-by: raver119 <raver119@gmail.com>
* F16C definition renamed
Signed-off-by: raver119 <raver119@gmail.com>
* some mkldnn rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* check supported instructions before doing anything
Signed-off-by: raver119 <raver119@gmail.com>
* typo
Signed-off-by: raver119 <raver119@gmail.com>
* missied impl
Signed-off-by: raver119 <raver119@gmail.com>
* BUILD_PIC option
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool3d fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool3d_bp fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool2d_bp leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool3d_bp leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* maxpool bp leaks fixed
Signed-off-by: raver119 <raver119@gmail.com>
* printf removed
Signed-off-by: raver119 <raver119@gmail.com>
* batchnorm fix
Signed-off-by: raver119 <raver119@gmail.com>
* AVX warning/error polishing
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* remove previous MKL-DNN support layer
Signed-off-by: raver119 <raver119@gmail.com>
* avx2 tweak
Signed-off-by: raver119 <raver119@gmail.com>
* allow static for apple
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* exclude mkldnn in one more place
Signed-off-by: raver119 <raver119@gmail.com>
* exclude mkldnn in one more place
Signed-off-by: raver119 <raver119@gmail.com>
* restore OPENBLAS_PATH use
Signed-off-by: raver119 <raver119@gmail.com>
* add runtime check for avx/avx2 support
Signed-off-by: raver119 <raver119@gmail.com>
* convolution_auto
Signed-off-by: raver119 <raver119@gmail.com>
* Add logic for helper argument
* minor test fix
Signed-off-by: raver119 <raver119@gmail.com>
* few tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* few tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* skip OpTracker props for non-x86 builds
Signed-off-by: raver119 <raver119@gmail.com>
* linux arm isn't x86 :)
Signed-off-by: raver119 <raver119@gmail.com>
* avx-512
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA presets fix
Signed-off-by: raver119 <raver119@gmail.com>
* BUILD_PIC
Signed-off-by: raver119 <raver119@gmail.com>
* prefetchw for avx2
Signed-off-by: raver119 <raver119@gmail.com>
* BUILD_PIC again
Signed-off-by: raver119 <raver119@gmail.com>
2019-09-11 21:50:28 +03:00
Robert Altena
c99f980513
INDArray javadoc ( #246 )
...
* javadoc
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* javadoc
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* javadoc
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* review fixes.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
2019-09-09 13:09:31 +10:00
Robert Altena
f25e3e71e5
remove lengthLong ( #236 )
...
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
2019-09-05 11:19:38 +10:00
raver119
a90c7dd995
[WIP] Last set of changes ( #234 )
...
* mmul op instead of cublasSgemm
Signed-off-by: raver119 <raver119@gmail.com>
* transB
Signed-off-by: raver119 <raver119@gmail.com>
* jcpp handles
Signed-off-by: raver119 <raver119@gmail.com>
* bitwise and/or/xor
Signed-off-by: raver119 <raver119@gmail.com>
* bitwise and/or/xor mapping
Signed-off-by: raver119 <raver119@gmail.com>
* cuda/cublas version check
Signed-off-by: raver119 <raver119@gmail.com>
* add expected version
Signed-off-by: raver119 <raver119@gmail.com>
* cuda/cublas version check in java
Signed-off-by: raver119 <raver119@gmail.com>
* one more error check
Signed-off-by: raver119 <raver119@gmail.com>
* build fix
Signed-off-by: raver119 <raver119@gmail.com>
* build fix
Signed-off-by: raver119 <raver119@gmail.com>
* build fix
Signed-off-by: raver119 <raver119@gmail.com>
* one more fix
Signed-off-by: raver119 <raver119@gmail.com>
* skip CUDA version check for now
Signed-off-by: raver119 <raver119@gmail.com>
* better wording
Signed-off-by: raver119 <raver119@gmail.com>
* few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
2019-09-04 14:41:08 +03:00
raver119
d3253aff3f
dedicated lock for getCudaCublasHandle
...
Signed-off-by: raver119 <raver119@gmail.com>
2019-09-02 20:01:13 +03:00
raver119
18828f9725
cublasHandle sharing + lock
...
Signed-off-by: raver119 <raver119@gmail.com>
2019-09-02 16:52:10 +03:00
Alex Black
82c9dc5743
ELU fix ( #219 )
...
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-09-02 18:37:05 +10:00
Robert Altena
6d04d30c94
INDArray.java javadoc ( #215 )
...
* javadoc
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* javadoc
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
2019-09-02 16:06:20 +10:00
raver119
1003428a18
[WIP] Int broadcastables ( #195 )
...
* Removed invalid resource and fixed tests
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* legacy scalar/pairwise/broadcast int ops
Signed-off-by: raver119 <raver119@gmail.com>
* NDArray int broadcastables
Signed-off-by: raver119 <raver119@gmail.com>
* few more bitwise tests
Signed-off-by: raver119 <raver119@gmail.com>
* java side update
Signed-off-by: raver119 <raver119@gmail.com>
* Argument type changed for shift ops
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* legacy scalar/pairwise/broadcast int ops
Signed-off-by: raver119 <raver119@gmail.com>
* NDArray int broadcastables
Signed-off-by: raver119 <raver119@gmail.com>
* few more bitwise tests
Signed-off-by: raver119 <raver119@gmail.com>
* java side update
Signed-off-by: raver119 <raver119@gmail.com>
* Argument type changed for shift ops
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2019-08-30 10:12:40 +03:00
raver119
25e5c23eae
[WIP] Error handling ( #169 )
...
* CUDA reverse rewrite + couple of tests
Signed-off-by: raver119 <raver119@gmail.com>
* don't throw exception on invalid pointer
Signed-off-by: raver119 <raver119@gmail.com>
* data types validation for fastpath exec mode + 2 tests
Signed-off-by: raver119 <raver119@gmail.com>
* data types validation for fastpath exec mode + 2 tests
Signed-off-by: raver119 <raver119@gmail.com>
* ismax allowed dtypes tweak
Signed-off-by: raver119 <raver119@gmail.com>
* lastErrorCode + lastErrorMessage for native exceptions handling
Signed-off-by: raver119 <raver119@gmail.com>
* exportable ErrorReference
Signed-off-by: raver119 <raver119@gmail.com>
* check error codes in java
Signed-off-by: raver119 <raver119@gmail.com>
* - consume lastErrorCode
- fast_in dtype validation fix
Signed-off-by: raver119 <raver119@gmail.com>
* - sg/cb allowed output type change
- minor logging fix for data type validation
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-26 19:57:51 +03:00
raver119
f03b0ee78f
[WIP] more fixes ( #159 )
...
* Added test for MatrixInverse with double input. Fixed matrixDeterminantKernel.
* Fixed kernels to avoid waste templating.
* Fixed logDeterminant kernel.
* Refactored type check for lup'
* - decrease blockDim value for zeta op
Signed-off-by: Yurii <yurii@skymind.io>
* Added print for compound matrix with CUDA.
* Refactored upper matrix invertion kernels.
* - provide move constructor and move assignment operator for OpArgsHoder class
Signed-off-by: Yurii <yurii@skymind.io>
* Refactored usage of launch context.
* - add test for mergemax
Signed-off-by: Yurii <yurii@skymind.io>
* get rid of AveragingArrayProxy
Signed-off-by: raver119 <raver119@gmail.com>
* Refactoring of LUP inversion.
* Added prints for invertion.
* - add OpArgsHolder copy constructor and assignment operator
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for lower inversion
* - fix bug in upsampling2d/3d_bp op
Signed-off-by: Yurii <yurii@skymind.io>
* Added expensive printfs to kernel.
* Refactored expensive kernel prints.
* Refactored expensive printfs
* - remove nullify
Signed-off-by: Yurii <yurii@skymind.io>
* Eliminated waste prints with tests.
* upsampling2d_bp test
Signed-off-by: raver119 <raver119@gmail.com>
* test updated
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-23 19:20:50 +03:00
raver119
729dc5e879
[WIP] size etc ( #155 )
...
* one test for size
Signed-off-by: raver119 <raver119@gmail.com>
* - few tests for size op
- size/rank/size_at ops now use p instead of assign
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-23 12:31:12 +03:00
Samuel Audet
c4e7d032cb
ND4J: Fix incorrectly bundled libraries on Linux ARM
...
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
2019-08-23 16:54:54 +09:00
raver119
243bf866c4
[WIP] Few fixes ( #153 )
...
* throw exception if op execution failed
Signed-off-by: raver119 <raver119@gmail.com>
* expected for test
Signed-off-by: raver119 <raver119@gmail.com>
* one more ismax test
Signed-off-by: raver119 <raver119@gmail.com>
* ismax view fix
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-23 09:00:10 +03:00
raver119
930b49e87f
[WIP] DeviceLocalNDArray updates ( #149 )
...
* ContextBuffers are released upon device change
Signed-off-by: raver119 <raver119@gmail.com>
* DeviceLocalNDArray updates + tests
Signed-off-by: raver119 <raver119@gmail.com>
* special array for delayed mode
Signed-off-by: raver119 <raver119@gmail.com>
* additional detach()
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-22 20:01:29 +03:00
raver119
3f4379927a
scalar constructor fix
...
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-21 08:50:59 +03:00
raver119
4310e87860
include path fix for java
...
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-21 07:32:21 +03:00
raver119
269d508ba5
[WIP] cross-device migrations ( #134 )
...
* two more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA device afinity tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* prepareAction/registerAction for CustomOps
Signed-off-by: raver119 <raver119@gmail.com>
* lazy allocate host bufer before relocation
Signed-off-by: raver119 <raver119@gmail.com>
* one special test for migration in cpp
Signed-off-by: raver119 <raver119@gmail.com>
* tests update for msvc
Signed-off-by: raver119 <raver119@gmail.com>
* logging
Signed-off-by: raver119 <raver119@gmail.com>
* stick to old col2im impl
Signed-off-by: raver119 <raver119@gmail.com>
* cudaStreams reorganization
Signed-off-by: raver119 <raver119@gmail.com>
* buffer size fix
Signed-off-by: raver119 <raver119@gmail.com>
* c++ data migration
Signed-off-by: raver119 <raver119@gmail.com>
* fix CropAndResize test
Signed-off-by: raver119 <raver119@gmail.com>
* - minor improvment
Signed-off-by: Yurii <yurii@skymind.io>
2019-08-20 18:52:41 +03:00
raver119
23c8738d4a
syncthreads ( #136 )
...
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-20 18:28:43 +03:00
Robert Altena
38310777ee
fix for eclipse#8087 ( #129 )
...
* fix for #8087
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* remove commented code.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* removing trueScalar.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* remove tueScalar.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
2019-08-20 15:20:40 +09:00
raver119
3ba616a161
[WIP] Java wrappers ( #126 )
...
* shift/rshift/rotl/rotr java/sd wrappers
Signed-off-by: raver119 <raver119@gmail.com>
* few additional wrappers
Signed-off-by: raver119 <raver119@gmail.com>
* minor naming tweak
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-16 13:14:26 +03:00
raver119
2f3d7330ce
[WIP] build fix ( #124 )
...
* AffinityManager changes
Signed-off-by: raver119 <raver119@gmail.com>
* build fixes
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-16 08:14:18 +03:00
raver119
987bb80c46
[WIP] right shift ops ( #118 )
...
* right shift ops
Signed-off-by: raver119 <raver119@gmail.com>
* typo
Signed-off-by: raver119 <raver119@gmail.com>
* rotr test
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-15 20:35:15 +03:00
raver119
53ca9a76e8
[WIP] multi-device support ( #80 )
...
* fix pad javadoc and @see links. (#72 )
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* [WIP] More fixes (#73 )
* special tests for ConstantTadHelper/ConstantShapeHelper
Signed-off-by: raver119 <raver119@gmail.com>
* release methods for data buffers
Signed-off-by: raver119 <raver119@gmail.com>
* delete temporary buffer Java side
Signed-off-by: raver119 <raver119@gmail.com>
* delete temporary buffer Java side
Signed-off-by: raver119 <raver119@gmail.com>
* delete temporary TadPack C++/Java side (#74 )
Signed-off-by: raver119 <raver119@gmail.com>
* Zoo model TF import test updates (#75 )
* argLine fix, update compression_gru comment
* updated comment for xception
* undid but commented argLine change
* updated xlnet comment
* copyright headers
* - new NDArray methods like()/ulike() (#77 )
- fix for depthwise_conv2d_bp + special test
Signed-off-by: raver119 <raver119@gmail.com>
* upsampling2d fix CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* DL4J trace logging (#79 )
* MLN/CG trace logging for debugging
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Tiny tweak
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* strided_slice_bp shape fn leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* SameDiff fixes and naming (#78 )
* remove SDVariable inplace methods
* import methods
* npe fix in OpVal
* removed SameDiff inplace ops from tests
* Naming updates, moved to centralized methods in SameDiff, should use op_#:# for everything
* quick fixes
* javadoc
* SDVariable eval with placeholders
* use regex match
* better matching
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* fix javadoc. (#76 )
* fix javadoc.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* replace most @see with @link s.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* 4 additional tests
Signed-off-by: raver119 <raver119@gmail.com>
* launch context reorganization
Signed-off-by: raver119 <raver119@gmail.com>
* LaunchContext reorganization
Signed-off-by: raver119 <raver119@gmail.com>
* per-device LaunchContext
Signed-off-by: raver119 <raver119@gmail.com>
* Various DL4J/ND4J fixes (#81 )
* #7954 Force refresh of UI when switching tabs on overview page
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8017 Concurrent modification exception (synchronize) fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8033 Don't initialize updater in middle of writing memory crash dump
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8208 Fix shape checks for ND4J int[] creator methods
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #6385 #7992 Keras import naming fixes + cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8016 Upsampling3D - add NDHWC format support
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* ContextBuffers as separate entity
Signed-off-by: raver119 <raver119@gmail.com>
* Refactor NativeOps.h to export C functions
* Actually export functions from NativeOps.h
* Adapt the Java wrappers in ND4J generated with JavaCPP
* Create C wrappers for some of the C++ classes currently used by ND4J
* ContextBuffers as separate entity
Signed-off-by: raver119 <raver119@gmail.com>
* remove duplicate code in createBufferDetached. (#83 )
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* Keras model import - updater lr fix (#84 )
* Keras model import - updater lr fix
Signed-off-by: eraly <susan.eraly@gmail.com>
* Keras model import - updater lr fix, cleanup
Signed-off-by: eraly <susan.eraly@gmail.com>
* ContextBuffers as separate entity
Signed-off-by: raver119 <raver119@gmail.com>
* ContextBuffers as separate entity
Signed-off-by: raver119 <raver119@gmail.com>
* Fix functions of OpaqueVariablesSet
* thread-local buffers/affinity
Signed-off-by: raver119 <raver119@gmail.com>
* thread safety for LaunchContext
Signed-off-by: raver119 <raver119@gmail.com>
* more of thread safety
Signed-off-by: raver119 <raver119@gmail.com>
* one more multi threaded test
Signed-off-by: raver119 <raver119@gmail.com>
* SameDiff Convolution Config validation, better output methods (#82 )
* Conv Config validation & tests
Signed-off-by: Ryan Nett <rnett@skymind.io>
* stackOutputs utility method
Signed-off-by: Ryan Nett <rnett@skymind.io>
* use constructor for validation, support negative kernel sizes (infered from weights)
Signed-off-by: Ryan Nett <rnett@skymind.io>
* better output methods
Signed-off-by: Ryan Nett <rnett@skymind.io>
* move output to be with fit and evaluate
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* more fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* refactor duplicate code from pad methods. (#86 )
* refactor duplicate code from pad methods.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* replace switch with if.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* Various ND4J/DL4J fixes and improvements (#87 )
* Reshape and reallocate - small fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Reshape and reallocate - small fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #6488 ElementWiseVertex broadcast support
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Constructors and broadcast supported it Transforms.max/min
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8054 ElementWiseVertex now supports broadcast inputs
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8057 Nd4j.create overload dtype fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #7551 ND4J Shape validation fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* [WIP] Numpy boolean import (#91 )
* numpy bool type
Signed-off-by: raver119 <raver119@gmail.com>
* numpy bool java side
Signed-off-by: raver119 <raver119@gmail.com>
* remove create method with unused parameter. (#89 )
* remove create method with unused parameter.
* removed more unused methods.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* removing more unused code.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* last removal of unused code.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* remove createSparse methods. (#92 )
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* Various ND4J/DL4J fixes (#90 )
* Deprecate Old*Op instances
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8063 #8054 Broadcast exceptions + cleanup inplace ops
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Remove bad test condition
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #7993 Fix shape function issue in crop_and_resize op
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* DL4J SameDiff lambda layer fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8029 Fix for pnorm backprop math
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8038 Fix Op profiler NaN/Inf triggering + add tests (#93 )
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* createUninitializedDetached refactoring. (#94 )
* wip
* update interface, add null implementations.
* Breaking one test in a weird way.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* createUninitializedDetached refactored.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* cuda build fix for issues introduced by recent refactoring
Signed-off-by: raver119 <raver119@gmail.com>
* [WIP] More of CUDA (#95 )
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* Implementation of hashcode cuda helper. Working edition.
* Fixed parallel test input arangements.
* Fixed tests for hashcode op.
* Fixed shape calculation for image:crop_and_resize op and test.
* NativeOps tests. Initial test suite.
* Added tests for indexReduce methods.
* Added test on execBroadcast with NDArray as dimensions.
* Added test on execBroadcastBool with NDArray as dimensions.
* Added tests on execPairwiseTransform and execPairwiseTransofrmBool.
* Added tests for execReduce with scalar results.
* Added reduce tests for non-empty dims array.
* Added tests for reduce3.
* Added tests for execScalar.
* Added tests for execSummaryStats.
* - provide cpu/cuda code for batch_to_space
- testing it
Signed-off-by: Yurii <yurii@skymind.io>
* - remove old test for batch_to_space (had wrong format and numbers were not checked)
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed complilation errors with test.
* Added test for execTransformFloat.
* Added test for execTransformSame.
* Added test for execTransformBool.
* Added test for execTransformStrict.
* Added tests for execScalar/execScalarBool with TADs.
* Added test for flatten.
* - provide cpu/cuda code for space_to_Batch operaion
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for concat.
* comment unnecessary stuff in s_t_b
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for specialConcat.
* Added tests for memcpy/set routines.
* Fixed pullRow cuda test.
* Added pullRow test.
* Added average test.
* - correct typo in NDArray::applyPairwiseTransform(nd4j::pairwise::BoolOps op...)
Signed-off-by: Yurii <yurii@skymind.io>
* - debugging and fixing cuda tests in JavaInteropTests file
Signed-off-by: Yurii <yurii@skymind.io>
* - correct some tests
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for shuffle.
* Fixed ops declarations.
* Restored omp and added shuffle test.
* Added convertTypes test.
* Added tests for execRandom. Eliminated usage of RandomBuffer with NativeOps.
* Added sort tests.
* Added tests for execCustomOp.
* - further debuging and fixing tests terminated with crash
Signed-off-by: Yurii <yurii@skymind.io>
* Added tests for calculateOutputShapes.
* Addded Benchmarks test.
* Commented benchmark tests.
* change assertion
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for apply_sgd op. Added cpu helper for that op.
* Implement cuda helper for aplly_sgd op. Fixed tests for NativeOps.
* Added test for assign broadcastable.
* Added tests for assign_bp op.
* Added tests for axpy op.
* - assign/execScalar/execTransformAny signature change
- minor test fix
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed axpy op.
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* - fix tests for nativeOps::concat
Signed-off-by: Yurii <yurii@skymind.io>
* sequential transform/scalar
Signed-off-by: raver119 <raver119@gmail.com>
* allow nested parallelism
Signed-off-by: raver119 <raver119@gmail.com>
* assign_bp leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* block setRNG fix
Signed-off-by: raver119 <raver119@gmail.com>
* enable parallelism by default
Signed-off-by: raver119 <raver119@gmail.com>
* enable nested parallelism by default
Signed-off-by: raver119 <raver119@gmail.com>
* Added cuda implementation for row_count helper.
* Added implementation for tnse gains op helper.
* - take into account possible situations when input arrays are empty in reduce_ cuda stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Implemented tsne/edge_forces op cuda-based helper. Parallelized cpu-based helper for edge_forces.
* Added kernel for tsne/symmetrized op heleper.
* Implementation of tsne/symmetrized op cuda helper. Working edition.
* Eliminated waste printfs.
* Added test for broadcastgradientargs op.
* host-only fallback for empty reduce float
Signed-off-by: raver119 <raver119@gmail.com>
* - some tests fixes
Signed-off-by: Yurii <yurii@skymind.io>
* - correct the rest of reduce_ stuff
Signed-off-by: Yurii <yurii@skymind.io>
* - further correction of reduce_ stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for Cbow op. Also added cuda implementation for cbow helpers.
* - improve code of stack operation for scalar case
Signed-off-by: Yurii <yurii@skymind.io>
* - provide cuda kernel for gatherND operation
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of cbow helpers with cuda kernels.
* minor tests tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* minor tests tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* - further correction of cuda stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Implementatation of cbow op helper with cuda kernels. Working edition.
* Skip random testing for cudablas case.
* lstmBlockCell context fix
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for ELU and ELU_BP ops.
* Added tests for eq_scalar, gt_scalar, gte_scalar and lte_scalar ops.
* Added tests for neq_scalar.
* Added test for noop.
* - further work on clipbynorm_bp
Signed-off-by: Yurii <yurii@skymind.io>
* - get rid of concat op call, use instead direct concat helper call
Signed-off-by: Yurii <yurii@skymind.io>
* lstmBlockCell context fix
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for lrelu and lrelu_bp.
* Added tests for selu and selu_bp.
* Fixed lrelu derivative helpers.
* - some corrections in lstm
Signed-off-by: Yurii <yurii@skymind.io>
* operator * result shape fix
Signed-off-by: raver119 <raver119@gmail.com>
* - correct typo in lstmCell
Signed-off-by: Yurii <yurii@skymind.io>
* few tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA inverse broadcast bool fix
Signed-off-by: raver119 <raver119@gmail.com>
* disable MMAP test for CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* BooleanOp syncToDevice
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* additional data types for im2col/col2im
Signed-off-by: raver119 <raver119@gmail.com>
* Added test for firas_sparse op.
* one more RandomBuffer test excluded
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for flatten op.
* Added test for Floor op.
* bunch of tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* mmulDot tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Implemented floordiv_bp op and tests.
* Fixed scalar case with cuda implementation for bds.
* - work on cuda kernel for clip_by_norm backprop op is completed
Signed-off-by: Yurii <yurii@skymind.io>
* Eliminate cbow crach.
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Eliminated abortion with batched nlp test.
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed shared flag initializing.
* disabled bunch of cpu workspaces tests
Signed-off-by: raver119 <raver119@gmail.com>
* scalar operators fix: missing registerSpecialUse call
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed logdet for cuda and tests.
* - correct clipBynorm_bp
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed crop_and_resize shape datatype.
* - correct some mmul tests
Signed-off-by: Yurii <yurii@skymind.io>
* build fix
Signed-off-by: raver119 <raver119@gmail.com>
* exclude two methods for JNI
Signed-off-by: raver119 <raver119@gmail.com>
* exclude two methods for JNI
Signed-off-by: raver119 <raver119@gmail.com>
* exclude two methods for JNI (#97 )
Signed-off-by: raver119 <raver119@gmail.com>
* temporary stack fix
Signed-off-by: raver119 <raver119@gmail.com>
* round robin affinity test
Signed-off-by: raver119 <raver119@gmail.com>
* get rid of legacy CudaContext methods
Signed-off-by: raver119 <raver119@gmail.com>
* get rid of legacy ContextPool classes/methods
Signed-off-by: raver119 <raver119@gmail.com>
* one legacy test removed
Signed-off-by: raver119 <raver119@gmail.com>
* few more fields rearranged
Signed-off-by: raver119 <raver119@gmail.com>
* OpaqueLaunchContext
Signed-off-by: raver119 <raver119@gmail.com>
* OpaqueLaunchContext++
Signed-off-by: raver119 <raver119@gmail.com>
* more of OpaqueLaunchContext methods
Signed-off-by: raver119 <raver119@gmail.com>
* LaunchContext -> CudaContext
Signed-off-by: raver119 <raver119@gmail.com>
* AffinityManger changes
Signed-off-by: raver119 <raver119@gmail.com>
* AffinityManger changes
Signed-off-by: raver119 <raver119@gmail.com>
* cusolver handles
Signed-off-by: raver119 <raver119@gmail.com>
* typo
Signed-off-by: raver119 <raver119@gmail.com>
* cusolver method
Signed-off-by: raver119 <raver119@gmail.com>
* cusolver handle propagated
Signed-off-by: raver119 <raver119@gmail.com>
* blas/solver handles
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* legacy concat implementations replaced with new CustomOp
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* concat now uses way more blocks
Signed-off-by: raver119 <raver119@gmail.com>
* print
Signed-off-by: raver119 <raver119@gmail.com>
* no more triple template mmul
Signed-off-by: raver119 <raver119@gmail.com>
* bunch of kernels have dtypes reconsidered
Signed-off-by: raver119 <raver119@gmail.com>
* bunch of kernels have dtypes reconsidered
Signed-off-by: raver119 <raver119@gmail.com>
* bitonic sort reorganized
Signed-off-by: raver119 <raver119@gmail.com>
* bunch of cpu stuff removed from cuda scope
Signed-off-by: raver119 <raver119@gmail.com>
* bunch of cpu stuff removed from cuda scope
Signed-off-by: raver119 <raver119@gmail.com>
* type conversions moved to generic impl
Signed-off-by: raver119 <raver119@gmail.com>
* cpu data types pass
Signed-off-by: raver119 <raver119@gmail.com>
* non_max_suppression
Signed-off-by: raver119 <raver119@gmail.com>
* sortByValue fix
Signed-off-by: raver119 <raver119@gmail.com>
* ignore all mixed datatype tests for mmul
Signed-off-by: raver119 <raver119@gmail.com>
* special handling of OpProfiler exceptions
Signed-off-by: raver119 <raver119@gmail.com>
* - one failing concat test in cpp
- Nd4j.tile now uses op internally
Signed-off-by: raver119 <raver119@gmail.com>
* get back dtype exception for legacy arrays deserialization
Signed-off-by: raver119 <raver119@gmail.com>
2019-08-14 16:52:34 +03:00