* Fixing issues from Sonar report
* Proper logger of exceptions
* Coding style fixes
* Use dup parameter
* Cleanup, minor issues
* Cuda compilation fixed and some minor fixes
* Remove old nd4j-jackson dependencies
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix use of old/deprecated JSON serializer
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix deserialization
Signed-off-by: Alex Black <blacka101@gmail.com>
* Delete test using deleted ser/de classes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Delete another copy of old test
Signed-off-by: Alex Black <blacka101@gmail.com>
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* up to assign operation.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* fix Imax, IMin.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* concat.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* dynamicPartition
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* new ops up to gte.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* updated review items.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* up to matchCondition.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* up to OneHot.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip. up to permute.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip. up to rank.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip. up to scatterMul.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* resolving code review issues.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip. inclides UnsortedSegment ops.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip. up to stridedSlice.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* fix stridedSlice.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* first pass of SDBaseops.kt complete.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* fix review items.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* put branch in compilable state.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* add NDBaseTest. fix dynamicPartition signature. failed fix of assign.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* make tests public.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* adds tests up to invertedPermutation.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* fix ScalarEquals, Assign.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* updates NDBaseTest.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* updates 'check' comments based on test pass/fail.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* fix scalar ops. Update tests,
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* dev-tools review items. wip.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* dev-tools code review items.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* Test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* complete review items.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* Comment for logged issue; fix test case
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* More fixes
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* undo changes to Nd4jCpu.java
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* update tests.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* Fixes and regenerate
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small test fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* small fixes to tests.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* Cleanup
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small CUDAExecutioner fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small CudaExecutioner fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* Another small CudaExecutioner fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* Another small CudaExecutioner fix
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Robert Altena <Rob@Ra-ai.com>
* - 1D indexing fix
- couple of new tests for 1D indexing
Signed-off-by: raver119 <raver119@gmail.com>
* percentile fix + test
Signed-off-by: raver119 <raver119@gmail.com>
* wrong signature used in test
Signed-off-by: raver119 <raver119@gmail.com>
* init in this branch
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Lenetet Mnist workflow
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* small fix for calculations
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* for Alex to check placeholder null pointer issue
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* CNN3D workflow
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* state for launching on dxg to regenterate dl4j examples
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* SD RNN test case workflow
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* small fixes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* checkpoint at lstmBlock: Input array 1 (x) rank must be got input with rank 2 issue
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Fix LSTMLayer inputs order
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* lstm mismatch with c++ op issue
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* LSTMLayer config draft
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* LSTMLayer config draft v2
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* have doubt I had to do this
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* NDRNN generated by codegen
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* LSTMLayerTestCases draft
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* minor fixes again
* added LSTMLayer testcases to nd4j-tests + setted Preconditions in LSTMLayer constructors
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* added lost SDCNNtestcases
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* overrided getNumOutputs from DynamicCustomOp in LSTMLayer and reorganized LSTMLayerOutputs according to cpp op
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* finished with LSTMLayerOutputs
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Fix MKLDNN platform checks (i.e., when MKLDNN can be used vs. not)
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix LSTMLayerWeights input order
Signed-off-by: Alex Black <blacka101@gmail.com>
* More fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* minor fixes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* fixed LSTMLayer testcases
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* finished SameDiffRNNTestCase
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* finished all testcases + minor fixes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Multiple generation-related fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix multiple issues
Signed-off-by: Alex Black <blacka101@gmail.com>
* More fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* LSTM fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Regenerate ND4J namespaces and fix multiple issues
Signed-off-by: Alex Black <blacka101@gmail.com>
* changed SameDiffRNNTestCase
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Small fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* added Nd4j.getRandom().setSeed(12345) where needed
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* #8828 Fix ND4J profiler NaN/Inf checks when using OpContext
Signed-off-by: Alex Black <blacka101@gmail.com>
* #8828 Fix ND4J profiler NaN/Inf checks when using OpContext
Signed-off-by: Alex Black <blacka101@gmail.com>
* Tweak to weight init for SameDiff CNN test case
Signed-off-by: Alex Black <blacka101@gmail.com>
* Tweaks for test cases
Signed-off-by: Alex Black <blacka101@gmail.com>
* Ignore failing tests until fixed
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
* libnd4j raw implementation of sgd upader
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections and simple test added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections after discussion
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j integrate applyScalar
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j raw implementation of rmsPropUpdater on cpu
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fix operations declaration
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j rmsPropUpdater added, test cases for sgd, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some fixes and improvements for rmsPropUpdater based on Java tests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed cuda implementation, update tests and corrected behavior according java tests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j adaGrad updater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one minor fix for ada grad
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several more fixes for ada_grad
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j nesterovs updater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed nesterovs updater behavior, several typos and rename file
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one minor typo
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j ada max updater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos in adaMax updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos in adaMaxUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for adaMax, added Adam Updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j adaDeltaUpdater added, minor fixes for adamUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for adaDeltaUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j nadamUpdater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one more correction for nadam updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for nadam updater and added amsGradUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several typos fixed in amsGradUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections and added f order support rmsProp updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added support of f order for all updaters and modify tests for testing in place
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed issues for updates when not in place mode used, added tests for f order
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added input shape checks
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections for different cases handling
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some code clean up and optimize per request
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j updaters refactoring after review
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* SgdUpdater wrapper
Signed-off-by: raver119 <raver119@gmail.com>
* first test
Signed-off-by: raver119 <raver119@gmail.com>
* RmsPropUpdater added
Signed-off-by: raver119 <raver119@gmail.com>
* NadamUpdater + NesterovsUpdater
Signed-off-by: raver119 <raver119@gmail.com>
* AmsGradUpdater
Signed-off-by: raver119 <raver119@gmail.com>
* AdamUpdater added
Signed-off-by: raver119 <raver119@gmail.com>
* AdaGradUpdater + AdaDeltaUpdater + AdaMaxUpdater
Signed-off-by: raver119 <raver119@gmail.com>
* AdaGradUpdater test added
Signed-off-by: raver119 <raver119@gmail.com>
* libnd4j remove input parameters parsing through NDArray, split implementation of helpers to separate files, added some rename, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j next step to split operations implementation into separate files
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and minor corrections
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j revert some changes of split implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j forgot to add header file
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* public default constructors
Signed-off-by: raver119 <raver119@gmail.com>
* ImportClassMapping updated
Signed-off-by: raver119 <raver119@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* #8682 Don't log openmp BLAS threads for CUDA
Signed-off-by: Alex Black <blacka101@gmail.com>
* #8654 Add SameDiff multi-threaded tests
Signed-off-by: Alex Black <blacka101@gmail.com>
* Switching to op context for SameDiff exec
Signed-off-by: Alex Black <blacka101@gmail.com>
* Next steps
Signed-off-by: Alex Black <blacka101@gmail.com>
* Most back to passing
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Better tests, test refactoring
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small tweak
Signed-off-by: Alex Black <blacka101@gmail.com>
* Code duplication reduction
Signed-off-by: Alex Black <blacka101@gmail.com>
* More code deduplication
Signed-off-by: Alex Black <blacka101@gmail.com>
* CUDA fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* More CUDA fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* More fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* ND4S small fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix cmake detection in msys
* Fix toolchain file on windows
* Make android 64 bit work
* Fix libnd4j build script on msys
* Update build script for windows/linux
* Encoding issue for ci
* Update pom.xml
* Update pom.xml
* Update pom.xml
* Remove mingw
* Ensure android x86 builds are inline with arm builds
* Update toolchains and env variables for x86
* Move profile for build program up to parent
* Fix blas vendor and add comment
* Update cuda presets version
* Set default value and move properties back to child pom
* Change program from hard coded to use the script as the program
* Update pom.xml
* Update pom.xml
* Static lib fix
* Update static lib output
* Get rid of old comments
* Update static for buiding
* Adding more datatypes support in datavec-python
* Using numpy C API for creating numpy arrays
* Adding parameterized tests
* Adding support for BFLOAT16 (by converting it to FLOAT)
* Cleanup
* Using casting instead of creating an array
* Giving out a warning while casting array from BFLOAT16 to FLOAT
* Add syncToPrimary and syncToSpecial methods to BaseDataBuffer
Signed-off-by: Alex Black <blacka101@gmail.com>
* Python exec: sync to host before passing pointers
Signed-off-by: Alex Black <blacka101@gmail.com>
* Added copyright header
* use np api (#267)
* python exec / numpy - check object type before cast (#268)
* use np api
* verify object before cast
* fix cong
* cuda fix
* inplace test + tiny fix
* more test
* fix double alloc
* rem tags
* fix cuda check
* Fix implicit CUDA dependency in datavec-python tests; remove new method, add test
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Fariz Rahman <farizrahman4u@gmail.com>
* Revive and start updating DL4J integration tests
Signed-off-by: Alex Black <blacka101@gmail.com>
* Add SameDiff support - first pass
Signed-off-by: Alex Black <blacka101@gmail.com>
* SameDiff test case generation
Signed-off-by: Alex Black <blacka101@gmail.com>
* SameDiff integration tests polishing
Signed-off-by: Alex Black <blacka101@gmail.com>
* More SameDiff integration test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Final polish
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small test tweak
Signed-off-by: Alex Black <blacka101@gmail.com>
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* - profiling of concat op (both cuda and cpu)
Signed-off-by: Yurii <iuriish@yahoo.com>
* better comparison for large concat
Signed-off-by: raver119 <raver119@gmail.com>
* - further improving of concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* some loggin
Signed-off-by: raver119 <raver119@gmail.com>
* - add possibility to verify presence of trailing unities in shape and set strides/ews correspondingly
- restrict second simple case in concat op to c order only
Signed-off-by: Yurii <iuriish@yahoo.com>
* - move concat op to specials_single.cpp file
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of second concat op declaration in transforms.cpp file
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* special workaround methods for DataBuffer.write
Signed-off-by: raver119 <raver119@gmail.com>
* one test removed
Signed-off-by: raver119 <raver119@gmail.com>
* more of unsynced
Signed-off-by: raver119 <raver119@gmail.com>
* missing asLong for BaseCudaDataBuffer
Signed-off-by: raver119 <raver119@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* - one more test for OneHot with dtype
- one more signature in Nd4j
Signed-off-by: raver119 <raver119@gmail.com>
* ones_as/zeros_as now accept dtype
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* - more updates for configurable data types
- ones_as/zeros_as java side + tests
Signed-off-by: raver119 <raver119@gmail.com>
* few c++ tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes around DArgs
Signed-off-by: raver119 <raver119@gmail.com>
* missing alloc validation in RandomGenerator for CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* set error message if rng alloc failed
Signed-off-by: raver119 <raver119@gmail.com>
* check for error code during RNG creation in java
Signed-off-by: raver119 <raver119@gmail.com>
* nd4j-aeron profiles
Signed-off-by: raver119 <raver119@gmail.com>
* nd4j-aeron profiles
Signed-off-by: raver119 <raver119@gmail.com>
* skip one long test
Signed-off-by: raver119 <raver119@gmail.com>
* skip one long test
Signed-off-by: raver119 <raver119@gmail.com>
* kryo profile
Signed-off-by: raver119 <raver119@gmail.com>
* few more profiles
Signed-off-by: raver119 <raver119@gmail.com>
* few more profiles
Signed-off-by: raver119 <raver119@gmail.com>
* few more profiles
Signed-off-by: raver119 <raver119@gmail.com>
* Add maven profile + base tests methods for integration tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Switch from system property to environment variable; seems more reliable in intellij
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add nd4j-common-tests module, and common base test; cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ensure all ND4J tests extend BaseND4JTest
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Test spam reduction, import fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add test logging to nd4j-aeron
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix unintended change
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Reduce sprint test log spam
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More test spam cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Significantly speed up TSNE tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* W2V iterator test unit/integration split
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More NLP test speedups
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Avoid debug/verbose mode leaking between tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* test tweak
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Arbiter extends base DL4J test
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Arbiter test speedup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* nlp-uima test speedup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More test speedups
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix ND4J base test
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Few small ND4J test speed improvements
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* DL4J tests speedup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More tweaks
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Even more test speedups
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More tweaks
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Various test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* More test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Add ability to specify number of threads for C++ ops in BaseDL4JTest and BaseND4JTest
Signed-off-by: Alex Black <blacka101@gmail.com>
* nd4j-aeron test profile fix for CUDA
Signed-off-by: Alex Black <blacka101@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* one file
Signed-off-by: raver119 <raver119@gmail.com>
* few more includes
Signed-off-by: raver119 <raver119@gmail.com>
* m?
Signed-off-by: raver119 <raver119@gmail.com>
* const
Signed-off-by: raver119 <raver119@gmail.com>
* cudnn linkage in tests
Signed-off-by: raver119 <raver119@gmail.com>
* culibos
Signed-off-by: raver119 <raver119@gmail.com>
* static reminder
Signed-off-by: raver119 <raver119@gmail.com>
* platform engine tag
Signed-off-by: raver119 <raver119@gmail.com>
* HAVE_CUDNN moved to config.h.in
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* skip cudnn handle creation if there's not cudnn
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* target device in context
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* move mkldnn out of CPU block
Signed-off-by: raver119 <raver119@gmail.com>
* link to mkldnn on cuda
Signed-off-by: raver119 <raver119@gmail.com>
* less prints
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d NCHW draft
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d biasAdd
Signed-off-by: raver119 <raver119@gmail.com>
* test for MKL/CUDNN combined use
Signed-off-by: raver119 <raver119@gmail.com>
* - provide additional code for conv2d ff based on cudnn api, not tested yet
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on conv2d helper based on using cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing several cuda bugs which appeared after cudnn lib had been started to use
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of conv2d backprop op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementaion of conv3d and conv3d_bp ops based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - bugs fixing in conv3d/conv3d_bp ops (cudnn in use)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of depthwiseConv2d (ff/bp) op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of batchnorm ff op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - disable cudnn batchnorm temporary
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add minor change in cmake
Signed-off-by: Yurii <iuriish@yahoo.com>
* engine for depthwise mkldnn
Signed-off-by: raver119 <raver119@gmail.com>
* couple of includes
Signed-off-by: raver119 <raver119@gmail.com>
* - provide permutation to cudnn batchnorm ff when format is NHWC
Signed-off-by: Yurii <iuriish@yahoo.com>
* lgamma fix
Signed-off-by: raver119 <raver119@gmail.com>
* - eliminate memory leak in two tests
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
* Allow scalar op result array auto allocation
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Don't swallow underlying exception for calculateOutputShape execution failures
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ignore for known keras failure
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Refactored resize images ops to use TF-like bool args as input.
* Refactored helpers for cpu implementation of resize_bilinear and resize_nearest_neighbor ops.
* Refactored cuda implementation for image.resize_bilinear and image.resize_nearest_neighbor ops helpers.
* Refactored nearest_neighbor resize op.
* Added a pair of tests for special case of resize_bilinear algorithm.
* Fixed issue with resize_bilinear op.
* Refactored cpu implementation for helpers with resize_nearest_neighbor op.
* Final fixed for resize ops to conform TF v.1.5
* Refactored cuda helpers for resize_neares_neighbor op.
* Fixed resize_bilinear to accept proper data.
* Fixed issue with non-float input for resize_bilinear op.
* Refactored cuda helper for resize_bilinear to proper process non-float inputs.
* Added tests for resize_bilinear to int inputs.
* Fixed ResizeBilinear wrapper
* Tests fixed
* Fixed float and bool constant to avoid overflow for some kind of compilers.
* Corrected float constants with float data type.
* Added f suffix for float constants.
* Corrected float constant to avoid overflow with initializing lists.
* Corrected float initializing list with float input.
* Corrected bool constant with initalizing list.
* Corrected float and bool values with initializing lists.
* Fixed wrong constant.
* Fixed issue with 1x1 input picture for resize.
* ResizeBilinear default values on import fix
Signed-off-by: raver119 <raver119@gmail.com>
* Corrected input checking and tests for bitcast op.
* Fixed an issue with non_max_suppression form generation and processing with score threshold given.
* Fixed bilinear resize kernel and tests.
* push for Serhii
Signed-off-by: raver119 <raver119@gmail.com>
* Added test for nearest_neighbor resize with int input.
* Added data type check for input/output match.
* Eliminate error in macros.
* Improved output message for type checking.
* Fixed input/output types for op.
* Eliminated waste logging.
* Refactored resize_bilinear helper for multithreading for cpu platform.
* Cosmetic changes only.
* Fixed error for string substitution.
* Skip test for cbow_batch with cuda.
* fix for resizeNearestNeighbor output dtype
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored non_max_suppression helper.
* Refactored shape generation and input handling.
* Added additional test.
* ND4J: Fix OpenBLAS loading for nd4j-native and remove bundling of OpenMP
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* Bundle back libgomp.so.1 for Linux
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* Readd preload directories for ARM
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* Add back preloads for GCC on Windows
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* Add explicit preloadpaths for ARM and POWER to bundle correct library
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* - create op
- skip exec for empty inputs for non_max_suppression
- EmptyHandling idea
Signed-off-by: raver119 <raver119@gmail.com>
* Create op and mapping for it
Signed-off-by: raver119 <raver119@gmail.com>
* - get rid of some copy procedures in mmulHelper ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on embedding cuda api for batched gemm (cublasGemmBatchedEx) in our mmulHelper class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on cuda batched gamm api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write own cuda kernel performing batched gemm
Signed-off-by: Yurii <iuriish@yahoo.com>
* missing include in MmulHelper
Signed-off-by: raver119 <raver119@gmail.com>
* - forgot to keep in code previous correct kernels for mmulNxN, since it may happen that new onw will fail for some reason in future
Signed-off-by: Yurii <iuriish@yahoo.com>
* disable old tensordot
Signed-off-by: raver119 <raver119@gmail.com>
* - rewrite cuda kernels for usualGemm and usualGemv
Signed-off-by: Yurii <iuriish@yahoo.com>
* - profiling mmul helpers
Signed-off-by: Yurii <iuriish@yahoo.com>
* - prints to check shapes were added
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct type of output array Cin mmulNxN
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account possible nans in C array
Signed-off-by: Yurii <iuriish@yahoo.com>
* slightly change numThreads message
Signed-off-by: raver119 <raver119@gmail.com>
* - make corrections in accordance to given notes in pr review
Signed-off-by: Yurii <iuriish@yahoo.com>