* RL4J: Add generic update rule (#502)
Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
* Shyrma reduce (#481)
* - start working on improving of cpu legacy code for reduce ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on improving legacy loops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - still working on improving reduce ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on improving reduce ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing speed run of new reduce op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - working on improvement of default loop for reduce op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - update signatures of stuff which calls reduce ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - make corrections in cuda reduce kernels
Signed-off-by: Yurii <iuriish@yahoo.com>
* - change loop for default case in broadcast legacy ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - comment some shape stuff
Signed-off-by: Yurii <iuriish@yahoo.com>
* - comment unnecessary prints in RNGtests
Signed-off-by: Yurii <iuriish@yahoo.com>
* - finish to resolve conflicts after master has been merged
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of some compilation mistakes of cuda stuff
Signed-off-by: Yurii <iuriish@yahoo.com>
* - minor changes
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further search for bug causing crash on java test
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add scalar case in reduce_ ... exec stuff
Signed-off-by: Yurii <iuriish@yahoo.com>
* - minor corrections in NAtiveOps.cu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add switch to scalar case execReduceXD functions
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct cuda mirrorPad
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* Add support for CUDA 11.0 (#492)
* Add support for CUDA 11.0
* libnd4j tweaks for CUDA 11
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* bindings update, again?
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy
* update API to match CUDA 8
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* * Update version of JavaCPP Presets for CPython
* C++ updated for cuDNN 8.0
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* 128-bit alignment for workspaces
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* change seed in 1 test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Fix dependecy duplication in python4j-parent pom
* Fix group id for in python4j-numpy
* few tests tweaked
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow
* few minor tweaks for IndexReduce
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one test removed
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)
Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
* - provide faster index2coords function for cpu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - new faster index2coords function is introduced into cpu code
Signed-off-by: Yurii <iuriish@yahoo.com>
* - replace long long coordinates with int coordinates
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add missed reload of coords2index function
Signed-off-by: Yurii <iuriish@yahoo.com>
* - reststart jenkins
Signed-off-by: Yurii <iuriish@yahoo.com>
* - rollback changes in convolutions.cu and addBias.cu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - profiling TrueBroadcastHelper
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further improving of TrueBroadcastHelper
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further profiling of broadcast op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of broadcastShapeHelper which inserts unities in shapes of arrays to be broadcasted
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide additional method in ConstantShapeHelper class for deducing broadcast shapes with unities
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new NativeOps helpers for usual and true broadcast methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* enable bert profiler
Signed-off-by: raver119 <raver119@gmail.com>
* - delete unnessesary tests
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* - profiling bias_add op
- add some docementation
Signed-off-by: Yurii <yurii@skymind.io>
* - minor change
Signed-off-by: Yurii <yurii@skymind.io>
* - provide addBias cuda kernel
Signed-off-by: Yurii <yurii@skymind.io>
* - improve shape::getIndexOfffset and change its signature
Signed-off-by: Yurii <yurii@skymind.io>
* - same as previous
Signed-off-by: Yurii <yurii@skymind.io>
* - improve and change signature in some shape:: stuff which has to do with calculation of offsets for array elements
Signed-off-by: Yurii <yurii@skymind.io>
* - minor changes in flatten
Signed-off-by: Yurii <shyrma@skymind.io>
* - add function shape::getIndexOffsetOrdered
Signed-off-by: Yurii <shyrma@skymind.io>
* - correct shape::getIndexOffsetOrdered()
Signed-off-by: Yurii <shyrma@skymind.io>
* - move getIndexOffsetOrdered to flatten.h header in order to isolate this function
Signed-off-by: Yurii <shyrma@skymind.io>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* Implementation of hashcode cuda helper. Working edition.
* Fixed parallel test input arangements.
* Fixed tests for hashcode op.
* Fixed shape calculation for image:crop_and_resize op and test.
* NativeOps tests. Initial test suite.
* Added tests for indexReduce methods.
* Added test on execBroadcast with NDArray as dimensions.
* Added test on execBroadcastBool with NDArray as dimensions.
* Added tests on execPairwiseTransform and execPairwiseTransofrmBool.
* Added tests for execReduce with scalar results.
* Added reduce tests for non-empty dims array.
* Added tests for reduce3.
* Added tests for execScalar.
* Added tests for execSummaryStats.
* - provide cpu/cuda code for batch_to_space
- testing it
Signed-off-by: Yurii <yurii@skymind.io>
* - remove old test for batch_to_space (had wrong format and numbers were not checked)
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed complilation errors with test.
* Added test for execTransformFloat.
* Added test for execTransformSame.
* Added test for execTransformBool.
* Added test for execTransformStrict.
* Added tests for execScalar/execScalarBool with TADs.
* Added test for flatten.
* - provide cpu/cuda code for space_to_Batch operaion
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for concat.
* comment unnecessary stuff in s_t_b
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for specialConcat.
* Added tests for memcpy/set routines.
* Fixed pullRow cuda test.
* Added pullRow test.
* Added average test.
* - correct typo in NDArray::applyPairwiseTransform(nd4j::pairwise::BoolOps op...)
Signed-off-by: Yurii <yurii@skymind.io>
* - debugging and fixing cuda tests in JavaInteropTests file
Signed-off-by: Yurii <yurii@skymind.io>
* - correct some tests
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for shuffle.
* Fixed ops declarations.
* Restored omp and added shuffle test.
* Added convertTypes test.
* Added tests for execRandom. Eliminated usage of RandomBuffer with NativeOps.
* Added sort tests.
* Added tests for execCustomOp.
* - further debuging and fixing tests terminated with crash
Signed-off-by: Yurii <yurii@skymind.io>
* Added tests for calculateOutputShapes.
* Addded Benchmarks test.
* Commented benchmark tests.
* change assertion
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for apply_sgd op. Added cpu helper for that op.
* Implement cuda helper for aplly_sgd op. Fixed tests for NativeOps.
* Added test for assign broadcastable.
* Added tests for assign_bp op.
* Added tests for axpy op.
* - assign/execScalar/execTransformAny signature change
- minor test fix
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed axpy op.
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* - fix tests for nativeOps::concat
Signed-off-by: Yurii <yurii@skymind.io>
* sequential transform/scalar
Signed-off-by: raver119 <raver119@gmail.com>
* allow nested parallelism
Signed-off-by: raver119 <raver119@gmail.com>
* assign_bp leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* block setRNG fix
Signed-off-by: raver119 <raver119@gmail.com>
* enable parallelism by default
Signed-off-by: raver119 <raver119@gmail.com>
* enable nested parallelism by default
Signed-off-by: raver119 <raver119@gmail.com>
* Added cuda implementation for row_count helper.
* Added implementation for tnse gains op helper.
* - take into account possible situations when input arrays are empty in reduce_ cuda stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Implemented tsne/edge_forces op cuda-based helper. Parallelized cpu-based helper for edge_forces.
* Added kernel for tsne/symmetrized op heleper.
* Implementation of tsne/symmetrized op cuda helper. Working edition.
* Eliminated waste printfs.
* Added test for broadcastgradientargs op.
* host-only fallback for empty reduce float
Signed-off-by: raver119 <raver119@gmail.com>
* - some tests fixes
Signed-off-by: Yurii <yurii@skymind.io>
* - correct the rest of reduce_ stuff
Signed-off-by: Yurii <yurii@skymind.io>
* - further correction of reduce_ stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for Cbow op. Also added cuda implementation for cbow helpers.
* - improve code of stack operation for scalar case
Signed-off-by: Yurii <yurii@skymind.io>
* - provide cuda kernel for gatherND operation
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of cbow helpers with cuda kernels.
* minor tests tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* minor tests tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* - further correction of cuda stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Implementatation of cbow op helper with cuda kernels. Working edition.
* Skip random testing for cudablas case.
* lstmBlockCell context fix
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for ELU and ELU_BP ops.
* Added tests for eq_scalar, gt_scalar, gte_scalar and lte_scalar ops.
* Added tests for neq_scalar.
* Added test for noop.
* - further work on clipbynorm_bp
Signed-off-by: Yurii <yurii@skymind.io>
* - get rid of concat op call, use instead direct concat helper call
Signed-off-by: Yurii <yurii@skymind.io>
* lstmBlockCell context fix
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for lrelu and lrelu_bp.
* Added tests for selu and selu_bp.
* Fixed lrelu derivative helpers.
* - some corrections in lstm
Signed-off-by: Yurii <yurii@skymind.io>
* operator * result shape fix
Signed-off-by: raver119 <raver119@gmail.com>
* - correct typo in lstmCell
Signed-off-by: Yurii <yurii@skymind.io>
* few tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA inverse broadcast bool fix
Signed-off-by: raver119 <raver119@gmail.com>
* disable MMAP test for CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* BooleanOp syncToDevice
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* additional data types for im2col/col2im
Signed-off-by: raver119 <raver119@gmail.com>
* Added test for firas_sparse op.
* one more RandomBuffer test excluded
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for flatten op.
* Added test for Floor op.
* bunch of tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* mmulDot tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Implemented floordiv_bp op and tests.
* Fixed scalar case with cuda implementation for bds.
* - work on cuda kernel for clip_by_norm backprop op is completed
Signed-off-by: Yurii <yurii@skymind.io>
* Eliminate cbow crach.
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Eliminated abortion with batched nlp test.
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed shared flag initializing.
* disabled bunch of cpu workspaces tests
Signed-off-by: raver119 <raver119@gmail.com>
* scalar operators fix: missing registerSpecialUse call
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed logdet for cuda and tests.
* - correct clipBynorm_bp
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed crop_and_resize shape datatype.
* - correct some mmul tests
Signed-off-by: Yurii <yurii@skymind.io>