* - start working on implementation of sqrtm op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - improving householder procedure
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further polishing householder stuff
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing hh pivoting qr procedure
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing BiDiagonalUp procedure
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing householder sequence class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing jacobi svd class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing svd stuff 1
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing svd stuff 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing class which performs Hessenberg decomposition of square matrix
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add static method to JacobiSVD class which makes the continuous Givens rotation generation algorithm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing auxiliary methods of Schur decomp class
Signed-off-by: Yurii <iuriish@yahoo.com>
* some references here and there
Signed-off-by: raver119 <raver119@gmail.com>
* - trying figure out difference between eigen and our Schur alg
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing fixing bugs in Schur decomposition op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - start to implement class which performs calculation of eigen values and vectors
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add to EigenValsAndVecs method which calculates complex eigen vectors
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in EigenValsAndVecs class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing triangularSolver class
Signed-off-by: Yurii <iuriish@yahoo.com>
* Added a 2D routine for triangular systems solve.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored triangularSolve2D routine and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored another test for triangularSolve2D.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored test for triangularSolve for vector-bar case.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored triangularSolve2D routine and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* - implementation of FullPivLU class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix bugs in FullPivLU::solve method
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct permutation vector in FullPivLU::solve
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct include headers
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of Sqrtm class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in Sqrtm class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - include sqrtm classes to cuda folder, investigate in what places synchronization doesn't work
Signed-off-by: Yurii <iuriish@yahoo.com>
* Added implementation for cuda triangularSolve2D and also refactored triangularSolve2D for cpu.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Eliminated waste implementations.
Signed-off-by: shugeo <sgazeos@gmail.com>
* - make offset calculation faster in t<> methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* - rename refference T& NDArray::t<> method
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on cuda sqrtm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide correct synchronization to device in Sqrtm class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add tests for sqrtm op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct fails which appeared while testing on jenkins
Signed-off-by: Yurii <iuriish@yahoo.com>
* - trying to find out mistake in svd::deflation method
Signed-off-by: Yurii <iuriish@yahoo.com>
* Revert "- trying to find out mistake in svd::deflation method"
This reverts commit 19d37baddbc509028e4bc67bc932fe7449becdb6.
* Revert "- trying to find out mistake in svd::deflation method"
This reverts commit 19d37baddbc509028e4bc67bc932fe7449becdb6.
Signed-off-by: Yurii <iuriish@yahoo.com>
* - change call semantic of r<> and t<> methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* - ged rid of ambiguity in * operator overloads for windows buikd
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of ambiguity in * operator overloads for windows build 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of ambiguity in * operator overloads for windows build 3
Signed-off-by: Yurii <iuriish@yahoo.com>
* - resolve conflicts with master
Signed-off-by: Yurii <iuriish@yahoo.com>
* cmakelists updated
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - minor fix in merge cpu helper - make use of reference getter
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: shugeo <sgazeos@gmail.com>
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* - one more test for OneHot with dtype
- one more signature in Nd4j
Signed-off-by: raver119 <raver119@gmail.com>
* ones_as/zeros_as now accept dtype
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* - more updates for configurable data types
- ones_as/zeros_as java side + tests
Signed-off-by: raver119 <raver119@gmail.com>
* few c++ tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes around DArgs
Signed-off-by: raver119 <raver119@gmail.com>
* fix narrowing down cast
Signed-off-by: raver119 <raver119@gmail.com>
* trigger jenkins
Signed-off-by: raver119 <raver119@gmail.com>
* few more fixes for MSVC and Windows
Signed-off-by: raver119 <raver119@gmail.com>
* few more fixes for MSVC and Windows
Signed-off-by: raver119 <raver119@gmail.com>
* few more fixes for MSVC and Windows
Signed-off-by: raver119 <raver119@gmail.com>
* few more fixes for MSVC and Windows
Signed-off-by: raver119 <raver119@gmail.com>
* few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* - few more tweaks
- tensormmul dtype validation
Signed-off-by: raver119 <raver119@gmail.com>
* - few more tweaks
- batched gemm dtype validation
Signed-off-by: raver119 <raver119@gmail.com>
* - few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* - few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* - few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* - few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* Implementation of hashcode cuda helper. Working edition.
* Fixed parallel test input arangements.
* Fixed tests for hashcode op.
* Fixed shape calculation for image:crop_and_resize op and test.
* NativeOps tests. Initial test suite.
* Added tests for indexReduce methods.
* Added test on execBroadcast with NDArray as dimensions.
* Added test on execBroadcastBool with NDArray as dimensions.
* Added tests on execPairwiseTransform and execPairwiseTransofrmBool.
* Added tests for execReduce with scalar results.
* Added reduce tests for non-empty dims array.
* Added tests for reduce3.
* Added tests for execScalar.
* Added tests for execSummaryStats.
* - provide cpu/cuda code for batch_to_space
- testing it
Signed-off-by: Yurii <yurii@skymind.io>
* - remove old test for batch_to_space (had wrong format and numbers were not checked)
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed complilation errors with test.
* Added test for execTransformFloat.
* Added test for execTransformSame.
* Added test for execTransformBool.
* Added test for execTransformStrict.
* Added tests for execScalar/execScalarBool with TADs.
* Added test for flatten.
* - provide cpu/cuda code for space_to_Batch operaion
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for concat.
* comment unnecessary stuff in s_t_b
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for specialConcat.
* Added tests for memcpy/set routines.
* Fixed pullRow cuda test.
* Added pullRow test.
* Added average test.
* - correct typo in NDArray::applyPairwiseTransform(nd4j::pairwise::BoolOps op...)
Signed-off-by: Yurii <yurii@skymind.io>
* - debugging and fixing cuda tests in JavaInteropTests file
Signed-off-by: Yurii <yurii@skymind.io>
* - correct some tests
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for shuffle.
* Fixed ops declarations.
* Restored omp and added shuffle test.
* Added convertTypes test.
* Added tests for execRandom. Eliminated usage of RandomBuffer with NativeOps.
* Added sort tests.
* Added tests for execCustomOp.
* - further debuging and fixing tests terminated with crash
Signed-off-by: Yurii <yurii@skymind.io>
* Added tests for calculateOutputShapes.
* Addded Benchmarks test.
* Commented benchmark tests.
* change assertion
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for apply_sgd op. Added cpu helper for that op.
* Implement cuda helper for aplly_sgd op. Fixed tests for NativeOps.
* Added test for assign broadcastable.
* Added tests for assign_bp op.
* Added tests for axpy op.
* - assign/execScalar/execTransformAny signature change
- minor test fix
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed axpy op.
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* - fix tests for nativeOps::concat
Signed-off-by: Yurii <yurii@skymind.io>
* sequential transform/scalar
Signed-off-by: raver119 <raver119@gmail.com>
* allow nested parallelism
Signed-off-by: raver119 <raver119@gmail.com>
* assign_bp leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* block setRNG fix
Signed-off-by: raver119 <raver119@gmail.com>
* enable parallelism by default
Signed-off-by: raver119 <raver119@gmail.com>
* enable nested parallelism by default
Signed-off-by: raver119 <raver119@gmail.com>
* Added cuda implementation for row_count helper.
* Added implementation for tnse gains op helper.
* - take into account possible situations when input arrays are empty in reduce_ cuda stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Implemented tsne/edge_forces op cuda-based helper. Parallelized cpu-based helper for edge_forces.
* Added kernel for tsne/symmetrized op heleper.
* Implementation of tsne/symmetrized op cuda helper. Working edition.
* Eliminated waste printfs.
* Added test for broadcastgradientargs op.
* host-only fallback for empty reduce float
Signed-off-by: raver119 <raver119@gmail.com>
* - some tests fixes
Signed-off-by: Yurii <yurii@skymind.io>
* - correct the rest of reduce_ stuff
Signed-off-by: Yurii <yurii@skymind.io>
* - further correction of reduce_ stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for Cbow op. Also added cuda implementation for cbow helpers.
* - improve code of stack operation for scalar case
Signed-off-by: Yurii <yurii@skymind.io>
* - provide cuda kernel for gatherND operation
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of cbow helpers with cuda kernels.
* minor tests tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* minor tests tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* - further correction of cuda stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Implementatation of cbow op helper with cuda kernels. Working edition.
* Skip random testing for cudablas case.
* lstmBlockCell context fix
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for ELU and ELU_BP ops.
* Added tests for eq_scalar, gt_scalar, gte_scalar and lte_scalar ops.
* Added tests for neq_scalar.
* Added test for noop.
* - further work on clipbynorm_bp
Signed-off-by: Yurii <yurii@skymind.io>
* - get rid of concat op call, use instead direct concat helper call
Signed-off-by: Yurii <yurii@skymind.io>
* lstmBlockCell context fix
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for lrelu and lrelu_bp.
* Added tests for selu and selu_bp.
* Fixed lrelu derivative helpers.
* - some corrections in lstm
Signed-off-by: Yurii <yurii@skymind.io>
* operator * result shape fix
Signed-off-by: raver119 <raver119@gmail.com>
* - correct typo in lstmCell
Signed-off-by: Yurii <yurii@skymind.io>
* few tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA inverse broadcast bool fix
Signed-off-by: raver119 <raver119@gmail.com>
* disable MMAP test for CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* BooleanOp syncToDevice
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* additional data types for im2col/col2im
Signed-off-by: raver119 <raver119@gmail.com>
* Added test for firas_sparse op.
* one more RandomBuffer test excluded
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for flatten op.
* Added test for Floor op.
* bunch of tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* mmulDot tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Implemented floordiv_bp op and tests.
* Fixed scalar case with cuda implementation for bds.
* - work on cuda kernel for clip_by_norm backprop op is completed
Signed-off-by: Yurii <yurii@skymind.io>
* Eliminate cbow crach.
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Eliminated abortion with batched nlp test.
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed shared flag initializing.
* disabled bunch of cpu workspaces tests
Signed-off-by: raver119 <raver119@gmail.com>
* scalar operators fix: missing registerSpecialUse call
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed logdet for cuda and tests.
* - correct clipBynorm_bp
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed crop_and_resize shape datatype.
* - correct some mmul tests
Signed-off-by: Yurii <yurii@skymind.io>