* - start working on implementation of sqrtm op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - improving householder procedure
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further polishing householder stuff
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing hh pivoting qr procedure
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing BiDiagonalUp procedure
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing householder sequence class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing jacobi svd class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing svd stuff 1
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing svd stuff 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing class which performs Hessenberg decomposition of square matrix
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add static method to JacobiSVD class which makes the continuous Givens rotation generation algorithm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing auxiliary methods of Schur decomp class
Signed-off-by: Yurii <iuriish@yahoo.com>
* some references here and there
Signed-off-by: raver119 <raver119@gmail.com>
* - trying figure out difference between eigen and our Schur alg
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing fixing bugs in Schur decomposition op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - start to implement class which performs calculation of eigen values and vectors
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add to EigenValsAndVecs method which calculates complex eigen vectors
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in EigenValsAndVecs class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing triangularSolver class
Signed-off-by: Yurii <iuriish@yahoo.com>
* Added a 2D routine for triangular systems solve.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored triangularSolve2D routine and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored another test for triangularSolve2D.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored test for triangularSolve for vector-bar case.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored triangularSolve2D routine and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* - implementation of FullPivLU class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix bugs in FullPivLU::solve method
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct permutation vector in FullPivLU::solve
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct include headers
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of Sqrtm class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in Sqrtm class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - include sqrtm classes to cuda folder, investigate in what places synchronization doesn't work
Signed-off-by: Yurii <iuriish@yahoo.com>
* Added implementation for cuda triangularSolve2D and also refactored triangularSolve2D for cpu.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Eliminated waste implementations.
Signed-off-by: shugeo <sgazeos@gmail.com>
* - make offset calculation faster in t<> methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* - rename refference T& NDArray::t<> method
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on cuda sqrtm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide correct synchronization to device in Sqrtm class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add tests for sqrtm op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct fails which appeared while testing on jenkins
Signed-off-by: Yurii <iuriish@yahoo.com>
* - trying to find out mistake in svd::deflation method
Signed-off-by: Yurii <iuriish@yahoo.com>
* Revert "- trying to find out mistake in svd::deflation method"
This reverts commit 19d37baddbc509028e4bc67bc932fe7449becdb6.
* Revert "- trying to find out mistake in svd::deflation method"
This reverts commit 19d37baddbc509028e4bc67bc932fe7449becdb6.
Signed-off-by: Yurii <iuriish@yahoo.com>
* - change call semantic of r<> and t<> methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* - ged rid of ambiguity in * operator overloads for windows buikd
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of ambiguity in * operator overloads for windows build 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of ambiguity in * operator overloads for windows build 3
Signed-off-by: Yurii <iuriish@yahoo.com>
* - resolve conflicts with master
Signed-off-by: Yurii <iuriish@yahoo.com>
* cmakelists updated
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - minor fix in merge cpu helper - make use of reference getter
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: shugeo <sgazeos@gmail.com>
* - provide correct possible output types in mergeMaxIndex op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - cleaning up the unneeded backprop arg in reverse_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - improve clipByNorm both ff and bp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing clipByAvgNorm_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - pass biases in any way in dnnl lstm op, they are zeros when user doesn't provide them to us
Signed-off-by: Yurii <iuriish@yahoo.com>
* - start working on mkldnn concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on mkldnn concat
Signed-off-by: Yurii <iuriish@yahoo.com>
* missing declaration fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - polishing mkl ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in mkl concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix linkage error for windows cuda build
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further conflicts resolving with master
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix format tags in mkldnn matmul op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide additional type cast in clip.cu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - finally bug in mkldnn tanh_bp was caught
Co-authored-by: raver119@gmail.com <raver119@gmail.com>
* MergeMaxIndex, ReverseBp, Tri, Triu and TriuBp added
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Upsamling3d draft
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* minor fix (upsampling3dBp inputDatatype.size=2)
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* polished testcases
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* matching of Upsampling3d input format according to cpp iArg
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* ops generated from codegen
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* requested changes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* added super() for Triu
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* everything passes except TriuOp
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Tri op dtype arg (output datatype config support) + default float32
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* temporary commit with manually edited sd/nd ops
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Cannot use 'val' here because initializer expression does not have a representable type: Type cannot be resolved
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* all tests passed
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* few requested changes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Small fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Ignore reverse_bp test due to logged issue
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix reverse op
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix MergeMaxIndex dtype -> iarg
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
* Fixed bound problem with Exponential distribution implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added test for Exponential distribution to avoid infinities.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a test for exponential distribution with 1M elements.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Cosmetical changes only and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Modified test and implementation for exponential_distribution op.
Signed-off-by: shugeo <sgazeos@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* Added test and fixed error message for unsorted_segment_sqrt_n op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed error message for unsorted_segment_* ops when 1 segment is given.
Signed-off-by: shugeo <sgazeos@gmail.com>
* - start working on bp for lstm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further working on bp for lstmLayer
Signed-off-by: Yurii <iuriish@yahoo.com>
* - minor change
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 3
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 4
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 5
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 6
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 7
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 8
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 9
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide lstmLayerCell and lstmLayerCellBp as separate CUSTOM_OPs
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing lstmLayerCellBp helper
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implement lstmLayerCellBp as separate op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implement lstmLayerBp as separate op (not tested)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing calculations of dLdWp and dLdb in lstmLayerCellBp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 10
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing typo in lstmLayerTimeLoop
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to perform clipping of c array and calculate corresponding derivative in lstmLayerCellBp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 10
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in lstmLayer_bp op 1
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in lstmLayer_bp op 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - turn off heavy tests for cuda for lstmLayer_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to nullify gradients at eliminated time steps (when sequnce length array is present )
Signed-off-by: Yurii <iuriish@yahoo.com>
* libnd4j added optional alpha and beta support to matmul
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j typos fixes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j add optional alpha and beta to matmul_bp
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one more typo fix
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added optional alpha and beta to mkl implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* MatMul alpha/beta on java side
Signed-off-by: raver119 <raver119@gmail.com>
* alpha/beta fix in libnd4j
Signed-off-by: raver119 <raver119@gmail.com>
* alpha/beta fix in matmul_bp
Signed-off-by: raver119 <raver119@gmail.com>
* restored view validation
Signed-off-by: raver119 <raver119@gmail.com>
* gemv/gemm now use MatMul op
Signed-off-by: raver119 <raver119@gmail.com>
* few tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* additional INDArray.mmul signature
Signed-off-by: raver119 <raver119@gmail.com>
* make C order default for INDArray.mmul, unless both A/B have F order
Signed-off-by: raver119 <raver119@gmail.com>
* Nd4j.gemm validation fix
Signed-off-by: raver119 <raver119@gmail.com>
* disable mkldnn matmul for xxf with beta != 0 case
Signed-off-by: raver119 <raver119@gmail.com>
* SimpleRnn workspace fix + timeouts
Signed-off-by: Alex Black <blacka101@gmail.com>
* two more tests + minor fix in matmul platform check
Signed-off-by: raver119 <raver119@gmail.com>
* Flaky test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* propagate testresources profile
Signed-off-by: raver119 <raver119@gmail.com>
* Resources fix + flaky test fix
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Oleg <oleg.semeniv@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
* - correct reshape op for empty shape in case of -1 at the end
Signed-off-by: Yurii <iuriish@yahoo.com>
* Fix test + new reshape op constructor
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
* libnd4j first step of mkldnn for xw_plus_b and test of aurora crash in imageHelper
* libnd4j sync folders with master
* libnd4j merge master, raw implementation of xw_plus_b on mkldnn, clean up, need testing and adding checks for corresponded input shapes
* libnd4j corrections and checks added to xw_plus_b mkl
* libnd4j corrected dataType description based on mkl operation description, need more investigation
* libnd4j fixe xw_blus_b mkl implementation, need testing
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j two unit tests added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed check input dimensions bug
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libndj4 one more test added to cover different order handling
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added optional int arg support to define weights format, if arg == 1, mkldnn (do not need transpose in mkldnn implementation), else mmul weights format, corrected check points, added unit test
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some improvements to avoid NDArray transpose in xw_plus_b operation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed issues connected with weights rank, also added support of one case based on tf (for mkldnn, cpu, cuda), test case added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added proper handling of empty inputs (all implementations)
* libnd4j fixed compilation error
* libnd4j several more corrections after conflict solve and fixed typos
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j removed unsupported data types
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and fixed issues
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added propagation implementation for xw_plus_b, fixed issue connected with mkl weights data format, avoided data copy in transpose mode, test cases added, manually tested with gradCheck
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one minor fix of double operation declaration
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j minor tests fixes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed build problem, integrate helpers changes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* - start working on reshape op which operates with empty shapes
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct reshaping for empty arrays
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove unnecessary check in Loopkind
Signed-off-by: Yurii <iuriish@yahoo.com>
* libnd4j raw implementation of sgd upader
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections and simple test added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections after discussion
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j integrate applyScalar
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j raw implementation of rmsPropUpdater on cpu
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fix operations declaration
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j rmsPropUpdater added, test cases for sgd, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some fixes and improvements for rmsPropUpdater based on Java tests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed cuda implementation, update tests and corrected behavior according java tests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j adaGrad updater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one minor fix for ada grad
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several more fixes for ada_grad
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j nesterovs updater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed nesterovs updater behavior, several typos and rename file
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one minor typo
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j ada max updater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos in adaMax updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos in adaMaxUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for adaMax, added Adam Updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j adaDeltaUpdater added, minor fixes for adamUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for adaDeltaUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j nadamUpdater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one more correction for nadam updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for nadam updater and added amsGradUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several typos fixed in amsGradUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections and added f order support rmsProp updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added support of f order for all updaters and modify tests for testing in place
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed issues for updates when not in place mode used, added tests for f order
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added input shape checks
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections for different cases handling
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some code clean up and optimize per request
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j updaters refactoring after review
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* SgdUpdater wrapper
Signed-off-by: raver119 <raver119@gmail.com>
* first test
Signed-off-by: raver119 <raver119@gmail.com>
* RmsPropUpdater added
Signed-off-by: raver119 <raver119@gmail.com>
* NadamUpdater + NesterovsUpdater
Signed-off-by: raver119 <raver119@gmail.com>
* AmsGradUpdater
Signed-off-by: raver119 <raver119@gmail.com>
* AdamUpdater added
Signed-off-by: raver119 <raver119@gmail.com>
* AdaGradUpdater + AdaDeltaUpdater + AdaMaxUpdater
Signed-off-by: raver119 <raver119@gmail.com>
* AdaGradUpdater test added
Signed-off-by: raver119 <raver119@gmail.com>
* libnd4j remove input parameters parsing through NDArray, split implementation of helpers to separate files, added some rename, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j next step to split operations implementation into separate files
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and minor corrections
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j revert some changes of split implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j forgot to add header file
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* public default constructors
Signed-off-by: raver119 <raver119@gmail.com>
* ImportClassMapping updated
Signed-off-by: raver119 <raver119@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* - start to introduce additional weights formats into conv2d ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide weights format variety in backprop conv2d and deconv2d ops, testing and fixing bugs
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to recover kernels sizes in deconv2d_bp test
Signed-off-by: Yurii <iuriish@yahoo.com>
* - built in weights format in depthwise conv 2d op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in mkl dnn conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cuda conv helpers
Signed-off-by: Yurii <iuriish@yahoo.com>
* - working with new weights format in cudnn conv api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account order of arrays in cudnn tensor descriptions
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cpu conv3d (ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cpu deconv3d (ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in conv3d ops (ff/bp) based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in conv3d ops (ff/bp) based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - resolve conflicts 2
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* - provide faster index2coords function for cpu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - new faster index2coords function is introduced into cpu code
Signed-off-by: Yurii <iuriish@yahoo.com>
* - replace long long coordinates with int coordinates
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add missed reload of coords2index function
Signed-off-by: Yurii <iuriish@yahoo.com>
* - reststart jenkins
Signed-off-by: Yurii <iuriish@yahoo.com>
* - rollback changes in convolutions.cu and addBias.cu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - profiling of stack and unstack ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix bug in cpu concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correction of cuda stack and unstack
Signed-off-by: Yurii <iuriish@yahoo.com>
* - change shape.h method which operates with unity dimensions strides
Signed-off-by: Yurii <iuriish@yahoo.com>
* - rearrange stack tests
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct evaluation of smallest stride for moving through contiguous axis
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to update signature of function strideOverContigAxis in cuda concat and split ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove ShapeUtils::shapeAsString method applied before input arrays validations
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further removing of ShapeUtils::shapeAsString
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take sub-array shapeIndo/offset calculation out of NDArray class
- add possibility of contiguous memory copy in execTransformAny op if opNum == assign
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct test_empty_scatter_2 in EmptyTests.cpp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - profiling of slice op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of contiguous memcpy for some cases in concat and split ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to declare oid nd4j::SpecialMethods<T>::splitCpuGeneric
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct typo in calculation of threads in cuda split op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to correct another set of threads variables in split cuda ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further conflicts resolving
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* libnd4j cast loop types
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j more type castination added to loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j sync casting types of iterated variable in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j more loops reviewed for vectorization problem fix
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several more files reviewed to fix auto-vectorization problem in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and reviewed more files to fix auto-vectorization problem in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several type casting added in broadcasting that were missed, fixed mac builds
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j double check all files and fix several more places in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed builds
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j revert changes for lup.cpp
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j moved split operation implementation to helpers before special case adding
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j minor fixes for general split operation move, merge master
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libndj4 split cpu implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* - provide cuda helper for split op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - minor correction
Signed-off-by: Yurii <iuriish@yahoo.com>
* - minor correction 2
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* - profiling of concat op (both cuda and cpu)
Signed-off-by: Yurii <iuriish@yahoo.com>
* better comparison for large concat
Signed-off-by: raver119 <raver119@gmail.com>
* - further improving of concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* some loggin
Signed-off-by: raver119 <raver119@gmail.com>
* - add possibility to verify presence of trailing unities in shape and set strides/ews correspondingly
- restrict second simple case in concat op to c order only
Signed-off-by: Yurii <iuriish@yahoo.com>
* - move concat op to specials_single.cpp file
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of second concat op declaration in transforms.cpp file
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* - provide matmul code based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct typo in mkl matmul op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account empty arrays in mkl matmul op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix bug in mkl matmul and group all matmul tests in one file
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide contiguous strides for ouput in transpose op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide contiguous strides for output in permute op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account empty shapes properly in transpose/permute op
Signed-off-by: Yurii <iuriish@yahoo.com>
* Libnd4j: TensorMMul backprop op #8174, raw implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 merge master and some corrections
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 algorithm update, need testing, sync with master
* Libnd4j: TensorMMul backprop op #8174 fixed incorrect B axes calculation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 optimize axes identification and fix bug of indeces overlapping, added first test. need testing with different shapes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 some fixes and improvements need more testing
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 fixed order of matrix multiply
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 fixed issue of incorrect axes definition, add tests based on TF, need additional testing for case dLdC not equal 1
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 fixed scalar case add test
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 fixed bp algorithm, axes definition, need some mode testing with different orders combination f,c; c,f f,f and add some checks for inputs
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 some checks and corrections added tests, exists the problem with different input orders support A-f B-c and A-f B-f
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 sync master
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* - correct bug in MmulHelper::tensorDot(a, b, c, axes_a, axes_b,permutForC)
Signed-off-by: Yurii <iuriish@yahoo.com>
* Libnd4j: TensorMMul backprop op #8174 code clean up and refactoring
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* - add check for linspase ordered permutations in ShapeUtils::evalShapeForTensorDot
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide additional code in shape::reshape stuff in order to reduce amount of allocation/copy operations during reshaping procedure
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on problem of wrong shape evaluation during permute/reshape procedures
Signed-off-by: Yurii <iuriish@yahoo.com>
* - still looking for bug reason in reshape/permute stuff
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct bug in transform cuda native ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct bug in NDArray::assign
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove old shape::reshape stuff
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add possibility to disable copy of old buffer to new buffer during reshape operation in NDArray class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct bug in tensorDot which had to do with wrong pointers assigments
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Oleh <oleg.semeniv@gmail.com>
* Fixed a couple of issues with resize_area op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added additional test for alternate params for resize_area testing.
Signed-off-by: shugeo <sgazeos@gmail.com>
* - provide nhwc format in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl conv3d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl batchnorm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl maxpooling2d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add format format_tag::any to outputs in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - complete corrections in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add test for comparison of execution speeds of mkl conv2d op with different weights format
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account order f in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* Fixed sequence_mask op and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Cuda fix for sequence_mask op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed sequence_mask op for both platforms and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed solve and triangular_solve for more than 2D for adjoint cases.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added adjoint solve test again.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a set of tests for triangual_solve and generic solve ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a pair tests for triangular_solve
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added tests for triangular_solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* linear equations systems solve op. Initial commit.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed compiling issues.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Linear equations systems solve. The next stage commit.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added test for linear equations systems solve operation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added additional test and fixed lower matrix retrievance.
* Implementation for solve of the systems of linear equations."
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored permutation generation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added restore for permutations batched with cuda helper for solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Finished cuda implementation for solve op helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored cpu helpers for solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fix gtest output on Windows
* Fixed issue with permutation matrix for cuda implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed issue with permutation matrix for cpu implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Eliminated waste comments.
Signed-off-by: shugeo <sgazeos@gmail.com>
* LinearSolve added
* Mapping added
* Javadoc added
* Refactored implementation of triangular_solve helpers and tests for solve matrix equations generally.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a test for solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Solve test added
* Fix for TF import
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>