* - start working on bp for lstm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further working on bp for lstmLayer
Signed-off-by: Yurii <iuriish@yahoo.com>
* - minor change
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 3
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 4
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 5
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 6
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 7
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 8
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 9
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide lstmLayerCell and lstmLayerCellBp as separate CUSTOM_OPs
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing lstmLayerCellBp helper
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implement lstmLayerCellBp as separate op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implement lstmLayerBp as separate op (not tested)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing calculations of dLdWp and dLdb in lstmLayerCellBp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 10
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing typo in lstmLayerTimeLoop
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to perform clipping of c array and calculate corresponding derivative in lstmLayerCellBp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on bp for lstmLayer 10
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in lstmLayer_bp op 1
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in lstmLayer_bp op 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - turn off heavy tests for cuda for lstmLayer_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to nullify gradients at eliminated time steps (when sequnce length array is present )
Signed-off-by: Yurii <iuriish@yahoo.com>
* libnd4j added optional alpha and beta support to matmul
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j typos fixes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j add optional alpha and beta to matmul_bp
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one more typo fix
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added optional alpha and beta to mkl implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* MatMul alpha/beta on java side
Signed-off-by: raver119 <raver119@gmail.com>
* alpha/beta fix in libnd4j
Signed-off-by: raver119 <raver119@gmail.com>
* alpha/beta fix in matmul_bp
Signed-off-by: raver119 <raver119@gmail.com>
* restored view validation
Signed-off-by: raver119 <raver119@gmail.com>
* gemv/gemm now use MatMul op
Signed-off-by: raver119 <raver119@gmail.com>
* few tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* additional INDArray.mmul signature
Signed-off-by: raver119 <raver119@gmail.com>
* make C order default for INDArray.mmul, unless both A/B have F order
Signed-off-by: raver119 <raver119@gmail.com>
* Nd4j.gemm validation fix
Signed-off-by: raver119 <raver119@gmail.com>
* disable mkldnn matmul for xxf with beta != 0 case
Signed-off-by: raver119 <raver119@gmail.com>
* SimpleRnn workspace fix + timeouts
Signed-off-by: Alex Black <blacka101@gmail.com>
* two more tests + minor fix in matmul platform check
Signed-off-by: raver119 <raver119@gmail.com>
* Flaky test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* propagate testresources profile
Signed-off-by: raver119 <raver119@gmail.com>
* Resources fix + flaky test fix
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Oleg <oleg.semeniv@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
* init in this branch
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Lenetet Mnist workflow
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* small fix for calculations
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* for Alex to check placeholder null pointer issue
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* CNN3D workflow
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* state for launching on dxg to regenterate dl4j examples
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* SD RNN test case workflow
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* small fixes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* checkpoint at lstmBlock: Input array 1 (x) rank must be got input with rank 2 issue
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Fix LSTMLayer inputs order
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* lstm mismatch with c++ op issue
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* LSTMLayer config draft
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* LSTMLayer config draft v2
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* have doubt I had to do this
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* NDRNN generated by codegen
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* LSTMLayerTestCases draft
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* minor fixes again
* added LSTMLayer testcases to nd4j-tests + setted Preconditions in LSTMLayer constructors
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* added lost SDCNNtestcases
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* overrided getNumOutputs from DynamicCustomOp in LSTMLayer and reorganized LSTMLayerOutputs according to cpp op
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* finished with LSTMLayerOutputs
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Fix MKLDNN platform checks (i.e., when MKLDNN can be used vs. not)
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix LSTMLayerWeights input order
Signed-off-by: Alex Black <blacka101@gmail.com>
* More fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* minor fixes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* fixed LSTMLayer testcases
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* finished SameDiffRNNTestCase
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* finished all testcases + minor fixes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Multiple generation-related fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix multiple issues
Signed-off-by: Alex Black <blacka101@gmail.com>
* More fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* LSTM fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Regenerate ND4J namespaces and fix multiple issues
Signed-off-by: Alex Black <blacka101@gmail.com>
* changed SameDiffRNNTestCase
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Small fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* added Nd4j.getRandom().setSeed(12345) where needed
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* #8828 Fix ND4J profiler NaN/Inf checks when using OpContext
Signed-off-by: Alex Black <blacka101@gmail.com>
* #8828 Fix ND4J profiler NaN/Inf checks when using OpContext
Signed-off-by: Alex Black <blacka101@gmail.com>
* Tweak to weight init for SameDiff CNN test case
Signed-off-by: Alex Black <blacka101@gmail.com>
* Tweaks for test cases
Signed-off-by: Alex Black <blacka101@gmail.com>
* Ignore failing tests until fixed
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
* - correct reshape op for empty shape in case of -1 at the end
Signed-off-by: Yurii <iuriish@yahoo.com>
* Fix test + new reshape op constructor
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
* libnd4j first step of mkldnn for xw_plus_b and test of aurora crash in imageHelper
* libnd4j sync folders with master
* libnd4j merge master, raw implementation of xw_plus_b on mkldnn, clean up, need testing and adding checks for corresponded input shapes
* libnd4j corrections and checks added to xw_plus_b mkl
* libnd4j corrected dataType description based on mkl operation description, need more investigation
* libnd4j fixe xw_blus_b mkl implementation, need testing
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j two unit tests added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed check input dimensions bug
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libndj4 one more test added to cover different order handling
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added optional int arg support to define weights format, if arg == 1, mkldnn (do not need transpose in mkldnn implementation), else mmul weights format, corrected check points, added unit test
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some improvements to avoid NDArray transpose in xw_plus_b operation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed issues connected with weights rank, also added support of one case based on tf (for mkldnn, cpu, cuda), test case added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added proper handling of empty inputs (all implementations)
* libnd4j fixed compilation error
* libnd4j several more corrections after conflict solve and fixed typos
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j removed unsupported data types
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and fixed issues
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added propagation implementation for xw_plus_b, fixed issue connected with mkl weights data format, avoided data copy in transpose mode, test cases added, manually tested with gradCheck
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one minor fix of double operation declaration
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j minor tests fixes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed build problem, integrate helpers changes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* - start working on reshape op which operates with empty shapes
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct reshaping for empty arrays
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove unnecessary check in Loopkind
Signed-off-by: Yurii <iuriish@yahoo.com>
* libnd4j raw implementation of sgd upader
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections and simple test added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections after discussion
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j integrate applyScalar
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j raw implementation of rmsPropUpdater on cpu
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fix operations declaration
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j rmsPropUpdater added, test cases for sgd, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some fixes and improvements for rmsPropUpdater based on Java tests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed cuda implementation, update tests and corrected behavior according java tests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j adaGrad updater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one minor fix for ada grad
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several more fixes for ada_grad
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j nesterovs updater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed nesterovs updater behavior, several typos and rename file
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one minor typo
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j ada max updater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos in adaMax updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos in adaMaxUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for adaMax, added Adam Updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j adaDeltaUpdater added, minor fixes for adamUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for adaDeltaUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j nadamUpdater added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one more correction for nadam updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for nadam updater and added amsGradUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several typos fixed in amsGradUpdater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections and added f order support rmsProp updater
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added support of f order for all updaters and modify tests for testing in place
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed issues for updates when not in place mode used, added tests for f order
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added input shape checks
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections for different cases handling
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some code clean up and optimize per request
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j updaters refactoring after review
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* SgdUpdater wrapper
Signed-off-by: raver119 <raver119@gmail.com>
* first test
Signed-off-by: raver119 <raver119@gmail.com>
* RmsPropUpdater added
Signed-off-by: raver119 <raver119@gmail.com>
* NadamUpdater + NesterovsUpdater
Signed-off-by: raver119 <raver119@gmail.com>
* AmsGradUpdater
Signed-off-by: raver119 <raver119@gmail.com>
* AdamUpdater added
Signed-off-by: raver119 <raver119@gmail.com>
* AdaGradUpdater + AdaDeltaUpdater + AdaMaxUpdater
Signed-off-by: raver119 <raver119@gmail.com>
* AdaGradUpdater test added
Signed-off-by: raver119 <raver119@gmail.com>
* libnd4j remove input parameters parsing through NDArray, split implementation of helpers to separate files, added some rename, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j next step to split operations implementation into separate files
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and minor corrections
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j revert some changes of split implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j forgot to add header file
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* public default constructors
Signed-off-by: raver119 <raver119@gmail.com>
* ImportClassMapping updated
Signed-off-by: raver119 <raver119@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* - start to introduce additional weights formats into conv2d ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide weights format variety in backprop conv2d and deconv2d ops, testing and fixing bugs
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to recover kernels sizes in deconv2d_bp test
Signed-off-by: Yurii <iuriish@yahoo.com>
* - built in weights format in depthwise conv 2d op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in mkl dnn conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cuda conv helpers
Signed-off-by: Yurii <iuriish@yahoo.com>
* - working with new weights format in cudnn conv api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account order of arrays in cudnn tensor descriptions
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cpu conv3d (ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cpu deconv3d (ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in conv3d ops (ff/bp) based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in conv3d ops (ff/bp) based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - resolve conflicts 2
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* Refactored exponential distribution implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored exponential distribution and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored test to new result sets.
Signed-off-by: shugeo <sgazeos@gmail.com>
* bunch of small fixes
Signed-off-by: raver119 <raver119@gmail.com>
* validation for legacy random op
Signed-off-by: raver119 <raver119@gmail.com>
* get rid of test
Signed-off-by: raver119 <raver119@gmail.com>
* libnd4j first step of tanh_bp operation implementation on mkldnn
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j optimize several places and added test case for tanh_bp
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j minor corrections and renaming, added one more test case
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j missed mkldnn data format definition
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j mkldnn softmax_bp operation implementation and integration, 2 tests added, need some refactoring and code clean up and more testing with different input shapes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j softmax_bp update, code refactoring, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master, fixed typos, minor tweaks, code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j integrate mkldnnUtils helpers in other mkldnn operations
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* - provide faster index2coords function for cpu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - new faster index2coords function is introduced into cpu code
Signed-off-by: Yurii <iuriish@yahoo.com>
* - replace long long coordinates with int coordinates
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add missed reload of coords2index function
Signed-off-by: Yurii <iuriish@yahoo.com>
* - reststart jenkins
Signed-off-by: Yurii <iuriish@yahoo.com>
* - rollback changes in convolutions.cu and addBias.cu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - profiling TrueBroadcastHelper
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further improving of TrueBroadcastHelper
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further profiling of broadcast op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of broadcastShapeHelper which inserts unities in shapes of arrays to be broadcasted
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide additional method in ConstantShapeHelper class for deducing broadcast shapes with unities
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new NativeOps helpers for usual and true broadcast methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* enable bert profiler
Signed-off-by: raver119 <raver119@gmail.com>
* - delete unnessesary tests
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* libnd4j first step of softmax mkldnn implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j raw implementation of mkldnn softmax
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and added softmax to MklDnnTests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections for softmax mkldnn
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge branch, fixed problem with negative axis, fixed dnnl::memory::format_tag selection, test cases added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j minor corrections to avoid risk connected with negative axis usage
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed windows builds, added switcher to use mkldnn sofmax version only for 3D, 4D, 5D, 6D arrays
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed dataType selection per request
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fix for mac and windows builds
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j builds fix
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j first spet of elementwize tanh implementation on mkldnn
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed typo in error message for softmax MKLDNN, test case added, implementation of tanh on MKLDNN, need supported DataType testing
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for tanh and temporary performance test added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed mkldnn platform loader for tanh
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j MklDnn tanh removed unsupported data types, removed performance test case, added more appropriate equivalence test case, code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed problem with empty input case for MklDnn tanh and softmax
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* - profiling of stack and unstack ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix bug in cpu concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correction of cuda stack and unstack
Signed-off-by: Yurii <iuriish@yahoo.com>
* - change shape.h method which operates with unity dimensions strides
Signed-off-by: Yurii <iuriish@yahoo.com>
* - rearrange stack tests
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct evaluation of smallest stride for moving through contiguous axis
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to update signature of function strideOverContigAxis in cuda concat and split ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove ShapeUtils::shapeAsString method applied before input arrays validations
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further removing of ShapeUtils::shapeAsString
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take sub-array shapeIndo/offset calculation out of NDArray class
- add possibility of contiguous memory copy in execTransformAny op if opNum == assign
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct test_empty_scatter_2 in EmptyTests.cpp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - profiling of slice op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of contiguous memcpy for some cases in concat and split ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to declare oid nd4j::SpecialMethods<T>::splitCpuGeneric
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct typo in calculation of threads in cuda split op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to correct another set of threads variables in split cuda ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further conflicts resolving
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* libnd4j cast loop types
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j more type castination added to loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j sync casting types of iterated variable in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j more loops reviewed for vectorization problem fix
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several more files reviewed to fix auto-vectorization problem in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and reviewed more files to fix auto-vectorization problem in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several type casting added in broadcasting that were missed, fixed mac builds
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j double check all files and fix several more places in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed builds
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j revert changes for lup.cpp
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j more files reviewed for auto-vectorization problem fix
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j cast loop types
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j more type castination added to loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j sync casting types of iterated variable in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j more loops reviewed for vectorization problem fix
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed several typos
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several more files reviewed to fix auto-vectorization problem in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and reviewed more files to fix auto-vectorization problem in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several type casting added in broadcasting that were missed, fixed mac builds
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j double check all files and fix several more places in loops
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed builds
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j revert changes for lup.cpp
Signed-off-by: Oleg <oleg.semeniv@gmail.com>