* - disable mkldnn concat when number of input arrays > 3072
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of loop in calculating of input arrays number
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide correct possible output types in mergeMaxIndex op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - cleaning up the unneeded backprop arg in reverse_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - improve clipByNorm both ff and bp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing clipByAvgNorm_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - pass biases in any way in dnnl lstm op, they are zeros when user doesn't provide them to us
Signed-off-by: Yurii <iuriish@yahoo.com>
* - start working on mkldnn concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on mkldnn concat
Signed-off-by: Yurii <iuriish@yahoo.com>
* missing declaration fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - polishing mkl ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in mkl concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix linkage error for windows cuda build
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further conflicts resolving with master
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix format tags in mkldnn matmul op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide additional type cast in clip.cu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - finally bug in mkldnn tanh_bp was caught
Co-authored-by: raver119@gmail.com <raver119@gmail.com>
* one simple test
Signed-off-by: raver119 <raver119@gmail.com>
* fix
Signed-off-by: raver119 <raver119@gmail.com>
* hmmmm...
Signed-off-by: raver119 <raver119@gmail.com>
* mkl matmul skip tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* minor fix for MemoryTracker
* long shapes in matmul
* - 2 new tests for mkldnn tanh
- mkldnn isn't used for scalar tanh
* libnd4j added optional alpha and beta support to matmul
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j typos fixes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j add optional alpha and beta to matmul_bp
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one more typo fix
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added optional alpha and beta to mkl implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* MatMul alpha/beta on java side
Signed-off-by: raver119 <raver119@gmail.com>
* alpha/beta fix in libnd4j
Signed-off-by: raver119 <raver119@gmail.com>
* alpha/beta fix in matmul_bp
Signed-off-by: raver119 <raver119@gmail.com>
* restored view validation
Signed-off-by: raver119 <raver119@gmail.com>
* gemv/gemm now use MatMul op
Signed-off-by: raver119 <raver119@gmail.com>
* few tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* additional INDArray.mmul signature
Signed-off-by: raver119 <raver119@gmail.com>
* make C order default for INDArray.mmul, unless both A/B have F order
Signed-off-by: raver119 <raver119@gmail.com>
* Nd4j.gemm validation fix
Signed-off-by: raver119 <raver119@gmail.com>
* disable mkldnn matmul for xxf with beta != 0 case
Signed-off-by: raver119 <raver119@gmail.com>
* SimpleRnn workspace fix + timeouts
Signed-off-by: Alex Black <blacka101@gmail.com>
* two more tests + minor fix in matmul platform check
Signed-off-by: raver119 <raver119@gmail.com>
* Flaky test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* propagate testresources profile
Signed-off-by: raver119 <raver119@gmail.com>
* Resources fix + flaky test fix
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Oleg <oleg.semeniv@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
* init in this branch
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Lenetet Mnist workflow
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* small fix for calculations
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* for Alex to check placeholder null pointer issue
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* CNN3D workflow
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* state for launching on dxg to regenterate dl4j examples
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* SD RNN test case workflow
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* small fixes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* checkpoint at lstmBlock: Input array 1 (x) rank must be got input with rank 2 issue
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Fix LSTMLayer inputs order
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* lstm mismatch with c++ op issue
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* LSTMLayer config draft
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* LSTMLayer config draft v2
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* have doubt I had to do this
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* NDRNN generated by codegen
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* LSTMLayerTestCases draft
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* minor fixes again
* added LSTMLayer testcases to nd4j-tests + setted Preconditions in LSTMLayer constructors
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* added lost SDCNNtestcases
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* overrided getNumOutputs from DynamicCustomOp in LSTMLayer and reorganized LSTMLayerOutputs according to cpp op
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* finished with LSTMLayerOutputs
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Fix MKLDNN platform checks (i.e., when MKLDNN can be used vs. not)
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix LSTMLayerWeights input order
Signed-off-by: Alex Black <blacka101@gmail.com>
* More fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* minor fixes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* fixed LSTMLayer testcases
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* finished SameDiffRNNTestCase
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* finished all testcases + minor fixes
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Multiple generation-related fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix multiple issues
Signed-off-by: Alex Black <blacka101@gmail.com>
* More fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* LSTM fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Regenerate ND4J namespaces and fix multiple issues
Signed-off-by: Alex Black <blacka101@gmail.com>
* changed SameDiffRNNTestCase
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* Small fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* added Nd4j.getRandom().setSeed(12345) where needed
Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>
* #8828 Fix ND4J profiler NaN/Inf checks when using OpContext
Signed-off-by: Alex Black <blacka101@gmail.com>
* #8828 Fix ND4J profiler NaN/Inf checks when using OpContext
Signed-off-by: Alex Black <blacka101@gmail.com>
* Tweak to weight init for SameDiff CNN test case
Signed-off-by: Alex Black <blacka101@gmail.com>
* Tweaks for test cases
Signed-off-by: Alex Black <blacka101@gmail.com>
* Ignore failing tests until fixed
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
* libnd4j first step of mkldnn for xw_plus_b and test of aurora crash in imageHelper
* libnd4j sync folders with master
* libnd4j merge master, raw implementation of xw_plus_b on mkldnn, clean up, need testing and adding checks for corresponded input shapes
* libnd4j corrections and checks added to xw_plus_b mkl
* libnd4j corrected dataType description based on mkl operation description, need more investigation
* libnd4j fixe xw_blus_b mkl implementation, need testing
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j two unit tests added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed check input dimensions bug
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libndj4 one more test added to cover different order handling
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added optional int arg support to define weights format, if arg == 1, mkldnn (do not need transpose in mkldnn implementation), else mmul weights format, corrected check points, added unit test
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some improvements to avoid NDArray transpose in xw_plus_b operation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed issues connected with weights rank, also added support of one case based on tf (for mkldnn, cpu, cuda), test case added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added proper handling of empty inputs (all implementations)
* libnd4j fixed compilation error
* libnd4j several more corrections after conflict solve and fixed typos
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j removed unsupported data types
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and fixed issues
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added propagation implementation for xw_plus_b, fixed issue connected with mkl weights data format, avoided data copy in transpose mode, test cases added, manually tested with gradCheck
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one minor fix of double operation declaration
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j minor tests fixes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed build problem, integrate helpers changes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* - start to introduce additional weights formats into conv2d ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide weights format variety in backprop conv2d and deconv2d ops, testing and fixing bugs
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to recover kernels sizes in deconv2d_bp test
Signed-off-by: Yurii <iuriish@yahoo.com>
* - built in weights format in depthwise conv 2d op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in mkl dnn conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cuda conv helpers
Signed-off-by: Yurii <iuriish@yahoo.com>
* - working with new weights format in cudnn conv api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account order of arrays in cudnn tensor descriptions
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cpu conv3d (ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cpu deconv3d (ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in conv3d ops (ff/bp) based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in conv3d ops (ff/bp) based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - resolve conflicts 2
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* libnd4j first step of tanh_bp operation implementation on mkldnn
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j optimize several places and added test case for tanh_bp
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j minor corrections and renaming, added one more test case
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j missed mkldnn data format definition
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j mkldnn softmax_bp operation implementation and integration, 2 tests added, need some refactoring and code clean up and more testing with different input shapes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j softmax_bp update, code refactoring, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master, fixed typos, minor tweaks, code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j integrate mkldnnUtils helpers in other mkldnn operations
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j first step of softmax mkldnn implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j raw implementation of mkldnn softmax
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master and added softmax to MklDnnTests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections for softmax mkldnn
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge branch, fixed problem with negative axis, fixed dnnl::memory::format_tag selection, test cases added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j minor corrections to avoid risk connected with negative axis usage
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed windows builds, added switcher to use mkldnn sofmax version only for 3D, 4D, 5D, 6D arrays
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed dataType selection per request
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fix for mac and windows builds
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j builds fix
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j first spet of elementwize tanh implementation on mkldnn
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed typo in error message for softmax MKLDNN, test case added, implementation of tanh on MKLDNN, need supported DataType testing
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j several fixes for tanh and temporary performance test added
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed mkldnn platform loader for tanh
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j MklDnn tanh removed unsupported data types, removed performance test case, added more appropriate equivalence test case, code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j fixed problem with empty input case for MklDnn tanh and softmax
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* - profiling gather op for aurora
Signed-off-by: Yurii <iuriish@yahoo.com>
* - include contiguous memcpy in gather op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide matmul code based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct typo in mkl matmul op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account empty arrays in mkl matmul op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix bug in mkl matmul and group all matmul tests in one file
Signed-off-by: Yurii <iuriish@yahoo.com>
* Libnd4j: TensorMMul backprop op #8174, raw implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 merge master and some corrections
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 algorithm update, need testing, sync with master
* Libnd4j: TensorMMul backprop op #8174 fixed incorrect B axes calculation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 optimize axes identification and fix bug of indeces overlapping, added first test. need testing with different shapes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 some fixes and improvements need more testing
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 fixed order of matrix multiply
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 fixed issue of incorrect axes definition, add tests based on TF, need additional testing for case dLdC not equal 1
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 fixed scalar case add test
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 fixed bp algorithm, axes definition, need some mode testing with different orders combination f,c; c,f f,f and add some checks for inputs
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 some checks and corrections added tests, exists the problem with different input orders support A-f B-c and A-f B-f
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: TensorMMul backprop op #8174 sync master
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* - correct bug in MmulHelper::tensorDot(a, b, c, axes_a, axes_b,permutForC)
Signed-off-by: Yurii <iuriish@yahoo.com>
* Libnd4j: TensorMMul backprop op #8174 code clean up and refactoring
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* - add check for linspase ordered permutations in ShapeUtils::evalShapeForTensorDot
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide additional code in shape::reshape stuff in order to reduce amount of allocation/copy operations during reshaping procedure
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on problem of wrong shape evaluation during permute/reshape procedures
Signed-off-by: Yurii <iuriish@yahoo.com>
* - still looking for bug reason in reshape/permute stuff
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct bug in transform cuda native ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct bug in NDArray::assign
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove old shape::reshape stuff
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add possibility to disable copy of old buffer to new buffer during reshape operation in NDArray class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct bug in tensorDot which had to do with wrong pointers assigments
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Oleh <oleg.semeniv@gmail.com>
* - provide nhwc format in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl conv3d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl batchnorm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl maxpooling2d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add format format_tag::any to outputs in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - complete corrections in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add test for comparison of execution speeds of mkl conv2d op with different weights format
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account order f in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of cudnn batchnorm_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in batchnorm_bp based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - move pooling mkl code and delete some unnecessary files
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing cudnn pooling2d ops (avg/max, ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing cudnn pooling 3d (ff/bp) ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide ff step in case of cudnn maxpool3d_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove half type from set of supported types in mkl dpethwise conv op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - bring back cudaStreamSynchronize in batchnorm and pooling cudnn ops
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* one file
Signed-off-by: raver119 <raver119@gmail.com>
* few more includes
Signed-off-by: raver119 <raver119@gmail.com>
* m?
Signed-off-by: raver119 <raver119@gmail.com>
* const
Signed-off-by: raver119 <raver119@gmail.com>
* cudnn linkage in tests
Signed-off-by: raver119 <raver119@gmail.com>
* culibos
Signed-off-by: raver119 <raver119@gmail.com>
* static reminder
Signed-off-by: raver119 <raver119@gmail.com>
* platform engine tag
Signed-off-by: raver119 <raver119@gmail.com>
* HAVE_CUDNN moved to config.h.in
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* skip cudnn handle creation if there's not cudnn
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* target device in context
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* move mkldnn out of CPU block
Signed-off-by: raver119 <raver119@gmail.com>
* link to mkldnn on cuda
Signed-off-by: raver119 <raver119@gmail.com>
* less prints
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d NCHW draft
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d biasAdd
Signed-off-by: raver119 <raver119@gmail.com>
* test for MKL/CUDNN combined use
Signed-off-by: raver119 <raver119@gmail.com>
* - provide additional code for conv2d ff based on cudnn api, not tested yet
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on conv2d helper based on using cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing several cuda bugs which appeared after cudnn lib had been started to use
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of conv2d backprop op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementaion of conv3d and conv3d_bp ops based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - bugs fixing in conv3d/conv3d_bp ops (cudnn in use)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of depthwiseConv2d (ff/bp) op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of batchnorm ff op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - disable cudnn batchnorm temporary
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add minor change in cmake
Signed-off-by: Yurii <iuriish@yahoo.com>
* engine for depthwise mkldnn
Signed-off-by: raver119 <raver119@gmail.com>
* couple of includes
Signed-off-by: raver119 <raver119@gmail.com>
* - provide permutation to cudnn batchnorm ff when format is NHWC
Signed-off-by: Yurii <iuriish@yahoo.com>
* lgamma fix
Signed-off-by: raver119 <raver119@gmail.com>
* - eliminate memory leak in two tests
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
* - implementation of depthwise_conv2d (both ff/bp) based on mkl dnn api
* - minor corrections in deconv3d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove unnecessary time test
Signed-off-by: Yurii <iuriish@yahoo.com>
* - update mkl dnn version in cmake
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account several notes given by pr reviewer
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix bug in depthwise conv2d op based on mkl
Signed-off-by: Yurii <iuriish@yahoo.com>
* - specifying template instantiation for certain types in float16 and bloat16
Signed-off-by: Yurii <iuriish@yahoo.com>
* - polishing bfloat16 and float16 member functions template specialization
Signed-off-by: Yurii <iuriish@yahoo.com>
* - rewrite and overload array +-*/ scalar and scalar +-*/ arr in NDAray class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - make corrections which have to do with and rvalue lvalue conversions
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide move semantic in NDArray operators array +-/* array
Signed-off-by: Yurii <iuriish@yahoo.com>
* float16/bfloat16 tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* - make float16 and bfloat16 to compile successfully on cuda
Signed-off-by: Yurii <iuriish@yahoo.com>
* - do not use resources of view-like arrays when move semantics is applied
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of pointers in signatures NDArray methods 1
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correction of signature of NDArray::dup method
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correction of signature of NDArray::reduceAlongDimension method
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::applyIndexReduce and applyTrueBroadcast methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::applyReduce3 and varianceAlongDimension methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::tensorsAlongDimension and diagonal methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::allTensorsAlongDimension
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::reduceAlongDimension 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::applyTransform 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::applyPairwiseTransform 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::applyBroadcast 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::applyTrueBroadcast 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::applyScalar and applyScalarArr
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::lambda methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::reduce3 methods 2
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of following NDArray methods: add/sub/mul/div row/column and fillAsTriangular
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::tileToShape methods
Signed-off-by: Yurii <iuriish@yahoo.com>
* - signature correction of NDArray::isShapeSameStrict method
Signed-off-by: Yurii <iuriish@yahoo.com>
* minor corrections in tests
Signed-off-by: Yurii <iuriish@yahoo.com>
* - replace reduce op in batchnorm mkldnn
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add explicit templates instantiations for operator+(NDArray&&. const scalar)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections of casts in float16/bfloat16
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide move semantics in following NDArray methods: transform, applyTrueBroadcast, transpose, reshape, permute
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of input array A duplicate in svd cuda op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - avoid available bug in svd cuda API
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add temporary global memory buffer in svd cuda when calcUV = false and m != n
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove test with blfoat16 type for betainC
Signed-off-by: Yurii <iuriish@yahoo.com>
* - resolve conflicts after master has been merged in
Signed-off-by: Yurii <iuriish@yahoo.com>
* - changed type of affected input array in fused_batch_norm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add several explicit type castings
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add ND4J_EXPORT to operators
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add explicit template types in instantiations of template arithm operators of NDArray class
Signed-off-by: Yurii <iuriish@yahoo.com>
* - one more test fix
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* - add padding calculation in same mode in causal conv1d op for right mkl paddings
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct causal condition in mkldnnUtils.cpp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct some code which caused additional round errors is betainc op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - put float in place of template parameter in nan assign in betainc op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add possibility of passing scalar-array as input parameter for scale factor in adjust hue/contrast/saturation ops
- correct typo in function which calculates regularized incomplete beta integral
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix bug in betainc cuda kernel
Signed-off-by: Yurii <iuriish@yahoo.com>
* - start working on implementation of digamma function
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on digamma function (cpu)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in digamma op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - make correction n cuda kernel for polyGamma
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove unnecessary stuff from betaInc cuda kernel
Signed-off-by: Yurii <iuriish@yahoo.com>
* - resolve conflicts in DeclarableOpsTests3.cpp after master branch has been merged
Signed-off-by: Yurii <iuriish@yahoo.com>
* - restore id number of Not opertion in legacy_ops.h
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct padding calculation in mkl dnn conv1d causal
Signed-off-by: Yurii <iuriish@yahoo.com>
* restore empty check in adjust_contrast_v2
Signed-off-by: raver119 <raver119@gmail.com>
* - profiling cuda kernels for vol2col and im2col
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct addBias helper
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct mkl dilation formula and switch off mkl api for dilation deconvolutions
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide possibility to pass axis as last input array in concat op
- corrcect sumation in bias_add_bp op for NHWC case
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write code for deconv2d op based on mkl dnn api
* no unsafe math
Signed-off-by: raver119 <raver119@gmail.com>
* no unsafe math
Signed-off-by: raver119 <raver119@gmail.com>
* - get rid of e<> and p<> methods in svd helper
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide mkl api support for deconvolution 3d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write deconv2d_bp based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write deconv3d_bp based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing deconv based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove dilation form conv2d/3d mkl
Signed-off-by: Yurii <iuriish@yahoo.com>
* - minor changes
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further corrections of deconv ops based on mkl dnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide deconv2d_tf based on mkl dnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add minor corrections required by reviewer
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write code for new batchnorm backprop
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing batchnorm backprop
Signed-off-by: Yurii <iuriish@yahoo.com>
* - write code for batchnorm backprop based on mkl dnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in batchnorm_bp mkl dnn
Signed-off-by: Yurii <iuriish@yahoo.com>
* - made corrections required by reviewer
Signed-off-by: Yurii <iuriish@yahoo.com>
* - change name in java wrapper for batchnorm op
Signed-off-by: Yurii <iuriish@yahoo.com>