* - provide correct possible output types in mergeMaxIndex op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - cleaning up the unneeded backprop arg in reverse_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - improve clipByNorm both ff and bp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing clipByAvgNorm_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - pass biases in any way in dnnl lstm op, they are zeros when user doesn't provide them to us
Signed-off-by: Yurii <iuriish@yahoo.com>
* - start working on mkldnn concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on mkldnn concat
Signed-off-by: Yurii <iuriish@yahoo.com>
* missing declaration fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - polishing mkl ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in mkl concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix linkage error for windows cuda build
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further conflicts resolving with master
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix format tags in mkldnn matmul op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide additional type cast in clip.cu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - finally bug in mkldnn tanh_bp was caught
Co-authored-by: raver119@gmail.com <raver119@gmail.com>
* - start to introduce additional weights formats into conv2d ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide weights format variety in backprop conv2d and deconv2d ops, testing and fixing bugs
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to recover kernels sizes in deconv2d_bp test
Signed-off-by: Yurii <iuriish@yahoo.com>
* - built in weights format in depthwise conv 2d op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in mkl dnn conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cuda conv helpers
Signed-off-by: Yurii <iuriish@yahoo.com>
* - working with new weights format in cudnn conv api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account order of arrays in cudnn tensor descriptions
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cpu conv3d (ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in cpu deconv3d (ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in conv3d ops (ff/bp) based on mkl api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide new weights formats in conv3d ops (ff/bp) based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - resolve conflicts 2
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* libnd4j mkldnn softmax_bp operation implementation and integration, 2 tests added, need some refactoring and code clean up and more testing with different input shapes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j softmax_bp update, code refactoring, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j merge master, fixed typos, minor tweaks, code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j integrate mkldnnUtils helpers in other mkldnn operations
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* - provide nhwc format in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl conv3d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl batchnorm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl maxpooling2d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add format format_tag::any to outputs in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - complete corrections in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add test for comparison of execution speeds of mkl conv2d op with different weights format
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account order f in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of cudnn batchnorm_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in batchnorm_bp based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - move pooling mkl code and delete some unnecessary files
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing cudnn pooling2d ops (avg/max, ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing cudnn pooling 3d (ff/bp) ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide ff step in case of cudnn maxpool3d_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove half type from set of supported types in mkl dpethwise conv op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - bring back cudaStreamSynchronize in batchnorm and pooling cudnn ops
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* one file
Signed-off-by: raver119 <raver119@gmail.com>
* few more includes
Signed-off-by: raver119 <raver119@gmail.com>
* m?
Signed-off-by: raver119 <raver119@gmail.com>
* const
Signed-off-by: raver119 <raver119@gmail.com>
* cudnn linkage in tests
Signed-off-by: raver119 <raver119@gmail.com>
* culibos
Signed-off-by: raver119 <raver119@gmail.com>
* static reminder
Signed-off-by: raver119 <raver119@gmail.com>
* platform engine tag
Signed-off-by: raver119 <raver119@gmail.com>
* HAVE_CUDNN moved to config.h.in
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* skip cudnn handle creation if there's not cudnn
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* target device in context
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* move mkldnn out of CPU block
Signed-off-by: raver119 <raver119@gmail.com>
* link to mkldnn on cuda
Signed-off-by: raver119 <raver119@gmail.com>
* less prints
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d NCHW draft
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d biasAdd
Signed-off-by: raver119 <raver119@gmail.com>
* test for MKL/CUDNN combined use
Signed-off-by: raver119 <raver119@gmail.com>
* - provide additional code for conv2d ff based on cudnn api, not tested yet
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on conv2d helper based on using cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing several cuda bugs which appeared after cudnn lib had been started to use
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of conv2d backprop op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementaion of conv3d and conv3d_bp ops based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - bugs fixing in conv3d/conv3d_bp ops (cudnn in use)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of depthwiseConv2d (ff/bp) op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of batchnorm ff op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - disable cudnn batchnorm temporary
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add minor change in cmake
Signed-off-by: Yurii <iuriish@yahoo.com>
* engine for depthwise mkldnn
Signed-off-by: raver119 <raver119@gmail.com>
* couple of includes
Signed-off-by: raver119 <raver119@gmail.com>
* - provide permutation to cudnn batchnorm ff when format is NHWC
Signed-off-by: Yurii <iuriish@yahoo.com>
* lgamma fix
Signed-off-by: raver119 <raver119@gmail.com>
* - eliminate memory leak in two tests
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
* - implementation of depthwise_conv2d (both ff/bp) based on mkl dnn api
* - minor corrections in deconv3d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove unnecessary time test
Signed-off-by: Yurii <iuriish@yahoo.com>
* - update mkl dnn version in cmake
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account several notes given by pr reviewer
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix bug in depthwise conv2d op based on mkl
Signed-off-by: Yurii <iuriish@yahoo.com>