* Fix cmake detection in msys
* Fix toolchain file on windows
* Make android 64 bit work
* Fix libnd4j build script on msys
* Update build script for windows/linux
* Encoding issue for ci
* Update pom.xml
* Update pom.xml
* Update pom.xml
* Remove mingw
* Ensure android x86 builds are inline with arm builds
* Update toolchains and env variables for x86
* Move profile for build program up to parent
* Fix blas vendor and add comment
* Update cuda presets version
* Set default value and move properties back to child pom
* Change program from hard coded to use the script as the program
* Update pom.xml
* Update pom.xml
* Static lib fix
* Update static lib output
* Get rid of old comments
* Update static for buiding
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* one file
Signed-off-by: raver119 <raver119@gmail.com>
* few more includes
Signed-off-by: raver119 <raver119@gmail.com>
* m?
Signed-off-by: raver119 <raver119@gmail.com>
* const
Signed-off-by: raver119 <raver119@gmail.com>
* cudnn linkage in tests
Signed-off-by: raver119 <raver119@gmail.com>
* culibos
Signed-off-by: raver119 <raver119@gmail.com>
* static reminder
Signed-off-by: raver119 <raver119@gmail.com>
* platform engine tag
Signed-off-by: raver119 <raver119@gmail.com>
* HAVE_CUDNN moved to config.h.in
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* skip cudnn handle creation if there's not cudnn
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* target device in context
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* move mkldnn out of CPU block
Signed-off-by: raver119 <raver119@gmail.com>
* link to mkldnn on cuda
Signed-off-by: raver119 <raver119@gmail.com>
* less prints
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d NCHW draft
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d biasAdd
Signed-off-by: raver119 <raver119@gmail.com>
* test for MKL/CUDNN combined use
Signed-off-by: raver119 <raver119@gmail.com>
* - provide additional code for conv2d ff based on cudnn api, not tested yet
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on conv2d helper based on using cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing several cuda bugs which appeared after cudnn lib had been started to use
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of conv2d backprop op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementaion of conv3d and conv3d_bp ops based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - bugs fixing in conv3d/conv3d_bp ops (cudnn in use)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of depthwiseConv2d (ff/bp) op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of batchnorm ff op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - disable cudnn batchnorm temporary
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add minor change in cmake
Signed-off-by: Yurii <iuriish@yahoo.com>
* engine for depthwise mkldnn
Signed-off-by: raver119 <raver119@gmail.com>
* couple of includes
Signed-off-by: raver119 <raver119@gmail.com>
* - provide permutation to cudnn batchnorm ff when format is NHWC
Signed-off-by: Yurii <iuriish@yahoo.com>
* lgamma fix
Signed-off-by: raver119 <raver119@gmail.com>
* - eliminate memory leak in two tests
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>