agibsonccc
6dc7e2f08f
Update c++ copyrights
2021-02-01 21:31:45 +09:00
raver119
ac7fb903d7
C++ rearrangements ( #485 )
...
* initial commit
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* some minor singleton changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* more iterations
Signed-off-by: raver119 <raver119@gmail.com>
* more singletons updated
Signed-off-by: raver119 <raver119@gmail.com>
* more singletons updated
Signed-off-by: raver119 <raver119@gmail.com>
* more changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* CUDA updates
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Java side update
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one commented out test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
2020-06-06 15:26:55 +03:00
raver119
320924278d
Legacy API changes ( #441 )
...
* initial commit
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* another initial commit
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* another initial commit
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more initial commit
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* next step
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* next step
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* next step
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* next step
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Refactored buffer() and shapeInfo() methods usage with NDArray class.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt Graph class methods to use const shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt choose op to use constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt where op shape method to use constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt lstsq op to use constant empty shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt matrix_diag_part op shape routine to use constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt determinant ops to use constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt mean_pairwssqerr_loss ops to use constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt ops shape methods.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt shape methods for loss ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt log_loss op shape method.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt shape methods for ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt dilation2d ops shape methods.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted deconv2d ops shape methods.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted dynamicRNN op shape method.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted shape methods for ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted shape methods for lstm layer ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* few updates
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* first cuda tweak
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Adopt constant shapes for sconv2d ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt constant shapes for gru ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt constant shapes with shape methods for segment ops and so on.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted constant shapes with unsorted_segment_* ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted constant shapes with gamma op shape method.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted shape methods of reduce_stddev ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted shape methods for reduce_* ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt shape method for squeeze op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt strided_slice shape method.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored concat op shape method to adopt constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted shape method for mirror_pad op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted split op shape method.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted tile ops shape methods.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added const cast for mkldnn routines handles.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored logSoftMaxForVector_ routine to conform with proper data and shape pointer casts.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Cosmetic changes to proper usage of constant pointers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored a couple shape comparators for strides and addBias helpers to proper use data pointers with inplace option.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored depthToSpace helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored histogram helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored im2col helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored gather and gatherND helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed buffer usage on percentile helper.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed gather shape with helpers and range buffer usage.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed buffer usage with space to depth helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed buffer usage and constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed buffer usage with LUP decomposition>
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored onehot_ helper.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored pad and prefix to use constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactoed softmax helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed space to batch helpers to use buffers properly.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed stack and split helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed buffer usage with sparse to dense helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed buffer usage with mindistance_ helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed buffer usage with tile helper.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed constant shape usage.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed constant shape usage with legacy pairwise bool ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored a couple of methods to adopt constant shape usage.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed broadcasting with constant shape."
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed const usage with inplace reverse and constant shapes with legacy reduction.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored legacy ops with const shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored sort to adopt constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected sort for constant shape usage.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed constant shape usage with special methods.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored Context to conform with constant shape usage.
Signed-off-by: shugeo <sgazeos@gmail.com>
* CUDA broadcasting headers
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* pairwise/indexreduce/random headers
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Refactored native ops to adopt constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* legacy reduce3/scalar headers
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Corrected pullRow signature and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected routines to proper use of constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored tests to use constant shapes properly.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored legacy ops tests to use constant shapes properly.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored buffer usage with NDArray tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed native ops tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed special concat routine.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed buffer usage with test.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed buffer usage with a test.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored TAD.h and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored calcStrides* routines to use constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed miscelaneous errors with constant shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* NativeOps const changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Corrected definitions for declared functions.
Signed-off-by: shugeo <sgazeos@gmail.com>
* NativeOps const changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* few more const changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Fixed const shapes with shape routines.
Signed-off-by: shugeo <sgazeos@gmail.com>
* few more const changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Fixed shape method for broadcastable case.
Signed-off-by: shugeo <sgazeos@gmail.com>
* few more const changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* xw_plus_b BP shape fn restored
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Fixed signatures with broadcasting.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Repaired backprops shape methods for a set of operations.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored broadcast bool for cuda.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored methods for 3 args with const qualifier.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed a couple of kernel signatures for broadcasting.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed kernels signatures for const buffers and shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored pairwise methods to persistent buffers and shapes usage.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt const to buffers and shapes with kernels.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopt const to buffers and shapes with scalar kernels.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored indexreduce kernels signatures to use const buffers and shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored pairwise kernels to adopt cons shapes and buffers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored pairwise bool kernels to adopt cons shapes and buffers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored random special ops to conform with const shapes and buffers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored native ops to conform with const shapes and buffers under cuda platform.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Cosmetical changes only.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed const shapes and buffers error.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected start pos routine.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored methods to conform with const shapes and buffers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored helpers to use proper methods instead.
Signed-off-by: shugeo <sgazeos@gmail.com>
* bunch of changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* next bunch of changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* next bunch of changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Fixed execScalar declaration.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed execScalar declaration.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected const shape cases with sort and so on.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed const shapes for sort.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored kernel declarations to adopt const shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed kernels declarations to adopt const shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected kernel declarations to adopt const shapes and buffers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed kernels declarations to adopt const shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed segment helpers kernels declarations and so on to adopt const shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed const shape usage with segment and solve helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed kernel declaration with adjustWeight helper.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed cuda implementations for constant shape helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted const shape usage with kernels.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adopted top_k kernels to use const shapes and buffers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected kernels declarations to adopt const shapes with helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored NDArray definitions to adopt const shapes and buffers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed const shapes with image suppression helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Slight improvement with buffers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored buffer usage.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored buffer usage with tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed const shape usage with definitions.
Signed-off-by: shugeo <sgazeos@gmail.com>
* minor updates on cpu side
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Refactored const shape usage with ConstantDescritor and native ops with cuda platform.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored tear and tile kernels to adopt with const shapes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* softmax_loop fix
Signed-off-by: raver119 <raver119@gmail.com>
* update missing signature
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* softmax again
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* few more missing consts
Signed-off-by: raver119 <raver119@gmail.com>
* new methods updated
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: shugeo <sgazeos@gmail.com>
2020-05-09 08:06:14 +03:00
raver119
57210b936c
Revert "OpenMP Threads execution ( #297 )" ( #299 )
...
This reverts commit dd2043ef48
.
2020-03-09 08:22:49 +03:00
raver119
dd2043ef48
OpenMP Threads execution ( #297 )
...
* omp threads backported
Signed-off-by: raver119 <raver119@gmail.com>
* omp scalar reduce
Signed-off-by: raver119 <raver119@gmail.com>
* timing
Signed-off-by: raver119 <raver119@gmail.com>
* timing
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* namespace change
Signed-off-by: raver119 <raver119@gmail.com>
* num_threads
Signed-off-by: raver119 <raver119@gmail.com>
* one minor fix
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-09 08:21:44 +03:00
raver119
63fa3c2ef3
libnd4j polishing ( #273 )
...
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-02 12:49:41 +03:00
raver119
5332ace32b
better inplace exec with FastPath ( #280 )
...
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-28 12:06:30 +03:00
raver119
31e3a2f7a5
transparent conversion to FastPath execution within Graph ( #278 )
...
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-27 16:10:38 +03:00
raver119
f6442b6724
few minor tweaks ( #272 )
...
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-25 11:13:23 +03:00
raver119
5d28e6143d
OpContext handling ( #214 )
...
* nano tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* OpContext tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* OpContext deallocators
Signed-off-by: raver119 <raver119@gmail.com>
* get rid of few mkldnn safety checks
Signed-off-by: raver119 <raver119@gmail.com>
* databuffer setSpecial fix
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-05 07:27:24 +03:00
raver119
ba961c7601
DataTypes & FlatBuffers ( #197 )
...
* flatbuffers version upgrade
Signed-off-by: raver119 <raver119@gmail.com>
* flatbuffers version upgrade java side
Signed-off-by: raver119 <raver119@gmail.com>
* flatbuffers dependency version upgrade java side
Signed-off-by: raver119 <raver119@gmail.com>
* MKLDNN version upgrade
Signed-off-by: raver119 <raver119@gmail.com>
* DArgs first pass
Signed-off-by: raver119 <raver119@gmail.com>
* signatures first pass
Signed-off-by: raver119 <raver119@gmail.com>
* signatures second pass
Signed-off-by: raver119 <raver119@gmail.com>
* signatures third pass
Signed-off-by: raver119 <raver119@gmail.com>
* signatures third pass
Signed-off-by: raver119 <raver119@gmail.com>
* signatures fourth pass
Signed-off-by: raver119 <raver119@gmail.com>
* signatures fifth pass
Signed-off-by: raver119 <raver119@gmail.com>
* flatbuffers UI version upgrade java side
Signed-off-by: raver119 <raver119@gmail.com>
* flatbuffers ui update
Signed-off-by: raver119 <raver119@gmail.com>
* flatbuffers downgrade
Signed-off-by: raver119 <raver119@gmail.com>
* flatbuffers downgrade java side
Signed-off-by: raver119 <raver119@gmail.com>
2020-01-30 10:07:24 +03:00
raver119
531a72fabd
execution mode ( #183 )
...
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* execution mode java side
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* move exec mode to ContextPrototype
Signed-off-by: raver119 <raver119@gmail.com>
* copyrights
Signed-off-by: raver119 <raver119@gmail.com>
2020-01-27 10:00:07 +03:00
raver119
7783012f39
cuDNN integration ( #150 )
...
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* one file
Signed-off-by: raver119 <raver119@gmail.com>
* few more includes
Signed-off-by: raver119 <raver119@gmail.com>
* m?
Signed-off-by: raver119 <raver119@gmail.com>
* const
Signed-off-by: raver119 <raver119@gmail.com>
* cudnn linkage in tests
Signed-off-by: raver119 <raver119@gmail.com>
* culibos
Signed-off-by: raver119 <raver119@gmail.com>
* static reminder
Signed-off-by: raver119 <raver119@gmail.com>
* platform engine tag
Signed-off-by: raver119 <raver119@gmail.com>
* HAVE_CUDNN moved to config.h.in
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* skip cudnn handle creation if there's not cudnn
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* target device in context
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* move mkldnn out of CPU block
Signed-off-by: raver119 <raver119@gmail.com>
* link to mkldnn on cuda
Signed-off-by: raver119 <raver119@gmail.com>
* less prints
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d NCHW draft
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d biasAdd
Signed-off-by: raver119 <raver119@gmail.com>
* test for MKL/CUDNN combined use
Signed-off-by: raver119 <raver119@gmail.com>
* - provide additional code for conv2d ff based on cudnn api, not tested yet
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on conv2d helper based on using cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing several cuda bugs which appeared after cudnn lib had been started to use
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of conv2d backprop op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementaion of conv3d and conv3d_bp ops based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - bugs fixing in conv3d/conv3d_bp ops (cudnn in use)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of depthwiseConv2d (ff/bp) op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of batchnorm ff op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - disable cudnn batchnorm temporary
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add minor change in cmake
Signed-off-by: Yurii <iuriish@yahoo.com>
* engine for depthwise mkldnn
Signed-off-by: raver119 <raver119@gmail.com>
* couple of includes
Signed-off-by: raver119 <raver119@gmail.com>
* - provide permutation to cudnn batchnorm ff when format is NHWC
Signed-off-by: Yurii <iuriish@yahoo.com>
* lgamma fix
Signed-off-by: raver119 <raver119@gmail.com>
* - eliminate memory leak in two tests
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
2020-01-20 21:32:46 +03:00
raver119
29e8e09db6
String changes ( #3 )
...
* initial commit
* additional data types & tensor type
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* missing include
* sparse_to_dense
Signed-off-by: raver119 <raver119@gmail.com>
* few more tests files
Signed-off-by: raver119 <raver119@gmail.com>
* draft
Signed-off-by: raver119 <raver119@gmail.com>
* numeric sparse_to_dense
Signed-off-by: raver119 <raver119@gmail.com>
* comment
Signed-off-by: raver119 <raver119@gmail.com>
* string sparse_to_dense version
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA DataBuffer expand
Signed-off-by: raver119 <raver119@gmail.com>
* few tweaks for CUDA build
Signed-off-by: raver119 <raver119@gmail.com>
* shape fn for string_split
Signed-off-by: raver119 <raver119@gmail.com>
* one more comment
Signed-off-by: raver119 <raver119@gmail.com>
* string_split indices
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* test passes
Signed-off-by: raver119 <raver119@gmail.com>
* few rearrangements for databuffer implementations
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer: move inline methods to common implementations
Signed-off-by: raver119 <raver119@gmail.com>
* add native DataBuffer to Nd4j presets
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer creation
Signed-off-by: raver119 <raver119@gmail.com>
* use DataBuffer for allocation
Signed-off-by: raver119 <raver119@gmail.com>
* cpu databuffer as deallocatable
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer setters for bufers
Signed-off-by: raver119 <raver119@gmail.com>
* couple of wrappers
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffers being passed around
Signed-off-by: raver119 <raver119@gmail.com>
* Bunch of ByteBuffer-related signatures gone
Signed-off-by: raver119 <raver119@gmail.com>
* - few more Nd4j signatures removed
- minor fix for bfloat16
Signed-off-by: raver119 <raver119@gmail.com>
* nullptr pointer is still a pointer, but 0 as address :)
Signed-off-by: raver119 <raver119@gmail.com>
* one special test
Signed-off-by: raver119 <raver119@gmail.com>
* empty string array init
Signed-off-by: raver119 <raver119@gmail.com>
* one more test in cpp
Signed-off-by: raver119 <raver119@gmail.com>
* memcpy instead of databuffer swap
Signed-off-by: raver119 <raver119@gmail.com>
* special InteropDataBuffer for front-end languages
Signed-off-by: raver119 <raver119@gmail.com>
* few tweaks for java
Signed-off-by: raver119 <raver119@gmail.com>
* pointer/indexer actualization
Signed-off-by: raver119 <raver119@gmail.com>
* CustomOp returns list for inputArumgents and outputArguments instead of array
Signed-off-by: raver119 <raver119@gmail.com>
* redundant call
Signed-off-by: raver119 <raver119@gmail.com>
* print_variable op
Signed-off-by: raver119 <raver119@gmail.com>
* - view handling (but wrong one)
- print_variable java wrapper
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* - empty arrays handling
Signed-off-by: raver119 <raver119@gmail.com>
* - deserialization works now
Signed-off-by: raver119 <raver119@gmail.com>
* minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* one more fix
Signed-off-by: raver119 <raver119@gmail.com>
* initial cuda commit
Signed-off-by: raver119 <raver119@gmail.com>
* print_variable message validation
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA views
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA special buffer size
Signed-off-by: raver119 <raver119@gmail.com>
* minor update to match master changes
Signed-off-by: raver119 <raver119@gmail.com>
* - consider arrays always actual on device for CUDA
- additional PrintVariable constructor
- CudaUtf8Buffer now allocates host buffer by default
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* - print_variable now allows print from device
Signed-off-by: raver119 <raver119@gmail.com>
* InteropDataBuffer data type fix
Signed-off-by: raver119 <raver119@gmail.com>
* ...
Signed-off-by: raver119 <raver119@gmail.com>
* disable some debug messages
Signed-off-by: raver119 <raver119@gmail.com>
* master pulled in
Signed-off-by: raver119 <raver119@gmail.com>
* couple of new methods for DataBuffer interop
Signed-off-by: raver119 <raver119@gmail.com>
* java side
Signed-off-by: raver119 <raver119@gmail.com>
* offsetted constructor
Signed-off-by: raver119 <raver119@gmail.com>
* new CUDA deallocator
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA backend torn apart
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA backend torn apart 2
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA backend torn apart 3
Signed-off-by: raver119 <raver119@gmail.com>
* - few new tests
- few new methods for DataBuffer management
Signed-off-by: raver119 <raver119@gmail.com>
* few more tests + few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* two failing tests
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* two failing tests pass
Signed-off-by: raver119 <raver119@gmail.com>
* now we pass DataBuffer to legacy ops too
Signed-off-by: raver119 <raver119@gmail.com>
* Native DataBuffer for legacy ops, Java side
Signed-off-by: raver119 <raver119@gmail.com>
* CPU java side update
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA java side update
Signed-off-by: raver119 <raver119@gmail.com>
* no more prepare/register action on java side
Signed-off-by: raver119 <raver119@gmail.com>
* NDArray::prepare/register use now accepts vectors
Signed-off-by: raver119 <raver119@gmail.com>
* InteropDataBuffer now has few more convenience methods
Signed-off-by: raver119 <raver119@gmail.com>
* java bindings update
Signed-off-by: raver119 <raver119@gmail.com>
* tick device in NativeOps
Signed-off-by: raver119 <raver119@gmail.com>
* Corrected usage of OpaqueBuffer for tests.
* Corrected usage of OpaqueBuffer for java tests.
* NativeOpsTests fixes.
* print_variable now returns scalar
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* compat_string_split fix for CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* - CUDA execScalar fix
- CUDA lazyAllocateHostPointer now checks java indexer/pointer instead of native pointer
Signed-off-by: raver119 <raver119@gmail.com>
* legacy ops DataBuffer migration prototype
Signed-off-by: raver119 <raver119@gmail.com>
* ignore device shapeinfo coming from java
Signed-off-by: raver119 <raver119@gmail.com>
* minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* minor transformAny fix
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweak for lazy host allocation
Signed-off-by: raver119 <raver119@gmail.com>
* - DataBuffer::memcpy method
- bitcast now uses memcpy
Signed-off-by: raver119 <raver119@gmail.com>
* - IndexReduce CUDA dimension buffer fix
Signed-off-by: raver119 <raver119@gmail.com>
* views for CPU and CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* less spam
Signed-off-by: raver119 <raver119@gmail.com>
* optional memory init
Signed-off-by: raver119 <raver119@gmail.com>
* async memset
Signed-off-by: raver119 <raver119@gmail.com>
* - SummaryStats CUDA fix
- DataBuffer.sameUnderlyingData() impl
- execBroadcast fix
Signed-off-by: raver119 <raver119@gmail.com>
* - reduce3All fix
switch to CUDA 10 temporarily
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA version
Signed-off-by: raver119 <raver119@gmail.com>
* proper memory deallocator registration
Signed-off-by: raver119 <raver119@gmail.com>
* HOST_ONLY workspace allocation
Signed-off-by: raver119 <raver119@gmail.com>
* temp commit
Signed-off-by: raver119 <raver119@gmail.com>
* few conflicts resolved
Signed-off-by: raver119 <raver119@gmail.com>
* few minor fixes
Signed-off-by: raver119 <raver119@gmail.com>
* one more minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* NDArray permute should operate on JVM primitives
Signed-off-by: raver119 <raver119@gmail.com>
* - create InteropDataBuffer for shapes as well
- update pointers after view creation in Java
Signed-off-by: raver119 <raver119@gmail.com>
* - addressPointer temporary moved to C++
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA: don't account offset twice
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA: DataBuffer pointer constructor updated
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA NDArray.unsafeDuplication() simplified
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA minor workspace-related fixes
Signed-off-by: raver119 <raver119@gmail.com>
* CPU DataBuffer.reallocate()
Signed-off-by: raver119 <raver119@gmail.com>
* print_affinity op
Signed-off-by: raver119 <raver119@gmail.com>
* print_affinity java side
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA more tweaks for data locality
Signed-off-by: raver119 <raver119@gmail.com>
* - compat_string_split tweak
- CudaUtf8Buffer update
Signed-off-by: raver119 <raver119@gmail.com>
* INDArray.close() mechanic restored
Signed-off-by: raver119 <raver119@gmail.com>
* one more test fixed
Signed-off-by: raver119 <raver119@gmail.com>
* - CUDA DataBuffer.reallocate() updated
- cudaMemcpy (synchronous) restored
Signed-off-by: raver119 <raver119@gmail.com>
* one last fix
Signed-off-by: raver119 <raver119@gmail.com>
* bad import removed
Signed-off-by: raver119 <raver119@gmail.com>
* another small fix
Signed-off-by: raver119 <raver119@gmail.com>
* one special test
Signed-off-by: raver119 <raver119@gmail.com>
* fix bad databuffer size
Signed-off-by: raver119 <raver119@gmail.com>
* release primaryBuffer on replace
Signed-off-by: raver119 <raver119@gmail.com>
* higher timeout
Signed-off-by: raver119 <raver119@gmail.com>
* disable timeouts
Signed-off-by: raver119 <raver119@gmail.com>
* dbCreateView now validates offset and length of a view
Signed-off-by: raver119 <raver119@gmail.com>
* additional validation for dbExpand
Signed-off-by: raver119 <raver119@gmail.com>
* restore timeout back again
Signed-off-by: raver119 <raver119@gmail.com>
* smaller distribution for rng test to prevent timeouts
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA DataBuffer::memcpy now copies to device all the time
Signed-off-by: raver119 <raver119@gmail.com>
* OpaqueDataBuffer now contains all required methods for interop
Signed-off-by: raver119 <raver119@gmail.com>
* some javadoc
Signed-off-by: raver119 <raver119@gmail.com>
* GC on failed allocations
Signed-off-by: raver119 <raver119@gmail.com>
* minoe memcpu tweak
Signed-off-by: raver119 <raver119@gmail.com>
* one more bitcast test
Signed-off-by: raver119 <raver119@gmail.com>
* - NDArray::deviceId() propagation
- special multi-threaded test for data locality checks
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer additional syncStream
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer additional syncStream
Signed-off-by: raver119 <raver119@gmail.com>
* one ignored test
Signed-off-by: raver119 <raver119@gmail.com>
* skip host alloc for empty arrays
Signed-off-by: raver119 <raver119@gmail.com>
* ByteBuffer support is back
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer::memcpy minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few minor prelu/bp tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* nullify-related fixes
Signed-off-by: raver119 <raver119@gmail.com>
* PReLU fixes (#157 )
Signed-off-by: Alex Black <blacka101@gmail.com>
* Build fixed
* Fix tests
* one more ByteBuffer signature restored
Signed-off-by: raver119 <raver119@gmail.com>
* nd4j-jdbc-hsql profiles fix
Signed-off-by: raver119 <raver119@gmail.com>
* nd4j-jdbc-hsql profiles fix
Signed-off-by: raver119 <raver119@gmail.com>
* PReLU weight init fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small PReLU fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* - INDArray.migrate() reactivated
- DataBuffer::setDeviceId(...) added
- InteropDataBuffer Z syncToDevice added for views
Signed-off-by: raver119 <raver119@gmail.com>
* missed file
Signed-off-by: raver119 <raver119@gmail.com>
* Small tweak
Signed-off-by: Alex Black <blacka101@gmail.com>
* cuda 10.2
Signed-off-by: raver119 <raver119@gmail.com>
* minor fix
Signed-off-by: raver119 <raver119@gmail.com>
Co-authored-by: shugeo <sgazeos@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-01-04 13:27:50 +03:00
raver119
451d9d57fd
shape function override ( #161 )
...
Signed-off-by: raver119 <raver119@gmail.com>
2020-01-04 09:06:44 +03:00
raver119
29990b1214
[WIP] Clang/macOS fixes ( #8412 )
...
* build fix for clang
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* [WIP] clang for jcpp (#53 )
* clang as compiler for jcpp
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* we don't need macos profile
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-17 13:45:58 +03:00
raver119
1780dcc883
[WIP] Small fixes here and there ( #50 )
...
* one range test
Signed-off-by: raver119 <raver119@gmail.com>
* few Context convenience singatures
Signed-off-by: raver119 <raver119@gmail.com>
* one more range test
Signed-off-by: raver119 <raver119@gmail.com>
* "range" "fix"
Signed-off-by: raver119 <raver119@gmail.com>
* adjuct_contrast_v2 now allows scale factor to be provided via input_variable
Signed-off-by: raver119 <raver119@gmail.com>
* adjust_contrast now allows scale factor as variable too
Signed-off-by: raver119 <raver119@gmail.com>
* bitcast shape tests
Signed-off-by: raver119 <raver119@gmail.com>
* BitCast import dtype added
Signed-off-by: raver119 <raver119@gmail.com>
* few more BitCast signatures
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-15 17:04:29 +03:00
raver119
1eb3de90d7
[WIP] Platform helpers switches ( #44 )
...
* - platform helpers can be disabled on per-op basis now via Context::allowHelpers
- java has access to it as well
Signed-off-by: raver119 <raver119@gmail.com>
* global platform-helpers trigger
Signed-off-by: raver119 <raver119@gmail.com>
* few signatures renamed
Signed-off-by: raver119 <raver119@gmail.com>
* - few new env variables to follow
- maxThreads/masterThreads differentiation
Signed-off-by: raver119 <raver119@gmail.com>
* Javadoc update
Signed-off-by: raver119 <raver119@gmail.com>
2019-11-14 14:35:02 +03:00
raver119
98e2814879
Platform helpers ( #8216 )
...
* platform helpers draft
Signed-off-by: raver119 <raver119@gmail.com>
* typo
Signed-off-by: raver119 <raver119@gmail.com>
* disable platform cmake
Signed-off-by: raver119 <raver119@gmail.com>
* another draft
Signed-off-by: raver119 <raver119@gmail.com>
* mkldnn convolution refactored
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* one more safety check
Signed-off-by: raver119 <raver119@gmail.com>
* prototype works
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* force static library mode for mkldnn
Signed-off-by: raver119 <raver119@gmail.com>
* - ismax fix
- experimental arg fix
- don't enforce openblas on Apple hardware
Signed-off-by: raver119 <raver119@gmail.com>
* bunch of small fixes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* declare concurrent
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - MKLDNN version upgrade to 1.0.2
- avgpool2d/maxpool2d APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* - avgpool2d_bp/maxpool2d_bp APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* - conv2d/batchnorm APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* - lrn/conv2d_bp/conv3d/conv3d_bp APIs update
Signed-off-by: raver119 <raver119@gmail.com>
* all ops converted to MKLDNN 1.x
Signed-off-by: raver119 <raver119@gmail.com>
* bunch of tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* namespace for platform helpers
Signed-off-by: raver119 <raver119@gmail.com>
* make sure platform helpers aren't opimized out
Signed-off-by: raver119 <raver119@gmail.com>
* build cpu_features on x86 systems
Signed-off-by: raver119 <raver119@gmail.com>
* build cpu_features on x86 systems
Signed-off-by: raver119 <raver119@gmail.com>
* more of cpu_features
Signed-off-by: raver119 <raver119@gmail.com>
* - mkldnn removed from java
- cpu_features checks in CpuNDArrayFactory
Signed-off-by: raver119 <raver119@gmail.com>
* F16C definition renamed
Signed-off-by: raver119 <raver119@gmail.com>
* some mkldnn rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* check supported instructions before doing anything
Signed-off-by: raver119 <raver119@gmail.com>
* typo
Signed-off-by: raver119 <raver119@gmail.com>
* missied impl
Signed-off-by: raver119 <raver119@gmail.com>
* BUILD_PIC option
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool3d fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool3d_bp fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool2d_bp leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* avgpool3d_bp leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* maxpool bp leaks fixed
Signed-off-by: raver119 <raver119@gmail.com>
* printf removed
Signed-off-by: raver119 <raver119@gmail.com>
* batchnorm fix
Signed-off-by: raver119 <raver119@gmail.com>
* AVX warning/error polishing
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* remove previous MKL-DNN support layer
Signed-off-by: raver119 <raver119@gmail.com>
* avx2 tweak
Signed-off-by: raver119 <raver119@gmail.com>
* allow static for apple
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* exclude mkldnn in one more place
Signed-off-by: raver119 <raver119@gmail.com>
* exclude mkldnn in one more place
Signed-off-by: raver119 <raver119@gmail.com>
* restore OPENBLAS_PATH use
Signed-off-by: raver119 <raver119@gmail.com>
* add runtime check for avx/avx2 support
Signed-off-by: raver119 <raver119@gmail.com>
* convolution_auto
Signed-off-by: raver119 <raver119@gmail.com>
* Add logic for helper argument
* minor test fix
Signed-off-by: raver119 <raver119@gmail.com>
* few tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* few tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* skip OpTracker props for non-x86 builds
Signed-off-by: raver119 <raver119@gmail.com>
* linux arm isn't x86 :)
Signed-off-by: raver119 <raver119@gmail.com>
* avx-512
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA presets fix
Signed-off-by: raver119 <raver119@gmail.com>
* BUILD_PIC
Signed-off-by: raver119 <raver119@gmail.com>
* prefetchw for avx2
Signed-off-by: raver119 <raver119@gmail.com>
* BUILD_PIC again
Signed-off-by: raver119 <raver119@gmail.com>
2019-09-11 21:50:28 +03:00
raver119
c969b724bb
[WIP] more CUDA stuff ( #57 )
...
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* Added gradcheck test for dynamic_partition_bp op.
* - implementation of dilation op (cpu and cuda)
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed broadcast_dynamic_shape 1D case and tests.
* Fixed usage of default integer arguments.
* Fixed dynamic_partition_bp op and tests.
* Eliminated test with grad check for dynamic_partition_bp op.
* start working on cuda svd - porting available corresponding api from cuSOLVER library
Signed-off-by: Yurii <yurii@skymind.io>
* provide prelu_bp
Signed-off-by: Yurii <yurii@skymind.io>
* - provide gruCell_bp (old version ??)
Signed-off-by: Yurii <yurii@skymind.io>
* - polishing cumsum_bp and cumprod_bp tests
Signed-off-by: Yurii <yurii@skymind.io>
* provide sparseSoftmaxCrossEntropyWithLogits and sparseSoftmaxCrossEntropyWithLogits_grad
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed atomicMul with float input/output
* implementation of cuda kernel for triu_bp operation
Signed-off-by: Yurii <yurii@skymind.io>
* Refactored lup helper to add parrallel computing.
* cusolver libraries
Signed-off-by: raver119 <raver119@gmail.com>
* uncomment cuSolver APIs in svd.cu
Signed-off-by: Yurii <yurii@skymind.io>
* cusolver var
Signed-off-by: raver119 <raver119@gmail.com>
* - further work on cuSolver svd
Signed-off-by: Yurii <yurii@skymind.io>
* Implement usage of cuda solver to LUP decomposition.
* - correct naames in lup functions
Signed-off-by: Yurii <yurii@skymind.io>
* correct svdQR cuda
Signed-off-by: Yurii <yurii@skymind.io>
* - provide transpositions of input matrices in case of c order in svdCudaQR
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed implementation issues with LUP usign cuda solver.
* Implementation of matrix_determinant helper with cuda kernels. Working revision.
* Implemented log_matrix_determinant helper with cuda kernels.
* - implementation of batched cuda svd
Signed-off-by: Yurii <yurii@skymind.io>
* Refactored cholesky helper and implementation of cuda solver cholesky batch.
* - implementation of cuda kernel for tile bp
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of cholesky and logdet with cuda kernels.
* - implementation of cuda kernel for sru_bidirectional
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed cholesky helper.
* Cholesky op helper implementation. Working double-based cublas implementation.
* bad import excluded
Signed-off-by: raver119 <raver119@gmail.com>
* Finished with cuda implementation of cholesky helper and tests.
* - implementation of cuda kernel for sru_bidirectional_backprop operation
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of matrix_inverse op helper with cuda kernels. The first revision.
* - start working on gruCell_bp
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of matrix_inverse helper.
* - further work on new gruCell_bp
Signed-off-by: Yurii <yurii@skymind.io>
* cuBLAS related fixes
Signed-off-by: raver119 <raver119@gmail.com>
* calculateOutputShapes() now passes device buffers as well
Signed-off-by: raver119 <raver119@gmail.com>
* special concat/average/accumulate init host pointers now
Signed-off-by: raver119 <raver119@gmail.com>
* few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* additional CudaDataBufferFactory signatures certain for data types
Signed-off-by: raver119 <raver119@gmail.com>
* cuSolver host buffer
Signed-off-by: raver119 <raver119@gmail.com>
* buffer to buffer memcpy host ptr allocation
Signed-off-by: raver119 <raver119@gmail.com>
2019-07-20 23:05:21 +10:00
skymindops
b5f0ec072f
Eclipse Migration Initial Commit
2019-06-06 15:21:15 +03:00