* - provide faster index2coords function for cpu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - new faster index2coords function is introduced into cpu code
Signed-off-by: Yurii <iuriish@yahoo.com>
* - replace long long coordinates with int coordinates
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add missed reload of coords2index function
Signed-off-by: Yurii <iuriish@yahoo.com>
* - reststart jenkins
Signed-off-by: Yurii <iuriish@yahoo.com>
* - rollback changes in convolutions.cu and addBias.cu
Signed-off-by: Yurii <iuriish@yahoo.com>
* - profiling of stack and unstack ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fix bug in cpu concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correction of cuda stack and unstack
Signed-off-by: Yurii <iuriish@yahoo.com>
* - change shape.h method which operates with unity dimensions strides
Signed-off-by: Yurii <iuriish@yahoo.com>
* - rearrange stack tests
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct evaluation of smallest stride for moving through contiguous axis
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to update signature of function strideOverContigAxis in cuda concat and split ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove ShapeUtils::shapeAsString method applied before input arrays validations
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further removing of ShapeUtils::shapeAsString
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take sub-array shapeIndo/offset calculation out of NDArray class
- add possibility of contiguous memory copy in execTransformAny op if opNum == assign
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct test_empty_scatter_2 in EmptyTests.cpp
Signed-off-by: Yurii <iuriish@yahoo.com>
* - profiling of slice op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of contiguous memcpy for some cases in concat and split ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to declare oid nd4j::SpecialMethods<T>::splitCpuGeneric
Signed-off-by: Yurii <iuriish@yahoo.com>
* - correct typo in calculation of threads in cuda split op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - forgot to correct another set of threads variables in split cuda ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further conflicts resolving
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* - profiling of concat op (both cuda and cpu)
Signed-off-by: Yurii <iuriish@yahoo.com>
* better comparison for large concat
Signed-off-by: raver119 <raver119@gmail.com>
* - further improving of concat op
Signed-off-by: Yurii <iuriish@yahoo.com>
* some loggin
Signed-off-by: raver119 <raver119@gmail.com>
* - add possibility to verify presence of trailing unities in shape and set strides/ews correspondingly
- restrict second simple case in concat op to c order only
Signed-off-by: Yurii <iuriish@yahoo.com>
* - move concat op to specials_single.cpp file
Signed-off-by: Yurii <iuriish@yahoo.com>
* - get rid of second concat op declaration in transforms.cpp file
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* one special test
Signed-off-by: raver119 <raver119@gmail.com>
* one special test
Signed-off-by: raver119 <raver119@gmail.com>
* local memory for concat
Signed-off-by: raver119 <raver119@gmail.com>
* fixed grid size for concat
Signed-off-by: raver119 <raver119@gmail.com>
* fixed grid size for concat
Signed-off-by: raver119 <raver119@gmail.com>
* test commented out
Signed-off-by: raver119 <raver119@gmail.com>
* - profiling bias_add op
- add some docementation
Signed-off-by: Yurii <yurii@skymind.io>
* - minor change
Signed-off-by: Yurii <yurii@skymind.io>
* - provide addBias cuda kernel
Signed-off-by: Yurii <yurii@skymind.io>
* - improve shape::getIndexOfffset and change its signature
Signed-off-by: Yurii <yurii@skymind.io>
* - same as previous
Signed-off-by: Yurii <yurii@skymind.io>
* - improve and change signature in some shape:: stuff which has to do with calculation of offsets for array elements
Signed-off-by: Yurii <yurii@skymind.io>
* - minor changes in flatten
Signed-off-by: Yurii <shyrma@skymind.io>
* - add function shape::getIndexOffsetOrdered
Signed-off-by: Yurii <shyrma@skymind.io>
* - correct shape::getIndexOffsetOrdered()
Signed-off-by: Yurii <shyrma@skymind.io>
* - move getIndexOffsetOrdered to flatten.h header in order to isolate this function
Signed-off-by: Yurii <shyrma@skymind.io>
* - correct cuda concat
Signed-off-by: Yurii <yurii@skymind.io>
* - pooling 2d/3d : take into account possible case when input and gradI have different strides
Signed-off-by: Yurii <yurii@skymind.io>
* master pulled in
Signed-off-by: raver119 <raver119@gmail.com>
* floordiv_bp test reverted
Signed-off-by: raver119 <raver119@gmail.com>
* - add NDArray::printLinearBuffer method
Signed-off-by: Yurii <yurii@skymind.io>