raver119
c396fcb960
More pre-release fixes ( #456 )
...
* - numPrefixBlocks fix for threshold_encoding
- temparrays pointers fixed
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* auto configuration of memory workspace for gradients sharing
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* limit sparse encoding message size
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more workspace test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more CUDA-specific test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more CUDA-specific workspace test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more CUDA-specific workspace test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more CUDA-specific workspace test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* add separate host/device reset for circular workspace mode
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* new PW builder method for encoder memory amount
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* "inplace" execution for threshold encoding
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
2020-05-13 08:12:07 +03:00
raver119
0613485654
compression ops ( #436 )
...
* Added declarations for decode/encode_bitmap ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added implementation for bitmap encoding/decoding ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added helpers for encode/decode bitmap ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored encodingBitmap helper.
Signed-off-by: shugeo <sgazeos@gmail.com>
* threshold encode/decode skeleton
* helper skeleton
* minor import fix
* encoder shape fn & op impl
* thresholdEncode cpu impl
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* thresholdDecode cpu impl
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Only cosmetical changes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* placeholder
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Added cuda implementation for bitmap decode helper.
Signed-off-by: shugeo <sgazeos@gmail.com>
* cuda thresholdEstimate
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* cuda thresholdDecode
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* next step
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - nano cmakelist update (get rid of Clion section)
- fixed forgotten throw in AtomicTests
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* thesholdEncode cuda impl
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Added tests for bitmap encoding/decoding ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed tests for encode/decode bitmaps.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored decode/encode helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed crashes with bitmap decode/encode helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* bitmap encode/decode CPU
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* bitmap encode/decode CUDA
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* C API removed for threshold/bitmap encode
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* EncodeBitmap/DecodeBitmap Java side
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* EncodeThreshold/DecodeThreshold Java side
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* EncodeThreshold/DecodeThreshold Java side
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* few more tests for threshold encoding
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* minor test tweak
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* two special tests
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* encodeBitmap CPU fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* parallel_long/parallel_double proper spans fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* encodeThreshold CUDA fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* nano fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* grid tweaks
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* RTX adaptation for thresholdEncode
Signed-off-by: raver119 <raver119@gmail.com>
* don't allow threshold encoding for length < 2
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* get rid of NDArrayCompressor in EncodingHandler
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more minor update of EncodingHandler
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more minor tweak of EncodingHandler
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - matmul allows integer data types use
- EncodingHandler boundary default value
- few tests for integer matmul
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* minor fix of CUDA bitmap encode
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* boundary changed to integer everywhere
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* boundary changed to integer everywhere
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* re-enable CUDA deallocator
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* threshold encoder fix for systems without omp
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - encode_threshold now requires non-negative boundary
- minor tweak in EncodingHandler
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* restore parallelism in decode_bitmap
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* fall back to omp for encode_bitmap cpu
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* single time casts
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - additional test for encode_threshold
- sync buffers to device before calling for shape function
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: shugeo <sgazeos@gmail.com>
2020-05-08 20:59:39 +03:00
raver119
3e2dbc65dd
MatMul for gemm/gemv calls ( #365 )
...
* libnd4j added optional alpha and beta support to matmul
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j typos fixes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j add optional alpha and beta to matmul_bp
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j one more typo fix
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j added optional alpha and beta to mkl implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* MatMul alpha/beta on java side
Signed-off-by: raver119 <raver119@gmail.com>
* alpha/beta fix in libnd4j
Signed-off-by: raver119 <raver119@gmail.com>
* alpha/beta fix in matmul_bp
Signed-off-by: raver119 <raver119@gmail.com>
* restored view validation
Signed-off-by: raver119 <raver119@gmail.com>
* gemv/gemm now use MatMul op
Signed-off-by: raver119 <raver119@gmail.com>
* few tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* additional INDArray.mmul signature
Signed-off-by: raver119 <raver119@gmail.com>
* make C order default for INDArray.mmul, unless both A/B have F order
Signed-off-by: raver119 <raver119@gmail.com>
* Nd4j.gemm validation fix
Signed-off-by: raver119 <raver119@gmail.com>
* disable mkldnn matmul for xxf with beta != 0 case
Signed-off-by: raver119 <raver119@gmail.com>
* SimpleRnn workspace fix + timeouts
Signed-off-by: Alex Black <blacka101@gmail.com>
* two more tests + minor fix in matmul platform check
Signed-off-by: raver119 <raver119@gmail.com>
* Flaky test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* propagate testresources profile
Signed-off-by: raver119 <raver119@gmail.com>
* Resources fix + flaky test fix
Signed-off-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Oleg <oleg.semeniv@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
2020-04-10 17:57:02 +03:00
Oleh
c3223dbc7a
Improve ResultSet usage in libnd4j ( #281 )
...
* libnd4j profiling DeclarableOp and Tests by replacing return ResultSet pointer by instance
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j profiling semantic change in tests cases
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j some corrections to make new ResultSet semantic works, fixed one test
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j more tests fixes
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* - correct copy and move assignment operators of ResultSet class
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
2020-03-10 07:42:50 +03:00
raver119
63fa3c2ef3
libnd4j polishing ( #273 )
...
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-02 12:49:41 +03:00
raver119
9e3c1b02b1
Perf improvements ( #242 )
...
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* better ExpandDims impl
Signed-off-by: raver119 <raver119@gmail.com>
* better Squeeze impl
Signed-off-by: raver119 <raver119@gmail.com>
* better Softmax impl
Signed-off-by: raver119 <raver119@gmail.com>
* one test disabled
Signed-off-by: raver119 <raver119@gmail.com>
* more accurate impl
Signed-off-by: raver119 <raver119@gmail.com>
* - GraphProfiler now prints full shapeInfo instead of shape
- softmax typo fix
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-14 16:20:31 +03:00
raver119
9bb5798cac
Null arrays fix ( #208 )
...
* don't skip null arrays
Signed-off-by: raver119 <raver119@gmail.com>
* one test tweak
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-02 23:14:00 +03:00
raver119
29e8e09db6
String changes ( #3 )
...
* initial commit
* additional data types & tensor type
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* missing include
* sparse_to_dense
Signed-off-by: raver119 <raver119@gmail.com>
* few more tests files
Signed-off-by: raver119 <raver119@gmail.com>
* draft
Signed-off-by: raver119 <raver119@gmail.com>
* numeric sparse_to_dense
Signed-off-by: raver119 <raver119@gmail.com>
* comment
Signed-off-by: raver119 <raver119@gmail.com>
* string sparse_to_dense version
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA DataBuffer expand
Signed-off-by: raver119 <raver119@gmail.com>
* few tweaks for CUDA build
Signed-off-by: raver119 <raver119@gmail.com>
* shape fn for string_split
Signed-off-by: raver119 <raver119@gmail.com>
* one more comment
Signed-off-by: raver119 <raver119@gmail.com>
* string_split indices
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* test passes
Signed-off-by: raver119 <raver119@gmail.com>
* few rearrangements for databuffer implementations
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer: move inline methods to common implementations
Signed-off-by: raver119 <raver119@gmail.com>
* add native DataBuffer to Nd4j presets
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer creation
Signed-off-by: raver119 <raver119@gmail.com>
* use DataBuffer for allocation
Signed-off-by: raver119 <raver119@gmail.com>
* cpu databuffer as deallocatable
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer setters for bufers
Signed-off-by: raver119 <raver119@gmail.com>
* couple of wrappers
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffers being passed around
Signed-off-by: raver119 <raver119@gmail.com>
* Bunch of ByteBuffer-related signatures gone
Signed-off-by: raver119 <raver119@gmail.com>
* - few more Nd4j signatures removed
- minor fix for bfloat16
Signed-off-by: raver119 <raver119@gmail.com>
* nullptr pointer is still a pointer, but 0 as address :)
Signed-off-by: raver119 <raver119@gmail.com>
* one special test
Signed-off-by: raver119 <raver119@gmail.com>
* empty string array init
Signed-off-by: raver119 <raver119@gmail.com>
* one more test in cpp
Signed-off-by: raver119 <raver119@gmail.com>
* memcpy instead of databuffer swap
Signed-off-by: raver119 <raver119@gmail.com>
* special InteropDataBuffer for front-end languages
Signed-off-by: raver119 <raver119@gmail.com>
* few tweaks for java
Signed-off-by: raver119 <raver119@gmail.com>
* pointer/indexer actualization
Signed-off-by: raver119 <raver119@gmail.com>
* CustomOp returns list for inputArumgents and outputArguments instead of array
Signed-off-by: raver119 <raver119@gmail.com>
* redundant call
Signed-off-by: raver119 <raver119@gmail.com>
* print_variable op
Signed-off-by: raver119 <raver119@gmail.com>
* - view handling (but wrong one)
- print_variable java wrapper
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* - empty arrays handling
Signed-off-by: raver119 <raver119@gmail.com>
* - deserialization works now
Signed-off-by: raver119 <raver119@gmail.com>
* minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* one more fix
Signed-off-by: raver119 <raver119@gmail.com>
* initial cuda commit
Signed-off-by: raver119 <raver119@gmail.com>
* print_variable message validation
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA views
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA special buffer size
Signed-off-by: raver119 <raver119@gmail.com>
* minor update to match master changes
Signed-off-by: raver119 <raver119@gmail.com>
* - consider arrays always actual on device for CUDA
- additional PrintVariable constructor
- CudaUtf8Buffer now allocates host buffer by default
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* - print_variable now allows print from device
Signed-off-by: raver119 <raver119@gmail.com>
* InteropDataBuffer data type fix
Signed-off-by: raver119 <raver119@gmail.com>
* ...
Signed-off-by: raver119 <raver119@gmail.com>
* disable some debug messages
Signed-off-by: raver119 <raver119@gmail.com>
* master pulled in
Signed-off-by: raver119 <raver119@gmail.com>
* couple of new methods for DataBuffer interop
Signed-off-by: raver119 <raver119@gmail.com>
* java side
Signed-off-by: raver119 <raver119@gmail.com>
* offsetted constructor
Signed-off-by: raver119 <raver119@gmail.com>
* new CUDA deallocator
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA backend torn apart
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA backend torn apart 2
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA backend torn apart 3
Signed-off-by: raver119 <raver119@gmail.com>
* - few new tests
- few new methods for DataBuffer management
Signed-off-by: raver119 <raver119@gmail.com>
* few more tests + few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* two failing tests
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* two failing tests pass
Signed-off-by: raver119 <raver119@gmail.com>
* now we pass DataBuffer to legacy ops too
Signed-off-by: raver119 <raver119@gmail.com>
* Native DataBuffer for legacy ops, Java side
Signed-off-by: raver119 <raver119@gmail.com>
* CPU java side update
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA java side update
Signed-off-by: raver119 <raver119@gmail.com>
* no more prepare/register action on java side
Signed-off-by: raver119 <raver119@gmail.com>
* NDArray::prepare/register use now accepts vectors
Signed-off-by: raver119 <raver119@gmail.com>
* InteropDataBuffer now has few more convenience methods
Signed-off-by: raver119 <raver119@gmail.com>
* java bindings update
Signed-off-by: raver119 <raver119@gmail.com>
* tick device in NativeOps
Signed-off-by: raver119 <raver119@gmail.com>
* Corrected usage of OpaqueBuffer for tests.
* Corrected usage of OpaqueBuffer for java tests.
* NativeOpsTests fixes.
* print_variable now returns scalar
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* compat_string_split fix for CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* - CUDA execScalar fix
- CUDA lazyAllocateHostPointer now checks java indexer/pointer instead of native pointer
Signed-off-by: raver119 <raver119@gmail.com>
* legacy ops DataBuffer migration prototype
Signed-off-by: raver119 <raver119@gmail.com>
* ignore device shapeinfo coming from java
Signed-off-by: raver119 <raver119@gmail.com>
* minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* minor transformAny fix
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweak for lazy host allocation
Signed-off-by: raver119 <raver119@gmail.com>
* - DataBuffer::memcpy method
- bitcast now uses memcpy
Signed-off-by: raver119 <raver119@gmail.com>
* - IndexReduce CUDA dimension buffer fix
Signed-off-by: raver119 <raver119@gmail.com>
* views for CPU and CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* less spam
Signed-off-by: raver119 <raver119@gmail.com>
* optional memory init
Signed-off-by: raver119 <raver119@gmail.com>
* async memset
Signed-off-by: raver119 <raver119@gmail.com>
* - SummaryStats CUDA fix
- DataBuffer.sameUnderlyingData() impl
- execBroadcast fix
Signed-off-by: raver119 <raver119@gmail.com>
* - reduce3All fix
switch to CUDA 10 temporarily
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA version
Signed-off-by: raver119 <raver119@gmail.com>
* proper memory deallocator registration
Signed-off-by: raver119 <raver119@gmail.com>
* HOST_ONLY workspace allocation
Signed-off-by: raver119 <raver119@gmail.com>
* temp commit
Signed-off-by: raver119 <raver119@gmail.com>
* few conflicts resolved
Signed-off-by: raver119 <raver119@gmail.com>
* few minor fixes
Signed-off-by: raver119 <raver119@gmail.com>
* one more minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* NDArray permute should operate on JVM primitives
Signed-off-by: raver119 <raver119@gmail.com>
* - create InteropDataBuffer for shapes as well
- update pointers after view creation in Java
Signed-off-by: raver119 <raver119@gmail.com>
* - addressPointer temporary moved to C++
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA: don't account offset twice
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA: DataBuffer pointer constructor updated
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA NDArray.unsafeDuplication() simplified
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA minor workspace-related fixes
Signed-off-by: raver119 <raver119@gmail.com>
* CPU DataBuffer.reallocate()
Signed-off-by: raver119 <raver119@gmail.com>
* print_affinity op
Signed-off-by: raver119 <raver119@gmail.com>
* print_affinity java side
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA more tweaks for data locality
Signed-off-by: raver119 <raver119@gmail.com>
* - compat_string_split tweak
- CudaUtf8Buffer update
Signed-off-by: raver119 <raver119@gmail.com>
* INDArray.close() mechanic restored
Signed-off-by: raver119 <raver119@gmail.com>
* one more test fixed
Signed-off-by: raver119 <raver119@gmail.com>
* - CUDA DataBuffer.reallocate() updated
- cudaMemcpy (synchronous) restored
Signed-off-by: raver119 <raver119@gmail.com>
* one last fix
Signed-off-by: raver119 <raver119@gmail.com>
* bad import removed
Signed-off-by: raver119 <raver119@gmail.com>
* another small fix
Signed-off-by: raver119 <raver119@gmail.com>
* one special test
Signed-off-by: raver119 <raver119@gmail.com>
* fix bad databuffer size
Signed-off-by: raver119 <raver119@gmail.com>
* release primaryBuffer on replace
Signed-off-by: raver119 <raver119@gmail.com>
* higher timeout
Signed-off-by: raver119 <raver119@gmail.com>
* disable timeouts
Signed-off-by: raver119 <raver119@gmail.com>
* dbCreateView now validates offset and length of a view
Signed-off-by: raver119 <raver119@gmail.com>
* additional validation for dbExpand
Signed-off-by: raver119 <raver119@gmail.com>
* restore timeout back again
Signed-off-by: raver119 <raver119@gmail.com>
* smaller distribution for rng test to prevent timeouts
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA DataBuffer::memcpy now copies to device all the time
Signed-off-by: raver119 <raver119@gmail.com>
* OpaqueDataBuffer now contains all required methods for interop
Signed-off-by: raver119 <raver119@gmail.com>
* some javadoc
Signed-off-by: raver119 <raver119@gmail.com>
* GC on failed allocations
Signed-off-by: raver119 <raver119@gmail.com>
* minoe memcpu tweak
Signed-off-by: raver119 <raver119@gmail.com>
* one more bitcast test
Signed-off-by: raver119 <raver119@gmail.com>
* - NDArray::deviceId() propagation
- special multi-threaded test for data locality checks
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer additional syncStream
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer additional syncStream
Signed-off-by: raver119 <raver119@gmail.com>
* one ignored test
Signed-off-by: raver119 <raver119@gmail.com>
* skip host alloc for empty arrays
Signed-off-by: raver119 <raver119@gmail.com>
* ByteBuffer support is back
Signed-off-by: raver119 <raver119@gmail.com>
* DataBuffer::memcpy minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few minor prelu/bp tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* nullify-related fixes
Signed-off-by: raver119 <raver119@gmail.com>
* PReLU fixes (#157 )
Signed-off-by: Alex Black <blacka101@gmail.com>
* Build fixed
* Fix tests
* one more ByteBuffer signature restored
Signed-off-by: raver119 <raver119@gmail.com>
* nd4j-jdbc-hsql profiles fix
Signed-off-by: raver119 <raver119@gmail.com>
* nd4j-jdbc-hsql profiles fix
Signed-off-by: raver119 <raver119@gmail.com>
* PReLU weight init fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small PReLU fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* - INDArray.migrate() reactivated
- DataBuffer::setDeviceId(...) added
- InteropDataBuffer Z syncToDevice added for views
Signed-off-by: raver119 <raver119@gmail.com>
* missed file
Signed-off-by: raver119 <raver119@gmail.com>
* Small tweak
Signed-off-by: Alex Black <blacka101@gmail.com>
* cuda 10.2
Signed-off-by: raver119 <raver119@gmail.com>
* minor fix
Signed-off-by: raver119 <raver119@gmail.com>
Co-authored-by: shugeo <sgazeos@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-01-04 13:27:50 +03:00