agibsonccc
6dc7e2f08f
Update c++ copyrights
2021-02-01 21:31:45 +09:00
raver119
ac7fb903d7
C++ rearrangements ( #485 )
...
* initial commit
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* some minor singleton changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* more iterations
Signed-off-by: raver119 <raver119@gmail.com>
* more singletons updated
Signed-off-by: raver119 <raver119@gmail.com>
* more singletons updated
Signed-off-by: raver119 <raver119@gmail.com>
* more changes
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* CUDA updates
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Java side update
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one commented out test
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
2020-06-06 15:26:55 +03:00
raver119
0613485654
compression ops ( #436 )
...
* Added declarations for decode/encode_bitmap ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added implementation for bitmap encoding/decoding ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added helpers for encode/decode bitmap ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored encodingBitmap helper.
Signed-off-by: shugeo <sgazeos@gmail.com>
* threshold encode/decode skeleton
* helper skeleton
* minor import fix
* encoder shape fn & op impl
* thresholdEncode cpu impl
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* thresholdDecode cpu impl
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Only cosmetical changes.
Signed-off-by: shugeo <sgazeos@gmail.com>
* placeholder
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Added cuda implementation for bitmap decode helper.
Signed-off-by: shugeo <sgazeos@gmail.com>
* cuda thresholdEstimate
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* cuda thresholdDecode
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* next step
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - nano cmakelist update (get rid of Clion section)
- fixed forgotten throw in AtomicTests
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* thesholdEncode cuda impl
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* Added tests for bitmap encoding/decoding ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed tests for encode/decode bitmaps.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored decode/encode helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed crashes with bitmap decode/encode helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* bitmap encode/decode CPU
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* bitmap encode/decode CUDA
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* C API removed for threshold/bitmap encode
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* EncodeBitmap/DecodeBitmap Java side
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* EncodeThreshold/DecodeThreshold Java side
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* EncodeThreshold/DecodeThreshold Java side
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* few more tests for threshold encoding
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* minor test tweak
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* two special tests
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* encodeBitmap CPU fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* parallel_long/parallel_double proper spans fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* encodeThreshold CUDA fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* nano fix
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* grid tweaks
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* RTX adaptation for thresholdEncode
Signed-off-by: raver119 <raver119@gmail.com>
* don't allow threshold encoding for length < 2
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* get rid of NDArrayCompressor in EncodingHandler
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more minor update of EncodingHandler
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* one more minor tweak of EncodingHandler
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - matmul allows integer data types use
- EncodingHandler boundary default value
- few tests for integer matmul
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* minor fix of CUDA bitmap encode
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* boundary changed to integer everywhere
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* boundary changed to integer everywhere
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* re-enable CUDA deallocator
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* threshold encoder fix for systems without omp
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - encode_threshold now requires non-negative boundary
- minor tweak in EncodingHandler
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* restore parallelism in decode_bitmap
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* fall back to omp for encode_bitmap cpu
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* single time casts
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - additional test for encode_threshold
- sync buffers to device before calling for shape function
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: shugeo <sgazeos@gmail.com>
2020-05-08 20:59:39 +03:00
raver119
57210b936c
Revert "OpenMP Threads execution ( #297 )" ( #299 )
...
This reverts commit dd2043ef48
.
2020-03-09 08:22:49 +03:00
raver119
dd2043ef48
OpenMP Threads execution ( #297 )
...
* omp threads backported
Signed-off-by: raver119 <raver119@gmail.com>
* omp scalar reduce
Signed-off-by: raver119 <raver119@gmail.com>
* timing
Signed-off-by: raver119 <raver119@gmail.com>
* timing
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* namespace change
Signed-off-by: raver119 <raver119@gmail.com>
* num_threads
Signed-off-by: raver119 <raver119@gmail.com>
* one minor fix
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-09 08:21:44 +03:00
raver119
63fa3c2ef3
libnd4j polishing ( #273 )
...
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-02 12:49:41 +03:00
Abdelrauf
bead656feb
Initial performance improvement for Bias Add and etc #8556 ( #217 )
...
* Initial performance improvement for Bias Add, loop coords helpers and increment aligned parallel threading
Signed-off-by: AbdelRauf <rauf@konduit.ai>
* One more test for Rauf
Signed-off-by: raver119 <raver119@gmail.com>
* disable couple of perf tests
Signed-off-by: raver119 <raver119@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
2020-02-08 15:31:30 +03:00
raver119
8b877a8ddf
- 3d loops parallelism fix ( #135 )
...
- additional check for maxMasterThreads <= maxThreads
Signed-off-by: raver119 <raver119@gmail.com>
2019-12-19 16:50:08 +03:00
raver119
6de00bf75f
[WIP] Weekly update of repo ( #8390 )
...
* [WIP] Fix compilation after nd4j changes (#37 )
* Fix compilation.
* Some tests fixed
* Disable tests temporarily.
* Restored test
* Tests restored.
* Test restored.
* [WIP] perf tests (#40 )
* special maxpool test
Signed-off-by: raver119 <raver119@gmail.com>
* special maxpool test
Signed-off-by: raver119 <raver119@gmail.com>
* Shyrma bnorm bp (#41 )
Batchnorm backprop mkldnn
* Add SameDiff memory reuse memory manager (array cache) (#39 )
* Attention op comments
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* ArrayCacheMemoryMgr - first pass
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Tweak array cache for use with SameDiff identity arrays
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* ArrayCacheMemoryMgr javadoc and properly get max memory
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* LRU cache policy + add tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Resize arrays internally if required for ArrayCacheMemoryMgr
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Test improvement
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small polish
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* SameDiff op runtime benchmarking listener (#42 )
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* INLINE_LOOPS for windows
Signed-off-by: raver119 <raver119@gmail.com>
* [WIP] ThreadPool (#8 )
This PR removes OpenMP use in 95% of cases
2019-11-13 17:15:18 +03:00