Commit Graph

6 Commits (25db3a44f17279d4c53e8619f8061c6918373092)

Author SHA1 Message Date
Yurii Shyrma 425c747330 - permute threadsPerBlock and blocksPerGrid in signature of launching of cuda kernel for trueBroadcast op (#120)
Signed-off-by: Yurii <iuriish@yahoo.com>
2019-12-09 20:08:36 +03:00
raver119 ee5d25caa9 cuda broadcast exec fix
Signed-off-by: raver119 <raver119@gmail.com>
2019-12-09 11:17:16 +03:00
raver119 cea68c18f1 cuda broadcast fix
Signed-off-by: raver119 <raver119@gmail.com>
2019-12-09 09:27:50 +03:00
raver119 064a56ccf1
Few fixes (#66)
* skip legacy transforms execution in case of empty input arrays

Signed-off-by: raver119 <raver119@gmail.com>

* - BroadcastBool ops now accept extraParams to make MatchCondition possible
- TrueBroadcastHelper now uses samediff::threads

Signed-off-by: raver119 <raver119@gmail.com>

* java side

Signed-off-by: raver119 <raver119@gmail.com>

* trigger jenkins

Signed-off-by: raver119 <raver119@gmail.com>

* update LessThanOrEqual opNum mapping

Signed-off-by: raver119 <raver119@gmail.com>

* update LessThanOrEqual opNum mapping

Signed-off-by: raver119 <raver119@gmail.com>
2019-11-21 15:43:03 +03:00
raver119 6de00bf75f
[WIP] Weekly update of repo (#8390)
* [WIP] Fix compilation after nd4j changes (#37)

* Fix compilation.

* Some tests fixed

* Disable tests temporarily.

* Restored test

* Tests restored.

* Test restored.

* [WIP] perf tests (#40)

* special maxpool test

Signed-off-by: raver119 <raver119@gmail.com>

* special maxpool test

Signed-off-by: raver119 <raver119@gmail.com>

* Shyrma bnorm bp (#41)

Batchnorm backprop mkldnn

* Add SameDiff memory reuse memory manager (array cache) (#39)

* Attention op comments

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* ArrayCacheMemoryMgr - first pass

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Tweak array cache for use with SameDiff identity arrays

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* ArrayCacheMemoryMgr javadoc and properly get max memory

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* LRU cache policy + add tests

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fixes

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Resize arrays internally if required for ArrayCacheMemoryMgr

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Test improvement

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Small polish

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* SameDiff op runtime benchmarking listener (#42)

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* INLINE_LOOPS for windows

Signed-off-by: raver119 <raver119@gmail.com>

* [WIP] ThreadPool (#8)

This PR removes OpenMP use in 95% of cases
2019-11-13 17:15:18 +03:00
raver119 44a8d19ac6
[WIP] Broadcast changes (#8257)
* - provide correct call NDArray::applyBroadcast inside of NDArray::applyTrueBroadcast

Signed-off-by: Yurii <yurii@skymind.io>

* - provide new trueBroadcast helper

Signed-off-by: Yurii <yurii@skymind.io>

* example for yurii

Signed-off-by: raver119 <raver119@gmail.com>

* - provide new trueBroadcast helper for cpu

Signed-off-by: Yurii <yurii@skymind.io>

* - start working on new trueBroadcat helper for cuda

Signed-off-by: Yurii <yurii@skymind.io>

* - further work on trueBroadcast for cuda

Signed-off-by: Yurii <yurii@skymind.io>

* - fix bugs in cuda helper trueBroadcast

Signed-off-by: Yurii <yurii@skymind.io>
2019-10-01 09:10:19 +03:00