Commit Graph

178 Commits (0cf4a45573d79d846cb2053664c981223a10ed80)

Author SHA1 Message Date
Shams Ul Azeem 9c77bfa85f
Support for more numpy datatypes (#241)
* Adding more datatypes support in datavec-python

* Using numpy C API for creating numpy arrays

* Adding parameterized tests

* Adding support for BFLOAT16 (by converting it to FLOAT)

* Cleanup

* Using casting instead of creating an array

* Giving out a warning while casting array from BFLOAT16 to FLOAT

* Add syncToPrimary and syncToSpecial methods to BaseDataBuffer

Signed-off-by: Alex Black <blacka101@gmail.com>

* Python exec: sync to host before passing pointers

Signed-off-by: Alex Black <blacka101@gmail.com>

* Added copyright header

* use np api (#267)

* python exec / numpy - check object type before cast (#268)

* use np api

* verify object before cast

* fix cong

* cuda fix

* inplace test + tiny fix

* more test

* fix double alloc

* rem tags

* fix cuda check

* Fix implicit CUDA dependency in datavec-python tests; remove new method, add test

Signed-off-by: Alex Black <blacka101@gmail.com>

Co-authored-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Fariz Rahman <farizrahman4u@gmail.com>
2020-03-19 00:48:37 +11:00
Alex Black 2cd4522f94
Add updater tests/validation (#319)
Signed-off-by: Alex Black <blacka101@gmail.com>
2020-03-16 10:35:15 +03:00
Yurii Shyrma e42b4e96c3
correct output empty shapes deducing in split op (#311)
* - correct output empty shapes deducing in split op

Signed-off-by: Yurii <iuriish@yahoo.com>

* java test fixed

Signed-off-by: raver119 <raver119@gmail.com>

* - split broadcast::exec function on individual functions corresponding to switch arg

Signed-off-by: Yurii <iuriish@yahoo.com>

* - split broadcast::exec _int and _bool function on individual functions corresponding to switch arg

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>
2020-03-12 18:25:54 +03:00
raver119 57210b936c
Revert "OpenMP Threads execution (#297)" (#299)
This reverts commit dd2043ef48.
2020-03-09 08:22:49 +03:00
raver119 dd2043ef48
OpenMP Threads execution (#297)
* omp threads backported

Signed-off-by: raver119 <raver119@gmail.com>

* omp scalar reduce

Signed-off-by: raver119 <raver119@gmail.com>

* timing

Signed-off-by: raver119 <raver119@gmail.com>

* timing

Signed-off-by: raver119 <raver119@gmail.com>

* minor tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* minor tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* namespace change

Signed-off-by: raver119 <raver119@gmail.com>

* num_threads

Signed-off-by: raver119 <raver119@gmail.com>

* one minor fix

Signed-off-by: raver119 <raver119@gmail.com>
2020-03-09 08:21:44 +03:00
Alex Black a80fb99a5f
DL4J integrations tests updates + add SameDiff support (#298)
* Revive and start updating DL4J integration tests

Signed-off-by: Alex Black <blacka101@gmail.com>

* Add SameDiff support - first pass

Signed-off-by: Alex Black <blacka101@gmail.com>

* SameDiff test case generation

Signed-off-by: Alex Black <blacka101@gmail.com>

* SameDiff integration tests polishing

Signed-off-by: Alex Black <blacka101@gmail.com>

* More SameDiff integration test fixes

Signed-off-by: Alex Black <blacka101@gmail.com>

* Final polish

Signed-off-by: Alex Black <blacka101@gmail.com>

* Small test tweak

Signed-off-by: Alex Black <blacka101@gmail.com>
2020-03-07 22:44:41 +11:00
Alex Black e6a7b94fe4
Loss namespace (#294)
* codegen for SDLoss. WIP.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* first pass of SDLoss.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* wip. Firsat cut of new op constructors. UNTESTED , NOT COMPILED YET.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* updated op signatures.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* add NDLoss tests.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* fix test.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* adds loss default params. factory.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* Regenerate NDLoss

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* adds tests for null weights.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* Last few tweaks

Signed-off-by: Alex Black <blacka101@gmail.com>

Co-authored-by: Robert Altena <Rob@Ra-ai.com>
2020-03-06 16:07:22 +11:00
Alex Black 19d5a8d49d
Various fixes (#290)
* Add check to ensure ALL tests extend BaseND4JTest for proper timeouts + logging

Signed-off-by: Alex Black <blacka101@gmail.com>

* Add 'must extend BaseDL4JTest' check for deeplearning4j-core

Signed-off-by: Alex Black <blacka101@gmail.com>

* Flush logging on workspace exit during tests

Signed-off-by: Alex Black <blacka101@gmail.com>
2020-03-06 00:02:32 +11:00
raver119 0f581e74e3 one small test rearrangement
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-02 19:52:11 +03:00
Alex Black 483c3d7b8c
Assorted SameDiff/DL4J fixes (#279)
* #8565 Normalizer toString/hashcode

Signed-off-by: Alex Black <blacka101@gmail.com>

* #8731 ImagePreProcessingScaler lables/segmentation fix

Signed-off-by: Alex Black <blacka101@gmail.com>

* #8691 Fix SameDiffLayer/Vertx finetuning and parameter setting support

Signed-off-by: Alex Black <blacka101@gmail.com>

* #8663 DL4J embedding layer weight init - don't depend on vocab size

Signed-off-by: Alex Black <blacka101@gmail.com>

* EmbeddingLayer test tweak

Signed-off-by: Alex Black <blacka101@gmail.com>
2020-03-02 16:15:49 +11:00
shugeo 330a69d4e2
Shugeo solve ls (#203)
* lstsq op. Initial commit.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Least squares linear problem solve op (lstsq). Cpu draft implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed shape routine and tests.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added test for lstsq op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Rectification for lstsq op implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected test to avoid numerical inconsistensy.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added prints for check computing.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected tests to use evalueate facility instead.

Signed-off-by: shugeo <sgazeos@gmail.com>

* CPU implementation of MatrixSolveLs op and tests.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added cuda implementation for helpers with lstsq op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored tests for lstsq op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added processing for empty inputs.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Merged tests.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored lstsq op for fast case.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed test.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored lstsq op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed some issues with solve.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed lstsq op to avoid erros.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added kernel for giagonal factor

Signed-off-by: shugeo <sgazeos@gmail.com>

* lstsq wrapper and triangular_solve fixed

* Added proper processing empty inputs and test.

Signed-off-by: shugeo <sgazeos@gmail.com>

* SequenceMask test

* Build fixed

* Added proper processing of empty inputs with solve op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Mapping added

* Added check of input shapes with solve op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added a couple of tests for lstsq op and minor changes with cuda helper for one.'

Signed-off-by: shugeo <sgazeos@gmail.com>

* Tests on

* Refactored test for lstsq op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed test

* Added another approach for lstsq op aka solve_ls.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Finished cpu part for solve_ls op helpers.

* Added helper for low triangular matrix inversion.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored alternate solve_ls cpu implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Removed alternate approach for solve_ls op. Added multithreading with matrix inversion.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Assert fixed

* Refactored multithreading for inverse matricies.

Signed-off-by: shugeo <sgazeos@gmail.com>

Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-02-28 11:37:26 +03:00
Yurii Shyrma f7a9190407
profiling of concat op (both cuda and cpu) (#151)
* - profiling of concat op (both cuda and cpu)

Signed-off-by: Yurii <iuriish@yahoo.com>

* better comparison for large concat

Signed-off-by: raver119 <raver119@gmail.com>

* - further improving of concat op

Signed-off-by: Yurii <iuriish@yahoo.com>

* some loggin

Signed-off-by: raver119 <raver119@gmail.com>

* - add possibility to verify presence of trailing unities in shape and set strides/ews correspondingly
- restrict second simple case in concat op to c order only

Signed-off-by: Yurii <iuriish@yahoo.com>

* - move concat op to specials_single.cpp file

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of second concat op declaration in transforms.cpp file

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>
2020-02-20 21:19:01 +03:00
Alexander Stoyakin 4206171b70
Ignored tests (#243) 2020-02-14 09:27:46 +03:00
Yurii Shyrma fe47f52896
Oleh tenzor mmul (#231)
* Libnd4j: TensorMMul backprop op #8174, raw implementation

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: TensorMMul backprop op #8174 merge master and some corrections

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: TensorMMul backprop op #8174 algorithm update, need testing, sync with  master

* Libnd4j: TensorMMul backprop op #8174 fixed incorrect B axes calculation

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: TensorMMul backprop op #8174 optimize axes identification and fix bug of indeces overlapping, added first test. need testing with different shapes

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: TensorMMul backprop op #8174 some fixes and improvements need more testing

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: TensorMMul backprop op #8174 fixed order of matrix multiply

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: TensorMMul backprop op #8174 fixed issue of incorrect axes definition, add tests based on TF, need additional testing for case dLdC not equal 1

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: TensorMMul backprop op #8174 fixed scalar case add test

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: TensorMMul backprop op #8174 fixed bp algorithm, axes definition, need some mode testing with different orders combination f,c; c,f f,f and add some checks for inputs

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: TensorMMul backprop op #8174 some checks and corrections added tests, exists the problem with different input orders support A-f B-c and A-f B-f

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: TensorMMul backprop op #8174 sync master

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* - correct bug in MmulHelper::tensorDot(a, b, c, axes_a, axes_b,permutForC)

Signed-off-by: Yurii <iuriish@yahoo.com>

* Libnd4j: TensorMMul backprop op #8174 code clean up and refactoring

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* - add check for linspase ordered permutations in ShapeUtils::evalShapeForTensorDot

Signed-off-by: Yurii <iuriish@yahoo.com>

* - provide additional code in shape::reshape stuff in order to reduce amount of allocation/copy operations during reshaping procedure

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on problem of wrong shape evaluation during permute/reshape procedures

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still looking for bug reason in reshape/permute stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct bug in transform cuda native ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct bug in NDArray::assign

Signed-off-by: Yurii <iuriish@yahoo.com>

* - remove old shape::reshape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add possibility to disable copy of old buffer to new buffer during reshape operation in NDArray class

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct bug in tensorDot which had to do with wrong pointers assigments

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: Oleh <oleg.semeniv@gmail.com>
2020-02-13 20:33:54 +03:00
Alexander Stoyakin 8c0e378ec3
Improving SameDiff tests coverage (#227)
* Gradients tests added

* Fix for Standard deviation serialization + test

Signed-off-by: Alex Black <blacka101@gmail.com>

* More fixes

Signed-off-by: Alex Black <blacka101@gmail.com>

* Test fixed

* Spark config driver host config for CI

Signed-off-by: Alex Black <blacka101@gmail.com>

* Op validation timeout increase

Signed-off-by: Alex Black <blacka101@gmail.com>

* Gradient check - fix for low probability test failure due to randomly all 0s mask

Signed-off-by: AlexDBlack <blacka101@gmail.com>

Co-authored-by: Alex Black <blacka101@gmail.com>
2020-02-13 10:29:08 +11:00
raver119 1dfac9a736
DataBuffer.write() tweak (#221)
* special workaround methods for DataBuffer.write

Signed-off-by: raver119 <raver119@gmail.com>

* one test removed

Signed-off-by: raver119 <raver119@gmail.com>

* more of unsynced

Signed-off-by: raver119 <raver119@gmail.com>

* missing asLong for BaseCudaDataBuffer

Signed-off-by: raver119 <raver119@gmail.com>
2020-02-07 18:16:11 +03:00
Alex Black 569a46f87d
Fixes (#213)
* Increase timeouts for 2 tests occasionally failing on CI

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Explicitly set character encoding via argline for maven surefire tests

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* CUDA gradient check timeout fix + simple rnn masking fix

Signed-off-by: AlexDBlack <blacka101@gmail.com>
2020-02-05 17:07:36 +11:00
raver119 5d28e6143d
OpContext handling (#214)
* nano tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* OpContext tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* OpContext deallocators

Signed-off-by: raver119 <raver119@gmail.com>

* get rid of few mkldnn safety checks

Signed-off-by: raver119 <raver119@gmail.com>

* databuffer setSpecial fix

Signed-off-by: raver119 <raver119@gmail.com>
2020-02-05 07:27:24 +03:00
shugeo 41ff907bc6
Shugeo solve linear (#191)
* linear equations systems solve op. Initial commit.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed compiling issues.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Linear equations systems solve. The next stage commit.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added test for linear equations systems solve operation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added additional test and fixed lower matrix retrievance.

* Implementation for solve of the systems of linear equations."

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored permutation generation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added restore for permutations batched with cuda helper for solve op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Finished cuda implementation for solve op helpers.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored cpu helpers for solve op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fix gtest output on Windows

* Fixed issue with permutation matrix for cuda implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed issue with permutation matrix for cpu implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Eliminated waste comments.

Signed-off-by: shugeo <sgazeos@gmail.com>

* LinearSolve added

* Mapping added

* Javadoc added

* Refactored implementation of triangular_solve helpers and tests for solve matrix equations generally.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added a test for solve op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Solve test added

* Fix for TF import

Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-02-04 08:59:11 +03:00
Alex Black ddf70ac450
Avoid double printing of start/stop test in a few cases (#210)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2020-02-03 22:18:01 +11:00
raver119 9bb5798cac
Null arrays fix (#208)
* don't skip null arrays

Signed-off-by: raver119 <raver119@gmail.com>

* one test tweak

Signed-off-by: raver119 <raver119@gmail.com>
2020-02-02 23:14:00 +03:00
raver119 81efa5c3b6
[WIP] one small fix (#207)
* one small fix

Signed-off-by: raver119 <raver119@gmail.com>

* assert added

Signed-off-by: raver119 <raver119@gmail.com>
2020-02-02 19:17:26 +03:00
Alex Black 0756e3fe70
Small fixes. (#206)
* Logging format tweaks for file logging

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Min abs error tweak for Util layer gradient checks

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* #8648 Fix SameDiff NPE instead of error for missing placeholders

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Test runtime reduction

Signed-off-by: AlexDBlack <blacka101@gmail.com>
2020-02-01 18:19:36 +11:00
raver119 1ab86d1306
Range op data type (#204)
* - range op now accepts dargs
- dargs now can be in signature

Signed-off-by: raver119 <raver119@gmail.com>

* range dtype java side

Signed-off-by: raver119 <raver119@gmail.com>

* linspace fix

Signed-off-by: raver119 <raver119@gmail.com>

* lin_space fix for scalar outputs

Signed-off-by: raver119 <raver119@gmail.com>
2020-01-31 10:45:40 +03:00
raver119 5d98cfcf47
Configurable DataType for ops (#201)
* initial commit

Signed-off-by: raver119 <raver119@gmail.com>

* - one more test for OneHot with dtype
- one more signature in Nd4j

Signed-off-by: raver119 <raver119@gmail.com>

* ones_as/zeros_as now accept dtype

Signed-off-by: raver119 <raver119@gmail.com>

* one more test

Signed-off-by: raver119 <raver119@gmail.com>

* - more updates for configurable data types
- ones_as/zeros_as java side + tests

Signed-off-by: raver119 <raver119@gmail.com>

* few c++ tests fixed

Signed-off-by: raver119 <raver119@gmail.com>

* few more changes around DArgs

Signed-off-by: raver119 <raver119@gmail.com>
2020-01-30 18:46:12 +03:00
raver119 9f719488b9
CUDA sync tweaks (#194)
* ThreadLocal cache for CudaContext

Signed-off-by: raver119 <raver119@gmail.com>

* temp commit

Signed-off-by: raver119 <raver119@gmail.com>

* remove unwanted synchronization

Signed-off-by: raver119 <raver119@gmail.com>
2020-01-28 10:55:06 +03:00
raver119 7ef0ef907e
Packages fix (#193)
* packages fix

Signed-off-by: raver119 <raver119@gmail.com>

* few imports fixed

Signed-off-by: raver119 <raver119@gmail.com>

* few imports fixed

Signed-off-by: raver119 <raver119@gmail.com>
2020-01-27 23:04:21 +03:00
Alex Black 458d141d8e
Fix SDLoss null weights array issue (#185)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
2020-01-25 20:13:23 +11:00
Alexander Stoyakin 4db28a9300 Cleanup of multiple projects (#175)
* Cleanup modules

* Moving subprojects to nd4j-api

* Project cleanup

* Dropped AWS sub-project

* dl4j-util moved to core

* dl4j-perf moved to core

* Tests coverage

* Revert "Moving subprojects to nd4j-api"

This reverts commit bc6eb573c6b60c407ade47172c5d204725077e6b.

* Moved nd4j-buffer and nd4j-context to nd4j-api

* Rolled back change

* Revert "Project cleanup"

This reverts commit 64ac7f369b2d968f7be437718034f093fc886ffc.

* Datavec cleaned up

* Revert "Moved nd4j-buffer and nd4j-context to nd4j-api"

This reverts commit 75f4e8da80d2551e44e1251dd6c5923289fff8e1.

# Conflicts:
#	nd4j/nd4j-backends/nd4j-tests/src/test/java/org/nd4j/autodiff/opvalidation/ReductionBpOpValidation.java

* Resolve conflict

* Compilation fixed.

* nd4j-context and nd4j-buffer moved to nd4j-api

* Fixed TF mapping for mmul

* Fix for dl4j-cuda tests

Signed-off-by: Alex Black <blacka101@gmail.com>

* Move last few tests from deeplearning4j-nn to -core

Signed-off-by: Alex Black <blacka101@gmail.com>

* Remove incorrect TF import mapping for TensorMmul op

Signed-off-by: Alex Black <blacka101@gmail.com>

* Cleaned TF mapping

* Fix path for test results on windows

* Remove old dependency

Signed-off-by: Alex Black <blacka101@gmail.com>

* One more attempt to fix path for test results on windows

* fixup! One more attempt to fix path for test results on windows

* fixup! One more attempt to fix path for test results on windows

Co-authored-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: raver119 <raver119@gmail.com>
2020-01-24 22:35:00 +03:00
raver119 5d69069177
[WIP] Memory limits (#167)
* initial commit

Signed-off-by: raver119 <raver119@gmail.com>

* one more initial commit

Signed-off-by: raver119 <raver119@gmail.com>

* additional initial commit

Signed-off-by: raver119 <raver119@gmail.com>

* subsequent initial commit

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit testing

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit per device

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit per group

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit for cuda

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit for cuda + few missed lines

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit for cuda + missed includes

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit for cuda + one more missed include

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit shouldn't count host mem as dev0 in cuda

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit that tracks HOST group limits for CUDA

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit with some Environment changes

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit with more Environment changes

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit with maxMasterThreads fix

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit with maxMasterThreads fix

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit without maxMasterThreads exception

Signed-off-by: raver119 <raver119@gmail.com>

* initial commit without Nd4jULong in Environment

Signed-off-by: raver119 <raver119@gmail.com>

* add sleep and more iterations for OOM cases

Signed-off-by: raver119 <raver119@gmail.com>

* limits propagation from java side

Signed-off-by: raver119 <raver119@gmail.com>

* - consume ErrorCode every time
- one test for memory limits

Signed-off-by: raver119 <raver119@gmail.com>

* unordered_map

Signed-off-by: raver119 <raver119@gmail.com>

* unordered_map

Signed-off-by: raver119 <raver119@gmail.com>

* unordered_map

Signed-off-by: raver119 <raver119@gmail.com>

* RSub op mapping fixed

Signed-off-by: raver119 <raver119@gmail.com>

* typo fixed

Signed-off-by: raver119 <raver119@gmail.com>

* one bad test fixed

Signed-off-by: raver119 <raver119@gmail.com>
2020-01-24 10:11:09 +03:00
Alex Black a25bb6a11c
Unit/integration test split + test speedup (#166)
* Add maven profile + base tests methods for integration tests

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Switch from system property to environment variable; seems more reliable in intellij

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Add nd4j-common-tests module, and common base test; cleanup

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Ensure all ND4J tests extend BaseND4JTest

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Test spam reduction, import fix

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Add test logging to nd4j-aeron

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fix unintended change

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Reduce sprint test log spam

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* More test spam cleanup

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Significantly speed up TSNE tests

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* W2V iterator test unit/integration split

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* More NLP test speedups

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Avoid debug/verbose mode leaking between tests

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* test tweak

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Arbiter extends base DL4J test

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Arbiter test speedup

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* nlp-uima test speedup

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* More test speedups

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fix ND4J base test

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Few small ND4J test speed improvements

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* DL4J tests speedup

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* More tweaks

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Even more test speedups

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* More tweaks

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Various test fixes

Signed-off-by: Alex Black <blacka101@gmail.com>

* More test fixes

Signed-off-by: Alex Black <blacka101@gmail.com>

* Add ability to specify number of threads for C++ ops in BaseDL4JTest and BaseND4JTest

Signed-off-by: Alex Black <blacka101@gmail.com>

* nd4j-aeron test profile fix for CUDA

Signed-off-by: Alex Black <blacka101@gmail.com>
2020-01-22 22:27:01 +11:00
shugeo 815a2908af Shugeo solve triangular (#173)
* Added implementation of the triangular_solve op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed compilation issues.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added verification of input data and helpers facilities for triangular_solve op.'

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added cpu implementation for triangular_solve helpers.

* Added tests and implementation for upper triangular equations.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added a pair of cases to tests.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added multithreading with cpu helpers for triangular_solve op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added cuda implementation of triangular_solve op helpers.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Finished cuda implementation of triangular_solve helpers and tests.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed copyright marks.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected grammar errors with doc and error messages.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored matricies processing with triangular_solve cuda helper implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added triangular_solve wrapper

* Fixed mapping

* Added processing for adjoint with cpu helpers of triangular_solve op implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added implementation for adjoint routine with cuda platform.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added multithreading with adjoint routine for cpu platform.

Signed-off-by: shugeo <sgazeos@gmail.com>

Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-01-22 10:48:03 +03:00
shugeo e50b285c2c Shugeo resize area (#162)
* Added implementation for resize_area op. Initial commit.

* Added implementation of resize_area op. Initial revision.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected resizeArea functor call.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Implementation of resize_area. Cpu platform helpers.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Implementation for resize_area helpers. The first part revision.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added a set of tests for resize_area op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Cuda implementation for resize_area. Initial approach.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Adding multithreading for resize_area algorithm.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Cuda implementation of resize_area helpers. Shared memory approach.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored resizeAreaKernel with cuda implementation.

* Eliminated compilation errors.

* ResizeArea helpers for cuda platform. The first working revision.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added test for batched resize_area op testing.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Implementation of resize_are for cuda platform and tests.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed multithreading with resize_area op helper.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected copyright marks with sources.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected copyright mark for resize_area op implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected copyright mark for parity ops header.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected typo in strings and so on with image resize ops.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored resize_area helpers and multithreading.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added ResizeArea wrapper

* Added test with align_corners and fixed shape processing with only int args given for output size.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added test

* TF mapping for ResizeArea

* Fixed implementation issues with resize_area op for both platforms.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored image resizer struct to use flexible types for ints and floats.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Improved multithreading with resizeAreaKernel launch.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Use asynchronical memory copying with cuda platform image resize allocations.

Signed-off-by: shugeo <sgazeos@gmail.com>

Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-01-22 10:46:33 +03:00
Oleh 8fc0e63ce7 Oleh powderev (#171)
* Libnd4j: Add broadcastable elementwise power derivative #7461 first step of Pow_bp operation implementation

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: Add broadcastable elementwise power derivative #7461 some corrections of calculation steps

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: Add broadcastable elementwise power derivative #7461 some bug fixes, the PowDerevative op made broadcastable, add the raw tests for op, need refactoring to use broadcast ops

* Libnd4j: Add broadcastable elementwise power derivative #7461 fixed several bugs add broadcast support and tests, need to fix scalar+array and array+scalar

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: Add broadcastable elementwise power derivative #7461 fixed bugs for scalar inputs, fixed multinomial tests, added tests

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: Add broadcastable elementwise power derivative #7461 fised bugs for different shapes support, tests updated

* Libnd4j: Add broadcastable elementwise power derivative #7461 applied all possible variants via tiled arrays, add support of broadcast for Pow and PowDerivative ops, covered by tests, before review have to be replaced tiled implementation by applyTrueBroadcast

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: Add broadcastable elementwise power derivative #7461 replaced tile by broadcast implementation, fixed issue with negative x input, corrected tests, need additional testing

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: Add broadcastable elementwise power derivative #7461 added and corrected test cases, corrected implementation need review

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: Add broadcastable elementwise power derivative #7461 code clean up

* Libnd4j: Add broadcastable elementwise power derivative #7461 code clean up, removed some tests, add tests with scalar

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: Add broadcastable elementwise power derivative #7461 code improvement and clean up, split tests

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: Add broadcastable elementwise power derivative #7461 some code clean up

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* Libnd4j: Add broadcastable elementwise power derivative replace __isnanf by internal realization

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* pow_bp wrapper

* Fixed PowBp wrapper

* Tests added

* Test fixed

* Fix return type

* Disable powBp usage

* Pow backprop changed

Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-01-20 12:59:12 +03:00
shugeo 6943a5f57a Shugeo lgamma (#170)
* lgamma op. Initial version.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored lgamma op and test.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Lgamma wrapper

* Added TF mapping

Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-01-20 12:29:36 +03:00
Alex Black c84307a6fe
Small SameDiff execution fix (#168)
* SameDiff exec: Fix for switch op when predicate is constant, and op is inside loop

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Update ignores for failing zoo models

Signed-off-by: AlexDBlack <blacka101@gmail.com>
2020-01-08 23:57:23 +11:00
raver119 29e8e09db6
String changes (#3)
* initial commit

* additional data types & tensor type

Signed-off-by: raver119 <raver119@gmail.com>

* next step

Signed-off-by: raver119 <raver119@gmail.com>

* missing include

* sparse_to_dense

Signed-off-by: raver119 <raver119@gmail.com>

* few more tests files

Signed-off-by: raver119 <raver119@gmail.com>

* draft

Signed-off-by: raver119 <raver119@gmail.com>

* numeric sparse_to_dense

Signed-off-by: raver119 <raver119@gmail.com>

* comment

Signed-off-by: raver119 <raver119@gmail.com>

* string sparse_to_dense version

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA DataBuffer expand

Signed-off-by: raver119 <raver119@gmail.com>

* few tweaks for CUDA build

Signed-off-by: raver119 <raver119@gmail.com>

* shape fn for string_split

Signed-off-by: raver119 <raver119@gmail.com>

* one more comment

Signed-off-by: raver119 <raver119@gmail.com>

* string_split indices

Signed-off-by: raver119 <raver119@gmail.com>

* next step

Signed-off-by: raver119 <raver119@gmail.com>

* test passes

Signed-off-by: raver119 <raver119@gmail.com>

* few rearrangements for databuffer implementations

Signed-off-by: raver119 <raver119@gmail.com>

* DataBuffer: move inline methods to common implementations

Signed-off-by: raver119 <raver119@gmail.com>

* add native DataBuffer to Nd4j presets

Signed-off-by: raver119 <raver119@gmail.com>

* DataBuffer creation

Signed-off-by: raver119 <raver119@gmail.com>

* use DataBuffer for allocation

Signed-off-by: raver119 <raver119@gmail.com>

* cpu databuffer as deallocatable

Signed-off-by: raver119 <raver119@gmail.com>

* DataBuffer setters for bufers

Signed-off-by: raver119 <raver119@gmail.com>

* couple of wrappers

Signed-off-by: raver119 <raver119@gmail.com>

* DataBuffers being passed around

Signed-off-by: raver119 <raver119@gmail.com>

* Bunch of ByteBuffer-related signatures gone

Signed-off-by: raver119 <raver119@gmail.com>

* - few more Nd4j signatures removed
- minor fix for bfloat16

Signed-off-by: raver119 <raver119@gmail.com>

* nullptr pointer is still a pointer, but 0 as address :)

Signed-off-by: raver119 <raver119@gmail.com>

* one special test

Signed-off-by: raver119 <raver119@gmail.com>

* empty string array init

Signed-off-by: raver119 <raver119@gmail.com>

* one more test in cpp

Signed-off-by: raver119 <raver119@gmail.com>

* memcpy instead of databuffer swap

Signed-off-by: raver119 <raver119@gmail.com>

* special InteropDataBuffer for front-end languages

Signed-off-by: raver119 <raver119@gmail.com>

* few tweaks for java

Signed-off-by: raver119 <raver119@gmail.com>

* pointer/indexer actualization

Signed-off-by: raver119 <raver119@gmail.com>

* CustomOp returns list for inputArumgents and outputArguments instead of array

Signed-off-by: raver119 <raver119@gmail.com>

* redundant call

Signed-off-by: raver119 <raver119@gmail.com>

* print_variable op

Signed-off-by: raver119 <raver119@gmail.com>

* - view handling (but wrong one)
- print_variable java wrapper

Signed-off-by: raver119 <raver119@gmail.com>

* one more test

Signed-off-by: raver119 <raver119@gmail.com>

* - empty arrays handling

Signed-off-by: raver119 <raver119@gmail.com>

* - deserialization works now

Signed-off-by: raver119 <raver119@gmail.com>

* minor fix

Signed-off-by: raver119 <raver119@gmail.com>

* meh

Signed-off-by: raver119 <raver119@gmail.com>

* one more fix

Signed-off-by: raver119 <raver119@gmail.com>

* initial cuda commit

Signed-off-by: raver119 <raver119@gmail.com>

* print_variable message validation

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA views

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA special buffer size

Signed-off-by: raver119 <raver119@gmail.com>

* minor update to match master changes

Signed-off-by: raver119 <raver119@gmail.com>

* - consider arrays always actual on device for CUDA
- additional PrintVariable constructor
- CudaUtf8Buffer now allocates host buffer by default

Signed-off-by: raver119 <raver119@gmail.com>

* meh

Signed-off-by: raver119 <raver119@gmail.com>

* - print_variable now allows print from device

Signed-off-by: raver119 <raver119@gmail.com>

* InteropDataBuffer data type fix

Signed-off-by: raver119 <raver119@gmail.com>

* ...

Signed-off-by: raver119 <raver119@gmail.com>

* disable some debug messages

Signed-off-by: raver119 <raver119@gmail.com>

* master pulled in

Signed-off-by: raver119 <raver119@gmail.com>

* couple of new methods for DataBuffer interop

Signed-off-by: raver119 <raver119@gmail.com>

* java side

Signed-off-by: raver119 <raver119@gmail.com>

* offsetted constructor

Signed-off-by: raver119 <raver119@gmail.com>

* new CUDA deallocator

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA backend torn apart

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA backend torn apart 2

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA backend torn apart 3

Signed-off-by: raver119 <raver119@gmail.com>

* - few new tests
- few new methods for DataBuffer management

Signed-off-by: raver119 <raver119@gmail.com>

* few more tests + few more tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* two failing tests

Signed-off-by: raver119 <raver119@gmail.com>

* one more test

Signed-off-by: raver119 <raver119@gmail.com>

* two failing tests pass

Signed-off-by: raver119 <raver119@gmail.com>

* now we pass DataBuffer to legacy ops too

Signed-off-by: raver119 <raver119@gmail.com>

* Native DataBuffer for legacy ops, Java side

Signed-off-by: raver119 <raver119@gmail.com>

* CPU java side update

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA java side update

Signed-off-by: raver119 <raver119@gmail.com>

* no more prepare/register action on java side

Signed-off-by: raver119 <raver119@gmail.com>

* NDArray::prepare/register use now accepts vectors

Signed-off-by: raver119 <raver119@gmail.com>

* InteropDataBuffer now has few more convenience methods

Signed-off-by: raver119 <raver119@gmail.com>

* java bindings update

Signed-off-by: raver119 <raver119@gmail.com>

* tick device in NativeOps

Signed-off-by: raver119 <raver119@gmail.com>

* Corrected usage of OpaqueBuffer for tests.

* Corrected usage of OpaqueBuffer for java tests.

* NativeOpsTests fixes.

* print_variable now returns scalar

Signed-off-by: raver119 <raver119@gmail.com>

* one more test

Signed-off-by: raver119 <raver119@gmail.com>

* compat_string_split fix for CUDA

Signed-off-by: raver119 <raver119@gmail.com>

* - CUDA execScalar fix
- CUDA lazyAllocateHostPointer now checks java indexer/pointer instead of native pointer

Signed-off-by: raver119 <raver119@gmail.com>

* legacy ops DataBuffer migration prototype

Signed-off-by: raver119 <raver119@gmail.com>

* ignore device shapeinfo coming from java

Signed-off-by: raver119 <raver119@gmail.com>

* minor fix

Signed-off-by: raver119 <raver119@gmail.com>

* minor transformAny fix

Signed-off-by: raver119 <raver119@gmail.com>

* minor tweak for lazy host allocation

Signed-off-by: raver119 <raver119@gmail.com>

* - DataBuffer::memcpy method
- bitcast now uses memcpy

Signed-off-by: raver119 <raver119@gmail.com>

* - IndexReduce CUDA dimension buffer fix

Signed-off-by: raver119 <raver119@gmail.com>

* views for CPU and CUDA

Signed-off-by: raver119 <raver119@gmail.com>

* less spam

Signed-off-by: raver119 <raver119@gmail.com>

* optional memory init

Signed-off-by: raver119 <raver119@gmail.com>

* async memset

Signed-off-by: raver119 <raver119@gmail.com>

* - SummaryStats CUDA fix
- DataBuffer.sameUnderlyingData() impl
- execBroadcast fix

Signed-off-by: raver119 <raver119@gmail.com>

* - reduce3All fix
switch to CUDA 10 temporarily

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA version

Signed-off-by: raver119 <raver119@gmail.com>

* proper memory deallocator registration

Signed-off-by: raver119 <raver119@gmail.com>

* HOST_ONLY workspace allocation

Signed-off-by: raver119 <raver119@gmail.com>

* temp commit

Signed-off-by: raver119 <raver119@gmail.com>

* few conflicts resolved

Signed-off-by: raver119 <raver119@gmail.com>

* few minor fixes

Signed-off-by: raver119 <raver119@gmail.com>

* one more minor fix

Signed-off-by: raver119 <raver119@gmail.com>

* NDArray permute should operate on JVM primitives

Signed-off-by: raver119 <raver119@gmail.com>

* - create InteropDataBuffer for shapes as well
- update pointers after view creation in Java

Signed-off-by: raver119 <raver119@gmail.com>

* - addressPointer temporary moved to C++

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA: don't account offset twice

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA: DataBuffer pointer constructor updated

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA NDArray.unsafeDuplication() simplified

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA minor workspace-related fixes

Signed-off-by: raver119 <raver119@gmail.com>

* CPU DataBuffer.reallocate()

Signed-off-by: raver119 <raver119@gmail.com>

* print_affinity op

Signed-off-by: raver119 <raver119@gmail.com>

* print_affinity java side

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA more tweaks for data locality

Signed-off-by: raver119 <raver119@gmail.com>

* - compat_string_split tweak
- CudaUtf8Buffer update

Signed-off-by: raver119 <raver119@gmail.com>

* INDArray.close() mechanic restored

Signed-off-by: raver119 <raver119@gmail.com>

* one more test fixed

Signed-off-by: raver119 <raver119@gmail.com>

* - CUDA DataBuffer.reallocate() updated
- cudaMemcpy (synchronous) restored

Signed-off-by: raver119 <raver119@gmail.com>

* one last fix

Signed-off-by: raver119 <raver119@gmail.com>

* bad import removed

Signed-off-by: raver119 <raver119@gmail.com>

* another small fix

Signed-off-by: raver119 <raver119@gmail.com>

* one special test

Signed-off-by: raver119 <raver119@gmail.com>

* fix bad databuffer size

Signed-off-by: raver119 <raver119@gmail.com>

* release primaryBuffer on replace

Signed-off-by: raver119 <raver119@gmail.com>

* higher timeout

Signed-off-by: raver119 <raver119@gmail.com>

* disable timeouts

Signed-off-by: raver119 <raver119@gmail.com>

* dbCreateView now validates offset and length of a view

Signed-off-by: raver119 <raver119@gmail.com>

* additional validation for dbExpand

Signed-off-by: raver119 <raver119@gmail.com>

* restore timeout back again

Signed-off-by: raver119 <raver119@gmail.com>

* smaller distribution for rng test to prevent timeouts

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA DataBuffer::memcpy now copies to device all the time

Signed-off-by: raver119 <raver119@gmail.com>

* OpaqueDataBuffer now contains all required methods for interop

Signed-off-by: raver119 <raver119@gmail.com>

* some javadoc

Signed-off-by: raver119 <raver119@gmail.com>

* GC on failed allocations

Signed-off-by: raver119 <raver119@gmail.com>

* minoe memcpu tweak

Signed-off-by: raver119 <raver119@gmail.com>

* one more bitcast test

Signed-off-by: raver119 <raver119@gmail.com>

* - NDArray::deviceId() propagation
- special multi-threaded test for data locality checks

Signed-off-by: raver119 <raver119@gmail.com>

* DataBuffer additional syncStream

Signed-off-by: raver119 <raver119@gmail.com>

* DataBuffer additional syncStream

Signed-off-by: raver119 <raver119@gmail.com>

* one ignored test

Signed-off-by: raver119 <raver119@gmail.com>

* skip host alloc for empty arrays

Signed-off-by: raver119 <raver119@gmail.com>

* ByteBuffer support is back

Signed-off-by: raver119 <raver119@gmail.com>

* DataBuffer::memcpy minor fix

Signed-off-by: raver119 <raver119@gmail.com>

* few minor prelu/bp tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* nullify-related fixes

Signed-off-by: raver119 <raver119@gmail.com>

* PReLU fixes (#157)

Signed-off-by: Alex Black <blacka101@gmail.com>

* Build fixed

* Fix tests

* one more ByteBuffer signature restored

Signed-off-by: raver119 <raver119@gmail.com>

* nd4j-jdbc-hsql profiles fix

Signed-off-by: raver119 <raver119@gmail.com>

* nd4j-jdbc-hsql profiles fix

Signed-off-by: raver119 <raver119@gmail.com>

* PReLU weight init fix

Signed-off-by: Alex Black <blacka101@gmail.com>

* Small PReLU fix

Signed-off-by: Alex Black <blacka101@gmail.com>

* - INDArray.migrate() reactivated
- DataBuffer::setDeviceId(...) added
- InteropDataBuffer Z syncToDevice added for views

Signed-off-by: raver119 <raver119@gmail.com>

* missed file

Signed-off-by: raver119 <raver119@gmail.com>

* Small tweak

Signed-off-by: Alex Black <blacka101@gmail.com>

* cuda 10.2

Signed-off-by: raver119 <raver119@gmail.com>

* minor fix

Signed-off-by: raver119 <raver119@gmail.com>

Co-authored-by: shugeo <sgazeos@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-01-04 13:27:50 +03:00
Robert Altena 53d3bd1269 shallow delete of assign from SDBase. (#164)
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
2020-01-04 15:26:39 +11:00
Alex Black 29104083cc
Various fixes (#143)
* #8568 ArrayUtil optimization

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* #6171 Keras ReLU and ELU support

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Keras softmax layer import

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* #8549 Webjars dependency management

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fix for TF import names ':0' suffix issue / NPE

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* BiasAdd: fix default data format for TF import

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Update zoo test ignores

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* #8509 SameDiff Listener API - provide frame + iteration

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* #8520 ND4J Environment

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Deconv3d

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Deconv3d fixes + gradient check

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Conv3d fixes + deconv3d DType test

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fix issue with deconv3d gradinet check weight init

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* #8579 Fix BaseCudaDataBuffer constructor fix for UINT16

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* DataType.isNumerical() returns false for BOOL type

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* #8504 Reduce Spark log spam for tests

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Clean up DL4J gradient check test spam

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* More Gradient check spam reduction

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* SameDiff test spam reduction

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fixes for FlatBuffers mapping

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* SameDiff log spam cleanup

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Tests should extend BaseNd4jTest

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Remove debug line in c++ op

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* ND4J test spam cleanup

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* DL4J test spam reduction

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* More Dl4J and datavec test spam cleanup

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fix for bad conv3d test

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Additional test

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Embedding layers: don't inherit global default activation function

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Trigger CI

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Consolidate all BaseDL4JTest classes to single class used everywhere; make timeout configurable per class

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Test fixes and timeout increases

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Timeouts and PReLU fixes

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Restore libnd4j build threads arg for CUDA build

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Increase timeouts on a few tests to avoid spurious failures on some CI machines

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* More timeout fixes

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* More test timeout fixes

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Tweak timeout for one more test

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Final tweaks

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* One more ignore

Signed-off-by: AlexDBlack <blacka101@gmail.com>
2020-01-04 13:45:07 +11:00
Alexander Stoyakin 010744ef9c Lu wrapper and tests fixes (#144)
* Tests fixed

* Lu added

* Test fixed

* Default timeout

* Tests timeouts fixed.

* TF import fix

* Timeouts added

* Timeout fixed.

* Test corrected

* rgb and yiq conversion ops added

* Converter ops added

* Header

* Yuv converters

* API added

* Empty test for matmul

* Explanation

* skip gemm/gemv on empty inputs

Signed-off-by: raver119 <raver119@gmail.com>

* Test added

* Correct test

* one more empty pass-through for mmul

Signed-off-by: raver119 <raver119@gmail.com>

* Cleanup

* Test added

* Test fixed

* Added missing mapping

* Added missing mapping

Co-authored-by: raver119 <raver119@gmail.com>
2019-12-30 15:06:12 +03:00
Alex Black ce02b6fae7
Small fixes (#140)
* Allow scalar op result array auto allocation

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Don't swallow underlying exception for calculateOutputShape execution failures

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Ignore for known keras failure

Signed-off-by: AlexDBlack <blacka101@gmail.com>
2019-12-21 17:00:46 +11:00
Alexander Stoyakin 6d8a063c9b nd4j-tests cleanup (#137)
* Fixed tests

* Invalid test removed
2019-12-20 16:38:33 +03:00
Alex Black 3d8f6d50a1
SameDiff profiler / tracing and profile analysis/comparison (#133)
* Profiler

Signed-off-by: Alex Black <blacka101@gmail.com>

* Next steps, polishing, and loading SD/TF format JSON

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Next steps

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Profile comparison method

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Make profiling result writing async to reduce main thread overhead

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Profiling polishing

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fix

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Profile analyzer fixes

Signed-off-by: Alex Black <blacka101@gmail.com>

* Polish

Signed-off-by: Alex Black <blacka101@gmail.com>

* Cleanup

Signed-off-by: Alex Black <blacka101@gmail.com>

* Small formatting improvement

Signed-off-by: Alex Black <blacka101@gmail.com>

* Formatting tweak

Signed-off-by: Alex Black <blacka101@gmail.com>

* License headers

Signed-off-by: Alex Black <blacka101@gmail.com>
2019-12-19 23:43:58 +11:00
Alexander Stoyakin f5068f3980 Added missing Java ops wrappers (#122)
* Timeouts added

* Added some ops

* Ops added

* Fixed tests

* Minor fix

* Some fixes

* Digamma added

* Small fixes

* Timeouts added

* Added some ops

* Ops added

* Fixed tests

* Minor fix

* Some fixes

* Digamma added

* Small fixes

* Fused batch norm fixes-

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Tests switched off.

* Added test for resize_bicubic.

* Eliminated wasted in test of bicubic resize.

* Switched off multithreading explicit.

* HsvToRgb and RgbToHsv added

* Eliminated waste comments and conform proper float constants.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed multithreading with resize_bicubic helper for cpu platform.

Signed-off-by: shugeo <sgazeos@gmail.com>

* ResizeBicubic was fixed.

* Some fixes

* Fix op name

* Validation fixed.

* Clarifications for tests

* Wrappers and small fixes for new ops.
2019-12-19 20:15:48 +11:00
AlexDBlack 0df1b46c8c Merge 2019-12-10 15:08:50 +11:00
raver119 a5f5ac72b1
reduce bool changes (#118)
* reduce bool changes

Signed-off-by: raver119 <raver119@gmail.com>

* reduce bool tweaks

Signed-off-by: raver119 <raver119@gmail.com>
2019-12-09 20:08:59 +03:00
Alexander Stoyakin 927d591421 ResizeBicubic added (#117)
* ResizeBicubic added
Some fixes.

* Test fixed

* Narrowed argument type changed to boolean

* Clean up
2019-12-09 18:25:39 +11:00
raver119 b32dd1bf92
[WIP] resize_bicubic types (#116)
* resize_bicubic: allow more dtypes

Signed-off-by: raver119 <raver119@gmail.com>

* resize_bicubic: allow less dtypes

Signed-off-by: raver119 <raver119@gmail.com>

* Refactored resize_bicubic op to full conform with TF1.5 and tests.

* Corrected test to proper data type output.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected double input test to float constant outputs.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Finished with correction of tests for bicubic interpolated resizes expected.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed adjust_contrast ops to allow non-RGB inputs.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored adjust_contrast_v2 to conform with TF one.

Signed-off-by: shugeo <sgazeos@gmail.com>

* AdjustContrast tests activated

* two typos fixed

Signed-off-by: raver119 <raver119@gmail.com>
2019-12-06 18:58:37 +03:00
raver119 972fae60dc
Update master (#8511)
* cleaned up bert iterator tests (#110)

Signed-off-by: eraly <susan.eraly@gmail.com>

* Various pre-release fixes (#111)

* Various fixes

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fix default dtypes for MaxPoolWithArgmax

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Small pre-release tweak (#112)

* Log UI address on launch as in previous Play-based UI

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Logging level tweak for UI

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* http not https

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* datavec python ensure host (#113)

* ensure host

* one more host ensure

* info->debug

* [WIP] reverse improvements (#115)

* initial commit

Signed-off-by: raver119 <raver119@gmail.com>

* reverse draft

Signed-off-by: raver119 <raver119@gmail.com>

* reverse kernel

Signed-off-by: raver119 <raver119@gmail.com>

* reverse kernel

Signed-off-by: raver119 <raver119@gmail.com>

* 2 micro fixes

Signed-off-by: raver119 <raver119@gmail.com>

* Shugeo resize fix5 (#102)

* Refactored resize images ops to use TF-like bool args as input.

* Refactored helpers for cpu implementation of resize_bilinear and resize_nearest_neighbor ops.

* Refactored cuda implementation for image.resize_bilinear and image.resize_nearest_neighbor ops helpers.

* Refactored nearest_neighbor resize op.

* Added a pair of tests for special case of resize_bilinear algorithm.

* Fixed issue with resize_bilinear op.

* Refactored cpu implementation for helpers with resize_nearest_neighbor op.

* Final fixed for resize ops to conform TF v.1.5

* Refactored cuda helpers for resize_neares_neighbor op.

* Fixed resize_bilinear to accept proper data.

* Fixed issue with non-float input for resize_bilinear op.

* Refactored cuda helper for resize_bilinear to proper process non-float inputs.

* Added tests for resize_bilinear to int inputs.

* Fixed ResizeBilinear wrapper

* Tests fixed

* Fixed float and bool constant to avoid overflow for some kind of compilers.

* Corrected float constants with float data type.

* Added f suffix for float constants.

* Corrected float constant to avoid overflow with initializing lists.

* Corrected float initializing list with float input.

* Corrected bool constant with initalizing list.

* Corrected float and bool values with initializing lists.

* Fixed wrong constant.

* Fixed issue with 1x1 input picture for resize.

* ResizeBilinear default values on import fix

Signed-off-by: raver119 <raver119@gmail.com>
2019-12-06 11:10:44 +03:00
Robert Altena e7730eded4 delete unused and refactor. (#8262)
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
2019-12-05 22:25:41 -05:00