cavis/libnd4j
Adam Gibson f9aebec79e
Development updates (#9098)
* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Fix L2NormalizeVertex and eclipse#9054 (#513)

* update

* Fix L2NormalizeVertex

Fix eclipse#9054

* RL4J: Add async training and advantage actor-critic (#507)

* Added async training & Advantage Actor Critic

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fix compiler error

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Renamed ActorCriticPolicy back to ACPolicy

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>

* Python GIL overhaul (#517)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Cherry pick rl4j changes from most recent KonduitAI/deeplearning4j PR

* Update cherry pick again from last master revision.

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Ag pythongiloverhaul (#518)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Cherry pick rl4j changes from most recent KonduitAI/deeplearning4j PR

* Update cherry pick again from last master revision.

* Re update python4j

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Bump formatter-maven-plugin from 2.0.0 to 2.12.1 (#505)

Bumps [formatter-maven-plugin](https://github.com/revelc/formatter-maven-plugin) from 2.0.0 to 2.12.1.
- [Release notes](https://github.com/revelc/formatter-maven-plugin/releases)
- [Changelog](https://github.com/revelc/formatter-maven-plugin/blob/formatter-maven-plugin-2.12.1/CHANGELOG.md)
- [Commits](https://github.com/revelc/formatter-maven-plugin/compare/formatter-maven-plugin-2.0.0...formatter-maven-plugin-2.12.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* Ag fix9060 (#519)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Added support for the archunit (#9062)

* Added support for the archunit

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Updated pom files

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Datavec code cleaup (#9071)

* removed unnecessary semicolons

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Use standard charset object

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Removed unused imports

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* WIP: Fix Conv1d causal case

* Add inital tests

* Update Conv1d tests to be a bit more robust

* Remove redundant test

* Reset from master

* Remove cuda definition (left over)

* Update rl4j again

* Update pom.xml

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Fixes 9061 (#521)

* Get rid of edge case in validation

* Added support for the archunit (#9062)

* Added support for the archunit

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Updated pom files

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Using embedded copying of an array instead of manual (#9073)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Datavec bulk operation (#9075)

* Bulk operation can be used instead of iteration inspection

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Redundant 'Collection.addAll()' call inspection

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Removed infinitely loop (#9076)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Revert "Merge eclipse changes" (#526)

* Revert rl4j to 72f5c18c830f62df2c04fbf8dc7b1353cc2d3182 (#527)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Fix L2NormalizeVertex and eclipse#9054 (#513)

* update

* Fix L2NormalizeVertex

Fix eclipse#9054

* RL4J: Add async training and advantage actor-critic (#507)

* Added async training & Advantage Actor Critic

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fix compiler error

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Renamed ActorCriticPolicy back to ACPolicy

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>

* Python GIL overhaul (#517)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Cherry pick rl4j changes from most recent KonduitAI/deeplearning4j PR

* Update cherry pick again from last master revision.

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Ag pythongiloverhaul (#518)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Cherry pick rl4j changes from most recent KonduitAI/deeplearning4j PR

* Update cherry pick again from last master revision.

* Re update python4j

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Bump formatter-maven-plugin from 2.0.0 to 2.12.1 (#505)

Bumps [formatter-maven-plugin](https://github.com/revelc/formatter-maven-plugin) from 2.0.0 to 2.12.1.
- [Release notes](https://github.com/revelc/formatter-maven-plugin/releases)
- [Changelog](https://github.com/revelc/formatter-maven-plugin/blob/formatter-maven-plugin-2.12.1/CHANGELOG.md)
- [Commits](https://github.com/revelc/formatter-maven-plugin/compare/formatter-maven-plugin-2.0.0...formatter-maven-plugin-2.12.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* Ag fix9060 (#519)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Added support for the archunit (#9062)

* Added support for the archunit

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Updated pom files

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Datavec code cleaup (#9071)

* removed unnecessary semicolons

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Use standard charset object

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Removed unused imports

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* WIP: Fix Conv1d causal case

* Add inital tests

* Update Conv1d tests to be a bit more robust

* Remove redundant test

* Reset from master

* Remove cuda definition (left over)

* Update rl4j again

* Update pom.xml

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Fixes 9061 (#521)

* Get rid of edge case in validation

* Added support for the archunit (#9062)

* Added support for the archunit

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Updated pom files

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Using embedded copying of an array instead of manual (#9073)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Datavec bulk operation (#9075)

* Bulk operation can be used instead of iteration inspection

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Redundant 'Collection.addAll()' call inspection

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Removed infinitely loop (#9076)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* RL4J: Add async training and advantage actor-critic (#507)

* Added async training & Advantage Actor Critic

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fix compiler error

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Renamed ActorCriticPolicy back to ACPolicy

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>

(cherry picked from commit 72f5c18c830f62df2c04fbf8dc7b1353cc2d3182)

* RL4J: Add async training and advantage actor-critic (#507)

* Added async training & Advantage Actor Critic

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fix compiler error

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Renamed ActorCriticPolicy back to ACPolicy

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>

(cherry picked from commit 72f5c18c830f62df2c04fbf8dc7b1353cc2d3182)

* Revert rl4j to 72f5c18c830f62df2c04fbf8dc7b1353cc2d3182

* Delete jnind4jaurora.cpp

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>
Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* RL4J: Add partial support for RNN (#514)

* Added partial recurrent support

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Made sure the RNN always see the observation in EpsGreedy

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Converted all line endings of rl4j-core to LF (#530)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* NDJ4: Bundle configuration files required by AOT compilation with GraalVM (#529)

* NDJ4: Bundle configuration files required by AOT compilation with GraalVM

* Update dependencies to just released JavaCPP and JavaCV 1.5.4

* Ag fixtests 831 (#523)

* Update UnderSamplingPreProcessorTest.java

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Add proper annotation

* Fix classcast exception for recurrent model import case

* Update keras import to allow for proper handling of changing NCHW -> NHWC mid later

* Add output to test to ensure proper activation

* Fixes computation graphs to allow dimension ordering to change mid graph

* Add NHWC support for keras import.

* Update tests to pass /ignore out of date ones

* Add  multi RNNDataformat  support

* Update tests to make more pass.

Updates some tests to be correct, double checked existing models and updated reasons they may or may  not fail.

* Add back old default values to ensure legacy serialization works.  Replace null value default with sentinel value for default value overridden.

* Update layers to preserve changed values

* Exclude default value over ridden from comparison

* Fix conv1d import (no permute weights anymore)

* Update KerasConvolution1D.java

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* GPU compute capability  (#532)

* - GPU cpu capability flags
- CUDA MAJOR VERSION provided by cmake

Signed-off-by: AbdelRauf <rauf@konduit.ai>

* Readme

Signed-off-by: AbdelRauf <rauf@konduit.ai>

* Readme

Signed-off-by: AbdelRauf <rauf@konduit.ai>

* RL4J: Add new network implementation to help support recurrent networks (#531)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>
Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Abdelrauf <qwr@live.ru>
2020-09-23 19:11:29 +09:00
..
auto_vectorization auto-vectorization check for gcc (#172) 2020-01-28 19:00:12 +03:00
blas Development updates (#9098) 2020-09-23 19:11:29 +09:00
cmake Pi build and initial ArmCompute library support (#494) 2020-06-26 10:03:46 +03:00
include Development updates (#9098) 2020-09-23 19:11:29 +09:00
minifier C++ rearrangements (#485) 2020-06-06 15:26:55 +03:00
msi Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
packages Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
profile Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
server C++ rearrangements (#485) 2020-06-06 15:26:55 +03:00
tests_cpu Development updates (#9098) 2020-09-23 19:11:29 +09:00
.gitignore Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
AddingNewOps.md Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
CMakeLists.txt Pi build and initial ArmCompute library support (#494) 2020-06-26 10:03:46 +03:00
CMakeLists.txt.cpu_features.in Platform helpers (#8216) 2019-09-11 21:50:28 +03:00
CMakeLists.txt.in roll back flatbuffers version 2020-01-31 15:57:55 +03:00
CMakeLists.txt.mkldnn.in mkldnn version upgrade: v1.4 2020-04-20 08:19:03 +03:00
CMakeSettings.json libnd4j polishing (#273) 2020-03-02 12:49:41 +03:00
README.md Development updates (#9098) 2020-09-23 19:11:29 +09:00
RaspberryPi.md Update links to eclipse repos (#252) 2019-09-10 19:09:46 +10:00
UnderstandingGraph.md C++ rearrangements (#485) 2020-06-06 15:26:55 +03:00
assembly-cuda.xml Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
assembly.xml Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
buildnativeoperations.sh Development updates (#9098) 2020-09-23 19:11:29 +09:00
cibuild.sh Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
flatproto.txt Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
iOS.md Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
linuxOnPower.md Update links to eclipse repos (#252) 2019-09-10 19:09:46 +10:00
macOSx10 (CPU only).md Update links to eclipse repos (#252) 2019-09-10 19:09:46 +10:00
pi_build.sh Pi build and initial ArmCompute library support (#494) 2020-06-26 10:03:46 +03:00
pom.xml Development updates (#9053) 2020-07-26 21:59:27 +09:00
proto.sh Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
setuposx.sh Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
windows.md windows build instructions minor update 2020-04-30 18:13:09 +03:00

README.md

LibND4J

Native operations for nd4j. Build using cmake

Prerequisites

  • GCC 4.9+
  • CUDA Toolkit Versions 10 or 11
  • CMake 3.8 (as of Nov 2017, in near future will require 3.9)

Additional build arguments

There's few additional arguments for buildnativeoperations.sh script you could use:

 -a XXXXXXXX// shortcut for -march/-mtune, i.e. -a native
 -b release OR -b debug // enables/desables debug builds. release is considered by default
 -j XX // this argument defines how many threads will be used to binaries on your box. i.e. -j 8 
 -cc XX// CUDA-only argument, builds only binaries for target GPU architecture. use this for fast builds
 --check-vectorization  auto-vectorization report for developers. (Currently, only GCC is supported)

More about AutoVectorization report

You can provide the compute capability for your card on the NVIDIA website here or use auto.
Please also check your Cuda Toolkit Release notes for supported and dropped features.
Here is the latest CUDA Toolkit Release note.
You can find the same information for the older Toolkit versions in the CUDA archives.

-cc and --compute option examples description
-cc all builds for common GPUs
-cc auto tries to detect automatically
-cc Maxwell GPU microarchitecture codename
-cc 75 compute capability 7.5 without a dot
-cc 7.5 compute capability 7.5 with a dot
-cc "Maxwell 6.0 7.5" space-separated multiple arguments within quotes (note: numbers only with a dot)

OS Specific Requirements

Android

Download the NDK, extract it somewhere, and execute the following commands, replacing android-xxx with either android-arm or android-x86:

git clone https://github.com/deeplearning4j/libnd4j
git clone https://github.com/deeplearning4j/nd4j
export ANDROID_NDK=/path/to/android-ndk/
cd libnd4j
bash buildnativeoperations.sh -platform android-xxx
cd ../nd4j
mvn clean install -Djavacpp.platform=android-xxx -DskipTests -pl '!:nd4j-cuda-9.0,!:nd4j-cuda-9.0-platform,!:nd4j-tests'

OSX

Run ./setuposx.sh (Please ensure you have brew installed)

See macOSx10 CPU only.md

Linux

Depends on the distro - ask in the earlyadopters channel for specifics on distro

Ubuntu Linux 15.10

wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-ubuntu1504-7-5-local_7.5-18_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1504-7-5-local_7.5-18_amd64.deb
sudo apt-get update
sudo apt-get install cuda
sudo apt-get install cmake
sudo apt-get install gcc-4.9
sudo apt-get install g++-4.9
sudo apt-get install git
git clone https://github.com/deeplearning4j/libnd4j
cd libnd4j/
export LIBND4J_HOME=~/libnd4j/
sudo rm /usr/bin/gcc
sudo rm /usr/bin/g++
sudo ln -s /usr/bin/gcc-4.9 /usr/bin/gcc
sudo ln -s /usr/bin/g++-4.9 /usr/bin/g++
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH

Ubuntu Linux 16.04

sudo apt install cmake
sudo apt install nvidia-cuda-dev nvidia-cuda-toolkit nvidia-361
export TRICK_NVCC=YES
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH

The standard development headers are needed.

CentOS 6

yum install centos-release-scl-rh epel-release
yum install devtoolset-3-toolchain maven30 cmake3 git
scl enable devtoolset-3 maven30 bash
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH

Windows

See Windows.md

Setup for All OS

  1. Set a LIBND4J_HOME as an environment variable to the libnd4j folder you've obtained from GIT

    • Note: this is required for building nd4j as well.
  2. Setup cpu followed by gpu, run the following on the command line:

    • For standard builds:

      ./buildnativeoperations.sh
      ./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
      
    • For Debug builds:

      ./buildnativeoperations.sh blas -b debug
      ./buildnativeoperations.sh blas -c cuda -сс YOUR_DEVICE_ARCH -b debug
      
    • For release builds (default):

      ./buildnativeoperations.sh
      ./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
      

OpenMP support

OpenMP 4.0+ should be used to compile libnd4j. However, this shouldn't be any trouble, since OpenMP 4 was released in 2015 and should be available on all major platforms.

Linking with MKL

We can link with MKL either at build time, or at runtime with binaries initially linked with another BLAS implementation such as OpenBLAS. In either case, simply add the path containing libmkl_rt.so (or mkl_rt.dll on Windows), say /path/to/intel64/lib/, to the LD_LIBRARY_PATH environment variable on Linux (or PATH on Windows), and build or run your Java application as usual. If you get an error message like undefined symbol: omp_get_num_procs, it probably means that libiomp5.so, libiomp5.dylib, or libiomp5md.dll is not present on your system. In that case though, it is still possible to use the GNU version of OpenMP by setting these environment variables on Linux, for example:

export MKL_THREADING_LAYER=GNU
export LD_PRELOAD=/usr/lib64/libgomp.so.1

##Troubleshooting MKL

Sometimes the above steps might not be all you need to do. Another additional step might be the need to add:

export LD_LIBRARY_PATH=/opt/intel/lib/intel64/:/opt/intel/mkl/lib/intel64

This ensures that mkl will be found first and liked to.

Packaging

If on Ubuntu (14.04 or above) or CentOS (6 or above), this repository is also set to create packages for your distribution. Let's assume you have built:

  • for the cpu, your command-line was ./buildnativeoperations.sh ...:
cd blasbuild/cpu
make package
  • for the gpu, your command-line was ./buildnativeoperations.sh -c cuda ...:
cd blasbuild/cuda
make package

Uploading package to Bintray

The package upload script is in packaging. The upload command for an rpm built for cpu is:

./packages/push_to_bintray.sh myAPIUser myAPIKey deeplearning4j blasbuild/cpu/libnd4j-0.8.0.fc7.3.1611.x86_64.rpm https://github.com/deeplearning4j

The upload command for a deb package built for cuda is:

./packages/push_to_bintray.sh myAPIUser myAPIKey deeplearning4j blasbuild/cuda/libnd4j-0.8.0.fc7.3.1611.x86_64.deb https://github.com/deeplearning4j

Running tests

Tests are written with gtest, run using cmake. Tests are currently under tests_cpu/

There are 2 directories for running tests:

1. libnd4j_tests: These are older legacy ops tests.
2. layers_tests: This covers the newer graph operations and ops associated with samediff.

For running the tests, we currently use cmake or CLion to run the tests.

To run tests using CUDA backend it's pretty much similar process:

1. ./buildnativeoperations.h -c cuda -cc <YOUR_ARCH> -b debug -t -j <NUMBER_OF_CORES>
2. ./blasbuild/cuda/tests_cpu/layers_tests/runtests (.exe on Windows)

Development

In order to extend and update libnd4j, understanding libnd4j's various cmake flags is the key. Many of them are in buildnativeoperations.sh. The pom.xml is used to integrate and auto configure the project for building with deeplearning4j.

At a minimum, you will want to enable tests. An example default set of flags for running tests and getting cpu builds working is as follows:

-DSD_CPU=true -DBLAS=TRUE -DSD_ARCH=x86-64 -DSD_EXTENSION= -DSD_LIBRARY_NAME=nd4jcpu -DSD_CHECK_VECTORIZATION=OFF  -DSD_SHARED_LIB=ON -DSD_STATIC_LIB=OFF -DSD_BUILD_MINIFIER=false -DSD_ALL_OPS=true -DCMAKE_BUILD_TYPE=Release -DPACKAGING=none  -DSD_BUILD_TESTS=OFF -DCOMPUTE=all -DOPENBLAS_PATH=C:/Users/agibs/.javacpp/cache/openblas-0.3.10-1.5.4-windows-x86_64.jar/org/bytedeco/openblas/windows-x86_64 -DDEV=FALSE -DCMAKE_NEED_RESPONSE=YES -DMKL_MULTI_THREADED=TRUE -DSD_BUILD_TESTS=YES

The way the main build script works, it dynamically generates a set of flags suitable for use for building the projects. Understanding the build script will go a long way in to configuring cmake for your particular IDE.