Commit Graph

22 Commits (13cae7fb60283966b45e126c4564c86b3e34ab53)

Author SHA1 Message Date
agibsonccc 18d165c915 Update cuda compute defaults and ctc loss 2021-03-12 18:58:45 +09:00
agibsonccc e10b172376 Merge branch 'master' into ag_build_fixes 2021-03-11 11:29:24 +09:00
agibsonccc f0b6b517c3 Update 2021-03-11 08:34:31 +09:00
AbdelRauf becdc100ee cross_build: added android {x86, x86_64} and jetson {cross arm64 + host cuda + target cuda aarch64}
jetson note: host cuda toolkit should be installed to setup cross environment

Signed-off-by: AbdelRauf <rauf@konduit.ai>
2021-03-09 02:23:07 +01:00
agibsonccc 6def2ed68d Libnd4j build changes 2021-03-05 10:59:02 +09:00
agibsonccc 1eaee7f6d9 Copyright updates, removal of extra nlp modules 2021-02-18 11:46:53 +09:00
agibsonccc c715aea405 Update LICENSE 2021-02-01 17:47:29 +09:00
agibsonccc 65c6a9a42e Dev commits 2021-02-01 14:31:20 +09:00
Adam Gibson f9aebec79e
Development updates (#9098)
* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Fix L2NormalizeVertex and eclipse#9054 (#513)

* update

* Fix L2NormalizeVertex

Fix eclipse#9054

* RL4J: Add async training and advantage actor-critic (#507)

* Added async training & Advantage Actor Critic

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fix compiler error

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Renamed ActorCriticPolicy back to ACPolicy

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>

* Python GIL overhaul (#517)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Cherry pick rl4j changes from most recent KonduitAI/deeplearning4j PR

* Update cherry pick again from last master revision.

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Ag pythongiloverhaul (#518)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Cherry pick rl4j changes from most recent KonduitAI/deeplearning4j PR

* Update cherry pick again from last master revision.

* Re update python4j

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Bump formatter-maven-plugin from 2.0.0 to 2.12.1 (#505)

Bumps [formatter-maven-plugin](https://github.com/revelc/formatter-maven-plugin) from 2.0.0 to 2.12.1.
- [Release notes](https://github.com/revelc/formatter-maven-plugin/releases)
- [Changelog](https://github.com/revelc/formatter-maven-plugin/blob/formatter-maven-plugin-2.12.1/CHANGELOG.md)
- [Commits](https://github.com/revelc/formatter-maven-plugin/compare/formatter-maven-plugin-2.0.0...formatter-maven-plugin-2.12.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* Ag fix9060 (#519)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Added support for the archunit (#9062)

* Added support for the archunit

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Updated pom files

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Datavec code cleaup (#9071)

* removed unnecessary semicolons

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Use standard charset object

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Removed unused imports

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* WIP: Fix Conv1d causal case

* Add inital tests

* Update Conv1d tests to be a bit more robust

* Remove redundant test

* Reset from master

* Remove cuda definition (left over)

* Update rl4j again

* Update pom.xml

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Fixes 9061 (#521)

* Get rid of edge case in validation

* Added support for the archunit (#9062)

* Added support for the archunit

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Updated pom files

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Using embedded copying of an array instead of manual (#9073)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Datavec bulk operation (#9075)

* Bulk operation can be used instead of iteration inspection

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Redundant 'Collection.addAll()' call inspection

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Removed infinitely loop (#9076)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Revert "Merge eclipse changes" (#526)

* Revert rl4j to 72f5c18c830f62df2c04fbf8dc7b1353cc2d3182 (#527)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Fix L2NormalizeVertex and eclipse#9054 (#513)

* update

* Fix L2NormalizeVertex

Fix eclipse#9054

* RL4J: Add async training and advantage actor-critic (#507)

* Added async training & Advantage Actor Critic

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fix compiler error

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Renamed ActorCriticPolicy back to ACPolicy

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>

* Python GIL overhaul (#517)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Cherry pick rl4j changes from most recent KonduitAI/deeplearning4j PR

* Update cherry pick again from last master revision.

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Ag pythongiloverhaul (#518)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Cherry pick rl4j changes from most recent KonduitAI/deeplearning4j PR

* Update cherry pick again from last master revision.

* Re update python4j

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Bump formatter-maven-plugin from 2.0.0 to 2.12.1 (#505)

Bumps [formatter-maven-plugin](https://github.com/revelc/formatter-maven-plugin) from 2.0.0 to 2.12.1.
- [Release notes](https://github.com/revelc/formatter-maven-plugin/releases)
- [Changelog](https://github.com/revelc/formatter-maven-plugin/blob/formatter-maven-plugin-2.12.1/CHANGELOG.md)
- [Commits](https://github.com/revelc/formatter-maven-plugin/compare/formatter-maven-plugin-2.0.0...formatter-maven-plugin-2.12.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* Ag fix9060 (#519)

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Added support for the archunit (#9062)

* Added support for the archunit

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Updated pom files

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Datavec code cleaup (#9071)

* removed unnecessary semicolons

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Use standard charset object

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Removed unused imports

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* WIP: Fix Conv1d causal case

* Add inital tests

* Update Conv1d tests to be a bit more robust

* Remove redundant test

* Reset from master

* Remove cuda definition (left over)

* Update rl4j again

* Update pom.xml

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* Fixes 9061 (#521)

* Get rid of edge case in validation

* Added support for the archunit (#9062)

* Added support for the archunit

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Updated pom files

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Using embedded copying of an array instead of manual (#9073)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Datavec bulk operation (#9075)

* Bulk operation can be used instead of iteration inspection

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Redundant 'Collection.addAll()' call inspection

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Removed infinitely loop (#9076)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>

* RL4J: Add async training and advantage actor-critic (#507)

* Added async training & Advantage Actor Critic

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fix compiler error

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Renamed ActorCriticPolicy back to ACPolicy

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>

(cherry picked from commit 72f5c18c830f62df2c04fbf8dc7b1353cc2d3182)

* RL4J: Add async training and advantage actor-critic (#507)

* Added async training & Advantage Actor Critic

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fix compiler error

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Renamed ActorCriticPolicy back to ACPolicy

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>

(cherry picked from commit 72f5c18c830f62df2c04fbf8dc7b1353cc2d3182)

* Revert rl4j to 72f5c18c830f62df2c04fbf8dc7b1353cc2d3182

* Delete jnind4jaurora.cpp

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>
Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* RL4J: Add partial support for RNN (#514)

* Added partial recurrent support

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Made sure the RNN always see the observation in EpsGreedy

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Converted all line endings of rl4j-core to LF (#530)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* NDJ4: Bundle configuration files required by AOT compilation with GraalVM (#529)

* NDJ4: Bundle configuration files required by AOT compilation with GraalVM

* Update dependencies to just released JavaCPP and JavaCV 1.5.4

* Ag fixtests 831 (#523)

* Update UnderSamplingPreProcessorTest.java

* Development updates (#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Development updates (#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Add proper annotation

* Fix classcast exception for recurrent model import case

* Update keras import to allow for proper handling of changing NCHW -> NHWC mid later

* Add output to test to ensure proper activation

* Fixes computation graphs to allow dimension ordering to change mid graph

* Add NHWC support for keras import.

* Update tests to pass /ignore out of date ones

* Add  multi RNNDataformat  support

* Update tests to make more pass.

Updates some tests to be correct, double checked existing models and updated reasons they may or may  not fail.

* Add back old default values to ensure legacy serialization works.  Replace null value default with sentinel value for default value overridden.

* Update layers to preserve changed values

* Exclude default value over ridden from comparison

* Fix conv1d import (no permute weights anymore)

* Update KerasConvolution1D.java

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* GPU compute capability  (#532)

* - GPU cpu capability flags
- CUDA MAJOR VERSION provided by cmake

Signed-off-by: AbdelRauf <rauf@konduit.ai>

* Readme

Signed-off-by: AbdelRauf <rauf@konduit.ai>

* Readme

Signed-off-by: AbdelRauf <rauf@konduit.ai>

* RL4J: Add new network implementation to help support recurrent networks (#531)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>
Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Abdelrauf <qwr@live.ru>
2020-09-23 19:11:29 +09:00
Serhii Shepel 3cbba49518
Bugfix failing builds (#341)
* Fix interpreter for libnd4j tests and drop test script

* Remove mingw when specifying javacpp.platform, add new profile that triggers when javacpp.platform is windows-x86_64

* Update android 32 bit toolchain for x86

* Try triple instead of -target

* Change to -target

* Update 32 bit arm

* Change android bin path

* Update arm 32 bit build again

Co-authored-by: Adam Gibson <1144306+agibsonccc@users.noreply.github.com>
2020-03-24 12:55:47 +11:00
Adam Gibson 30a28fae45
Windows fix (#333)
* Fix cmake detection in msys

* Revert windows change

* Update to unix line endings
2020-03-20 12:14:03 +09:00
Adam Gibson 0cf4a45573
Fixes #8763 (#310)
* Fix cmake detection in msys

* Fix toolchain file on windows

* Make android 64 bit work

* Fix libnd4j build script on msys

* Update build script for windows/linux

* Encoding issue for ci

* Update pom.xml

* Update pom.xml

* Update pom.xml

* Remove mingw

* Ensure android x86 builds are inline with arm builds

* Update toolchains and env variables for x86

* Move profile for build program up to parent

* Fix blas vendor and add comment

* Update cuda presets version

* Set default value and move properties back to child pom

* Change program from hard coded to use the script as the program

* Update pom.xml

* Update pom.xml

* Static lib fix

* Update static lib output

* Get rid of old comments

* Update static for buiding
2020-03-19 14:53:21 +09:00
raver119 63fa3c2ef3
libnd4j polishing (#273)
* initial set of include changes

Signed-off-by: raver119 <raver119@gmail.com>

* one more tweak

Signed-off-by: raver119 <raver119@gmail.com>

* few more rearrangements

Signed-off-by: raver119 <raver119@gmail.com>

* few more rearrangements

Signed-off-by: raver119 <raver119@gmail.com>

* few more rearrangements

Signed-off-by: raver119 <raver119@gmail.com>

* cuda includes rearrangements

Signed-off-by: raver119 <raver119@gmail.com>

* java update

Signed-off-by: raver119 <raver119@gmail.com>

* = namespace changed to sd
- few CMake variables renamed with SD_ prefix

Signed-off-by: raver119 <raver119@gmail.com>

* java update

Signed-off-by: raver119 <raver119@gmail.com>

* LoopKind minor fix

Signed-off-by: raver119 <raver119@gmail.com>

* few more changes

Signed-off-by: raver119 <raver119@gmail.com>

* few more changes

Signed-off-by: raver119 <raver119@gmail.com>

* few more changes

Signed-off-by: raver119 <raver119@gmail.com>

* sanitizer is optional now

Signed-off-by: raver119 <raver119@gmail.com>

* dev tests updated

Signed-off-by: raver119 <raver119@gmail.com>

* few more changes

Signed-off-by: raver119 <raver119@gmail.com>

* last update

Signed-off-by: raver119 <raver119@gmail.com>

* java update

Signed-off-by: raver119 <raver119@gmail.com>
2020-03-02 12:49:41 +03:00
raver119 e78be14cc1
arm fix (#260)
* range check for scalar_int

Signed-off-by: raver119 <raver119@gmail.com>

* no simd

Signed-off-by: raver119 <raver119@gmail.com>

* no ops

Signed-off-by: raver119 <raver119@gmail.com>

* cyclic shift?

Signed-off-by: raver119 <raver119@gmail.com>

* left split

Signed-off-by: raver119 <raver119@gmail.com>

* left split

Signed-off-by: raver119 <raver119@gmail.com>

* rot ops unrolled templates

Signed-off-by: raver119 <raver119@gmail.com>

* no rotl/rotr for uint64

Signed-off-by: raver119 <raver119@gmail.com>

* no rotl/rotr for uint64 2

Signed-off-by: raver119 <raver119@gmail.com>

* no rotl/rotr for uint64 3

Signed-off-by: raver119 <raver119@gmail.com>

* ARM_BUILD declared

Signed-off-by: raver119 <raver119@gmail.com>
2020-02-21 14:31:00 +03:00
Abdelrauf f25056363b auto-vectorization check for gcc (#172)
* Autovectorization tool:
- sync output for gnu make
- Reduced html output
- links for line numbers
- AutoVectorization.md
Signed-off-by: AbdelRauf <rauf@konduit.ai>

* Detailed report with `-fsave-optimization-record` option
Signed-off-by: AbdelRauf <rauf@konduit.ai>

* Readme

Signed-off-by: AbdelRauf <rauf@konduit.ai>

Co-authored-by: raver119 <raver119@gmail.com>
2020-01-28 19:00:12 +03:00
raver119 7783012f39
cuDNN integration (#150)
* initial commit

Signed-off-by: raver119 <raver119@gmail.com>

* one file

Signed-off-by: raver119 <raver119@gmail.com>

* few more includes

Signed-off-by: raver119 <raver119@gmail.com>

* m?

Signed-off-by: raver119 <raver119@gmail.com>

* const

Signed-off-by: raver119 <raver119@gmail.com>

* cudnn linkage in tests

Signed-off-by: raver119 <raver119@gmail.com>

* culibos

Signed-off-by: raver119 <raver119@gmail.com>

* static reminder

Signed-off-by: raver119 <raver119@gmail.com>

* platform engine tag

Signed-off-by: raver119 <raver119@gmail.com>

* HAVE_CUDNN moved to config.h.in

Signed-off-by: raver119 <raver119@gmail.com>

* include

Signed-off-by: raver119 <raver119@gmail.com>

* include

Signed-off-by: raver119 <raver119@gmail.com>

* skip cudnn handle creation if there's not cudnn

Signed-off-by: raver119 <raver119@gmail.com>

* meh

Signed-off-by: raver119 <raver119@gmail.com>

* target device in context

Signed-off-by: raver119 <raver119@gmail.com>

* platform engines

Signed-off-by: raver119 <raver119@gmail.com>

* platform engines

Signed-off-by: raver119 <raver119@gmail.com>

* allow multiple -h args

Signed-off-by: raver119 <raver119@gmail.com>

* allow multiple -h args

Signed-off-by: raver119 <raver119@gmail.com>

* move mkldnn out of CPU block

Signed-off-by: raver119 <raver119@gmail.com>

* link to mkldnn on cuda

Signed-off-by: raver119 <raver119@gmail.com>

* less prints

Signed-off-by: raver119 <raver119@gmail.com>

* minor tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* next step

Signed-off-by: raver119 <raver119@gmail.com>

* conv2d NCHW draft

Signed-off-by: raver119 <raver119@gmail.com>

* conv2d biasAdd

Signed-off-by: raver119 <raver119@gmail.com>

* test for MKL/CUDNN combined use

Signed-off-by: raver119 <raver119@gmail.com>

* - provide additional code for conv2d ff based on cudnn api, not tested yet

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on conv2d helper based on using cudnn api

Signed-off-by: Yurii <iuriish@yahoo.com>

* - fixing several cuda bugs which appeared after cudnn lib had been started to use

Signed-off-by: Yurii <iuriish@yahoo.com>

* - implementation of conv2d backprop op based on cudnn api

Signed-off-by: Yurii <iuriish@yahoo.com>

* - implementaion of conv3d and conv3d_bp ops based on cudnn api

Signed-off-by: Yurii <iuriish@yahoo.com>

* - bugs fixing in conv3d/conv3d_bp ops (cudnn in use)

Signed-off-by: Yurii <iuriish@yahoo.com>

* - implementation of depthwiseConv2d (ff/bp) op based on cudnn api

Signed-off-by: Yurii <iuriish@yahoo.com>

* - implementation of batchnorm ff op based on cudnn api

Signed-off-by: Yurii <iuriish@yahoo.com>

* - disable cudnn batchnorm temporary

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add minor change in cmake

Signed-off-by: Yurii <iuriish@yahoo.com>

* engine for depthwise mkldnn

Signed-off-by: raver119 <raver119@gmail.com>

* couple of includes

Signed-off-by: raver119 <raver119@gmail.com>

* - provide permutation to cudnn batchnorm ff when format is NHWC

Signed-off-by: Yurii <iuriish@yahoo.com>

* lgamma fix

Signed-off-by: raver119 <raver119@gmail.com>

* - eliminate memory leak in two tests

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
2020-01-20 21:32:46 +03:00
Samuel Audet e51e6ebfd2 Update CMake toolchains for more recent versions of Android NDK (#8502) 2019-12-05 12:46:01 +02:00
raver119 25b3cd9b80
[WIP] CUDA tests (#95)
* one more CI test

Signed-off-by: raver119 <raver119@gmail.com>

* export additional symbols

Signed-off-by: raver119 <raver119@gmail.com>

* few more tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* one more tweak for linux

Signed-off-by: raver119 <raver119@gmail.com>

* fix dtype in few tests

Signed-off-by: raver119 <raver119@gmail.com>

* missing sync and memset in couple of tests

Signed-off-by: raver119 <raver119@gmail.com>

* copy step for libnd4j cuda

Signed-off-by: raver119 <raver119@gmail.com>

* no-op on empty for adjust hue/contrast/saturation

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA_VERBOSE Off

Signed-off-by: raver119 <raver119@gmail.com>

* BroadcastBool fix + few tests

Signed-off-by: raver119 <raver119@gmail.com>

* trigger jenkins

Signed-off-by: raver119 <raver119@gmail.com>

* trigger jenkins

Signed-off-by: raver119 <raver119@gmail.com>

* - ignore couple of warnings
- remove redundant compiler options

Signed-off-by: raver119 <raver119@gmail.com>
2019-12-02 21:37:21 +03:00
raver119 6de00bf75f
[WIP] Weekly update of repo (#8390)
* [WIP] Fix compilation after nd4j changes (#37)

* Fix compilation.

* Some tests fixed

* Disable tests temporarily.

* Restored test

* Tests restored.

* Test restored.

* [WIP] perf tests (#40)

* special maxpool test

Signed-off-by: raver119 <raver119@gmail.com>

* special maxpool test

Signed-off-by: raver119 <raver119@gmail.com>

* Shyrma bnorm bp (#41)

Batchnorm backprop mkldnn

* Add SameDiff memory reuse memory manager (array cache) (#39)

* Attention op comments

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* ArrayCacheMemoryMgr - first pass

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Tweak array cache for use with SameDiff identity arrays

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* ArrayCacheMemoryMgr javadoc and properly get max memory

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* LRU cache policy + add tests

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fixes

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Resize arrays internally if required for ArrayCacheMemoryMgr

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Test improvement

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Small polish

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* SameDiff op runtime benchmarking listener (#42)

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* INLINE_LOOPS for windows

Signed-off-by: raver119 <raver119@gmail.com>

* [WIP] ThreadPool (#8)

This PR removes OpenMP use in 95% of cases
2019-11-13 17:15:18 +03:00
raver119 98e2814879
Platform helpers (#8216)
* platform helpers draft

Signed-off-by: raver119 <raver119@gmail.com>

* typo

Signed-off-by: raver119 <raver119@gmail.com>

* disable platform cmake

Signed-off-by: raver119 <raver119@gmail.com>

* another draft

Signed-off-by: raver119 <raver119@gmail.com>

* mkldnn convolution refactored

Signed-off-by: raver119 <raver119@gmail.com>

* minor tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* one more safety check

Signed-off-by: raver119 <raver119@gmail.com>

* prototype works

Signed-off-by: raver119 <raver119@gmail.com>

* meh

Signed-off-by: raver119 <raver119@gmail.com>

* force static library mode for mkldnn

Signed-off-by: raver119 <raver119@gmail.com>

* - ismax fix
- experimental arg fix
- don't enforce openblas on Apple hardware

Signed-off-by: raver119 <raver119@gmail.com>

* bunch of small fixes

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* declare concurrent

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* - MKLDNN version upgrade to 1.0.2
- avgpool2d/maxpool2d APIs update

Signed-off-by: raver119 <raver119@gmail.com>

* - avgpool2d_bp/maxpool2d_bp APIs update

Signed-off-by: raver119 <raver119@gmail.com>

* - conv2d/batchnorm APIs update

Signed-off-by: raver119 <raver119@gmail.com>

* - lrn/conv2d_bp/conv3d/conv3d_bp APIs update

Signed-off-by: raver119 <raver119@gmail.com>

* all ops converted to MKLDNN 1.x

Signed-off-by: raver119 <raver119@gmail.com>

* bunch of tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* namespace for platform helpers

Signed-off-by: raver119 <raver119@gmail.com>

* make sure platform helpers aren't opimized out

Signed-off-by: raver119 <raver119@gmail.com>

* build cpu_features on x86 systems

Signed-off-by: raver119 <raver119@gmail.com>

* build cpu_features on x86 systems

Signed-off-by: raver119 <raver119@gmail.com>

* more of cpu_features

Signed-off-by: raver119 <raver119@gmail.com>

* - mkldnn removed from java
- cpu_features checks in CpuNDArrayFactory

Signed-off-by: raver119 <raver119@gmail.com>

* F16C definition renamed

Signed-off-by: raver119 <raver119@gmail.com>

* some mkldnn rearrangements

Signed-off-by: raver119 <raver119@gmail.com>

* check supported instructions before doing anything

Signed-off-by: raver119 <raver119@gmail.com>

* typo

Signed-off-by: raver119 <raver119@gmail.com>

* missied impl

Signed-off-by: raver119 <raver119@gmail.com>

* BUILD_PIC option

Signed-off-by: raver119 <raver119@gmail.com>

* conv2d fix

Signed-off-by: raver119 <raver119@gmail.com>

* avgpool3d fix

Signed-off-by: raver119 <raver119@gmail.com>

* avgpool3d_bp fix

Signed-off-by: raver119 <raver119@gmail.com>

* avgpool2d_bp leak fix

Signed-off-by: raver119 <raver119@gmail.com>

* avgpool3d_bp leak fix

Signed-off-by: raver119 <raver119@gmail.com>

* maxpool bp leaks fixed

Signed-off-by: raver119 <raver119@gmail.com>

* printf removed

Signed-off-by: raver119 <raver119@gmail.com>

* batchnorm fix

Signed-off-by: raver119 <raver119@gmail.com>

* AVX warning/error polishing

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fix

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* More polish

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Polish

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* remove previous MKL-DNN support layer

Signed-off-by: raver119 <raver119@gmail.com>

* avx2 tweak

Signed-off-by: raver119 <raver119@gmail.com>

* allow static for apple

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* exclude mkldnn in one more place

Signed-off-by: raver119 <raver119@gmail.com>

* exclude mkldnn in one more place

Signed-off-by: raver119 <raver119@gmail.com>

* restore OPENBLAS_PATH use

Signed-off-by: raver119 <raver119@gmail.com>

* add runtime check for avx/avx2 support

Signed-off-by: raver119 <raver119@gmail.com>

* convolution_auto

Signed-off-by: raver119 <raver119@gmail.com>

* Add logic for helper argument

* minor test fix

Signed-off-by: raver119 <raver119@gmail.com>

* few tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* few tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* skip OpTracker props for non-x86 builds

Signed-off-by: raver119 <raver119@gmail.com>

* linux arm isn't x86 :)

Signed-off-by: raver119 <raver119@gmail.com>

* avx-512

Signed-off-by: raver119 <raver119@gmail.com>

* CUDA presets fix

Signed-off-by: raver119 <raver119@gmail.com>

* BUILD_PIC

Signed-off-by: raver119 <raver119@gmail.com>

* prefetchw for avx2

Signed-off-by: raver119 <raver119@gmail.com>

* BUILD_PIC again

Signed-off-by: raver119 <raver119@gmail.com>
2019-09-11 21:50:28 +03:00
raver119 85e212fece
exclude memory tracker for android/ios/macos platforms (#8005)
Signed-off-by: raver119 <raver119@gmail.com>
2019-07-11 18:28:19 +03:00
skymindops b5f0ec072f Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00