* - provide nhwc format in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl conv3d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl batchnorm
Signed-off-by: Yurii <iuriish@yahoo.com>
* - corrections in mkl maxpooling2d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add format format_tag::any to outputs in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - complete corrections in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add test for comparison of execution speeds of mkl conv2d op with different weights format
Signed-off-by: Yurii <iuriish@yahoo.com>
* - take into account order f in mkl conv ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* Fixed sequence_mask op and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Cuda fix for sequence_mask op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed sequence_mask op for both platforms and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed solve and triangular_solve for more than 2D for adjoint cases.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added adjoint solve test again.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a set of tests for triangual_solve and generic solve ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a pair tests for triangular_solve
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added tests for triangular_solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* wrapper
* builtins
* context mgr
* direct place
* ret all var
* call fix
* fix ndarray serde
* jobs
* try-with gil management
* cleanup
* exec tests passing
* list tests
* transforms test passing
* all pass
* headers
* dict fixes+test
* python path
* bool isinstance
* job tests
* nits
* transform fix+test
* transform tests
* leak fixes
* more mem leak fixes
* more fixes
* nits for adam
* PythonJob lombok builder
* checked exceptions
* more nits
* small leak fix
* more nits
* pythonexceptions
* fix jvm crash when bad python code
* Exception->PythonException
* Add support for boolean types in arrow records and ability to cast from float, double to int for TypeConversion (#178)
* nits for alex
* update tests
* fix test
* all pass
* refacc
* rem old code
* dtypes
* bytes working+exception pass through+cleanup (#209)
* more bytes tests
* header
* rem dummy test
* rem bad import
* alex nits + refacc
* Small error fixes (wrong type in msg) + minor formatting
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* use actual python type names (dictionary->dict, boolean->bool)
Co-authored-by: Shams Ul Azeem <shamsazeem20@gmail.com>
Co-authored-by: Alex Black <blacka101@gmail.com>
* linear equations systems solve op. Initial commit.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed compiling issues.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Linear equations systems solve. The next stage commit.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added test for linear equations systems solve operation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added additional test and fixed lower matrix retrievance.
* Implementation for solve of the systems of linear equations."
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored permutation generation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added restore for permutations batched with cuda helper for solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Finished cuda implementation for solve op helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored cpu helpers for solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fix gtest output on Windows
* Fixed issue with permutation matrix for cuda implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed issue with permutation matrix for cpu implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Eliminated waste comments.
Signed-off-by: shugeo <sgazeos@gmail.com>
* LinearSolve added
* Mapping added
* Javadoc added
* Refactored implementation of triangular_solve helpers and tests for solve matrix equations generally.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a test for solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Solve test added
* Fix for TF import
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* Fixes for global pooling + masking with different mask datatypes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Global pooling backprop dtype fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Added isSkipped() to Observation
Signed-off-by: unknown <aboulang2002@yahoo.com>
* Changed refacInitMdp to use isSkipped()
Signed-off-by: unknown <aboulang2002@yahoo.com>
* Changed getHistoryProcessor()
Signed-off-by: unknown <aboulang2002@yahoo.com>
* Fixed getEpochCounter() incorrectly changed to getCurrentEpochStep() calls
Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
* Removed StepCountable
Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
* Fix build
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* Fixed a problem in QLearningDiscrete and another in CartpoleNative
Signed-off-by: unknown <aboulang2002@yahoo.com>
* Update versions of JavaCPP Presets for NumPy, MKL, Gym, and TensorFlow
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
* RL4J: Add ability to set a random seed for GymEnv
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
* StringUtils for utf convertor raw implementation of all possible combinations, need to be add counter of bytes per symbol for any type and add api to call convertors and store data
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor more corrections to support convertors
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor some corrections and bug fixes, need review to discuss how to add multi-threading
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 some corrections to move to multi-threading, add one test need discussion data inputs/outputs array presentation, need discussion the way of multi-threading
* StringUtils for utf convertor #8613 tests added some corrections to optimize build
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 some corrections and code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 code clean up and optimize usage, need update ndarray factory before replace std usage
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 some staff to integrate converters into NDArrayFactory, update tests and add some functionality
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 minor corrections and bug fix before discussion
* StringUtils for utf convertor #8613 some fixes and tets
* StringUtils for utf convertor #8613 some more staff to support different unicode
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 fix linking bug
* StringUtils for utf convertor #8613 corrected several tests as defaults for string ndarray changed
* StringUtils for utf convertor #8613 replace some incorrect implementation, revert some test changes, need sync before testing
* StringUtils for utf convertor #8613 fixed several thing that were badly implemented yesterday, need optimization, testing (before testing have to be add support of u32 and u16 buffer visualization)
* StringUtils for utf convertor #8613 fixed to support u16 and u32, and convertor in ndarray, fix buffer print, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 merge master and sync with server
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 some correction for string cast, need print check only asci support
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 merge master, remove copies and add cast, need test, refactoring according review and clean up
* StringUtils for utf convertor #8613 fixed cast and copy issues
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 fixed cuda and update tests
* StringUtils for utf convertor #8613 integration into NdArray, fix several tests for build pass, refactoring, etc
* - avoid ambiguity of NDArray ctrs overloading in some tests
Signed-off-by: Yurii <iuriish@yahoo.com>
* StringUtils for utf convertor #8613 NDArray string constructors added, updated NDArrayFactory, refactoring unicode and tests, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 fixed cuda build and test, refactoring and void* added to some functions
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 void* integration, removed copy operation, refactoring, added tests for NDArray string constructors, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 several more fixes, improvements and updates
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 master merge, code clean up and optimization before review
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 minor fixes string element size define
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 revert last changes as mistake
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 fixed NDArray constructor build problem, remove order from string factory, fixed order use for factory via project, added catch of incorrect sync in cast of arrays to data types, fixed e method for strings, etc
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 added javacpp hack, added multi-threading, minor corrections in license agreement
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* StringUtils for utf convertor #8613 windows builds fix, as "sting" is not treated as utf8
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
* - range op now accepts dargs
- dargs now can be in signature
Signed-off-by: raver119 <raver119@gmail.com>
* range dtype java side
Signed-off-by: raver119 <raver119@gmail.com>
* linspace fix
Signed-off-by: raver119 <raver119@gmail.com>
* lin_space fix for scalar outputs
Signed-off-by: raver119 <raver119@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* - one more test for OneHot with dtype
- one more signature in Nd4j
Signed-off-by: raver119 <raver119@gmail.com>
* ones_as/zeros_as now accept dtype
Signed-off-by: raver119 <raver119@gmail.com>
* one more test
Signed-off-by: raver119 <raver119@gmail.com>
* - more updates for configurable data types
- ones_as/zeros_as java side + tests
Signed-off-by: raver119 <raver119@gmail.com>
* few c++ tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes around DArgs
Signed-off-by: raver119 <raver119@gmail.com>
* - implementation of cudnn batchnorm_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - testing and fixing bugs in batchnorm_bp based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - move pooling mkl code and delete some unnecessary files
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing cudnn pooling2d ops (avg/max, ff/bp)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation and testing cudnn pooling 3d (ff/bp) ops
Signed-off-by: Yurii <iuriish@yahoo.com>
* - provide ff step in case of cudnn maxpool3d_bp op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - remove half type from set of supported types in mkl dpethwise conv op
Signed-off-by: Yurii <iuriish@yahoo.com>
* - bring back cudaStreamSynchronize in batchnorm and pooling cudnn ops
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
* Update Japanese translation for Deeplearning4J UI (#8525)
Signed-off-by: k-tamura <ktamura.biz.80@gmail.com>
* RL4J: Remove processing done on observations in Policy & Async (#8471)
* Removed processing from Policy.play() and fixed missing resets
Signed-off-by: unknown <aboulang2002@yahoo.com>
* Adjusted unit test to check if DQNs have been reset
Signed-off-by: unknown <aboulang2002@yahoo.com>
* Fixed a couple of problems, added and updated unit tests
Signed-off-by: unknown <aboulang2002@yahoo.com>
* Removed processing from AsyncThreadDiscrete
Signed-off-by: unknown <aboulang2002@yahoo.com>
* Fixed a few problems
Signed-off-by: unknown <aboulang2002@yahoo.com>
* python version bump
* increase
* RL4J: Replace gym-java-client with JavaCPP (#8595)
* RL4J: Replace gym-java-client with JavaCPP
Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Kohei Tamura <ktamura.biz.80@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Max Pumperla <max.pumperla@googlemail.com>
Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
* Cleanup modules
* Moving subprojects to nd4j-api
* Project cleanup
* Dropped AWS sub-project
* dl4j-util moved to core
* dl4j-perf moved to core
* Tests coverage
* Revert "Moving subprojects to nd4j-api"
This reverts commit bc6eb573c6b60c407ade47172c5d204725077e6b.
* Moved nd4j-buffer and nd4j-context to nd4j-api
* Rolled back change
* Revert "Project cleanup"
This reverts commit 64ac7f369b2d968f7be437718034f093fc886ffc.
* Datavec cleaned up
* Revert "Moved nd4j-buffer and nd4j-context to nd4j-api"
This reverts commit 75f4e8da80d2551e44e1251dd6c5923289fff8e1.
# Conflicts:
# nd4j/nd4j-backends/nd4j-tests/src/test/java/org/nd4j/autodiff/opvalidation/ReductionBpOpValidation.java
* Resolve conflict
* Compilation fixed.
* nd4j-context and nd4j-buffer moved to nd4j-api
* Fixed TF mapping for mmul
* Fix for dl4j-cuda tests
Signed-off-by: Alex Black <blacka101@gmail.com>
* Move last few tests from deeplearning4j-nn to -core
Signed-off-by: Alex Black <blacka101@gmail.com>
* Remove incorrect TF import mapping for TensorMmul op
Signed-off-by: Alex Black <blacka101@gmail.com>
* Cleaned TF mapping
* Fix path for test results on windows
* Remove old dependency
Signed-off-by: Alex Black <blacka101@gmail.com>
* One more attempt to fix path for test results on windows
* fixup! One more attempt to fix path for test results on windows
* fixup! One more attempt to fix path for test results on windows
Co-authored-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: raver119 <raver119@gmail.com>
* Added test for issue with resize_area op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a pair of tests for resize_are op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored resize_area kernel to avoid shared memory overflow.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Eliminated prints with tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* ignore bad test
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed test with resize_area.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed test for float constants.
Signed-off-by: shugeo <sgazeos@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* missing alloc validation in RandomGenerator for CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* set error message if rng alloc failed
Signed-off-by: raver119 <raver119@gmail.com>
* check for error code during RNG creation in java
Signed-off-by: raver119 <raver119@gmail.com>
* nd4j-aeron profiles
Signed-off-by: raver119 <raver119@gmail.com>
* nd4j-aeron profiles
Signed-off-by: raver119 <raver119@gmail.com>
* skip one long test
Signed-off-by: raver119 <raver119@gmail.com>
* skip one long test
Signed-off-by: raver119 <raver119@gmail.com>
* kryo profile
Signed-off-by: raver119 <raver119@gmail.com>
* few more profiles
Signed-off-by: raver119 <raver119@gmail.com>
* few more profiles
Signed-off-by: raver119 <raver119@gmail.com>
* few more profiles
Signed-off-by: raver119 <raver119@gmail.com>
* Add maven profile + base tests methods for integration tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Switch from system property to environment variable; seems more reliable in intellij
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add nd4j-common-tests module, and common base test; cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ensure all ND4J tests extend BaseND4JTest
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Test spam reduction, import fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add test logging to nd4j-aeron
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix unintended change
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Reduce sprint test log spam
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More test spam cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Significantly speed up TSNE tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* W2V iterator test unit/integration split
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More NLP test speedups
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Avoid debug/verbose mode leaking between tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* test tweak
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Arbiter extends base DL4J test
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Arbiter test speedup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* nlp-uima test speedup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More test speedups
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix ND4J base test
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Few small ND4J test speed improvements
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* DL4J tests speedup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More tweaks
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Even more test speedups
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More tweaks
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Various test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* More test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Add ability to specify number of threads for C++ ops in BaseDL4JTest and BaseND4JTest
Signed-off-by: Alex Black <blacka101@gmail.com>
* nd4j-aeron test profile fix for CUDA
Signed-off-by: Alex Black <blacka101@gmail.com>
* Added qr op implementation. Initial version.
* Fixed doc for qr op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Implementation of QR decomposition. CPU platform version.
* Added a pair of tests for qr op testing.
Signed-off-by: shugeo <sgazeos@gmail.com>
* QR implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected norm using.
* Properly calculated intermediate results with QR decomposition.
* Another step to implement QR algorithm by householder.
* Cpu implementatio for QR decomposition. The first working edition.
* Corrected test to QR decomposition.
* Added tad multithreading with QR implementation.
* Finished cpu implementation for QR decomposition helpers.
* Refactored tests and improved multithreading.
* Refactored QR cpu implementation and update cuda implementation helpers.
* Cuda QR helper implementation. The first working edition.
* Eliminated waste prints.
* Restore multithreading with cuda implementation.
* Ops names corrected
* Refactored qr op helpers to optimize.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Eliminated waste manual ticking.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored memory allocation to avoid waste memory usage.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored matrixMinor method both for cuda and cpu platforms.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored method of vmul to use raw buffers instead type conversion.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored temporary array of matricies.
Signed-off-by: shugeo <sgazeos@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* Added implementation of the triangular_solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed compilation issues.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added verification of input data and helpers facilities for triangular_solve op.'
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added cpu implementation for triangular_solve helpers.
* Added tests and implementation for upper triangular equations.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a pair of cases to tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added multithreading with cpu helpers for triangular_solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added cuda implementation of triangular_solve op helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Finished cuda implementation of triangular_solve helpers and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed copyright marks.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected grammar errors with doc and error messages.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored matricies processing with triangular_solve cuda helper implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added triangular_solve wrapper
* Fixed mapping
* Added processing for adjoint with cpu helpers of triangular_solve op implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added implementation for adjoint routine with cuda platform.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added multithreading with adjoint routine for cpu platform.
Signed-off-by: shugeo <sgazeos@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* Added implementation for resize_area op. Initial commit.
* Added implementation of resize_area op. Initial revision.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected resizeArea functor call.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Implementation of resize_area. Cpu platform helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Implementation for resize_area helpers. The first part revision.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a set of tests for resize_area op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Cuda implementation for resize_area. Initial approach.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adding multithreading for resize_area algorithm.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Cuda implementation of resize_area helpers. Shared memory approach.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored resizeAreaKernel with cuda implementation.
* Eliminated compilation errors.
* ResizeArea helpers for cuda platform. The first working revision.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added test for batched resize_area op testing.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Implementation of resize_are for cuda platform and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed multithreading with resize_area op helper.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected copyright marks with sources.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected copyright mark for resize_area op implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected copyright mark for parity ops header.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected typo in strings and so on with image resize ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored resize_area helpers and multithreading.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added ResizeArea wrapper
* Added test with align_corners and fixed shape processing with only int args given for output size.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added test
* TF mapping for ResizeArea
* Fixed implementation issues with resize_area op for both platforms.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored image resizer struct to use flexible types for ints and floats.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Improved multithreading with resizeAreaKernel launch.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Use asynchronical memory copying with cuda platform image resize allocations.
Signed-off-by: shugeo <sgazeos@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* one file
Signed-off-by: raver119 <raver119@gmail.com>
* few more includes
Signed-off-by: raver119 <raver119@gmail.com>
* m?
Signed-off-by: raver119 <raver119@gmail.com>
* const
Signed-off-by: raver119 <raver119@gmail.com>
* cudnn linkage in tests
Signed-off-by: raver119 <raver119@gmail.com>
* culibos
Signed-off-by: raver119 <raver119@gmail.com>
* static reminder
Signed-off-by: raver119 <raver119@gmail.com>
* platform engine tag
Signed-off-by: raver119 <raver119@gmail.com>
* HAVE_CUDNN moved to config.h.in
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* skip cudnn handle creation if there's not cudnn
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* target device in context
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* move mkldnn out of CPU block
Signed-off-by: raver119 <raver119@gmail.com>
* link to mkldnn on cuda
Signed-off-by: raver119 <raver119@gmail.com>
* less prints
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d NCHW draft
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d biasAdd
Signed-off-by: raver119 <raver119@gmail.com>
* test for MKL/CUDNN combined use
Signed-off-by: raver119 <raver119@gmail.com>
* - provide additional code for conv2d ff based on cudnn api, not tested yet
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on conv2d helper based on using cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing several cuda bugs which appeared after cudnn lib had been started to use
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of conv2d backprop op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementaion of conv3d and conv3d_bp ops based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - bugs fixing in conv3d/conv3d_bp ops (cudnn in use)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of depthwiseConv2d (ff/bp) op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of batchnorm ff op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - disable cudnn batchnorm temporary
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add minor change in cmake
Signed-off-by: Yurii <iuriish@yahoo.com>
* engine for depthwise mkldnn
Signed-off-by: raver119 <raver119@gmail.com>
* couple of includes
Signed-off-by: raver119 <raver119@gmail.com>
* - provide permutation to cudnn batchnorm ff when format is NHWC
Signed-off-by: Yurii <iuriish@yahoo.com>
* lgamma fix
Signed-off-by: raver119 <raver119@gmail.com>
* - eliminate memory leak in two tests
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 first step of Pow_bp operation implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 some corrections of calculation steps
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 some bug fixes, the PowDerevative op made broadcastable, add the raw tests for op, need refactoring to use broadcast ops
* Libnd4j: Add broadcastable elementwise power derivative #7461 fixed several bugs add broadcast support and tests, need to fix scalar+array and array+scalar
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 fixed bugs for scalar inputs, fixed multinomial tests, added tests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 fised bugs for different shapes support, tests updated
* Libnd4j: Add broadcastable elementwise power derivative #7461 applied all possible variants via tiled arrays, add support of broadcast for Pow and PowDerivative ops, covered by tests, before review have to be replaced tiled implementation by applyTrueBroadcast
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 replaced tile by broadcast implementation, fixed issue with negative x input, corrected tests, need additional testing
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 added and corrected test cases, corrected implementation need review
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 code clean up
* Libnd4j: Add broadcastable elementwise power derivative #7461 code clean up, removed some tests, add tests with scalar
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 code improvement and clean up, split tests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 some code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative replace __isnanf by internal realization
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* pow_bp wrapper
* Fixed PowBp wrapper
* Tests added
* Test fixed
* Fix return type
* Disable powBp usage
* Pow backprop changed
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>