* Cleanup modules
* Moving subprojects to nd4j-api
* Project cleanup
* Dropped AWS sub-project
* dl4j-util moved to core
* dl4j-perf moved to core
* Tests coverage
* Revert "Moving subprojects to nd4j-api"
This reverts commit bc6eb573c6b60c407ade47172c5d204725077e6b.
* Moved nd4j-buffer and nd4j-context to nd4j-api
* Rolled back change
* Revert "Project cleanup"
This reverts commit 64ac7f369b2d968f7be437718034f093fc886ffc.
* Datavec cleaned up
* Revert "Moved nd4j-buffer and nd4j-context to nd4j-api"
This reverts commit 75f4e8da80d2551e44e1251dd6c5923289fff8e1.
# Conflicts:
# nd4j/nd4j-backends/nd4j-tests/src/test/java/org/nd4j/autodiff/opvalidation/ReductionBpOpValidation.java
* Resolve conflict
* Compilation fixed.
* nd4j-context and nd4j-buffer moved to nd4j-api
* Fixed TF mapping for mmul
* Fix for dl4j-cuda tests
Signed-off-by: Alex Black <blacka101@gmail.com>
* Move last few tests from deeplearning4j-nn to -core
Signed-off-by: Alex Black <blacka101@gmail.com>
* Remove incorrect TF import mapping for TensorMmul op
Signed-off-by: Alex Black <blacka101@gmail.com>
* Cleaned TF mapping
* Fix path for test results on windows
* Remove old dependency
Signed-off-by: Alex Black <blacka101@gmail.com>
* One more attempt to fix path for test results on windows
* fixup! One more attempt to fix path for test results on windows
* fixup! One more attempt to fix path for test results on windows
Co-authored-by: Alex Black <blacka101@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: raver119 <raver119@gmail.com>
* missing alloc validation in RandomGenerator for CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* set error message if rng alloc failed
Signed-off-by: raver119 <raver119@gmail.com>
* check for error code during RNG creation in java
Signed-off-by: raver119 <raver119@gmail.com>
* nd4j-aeron profiles
Signed-off-by: raver119 <raver119@gmail.com>
* nd4j-aeron profiles
Signed-off-by: raver119 <raver119@gmail.com>
* skip one long test
Signed-off-by: raver119 <raver119@gmail.com>
* skip one long test
Signed-off-by: raver119 <raver119@gmail.com>
* kryo profile
Signed-off-by: raver119 <raver119@gmail.com>
* few more profiles
Signed-off-by: raver119 <raver119@gmail.com>
* few more profiles
Signed-off-by: raver119 <raver119@gmail.com>
* few more profiles
Signed-off-by: raver119 <raver119@gmail.com>
* Add maven profile + base tests methods for integration tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Switch from system property to environment variable; seems more reliable in intellij
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add nd4j-common-tests module, and common base test; cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ensure all ND4J tests extend BaseND4JTest
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Test spam reduction, import fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Add test logging to nd4j-aeron
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix unintended change
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Reduce sprint test log spam
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More test spam cleanup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Significantly speed up TSNE tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* W2V iterator test unit/integration split
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More NLP test speedups
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Avoid debug/verbose mode leaking between tests
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* test tweak
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Arbiter extends base DL4J test
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Arbiter test speedup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* nlp-uima test speedup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More test speedups
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix ND4J base test
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Few small ND4J test speed improvements
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* DL4J tests speedup
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More tweaks
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Even more test speedups
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* More tweaks
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Various test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* More test fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Add ability to specify number of threads for C++ ops in BaseDL4JTest and BaseND4JTest
Signed-off-by: Alex Black <blacka101@gmail.com>
* nd4j-aeron test profile fix for CUDA
Signed-off-by: Alex Black <blacka101@gmail.com>
* Added qr op implementation. Initial version.
* Fixed doc for qr op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Implementation of QR decomposition. CPU platform version.
* Added a pair of tests for qr op testing.
Signed-off-by: shugeo <sgazeos@gmail.com>
* QR implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected norm using.
* Properly calculated intermediate results with QR decomposition.
* Another step to implement QR algorithm by householder.
* Cpu implementatio for QR decomposition. The first working edition.
* Corrected test to QR decomposition.
* Added tad multithreading with QR implementation.
* Finished cpu implementation for QR decomposition helpers.
* Refactored tests and improved multithreading.
* Refactored QR cpu implementation and update cuda implementation helpers.
* Cuda QR helper implementation. The first working edition.
* Eliminated waste prints.
* Restore multithreading with cuda implementation.
* Ops names corrected
* Refactored qr op helpers to optimize.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Eliminated waste manual ticking.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored memory allocation to avoid waste memory usage.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored matrixMinor method both for cuda and cpu platforms.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored method of vmul to use raw buffers instead type conversion.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored temporary array of matricies.
Signed-off-by: shugeo <sgazeos@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* Added implementation of the triangular_solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed compilation issues.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added verification of input data and helpers facilities for triangular_solve op.'
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added cpu implementation for triangular_solve helpers.
* Added tests and implementation for upper triangular equations.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a pair of cases to tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added multithreading with cpu helpers for triangular_solve op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added cuda implementation of triangular_solve op helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Finished cuda implementation of triangular_solve helpers and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed copyright marks.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected grammar errors with doc and error messages.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored matricies processing with triangular_solve cuda helper implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added triangular_solve wrapper
* Fixed mapping
* Added processing for adjoint with cpu helpers of triangular_solve op implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added implementation for adjoint routine with cuda platform.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added multithreading with adjoint routine for cpu platform.
Signed-off-by: shugeo <sgazeos@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* Added implementation for resize_area op. Initial commit.
* Added implementation of resize_area op. Initial revision.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected resizeArea functor call.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Implementation of resize_area. Cpu platform helpers.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Implementation for resize_area helpers. The first part revision.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a set of tests for resize_area op.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Cuda implementation for resize_area. Initial approach.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Adding multithreading for resize_area algorithm.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Cuda implementation of resize_area helpers. Shared memory approach.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored resizeAreaKernel with cuda implementation.
* Eliminated compilation errors.
* ResizeArea helpers for cuda platform. The first working revision.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added test for batched resize_area op testing.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Implementation of resize_are for cuda platform and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed multithreading with resize_area op helper.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected copyright marks with sources.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected copyright mark for resize_area op implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected copyright mark for parity ops header.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected typo in strings and so on with image resize ops.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored resize_area helpers and multithreading.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added ResizeArea wrapper
* Added test with align_corners and fixed shape processing with only int args given for output size.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added test
* TF mapping for ResizeArea
* Fixed implementation issues with resize_area op for both platforms.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored image resizer struct to use flexible types for ints and floats.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Improved multithreading with resizeAreaKernel launch.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Use asynchronical memory copying with cuda platform image resize allocations.
Signed-off-by: shugeo <sgazeos@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* one file
Signed-off-by: raver119 <raver119@gmail.com>
* few more includes
Signed-off-by: raver119 <raver119@gmail.com>
* m?
Signed-off-by: raver119 <raver119@gmail.com>
* const
Signed-off-by: raver119 <raver119@gmail.com>
* cudnn linkage in tests
Signed-off-by: raver119 <raver119@gmail.com>
* culibos
Signed-off-by: raver119 <raver119@gmail.com>
* static reminder
Signed-off-by: raver119 <raver119@gmail.com>
* platform engine tag
Signed-off-by: raver119 <raver119@gmail.com>
* HAVE_CUDNN moved to config.h.in
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* include
Signed-off-by: raver119 <raver119@gmail.com>
* skip cudnn handle creation if there's not cudnn
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* target device in context
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* platform engines
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* allow multiple -h args
Signed-off-by: raver119 <raver119@gmail.com>
* move mkldnn out of CPU block
Signed-off-by: raver119 <raver119@gmail.com>
* link to mkldnn on cuda
Signed-off-by: raver119 <raver119@gmail.com>
* less prints
Signed-off-by: raver119 <raver119@gmail.com>
* minor tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* next step
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d NCHW draft
Signed-off-by: raver119 <raver119@gmail.com>
* conv2d biasAdd
Signed-off-by: raver119 <raver119@gmail.com>
* test for MKL/CUDNN combined use
Signed-off-by: raver119 <raver119@gmail.com>
* - provide additional code for conv2d ff based on cudnn api, not tested yet
Signed-off-by: Yurii <iuriish@yahoo.com>
* - further work on conv2d helper based on using cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - fixing several cuda bugs which appeared after cudnn lib had been started to use
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of conv2d backprop op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementaion of conv3d and conv3d_bp ops based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - bugs fixing in conv3d/conv3d_bp ops (cudnn in use)
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of depthwiseConv2d (ff/bp) op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - implementation of batchnorm ff op based on cudnn api
Signed-off-by: Yurii <iuriish@yahoo.com>
* - disable cudnn batchnorm temporary
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add minor change in cmake
Signed-off-by: Yurii <iuriish@yahoo.com>
* engine for depthwise mkldnn
Signed-off-by: raver119 <raver119@gmail.com>
* couple of includes
Signed-off-by: raver119 <raver119@gmail.com>
* - provide permutation to cudnn batchnorm ff when format is NHWC
Signed-off-by: Yurii <iuriish@yahoo.com>
* lgamma fix
Signed-off-by: raver119 <raver119@gmail.com>
* - eliminate memory leak in two tests
Signed-off-by: Yurii <iuriish@yahoo.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 first step of Pow_bp operation implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 some corrections of calculation steps
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 some bug fixes, the PowDerevative op made broadcastable, add the raw tests for op, need refactoring to use broadcast ops
* Libnd4j: Add broadcastable elementwise power derivative #7461 fixed several bugs add broadcast support and tests, need to fix scalar+array and array+scalar
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 fixed bugs for scalar inputs, fixed multinomial tests, added tests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 fised bugs for different shapes support, tests updated
* Libnd4j: Add broadcastable elementwise power derivative #7461 applied all possible variants via tiled arrays, add support of broadcast for Pow and PowDerivative ops, covered by tests, before review have to be replaced tiled implementation by applyTrueBroadcast
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 replaced tile by broadcast implementation, fixed issue with negative x input, corrected tests, need additional testing
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 added and corrected test cases, corrected implementation need review
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 code clean up
* Libnd4j: Add broadcastable elementwise power derivative #7461 code clean up, removed some tests, add tests with scalar
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 code improvement and clean up, split tests
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative #7461 some code clean up
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* Libnd4j: Add broadcastable elementwise power derivative replace __isnanf by internal realization
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* pow_bp wrapper
* Fixed PowBp wrapper
* Tests added
* Test fixed
* Fix return type
* Disable powBp usage
* Pow backprop changed
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* SameDiff exec: Fix for switch op when predicate is constant, and op is inside loop
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Update ignores for failing zoo models
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8555 SameDiff profiler analysis improvements
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix TF sub-op aggregation
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small filtering tweak
Signed-off-by: Alex Black <blacka101@gmail.com>
* Copyright headers
Signed-off-by: Alex Black <blacka101@gmail.com>
* Allow scalar op result array auto allocation
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Don't swallow underlying exception for calculateOutputShape execution failures
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ignore for known keras failure
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Profiler
Signed-off-by: Alex Black <blacka101@gmail.com>
* Next steps, polishing, and loading SD/TF format JSON
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Profile comparison method
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Make profiling result writing async to reduce main thread overhead
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Profiling polishing
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Profile analyzer fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Polish
Signed-off-by: Alex Black <blacka101@gmail.com>
* Cleanup
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small formatting improvement
Signed-off-by: Alex Black <blacka101@gmail.com>
* Formatting tweak
Signed-off-by: Alex Black <blacka101@gmail.com>
* License headers
Signed-off-by: Alex Black <blacka101@gmail.com>
* Timeouts added
* Added some ops
* Ops added
* Fixed tests
* Minor fix
* Some fixes
* Digamma added
* Small fixes
* Timeouts added
* Added some ops
* Ops added
* Fixed tests
* Minor fix
* Some fixes
* Digamma added
* Small fixes
* Fused batch norm fixes-
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Tests switched off.
* Added test for resize_bicubic.
* Eliminated wasted in test of bicubic resize.
* Switched off multithreading explicit.
* HsvToRgb and RgbToHsv added
* Eliminated waste comments and conform proper float constants.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed multithreading with resize_bicubic helper for cpu platform.
Signed-off-by: shugeo <sgazeos@gmail.com>
* ResizeBicubic was fixed.
* Some fixes
* Fix op name
* Validation fixed.
* Clarifications for tests
* Wrappers and small fixes for new ops.
* Add op counting to TensorFlowImportValidator
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Test tweak
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* resize_bicubic: allow more dtypes
Signed-off-by: raver119 <raver119@gmail.com>
* resize_bicubic: allow less dtypes
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored resize_bicubic op to full conform with TF1.5 and tests.
* Corrected test to proper data type output.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected double input test to float constant outputs.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Finished with correction of tests for bicubic interpolated resizes expected.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed adjust_contrast ops to allow non-RGB inputs.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored adjust_contrast_v2 to conform with TF one.
Signed-off-by: shugeo <sgazeos@gmail.com>
* AdjustContrast tests activated
* two typos fixed
Signed-off-by: raver119 <raver119@gmail.com>
* cleaned up bert iterator tests (#110)
Signed-off-by: eraly <susan.eraly@gmail.com>
* Various pre-release fixes (#111)
* Various fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix default dtypes for MaxPoolWithArgmax
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small pre-release tweak (#112)
* Log UI address on launch as in previous Play-based UI
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Logging level tweak for UI
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* http not https
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* datavec python ensure host (#113)
* ensure host
* one more host ensure
* info->debug
* [WIP] reverse improvements (#115)
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* reverse draft
Signed-off-by: raver119 <raver119@gmail.com>
* reverse kernel
Signed-off-by: raver119 <raver119@gmail.com>
* reverse kernel
Signed-off-by: raver119 <raver119@gmail.com>
* 2 micro fixes
Signed-off-by: raver119 <raver119@gmail.com>
* Shugeo resize fix5 (#102)
* Refactored resize images ops to use TF-like bool args as input.
* Refactored helpers for cpu implementation of resize_bilinear and resize_nearest_neighbor ops.
* Refactored cuda implementation for image.resize_bilinear and image.resize_nearest_neighbor ops helpers.
* Refactored nearest_neighbor resize op.
* Added a pair of tests for special case of resize_bilinear algorithm.
* Fixed issue with resize_bilinear op.
* Refactored cpu implementation for helpers with resize_nearest_neighbor op.
* Final fixed for resize ops to conform TF v.1.5
* Refactored cuda helpers for resize_neares_neighbor op.
* Fixed resize_bilinear to accept proper data.
* Fixed issue with non-float input for resize_bilinear op.
* Refactored cuda helper for resize_bilinear to proper process non-float inputs.
* Added tests for resize_bilinear to int inputs.
* Fixed ResizeBilinear wrapper
* Tests fixed
* Fixed float and bool constant to avoid overflow for some kind of compilers.
* Corrected float constants with float data type.
* Added f suffix for float constants.
* Corrected float constant to avoid overflow with initializing lists.
* Corrected float initializing list with float input.
* Corrected bool constant with initalizing list.
* Corrected float and bool values with initializing lists.
* Fixed wrong constant.
* Fixed issue with 1x1 input picture for resize.
* ResizeBilinear default values on import fix
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored resize images ops to use TF-like bool args as input.
* Refactored helpers for cpu implementation of resize_bilinear and resize_nearest_neighbor ops.
* Refactored cuda implementation for image.resize_bilinear and image.resize_nearest_neighbor ops helpers.
* Refactored nearest_neighbor resize op.
* Added a pair of tests for special case of resize_bilinear algorithm.
* Fixed issue with resize_bilinear op.
* Refactored cpu implementation for helpers with resize_nearest_neighbor op.
* Final fixed for resize ops to conform TF v.1.5
* Refactored cuda helpers for resize_neares_neighbor op.
* Fixed resize_bilinear to accept proper data.
* Fixed issue with non-float input for resize_bilinear op.
* Refactored cuda helper for resize_bilinear to proper process non-float inputs.
* Added tests for resize_bilinear to int inputs.
* Fixed ResizeBilinear wrapper
* Tests fixed
* Fixed float and bool constant to avoid overflow for some kind of compilers.
* Corrected float constants with float data type.
* Added f suffix for float constants.
* Corrected float constant to avoid overflow with initializing lists.
* Corrected float initializing list with float input.
* Corrected bool constant with initalizing list.
* Corrected float and bool values with initializing lists.
* Fixed wrong constant.
* Fixed issue with 1x1 input picture for resize.
* ResizeBilinear default values on import fix
Signed-off-by: raver119 <raver119@gmail.com>
* - add causal mode of padding to convolutions
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add additional tests for causal conv1d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add causal mode for cuda conv kernels
Signed-off-by: Yurii <iuriish@yahoo.com>
* Java side of Conv1D changes
Signed-off-by: raver119 <raver119@gmail.com>
* Add Conv1DDerivative op
Signed-off-by: Alex Black <blacka101@gmail.com>
* Causal Conv1D gradient checks
Signed-off-by: Alex Black <blacka101@gmail.com>
* Tweaks
Signed-off-by: Alex Black <blacka101@gmail.com>
* - add causal padding mode to conv2d_bp
Signed-off-by: Yurii <iuriish@yahoo.com>
* More thorough causal conv1d tests
Signed-off-by: Alex Black <blacka101@gmail.com>
* Implementation for non_max_suppression_v3 was added. Initial version
* Added check for overcome threshold.
* Added definition for V3 method.
* java remapping for NonMaxSuppressionV3
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed proporly processing of an empty output and test.
* Refactored op to less threshold data to float.
* Implemented cuda-based helper for non_max_suppression_v3 op.
* Fixed fake_quant_with_min_max_vars op.
* Fixed tests with float numbers.
* - assert now stops execution
- sortByKey/sortByValue now have input validation
Signed-off-by: raver119 <raver119@gmail.com>
* missing var
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed proper processing for zero max_size inputs.
* Refactored kernel callers.
* Fixed return statement for logdet op helper.
* Refactored unsorted segment SqrtN op.
* get back 8 tail bytes on CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored segment prod ops and helpers for cuda and tests.
* Additional test.
* CudaWorkspace tests updated for 8 tail bytes
Signed-off-by: raver119 <raver119@gmail.com>
* special atomic test
Signed-off-by: raver119 <raver119@gmail.com>
* atomicMul/atomicDiv fix for 16bit values
Signed-off-by: raver119 <raver119@gmail.com>
* Eliminated waste prints.