* Profiler
Signed-off-by: Alex Black <blacka101@gmail.com>
* Next steps, polishing, and loading SD/TF format JSON
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Next steps
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Profile comparison method
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Make profiling result writing async to reduce main thread overhead
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Profiling polishing
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Profile analyzer fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Polish
Signed-off-by: Alex Black <blacka101@gmail.com>
* Cleanup
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small formatting improvement
Signed-off-by: Alex Black <blacka101@gmail.com>
* Formatting tweak
Signed-off-by: Alex Black <blacka101@gmail.com>
* License headers
Signed-off-by: Alex Black <blacka101@gmail.com>
* Timeouts added
* Added some ops
* Ops added
* Fixed tests
* Minor fix
* Some fixes
* Digamma added
* Small fixes
* Timeouts added
* Added some ops
* Ops added
* Fixed tests
* Minor fix
* Some fixes
* Digamma added
* Small fixes
* Fused batch norm fixes-
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Tests switched off.
* Added test for resize_bicubic.
* Eliminated wasted in test of bicubic resize.
* Switched off multithreading explicit.
* HsvToRgb and RgbToHsv added
* Eliminated waste comments and conform proper float constants.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed multithreading with resize_bicubic helper for cpu platform.
Signed-off-by: shugeo <sgazeos@gmail.com>
* ResizeBicubic was fixed.
* Some fixes
* Fix op name
* Validation fixed.
* Clarifications for tests
* Wrappers and small fixes for new ops.
* resize_bicubic: allow more dtypes
Signed-off-by: raver119 <raver119@gmail.com>
* resize_bicubic: allow less dtypes
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored resize_bicubic op to full conform with TF1.5 and tests.
* Corrected test to proper data type output.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Corrected double input test to float constant outputs.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Finished with correction of tests for bicubic interpolated resizes expected.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed adjust_contrast ops to allow non-RGB inputs.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored adjust_contrast_v2 to conform with TF one.
Signed-off-by: shugeo <sgazeos@gmail.com>
* AdjustContrast tests activated
* two typos fixed
Signed-off-by: raver119 <raver119@gmail.com>
* cleaned up bert iterator tests (#110)
Signed-off-by: eraly <susan.eraly@gmail.com>
* Various pre-release fixes (#111)
* Various fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix default dtypes for MaxPoolWithArgmax
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small pre-release tweak (#112)
* Log UI address on launch as in previous Play-based UI
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Logging level tweak for UI
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* http not https
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* datavec python ensure host (#113)
* ensure host
* one more host ensure
* info->debug
* [WIP] reverse improvements (#115)
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* reverse draft
Signed-off-by: raver119 <raver119@gmail.com>
* reverse kernel
Signed-off-by: raver119 <raver119@gmail.com>
* reverse kernel
Signed-off-by: raver119 <raver119@gmail.com>
* 2 micro fixes
Signed-off-by: raver119 <raver119@gmail.com>
* Shugeo resize fix5 (#102)
* Refactored resize images ops to use TF-like bool args as input.
* Refactored helpers for cpu implementation of resize_bilinear and resize_nearest_neighbor ops.
* Refactored cuda implementation for image.resize_bilinear and image.resize_nearest_neighbor ops helpers.
* Refactored nearest_neighbor resize op.
* Added a pair of tests for special case of resize_bilinear algorithm.
* Fixed issue with resize_bilinear op.
* Refactored cpu implementation for helpers with resize_nearest_neighbor op.
* Final fixed for resize ops to conform TF v.1.5
* Refactored cuda helpers for resize_neares_neighbor op.
* Fixed resize_bilinear to accept proper data.
* Fixed issue with non-float input for resize_bilinear op.
* Refactored cuda helper for resize_bilinear to proper process non-float inputs.
* Added tests for resize_bilinear to int inputs.
* Fixed ResizeBilinear wrapper
* Tests fixed
* Fixed float and bool constant to avoid overflow for some kind of compilers.
* Corrected float constants with float data type.
* Added f suffix for float constants.
* Corrected float constant to avoid overflow with initializing lists.
* Corrected float initializing list with float input.
* Corrected bool constant with initalizing list.
* Corrected float and bool values with initializing lists.
* Fixed wrong constant.
* Fixed issue with 1x1 input picture for resize.
* ResizeBilinear default values on import fix
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored resize images ops to use TF-like bool args as input.
* Refactored helpers for cpu implementation of resize_bilinear and resize_nearest_neighbor ops.
* Refactored cuda implementation for image.resize_bilinear and image.resize_nearest_neighbor ops helpers.
* Refactored nearest_neighbor resize op.
* Added a pair of tests for special case of resize_bilinear algorithm.
* Fixed issue with resize_bilinear op.
* Refactored cpu implementation for helpers with resize_nearest_neighbor op.
* Final fixed for resize ops to conform TF v.1.5
* Refactored cuda helpers for resize_neares_neighbor op.
* Fixed resize_bilinear to accept proper data.
* Fixed issue with non-float input for resize_bilinear op.
* Refactored cuda helper for resize_bilinear to proper process non-float inputs.
* Added tests for resize_bilinear to int inputs.
* Fixed ResizeBilinear wrapper
* Tests fixed
* Fixed float and bool constant to avoid overflow for some kind of compilers.
* Corrected float constants with float data type.
* Added f suffix for float constants.
* Corrected float constant to avoid overflow with initializing lists.
* Corrected float initializing list with float input.
* Corrected bool constant with initalizing list.
* Corrected float and bool values with initializing lists.
* Fixed wrong constant.
* Fixed issue with 1x1 input picture for resize.
* ResizeBilinear default values on import fix
Signed-off-by: raver119 <raver119@gmail.com>
* - add causal mode of padding to convolutions
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add additional tests for causal conv1d
Signed-off-by: Yurii <iuriish@yahoo.com>
* - add causal mode for cuda conv kernels
Signed-off-by: Yurii <iuriish@yahoo.com>
* Java side of Conv1D changes
Signed-off-by: raver119 <raver119@gmail.com>
* Add Conv1DDerivative op
Signed-off-by: Alex Black <blacka101@gmail.com>
* Causal Conv1D gradient checks
Signed-off-by: Alex Black <blacka101@gmail.com>
* Tweaks
Signed-off-by: Alex Black <blacka101@gmail.com>
* - add causal padding mode to conv2d_bp
Signed-off-by: Yurii <iuriish@yahoo.com>
* More thorough causal conv1d tests
Signed-off-by: Alex Black <blacka101@gmail.com>
* - create op
- skip exec for empty inputs for non_max_suppression
- EmptyHandling idea
Signed-off-by: raver119 <raver119@gmail.com>
* Create op and mapping for it
Signed-off-by: raver119 <raver119@gmail.com>
* Added implementation files for image_resize and resize_bicubic ops.
* Image resize and image.resize_bicubic ops implementation. Initial revision.
* Minor fix
* Some TF imports disabled.
* Finished with infrastructure development for image.resize_bilinear op and image_resizo op implementation.
* Refactored resize methods.
* Added processing for Mitchelcubic algorithm.
* adjust_contrast
* Small fix for TF import expected value loading when variable name starts with the test name
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Tests
* Tests added.
* Removed tf names absent in mapping.
* Some fixes.
* Small fixes
* Minor change
* Some failing tests.
* Disable failed test
* Ignore some tests
* Fix import class mapping
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix float property mapping (flatbuffers)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Override equality function for model 'dropout'
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fail tests
* Failed tests ignored temporarily.
* Minor fixes
* Small fix
* Conflict resolved
* Default implementations of tensorflowName and onnxName
* one range test
Signed-off-by: raver119 <raver119@gmail.com>
* few Context convenience singatures
Signed-off-by: raver119 <raver119@gmail.com>
* one more range test
Signed-off-by: raver119 <raver119@gmail.com>
* "range" "fix"
Signed-off-by: raver119 <raver119@gmail.com>
* adjuct_contrast_v2 now allows scale factor to be provided via input_variable
Signed-off-by: raver119 <raver119@gmail.com>
* adjust_contrast now allows scale factor as variable too
Signed-off-by: raver119 <raver119@gmail.com>
* bitcast shape tests
Signed-off-by: raver119 <raver119@gmail.com>
* BitCast import dtype added
Signed-off-by: raver119 <raver119@gmail.com>
* few more BitCast signatures
Signed-off-by: raver119 <raver119@gmail.com>
* #8280 biasadd_bp nchw arg fixes (java side) + test
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #8285 Concat op Java side fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Concat op cpp fix - allow dynamic axis to be negative, same as static axis
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* ignores for deconv3d import tests until deconv3d_tf op is implemented
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* update javadocs and a few method signatures
Signed-off-by: Ryan Nett <rnett@skymind.io>
* add PRelu op
Signed-off-by: Ryan Nett <rnett@skymind.io>
* test and fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* add PRelu op
Signed-off-by: Ryan Nett <rnett@skymind.io>
* test and fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* slightly better test
Signed-off-by: Ryan Nett <rnett@skymind.io>
* Fixed signatures. SameDiff tests
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* Tests fixed
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* Test fixed
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* Small fix
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* Fixed test
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* fix execBackwards training issue
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fix validation not specifying outputs
Signed-off-by: Ryan Nett <rnett@skymind.io>
* another fix for validation listeners and history
Signed-off-by: Ryan Nett <rnett@skymind.io>
* tests
Signed-off-by: Ryan Nett <rnett@skymind.io>
* add single batch dataset output methods
Signed-off-by: Ryan Nett <rnett@skymind.io>
* SDCNN cleanup
Signed-off-by: Ryan Nett <rnett@skymind.io>
* NonNull annotations
Signed-off-by: Ryan Nett <rnett@skymind.io>
* better javadoc, NonNull fix for sconv
Signed-off-by: Ryan Nett <rnett@skymind.io>
* update builders to fix names
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* even more fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fix for null bias
Signed-off-by: Ryan Nett <rnett@skymind.io>
* one test for alex
Signed-off-by: raver119 <raver119@gmail.com>
* fix
Signed-off-by: raver119 <raver119@gmail.com>
* get rid of safety offset in cpp
Signed-off-by: raver119 <raver119@gmail.com>
* bfloat16
Signed-off-by: raver119 <raver119@gmail.com>
* minor test rearrangement to fastpath launch
Signed-off-by: raver119 <raver119@gmail.com>
* - atomicAdd/Mul/Div fix for float16/bfloat16 misalignment
- one special test for maxpoolbp java
- safety offset of 8 bytes is back to libnd4j legacy
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored kernels for segment_max/min/sum ops.
* Refactored segment_prod kernels.
* Refactored segment_prod kernels.
* DynamicPartition test
Signed-off-by: raver119 <raver119@gmail.com>
* Addede linear test for dynamic_partition op.
* Refactored test with int datatype.
* some logging
Signed-off-by: raver119 <raver119@gmail.com>
* some logging
Signed-off-by: raver119 <raver119@gmail.com>
* some logging
Signed-off-by: raver119 <raver119@gmail.com>
* dynamicPartition fix
Signed-off-by: raver119 <raver119@gmail.com>
* get rid of some logging
Signed-off-by: raver119 <raver119@gmail.com>
* one more test for dynamic_stitch
Signed-off-by: raver119 <raver119@gmail.com>
* one more test for dynamic_stitch
Signed-off-by: raver119 <raver119@gmail.com>
* empty check for stitch
Signed-off-by: raver119 <raver119@gmail.com>
* minor print changes
Signed-off-by: raver119 <raver119@gmail.com>
* remove some unneeded java-side output shape calculations
Signed-off-by: Ryan Nett <rnett@skymind.io>
* delete Broadcast
Signed-off-by: Ryan Nett <rnett@skymind.io>
* delete Linear and Module,
Signed-off-by: Ryan Nett <rnett@skymind.io>
* update Identity, HashCode, and NoOp
Signed-off-by: Ryan Nett <rnett@skymind.io>
* removed Cast java-side shape function, added tests and SDVariable.isEmpty
Signed-off-by: Ryan Nett <rnett@skymind.io>
* ignoring test w/ issues on master
Signed-off-by: Ryan Nett <rnett@skymind.io>
* noop needs more work, fixed BaseArithmeticBackprop and BaseDynamicTransform ops
merge in master for c++ build fix
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fix EqualTo
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fix other cond ops
Signed-off-by: Ryan Nett <rnett@skymind.io>
* "fake" ops calculateOutputShape() throws exception
Signed-off-by: Ryan Nett <rnett@skymind.io>
* use c++ shape calc for Linspace
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fix exception message, move most to BaseCompatOp
Signed-off-by: Ryan Nett <rnett@skymind.io>
* remove SDVariable.isEmpty
Signed-off-by: Ryan Nett <rnett@skymind.io>
* remove commented out code
Signed-off-by: Ryan Nett <rnett@skymind.io>
* one noop test
Signed-off-by: raver119 <raver119@gmail.com>
* skip input validation for no-input ops
Signed-off-by: raver119 <raver119@gmail.com>
* - one more noop empty test
- one more validation before sync
Signed-off-by: raver119 <raver119@gmail.com>
* typo
Signed-off-by: raver119 <raver119@gmail.com>
* one more validation fix
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA empty reductions java side
Signed-off-by: raver119 <raver119@gmail.com>
* one svd test
Signed-off-by: raver119 <raver119@gmail.com>
* Corrected segment_mean helpers and added another test.
* Refactored segment_mean kernels to avoid race_condition.
* CUDA empty reduction
Signed-off-by: raver119 <raver119@gmail.com>
* - listdiff synchronization fix for CUDA
- listdiff test
Signed-off-by: raver119 <raver119@gmail.com>
* - IndexReduce ops now allow INDEXING_TYPES output
- topK op accepts only INDEXING_TYPES as output
Signed-off-by: raver119 <raver119@gmail.com>
* one test for maxpool2d_bp
Signed-off-by: raver119 <raver119@gmail.com>
* - maxpool2d_bp cuda fix for NaNs
- streamSync after each custom op execution
Signed-off-by: raver119 <raver119@gmail.com>
* one test for size
Signed-off-by: raver119 <raver119@gmail.com>
* - few tests for size op
- size/rank/size_at ops now use p instead of assign
Signed-off-by: raver119 <raver119@gmail.com>
* throw exception if op execution failed
Signed-off-by: raver119 <raver119@gmail.com>
* expected for test
Signed-off-by: raver119 <raver119@gmail.com>
* one more ismax test
Signed-off-by: raver119 <raver119@gmail.com>
* ismax view fix
Signed-off-by: raver119 <raver119@gmail.com>
* Nd4j pad update
Signed-off-by: Ryan Nett <rnett@skymind.io>
* switched from guava Immutables to Collections.unmodifiableList/Map
Signed-off-by: Ryan Nett <rnett@skymind.io>
* javadoc
Signed-off-by: Ryan Nett <rnett@skymind.io>
* use new pad
Signed-off-by: Ryan Nett <rnett@skymind.io>
* conv tests use OpValidation
Signed-off-by: Ryan Nett <rnett@skymind.io>
* deconv3d overrides
Signed-off-by: Ryan Nett <rnett@skymind.io>
* test fix for the new pad method
Signed-off-by: Ryan Nett <rnett@skymind.io>
* more test fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* more test fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* rename SameDiff function methods to op (except for the actual SameDiff function ones)
Signed-off-by: Ryan Nett <rnett@skymind.io>
* more pad overloads, test fix
Signed-off-by: Ryan Nett <rnett@skymind.io>
* test updates
Signed-off-by: Ryan Nett <rnett@skymind.io>
* conv1d test
Signed-off-by: Ryan Nett <rnett@skymind.io>
* remove Conv1D tf import (there isn't a TF conv1d op)
Signed-off-by: Ryan Nett <rnett@skymind.io>
* remove numThreads from Nd4j
Signed-off-by: Ryan Nett <rnett@skymind.io>
* replace Old ops with their newer versions, deprecate ones that haven't already been deprecated
Signed-off-by: Ryan Nett <rnett@skymind.io>
* remove use of setNumThreads
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fix for Reverse and ATan2
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fix test for wrong equals type
Signed-off-by: Ryan Nett <rnett@skymind.io>
* well it works now
Signed-off-by: Ryan Nett <rnett@skymind.io>
* better javadocs
Signed-off-by: Ryan Nett <rnett@skymind.io>
* NonNulls
Signed-off-by: Ryan Nett <rnett@skymind.io>
* better array literal
Signed-off-by: Ryan Nett <rnett@skymind.io>
* re-add tf import stuff (will remove later)
Signed-off-by: Ryan Nett <rnett@skymind.io>
* conv1d config load fix
Signed-off-by: Ryan Nett <rnett@skymind.io>
* partial config usage changes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* remove Old op classes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* config property fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* removed one too many ops
Signed-off-by: Ryan Nett <rnett@skymind.io>
* refactoring
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* wip
* fix: make test public.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* make test public.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* fixes read refactoring.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* First pass on SameDiff op exec debug listener
Signed-off-by: Alex Black <blacka101@gmail.com>
* #7555 DL4J helpers - don't fall back on builtin for op profiler exceptions
Signed-off-by: Alex Black <blacka101@gmail.com>
* Exec debugging listener + fixes
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix import counts for TF ops in OpValidationSuite
Signed-off-by: Alex Black <blacka101@gmail.com>
* Fix bad DL4J test configuration
Signed-off-by: Alex Black <blacka101@gmail.com>
* Exec debugging listener polish
Signed-off-by: Alex Black <blacka101@gmail.com>
* Small fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* Another fix
Signed-off-by: Alex Black <blacka101@gmail.com>
* wip
* update interface, add null implementations.
* Breaking one test in a weird way.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* createUninitializedDetached refactored.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* remove create method with unused parameter.
* removed more unused methods.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* removing more unused code.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* last removal of unused code.
Signed-off-by: Robert Altena <Rob@Ra-ai.com>
* Conv Config validation & tests
Signed-off-by: Ryan Nett <rnett@skymind.io>
* stackOutputs utility method
Signed-off-by: Ryan Nett <rnett@skymind.io>
* use constructor for validation, support negative kernel sizes (infered from weights)
Signed-off-by: Ryan Nett <rnett@skymind.io>
* better output methods
Signed-off-by: Ryan Nett <rnett@skymind.io>
* move output to be with fit and evaluate
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* more fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* remove SDVariable inplace methods
* import methods
* npe fix in OpVal
* removed SameDiff inplace ops from tests
* Naming updates, moved to centralized methods in SameDiff, should use op_#:# for everything
* quick fixes
* javadoc
* SDVariable eval with placeholders
* use regex match
* better matching
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* Added gradcheck test for dynamic_partition_bp op.
* - implementation of dilation op (cpu and cuda)
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed broadcast_dynamic_shape 1D case and tests.
* Fixed usage of default integer arguments.
* Fixed dynamic_partition_bp op and tests.
* Eliminated test with grad check for dynamic_partition_bp op.
* start working on cuda svd - porting available corresponding api from cuSOLVER library
Signed-off-by: Yurii <yurii@skymind.io>
* provide prelu_bp
Signed-off-by: Yurii <yurii@skymind.io>
* - provide gruCell_bp (old version ??)
Signed-off-by: Yurii <yurii@skymind.io>
* - polishing cumsum_bp and cumprod_bp tests
Signed-off-by: Yurii <yurii@skymind.io>
* provide sparseSoftmaxCrossEntropyWithLogits and sparseSoftmaxCrossEntropyWithLogits_grad
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed atomicMul with float input/output
* implementation of cuda kernel for triu_bp operation
Signed-off-by: Yurii <yurii@skymind.io>
* Refactored lup helper to add parrallel computing.
* cusolver libraries
Signed-off-by: raver119 <raver119@gmail.com>
* uncomment cuSolver APIs in svd.cu
Signed-off-by: Yurii <yurii@skymind.io>
* cusolver var
Signed-off-by: raver119 <raver119@gmail.com>
* - further work on cuSolver svd
Signed-off-by: Yurii <yurii@skymind.io>
* Implement usage of cuda solver to LUP decomposition.
* - correct naames in lup functions
Signed-off-by: Yurii <yurii@skymind.io>
* correct svdQR cuda
Signed-off-by: Yurii <yurii@skymind.io>
* - provide transpositions of input matrices in case of c order in svdCudaQR
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed implementation issues with LUP usign cuda solver.
* Implementation of matrix_determinant helper with cuda kernels. Working revision.
* Implemented log_matrix_determinant helper with cuda kernels.
* - implementation of batched cuda svd
Signed-off-by: Yurii <yurii@skymind.io>
* Refactored cholesky helper and implementation of cuda solver cholesky batch.
* - implementation of cuda kernel for tile bp
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of cholesky and logdet with cuda kernels.
* - implementation of cuda kernel for sru_bidirectional
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed cholesky helper.
* Cholesky op helper implementation. Working double-based cublas implementation.
* bad import excluded
Signed-off-by: raver119 <raver119@gmail.com>
* Finished with cuda implementation of cholesky helper and tests.
* - implementation of cuda kernel for sru_bidirectional_backprop operation
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of matrix_inverse op helper with cuda kernels. The first revision.
* - start working on gruCell_bp
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of matrix_inverse helper.
* - further work on new gruCell_bp
Signed-off-by: Yurii <yurii@skymind.io>
* cuBLAS related fixes
Signed-off-by: raver119 <raver119@gmail.com>
* calculateOutputShapes() now passes device buffers as well
Signed-off-by: raver119 <raver119@gmail.com>
* special concat/average/accumulate init host pointers now
Signed-off-by: raver119 <raver119@gmail.com>
* few more tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* additional CudaDataBufferFactory signatures certain for data types
Signed-off-by: raver119 <raver119@gmail.com>
* cuSolver host buffer
Signed-off-by: raver119 <raver119@gmail.com>
* buffer to buffer memcpy host ptr allocation
Signed-off-by: raver119 <raver119@gmail.com>
* softmax and logSoftmax w/ dimension
Signed-off-by: Ryan Nett <rnett@skymind.io>
* start of while
Signed-off-by: Ryan Nett <rnett@skymind.io>
* if, start of javadocs
Signed-off-by: Ryan Nett <rnett@skymind.io>
* while foreward pass working, backprop WIP
Signed-off-by: Ryan Nett <rnett@skymind.io>
* no backprop
Signed-off-by: Ryan Nett <rnett@skymind.io>
* Tensorflow style if/while (& tests), name scope fixes (and test), argument interceptor (for if/while), use '_' in op names instead of ':'
Signed-off-by: Ryan Nett <rnett@skymind.io>
* javadoc
Signed-off-by: Ryan Nett <rnett@skymind.io>
* many fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* many fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* Some fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* cleanup if condition doesn't return boolean
Signed-off-by: Ryan Nett <rnett@skymind.io>
* serialization fix
Signed-off-by: Ryan Nett <rnett@skymind.io>
* use constants instead of magic numbers
Signed-off-by: Ryan Nett <rnett@skymind.io>
* LeakyReLU: Use serScalar to set alpha correctly in TF import
LogX: remove incorrect TF mapping
Pow: remove TF import method (no mapping)
BaseOp: remove duplicate extraArgs
Signed-off-by: Ryan Nett <rnett@skymind.io>
* un-ignore cifar-10 gan, as it is now passing
Signed-off-by: Ryan Nett <rnett@skymind.io>
* new issue for b_d_s (old is closed), new issue for boolean_mask (old is fixed)
Signed-off-by: Ryan Nett <rnett@skymind.io>
* date update
Signed-off-by: Ryan Nett <rnett@skymind.io>
* "embedding_lookup/.*multiple.*" is passing
Signed-off-by: Ryan Nett <rnett@skymind.io>
* new bincount issue (old was fixed)
Signed-off-by: Ryan Nett <rnett@skymind.io>
* date update
Signed-off-by: Ryan Nett <rnett@skymind.io>
* where op comment
Signed-off-by: Ryan Nett <rnett@skymind.io>
* where passing
Signed-off-by: Ryan Nett <rnett@skymind.io>
* ignore gan
Signed-off-by: Ryan Nett <rnett@skymind.io>
* scatter_nd/rank2shape_2indices passing
Signed-off-by: Ryan Nett <rnett@skymind.io>
* topk update, test fix (need to verify)
Signed-off-by: Ryan Nett <rnett@skymind.io>
* batch_to_space issue update
Signed-off-by: Ryan Nett <rnett@skymind.io>
* updates, no change
Signed-off-by: Ryan Nett <rnett@skymind.io>
* date updates
Signed-off-by: Ryan Nett <rnett@skymind.io>
* batch_to_space is fixed
Signed-off-by: Ryan Nett <rnett@skymind.io>
* equality function helper
Signed-off-by: Ryan Nett <rnett@skymind.io>
* topk passing
Signed-off-by: Ryan Nett <rnett@skymind.io>
* dropout equality check
Signed-off-by: Ryan Nett <rnett@skymind.io>
* adding ignores for zoo models
Signed-off-by: Ryan Nett <rnett@skymind.io>
* ignore libnd4j rnn tests
Signed-off-by: Ryan Nett <rnett@skymind.io>
* added batch to space test
Signed-off-by: Ryan Nett <rnett@skymind.io>
* issue comments
Signed-off-by: Ryan Nett <rnett@skymind.io>
* minor cmake changes to make macos happy
* space_to_batch/batch_to_space validation fix
* - choose op tweaks
- tests updated to match appleid tweaks
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - get rid of bad import
Signed-off-by: raver119@gmail.com <raver119@gmail.com>
* - choose now uses shape function
- choose test updated
* System info export for debugging and bug reporting
Signed-off-by: Ryan Nett <rnett@skymind.io>
* class name fix
Signed-off-by: Ryan Nett <rnett@skymind.io>
* add version information, pointer memory info
Signed-off-by: Ryan Nett <rnett@skymind.io>
* add nvidia-smi and nvcc info
Signed-off-by: Ryan Nett <rnett@skymind.io>
* line cleanup
Signed-off-by: Ryan Nett <rnett@skymind.io>
* nvidia-smi run works
Signed-off-by: Ryan Nett <rnett@skymind.io>
* add oshi dependency
Signed-off-by: Ryan Nett <rnett@skymind.io>
* use OS info, add workspaces info
Signed-off-by: Ryan Nett <rnett@skymind.io>
* use ServiceLoader to load GPU information
Signed-off-by: Ryan Nett <rnett@skymind.io>
* register service
Signed-off-by: Ryan Nett <rnett@skymind.io>
* moved service out of NativeOpsHolder (private constructor)
Signed-off-by: Ryan Nett <rnett@skymind.io>
* added newline
Signed-off-by: Ryan Nett <rnett@skymind.io>
* added license
Signed-off-by: Ryan Nett <rnett@skymind.io>
* and one more
Signed-off-by: Ryan Nett <rnett@skymind.io>
* copyright update
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* removed unused imports
Signed-off-by: Ryan Nett <rnett@skymind.io>
* removed more unused imports
Signed-off-by: Ryan Nett <rnett@skymind.io>
* close streams
Signed-off-by: Ryan Nett <rnett@skymind.io>
* and another one
Signed-off-by: Ryan Nett <rnett@skymind.io>
* use method
Signed-off-by: Ryan Nett <rnett@skymind.io>
* one more copyright
Signed-off-by: Ryan Nett <rnett@skymind.io>
* remove double license
Signed-off-by: Ryan Nett <rnett@skymind.io>
* moved test to correct package
Signed-off-by: Ryan Nett <rnett@skymind.io>
* classpath update
Signed-off-by: Ryan Nett <rnett@skymind.io>
* classpath for java >8 fix
Signed-off-by: Ryan Nett <rnett@skymind.io>
* changed [] to ...
Signed-off-by: Ryan Nett <rnett@skymind.io>
* added randn(long seed, int... shape)
Signed-off-by: Ryan Nett <rnett@skymind.io>
* Fixed a couple of methods
Signed-off-by: Ryan Nett <rnett@skymind.io>
* ToString methods w/ options
Signed-off-by: Ryan Nett <rnett@skymind.io>
* fixes, less toString methods, and a few ops I missed
Signed-off-by: Ryan Nett <rnett@skymind.io>
* some javadocs, change int... to long... where possible
Signed-off-by: Ryan Nett <rnett@skymind.io>
* another javadoc
Signed-off-by: Ryan Nett <rnett@skymind.io>
* javadoc fix
Signed-off-by: Ryan Nett <rnett@skymind.io>
* just javadoc in INDArray
Signed-off-by: Ryan Nett <rnett@skymind.io>
* local/static fix
Signed-off-by: Ryan Nett <rnett@skymind.io>
* Add @NonNull to options
Signed-off-by: Ryan Nett <rnett@skymind.io>
* javadoc updates/fixes
Signed-off-by: Ryan Nett <rnett@skymind.io>
* more @NonNulls
Signed-off-by: Ryan Nett <rnett@skymind.io>
* even more @NonNulls, this time on varargs
Signed-off-by: Ryan Nett <rnett@skymind.io>
* Added observation classes and tests
Signed-off-by: unknown <aboulang2002@yahoo.com>
* Now uses DataSetPreProcessors
Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
* CompositeDataSetPreProcessor can now stop processing on empty dataset; Some DataSetPreProcessors moving from RL4J to ND4J
Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
* Did requested minor changes
Signed-off-by: Alexandre Boulanger <Alexandre.Boulanger@ia.ca>
Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
* Shugeo strided slice zeros (#14)
* Modified strided_slice op to properly work with empty-like shapes.
* Fixed test for reduce_mean with empty-like input.
* [WIP] Last merge (#15)
* correct logsoftmax looss (#2)
* Small SameDiff listener fix (#4)
* Various fixes (#6)
* #7839 Fix for asXMatrix and tests
* #7866 EmbeddingSequenceLayer dtype fix + test
* #7856 SameDiff save/load stream methods
* #7859 RegressionEvaluation rank 4 fix + tests + axis configuration
* EvaluationBinary 3d/4d
* More evaluation 3d/4d tests
* #7847 Evaluation empty checks
* Small test ifx
* #7848 Fix median edge case
* Improve DL4J samediff layer tests
* [WIP] FastText wrapper implemented (#8)
* FastText implemented
* Some fixes
* Fix shapes for wordsNearest
* Validation of input vectors
* Fixes
* Fixed test
* Thread tagged
* Some tweaks
* setContextClassLoader for DeallocatorServiceThread
* Numpy format tests (#1)
* Various fixes (#11)
* #7852 SameDiff gather fix
* #7892 SameDiff placeholder to constant conversion
* #7890 validate input rank for MLN/CG init methods
* Fix broken permute shape calculation
* Permute and gather fixes
* Tests
* #7850 LogSumExp fix + test
* Handful of test fixes
* Empty arrays with non-scalar shapes (#10)
* minor rearrangements for lambdas
* empty tensors with non-scalar shapes
* numpy empty tensors with non-scalar shapes
* few more empty tweaks
* Small fixes
* conv3d signature update
* micro fix in batchnorm mkldnn
* Import fixes
* Fix
* MKL-DNN update
* Small fill fix
* fill with empty input + test
* Fixes
* Small error improvement
* Fix
* one special test
* couple of fixes for lstm
* Rewrite TFGraphMapper.getNDArrayFromTensor to be maintainable and less error prone
* Fixes
* FP16
* Unsigned
* BFloat16
* Fill op - empty tweaks
* - couple of fixes for empty arrays construction
- stack updated
* strided slice fix
* one transform test
* provide method for reducing shapeInfo in case of input array is empty
* Fixed reduceAlongDimensions to use empty input properly.
* couple of broadcast tests
* couple of tests broadcast tests + tweak to make them pass
* add check of non-empty to methods producing sub-arrays
* Fixed reshapeC with zeros in shape.
* complete empty check in reduce_... legacy ops
* Concat and cumsum/prod
* Tweak to empty shape inference on import
* add empty check to the rest of reduce legacy ops
* one more test
* correct typo in evalReduceShapeInfoEmpty
* Added tests for reduce_* ops to tests with zero shapes.
* few more tests for empty reductions
* Fixed strided_slice op with empty case and tests.
* one more empty reduction test
* Fixed strided_slice test.
* add empty check to NDArray::reshapei
* infOrMax
* empty min/max with infinity tests
* made unstack working correctly with empty arrays
* few IndexReduce tests + tweaks for empty shapes
* add test for empty concat
* few tests fixed
* Validation fix for reductions on empty shapes
* Reverse fix
* Reduction shape calc fixes
* SameDiff.generateOutputVariable: don't use shape function to determine number of outputs
* Range fix
* - NDArray constructor updated for scalars/empty arrays
- few tests fixed
* More fixes
* Empty creator fixes
* concat fix
* concat fix
* TF import tests: allow 'both all NaN' and 'both all inf' to pass
* Slice, zero fraction, and reshape fixes
* transpose, gather
* Zero fraction
* scalar cast fix
* Empty reduction axis support
* few more tests fixed
* Fixed input checks conforming with TF for concat op and tests.
* few tests fixed
* matmul scalar shape fix
* Fixed checkout for data type and scalarity with concat to allow non-empty scalars with vector concats.
* broadcast bool fix
* few more tests
* few more tests
* correct evalReduceShapeInfoEmpty
* argmax/argmin + tests
* one more empty edge case + one more test
* argmax/argmin/realdiv_bp tweaks
* empty reshape test + fix
* Helper fixes
* Small fixes
* Gather test fix
* Gather test fix
* Small fixes
* reduce scalar zero values
* scalar mean workaround
* Remove debug code
* along dim mean workaround
* one more test
* - equalsTo() tweak for empty arrays
- one more test
* broadcast tweaks
* [WIP] Fixing outstanding issues for NLP (#9)
* Avoid using not-inited objects
* Test fixed.
* Redundant method avoided for models like FastText
* KMeans++ implementation
* KMeans++ implementation
* Disable parallel execution
* KMeans++
* Tests
* Dev branch merge (#16)
* SameDiff: convertDataType and gradient check util improvements (#12)
* GradCheck util improvements
* StopGradient constructor + test
* SameDiff: Add datatype conversion
* Javadoc and add DataType.isNumerical()
* Small fix
* Fix SameDiff TF import test cases intermediate naming (workaround for bad default)
* TFGraphTestAllHelper: check intermediates in execution order
* Add missing debug listener
* [WIP] lstmBlock fix + other changes (#13)
- fixes lstmBlock issue
- changes NDArray method reshape(), permute(), transpose() by making them return instance instead of pointer
- CheckNumerics op
- fixes for ReduceBool IsInfOrNan & IsFinite
* Small test fix
* CheckNumerics op wrapper
* Fix some issues on master (#17)
* Fix DataVec test issue
* Fix issue with dl4j SameDiff output layer
* Dtype fix for lambda layers
* #7912 BertIterator dtype fix (use float32 not global default)
* [WIP] Next set of CUDA stuff (#7)
New CUDA implementations and improvements
* bad file
* Dev branch master merge (#23)
* SameDiff: convertDataType and gradient check util improvements (#12)
* GradCheck util improvements
* StopGradient constructor + test
* SameDiff: Add datatype conversion
* Javadoc and add DataType.isNumerical()
* Small fix
* Fix SameDiff TF import test cases intermediate naming (workaround for bad default)
* TFGraphTestAllHelper: check intermediates in execution order
* Add missing debug listener
* [WIP] lstmBlock fix + other changes (#13)
- fixes lstmBlock issue
- changes NDArray method reshape(), permute(), transpose() by making them return instance instead of pointer
- CheckNumerics op
- fixes for ReduceBool IsInfOrNan & IsFinite
* Small test fix
* CheckNumerics op wrapper
* Compatibility of deserialization (#18)
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* SameDiff: add activation gradient checking support for debugging (#19)
* SameDiff gradient checker: first pass on activation gradient checks
* Fixes + tests for activation gradient checking
* Javadoc
* [WIP] Some nd4j data type corrections (#20)
* Adjust data type
* Set correct Data type.
* Size of proper data type.
* fix averaged cpu load (#22)
* SameDiff ops, TF import and fixes (#24)
* CheckNumerics tests + fixes + misc fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fake quant
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* FakeQuantWithMinMaxArgs
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* CheckNumerics fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix libnd4j ALL_INTS and ALL_FLOATS declaration (uint and bfloat types)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Small fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Javadoc
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Exception tweak
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix for out of scope stack allocated var use
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ignores
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Ignore for known failing test (already logged issue)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Merge upstream to fork (#25)
* Add thousand-separator commas to TotalParams (#7915)
* Add thousand-separator commas to TotalParams
The number of parameters can be quite large, and it would help the reading of the summary printout to have the TotalParams column & values at the bottom have thousand-separator-commas in them.
* Add thousand-separator commas to MultiLayerNetwork
Corresponding change to MultiLayerNetwork
Signed-off-by: Jxtps Jxtps <jxtps435@gmail.com>
* Update contributing and issue/PR templates (#7934)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Fix link to AdaDelta paper (#7942)
Fix link to AdaDelta paper hosted on matthewzeiler.com
Signed-off-by: Jxtps
* Fixes, and ignores for known/logged failing issues (#7943)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* SameDiff + DL4J/SameDiff: Multiple fixes (#28)
* #7919 HDF5 attribute buffer length fix
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #7909 Arbiter constructor exception ux improvements
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #7925 RNN output layer length checks
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #7939 Add listener for validating inputs are not incorrectly modified
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* #7939 Integrate NonInplaceValidationListener into tests
* #7844 DL4J SameDiff fixes for variable minibatch size
* DL4J SameDiff fixes - ensure gradient for input placeholder is available
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* Tweaks to ExternalErrorsFunction - use placeholders, make more robust
* Another fix
* More fixes
* More SameDiff/DL4J fixes
* Scope out scalar array creation in BaseScalarOp
* Remove debug code
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* [WIP] Final dev branch merge (#29)
* SameDiff: convertDataType and gradient check util improvements (#12)
* GradCheck util improvements
* StopGradient constructor + test
* SameDiff: Add datatype conversion
* Javadoc and add DataType.isNumerical()
* Small fix
* Fix SameDiff TF import test cases intermediate naming (workaround for bad default)
* TFGraphTestAllHelper: check intermediates in execution order
* Add missing debug listener
* [WIP] lstmBlock fix + other changes (#13)
- fixes lstmBlock issue
- changes NDArray method reshape(), permute(), transpose() by making them return instance instead of pointer
- CheckNumerics op
- fixes for ReduceBool IsInfOrNan & IsFinite
* Small test fix
* CheckNumerics op wrapper
* Compatibility of deserialization (#18)
Signed-off-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
* SameDiff: add activation gradient checking support for debugging (#19)
* SameDiff gradient checker: first pass on activation gradient checks
* Fixes + tests for activation gradient checking
* Javadoc
* [WIP] Some nd4j data type corrections (#20)
* Adjust data type
* Set correct Data type.
* Size of proper data type.
* fix averaged cpu load (#22)
* [WIP] Multiple dataset iterators (#27)
* Splitting dataset into arbitrary number
* Fixes
* Multiple split of iterator
* Test
* Test
* Some fixes
* signature change
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* one more test for sequential use of DataSetIteratorSplitter
Signed-off-by: raver119 <raver119@gmail.com>
* Fixes
* Fixes
* one more test for Alexander
Signed-off-by: raver119 <raver119@gmail.com>
* Some fixes
* Some fixes
* one more test for Alexander
Signed-off-by: raver119 <raver119@gmail.com>
* minor test fix
Signed-off-by: raver119 <raver119@gmail.com>
* Some fixes
* Some fixes
* couple of assertions tweaked
Signed-off-by: raver119 <raver119@gmail.com>
* MDS splitter test :/
Signed-off-by: raver119 <raver119@gmail.com>
* Minor refactoring
* Multi dataset
* Some fixes
* More tests
* Small number of test fixes/improvements (failures on CI) (#31)
Signed-off-by: AlexDBlack <blacka101@gmail.com>
* [WIP] More CUDA stuff (#26)
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* LRN BP CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* less memory
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed bug with crop_and_resize op helper.
* get rid of unnecessary index-calculation dunction
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed sort with nth_element cuda-based helper.
* Refactored nth_element.
* Refactored nth_element op and tests.
* Modified usage of dim array with sortTad routine.
* Refactored main routine of helper for non_max_image_suppression op.
* non_max_image_suppression op helper with cuda kernel implementation. Initial revision.
* fix vol2col cuda kernel
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* topK concept
Signed-off-by: raver119 <raver119@gmail.com>
* unsorted topK with scanWitdh of 1
Signed-off-by: raver119 <raver119@gmail.com>
* correct vol2col tests
* sorted/unsorted topK
Signed-off-by: raver119 <raver119@gmail.com>
* implementation and fixing col2im/col2vol
* Corrected usage flags with input/output with reverse op.
* dup is const now
Signed-off-by: raver119 <raver119@gmail.com>
* percentile op
Signed-off-by: raver119 <raver119@gmail.com>
* group tests for mapool2d
Signed-off-by: Yurii <yurii@skymind.io>
* special test for george
Signed-off-by: raver119 <raver119@gmail.com>
* less threads for sortTad
Signed-off-by: raver119 <raver119@gmail.com>
* provide conv2d for cuda
Signed-off-by: Yurii <yurii@skymind.io>
* remove auther in sort tad kernel code
Signed-off-by: Yurii <yurii@skymind.io>
* provide depthwise_conv2d for cuda
Signed-off-by: Yurii <yurii@skymind.io>
* - max_pooling_with_argmax
- null check for special use
Signed-off-by: raver119 <raver119@gmail.com>
* dts cuda
Signed-off-by: raver119 <raver119@gmail.com>
* provide sconv2d for cuda
Signed-off-by: Yurii <yurii@skymind.io>
* std cuda
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored non_max_suppression op to conform TF implementation.
* Improved suppression helper.
* provide pooling3d for cuda
Signed-off-by: Yurii <yurii@skymind.io>
* minor lstm rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* more of minor lstm rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* (bi)dynamic_rnn
Signed-off-by: raver119 <raver119@gmail.com>
* templates init order
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored non_max_suppression op.
* Added cuda kernel for non_max_suppression.
* CPU sort by key/value
Signed-off-by: raver119 <raver119@gmail.com>
* CPU sort TAD by key/value
Signed-off-by: raver119 <raver119@gmail.com>
* CPU sort TAD by key/value tests
Signed-off-by: raver119 <raver119@gmail.com>
* Eliminate compiler error with cuda implementation.
* - repaired gradCheck in cuda
- provide conv2d_bp for cuda
Signed-off-by: Yurii <yurii@skymind.io>
* missed signature
Signed-off-by: raver119 <raver119@gmail.com>
* provide depthwise_conv2d_bp for cuda
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of lup helper with cuda kernel. Initial commit.
* further work on backprops for convolutions
Signed-off-by: Yurii <yurii@skymind.io>
* CUDA linear sort by key/val
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA tad sort by key/val
Signed-off-by: raver119 <raver119@gmail.com>
* start providing of backprop for pooling2d/3d
Signed-off-by: Yurii <yurii@skymind.io>
* Added atomicAdd for bool datatype.
* dynamic partition concept
Signed-off-by: raver119 <raver119@gmail.com>
* dynamic partition concept
Signed-off-by: raver119 <raver119@gmail.com>
* dynamic partition scalar CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* important comment
Signed-off-by: raver119 <raver119@gmail.com>
* fix pooling2d/3d backprop helpers
Signed-off-by: Yurii <yurii@skymind.io>
* Added non-linear test with dynamic_partition.
* Improved test for dynamic_partition.
* dynamic_partition TAD concept
Signed-off-by: raver119 <raver119@gmail.com>
* - dynamic_partition TAD CUDA impl
- dynamic_partition TAD CPU fix
Signed-off-by: raver119 <raver119@gmail.com>
* - rewrite cpu code for usampling2d/3d
- write cuda code for usampling2d/3d
Signed-off-by: Yurii <yurii@skymind.io>
* dynamic_stitch CUDA vector case
Signed-off-by: raver119 <raver119@gmail.com>
* dynamic_stitch CUDA TAD case concept
Signed-off-by: raver119 <raver119@gmail.com>
* dynamic_stitch CUDA TAD case impl
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for dynamic_stitch 3D-4D cases.
* minor tests tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed type check for dynamic stitch.
* min/max bp
Signed-off-by: raver119 <raver119@gmail.com>
* rewrite code for upsampling2d/3d cpu
Signed-off-by: Yurii <yurii@skymind.io>
* reduce min/max/norm_max bp
Signed-off-by: raver119 <raver119@gmail.com>
* lup implementation. Additional enhancements.
* provide code for upsamling2d/3d backprop
Signed-off-by: Yurii <yurii@skymind.io>
* weightedCrossEntropyWithLogits
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed template math atomicMul for 64bit ints.
* Refactored dynamic_partition_bp op.
* inverseBroadcast fix
Signed-off-by: raver119 <raver119@gmail.com>
* DynamicPartitionBP test datatype fixed.
* - nd4j_atomicMul Windows fix
- cpu/NDArrayLambda.hpp excluded from CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* correct logsoftmax looss (#2)
* Small SameDiff listener fix (#4)
* Various fixes (#6)
* #7839 Fix for asXMatrix and tests
* #7866 EmbeddingSequenceLayer dtype fix + test
* #7856 SameDiff save/load stream methods
* #7859 RegressionEvaluation rank 4 fix + tests + axis configuration
* EvaluationBinary 3d/4d
* More evaluation 3d/4d tests
* #7847 Evaluation empty checks
* Small test ifx
* #7848 Fix median edge case
* Improve DL4J samediff layer tests
* [WIP] FastText wrapper implemented (#8)
* FastText implemented
* Some fixes
* Fix shapes for wordsNearest
* Validation of input vectors
* Fixes
* Fixed test
* Thread tagged
* Some tweaks
* setContextClassLoader for DeallocatorServiceThread
* Numpy format tests (#1)
* Various fixes (#11)
* #7852 SameDiff gather fix
* #7892 SameDiff placeholder to constant conversion
* #7890 validate input rank for MLN/CG init methods
* Fix broken permute shape calculation
* Permute and gather fixes
* Tests
* #7850 LogSumExp fix + test
* Handful of test fixes
* Empty arrays with non-scalar shapes (#10)
* minor rearrangements for lambdas
* empty tensors with non-scalar shapes
* numpy empty tensors with non-scalar shapes
* few more empty tweaks
* Small fixes
* conv3d signature update
* micro fix in batchnorm mkldnn
* Import fixes
* Fix
* MKL-DNN update
* Small fill fix
* fill with empty input + test
* Fixes
* Small error improvement
* Fix
* one special test
* couple of fixes for lstm
* Rewrite TFGraphMapper.getNDArrayFromTensor to be maintainable and less error prone
* Fixes
* FP16
* Unsigned
* BFloat16
* Fill op - empty tweaks
* - couple of fixes for empty arrays construction
- stack updated
* strided slice fix
* one transform test
* provide method for reducing shapeInfo in case of input array is empty
* Fixed reduceAlongDimensions to use empty input properly.
* couple of broadcast tests
* couple of tests broadcast tests + tweak to make them pass
* add check of non-empty to methods producing sub-arrays
* Fixed reshapeC with zeros in shape.
* complete empty check in reduce_... legacy ops
* Concat and cumsum/prod
* Tweak to empty shape inference on import
* add empty check to the rest of reduce legacy ops
* one more test
* correct typo in evalReduceShapeInfoEmpty
* Added tests for reduce_* ops to tests with zero shapes.
* few more tests for empty reductions
* Fixed strided_slice op with empty case and tests.
* one more empty reduction test
* Fixed strided_slice test.
* add empty check to NDArray::reshapei
* infOrMax
* empty min/max with infinity tests
* made unstack working correctly with empty arrays
* few IndexReduce tests + tweaks for empty shapes
* add test for empty concat
* few tests fixed
* Validation fix for reductions on empty shapes
* Reverse fix
* Reduction shape calc fixes
* SameDiff.generateOutputVariable: don't use shape function to determine number of outputs
* Range fix
* - NDArray constructor updated for scalars/empty arrays
- few tests fixed
* More fixes
* Empty creator fixes
* concat fix
* concat fix
* TF import tests: allow 'both all NaN' and 'both all inf' to pass
* Slice, zero fraction, and reshape fixes
* transpose, gather
* Zero fraction
* scalar cast fix
* Empty reduction axis support
* few more tests fixed
* Fixed input checks conforming with TF for concat op and tests.
* few tests fixed
* matmul scalar shape fix
* Fixed checkout for data type and scalarity with concat to allow non-empty scalars with vector concats.
* broadcast bool fix
* few more tests
* few more tests
* correct evalReduceShapeInfoEmpty
* argmax/argmin + tests
* one more empty edge case + one more test
* argmax/argmin/realdiv_bp tweaks
* empty reshape test + fix
* Helper fixes
* Small fixes
* Gather test fix
* Gather test fix
* Small fixes
* reduce scalar zero values
* scalar mean workaround
* Remove debug code
* along dim mean workaround
* one more test
* - equalsTo() tweak for empty arrays
- one more test
* broadcast tweaks
* Capsnet test runtime improvements
* Slow test speedups
* Next round of test speed improvements
* More test improvements
* Improve test speed
* Next round of test speedups
* Another round
* More test speedups
* Another round
* Another round of test speedups
* Another round of speedups...
* CuDNN test speedups + more tests extending BaseDL4JTest
* Minor fix + more BaseDL4JTest use in other modules