* Added implementation files for image_resize and resize_bicubic ops.
* Image resize and image.resize_bicubic ops implementation. Initial revision.
* Finished with infrastructure development for image.resize_bilinear op and image_resizo op implementation.
* Refactored resize methods.
* Added processing for Mitchelcubic algorithm.
* Added check for input/output sizes.
* Added int and float types for crop_and_resize op.
* Refactored crop_and_resize output type check.
* Added helper for bicubic interpolation as TF v.1 does.
* Added TF v.1 bicubic helper for cuda platform.
* Added cached class for bicubic algorithm.
* Refactored cuda implementation for crop_and_resize helper to use proper output type.
* Added facilities for bicubic interpolation.
* Portion bicubic interpolation from TF.
* Added tests for resize_bilinear testing.
* Working implementation of bicubic interpolation and tests.
* Refactored routines with image_resize bicubic op helper.
* Refactored code with coding standards.
* Refactored cpu helpers for resize_bicubic op.
* Refactored bicubic helpers.
* Added bicubic resize facilities.
* Implementing cuda kernels for bicubic interpolation. Implementation step.
* Cuda implementation of resize_bicubic op helper.
* Refactor image.resize_bicubic op helpers.
* Refactored helpers for resize_bicubic. Added error checking with cuda implementation.
* Refactored cuda implementation of resize_bicubic op helper. The first working revision.
* Cuda arch implementation for resize_bicubic op helper. Full working single-threaded revision.
* Intermediate bicubic interpolation helper for cuda.
* Refactored cpu helper for resize_bicubic.
* Multithreaded cuda implementation for resize_bicubic.
* Fixed merge issues.
* Refactored nlp helpers.
* Replicated resize_bicubic for 3D also.
* Eliminated waste comments of unused code.
* Eliminated waste comments with unused code.
* Eliminated waste template definitions.
* Eliminated waste debug code.
* Eliminated waste comments.
* Fixed multithreading with helpers.
* Fixed test suites for float and double in float point input lists.
* Fixed usage of reshape with 3D/4D on resizes.
* Final fixes.
* Fixed resize_neighbor op problem.
* Added tests for get_seed/set_seed ops.
* Added missed tests for scatter_sub/mul/div ops.
* Added tests for hardsigmoid and hardsigmoid_bp.
* Added tests for hardtanh and hardtanh_bp ops.
* Added test for histogram op.
* Added tests for identity op.
* Refactored mergemaxindex op. Added tests for log1p,mergemaxindex, mod and mod_bp ops.
* Fixed tests for FloorDiv.
* Added test for rank op.
* Added tests for rationaltanh/rationaltanh_bp ops.
* Added tests for realdiv/realdiv_bp.
* Added tests for rectifiedtanh/_bp ops.
* Added tests for shapes_of op.
* Added tests for shapes_of op.
* Added tests for size op.
* Added tests for softplus/_bp ops.
* Added tests for softsign/_bp ops.
* Added tests for toggle_bits op. Fixed processing of OP_IMPL and so on defititions.
* Added test for truncatediv op.
* Added another test for truncatediv op.
* Added another test for histogram.
* Added tests for unstack_list op.
* Refactored to_int32/uint32/float16/float32/double/int64/uint64 ops and tests.
* Refactored mergemaxindex op helper for cuda platform and tests.
* Fixed cuda kernel for histogram op helper.
* Refactor skipgram to avoid early buffers shift.
* Fixed check up with non_max_suppression op cuda helper. Added cuda kernel implementation for skipgram op helpers.
* Added implementation of skipgram op helper for cuda platform. Working revision
* Fixed mergeMaxIndex kernel and move it to separate source file.
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* Implementation of hashcode cuda helper. Working edition.
* Fixed parallel test input arangements.
* Fixed tests for hashcode op.
* Fixed shape calculation for image:crop_and_resize op and test.
* NativeOps tests. Initial test suite.
* Added tests for indexReduce methods.
* Added test on execBroadcast with NDArray as dimensions.
* Added test on execBroadcastBool with NDArray as dimensions.
* Added tests on execPairwiseTransform and execPairwiseTransofrmBool.
* Added tests for execReduce with scalar results.
* Added reduce tests for non-empty dims array.
* Added tests for reduce3.
* Added tests for execScalar.
* Added tests for execSummaryStats.
* - provide cpu/cuda code for batch_to_space
- testing it
Signed-off-by: Yurii <yurii@skymind.io>
* - remove old test for batch_to_space (had wrong format and numbers were not checked)
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed complilation errors with test.
* Added test for execTransformFloat.
* Added test for execTransformSame.
* Added test for execTransformBool.
* Added test for execTransformStrict.
* Added tests for execScalar/execScalarBool with TADs.
* Added test for flatten.
* - provide cpu/cuda code for space_to_Batch operaion
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for concat.
* comment unnecessary stuff in s_t_b
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for specialConcat.
* Added tests for memcpy/set routines.
* Fixed pullRow cuda test.
* Added pullRow test.
* Added average test.
* - correct typo in NDArray::applyPairwiseTransform(nd4j::pairwise::BoolOps op...)
Signed-off-by: Yurii <yurii@skymind.io>
* - debugging and fixing cuda tests in JavaInteropTests file
Signed-off-by: Yurii <yurii@skymind.io>
* - correct some tests
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for shuffle.
* Fixed ops declarations.
* Restored omp and added shuffle test.
* Added convertTypes test.
* Added tests for execRandom. Eliminated usage of RandomBuffer with NativeOps.
* Added sort tests.
* Added tests for execCustomOp.
* - further debuging and fixing tests terminated with crash
Signed-off-by: Yurii <yurii@skymind.io>
* Added tests for calculateOutputShapes.
* Addded Benchmarks test.
* Commented benchmark tests.
* change assertion
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for apply_sgd op. Added cpu helper for that op.
* Implement cuda helper for aplly_sgd op. Fixed tests for NativeOps.
* Added test for assign broadcastable.
* Added tests for assign_bp op.
* Added tests for axpy op.
* - assign/execScalar/execTransformAny signature change
- minor test fix
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed axpy op.
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* - fix tests for nativeOps::concat
Signed-off-by: Yurii <yurii@skymind.io>
* sequential transform/scalar
Signed-off-by: raver119 <raver119@gmail.com>
* allow nested parallelism
Signed-off-by: raver119 <raver119@gmail.com>
* assign_bp leak fix
Signed-off-by: raver119 <raver119@gmail.com>
* block setRNG fix
Signed-off-by: raver119 <raver119@gmail.com>
* enable parallelism by default
Signed-off-by: raver119 <raver119@gmail.com>
* enable nested parallelism by default
Signed-off-by: raver119 <raver119@gmail.com>
* Added cuda implementation for row_count helper.
* Added implementation for tnse gains op helper.
* - take into account possible situations when input arrays are empty in reduce_ cuda stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Implemented tsne/edge_forces op cuda-based helper. Parallelized cpu-based helper for edge_forces.
* Added kernel for tsne/symmetrized op heleper.
* Implementation of tsne/symmetrized op cuda helper. Working edition.
* Eliminated waste printfs.
* Added test for broadcastgradientargs op.
* host-only fallback for empty reduce float
Signed-off-by: raver119 <raver119@gmail.com>
* - some tests fixes
Signed-off-by: Yurii <yurii@skymind.io>
* - correct the rest of reduce_ stuff
Signed-off-by: Yurii <yurii@skymind.io>
* - further correction of reduce_ stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Added test for Cbow op. Also added cuda implementation for cbow helpers.
* - improve code of stack operation for scalar case
Signed-off-by: Yurii <yurii@skymind.io>
* - provide cuda kernel for gatherND operation
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of cbow helpers with cuda kernels.
* minor tests tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* minor tests tweaks
Signed-off-by: raver119 <raver119@gmail.com>
* - further correction of cuda stuff
Signed-off-by: Yurii <yurii@skymind.io>
* Implementatation of cbow op helper with cuda kernels. Working edition.
* Skip random testing for cudablas case.
* lstmBlockCell context fix
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for ELU and ELU_BP ops.
* Added tests for eq_scalar, gt_scalar, gte_scalar and lte_scalar ops.
* Added tests for neq_scalar.
* Added test for noop.
* - further work on clipbynorm_bp
Signed-off-by: Yurii <yurii@skymind.io>
* - get rid of concat op call, use instead direct concat helper call
Signed-off-by: Yurii <yurii@skymind.io>
* lstmBlockCell context fix
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for lrelu and lrelu_bp.
* Added tests for selu and selu_bp.
* Fixed lrelu derivative helpers.
* - some corrections in lstm
Signed-off-by: Yurii <yurii@skymind.io>
* operator * result shape fix
Signed-off-by: raver119 <raver119@gmail.com>
* - correct typo in lstmCell
Signed-off-by: Yurii <yurii@skymind.io>
* few tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* CUDA inverse broadcast bool fix
Signed-off-by: raver119 <raver119@gmail.com>
* disable MMAP test for CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* BooleanOp syncToDevice
Signed-off-by: raver119 <raver119@gmail.com>
* meh
Signed-off-by: raver119 <raver119@gmail.com>
* additional data types for im2col/col2im
Signed-off-by: raver119 <raver119@gmail.com>
* Added test for firas_sparse op.
* one more RandomBuffer test excluded
Signed-off-by: raver119 <raver119@gmail.com>
* Added tests for flatten op.
* Added test for Floor op.
* bunch of tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* mmulDot tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Implemented floordiv_bp op and tests.
* Fixed scalar case with cuda implementation for bds.
* - work on cuda kernel for clip_by_norm backprop op is completed
Signed-off-by: Yurii <yurii@skymind.io>
* Eliminate cbow crach.
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Eliminated abortion with batched nlp test.
* more tests fixed
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed shared flag initializing.
* disabled bunch of cpu workspaces tests
Signed-off-by: raver119 <raver119@gmail.com>
* scalar operators fix: missing registerSpecialUse call
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed logdet for cuda and tests.
* - correct clipBynorm_bp
Signed-off-by: Yurii <yurii@skymind.io>
* Fixed crop_and_resize shape datatype.
* - correct some mmul tests
Signed-off-by: Yurii <yurii@skymind.io>