* Implementation for non_max_suppression_v3 was added. Initial version
* Added check for overcome threshold.
* Added definition for V3 method.
* java remapping for NonMaxSuppressionV3
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed proporly processing of an empty output and test.
* Refactored op to less threshold data to float.
* Implemented cuda-based helper for non_max_suppression_v3 op.
* Fixed fake_quant_with_min_max_vars op.
* Fixed tests with float numbers.
* - assert now stops execution
- sortByKey/sortByValue now have input validation
Signed-off-by: raver119 <raver119@gmail.com>
* missing var
Signed-off-by: raver119 <raver119@gmail.com>
* Fixed proper processing for zero max_size inputs.
* Refactored kernel callers.
* Fixed return statement for logdet op helper.
* Refactored unsorted segment SqrtN op.
* get back 8 tail bytes on CUDA
Signed-off-by: raver119 <raver119@gmail.com>
* Refactored segment prod ops and helpers for cuda and tests.
* Additional test.
* CudaWorkspace tests updated for 8 tail bytes
Signed-off-by: raver119 <raver119@gmail.com>
* special atomic test
Signed-off-by: raver119 <raver119@gmail.com>
* atomicMul/atomicDiv fix for 16bit values
Signed-off-by: raver119 <raver119@gmail.com>
* Eliminated waste prints.
* Added comments to tileKernel routine.
* Refactored kernel and added doc to it.
* Refactored setDiagonal kernel and added doc for it.
* Added doc for tnse cuda helpers.
* Added doc for diag kernels.
* Added doc for kernel.
* Refactored code with fake quantization.
* Added docs for image resize and crop kernels.
* Added docs for image suppression helpers.
* Added docs to matrix_band helpers.
* Added docs for matrix_diag_part and nth_element helpers.
* Fixed syntax error and refactored getIndexOffset usage.