Commit Graph

4 Commits (5c806d2fb545edf400f9fb56f2640ff41adfce22)

Author SHA1 Message Date
shugeo e1a7460f8e Shugeo cuda doc2 (#255)
* Added comments to tileKernel routine.

* Refactored kernel and added doc to it.

* Refactored setDiagonal kernel and added doc for it.

* Added doc for tnse cuda helpers.

* Added doc for diag kernels.

* Added doc for kernel.

* Refactored code with fake quantization.

* Added docs for image resize and crop kernels.

* Added docs for image suppression helpers.

* Added docs to matrix_band helpers.

* Added docs for matrix_diag_part and nth_element helpers.

* Fixed syntax error and refactored getIndexOffset usage.
2019-09-11 21:04:43 +03:00
raver119 589401477d
[WIP] bunch of improvements (#257)
* - profiling bias_add op
- add some docementation

Signed-off-by: Yurii <yurii@skymind.io>

* - minor change

Signed-off-by: Yurii <yurii@skymind.io>

* - provide addBias cuda kernel

Signed-off-by: Yurii <yurii@skymind.io>

* - improve shape::getIndexOfffset and change its signature

Signed-off-by: Yurii <yurii@skymind.io>

* - same as previous

Signed-off-by: Yurii <yurii@skymind.io>

* - improve and change signature in some shape:: stuff which has to do with calculation of offsets for array elements

Signed-off-by: Yurii <yurii@skymind.io>

* - minor changes in flatten

Signed-off-by: Yurii <shyrma@skymind.io>

* - add function shape::getIndexOffsetOrdered

Signed-off-by: Yurii <shyrma@skymind.io>

* - correct shape::getIndexOffsetOrdered()

Signed-off-by: Yurii <shyrma@skymind.io>

* - move getIndexOffsetOrdered to flatten.h header in order to isolate this function

Signed-off-by: Yurii <shyrma@skymind.io>
2019-09-11 20:12:09 +03:00
shugeo c78f5a8225
Shugeo cuda cuda (#105)
* Refactored extract_image_patches op helpers.

* Eliminated compliler errors with helper implementation.

* Finished implementation for extract_image_patches both cpu and cuda helpers.

* Improved cpu implementation.

* Improved cuda implementation for extract_image_patches helper.

* Added omp to ClipByGlobalNorm helpers implementation.

* Added implementation for thresholedrelu_bp op.

* Fixed cuda kernel with F order.

* Fixed tests for subarray.

* Refactored tests for Gaussian_3 and Truncated_22.

* Added tests for GaussianDistribution with native ops.

* Modified tests for Gaussian distribution.

* Fixed random tests.

* Fixed atomicMin/atomicMax for 64bit cases.

* Fixed tests for execReduce3TAD tests.

* Eliminated waste comments.
2019-08-07 15:29:17 +03:00
skymindops b5f0ec072f Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00