* Added dtype formulation for poisson and gamma distributions.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored gamma distribution generator and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added generator for gamma distribution when alpha (shape) between 0 and 1
Signed-off-by: shugeo <sgazeos@gmail.com>
* Implemented gamma distribution for shape param less than 1 and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Implemented gamma distributed randoms for shape (alpha) parameter greater then 1.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added cuda implementation for gamma distribution.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Refactored cuda and cpu implementation of gamma distribution.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed crash with default beta param with gamma distribution.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed pow for arm arch.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Gamma test fixed
* Cosmetic changes only.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Fixed random value retrieving
* Eliminated overflow attemptions.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Modified random retrieving.
Signed-off-by: shugeo <sgazeos@gmail.com>
* enlighted density of tests for Gamma distribution.
Signed-off-by: shugeo <sgazeos@gmail.com>
Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* Fixed bound problem with Exponential distribution implementation.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added test for Exponential distribution to avoid infinities.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Added a test for exponential distribution with 1M elements.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Cosmetical changes only and tests.
Signed-off-by: shugeo <sgazeos@gmail.com>
* Modified test and implementation for exponential_distribution op.
Signed-off-by: shugeo <sgazeos@gmail.com>
Co-authored-by: raver119 <raver119@gmail.com>
* initial set of include changes
Signed-off-by: raver119 <raver119@gmail.com>
* one more tweak
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* few more rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* cuda includes rearrangements
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* = namespace changed to sd
- few CMake variables renamed with SD_ prefix
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* LoopKind minor fix
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* sanitizer is optional now
Signed-off-by: raver119 <raver119@gmail.com>
* dev tests updated
Signed-off-by: raver119 <raver119@gmail.com>
* few more changes
Signed-off-by: raver119 <raver119@gmail.com>
* last update
Signed-off-by: raver119 <raver119@gmail.com>
* java update
Signed-off-by: raver119 <raver119@gmail.com>
* libnd4j: Multinomial op #8570 first raw step of multinomial random data generator implementation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op #8570 next step of multinomial random categories generator implementation on both cpu and cuda, need corrections and code clean up before review and testing
* libnd4j: Multinomial op #8570 code clean up and fixed issues data selecting, moved from coords to tads
* libnd4j: Multinomial op #8570 fixed cuda build add reference for math materials that was used for implementation
* libnd4j: Multinomial op #8570 fixed several bugs, added several tests and improved cuda version. current implementation works, need testing of reproduction with the same seed
* libnd4j: Multinomial op #8570 fixes and optimization after discussion in both cuda and cpu
* libnd4j: Multinomial op #8570 add corrections after review, removed tads, replace 2D parallel loop by 3D
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op fixed declaration and add tests need discussion
* libnd4j: Multinomial op fix in test
* libnd4j: Multinomial op corrected behavior to get reproducible results, fixed issue in uniform value getting, tests added, need cuda review and cuda testing
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op fixed indexing on uniform calculation
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op some corrections in max min declaration
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op fixed index calculation, added rewind, corrected input declaration, added stats tests, both cuda and cpu. cuda need testing
* libnd4j: Multinomial op fixed bugs on cuda nad cpu. need review
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op corrected tests to handle different orders
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op some improvements after code review
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op more corrections after review
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op fixed seed usage, update tests, fixed cuda based on comments, fixed bug of rewind, removed one behavior, minor corrections.
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op minor corrections
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op rise the bound of fluctuation for random cases
Signed-off-by: Oleg <oleg.semeniv@gmail.com>
* libnd4j: Multinomial op modified operation inputs and update implementation and tests on both cpu and cuda
* libnd4j: Multinomial op corrected data types according ops.proto
Co-authored-by: raver119 <raver119@gmail.com>
* Corrected randomuniform declaration.
* Refactored uniform distribution for both cuda and cpu platforms.
* Refactored uniform distribution and tests.
* Fixed type usage with indices.
* Refactored uniform distribution implementation and tests to full conform with TF implementation.
* Refactored gamma function to use type util method.
* Copyright changes and fixes with ConstantHelper.
* Added error checking on allocate cuda device memory and operations.
* Added implementation for random_gamma op.
* Added implementation for random_poisson op and support classes.
* Added helpers for random_poisson and random_gamma ops.
* Implementation of random_poisson. The first working edition.
* Implementation of random_poisson. Parallelized working edition.
* Implementation of random_gamma. Parallelized working edition with alpha only.
* Added cuda implementation for helper of poisson distribution.
* Corrected shape calculation with random_gamma and tests.
* Finished cpu implementation for gamma distribution.
* Finished cuda implementation for random_gamma op.
* Refactored cpu helpers for random_gamma and random_poisson ops.
* Refactored cuda helpers for gamma and poisson distribution.
* Refactored cuda helper for gamma distribution.
* Refactored cpu helper for random_poisson op.
* Refactored cpu helper for random_gamma op.
* Added tests for get_seed/set_seed ops.
* Added missed tests for scatter_sub/mul/div ops.
* Added tests for hardsigmoid and hardsigmoid_bp.
* Added tests for hardtanh and hardtanh_bp ops.
* Added test for histogram op.
* Added tests for identity op.
* Refactored mergemaxindex op. Added tests for log1p,mergemaxindex, mod and mod_bp ops.
* Fixed tests for FloorDiv.
* Added test for rank op.
* Added tests for rationaltanh/rationaltanh_bp ops.
* Added tests for realdiv/realdiv_bp.
* Added tests for rectifiedtanh/_bp ops.
* Added tests for shapes_of op.
* Added tests for shapes_of op.
* Added tests for size op.
* Added tests for softplus/_bp ops.
* Added tests for softsign/_bp ops.
* Added tests for toggle_bits op. Fixed processing of OP_IMPL and so on defititions.
* Added test for truncatediv op.
* Added another test for truncatediv op.
* Added another test for histogram.
* Added tests for unstack_list op.
* Refactored to_int32/uint32/float16/float32/double/int64/uint64 ops and tests.
* Refactored mergemaxindex op helper for cuda platform and tests.
* Fixed cuda kernel for histogram op helper.
* Refactor skipgram to avoid early buffers shift.
* Fixed check up with non_max_suppression op cuda helper. Added cuda kernel implementation for skipgram op helpers.
* Added implementation of skipgram op helper for cuda platform. Working revision
* Fixed mergeMaxIndex kernel and move it to separate source file.
* initial commit
Signed-off-by: raver119 <raver119@gmail.com>
* - gruCell_bp further
Signed-off-by: Yurii <yurii@skymind.io>
* - further work on gruCell_bp
Signed-off-by: Yurii <yurii@skymind.io>
* Inverse matrix cublas implementation. Partial working revision.
* Separation of segment ops helpers. Max separation.
* Separated segment_min ops.
* Separation of segment_mean/sum/prod/sqrtN ops heleprs.
* Fixed diagonal processing with LUP decomposition.
* Modified inversion approach using current state of LU decomposition.
* Implementation of matrix_inverse op with cuda kernels. Working revision.
* Implemented sequence_mask cuda helper. Eliminated waste printf with matrix_inverse implementation. Added proper tests.
* - further work on gruCell_bp (ff/cuda)
Signed-off-by: Yurii <yurii@skymind.io>
* comment one test for gruCell_bp
Signed-off-by: Yurii <yurii@skymind.io>
* - provide cuda static_rnn
Signed-off-by: Yurii <yurii@skymind.io>
* Refactored random_shuffle op to use new random generator.
* Refactored random_shuffle op helper.
* Fixed debug tests with random ops tests.
* Implement random_shuffle op cuda kernel helper and tests.
* - provide cuda scatter_update
Signed-off-by: Yurii <yurii@skymind.io>
* Implementation of random_shuffle for linear case with cuda kernels and tests.
* Implemented random_shuffle with cuda kernels. Final revision.
* - finally gruCell_bp is completed
Signed-off-by: Yurii <yurii@skymind.io>
* Dropout op cuda helper implementation.
* Implemented dropout_bp cuda helper.
* Implemented alpha_dropout_bp with cuda kernel helpers.
* Refactored helper.
* Implementation of suppresion helper with cuda kernels.
* - provide cpu code fot hsvToRgb, rgbToHsv, adjustHue
Signed-off-by: Yurii <yurii@skymind.io>
* Using sort by value method.
* Implementation of image.non_max_suppression op cuda-based helper.
* - correcting and testing adjust_hue, adjust_saturation cpu/cuda code
Signed-off-by: Yurii <yurii@skymind.io>
* Added cuda device prefixes to declarations.
* Implementation of hashcode op with cuda helper. Initital revision.
* rnn cu impl removed
Signed-off-by: raver119 <raver119@gmail.com>