cavis

Author	SHA1	Message	Date
agibsonccc	c3f04caef4	Add ctc loss from KonduitAI PR, add missing java bits	2021-03-11 14:22:34 +09:00
AbdelRauf	1dc8a2109c	compare_and_bitpack: correct documentation of the current implementation Signed-off-by: AbdelRauf <rauf@konduit.ai>	2021-02-28 19:26:09 +01:00
AbdelRauf	fe22bd5726	Compare_and_bitpack: It was reimplemented. now the last dimension should be divisible by 8 Signed-off-by: AbdelRauf <rauf@konduit.ai>	2021-02-28 19:19:59 +01:00
AbdelRauf	a4efb4d4e9	AdaBelief updater: it was agreed to modify changes on the copy of AdamUpdater. This way we can improve it later. https://arxiv.org/pdf/2010.07468.pdf Signed-off-by: AbdelRauf <rauf@konduit.ai>	2021-02-19 17:45:55 +01:00
agibsonccc	e88d0fe96c	Fix unsorted segment ops	2021-02-15 16:16:40 +09:00
agibsonccc	46dbd0b203	Update copyrights remove attic and relocate elsewhere	2021-02-09 13:16:31 +09:00
agibsonccc	6dc7e2f08f	Update c++ copyrights	2021-02-01 21:31:45 +09:00
agibsonccc	65c6a9a42e	Dev commits	2021-02-01 14:31:20 +09:00
shugeo	2aed216c2a	Eliminated error with resize implementation. (#418 ) * Eliminated error with resize implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored resize caller implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored image.resize op helper. Signed-off-by: shugeo <sgazeos@gmail.com> * Added dumb implementations for missed resize methods. Signed-off-by: shugeo <sgazeos@gmail.com> * Added resize_images op. Refactored image_resize op. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored matrix_band_part op and test. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored resize_images op to comply with preserve_aspect_ratio flag properly. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored resize_images and tests for resizeArea method. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored resize methods and test. Signed-off-by: shugeo <sgazeos@gmail.com> * Added new methods for TF2 resize op. Signed-off-by: shugeo <sgazeos@gmail.com> * Portion of resize algorithms from TF2 Signed-off-by: shugeo <sgazeos@gmail.com> * Added routine to process resize with given algorithm. Signed-off-by: shugeo <sgazeos@gmail.com> * Added new image resize via scale and translate process helper. Signed-off-by: shugeo <sgazeos@gmail.com> * Cpu implementation for V2 image resize operation helpers. Signed-off-by: shugeo <sgazeos@gmail.com> * Added implementation for lancos5 algorithm of resize and test. Signed-off-by: shugeo <sgazeos@gmail.com> * Added prints for span computing. Signed-off-by: shugeo <sgazeos@gmail.com> * The first working implementation and tests for lancos5 resize. Signed-off-by: shugeo <sgazeos@gmail.com> * Eliminated waste prints. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored image_resize op and tests." Signed-off-by: shugeo <sgazeos@gmail.com> * Lanczos3 resize implementation and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Implemented bicubic resize algorithm and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Added a couple of tests and cosmetic changes with image resize helper. Signed-off-by: shugeo <sgazeos@gmail.com> * Added bilinear implementation for image resize. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored bicubic algorithm and also implement area and neighbor algoritms for image resize on cpu arch. Signed-off-by: shugeo <sgazeos@gmail.com> * Added a couple of tests for nearest neighbor and area resize. Signed-off-by: shugeo <sgazeos@gmail.com> * Cosmetic changes for cpu implementation and added cuda implementation for resize methods. Signed-off-by: shugeo <sgazeos@gmail.com> * Separated cuda implementation of v2 image resize. Signed-off-by: shugeo <sgazeos@gmail.com> * Added kernels for span calculation and span gathering with new image resize cuda implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored cuda implementation of image resize kernels. Signed-off-by: shugeo <sgazeos@gmail.com> * Finished the first working implementation of image resize op and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed resize_images and image_resize ops. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored shape construction and output validation. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed test to properly initalized with float. Signed-off-by: shugeo <sgazeos@gmail.com> * Added 3D input opotunity for resize ops. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed test for resize_images op. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed test and call for resize_images op. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored image_resize op output data type handling for nearest neighbors method and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed issue with wrong resize method. Signed-off-by: shugeo <sgazeos@gmail.com> * Added checkup for wrong resize methods for resize ops. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored resize methods and test. Signed-off-by: shugeo <sgazeos@gmail.com> * Added output data type validation for given resize method. Signed-off-by: shugeo <sgazeos@gmail.com> * - ResizeMethod rearranged in order to match C++ side - minor test fix Signed-off-by: raver119@gmail.com <raver119@gmail.com> * Refactored resize_images op. Signed-off-by: shugeo <sgazeos@gmail.com> Co-authored-by: raver119@gmail.com <raver119@gmail.com>	2020-05-27 21:15:03 +03:00
Yurii Shyrma	753ce28a92	Shyrma sqrtm (#429 ) * - start working on implementation of sqrtm op Signed-off-by: Yurii <iuriish@yahoo.com> * - improving householder procedure Signed-off-by: Yurii <iuriish@yahoo.com> * - further polishing householder stuff Signed-off-by: Yurii <iuriish@yahoo.com> * - polishing hh pivoting qr procedure Signed-off-by: Yurii <iuriish@yahoo.com> * - polishing BiDiagonalUp procedure Signed-off-by: Yurii <iuriish@yahoo.com> * - polishing householder sequence class Signed-off-by: Yurii <iuriish@yahoo.com> * - polishing jacobi svd class Signed-off-by: Yurii <iuriish@yahoo.com> * - polishing svd stuff 1 Signed-off-by: Yurii <iuriish@yahoo.com> * - polishing svd stuff 2 Signed-off-by: Yurii <iuriish@yahoo.com> * - implementation and testing class which performs Hessenberg decomposition of square matrix Signed-off-by: Yurii <iuriish@yahoo.com> * - add static method to JacobiSVD class which makes the continuous Givens rotation generation algorithm Signed-off-by: Yurii <iuriish@yahoo.com> * - implementation and testing auxiliary methods of Schur decomp class Signed-off-by: Yurii <iuriish@yahoo.com> * some references here and there Signed-off-by: raver119 <raver119@gmail.com> * - trying figure out difference between eigen and our Schur alg Signed-off-by: Yurii <iuriish@yahoo.com> * - testing fixing bugs in Schur decomposition op Signed-off-by: Yurii <iuriish@yahoo.com> * - start to implement class which performs calculation of eigen values and vectors Signed-off-by: Yurii <iuriish@yahoo.com> * - add to EigenValsAndVecs method which calculates complex eigen vectors Signed-off-by: Yurii <iuriish@yahoo.com> * - testing and fixing bugs in EigenValsAndVecs class Signed-off-by: Yurii <iuriish@yahoo.com> * - implementation and testing triangularSolver class Signed-off-by: Yurii <iuriish@yahoo.com> * Added a 2D routine for triangular systems solve. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored triangularSolve2D routine and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored another test for triangularSolve2D. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored test for triangularSolve for vector-bar case. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored triangularSolve2D routine and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * - implementation of FullPivLU class Signed-off-by: Yurii <iuriish@yahoo.com> * - fix bugs in FullPivLU::solve method Signed-off-by: Yurii <iuriish@yahoo.com> * - correct permutation vector in FullPivLU::solve Signed-off-by: Yurii <iuriish@yahoo.com> * - correct include headers Signed-off-by: Yurii <iuriish@yahoo.com> * - implementation of Sqrtm class Signed-off-by: Yurii <iuriish@yahoo.com> * - testing and fixing bugs in Sqrtm class Signed-off-by: Yurii <iuriish@yahoo.com> * - include sqrtm classes to cuda folder, investigate in what places synchronization doesn't work Signed-off-by: Yurii <iuriish@yahoo.com> * Added implementation for cuda triangularSolve2D and also refactored triangularSolve2D for cpu. Signed-off-by: shugeo <sgazeos@gmail.com> * Eliminated waste implementations. Signed-off-by: shugeo <sgazeos@gmail.com> * - make offset calculation faster in t<> methods Signed-off-by: Yurii <iuriish@yahoo.com> * - rename refference T& NDArray::t<> method Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on cuda sqrtm Signed-off-by: Yurii <iuriish@yahoo.com> * - provide correct synchronization to device in Sqrtm class Signed-off-by: Yurii <iuriish@yahoo.com> * - add tests for sqrtm op Signed-off-by: Yurii <iuriish@yahoo.com> * - correct fails which appeared while testing on jenkins Signed-off-by: Yurii <iuriish@yahoo.com> * - trying to find out mistake in svd::deflation method Signed-off-by: Yurii <iuriish@yahoo.com> * Revert "- trying to find out mistake in svd::deflation method" This reverts commit 19d37baddbc509028e4bc67bc932fe7449becdb6. * Revert "- trying to find out mistake in svd::deflation method" This reverts commit 19d37baddbc509028e4bc67bc932fe7449becdb6. Signed-off-by: Yurii <iuriish@yahoo.com> * - change call semantic of r<> and t<> methods Signed-off-by: Yurii <iuriish@yahoo.com> * - ged rid of ambiguity in * operator overloads for windows buikd Signed-off-by: Yurii <iuriish@yahoo.com> * - get rid of ambiguity in * operator overloads for windows build 2 Signed-off-by: Yurii <iuriish@yahoo.com> * - get rid of ambiguity in * operator overloads for windows build 3 Signed-off-by: Yurii <iuriish@yahoo.com> * - resolve conflicts with master Signed-off-by: Yurii <iuriish@yahoo.com> * cmakelists updated Signed-off-by: raver119@gmail.com <raver119@gmail.com> * - minor fix in merge cpu helper - make use of reference getter Signed-off-by: Yurii <iuriish@yahoo.com> Co-authored-by: raver119 <raver119@gmail.com> Co-authored-by: shugeo <sgazeos@gmail.com>	2020-05-14 18:06:13 +03:00
Abdelrauf	69d91e272a	- new implementations for Index Reductions (#421 ) * - new implementations for Index Reductions - small fix in the legacy reduction - disabled index reduction bench tests inside Playground Signed-off-by: Abdelrauf <rauf@konduit.ai> * Allow LIBND4J_TYPES Signed-off-by: Abdelrauf <rauf@konduit.ai> * index reduction stuff split into bunch of units * meh * IMax switched to new impl Signed-off-by: raver119@gmail.com <raver119@gmail.com> * minor fix + test * minor fix * index range fix Signed-off-by: Abdelrauf <rauf@konduit.ai> * noop on empty outputs * minor fix * minor fix Signed-off-by: Abdelrauf <rauf@konduit.ai> * ArgMax replaces IMax Signed-off-by: raver119@gmail.com <raver119@gmail.com> * argmax/argmin/argamax/argamin shape functions updated * ArgAmax/ArgAmin/ArgMin replaces IAMax/IAMin/IMin Signed-off-by: raver119@gmail.com <raver119@gmail.com> * argmax/argmin/argamax/argamin CUDA * IMax replaced in dl4j Signed-off-by: raver119@gmail.com <raver119@gmail.com> * Codegen output * imports fixed Signed-off-by: raver119@gmail.com <raver119@gmail.com> * fix compilation issue Signed-off-by: Abdelrauf <rauf@konduit.ai> * Auto-generate compilation units Signed-off-by: Abdelrauf <rauf@konduit.ai> * Should fix NDArray refactored function calls in indexReductions.cu Signed-off-by: Abdelrauf <rauf@konduit.ai> Co-authored-by: raver119@gmail.com <raver119@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-05-14 13:41:55 +03:00
Yurii Shyrma	76f3553679	Shyrma merge max ind (#443 ) * - provide correct possible output types in mergeMaxIndex op Signed-off-by: Yurii <iuriish@yahoo.com> * - cleaning up the unneeded backprop arg in reverse_bp op Signed-off-by: Yurii <iuriish@yahoo.com> * - improve clipByNorm both ff and bp Signed-off-by: Yurii <iuriish@yahoo.com> * - implementation and testing clipByAvgNorm_bp op Signed-off-by: Yurii <iuriish@yahoo.com> * - pass biases in any way in dnnl lstm op, they are zeros when user doesn't provide them to us Signed-off-by: Yurii <iuriish@yahoo.com> * - start working on mkldnn concat op Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on mkldnn concat Signed-off-by: Yurii <iuriish@yahoo.com> * missing declaration fix Signed-off-by: raver119@gmail.com <raver119@gmail.com> * - polishing mkl ops Signed-off-by: Yurii <iuriish@yahoo.com> * - testing and fixing bugs in mkl concat op Signed-off-by: Yurii <iuriish@yahoo.com> * - fix linkage error for windows cuda build Signed-off-by: Yurii <iuriish@yahoo.com> * - further conflicts resolving with master Signed-off-by: Yurii <iuriish@yahoo.com> * - fix format tags in mkldnn matmul op Signed-off-by: Yurii <iuriish@yahoo.com> * - provide additional type cast in clip.cu Signed-off-by: Yurii <iuriish@yahoo.com> * - finally bug in mkldnn tanh_bp was caught Co-authored-by: raver119@gmail.com <raver119@gmail.com>	2020-05-12 07:47:09 +03:00
raver119	0613485654	compression ops (#436 ) * Added declarations for decode/encode_bitmap ops. Signed-off-by: shugeo <sgazeos@gmail.com> * Added implementation for bitmap encoding/decoding ops. Signed-off-by: shugeo <sgazeos@gmail.com> * Added helpers for encode/decode bitmap ops. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored encodingBitmap helper. Signed-off-by: shugeo <sgazeos@gmail.com> * threshold encode/decode skeleton * helper skeleton * minor import fix * encoder shape fn & op impl * thresholdEncode cpu impl Signed-off-by: raver119@gmail.com <raver119@gmail.com> * thresholdDecode cpu impl Signed-off-by: raver119@gmail.com <raver119@gmail.com> * Only cosmetical changes. Signed-off-by: shugeo <sgazeos@gmail.com> * placeholder Signed-off-by: raver119@gmail.com <raver119@gmail.com> * Added cuda implementation for bitmap decode helper. Signed-off-by: shugeo <sgazeos@gmail.com> * cuda thresholdEstimate Signed-off-by: raver119@gmail.com <raver119@gmail.com> * cuda thresholdDecode Signed-off-by: raver119@gmail.com <raver119@gmail.com> * next step Signed-off-by: raver119@gmail.com <raver119@gmail.com> * - nano cmakelist update (get rid of Clion section) - fixed forgotten throw in AtomicTests Signed-off-by: raver119@gmail.com <raver119@gmail.com> * thesholdEncode cuda impl Signed-off-by: raver119@gmail.com <raver119@gmail.com> * Added tests for bitmap encoding/decoding ops. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed tests for encode/decode bitmaps. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored decode/encode helpers. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed crashes with bitmap decode/encode helpers. Signed-off-by: shugeo <sgazeos@gmail.com> * bitmap encode/decode CPU Signed-off-by: raver119@gmail.com <raver119@gmail.com> * bitmap encode/decode CUDA Signed-off-by: raver119@gmail.com <raver119@gmail.com> * C API removed for threshold/bitmap encode Signed-off-by: raver119@gmail.com <raver119@gmail.com> * EncodeBitmap/DecodeBitmap Java side Signed-off-by: raver119@gmail.com <raver119@gmail.com> * EncodeThreshold/DecodeThreshold Java side Signed-off-by: raver119@gmail.com <raver119@gmail.com> * EncodeThreshold/DecodeThreshold Java side Signed-off-by: raver119@gmail.com <raver119@gmail.com> * few more tests for threshold encoding Signed-off-by: raver119@gmail.com <raver119@gmail.com> * minor test tweak Signed-off-by: raver119@gmail.com <raver119@gmail.com> * two special tests Signed-off-by: raver119@gmail.com <raver119@gmail.com> * encodeBitmap CPU fix Signed-off-by: raver119@gmail.com <raver119@gmail.com> * parallel_long/parallel_double proper spans fix Signed-off-by: raver119@gmail.com <raver119@gmail.com> * encodeThreshold CUDA fix Signed-off-by: raver119@gmail.com <raver119@gmail.com> * nano fix Signed-off-by: raver119@gmail.com <raver119@gmail.com> * grid tweaks Signed-off-by: raver119@gmail.com <raver119@gmail.com> * RTX adaptation for thresholdEncode Signed-off-by: raver119 <raver119@gmail.com> * don't allow threshold encoding for length < 2 Signed-off-by: raver119@gmail.com <raver119@gmail.com> * get rid of NDArrayCompressor in EncodingHandler Signed-off-by: raver119@gmail.com <raver119@gmail.com> * one more minor update of EncodingHandler Signed-off-by: raver119@gmail.com <raver119@gmail.com> * one more minor tweak of EncodingHandler Signed-off-by: raver119@gmail.com <raver119@gmail.com> * - matmul allows integer data types use - EncodingHandler boundary default value - few tests for integer matmul Signed-off-by: raver119@gmail.com <raver119@gmail.com> * minor fix of CUDA bitmap encode Signed-off-by: raver119@gmail.com <raver119@gmail.com> * boundary changed to integer everywhere Signed-off-by: raver119@gmail.com <raver119@gmail.com> * boundary changed to integer everywhere Signed-off-by: raver119@gmail.com <raver119@gmail.com> * re-enable CUDA deallocator Signed-off-by: raver119@gmail.com <raver119@gmail.com> * threshold encoder fix for systems without omp Signed-off-by: raver119@gmail.com <raver119@gmail.com> * - encode_threshold now requires non-negative boundary - minor tweak in EncodingHandler Signed-off-by: raver119@gmail.com <raver119@gmail.com> * restore parallelism in decode_bitmap Signed-off-by: raver119@gmail.com <raver119@gmail.com> * fall back to omp for encode_bitmap cpu Signed-off-by: raver119@gmail.com <raver119@gmail.com> * single time casts Signed-off-by: raver119@gmail.com <raver119@gmail.com> * - additional test for encode_threshold - sync buffers to device before calling for shape function Signed-off-by: raver119@gmail.com <raver119@gmail.com> Co-authored-by: shugeo <sgazeos@gmail.com>	2020-05-08 20:59:39 +03:00
Oleh	3d15706ffa	Lin_space operation improve (#373 ) * libnd4j update linspace op Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j #8513 update lin_space op, tests added Signed-off-by: Oleg <oleg.semeniv@gmail.com> * - minor linspace tweaks (num_elements now iArg) - java linspace updates - couple of additional tests for linspace Signed-off-by: raver119 <raver119@gmail.com> * roll back timeout change Signed-off-by: raver119 <raver119@gmail.com> Co-authored-by: raver119 <raver119@gmail.com>	2020-04-16 14:53:56 +03:00
Yurii Shyrma	4247718f61	Shyrma gru bp (#377 ) * - update gru ff op Signed-off-by: Yurii <iuriish@yahoo.com> * - implementation and testing gru_bp op Signed-off-by: Yurii <iuriish@yahoo.com> * - neglect dependencies between dLdh/dLdhLast/dLdcLast in lstmLayer backprop Signed-off-by: Yurii <iuriish@yahoo.com>	2020-04-16 08:09:04 +03:00
Yurii Shyrma	23e4aa99ad	Shyrma lstm layer bp (#370 ) * - start working on bp for lstm Signed-off-by: Yurii <iuriish@yahoo.com> * - further working on bp for lstmLayer Signed-off-by: Yurii <iuriish@yahoo.com> * - minor change Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on bp for lstmLayer 2 Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on bp for lstmLayer 3 Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on bp for lstmLayer 4 Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on bp for lstmLayer 5 Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on bp for lstmLayer 6 Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on bp for lstmLayer 7 Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on bp for lstmLayer 8 Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on bp for lstmLayer 9 Signed-off-by: Yurii <iuriish@yahoo.com> * - provide lstmLayerCell and lstmLayerCellBp as separate CUSTOM_OPs Signed-off-by: Yurii <iuriish@yahoo.com> * - testing and fixing lstmLayerCellBp helper Signed-off-by: Yurii <iuriish@yahoo.com> * - implement lstmLayerCellBp as separate op Signed-off-by: Yurii <iuriish@yahoo.com> * - implement lstmLayerBp as separate op (not tested) Signed-off-by: Yurii <iuriish@yahoo.com> * - fixing calculations of dLdWp and dLdb in lstmLayerCellBp Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on bp for lstmLayer 10 Signed-off-by: Yurii <iuriish@yahoo.com> * - fixing typo in lstmLayerTimeLoop Signed-off-by: Yurii <iuriish@yahoo.com> * - forgot to perform clipping of c array and calculate corresponding derivative in lstmLayerCellBp Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on bp for lstmLayer 10 Signed-off-by: Yurii <iuriish@yahoo.com> * - testing and fixing bugs in lstmLayer_bp op 1 Signed-off-by: Yurii <iuriish@yahoo.com> * - testing and fixing bugs in lstmLayer_bp op 2 Signed-off-by: Yurii <iuriish@yahoo.com> * - turn off heavy tests for cuda for lstmLayer_bp op Signed-off-by: Yurii <iuriish@yahoo.com> * - forgot to nullify gradients at eliminated time steps (when sequnce length array is present ) Signed-off-by: Yurii <iuriish@yahoo.com>	2020-04-13 13:21:51 +03:00
raver119	04b2b4f9b6	Few fixes (#361 ) * INDArray.close() fix for CPU Signed-off-by: raver119 <raver119@gmail.com> * - BroadcastableBoolOp introduced - ConfusionMatrix now supports explicit DataType argument Signed-off-by: raver119 <raver119@gmail.com> * confusion_matrix: dtype is still optional Signed-off-by: raver119 <raver119@gmail.com> * disable bert tests in debug builds Signed-off-by: raver119 <raver119@gmail.com> * Affinity fix Signed-off-by: raver119 <raver119@gmail.com> * minor workspace tweak to allow close() on scoped out borrowed workspace Signed-off-by: raver119 <raver119@gmail.com>	2020-04-06 21:01:59 +03:00
Oleh	1d004b542a	xw_plus_b mkldnn implementation (#247 ) * libnd4j first step of mkldnn for xw_plus_b and test of aurora crash in imageHelper * libnd4j sync folders with master * libnd4j merge master, raw implementation of xw_plus_b on mkldnn, clean up, need testing and adding checks for corresponded input shapes * libnd4j corrections and checks added to xw_plus_b mkl * libnd4j corrected dataType description based on mkl operation description, need more investigation * libnd4j fixe xw_blus_b mkl implementation, need testing Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j two unit tests added Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed check input dimensions bug Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libndj4 one more test added to cover different order handling Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j added optional int arg support to define weights format, if arg == 1, mkldnn (do not need transpose in mkldnn implementation), else mmul weights format, corrected check points, added unit test Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j merge master Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some improvements to avoid NDArray transpose in xw_plus_b operation Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed issues connected with weights rank, also added support of one case based on tf (for mkldnn, cpu, cuda), test case added Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j added proper handling of empty inputs (all implementations) * libnd4j fixed compilation error * libnd4j several more corrections after conflict solve and fixed typos Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j removed unsupported data types Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j merge master and fixed issues Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j added propagation implementation for xw_plus_b, fixed issue connected with mkl weights data format, avoided data copy in transpose mode, test cases added, manually tested with gradCheck Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j one minor fix of double operation declaration Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j code clean up Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j minor tests fixes Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed build problem, integrate helpers changes Signed-off-by: Oleg <oleg.semeniv@gmail.com> Co-authored-by: raver119 <raver119@gmail.com>	2020-03-31 13:03:10 +03:00
Oleh	e8cbf5255a	Backpropagation implementation of mergemax, mergeadd and mergeavg ops (#343 ) * libnd4j: first step of merge_max implementation Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed typos Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some corrections for mergeMaxBp Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some minor corrections Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j test added for mergemax_bp Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed several problems tests added, check with gradCheck Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j remove duplicated tests Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j split implementation of transforms ops into separate file implementation Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j code clean up, added mergeavg_bp and mergeadd_bp, need testing Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j merge master, fixed typos and added tests Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some minor fixes Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j added helper for mergeAddBp operation, this permits to skip nullify Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j file renaming changes and cuda some corrections, need some additional corrections Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some additional corrections for merge ops Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j more corrections per request for cuda more proper usage Signed-off-by: Oleg <oleg.semeniv@gmail.com>	2020-03-25 08:40:30 +03:00
Oleh	69c92ca5ae	Learning updaters for gradient (#335 ) * libnd4j raw implementation of sgd upader Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some corrections and simple test added Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some corrections after discussion Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j integrate applyScalar Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j raw implementation of rmsPropUpdater on cpu Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fix operations declaration Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j rmsPropUpdater added, test cases for sgd, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed several typos Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some fixes and improvements for rmsPropUpdater based on Java tests Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed cuda implementation, update tests and corrected behavior according java tests Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j adaGrad updater added Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j one minor fix for ada grad Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j several more fixes for ada_grad Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j nesterovs updater added Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed nesterovs updater behavior, several typos and rename file Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j one minor typo Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j ada max updater added Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed several typos in adaMax updater Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed several typos in adaMaxUpdater Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j several fixes for adaMax, added Adam Updater Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j adaDeltaUpdater added, minor fixes for adamUpdater Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j several fixes for adaDeltaUpdater Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j nadamUpdater added Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j one more correction for nadam updater Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j several fixes for nadam updater and added amsGradUpdater Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j several typos fixed in amsGradUpdater Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some corrections and added f order support rmsProp updater Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j added support of f order for all updaters and modify tests for testing in place Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j fixed issues for updates when not in place mode used, added tests for f order Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j added input shape checks Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some corrections for different cases handling Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j some code clean up and optimize per request Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j updaters refactoring after review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * SgdUpdater wrapper Signed-off-by: raver119 <raver119@gmail.com> * first test Signed-off-by: raver119 <raver119@gmail.com> * RmsPropUpdater added Signed-off-by: raver119 <raver119@gmail.com> * NadamUpdater + NesterovsUpdater Signed-off-by: raver119 <raver119@gmail.com> * AmsGradUpdater Signed-off-by: raver119 <raver119@gmail.com> * AdamUpdater added Signed-off-by: raver119 <raver119@gmail.com> * AdaGradUpdater + AdaDeltaUpdater + AdaMaxUpdater Signed-off-by: raver119 <raver119@gmail.com> * AdaGradUpdater test added Signed-off-by: raver119 <raver119@gmail.com> * libnd4j remove input parameters parsing through NDArray, split implementation of helpers to separate files, added some rename, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j next step to split operations implementation into separate files Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j merge master and minor corrections Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j revert some changes of split implementation Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j forgot to add header file Signed-off-by: Oleg <oleg.semeniv@gmail.com> * public default constructors Signed-off-by: raver119 <raver119@gmail.com> * ImportClassMapping updated Signed-off-by: raver119 <raver119@gmail.com> Co-authored-by: raver119 <raver119@gmail.com>	2020-03-23 07:28:31 +03:00
raver119	63fa3c2ef3	libnd4j polishing (#273 ) * initial set of include changes Signed-off-by: raver119 <raver119@gmail.com> * one more tweak Signed-off-by: raver119 <raver119@gmail.com> * few more rearrangements Signed-off-by: raver119 <raver119@gmail.com> * few more rearrangements Signed-off-by: raver119 <raver119@gmail.com> * few more rearrangements Signed-off-by: raver119 <raver119@gmail.com> * cuda includes rearrangements Signed-off-by: raver119 <raver119@gmail.com> * java update Signed-off-by: raver119 <raver119@gmail.com> * = namespace changed to sd - few CMake variables renamed with SD_ prefix Signed-off-by: raver119 <raver119@gmail.com> * java update Signed-off-by: raver119 <raver119@gmail.com> * LoopKind minor fix Signed-off-by: raver119 <raver119@gmail.com> * few more changes Signed-off-by: raver119 <raver119@gmail.com> * few more changes Signed-off-by: raver119 <raver119@gmail.com> * few more changes Signed-off-by: raver119 <raver119@gmail.com> * sanitizer is optional now Signed-off-by: raver119 <raver119@gmail.com> * dev tests updated Signed-off-by: raver119 <raver119@gmail.com> * few more changes Signed-off-by: raver119 <raver119@gmail.com> * last update Signed-off-by: raver119 <raver119@gmail.com> * java update Signed-off-by: raver119 <raver119@gmail.com>	2020-03-02 12:49:41 +03:00
shugeo	330a69d4e2	Shugeo solve ls (#203 ) * lstsq op. Initial commit. Signed-off-by: shugeo <sgazeos@gmail.com> * Least squares linear problem solve op (lstsq). Cpu draft implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed shape routine and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Added test for lstsq op. Signed-off-by: shugeo <sgazeos@gmail.com> * Rectification for lstsq op implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected test to avoid numerical inconsistensy. Signed-off-by: shugeo <sgazeos@gmail.com> * Added prints for check computing. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected tests to use evalueate facility instead. Signed-off-by: shugeo <sgazeos@gmail.com> * CPU implementation of MatrixSolveLs op and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Added cuda implementation for helpers with lstsq op. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored tests for lstsq op. Signed-off-by: shugeo <sgazeos@gmail.com> * Added processing for empty inputs. Signed-off-by: shugeo <sgazeos@gmail.com> * Merged tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored lstsq op for fast case. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed test. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored lstsq op. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed some issues with solve. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed lstsq op to avoid erros. Signed-off-by: shugeo <sgazeos@gmail.com> * Added kernel for giagonal factor Signed-off-by: shugeo <sgazeos@gmail.com> * lstsq wrapper and triangular_solve fixed * Added proper processing empty inputs and test. Signed-off-by: shugeo <sgazeos@gmail.com> * SequenceMask test * Build fixed * Added proper processing of empty inputs with solve op. Signed-off-by: shugeo <sgazeos@gmail.com> * Mapping added * Added check of input shapes with solve op. Signed-off-by: shugeo <sgazeos@gmail.com> * Added a couple of tests for lstsq op and minor changes with cuda helper for one.' Signed-off-by: shugeo <sgazeos@gmail.com> * Tests on * Refactored test for lstsq op. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed test * Added another approach for lstsq op aka solve_ls. Signed-off-by: shugeo <sgazeos@gmail.com> * Finished cpu part for solve_ls op helpers. * Added helper for low triangular matrix inversion. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored alternate solve_ls cpu implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Removed alternate approach for solve_ls op. Added multithreading with matrix inversion. Signed-off-by: shugeo <sgazeos@gmail.com> * Assert fixed * Refactored multithreading for inverse matricies. Signed-off-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-02-28 11:37:26 +03:00
raver119	3de3cd8277	R119 tests (#238 ) * one small test Signed-off-by: raver119 <raver119@gmail.com> * one small test Signed-off-by: raver119 <raver119@gmail.com> * bert test Signed-off-by: raver119 <raver119@gmail.com> * Graph FlowPath fix Signed-off-by: raver119 <raver119@gmail.com> * - GraphProfiler tweaks - NodeProfile now includes shapes Signed-off-by: raver119 <raver119@gmail.com> * RELU_layer inplace tweak Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * identity tweaks Signed-off-by: raver119 <raver119@gmail.com> * bert result validation Signed-off-by: raver119 <raver119@gmail.com> * - bunch of Shape ops have inplace exec forbidden now - Legacy ops have inplace exec disabled by default now Signed-off-by: raver119 <raver119@gmail.com> * ffast-math enabled Signed-off-by: raver119 <raver119@gmail.com> * ffast-math enabled Signed-off-by: raver119 <raver119@gmail.com> * allow some legacy ops to be inplace Signed-off-by: raver119 <raver119@gmail.com> * disable -fast_math Signed-off-by: raver119 <raver119@gmail.com> * disable expensive test for cuda Signed-off-by: raver119 <raver119@gmail.com>	2020-02-13 20:59:35 +03:00
Yurii Shyrma	fe47f52896	Oleh tenzor mmul (#231 ) * Libnd4j: TensorMMul backprop op #8174, raw implementation Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: TensorMMul backprop op #8174 merge master and some corrections Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: TensorMMul backprop op #8174 algorithm update, need testing, sync with master * Libnd4j: TensorMMul backprop op #8174 fixed incorrect B axes calculation Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: TensorMMul backprop op #8174 optimize axes identification and fix bug of indeces overlapping, added first test. need testing with different shapes Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: TensorMMul backprop op #8174 some fixes and improvements need more testing Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: TensorMMul backprop op #8174 fixed order of matrix multiply Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: TensorMMul backprop op #8174 fixed issue of incorrect axes definition, add tests based on TF, need additional testing for case dLdC not equal 1 Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: TensorMMul backprop op #8174 fixed scalar case add test Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: TensorMMul backprop op #8174 fixed bp algorithm, axes definition, need some mode testing with different orders combination f,c; c,f f,f and add some checks for inputs Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: TensorMMul backprop op #8174 some checks and corrections added tests, exists the problem with different input orders support A-f B-c and A-f B-f Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: TensorMMul backprop op #8174 sync master Signed-off-by: Oleg <oleg.semeniv@gmail.com> * - correct bug in MmulHelper::tensorDot(a, b, c, axes_a, axes_b,permutForC) Signed-off-by: Yurii <iuriish@yahoo.com> * Libnd4j: TensorMMul backprop op #8174 code clean up and refactoring Signed-off-by: Oleg <oleg.semeniv@gmail.com> * - add check for linspase ordered permutations in ShapeUtils::evalShapeForTensorDot Signed-off-by: Yurii <iuriish@yahoo.com> * - provide additional code in shape::reshape stuff in order to reduce amount of allocation/copy operations during reshaping procedure Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on problem of wrong shape evaluation during permute/reshape procedures Signed-off-by: Yurii <iuriish@yahoo.com> * - still looking for bug reason in reshape/permute stuff Signed-off-by: Yurii <iuriish@yahoo.com> * - correct bug in transform cuda native ops Signed-off-by: Yurii <iuriish@yahoo.com> * - correct bug in NDArray::assign Signed-off-by: Yurii <iuriish@yahoo.com> * - remove old shape::reshape stuff Signed-off-by: Yurii <iuriish@yahoo.com> * - add possibility to disable copy of old buffer to new buffer during reshape operation in NDArray class Signed-off-by: Yurii <iuriish@yahoo.com> * - correct bug in tensorDot which had to do with wrong pointers assigments Signed-off-by: Yurii <iuriish@yahoo.com> Co-authored-by: Oleh <oleg.semeniv@gmail.com>	2020-02-13 20:33:54 +03:00
shugeo	41ff907bc6	Shugeo solve linear (#191 ) * linear equations systems solve op. Initial commit. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed compiling issues. Signed-off-by: shugeo <sgazeos@gmail.com> * Linear equations systems solve. The next stage commit. Signed-off-by: shugeo <sgazeos@gmail.com> * Added test for linear equations systems solve operation. Signed-off-by: shugeo <sgazeos@gmail.com> * Added additional test and fixed lower matrix retrievance. * Implementation for solve of the systems of linear equations." Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored permutation generation. Signed-off-by: shugeo <sgazeos@gmail.com> * Added restore for permutations batched with cuda helper for solve op. Signed-off-by: shugeo <sgazeos@gmail.com> * Finished cuda implementation for solve op helpers. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored cpu helpers for solve op. Signed-off-by: shugeo <sgazeos@gmail.com> * Fix gtest output on Windows * Fixed issue with permutation matrix for cuda implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed issue with permutation matrix for cpu implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Eliminated waste comments. Signed-off-by: shugeo <sgazeos@gmail.com> * LinearSolve added * Mapping added * Javadoc added * Refactored implementation of triangular_solve helpers and tests for solve matrix equations generally. Signed-off-by: shugeo <sgazeos@gmail.com> * Added a test for solve op. Signed-off-by: shugeo <sgazeos@gmail.com> * Solve test added * Fix for TF import Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com> Co-authored-by: raver119 <raver119@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-02-04 08:59:11 +03:00
raver119	5d98cfcf47	Configurable DataType for ops (#201 ) * initial commit Signed-off-by: raver119 <raver119@gmail.com> * - one more test for OneHot with dtype - one more signature in Nd4j Signed-off-by: raver119 <raver119@gmail.com> * ones_as/zeros_as now accept dtype Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * - more updates for configurable data types - ones_as/zeros_as java side + tests Signed-off-by: raver119 <raver119@gmail.com> * few c++ tests fixed Signed-off-by: raver119 <raver119@gmail.com> * few more changes around DArgs Signed-off-by: raver119 <raver119@gmail.com>	2020-01-30 18:46:12 +03:00
shugeo	2717b25931	Shugeo qr (#153 ) * Added qr op implementation. Initial version. * Fixed doc for qr op. Signed-off-by: shugeo <sgazeos@gmail.com> * Implementation of QR decomposition. CPU platform version. * Added a pair of tests for qr op testing. Signed-off-by: shugeo <sgazeos@gmail.com> * QR implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected norm using. * Properly calculated intermediate results with QR decomposition. * Another step to implement QR algorithm by householder. * Cpu implementatio for QR decomposition. The first working edition. * Corrected test to QR decomposition. * Added tad multithreading with QR implementation. * Finished cpu implementation for QR decomposition helpers. * Refactored tests and improved multithreading. * Refactored QR cpu implementation and update cuda implementation helpers. * Cuda QR helper implementation. The first working edition. * Eliminated waste prints. * Restore multithreading with cuda implementation. * Ops names corrected * Refactored qr op helpers to optimize. Signed-off-by: shugeo <sgazeos@gmail.com> * Eliminated waste manual ticking. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored memory allocation to avoid waste memory usage. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored matrixMinor method both for cuda and cpu platforms. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored method of vmul to use raw buffers instead type conversion. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored temporary array of matricies. Signed-off-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com> Co-authored-by: raver119 <raver119@gmail.com>	2020-01-22 13:59:36 +03:00
shugeo	815a2908af	Shugeo solve triangular (#173 ) * Added implementation of the triangular_solve op. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed compilation issues. Signed-off-by: shugeo <sgazeos@gmail.com> * Added verification of input data and helpers facilities for triangular_solve op.' Signed-off-by: shugeo <sgazeos@gmail.com> * Added cpu implementation for triangular_solve helpers. * Added tests and implementation for upper triangular equations. Signed-off-by: shugeo <sgazeos@gmail.com> * Added a pair of cases to tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Added multithreading with cpu helpers for triangular_solve op. Signed-off-by: shugeo <sgazeos@gmail.com> * Added cuda implementation of triangular_solve op helpers. Signed-off-by: shugeo <sgazeos@gmail.com> * Finished cuda implementation of triangular_solve helpers and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed copyright marks. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected grammar errors with doc and error messages. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored matricies processing with triangular_solve cuda helper implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Added triangular_solve wrapper * Fixed mapping * Added processing for adjoint with cpu helpers of triangular_solve op implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Added implementation for adjoint routine with cuda platform. Signed-off-by: shugeo <sgazeos@gmail.com> * Added multithreading with adjoint routine for cpu platform. Signed-off-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-01-22 10:48:03 +03:00
shugeo	e50b285c2c	Shugeo resize area (#162 ) * Added implementation for resize_area op. Initial commit. * Added implementation of resize_area op. Initial revision. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected resizeArea functor call. Signed-off-by: shugeo <sgazeos@gmail.com> * Implementation of resize_area. Cpu platform helpers. Signed-off-by: shugeo <sgazeos@gmail.com> * Implementation for resize_area helpers. The first part revision. Signed-off-by: shugeo <sgazeos@gmail.com> * Added a set of tests for resize_area op. Signed-off-by: shugeo <sgazeos@gmail.com> * Cuda implementation for resize_area. Initial approach. Signed-off-by: shugeo <sgazeos@gmail.com> * Adding multithreading for resize_area algorithm. Signed-off-by: shugeo <sgazeos@gmail.com> * Cuda implementation of resize_area helpers. Shared memory approach. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored resizeAreaKernel with cuda implementation. * Eliminated compilation errors. * ResizeArea helpers for cuda platform. The first working revision. Signed-off-by: shugeo <sgazeos@gmail.com> * Added test for batched resize_area op testing. Signed-off-by: shugeo <sgazeos@gmail.com> * Implementation of resize_are for cuda platform and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed multithreading with resize_area op helper. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected copyright marks with sources. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected copyright mark for resize_area op implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected copyright mark for parity ops header. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected typo in strings and so on with image resize ops. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored resize_area helpers and multithreading. Signed-off-by: shugeo <sgazeos@gmail.com> * Added ResizeArea wrapper * Added test with align_corners and fixed shape processing with only int args given for output size. Signed-off-by: shugeo <sgazeos@gmail.com> * Added test * TF mapping for ResizeArea * Fixed implementation issues with resize_area op for both platforms. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored image resizer struct to use flexible types for ints and floats. Signed-off-by: shugeo <sgazeos@gmail.com> * Improved multithreading with resizeAreaKernel launch. Signed-off-by: shugeo <sgazeos@gmail.com> * Use asynchronical memory copying with cuda platform image resize allocations. Signed-off-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-01-22 10:46:33 +03:00
Oleh	8fc0e63ce7	Oleh powderev (#171 ) * Libnd4j: Add broadcastable elementwise power derivative #7461 first step of Pow_bp operation implementation Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 some corrections of calculation steps Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 some bug fixes, the PowDerevative op made broadcastable, add the raw tests for op, need refactoring to use broadcast ops * Libnd4j: Add broadcastable elementwise power derivative #7461 fixed several bugs add broadcast support and tests, need to fix scalar+array and array+scalar Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 fixed bugs for scalar inputs, fixed multinomial tests, added tests Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 fised bugs for different shapes support, tests updated * Libnd4j: Add broadcastable elementwise power derivative #7461 applied all possible variants via tiled arrays, add support of broadcast for Pow and PowDerivative ops, covered by tests, before review have to be replaced tiled implementation by applyTrueBroadcast Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 replaced tile by broadcast implementation, fixed issue with negative x input, corrected tests, need additional testing Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 added and corrected test cases, corrected implementation need review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 code clean up * Libnd4j: Add broadcastable elementwise power derivative #7461 code clean up, removed some tests, add tests with scalar Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 code improvement and clean up, split tests Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 some code clean up Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative replace __isnanf by internal realization Signed-off-by: Oleg <oleg.semeniv@gmail.com> * pow_bp wrapper * Fixed PowBp wrapper * Tests added * Test fixed * Fix return type * Disable powBp usage * Pow backprop changed Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-01-20 12:59:12 +03:00
shugeo	6943a5f57a	Shugeo lgamma (#170 ) * lgamma op. Initial version. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored lgamma op and test. Signed-off-by: shugeo <sgazeos@gmail.com> * Lgamma wrapper * Added TF mapping Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-01-20 12:29:36 +03:00
Oleh	2404be5fe0	Oleh multinomial (#163 ) * libnd4j: Multinomial op #8570 first raw step of multinomial random data generator implementation Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op #8570 next step of multinomial random categories generator implementation on both cpu and cuda, need corrections and code clean up before review and testing * libnd4j: Multinomial op #8570 code clean up and fixed issues data selecting, moved from coords to tads * libnd4j: Multinomial op #8570 fixed cuda build add reference for math materials that was used for implementation * libnd4j: Multinomial op #8570 fixed several bugs, added several tests and improved cuda version. current implementation works, need testing of reproduction with the same seed * libnd4j: Multinomial op #8570 fixes and optimization after discussion in both cuda and cpu * libnd4j: Multinomial op #8570 add corrections after review, removed tads, replace 2D parallel loop by 3D Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op fixed declaration and add tests need discussion * libnd4j: Multinomial op fix in test * libnd4j: Multinomial op corrected behavior to get reproducible results, fixed issue in uniform value getting, tests added, need cuda review and cuda testing Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op fixed indexing on uniform calculation Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op some corrections in max min declaration Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op fixed index calculation, added rewind, corrected input declaration, added stats tests, both cuda and cpu. cuda need testing * libnd4j: Multinomial op fixed bugs on cuda nad cpu. need review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op corrected tests to handle different orders Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op some improvements after code review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op more corrections after review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op fixed seed usage, update tests, fixed cuda based on comments, fixed bug of rewind, removed one behavior, minor corrections. Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op minor corrections Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op rise the bound of fluctuation for random cases Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: Multinomial op modified operation inputs and update implementation and tests on both cpu and cuda * libnd4j: Multinomial op corrected data types according ops.proto Co-authored-by: raver119 <raver119@gmail.com>	2020-01-06 22:35:05 +03:00
raver119	29e8e09db6	String changes (#3 ) * initial commit * additional data types & tensor type Signed-off-by: raver119 <raver119@gmail.com> * next step Signed-off-by: raver119 <raver119@gmail.com> * missing include * sparse_to_dense Signed-off-by: raver119 <raver119@gmail.com> * few more tests files Signed-off-by: raver119 <raver119@gmail.com> * draft Signed-off-by: raver119 <raver119@gmail.com> * numeric sparse_to_dense Signed-off-by: raver119 <raver119@gmail.com> * comment Signed-off-by: raver119 <raver119@gmail.com> * string sparse_to_dense version Signed-off-by: raver119 <raver119@gmail.com> * CUDA DataBuffer expand Signed-off-by: raver119 <raver119@gmail.com> * few tweaks for CUDA build Signed-off-by: raver119 <raver119@gmail.com> * shape fn for string_split Signed-off-by: raver119 <raver119@gmail.com> * one more comment Signed-off-by: raver119 <raver119@gmail.com> * string_split indices Signed-off-by: raver119 <raver119@gmail.com> * next step Signed-off-by: raver119 <raver119@gmail.com> * test passes Signed-off-by: raver119 <raver119@gmail.com> * few rearrangements for databuffer implementations Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer: move inline methods to common implementations Signed-off-by: raver119 <raver119@gmail.com> * add native DataBuffer to Nd4j presets Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer creation Signed-off-by: raver119 <raver119@gmail.com> * use DataBuffer for allocation Signed-off-by: raver119 <raver119@gmail.com> * cpu databuffer as deallocatable Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer setters for bufers Signed-off-by: raver119 <raver119@gmail.com> * couple of wrappers Signed-off-by: raver119 <raver119@gmail.com> * DataBuffers being passed around Signed-off-by: raver119 <raver119@gmail.com> * Bunch of ByteBuffer-related signatures gone Signed-off-by: raver119 <raver119@gmail.com> * - few more Nd4j signatures removed - minor fix for bfloat16 Signed-off-by: raver119 <raver119@gmail.com> * nullptr pointer is still a pointer, but 0 as address :) Signed-off-by: raver119 <raver119@gmail.com> * one special test Signed-off-by: raver119 <raver119@gmail.com> * empty string array init Signed-off-by: raver119 <raver119@gmail.com> * one more test in cpp Signed-off-by: raver119 <raver119@gmail.com> * memcpy instead of databuffer swap Signed-off-by: raver119 <raver119@gmail.com> * special InteropDataBuffer for front-end languages Signed-off-by: raver119 <raver119@gmail.com> * few tweaks for java Signed-off-by: raver119 <raver119@gmail.com> * pointer/indexer actualization Signed-off-by: raver119 <raver119@gmail.com> * CustomOp returns list for inputArumgents and outputArguments instead of array Signed-off-by: raver119 <raver119@gmail.com> * redundant call Signed-off-by: raver119 <raver119@gmail.com> * print_variable op Signed-off-by: raver119 <raver119@gmail.com> * - view handling (but wrong one) - print_variable java wrapper Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * - empty arrays handling Signed-off-by: raver119 <raver119@gmail.com> * - deserialization works now Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * one more fix Signed-off-by: raver119 <raver119@gmail.com> * initial cuda commit Signed-off-by: raver119 <raver119@gmail.com> * print_variable message validation Signed-off-by: raver119 <raver119@gmail.com> * CUDA views Signed-off-by: raver119 <raver119@gmail.com> * CUDA special buffer size Signed-off-by: raver119 <raver119@gmail.com> * minor update to match master changes Signed-off-by: raver119 <raver119@gmail.com> * - consider arrays always actual on device for CUDA - additional PrintVariable constructor - CudaUtf8Buffer now allocates host buffer by default Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * - print_variable now allows print from device Signed-off-by: raver119 <raver119@gmail.com> * InteropDataBuffer data type fix Signed-off-by: raver119 <raver119@gmail.com> * ... Signed-off-by: raver119 <raver119@gmail.com> * disable some debug messages Signed-off-by: raver119 <raver119@gmail.com> * master pulled in Signed-off-by: raver119 <raver119@gmail.com> * couple of new methods for DataBuffer interop Signed-off-by: raver119 <raver119@gmail.com> * java side Signed-off-by: raver119 <raver119@gmail.com> * offsetted constructor Signed-off-by: raver119 <raver119@gmail.com> * new CUDA deallocator Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart 2 Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart 3 Signed-off-by: raver119 <raver119@gmail.com> * - few new tests - few new methods for DataBuffer management Signed-off-by: raver119 <raver119@gmail.com> * few more tests + few more tweaks Signed-off-by: raver119 <raver119@gmail.com> * two failing tests Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * two failing tests pass Signed-off-by: raver119 <raver119@gmail.com> * now we pass DataBuffer to legacy ops too Signed-off-by: raver119 <raver119@gmail.com> * Native DataBuffer for legacy ops, Java side Signed-off-by: raver119 <raver119@gmail.com> * CPU java side update Signed-off-by: raver119 <raver119@gmail.com> * CUDA java side update Signed-off-by: raver119 <raver119@gmail.com> * no more prepare/register action on java side Signed-off-by: raver119 <raver119@gmail.com> * NDArray::prepare/register use now accepts vectors Signed-off-by: raver119 <raver119@gmail.com> * InteropDataBuffer now has few more convenience methods Signed-off-by: raver119 <raver119@gmail.com> * java bindings update Signed-off-by: raver119 <raver119@gmail.com> * tick device in NativeOps Signed-off-by: raver119 <raver119@gmail.com> * Corrected usage of OpaqueBuffer for tests. * Corrected usage of OpaqueBuffer for java tests. * NativeOpsTests fixes. * print_variable now returns scalar Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * compat_string_split fix for CUDA Signed-off-by: raver119 <raver119@gmail.com> * - CUDA execScalar fix - CUDA lazyAllocateHostPointer now checks java indexer/pointer instead of native pointer Signed-off-by: raver119 <raver119@gmail.com> * legacy ops DataBuffer migration prototype Signed-off-by: raver119 <raver119@gmail.com> * ignore device shapeinfo coming from java Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> * minor transformAny fix Signed-off-by: raver119 <raver119@gmail.com> * minor tweak for lazy host allocation Signed-off-by: raver119 <raver119@gmail.com> * - DataBuffer::memcpy method - bitcast now uses memcpy Signed-off-by: raver119 <raver119@gmail.com> * - IndexReduce CUDA dimension buffer fix Signed-off-by: raver119 <raver119@gmail.com> * views for CPU and CUDA Signed-off-by: raver119 <raver119@gmail.com> * less spam Signed-off-by: raver119 <raver119@gmail.com> * optional memory init Signed-off-by: raver119 <raver119@gmail.com> * async memset Signed-off-by: raver119 <raver119@gmail.com> * - SummaryStats CUDA fix - DataBuffer.sameUnderlyingData() impl - execBroadcast fix Signed-off-by: raver119 <raver119@gmail.com> * - reduce3All fix switch to CUDA 10 temporarily Signed-off-by: raver119 <raver119@gmail.com> * CUDA version Signed-off-by: raver119 <raver119@gmail.com> * proper memory deallocator registration Signed-off-by: raver119 <raver119@gmail.com> * HOST_ONLY workspace allocation Signed-off-by: raver119 <raver119@gmail.com> * temp commit Signed-off-by: raver119 <raver119@gmail.com> * few conflicts resolved Signed-off-by: raver119 <raver119@gmail.com> * few minor fixes Signed-off-by: raver119 <raver119@gmail.com> * one more minor fix Signed-off-by: raver119 <raver119@gmail.com> * NDArray permute should operate on JVM primitives Signed-off-by: raver119 <raver119@gmail.com> * - create InteropDataBuffer for shapes as well - update pointers after view creation in Java Signed-off-by: raver119 <raver119@gmail.com> * - addressPointer temporary moved to C++ Signed-off-by: raver119 <raver119@gmail.com> * CUDA: don't account offset twice Signed-off-by: raver119 <raver119@gmail.com> * CUDA: DataBuffer pointer constructor updated Signed-off-by: raver119 <raver119@gmail.com> * CUDA NDArray.unsafeDuplication() simplified Signed-off-by: raver119 <raver119@gmail.com> * CUDA minor workspace-related fixes Signed-off-by: raver119 <raver119@gmail.com> * CPU DataBuffer.reallocate() Signed-off-by: raver119 <raver119@gmail.com> * print_affinity op Signed-off-by: raver119 <raver119@gmail.com> * print_affinity java side Signed-off-by: raver119 <raver119@gmail.com> * CUDA more tweaks for data locality Signed-off-by: raver119 <raver119@gmail.com> * - compat_string_split tweak - CudaUtf8Buffer update Signed-off-by: raver119 <raver119@gmail.com> * INDArray.close() mechanic restored Signed-off-by: raver119 <raver119@gmail.com> * one more test fixed Signed-off-by: raver119 <raver119@gmail.com> * - CUDA DataBuffer.reallocate() updated - cudaMemcpy (synchronous) restored Signed-off-by: raver119 <raver119@gmail.com> * one last fix Signed-off-by: raver119 <raver119@gmail.com> * bad import removed Signed-off-by: raver119 <raver119@gmail.com> * another small fix Signed-off-by: raver119 <raver119@gmail.com> * one special test Signed-off-by: raver119 <raver119@gmail.com> * fix bad databuffer size Signed-off-by: raver119 <raver119@gmail.com> * release primaryBuffer on replace Signed-off-by: raver119 <raver119@gmail.com> * higher timeout Signed-off-by: raver119 <raver119@gmail.com> * disable timeouts Signed-off-by: raver119 <raver119@gmail.com> * dbCreateView now validates offset and length of a view Signed-off-by: raver119 <raver119@gmail.com> * additional validation for dbExpand Signed-off-by: raver119 <raver119@gmail.com> * restore timeout back again Signed-off-by: raver119 <raver119@gmail.com> * smaller distribution for rng test to prevent timeouts Signed-off-by: raver119 <raver119@gmail.com> * CUDA DataBuffer::memcpy now copies to device all the time Signed-off-by: raver119 <raver119@gmail.com> * OpaqueDataBuffer now contains all required methods for interop Signed-off-by: raver119 <raver119@gmail.com> * some javadoc Signed-off-by: raver119 <raver119@gmail.com> * GC on failed allocations Signed-off-by: raver119 <raver119@gmail.com> * minoe memcpu tweak Signed-off-by: raver119 <raver119@gmail.com> * one more bitcast test Signed-off-by: raver119 <raver119@gmail.com> * - NDArray::deviceId() propagation - special multi-threaded test for data locality checks Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer additional syncStream Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer additional syncStream Signed-off-by: raver119 <raver119@gmail.com> * one ignored test Signed-off-by: raver119 <raver119@gmail.com> * skip host alloc for empty arrays Signed-off-by: raver119 <raver119@gmail.com> * ByteBuffer support is back Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer::memcpy minor fix Signed-off-by: raver119 <raver119@gmail.com> * few minor prelu/bp tweaks Signed-off-by: raver119 <raver119@gmail.com> * nullify-related fixes Signed-off-by: raver119 <raver119@gmail.com> * PReLU fixes (#157) Signed-off-by: Alex Black <blacka101@gmail.com> * Build fixed * Fix tests * one more ByteBuffer signature restored Signed-off-by: raver119 <raver119@gmail.com> * nd4j-jdbc-hsql profiles fix Signed-off-by: raver119 <raver119@gmail.com> * nd4j-jdbc-hsql profiles fix Signed-off-by: raver119 <raver119@gmail.com> * PReLU weight init fix Signed-off-by: Alex Black <blacka101@gmail.com> * Small PReLU fix Signed-off-by: Alex Black <blacka101@gmail.com> * - INDArray.migrate() reactivated - DataBuffer::setDeviceId(...) added - InteropDataBuffer Z syncToDevice added for views Signed-off-by: raver119 <raver119@gmail.com> * missed file Signed-off-by: raver119 <raver119@gmail.com> * Small tweak Signed-off-by: Alex Black <blacka101@gmail.com> * cuda 10.2 Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> Co-authored-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alex Black <blacka101@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-01-04 13:27:50 +03:00
Oleh	75123b0a4c	[WIP] Oleh rgb yuv (#147 ) * libnd4j: RgbToYuv and YuvToRgb, both implementations for both cpu and cuda. Need adding tests and review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: RgbToYuv and YuvToRgb, replace coords method on Tad in both cpu and cuda, add tests, fixed bugs Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: RgbToYuv and YuvToRgb minor corrections Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: RgbToYuv and YuvToRgb corrections to use operations in-place	2019-12-24 18:30:54 +03:00
Abdelrauf	39d43ca170	RgbToYiq and YiqToRgb operations (#142 ) * RgbToYiq and YiqToRgb Signed-off-by: Abdelrauf <rauf@konduit.ai> * CUDA impl for RgbToYiq and YiqToRgb Signed-off-by: raver119 <raver119@gmail.com> * remove print Signed-off-by: raver119 <raver119@gmail.com> * allow inplace for hsv,rgb,yiq ops Signed-off-by: Abdelrauf <rauf@konduit.ai> Co-authored-by: raver119 <raver119@gmail.com>	2019-12-24 15:20:35 +03:00
raver119	495256c827	minor build fix (#139 ) Signed-off-by: raver119 <raver119@gmail.com>	2019-12-21 08:07:13 +03:00
Oleh	211c0df76f	Oleh rgb to gray scale (#138 ) * libnd4j: RgbToGrayscale op #8536 - raw implementation in user branch, need checks for integration and adding other orders Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: RgbToGrayscale op #8536 next step of merging images Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: RgbToGrayscale op #8536, Revert merge of hsv_to_rgb and rgb_to_hsv as cause conflicts in naming need refactoring before merge, implementation of rbg_to_grs added * libnd4j: RgbToGrayscale op #8536 imlementation and conflict resolve * libnd4j: RgbToGrayscale op #8536 merged operations with images into image, renamed methods and files * libnd4j: RgbToGrayscale op #8536 added test for rgbToGrayScale, need clarification and fixed tests case run Signed-off-by: Oleg <oleg.semeniv@gmail.com> * libnd4j: RgbToGrayscale op #8536 bug fixing and need review * libnd4j: RgbToGrayscale op #8536 some additional corrections after review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * - minor corrections in rgbToGrs test1 Signed-off-by: Yurii <iuriish@yahoo.com> * libnd4j: RgbToGrayscale op #8536, corrected tests and rbf_to_grs, fixed problems, refactoring, need review * libnd4j: RgbToGrayscale op #8536 fix for 'f' order in rgbToGrs * libnd4j: RgbToGrayscale op #8536 fixed several bugs with dimC, test case refactoring and improve Signed-off-by: Oleg <oleg.semeniv@gmail.com> * - add cuda kernel for rgbToGrs op Signed-off-by: Yurii <iuriish@yahoo.com> * - fix linkage errors Signed-off-by: Yurii <iuriish@yahoo.com> Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>	2019-12-20 20:59:29 +03:00
shugeo	67d8199165	[WIP] Shugeo lup (#126 ) * Added infrastructure for implementation op lu for both cuda and cpu platforms. * Added implementation of helpers with lu op. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored LU decomposition to use vector of permutations instead. * Refactored helpers for lu op. * Fixed crash with determinant op. * Refactored cpu LU op heleper. * Added implementation for lu op. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed issue with argmax on column. * Added multithreaded behaviour for lu op helper. * Fixed multithreaded cpu implementation helpers for lu op. * Added cuda implementation for lu op helper. * Finished lu helper implementation for cuda platform. * Eliminated waste prints and comments. * Fixed race condition and multithreading issues. * Fixed memory leak with shape construction. * Corrected test for lu op to avoid near zero elements on the main diagonal." Signed-off-by: shugeo <sgazeos@gmail.com> * Improved test for adjust_constast op. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed issues with cuda implementation of resize_bicubic helpers. Signed-off-by: shugeo <sgazeos@gmail.com>	2019-12-20 17:56:28 +03:00
Abdelrauf	e0a9cb6c08	[WIP] HSV,RGB color model conversions (#125 ) * CUDA implementation for hsv_to_rgb and rgb_to_hsv Signed-off-by: raver119 <raver119@gmail.com> * hsv_to_rgb and rgb_to_hsv operations Test coverage: c order 1d, 2d, 3d array Signed-off-by: Abdelrauf <rauf@konduit.ai> * Index check Signed-off-by: Abdelrauf <rauf@konduit.ai> * Suppress Msvc floating point errors Signed-off-by: Abdelrauf <rauf@konduit.ai> * Added Index Check for adjust_saturation and adjust_hue Signed-off-by: Abdelrauf <rauf@konduit.ai> * minor fix Signed-off-by: raver119 <raver119@gmail.com> * Fixes missed Msvc floating narrowing errors Signed-off-by: Abdelrauf <rauf@konduit.ai>	2019-12-17 09:42:09 +03:00
Yurii Shyrma	1f5e15b541	Shyrma adjust (#98 ) * - add possibility of passing scalar-array as input parameter for scale factor in adjust hue/contrast/saturation ops - correct typo in function which calculates regularized incomplete beta integral Signed-off-by: Yurii <iuriish@yahoo.com> * - fix bug in betainc cuda kernel Signed-off-by: Yurii <iuriish@yahoo.com> * - start working on implementation of digamma function Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on digamma function (cpu) Signed-off-by: Yurii <iuriish@yahoo.com> * - testing and fixing bugs in digamma op Signed-off-by: Yurii <iuriish@yahoo.com> * - make correction n cuda kernel for polyGamma Signed-off-by: Yurii <iuriish@yahoo.com> * - remove unnecessary stuff from betaInc cuda kernel Signed-off-by: Yurii <iuriish@yahoo.com> * - resolve conflicts in DeclarableOpsTests3.cpp after master branch has been merged Signed-off-by: Yurii <iuriish@yahoo.com> * - restore id number of Not opertion in legacy_ops.h Signed-off-by: Yurii <iuriish@yahoo.com> * - correct padding calculation in mkl dnn conv1d causal Signed-off-by: Yurii <iuriish@yahoo.com> * restore empty check in adjust_contrast_v2 Signed-off-by: raver119 <raver119@gmail.com>	2019-12-03 09:40:45 +03:00
Yurii Shyrma	d19eeaec52	Shyrma casual conv1d (#90 ) * - add causal mode of padding to convolutions Signed-off-by: Yurii <iuriish@yahoo.com> * - add additional tests for causal conv1d Signed-off-by: Yurii <iuriish@yahoo.com> * - add causal mode for cuda conv kernels Signed-off-by: Yurii <iuriish@yahoo.com> * Java side of Conv1D changes Signed-off-by: raver119 <raver119@gmail.com> * Add Conv1DDerivative op Signed-off-by: Alex Black <blacka101@gmail.com> * Causal Conv1D gradient checks Signed-off-by: Alex Black <blacka101@gmail.com> * Tweaks Signed-off-by: Alex Black <blacka101@gmail.com> * - add causal padding mode to conv2d_bp Signed-off-by: Yurii <iuriish@yahoo.com> * More thorough causal conv1d tests Signed-off-by: Alex Black <blacka101@gmail.com>	2019-11-29 14:14:30 +03:00
shugeo	009007120b	Shugeo_release_fixes3 (#81 ) * Implementation for non_max_suppression_v3 was added. Initial version * Added check for overcome threshold. * Added definition for V3 method. * java remapping for NonMaxSuppressionV3 Signed-off-by: raver119 <raver119@gmail.com> * Fixed proporly processing of an empty output and test. * Refactored op to less threshold data to float. * Implemented cuda-based helper for non_max_suppression_v3 op. * Fixed fake_quant_with_min_max_vars op. * Fixed tests with float numbers. * - assert now stops execution - sortByKey/sortByValue now have input validation Signed-off-by: raver119 <raver119@gmail.com> * missing var Signed-off-by: raver119 <raver119@gmail.com> * Fixed proper processing for zero max_size inputs. * Refactored kernel callers. * Fixed return statement for logdet op helper. * Refactored unsorted segment SqrtN op. * get back 8 tail bytes on CUDA Signed-off-by: raver119 <raver119@gmail.com> * Refactored segment prod ops and helpers for cuda and tests. * Additional test. * CudaWorkspace tests updated for 8 tail bytes Signed-off-by: raver119 <raver119@gmail.com> * special atomic test Signed-off-by: raver119 <raver119@gmail.com> * atomicMul/atomicDiv fix for 16bit values Signed-off-by: raver119 <raver119@gmail.com> * Eliminated waste prints.	2019-11-28 21:08:51 +03:00
raver119	83cb0d9329	[WIP] Create and small fix (#67 ) * - create op - skip exec for empty inputs for non_max_suppression - EmptyHandling idea Signed-off-by: raver119 <raver119@gmail.com> * Create op and mapping for it Signed-off-by: raver119 <raver119@gmail.com>	2019-11-21 13:31:20 +03:00
Alex Black	da1944e8e1	SameDiff TF import (#49 ) * Added implementation files for image_resize and resize_bicubic ops. * Image resize and image.resize_bicubic ops implementation. Initial revision. * Minor fix * Some TF imports disabled. * Finished with infrastructure development for image.resize_bilinear op and image_resizo op implementation. * Refactored resize methods. * Added processing for Mitchelcubic algorithm. * adjust_contrast * Small fix for TF import expected value loading when variable name starts with the test name Signed-off-by: AlexDBlack <blacka101@gmail.com> * Tests * Tests added. * Removed tf names absent in mapping. * Some fixes. * Small fixes * Minor change * Some failing tests. * Disable failed test * Ignore some tests * Fix import class mapping Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix float property mapping (flatbuffers) Signed-off-by: AlexDBlack <blacka101@gmail.com> * Override equality function for model 'dropout' Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fail tests * Failed tests ignored temporarily. * Minor fixes * Small fix * Conflict resolved * Default implementations of tensorflowName and onnxName	2019-11-19 22:44:29 +11:00
raver119	1780dcc883	[WIP] Small fixes here and there (#50 ) * one range test Signed-off-by: raver119 <raver119@gmail.com> * few Context convenience singatures Signed-off-by: raver119 <raver119@gmail.com> * one more range test Signed-off-by: raver119 <raver119@gmail.com> * "range" "fix" Signed-off-by: raver119 <raver119@gmail.com> * adjuct_contrast_v2 now allows scale factor to be provided via input_variable Signed-off-by: raver119 <raver119@gmail.com> * adjust_contrast now allows scale factor as variable too Signed-off-by: raver119 <raver119@gmail.com> * bitcast shape tests Signed-off-by: raver119 <raver119@gmail.com> * BitCast import dtype added Signed-off-by: raver119 <raver119@gmail.com> * few more BitCast signatures Signed-off-by: raver119 <raver119@gmail.com>	2019-11-15 17:04:29 +03:00
Alex Black	47d19908f4	Various fixes (#43 ) * #8172 Enable DL4J MKLDNN batch norm backward pass Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8382 INDArray.toString() rank 1 brackets / ambiguity fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8308 Fix handful of broken links (inc. some in errors) Signed-off-by: AlexDBlack <blacka101@gmail.com> * Unused dependencies, round 1 Signed-off-by: AlexDBlack <blacka101@gmail.com> * Unused dependencies, round 2 Signed-off-by: AlexDBlack <blacka101@gmail.com> * Unused dependencies, round 3 Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Uniform distribution TF import fix Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-11-14 19:38:20 +11:00
shugeo	08853c7829	Shugeo random uniform int (#30 ) * Corrected randomuniform declaration. * Refactored uniform distribution for both cuda and cpu platforms. * Refactored uniform distribution and tests. * Fixed type usage with indices. * Refactored uniform distribution implementation and tests to full conform with TF implementation. * Refactored gamma function to use type util method. * Copyright changes and fixes with ConstantHelper. * Added error checking on allocate cuda device memory and operations.	2019-11-06 12:49:27 +02:00
shugeo	7b14a9f603	Gamma and Poisson distributions (#27 ) * Added implementation for random_gamma op. * Added implementation for random_poisson op and support classes. * Added helpers for random_poisson and random_gamma ops. * Implementation of random_poisson. The first working edition. * Implementation of random_poisson. Parallelized working edition. * Implementation of random_gamma. Parallelized working edition with alpha only. * Added cuda implementation for helper of poisson distribution. * Corrected shape calculation with random_gamma and tests. * Finished cpu implementation for gamma distribution. * Finished cuda implementation for random_gamma op. * Refactored cpu helpers for random_gamma and random_poisson ops. * Refactored cuda helpers for gamma and poisson distribution. * Refactored cuda helper for gamma distribution. * Refactored cpu helper for random_poisson op. * Refactored cpu helper for random_gamma op.	2019-11-04 15:42:28 +02:00
Yurii Shyrma	0cdb5750e0	Shyrma concat (#24 ) * - provide possibility to pass axis as last input array in concat op - corrcect sumation in bias_add_bp op for NHWC case Signed-off-by: Yurii <iuriish@yahoo.com> * - write code for deconv2d op based on mkl dnn api * no unsafe math Signed-off-by: raver119 <raver119@gmail.com> * no unsafe math Signed-off-by: raver119 <raver119@gmail.com> * - get rid of e<> and p<> methods in svd helper Signed-off-by: Yurii <iuriish@yahoo.com> * - provide mkl api support for deconvolution 3d Signed-off-by: Yurii <iuriish@yahoo.com> * - write deconv2d_bp based on mkl api Signed-off-by: Yurii <iuriish@yahoo.com> * - write deconv3d_bp based on mkl api Signed-off-by: Yurii <iuriish@yahoo.com> * - testing and fixing deconv based on mkl api Signed-off-by: Yurii <iuriish@yahoo.com> * - remove dilation form conv2d/3d mkl Signed-off-by: Yurii <iuriish@yahoo.com> * - minor changes Signed-off-by: Yurii <iuriish@yahoo.com> * - further corrections of deconv ops based on mkl dnn api Signed-off-by: Yurii <iuriish@yahoo.com> * - provide deconv2d_tf based on mkl dnn api Signed-off-by: Yurii <iuriish@yahoo.com> * - add minor corrections required by reviewer Signed-off-by: Yurii <iuriish@yahoo.com>	2019-11-03 12:37:19 +02:00
shugeo	95f7ad7b94	Shugeo suppression overlaps (#9 ) * Added non_max_suppression_overlaps op and tests. * Refactored implementation of non_max_suppression_overlaps. * Refactoring of implementation of non_max_suppression_overlaps op. * Refactoring of implementation of non_max_suppression op. * Fixed portion error. * Added cuda frontends for image suppression ops. * Eliminated crash with cuda arch on image.non_max_suppression_overlaps op. * Improved implementation of image_suppression helper for cpu platform. * The generic approach of non_max_suppression_overlaps op helper with cuda platform. * Working cuda implementation of helper non_max_suppression_overlaps op. * Eliminated waste comments. * Improved implementations for both platforms * Refactored cuda implementation of image.non_max_suppression_overlaps op helper. * Improved cuda implementation of non_max_suppression op helper. * Refactored cuda implementation of image.non_max_suppression_overlaps op helper. * Improved cuda implementation of image.non_max_suppression_overlaps op helper. * Added modifications into cuda implementation for image suppression overlaps op. * Correct queue emulation with cuda implementation of non_max_suppression_overlaps op. * Prefinal stage of cuda implementation of non_max_suppression_overlaps. * Worked cuda implementation of non_max_suppresion_overlaps helper. * Fixed return to proper thread. * Improvements for cuda implementation of image.non_max_suppression_overlaps op helper. * Fixed implementation issues with non_max_suppression_overlaps on cuda platform. * Fixed skip for non_max_suppression_overlaps on cuda platform. * Finalize implementation of image_suppression helper and tests. * Cosmetic changes only.	2019-10-30 13:43:45 +02:00

1 2

88 Commits