cavis

Author	SHA1	Message	Date
raver119	9f719488b9	CUDA sync tweaks (#194 ) * ThreadLocal cache for CudaContext Signed-off-by: raver119 <raver119@gmail.com> * temp commit Signed-off-by: raver119 <raver119@gmail.com> * remove unwanted synchronization Signed-off-by: raver119 <raver119@gmail.com>	2020-01-28 10:55:06 +03:00
raver119	7ef0ef907e	Packages fix (#193 ) * packages fix Signed-off-by: raver119 <raver119@gmail.com> * few imports fixed Signed-off-by: raver119 <raver119@gmail.com> * few imports fixed Signed-off-by: raver119 <raver119@gmail.com>	2020-01-27 23:04:21 +03:00
raver119	531a72fabd	execution mode (#183 ) * initial commit Signed-off-by: raver119 <raver119@gmail.com> * execution mode java side Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * move exec mode to ContextPrototype Signed-off-by: raver119 <raver119@gmail.com> * copyrights Signed-off-by: raver119 <raver119@gmail.com>	2020-01-27 10:00:07 +03:00
Alex Black	458d141d8e	Fix SDLoss null weights array issue (#185 ) Signed-off-by: AlexDBlack <blacka101@gmail.com>	2020-01-25 20:13:23 +11:00
Alexander Stoyakin	4db28a9300	Cleanup of multiple projects (#175 ) * Cleanup modules * Moving subprojects to nd4j-api * Project cleanup * Dropped AWS sub-project * dl4j-util moved to core * dl4j-perf moved to core * Tests coverage * Revert "Moving subprojects to nd4j-api" This reverts commit bc6eb573c6b60c407ade47172c5d204725077e6b. * Moved nd4j-buffer and nd4j-context to nd4j-api * Rolled back change * Revert "Project cleanup" This reverts commit 64ac7f369b2d968f7be437718034f093fc886ffc. * Datavec cleaned up * Revert "Moved nd4j-buffer and nd4j-context to nd4j-api" This reverts commit 75f4e8da80d2551e44e1251dd6c5923289fff8e1. # Conflicts: # nd4j/nd4j-backends/nd4j-tests/src/test/java/org/nd4j/autodiff/opvalidation/ReductionBpOpValidation.java * Resolve conflict * Compilation fixed. * nd4j-context and nd4j-buffer moved to nd4j-api * Fixed TF mapping for mmul * Fix for dl4j-cuda tests Signed-off-by: Alex Black <blacka101@gmail.com> * Move last few tests from deeplearning4j-nn to -core Signed-off-by: Alex Black <blacka101@gmail.com> * Remove incorrect TF import mapping for TensorMmul op Signed-off-by: Alex Black <blacka101@gmail.com> * Cleaned TF mapping * Fix path for test results on windows * Remove old dependency Signed-off-by: Alex Black <blacka101@gmail.com> * One more attempt to fix path for test results on windows * fixup! One more attempt to fix path for test results on windows * fixup! One more attempt to fix path for test results on windows Co-authored-by: Alex Black <blacka101@gmail.com> Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com> Co-authored-by: raver119 <raver119@gmail.com>	2020-01-24 22:35:00 +03:00
raver119	5d69069177	[WIP] Memory limits (#167 ) * initial commit Signed-off-by: raver119 <raver119@gmail.com> * one more initial commit Signed-off-by: raver119 <raver119@gmail.com> * additional initial commit Signed-off-by: raver119 <raver119@gmail.com> * subsequent initial commit Signed-off-by: raver119 <raver119@gmail.com> * initial commit testing Signed-off-by: raver119 <raver119@gmail.com> * initial commit per device Signed-off-by: raver119 <raver119@gmail.com> * initial commit per group Signed-off-by: raver119 <raver119@gmail.com> * initial commit for cuda Signed-off-by: raver119 <raver119@gmail.com> * initial commit for cuda + few missed lines Signed-off-by: raver119 <raver119@gmail.com> * initial commit for cuda + missed includes Signed-off-by: raver119 <raver119@gmail.com> * initial commit for cuda + one more missed include Signed-off-by: raver119 <raver119@gmail.com> * initial commit shouldn't count host mem as dev0 in cuda Signed-off-by: raver119 <raver119@gmail.com> * initial commit that tracks HOST group limits for CUDA Signed-off-by: raver119 <raver119@gmail.com> * initial commit with some Environment changes Signed-off-by: raver119 <raver119@gmail.com> * initial commit with more Environment changes Signed-off-by: raver119 <raver119@gmail.com> * initial commit with maxMasterThreads fix Signed-off-by: raver119 <raver119@gmail.com> * initial commit with maxMasterThreads fix Signed-off-by: raver119 <raver119@gmail.com> * initial commit without maxMasterThreads exception Signed-off-by: raver119 <raver119@gmail.com> * initial commit without Nd4jULong in Environment Signed-off-by: raver119 <raver119@gmail.com> * add sleep and more iterations for OOM cases Signed-off-by: raver119 <raver119@gmail.com> * limits propagation from java side Signed-off-by: raver119 <raver119@gmail.com> * - consume ErrorCode every time - one test for memory limits Signed-off-by: raver119 <raver119@gmail.com> * unordered_map Signed-off-by: raver119 <raver119@gmail.com> * unordered_map Signed-off-by: raver119 <raver119@gmail.com> * unordered_map Signed-off-by: raver119 <raver119@gmail.com> * RSub op mapping fixed Signed-off-by: raver119 <raver119@gmail.com> * typo fixed Signed-off-by: raver119 <raver119@gmail.com> * one bad test fixed Signed-off-by: raver119 <raver119@gmail.com>	2020-01-24 10:11:09 +03:00
Robert Altena	0caf50f80f	SDLoss cleanup. (#180 ) Signed-off-by: Robert Altena <Rob@Ra-ai.com>	2020-01-23 22:22:06 +11:00
raver119	256c9d20b0	alloc check for RNG (#179 ) * missing alloc validation in RandomGenerator for CUDA Signed-off-by: raver119 <raver119@gmail.com> * set error message if rng alloc failed Signed-off-by: raver119 <raver119@gmail.com> * check for error code during RNG creation in java Signed-off-by: raver119 <raver119@gmail.com>	2020-01-23 09:51:02 +03:00
raver119	25db3a44f1	[WIP] few fixes for tests (#177 ) * nd4j-aeron profiles Signed-off-by: raver119 <raver119@gmail.com> * nd4j-aeron profiles Signed-off-by: raver119 <raver119@gmail.com> * skip one long test Signed-off-by: raver119 <raver119@gmail.com> * skip one long test Signed-off-by: raver119 <raver119@gmail.com> * kryo profile Signed-off-by: raver119 <raver119@gmail.com> * few more profiles Signed-off-by: raver119 <raver119@gmail.com> * few more profiles Signed-off-by: raver119 <raver119@gmail.com> * few more profiles Signed-off-by: raver119 <raver119@gmail.com>	2020-01-22 16:12:30 +03:00
Alex Black	a25bb6a11c	Unit/integration test split + test speedup (#166 ) * Add maven profile + base tests methods for integration tests Signed-off-by: AlexDBlack <blacka101@gmail.com> * Switch from system property to environment variable; seems more reliable in intellij Signed-off-by: AlexDBlack <blacka101@gmail.com> * Add nd4j-common-tests module, and common base test; cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * Ensure all ND4J tests extend BaseND4JTest Signed-off-by: AlexDBlack <blacka101@gmail.com> * Test spam reduction, import fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Add test logging to nd4j-aeron Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix unintended change Signed-off-by: AlexDBlack <blacka101@gmail.com> * Reduce sprint test log spam Signed-off-by: AlexDBlack <blacka101@gmail.com> * More test spam cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * Significantly speed up TSNE tests Signed-off-by: AlexDBlack <blacka101@gmail.com> * W2V iterator test unit/integration split Signed-off-by: AlexDBlack <blacka101@gmail.com> * More NLP test speedups Signed-off-by: AlexDBlack <blacka101@gmail.com> * Avoid debug/verbose mode leaking between tests Signed-off-by: AlexDBlack <blacka101@gmail.com> * test tweak Signed-off-by: AlexDBlack <blacka101@gmail.com> * Arbiter extends base DL4J test Signed-off-by: AlexDBlack <blacka101@gmail.com> * Arbiter test speedup Signed-off-by: AlexDBlack <blacka101@gmail.com> * nlp-uima test speedup Signed-off-by: AlexDBlack <blacka101@gmail.com> * More test speedups Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix ND4J base test Signed-off-by: AlexDBlack <blacka101@gmail.com> * Few small ND4J test speed improvements Signed-off-by: AlexDBlack <blacka101@gmail.com> * DL4J tests speedup Signed-off-by: AlexDBlack <blacka101@gmail.com> * More tweaks Signed-off-by: AlexDBlack <blacka101@gmail.com> * Even more test speedups Signed-off-by: AlexDBlack <blacka101@gmail.com> * More tweaks Signed-off-by: AlexDBlack <blacka101@gmail.com> * Various test fixes Signed-off-by: Alex Black <blacka101@gmail.com> * More test fixes Signed-off-by: Alex Black <blacka101@gmail.com> * Add ability to specify number of threads for C++ ops in BaseDL4JTest and BaseND4JTest Signed-off-by: Alex Black <blacka101@gmail.com> * nd4j-aeron test profile fix for CUDA Signed-off-by: Alex Black <blacka101@gmail.com>	2020-01-22 22:27:01 +11:00
shugeo	2717b25931	Shugeo qr (#153 ) * Added qr op implementation. Initial version. * Fixed doc for qr op. Signed-off-by: shugeo <sgazeos@gmail.com> * Implementation of QR decomposition. CPU platform version. * Added a pair of tests for qr op testing. Signed-off-by: shugeo <sgazeos@gmail.com> * QR implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected norm using. * Properly calculated intermediate results with QR decomposition. * Another step to implement QR algorithm by householder. * Cpu implementatio for QR decomposition. The first working edition. * Corrected test to QR decomposition. * Added tad multithreading with QR implementation. * Finished cpu implementation for QR decomposition helpers. * Refactored tests and improved multithreading. * Refactored QR cpu implementation and update cuda implementation helpers. * Cuda QR helper implementation. The first working edition. * Eliminated waste prints. * Restore multithreading with cuda implementation. * Ops names corrected * Refactored qr op helpers to optimize. Signed-off-by: shugeo <sgazeos@gmail.com> * Eliminated waste manual ticking. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored memory allocation to avoid waste memory usage. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored matrixMinor method both for cuda and cpu platforms. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored method of vmul to use raw buffers instead type conversion. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored temporary array of matricies. Signed-off-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com> Co-authored-by: raver119 <raver119@gmail.com>	2020-01-22 13:59:36 +03:00
shugeo	815a2908af	Shugeo solve triangular (#173 ) * Added implementation of the triangular_solve op. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed compilation issues. Signed-off-by: shugeo <sgazeos@gmail.com> * Added verification of input data and helpers facilities for triangular_solve op.' Signed-off-by: shugeo <sgazeos@gmail.com> * Added cpu implementation for triangular_solve helpers. * Added tests and implementation for upper triangular equations. Signed-off-by: shugeo <sgazeos@gmail.com> * Added a pair of cases to tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Added multithreading with cpu helpers for triangular_solve op. Signed-off-by: shugeo <sgazeos@gmail.com> * Added cuda implementation of triangular_solve op helpers. Signed-off-by: shugeo <sgazeos@gmail.com> * Finished cuda implementation of triangular_solve helpers and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed copyright marks. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected grammar errors with doc and error messages. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored matricies processing with triangular_solve cuda helper implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Added triangular_solve wrapper * Fixed mapping * Added processing for adjoint with cpu helpers of triangular_solve op implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Added implementation for adjoint routine with cuda platform. Signed-off-by: shugeo <sgazeos@gmail.com> * Added multithreading with adjoint routine for cpu platform. Signed-off-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-01-22 10:48:03 +03:00
shugeo	e50b285c2c	Shugeo resize area (#162 ) * Added implementation for resize_area op. Initial commit. * Added implementation of resize_area op. Initial revision. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected resizeArea functor call. Signed-off-by: shugeo <sgazeos@gmail.com> * Implementation of resize_area. Cpu platform helpers. Signed-off-by: shugeo <sgazeos@gmail.com> * Implementation for resize_area helpers. The first part revision. Signed-off-by: shugeo <sgazeos@gmail.com> * Added a set of tests for resize_area op. Signed-off-by: shugeo <sgazeos@gmail.com> * Cuda implementation for resize_area. Initial approach. Signed-off-by: shugeo <sgazeos@gmail.com> * Adding multithreading for resize_area algorithm. Signed-off-by: shugeo <sgazeos@gmail.com> * Cuda implementation of resize_area helpers. Shared memory approach. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored resizeAreaKernel with cuda implementation. * Eliminated compilation errors. * ResizeArea helpers for cuda platform. The first working revision. Signed-off-by: shugeo <sgazeos@gmail.com> * Added test for batched resize_area op testing. Signed-off-by: shugeo <sgazeos@gmail.com> * Implementation of resize_are for cuda platform and tests. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed multithreading with resize_area op helper. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected copyright marks with sources. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected copyright mark for resize_area op implementation. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected copyright mark for parity ops header. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected typo in strings and so on with image resize ops. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored resize_area helpers and multithreading. Signed-off-by: shugeo <sgazeos@gmail.com> * Added ResizeArea wrapper * Added test with align_corners and fixed shape processing with only int args given for output size. Signed-off-by: shugeo <sgazeos@gmail.com> * Added test * TF mapping for ResizeArea * Fixed implementation issues with resize_area op for both platforms. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored image resizer struct to use flexible types for ints and floats. Signed-off-by: shugeo <sgazeos@gmail.com> * Improved multithreading with resizeAreaKernel launch. Signed-off-by: shugeo <sgazeos@gmail.com> * Use asynchronical memory copying with cuda platform image resize allocations. Signed-off-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-01-22 10:46:33 +03:00
raver119	7783012f39	cuDNN integration (#150 ) * initial commit Signed-off-by: raver119 <raver119@gmail.com> * one file Signed-off-by: raver119 <raver119@gmail.com> * few more includes Signed-off-by: raver119 <raver119@gmail.com> * m? Signed-off-by: raver119 <raver119@gmail.com> * const Signed-off-by: raver119 <raver119@gmail.com> * cudnn linkage in tests Signed-off-by: raver119 <raver119@gmail.com> * culibos Signed-off-by: raver119 <raver119@gmail.com> * static reminder Signed-off-by: raver119 <raver119@gmail.com> * platform engine tag Signed-off-by: raver119 <raver119@gmail.com> * HAVE_CUDNN moved to config.h.in Signed-off-by: raver119 <raver119@gmail.com> * include Signed-off-by: raver119 <raver119@gmail.com> * include Signed-off-by: raver119 <raver119@gmail.com> * skip cudnn handle creation if there's not cudnn Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * target device in context Signed-off-by: raver119 <raver119@gmail.com> * platform engines Signed-off-by: raver119 <raver119@gmail.com> * platform engines Signed-off-by: raver119 <raver119@gmail.com> * allow multiple -h args Signed-off-by: raver119 <raver119@gmail.com> * allow multiple -h args Signed-off-by: raver119 <raver119@gmail.com> * move mkldnn out of CPU block Signed-off-by: raver119 <raver119@gmail.com> * link to mkldnn on cuda Signed-off-by: raver119 <raver119@gmail.com> * less prints Signed-off-by: raver119 <raver119@gmail.com> * minor tweaks Signed-off-by: raver119 <raver119@gmail.com> * next step Signed-off-by: raver119 <raver119@gmail.com> * conv2d NCHW draft Signed-off-by: raver119 <raver119@gmail.com> * conv2d biasAdd Signed-off-by: raver119 <raver119@gmail.com> * test for MKL/CUDNN combined use Signed-off-by: raver119 <raver119@gmail.com> * - provide additional code for conv2d ff based on cudnn api, not tested yet Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on conv2d helper based on using cudnn api Signed-off-by: Yurii <iuriish@yahoo.com> * - fixing several cuda bugs which appeared after cudnn lib had been started to use Signed-off-by: Yurii <iuriish@yahoo.com> * - implementation of conv2d backprop op based on cudnn api Signed-off-by: Yurii <iuriish@yahoo.com> * - implementaion of conv3d and conv3d_bp ops based on cudnn api Signed-off-by: Yurii <iuriish@yahoo.com> * - bugs fixing in conv3d/conv3d_bp ops (cudnn in use) Signed-off-by: Yurii <iuriish@yahoo.com> * - implementation of depthwiseConv2d (ff/bp) op based on cudnn api Signed-off-by: Yurii <iuriish@yahoo.com> * - implementation of batchnorm ff op based on cudnn api Signed-off-by: Yurii <iuriish@yahoo.com> * - disable cudnn batchnorm temporary Signed-off-by: Yurii <iuriish@yahoo.com> * - add minor change in cmake Signed-off-by: Yurii <iuriish@yahoo.com> * engine for depthwise mkldnn Signed-off-by: raver119 <raver119@gmail.com> * couple of includes Signed-off-by: raver119 <raver119@gmail.com> * - provide permutation to cudnn batchnorm ff when format is NHWC Signed-off-by: Yurii <iuriish@yahoo.com> * lgamma fix Signed-off-by: raver119 <raver119@gmail.com> * - eliminate memory leak in two tests Signed-off-by: Yurii <iuriish@yahoo.com> Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>	2020-01-20 21:32:46 +03:00
Oleh	8fc0e63ce7	Oleh powderev (#171 ) * Libnd4j: Add broadcastable elementwise power derivative #7461 first step of Pow_bp operation implementation Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 some corrections of calculation steps Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 some bug fixes, the PowDerevative op made broadcastable, add the raw tests for op, need refactoring to use broadcast ops * Libnd4j: Add broadcastable elementwise power derivative #7461 fixed several bugs add broadcast support and tests, need to fix scalar+array and array+scalar Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 fixed bugs for scalar inputs, fixed multinomial tests, added tests Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 fised bugs for different shapes support, tests updated * Libnd4j: Add broadcastable elementwise power derivative #7461 applied all possible variants via tiled arrays, add support of broadcast for Pow and PowDerivative ops, covered by tests, before review have to be replaced tiled implementation by applyTrueBroadcast Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 replaced tile by broadcast implementation, fixed issue with negative x input, corrected tests, need additional testing Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 added and corrected test cases, corrected implementation need review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 code clean up * Libnd4j: Add broadcastable elementwise power derivative #7461 code clean up, removed some tests, add tests with scalar Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 code improvement and clean up, split tests Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative #7461 some code clean up Signed-off-by: Oleg <oleg.semeniv@gmail.com> * Libnd4j: Add broadcastable elementwise power derivative replace __isnanf by internal realization Signed-off-by: Oleg <oleg.semeniv@gmail.com> * pow_bp wrapper * Fixed PowBp wrapper * Tests added * Test fixed * Fix return type * Disable powBp usage * Pow backprop changed Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-01-20 12:59:12 +03:00
shugeo	6943a5f57a	Shugeo lgamma (#170 ) * lgamma op. Initial version. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored lgamma op and test. Signed-off-by: shugeo <sgazeos@gmail.com> * Lgamma wrapper * Added TF mapping Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-01-20 12:29:36 +03:00
Alex Black	c84307a6fe	Small SameDiff execution fix (#168 ) * SameDiff exec: Fix for switch op when predicate is constant, and op is inside loop Signed-off-by: AlexDBlack <blacka101@gmail.com> * Update ignores for failing zoo models Signed-off-by: AlexDBlack <blacka101@gmail.com>	2020-01-08 23:57:23 +11:00
raver119	29e8e09db6	String changes (#3 ) * initial commit * additional data types & tensor type Signed-off-by: raver119 <raver119@gmail.com> * next step Signed-off-by: raver119 <raver119@gmail.com> * missing include * sparse_to_dense Signed-off-by: raver119 <raver119@gmail.com> * few more tests files Signed-off-by: raver119 <raver119@gmail.com> * draft Signed-off-by: raver119 <raver119@gmail.com> * numeric sparse_to_dense Signed-off-by: raver119 <raver119@gmail.com> * comment Signed-off-by: raver119 <raver119@gmail.com> * string sparse_to_dense version Signed-off-by: raver119 <raver119@gmail.com> * CUDA DataBuffer expand Signed-off-by: raver119 <raver119@gmail.com> * few tweaks for CUDA build Signed-off-by: raver119 <raver119@gmail.com> * shape fn for string_split Signed-off-by: raver119 <raver119@gmail.com> * one more comment Signed-off-by: raver119 <raver119@gmail.com> * string_split indices Signed-off-by: raver119 <raver119@gmail.com> * next step Signed-off-by: raver119 <raver119@gmail.com> * test passes Signed-off-by: raver119 <raver119@gmail.com> * few rearrangements for databuffer implementations Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer: move inline methods to common implementations Signed-off-by: raver119 <raver119@gmail.com> * add native DataBuffer to Nd4j presets Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer creation Signed-off-by: raver119 <raver119@gmail.com> * use DataBuffer for allocation Signed-off-by: raver119 <raver119@gmail.com> * cpu databuffer as deallocatable Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer setters for bufers Signed-off-by: raver119 <raver119@gmail.com> * couple of wrappers Signed-off-by: raver119 <raver119@gmail.com> * DataBuffers being passed around Signed-off-by: raver119 <raver119@gmail.com> * Bunch of ByteBuffer-related signatures gone Signed-off-by: raver119 <raver119@gmail.com> * - few more Nd4j signatures removed - minor fix for bfloat16 Signed-off-by: raver119 <raver119@gmail.com> * nullptr pointer is still a pointer, but 0 as address :) Signed-off-by: raver119 <raver119@gmail.com> * one special test Signed-off-by: raver119 <raver119@gmail.com> * empty string array init Signed-off-by: raver119 <raver119@gmail.com> * one more test in cpp Signed-off-by: raver119 <raver119@gmail.com> * memcpy instead of databuffer swap Signed-off-by: raver119 <raver119@gmail.com> * special InteropDataBuffer for front-end languages Signed-off-by: raver119 <raver119@gmail.com> * few tweaks for java Signed-off-by: raver119 <raver119@gmail.com> * pointer/indexer actualization Signed-off-by: raver119 <raver119@gmail.com> * CustomOp returns list for inputArumgents and outputArguments instead of array Signed-off-by: raver119 <raver119@gmail.com> * redundant call Signed-off-by: raver119 <raver119@gmail.com> * print_variable op Signed-off-by: raver119 <raver119@gmail.com> * - view handling (but wrong one) - print_variable java wrapper Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * - empty arrays handling Signed-off-by: raver119 <raver119@gmail.com> * - deserialization works now Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * one more fix Signed-off-by: raver119 <raver119@gmail.com> * initial cuda commit Signed-off-by: raver119 <raver119@gmail.com> * print_variable message validation Signed-off-by: raver119 <raver119@gmail.com> * CUDA views Signed-off-by: raver119 <raver119@gmail.com> * CUDA special buffer size Signed-off-by: raver119 <raver119@gmail.com> * minor update to match master changes Signed-off-by: raver119 <raver119@gmail.com> * - consider arrays always actual on device for CUDA - additional PrintVariable constructor - CudaUtf8Buffer now allocates host buffer by default Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * - print_variable now allows print from device Signed-off-by: raver119 <raver119@gmail.com> * InteropDataBuffer data type fix Signed-off-by: raver119 <raver119@gmail.com> * ... Signed-off-by: raver119 <raver119@gmail.com> * disable some debug messages Signed-off-by: raver119 <raver119@gmail.com> * master pulled in Signed-off-by: raver119 <raver119@gmail.com> * couple of new methods for DataBuffer interop Signed-off-by: raver119 <raver119@gmail.com> * java side Signed-off-by: raver119 <raver119@gmail.com> * offsetted constructor Signed-off-by: raver119 <raver119@gmail.com> * new CUDA deallocator Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart 2 Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart 3 Signed-off-by: raver119 <raver119@gmail.com> * - few new tests - few new methods for DataBuffer management Signed-off-by: raver119 <raver119@gmail.com> * few more tests + few more tweaks Signed-off-by: raver119 <raver119@gmail.com> * two failing tests Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * two failing tests pass Signed-off-by: raver119 <raver119@gmail.com> * now we pass DataBuffer to legacy ops too Signed-off-by: raver119 <raver119@gmail.com> * Native DataBuffer for legacy ops, Java side Signed-off-by: raver119 <raver119@gmail.com> * CPU java side update Signed-off-by: raver119 <raver119@gmail.com> * CUDA java side update Signed-off-by: raver119 <raver119@gmail.com> * no more prepare/register action on java side Signed-off-by: raver119 <raver119@gmail.com> * NDArray::prepare/register use now accepts vectors Signed-off-by: raver119 <raver119@gmail.com> * InteropDataBuffer now has few more convenience methods Signed-off-by: raver119 <raver119@gmail.com> * java bindings update Signed-off-by: raver119 <raver119@gmail.com> * tick device in NativeOps Signed-off-by: raver119 <raver119@gmail.com> * Corrected usage of OpaqueBuffer for tests. * Corrected usage of OpaqueBuffer for java tests. * NativeOpsTests fixes. * print_variable now returns scalar Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * compat_string_split fix for CUDA Signed-off-by: raver119 <raver119@gmail.com> * - CUDA execScalar fix - CUDA lazyAllocateHostPointer now checks java indexer/pointer instead of native pointer Signed-off-by: raver119 <raver119@gmail.com> * legacy ops DataBuffer migration prototype Signed-off-by: raver119 <raver119@gmail.com> * ignore device shapeinfo coming from java Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> * minor transformAny fix Signed-off-by: raver119 <raver119@gmail.com> * minor tweak for lazy host allocation Signed-off-by: raver119 <raver119@gmail.com> * - DataBuffer::memcpy method - bitcast now uses memcpy Signed-off-by: raver119 <raver119@gmail.com> * - IndexReduce CUDA dimension buffer fix Signed-off-by: raver119 <raver119@gmail.com> * views for CPU and CUDA Signed-off-by: raver119 <raver119@gmail.com> * less spam Signed-off-by: raver119 <raver119@gmail.com> * optional memory init Signed-off-by: raver119 <raver119@gmail.com> * async memset Signed-off-by: raver119 <raver119@gmail.com> * - SummaryStats CUDA fix - DataBuffer.sameUnderlyingData() impl - execBroadcast fix Signed-off-by: raver119 <raver119@gmail.com> * - reduce3All fix switch to CUDA 10 temporarily Signed-off-by: raver119 <raver119@gmail.com> * CUDA version Signed-off-by: raver119 <raver119@gmail.com> * proper memory deallocator registration Signed-off-by: raver119 <raver119@gmail.com> * HOST_ONLY workspace allocation Signed-off-by: raver119 <raver119@gmail.com> * temp commit Signed-off-by: raver119 <raver119@gmail.com> * few conflicts resolved Signed-off-by: raver119 <raver119@gmail.com> * few minor fixes Signed-off-by: raver119 <raver119@gmail.com> * one more minor fix Signed-off-by: raver119 <raver119@gmail.com> * NDArray permute should operate on JVM primitives Signed-off-by: raver119 <raver119@gmail.com> * - create InteropDataBuffer for shapes as well - update pointers after view creation in Java Signed-off-by: raver119 <raver119@gmail.com> * - addressPointer temporary moved to C++ Signed-off-by: raver119 <raver119@gmail.com> * CUDA: don't account offset twice Signed-off-by: raver119 <raver119@gmail.com> * CUDA: DataBuffer pointer constructor updated Signed-off-by: raver119 <raver119@gmail.com> * CUDA NDArray.unsafeDuplication() simplified Signed-off-by: raver119 <raver119@gmail.com> * CUDA minor workspace-related fixes Signed-off-by: raver119 <raver119@gmail.com> * CPU DataBuffer.reallocate() Signed-off-by: raver119 <raver119@gmail.com> * print_affinity op Signed-off-by: raver119 <raver119@gmail.com> * print_affinity java side Signed-off-by: raver119 <raver119@gmail.com> * CUDA more tweaks for data locality Signed-off-by: raver119 <raver119@gmail.com> * - compat_string_split tweak - CudaUtf8Buffer update Signed-off-by: raver119 <raver119@gmail.com> * INDArray.close() mechanic restored Signed-off-by: raver119 <raver119@gmail.com> * one more test fixed Signed-off-by: raver119 <raver119@gmail.com> * - CUDA DataBuffer.reallocate() updated - cudaMemcpy (synchronous) restored Signed-off-by: raver119 <raver119@gmail.com> * one last fix Signed-off-by: raver119 <raver119@gmail.com> * bad import removed Signed-off-by: raver119 <raver119@gmail.com> * another small fix Signed-off-by: raver119 <raver119@gmail.com> * one special test Signed-off-by: raver119 <raver119@gmail.com> * fix bad databuffer size Signed-off-by: raver119 <raver119@gmail.com> * release primaryBuffer on replace Signed-off-by: raver119 <raver119@gmail.com> * higher timeout Signed-off-by: raver119 <raver119@gmail.com> * disable timeouts Signed-off-by: raver119 <raver119@gmail.com> * dbCreateView now validates offset and length of a view Signed-off-by: raver119 <raver119@gmail.com> * additional validation for dbExpand Signed-off-by: raver119 <raver119@gmail.com> * restore timeout back again Signed-off-by: raver119 <raver119@gmail.com> * smaller distribution for rng test to prevent timeouts Signed-off-by: raver119 <raver119@gmail.com> * CUDA DataBuffer::memcpy now copies to device all the time Signed-off-by: raver119 <raver119@gmail.com> * OpaqueDataBuffer now contains all required methods for interop Signed-off-by: raver119 <raver119@gmail.com> * some javadoc Signed-off-by: raver119 <raver119@gmail.com> * GC on failed allocations Signed-off-by: raver119 <raver119@gmail.com> * minoe memcpu tweak Signed-off-by: raver119 <raver119@gmail.com> * one more bitcast test Signed-off-by: raver119 <raver119@gmail.com> * - NDArray::deviceId() propagation - special multi-threaded test for data locality checks Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer additional syncStream Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer additional syncStream Signed-off-by: raver119 <raver119@gmail.com> * one ignored test Signed-off-by: raver119 <raver119@gmail.com> * skip host alloc for empty arrays Signed-off-by: raver119 <raver119@gmail.com> * ByteBuffer support is back Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer::memcpy minor fix Signed-off-by: raver119 <raver119@gmail.com> * few minor prelu/bp tweaks Signed-off-by: raver119 <raver119@gmail.com> * nullify-related fixes Signed-off-by: raver119 <raver119@gmail.com> * PReLU fixes (#157) Signed-off-by: Alex Black <blacka101@gmail.com> * Build fixed * Fix tests * one more ByteBuffer signature restored Signed-off-by: raver119 <raver119@gmail.com> * nd4j-jdbc-hsql profiles fix Signed-off-by: raver119 <raver119@gmail.com> * nd4j-jdbc-hsql profiles fix Signed-off-by: raver119 <raver119@gmail.com> * PReLU weight init fix Signed-off-by: Alex Black <blacka101@gmail.com> * Small PReLU fix Signed-off-by: Alex Black <blacka101@gmail.com> * - INDArray.migrate() reactivated - DataBuffer::setDeviceId(...) added - InteropDataBuffer Z syncToDevice added for views Signed-off-by: raver119 <raver119@gmail.com> * missed file Signed-off-by: raver119 <raver119@gmail.com> * Small tweak Signed-off-by: Alex Black <blacka101@gmail.com> * cuda 10.2 Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> Co-authored-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alex Black <blacka101@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>	2020-01-04 13:27:50 +03:00
raver119	451d9d57fd	shape function override (#161 ) Signed-off-by: raver119 <raver119@gmail.com>	2020-01-04 09:06:44 +03:00
Robert Altena	53d3bd1269	shallow delete of assign from SDBase. (#164 ) Signed-off-by: Robert Altena <Rob@Ra-ai.com>	2020-01-04 15:26:39 +11:00
Alex Black	29104083cc	Various fixes (#143 ) * #8568 ArrayUtil optimization Signed-off-by: AlexDBlack <blacka101@gmail.com> * #6171 Keras ReLU and ELU support Signed-off-by: AlexDBlack <blacka101@gmail.com> * Keras softmax layer import Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8549 Webjars dependency management Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix for TF import names ':0' suffix issue / NPE Signed-off-by: AlexDBlack <blacka101@gmail.com> * BiasAdd: fix default data format for TF import Signed-off-by: AlexDBlack <blacka101@gmail.com> * Update zoo test ignores Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8509 SameDiff Listener API - provide frame + iteration Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8520 ND4J Environment Signed-off-by: AlexDBlack <blacka101@gmail.com> * Deconv3d Signed-off-by: AlexDBlack <blacka101@gmail.com> * Deconv3d fixes + gradient check Signed-off-by: AlexDBlack <blacka101@gmail.com> * Conv3d fixes + deconv3d DType test Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix issue with deconv3d gradinet check weight init Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8579 Fix BaseCudaDataBuffer constructor fix for UINT16 Signed-off-by: AlexDBlack <blacka101@gmail.com> * DataType.isNumerical() returns false for BOOL type Signed-off-by: AlexDBlack <blacka101@gmail.com> * #8504 Reduce Spark log spam for tests Signed-off-by: AlexDBlack <blacka101@gmail.com> * Clean up DL4J gradient check test spam Signed-off-by: AlexDBlack <blacka101@gmail.com> * More Gradient check spam reduction Signed-off-by: AlexDBlack <blacka101@gmail.com> * SameDiff test spam reduction Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fixes for FlatBuffers mapping Signed-off-by: AlexDBlack <blacka101@gmail.com> * SameDiff log spam cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * Tests should extend BaseNd4jTest Signed-off-by: AlexDBlack <blacka101@gmail.com> * Remove debug line in c++ op Signed-off-by: AlexDBlack <blacka101@gmail.com> * ND4J test spam cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * DL4J test spam reduction Signed-off-by: AlexDBlack <blacka101@gmail.com> * More Dl4J and datavec test spam cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix for bad conv3d test Signed-off-by: AlexDBlack <blacka101@gmail.com> * Additional test Signed-off-by: AlexDBlack <blacka101@gmail.com> * Embedding layers: don't inherit global default activation function Signed-off-by: AlexDBlack <blacka101@gmail.com> * Trigger CI Signed-off-by: AlexDBlack <blacka101@gmail.com> * Consolidate all BaseDL4JTest classes to single class used everywhere; make timeout configurable per class Signed-off-by: AlexDBlack <blacka101@gmail.com> * Test fixes and timeout increases Signed-off-by: AlexDBlack <blacka101@gmail.com> * Timeouts and PReLU fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Restore libnd4j build threads arg for CUDA build Signed-off-by: AlexDBlack <blacka101@gmail.com> * Increase timeouts on a few tests to avoid spurious failures on some CI machines Signed-off-by: AlexDBlack <blacka101@gmail.com> * More timeout fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * More test timeout fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Tweak timeout for one more test Signed-off-by: AlexDBlack <blacka101@gmail.com> * Final tweaks Signed-off-by: AlexDBlack <blacka101@gmail.com> * One more ignore Signed-off-by: AlexDBlack <blacka101@gmail.com>	2020-01-04 13:45:07 +11:00
Susan Eraly	c32acb2ec7	fix if dir does not exist (#129 ) * fix if dir does not exist Signed-off-by: eraly <susan.eraly@gmail.com> * added simple test Signed-off-by: eraly <susan.eraly@gmail.com>	2019-12-30 19:48:57 -05:00
Alexander Stoyakin	010744ef9c	Lu wrapper and tests fixes (#144 ) * Tests fixed * Lu added * Test fixed * Default timeout * Tests timeouts fixed. * TF import fix * Timeouts added * Timeout fixed. * Test corrected * rgb and yiq conversion ops added * Converter ops added * Header * Yuv converters * API added * Empty test for matmul * Explanation * skip gemm/gemv on empty inputs Signed-off-by: raver119 <raver119@gmail.com> * Test added * Correct test * one more empty pass-through for mmul Signed-off-by: raver119 <raver119@gmail.com> * Cleanup * Test added * Test fixed * Added missing mapping * Added missing mapping Co-authored-by: raver119 <raver119@gmail.com>	2019-12-30 15:06:12 +03:00
Alex Black	1f9e1b6022	SameDiff profiler analysis improvements (#141 ) * #8555 SameDiff profiler analysis improvements Signed-off-by: Alex Black <blacka101@gmail.com> * Fix TF sub-op aggregation Signed-off-by: Alex Black <blacka101@gmail.com> * Small filtering tweak Signed-off-by: Alex Black <blacka101@gmail.com> * Copyright headers Signed-off-by: Alex Black <blacka101@gmail.com>	2019-12-23 15:24:20 +11:00
Alex Black	ce02b6fae7	Small fixes (#140 ) * Allow scalar op result array auto allocation Signed-off-by: AlexDBlack <blacka101@gmail.com> * Don't swallow underlying exception for calculateOutputShape execution failures Signed-off-by: AlexDBlack <blacka101@gmail.com> * Ignore for known keras failure Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-12-21 17:00:46 +11:00
Alexander Stoyakin	6d8a063c9b	nd4j-tests cleanup (#137 ) * Fixed tests * Invalid test removed	2019-12-20 16:38:33 +03:00
Alex Black	3d8f6d50a1	SameDiff profiler / tracing and profile analysis/comparison (#133 ) * Profiler Signed-off-by: Alex Black <blacka101@gmail.com> * Next steps, polishing, and loading SD/TF format JSON Signed-off-by: AlexDBlack <blacka101@gmail.com> * Next steps Signed-off-by: AlexDBlack <blacka101@gmail.com> * Profile comparison method Signed-off-by: AlexDBlack <blacka101@gmail.com> * Make profiling result writing async to reduce main thread overhead Signed-off-by: AlexDBlack <blacka101@gmail.com> * Profiling polishing Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * Profile analyzer fixes Signed-off-by: Alex Black <blacka101@gmail.com> * Polish Signed-off-by: Alex Black <blacka101@gmail.com> * Cleanup Signed-off-by: Alex Black <blacka101@gmail.com> * Small formatting improvement Signed-off-by: Alex Black <blacka101@gmail.com> * Formatting tweak Signed-off-by: Alex Black <blacka101@gmail.com> * License headers Signed-off-by: Alex Black <blacka101@gmail.com>	2019-12-19 23:43:58 +11:00
Alexander Stoyakin	f5068f3980	Added missing Java ops wrappers (#122 ) * Timeouts added * Added some ops * Ops added * Fixed tests * Minor fix * Some fixes * Digamma added * Small fixes * Timeouts added * Added some ops * Ops added * Fixed tests * Minor fix * Some fixes * Digamma added * Small fixes * Fused batch norm fixes- Signed-off-by: AlexDBlack <blacka101@gmail.com> * Tests switched off. * Added test for resize_bicubic. * Eliminated wasted in test of bicubic resize. * Switched off multithreading explicit. * HsvToRgb and RgbToHsv added * Eliminated waste comments and conform proper float constants. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed multithreading with resize_bicubic helper for cpu platform. Signed-off-by: shugeo <sgazeos@gmail.com> * ResizeBicubic was fixed. * Some fixes * Fix op name * Validation fixed. * Clarifications for tests * Wrappers and small fixes for new ops.	2019-12-19 20:15:48 +11:00
Alex Black	bfd9e3692a	Add op counting to TensorFlowImportValidator (#128 ) * Add op counting to TensorFlowImportValidator Signed-off-by: AlexDBlack <blacka101@gmail.com> * Test tweak Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-12-17 10:23:37 +11:00
AlexDBlack	0ab39a2274	nd4j-jackson: exclude java.xml.stream.XML*Factory from service loader to avoid clashes with other non-shaded jackson etc on classpath Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-12-13 21:41:28 +11:00
AlexDBlack	0df1b46c8c	Merge	2019-12-10 15:08:50 +11:00
raver119	a5f5ac72b1	reduce bool changes (#118 ) * reduce bool changes Signed-off-by: raver119 <raver119@gmail.com> * reduce bool tweaks Signed-off-by: raver119 <raver119@gmail.com>	2019-12-09 20:08:59 +03:00
Alex Black	0175ace4c3	Small tweaks (#119 ) Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-12-09 23:08:00 +11:00
Alexander Stoyakin	927d591421	ResizeBicubic added (#117 ) * ResizeBicubic added Some fixes. * Test fixed * Narrowed argument type changed to boolean * Clean up	2019-12-09 18:25:39 +11:00
Alex Black	b66154a9d4	Add ArraySavingListener for debugging (#114 ) Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-12-09 14:16:11 +11:00
raver119	b32dd1bf92	[WIP] resize_bicubic types (#116 ) * resize_bicubic: allow more dtypes Signed-off-by: raver119 <raver119@gmail.com> * resize_bicubic: allow less dtypes Signed-off-by: raver119 <raver119@gmail.com> * Refactored resize_bicubic op to full conform with TF1.5 and tests. * Corrected test to proper data type output. Signed-off-by: shugeo <sgazeos@gmail.com> * Corrected double input test to float constant outputs. Signed-off-by: shugeo <sgazeos@gmail.com> * Finished with correction of tests for bicubic interpolated resizes expected. Signed-off-by: shugeo <sgazeos@gmail.com> * Fixed adjust_contrast ops to allow non-RGB inputs. Signed-off-by: shugeo <sgazeos@gmail.com> * Refactored adjust_contrast_v2 to conform with TF one. Signed-off-by: shugeo <sgazeos@gmail.com> * AdjustContrast tests activated * two typos fixed Signed-off-by: raver119 <raver119@gmail.com>	2019-12-06 18:58:37 +03:00
raver119	972fae60dc	Update master (#8511 ) * cleaned up bert iterator tests (#110) Signed-off-by: eraly <susan.eraly@gmail.com> * Various pre-release fixes (#111) * Various fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix default dtypes for MaxPoolWithArgmax Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small pre-release tweak (#112) * Log UI address on launch as in previous Play-based UI Signed-off-by: AlexDBlack <blacka101@gmail.com> * Logging level tweak for UI Signed-off-by: AlexDBlack <blacka101@gmail.com> * http not https Signed-off-by: AlexDBlack <blacka101@gmail.com> * datavec python ensure host (#113) * ensure host * one more host ensure * info->debug * [WIP] reverse improvements (#115) * initial commit Signed-off-by: raver119 <raver119@gmail.com> * reverse draft Signed-off-by: raver119 <raver119@gmail.com> * reverse kernel Signed-off-by: raver119 <raver119@gmail.com> * reverse kernel Signed-off-by: raver119 <raver119@gmail.com> * 2 micro fixes Signed-off-by: raver119 <raver119@gmail.com> * Shugeo resize fix5 (#102) * Refactored resize images ops to use TF-like bool args as input. * Refactored helpers for cpu implementation of resize_bilinear and resize_nearest_neighbor ops. * Refactored cuda implementation for image.resize_bilinear and image.resize_nearest_neighbor ops helpers. * Refactored nearest_neighbor resize op. * Added a pair of tests for special case of resize_bilinear algorithm. * Fixed issue with resize_bilinear op. * Refactored cpu implementation for helpers with resize_nearest_neighbor op. * Final fixed for resize ops to conform TF v.1.5 * Refactored cuda helpers for resize_neares_neighbor op. * Fixed resize_bilinear to accept proper data. * Fixed issue with non-float input for resize_bilinear op. * Refactored cuda helper for resize_bilinear to proper process non-float inputs. * Added tests for resize_bilinear to int inputs. * Fixed ResizeBilinear wrapper * Tests fixed * Fixed float and bool constant to avoid overflow for some kind of compilers. * Corrected float constants with float data type. * Added f suffix for float constants. * Corrected float constant to avoid overflow with initializing lists. * Corrected float initializing list with float input. * Corrected bool constant with initalizing list. * Corrected float and bool values with initializing lists. * Fixed wrong constant. * Fixed issue with 1x1 input picture for resize. * ResizeBilinear default values on import fix Signed-off-by: raver119 <raver119@gmail.com>	2019-12-06 11:10:44 +03:00
Robert Altena	e7730eded4	delete unused and refactor. (#8262 ) Signed-off-by: Robert Altena <Rob@Ra-ai.com>	2019-12-05 22:25:41 -05:00
shugeo	e09a785232	Shugeo resize fix5 (#102 ) * Refactored resize images ops to use TF-like bool args as input. * Refactored helpers for cpu implementation of resize_bilinear and resize_nearest_neighbor ops. * Refactored cuda implementation for image.resize_bilinear and image.resize_nearest_neighbor ops helpers. * Refactored nearest_neighbor resize op. * Added a pair of tests for special case of resize_bilinear algorithm. * Fixed issue with resize_bilinear op. * Refactored cpu implementation for helpers with resize_nearest_neighbor op. * Final fixed for resize ops to conform TF v.1.5 * Refactored cuda helpers for resize_neares_neighbor op. * Fixed resize_bilinear to accept proper data. * Fixed issue with non-float input for resize_bilinear op. * Refactored cuda helper for resize_bilinear to proper process non-float inputs. * Added tests for resize_bilinear to int inputs. * Fixed ResizeBilinear wrapper * Tests fixed * Fixed float and bool constant to avoid overflow for some kind of compilers. * Corrected float constants with float data type. * Added f suffix for float constants. * Corrected float constant to avoid overflow with initializing lists. * Corrected float initializing list with float input. * Corrected bool constant with initalizing list. * Corrected float and bool values with initializing lists. * Fixed wrong constant. * Fixed issue with 1x1 input picture for resize. * ResizeBilinear default values on import fix Signed-off-by: raver119 <raver119@gmail.com>	2019-12-05 22:05:33 +03:00
Alex Black	2052ce7026	Various pre-release fixes (#111 ) * Various fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Fix default dtypes for MaxPoolWithArgmax Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-12-05 14:20:03 +11:00
Fariz Rahman	0d14032d26	TF Updates (#87 ) * tf updates * pom * copyright * graphrunner tests * gpu test * getSessionOptionsConfigProto * dtype fix * Small fix Signed-off-by: AlexDBlack <blacka101@gmail.com> * cast graphs * savemodel test fix * testresource instead of local * Logging level Signed-off-by: AlexDBlack <blacka101@gmail.com> * gson dependency issue fix; fix GraphRunnerTest for no session options config case Signed-off-by: Alex Black <blacka101@gmail.com> * Final tweaks Signed-off-by: AlexDBlack <blacka101@gmail.com> * few minor fixes Signed-off-by: raver119 <raver119@gmail.com> * one more fix Signed-off-by: raver119 <raver119@gmail.com> * Tweak configuration for GraphRunnerTest Signed-off-by: AlexDBlack <blacka101@gmail.com> * nd4j align config * tf warmup	2019-12-04 17:11:03 +11:00
raver119	25b3cd9b80	[WIP] CUDA tests (#95 ) * one more CI test Signed-off-by: raver119 <raver119@gmail.com> * export additional symbols Signed-off-by: raver119 <raver119@gmail.com> * few more tweaks Signed-off-by: raver119 <raver119@gmail.com> * one more tweak for linux Signed-off-by: raver119 <raver119@gmail.com> * fix dtype in few tests Signed-off-by: raver119 <raver119@gmail.com> * missing sync and memset in couple of tests Signed-off-by: raver119 <raver119@gmail.com> * copy step for libnd4j cuda Signed-off-by: raver119 <raver119@gmail.com> * no-op on empty for adjust hue/contrast/saturation Signed-off-by: raver119 <raver119@gmail.com> * CUDA_VERBOSE Off Signed-off-by: raver119 <raver119@gmail.com> * BroadcastBool fix + few tests Signed-off-by: raver119 <raver119@gmail.com> * trigger jenkins Signed-off-by: raver119 <raver119@gmail.com> * trigger jenkins Signed-off-by: raver119 <raver119@gmail.com> * - ignore couple of warnings - remove redundant compiler options Signed-off-by: raver119 <raver119@gmail.com>	2019-12-02 21:37:21 +03:00
Alexander Stoyakin	5e152c0d9a	TF import tests - adding missing operations (#65 ) * Add and fix mappings. * Intermediate * Added and fixed some mappings * Added op * Missing constructors added. * Added new mappings * SDImage wrappers and minor tweaks. * Added missing constructor * Some corrections * Cleanup * Small fixes * Ops wrappers * Minor fixes. * Max Pooling * MaxPoolWithArgmax * Some fixes * Ignores for failures * Some ops fixed. * Some fixes * Missing package added * Some fixes * Ignored tests fixed. * Some fixes * Merge master * bitcast fix Signed-off-by: raver119 <raver119@gmail.com> * Bitcast fixed	2019-12-02 21:23:06 +11:00
Alex Black	8123d9fa9b	SameDiff: Add Java-level assertion check/exception (#96 ) Signed-off-by: Alex Black <blacka101@gmail.com>	2019-12-02 18:07:54 +11:00
Alex Black	2be47082c9	#8470 TrainingConfig json fix for Evaluation instances (#93 ) Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-11-30 20:08:30 +11:00
Alex Black	35ab4a72ba	TF import test resources loading precision fixes (#92 ) * Fix precision issues when loading from CSV Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small tweak Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-11-30 18:58:37 +11:00
Alex Black	4fb9fa7748	Add ND4J namespaces (#83 ) * Add NDValidation Signed-off-by: AlexDBlack <blacka101@gmail.com> * Add bitwise namespace Signed-off-by: AlexDBlack <blacka101@gmail.com> * Math namespace op constructor fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Constructor fixes Signed-off-by: AlexDBlack <blacka101@gmail.com> * Add Math namespace Signed-off-by: AlexDBlack <blacka101@gmail.com> * Update NDBitwise Signed-off-by: AlexDBlack <blacka101@gmail.com> * Add random namespaces Signed-off-by: AlexDBlack <blacka101@gmail.com> * Update Signed-off-by: AlexDBlack <blacka101@gmail.com> * NN namespace Signed-off-by: AlexDBlack <blacka101@gmail.com> * Small cleanup Signed-off-by: AlexDBlack <blacka101@gmail.com>	2019-11-30 18:39:32 +11:00
Yurii Shyrma	d19eeaec52	Shyrma casual conv1d (#90 ) * - add causal mode of padding to convolutions Signed-off-by: Yurii <iuriish@yahoo.com> * - add additional tests for causal conv1d Signed-off-by: Yurii <iuriish@yahoo.com> * - add causal mode for cuda conv kernels Signed-off-by: Yurii <iuriish@yahoo.com> * Java side of Conv1D changes Signed-off-by: raver119 <raver119@gmail.com> * Add Conv1DDerivative op Signed-off-by: Alex Black <blacka101@gmail.com> * Causal Conv1D gradient checks Signed-off-by: Alex Black <blacka101@gmail.com> * Tweaks Signed-off-by: Alex Black <blacka101@gmail.com> * - add causal padding mode to conv2d_bp Signed-off-by: Yurii <iuriish@yahoo.com> * More thorough causal conv1d tests Signed-off-by: Alex Black <blacka101@gmail.com>	2019-11-29 14:14:30 +03:00
Samuel Audet	5e07998e59	Add support for CUDA 10.2 (#89 )	2019-11-29 16:31:03 +11:00
shugeo	009007120b	Shugeo_release_fixes3 (#81 ) * Implementation for non_max_suppression_v3 was added. Initial version * Added check for overcome threshold. * Added definition for V3 method. * java remapping for NonMaxSuppressionV3 Signed-off-by: raver119 <raver119@gmail.com> * Fixed proporly processing of an empty output and test. * Refactored op to less threshold data to float. * Implemented cuda-based helper for non_max_suppression_v3 op. * Fixed fake_quant_with_min_max_vars op. * Fixed tests with float numbers. * - assert now stops execution - sortByKey/sortByValue now have input validation Signed-off-by: raver119 <raver119@gmail.com> * missing var Signed-off-by: raver119 <raver119@gmail.com> * Fixed proper processing for zero max_size inputs. * Refactored kernel callers. * Fixed return statement for logdet op helper. * Refactored unsorted segment SqrtN op. * get back 8 tail bytes on CUDA Signed-off-by: raver119 <raver119@gmail.com> * Refactored segment prod ops and helpers for cuda and tests. * Additional test. * CudaWorkspace tests updated for 8 tail bytes Signed-off-by: raver119 <raver119@gmail.com> * special atomic test Signed-off-by: raver119 <raver119@gmail.com> * atomicMul/atomicDiv fix for 16bit values Signed-off-by: raver119 <raver119@gmail.com> * Eliminated waste prints.	2019-11-28 21:08:51 +03:00

1 2 3 4 5 ...

286 Commits