Commit Graph

745 Commits (58550b7c98390d008bae637db10500e891064265)

Author SHA1 Message Date
Yurii Shyrma 58550b7c98
[WIP] Shyrma coords (#305)
* - provide faster index2coords function for cpu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - new faster index2coords function is introduced into cpu code

Signed-off-by: Yurii <iuriish@yahoo.com>

* - replace long long coordinates with int coordinates

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add missed reload of coords2index function

Signed-off-by: Yurii <iuriish@yahoo.com>

* - reststart  jenkins

Signed-off-by: Yurii <iuriish@yahoo.com>

* - rollback changes in convolutions.cu and addBias.cu

Signed-off-by: Yurii <iuriish@yahoo.com>
2020-03-11 16:21:59 +03:00
raver119 50b7d82b96 more compilation units
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-11 11:43:18 +03:00
raver119 a7a97d8259 rl4j: update host pointers content before reading them
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-11 10:57:55 +03:00
Yurii Shyrma 6aaca58506
Shyrma broadcast (#302)
* - profiling TrueBroadcastHelper

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further improving of TrueBroadcastHelper

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further profiling of broadcast op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - implementation of broadcastShapeHelper which inserts unities in shapes of arrays to be broadcasted

Signed-off-by: Yurii <iuriish@yahoo.com>

* - provide additional method in ConstantShapeHelper class for deducing broadcast shapes with unities

Signed-off-by: Yurii <iuriish@yahoo.com>

* - provide new NativeOps helpers for usual and true broadcast methods

Signed-off-by: Yurii <iuriish@yahoo.com>

* enable bert profiler

Signed-off-by: raver119 <raver119@gmail.com>

* - delete unnessesary tests

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>
2020-03-10 16:29:09 +03:00
Oleh c3223dbc7a
Improve ResultSet usage in libnd4j (#281)
* libnd4j profiling DeclarableOp and Tests by replacing return ResultSet pointer by instance

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j profiling semantic change in tests cases

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j some corrections to make new ResultSet semantic works, fixed one test

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j more tests fixes

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* - correct copy and move assignment operators of ResultSet class

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: Yurii <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
2020-03-10 07:42:50 +03:00
raver119 57210b936c
Revert "OpenMP Threads execution (#297)" (#299)
This reverts commit dd2043ef48.
2020-03-09 08:22:49 +03:00
raver119 dd2043ef48
OpenMP Threads execution (#297)
* omp threads backported

Signed-off-by: raver119 <raver119@gmail.com>

* omp scalar reduce

Signed-off-by: raver119 <raver119@gmail.com>

* timing

Signed-off-by: raver119 <raver119@gmail.com>

* timing

Signed-off-by: raver119 <raver119@gmail.com>

* minor tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* minor tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* namespace change

Signed-off-by: raver119 <raver119@gmail.com>

* num_threads

Signed-off-by: raver119 <raver119@gmail.com>

* one minor fix

Signed-off-by: raver119 <raver119@gmail.com>
2020-03-09 08:21:44 +03:00
Andrii T a2ec3dbc97
Image namespace (#176)
* created NDImage.java and fixed constructor in AdjustContrast.java

* created NDImage.java and fixed constructor in AdjustContrast.java

* created NDImage.java and fixed constructor in AdjustContrast.java v2

* regenerated NDImage from cleaned Image,kt also cleaned AdjustContrast.java

* draft of NDCNN

* draft of NDCNN

* started NDRNN

* started NDRNN

* looking like finished with namespace

* Regenerate namespaces

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Add ND4J namespace methods for new namespaces

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fixes, cleanup

Signed-off-by: Alex Black <blacka101@gmail.com>

* More fixes

Signed-off-by: Alex Black <blacka101@gmail.com>

* Fixes

Signed-off-by: Alex Black <blacka101@gmail.com>

* Fix

Signed-off-by: Alex Black <blacka101@gmail.com>

Co-authored-by: Andrii Tuzhykov <andrew@unrealists.com>
Co-authored-by: Andrii Tuzhykov <andrew@konduit.ai>
Co-authored-by: AlexDBlack <blacka101@gmail.com>
2020-03-09 13:35:17 +11:00
Alex Black a80fb99a5f
DL4J integrations tests updates + add SameDiff support (#298)
* Revive and start updating DL4J integration tests

Signed-off-by: Alex Black <blacka101@gmail.com>

* Add SameDiff support - first pass

Signed-off-by: Alex Black <blacka101@gmail.com>

* SameDiff test case generation

Signed-off-by: Alex Black <blacka101@gmail.com>

* SameDiff integration tests polishing

Signed-off-by: Alex Black <blacka101@gmail.com>

* More SameDiff integration test fixes

Signed-off-by: Alex Black <blacka101@gmail.com>

* Final polish

Signed-off-by: Alex Black <blacka101@gmail.com>

* Small test tweak

Signed-off-by: Alex Black <blacka101@gmail.com>
2020-03-07 22:44:41 +11:00
Oleh ead5162c97
Tanh mkldnn implementation (#296)
* libnd4j first step of softmax mkldnn implementation

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j raw implementation of mkldnn softmax

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j merge master and added softmax to MklDnnTests

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j some corrections for softmax mkldnn

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j merge branch, fixed problem with negative axis, fixed dnnl::memory::format_tag selection, test cases added

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j minor corrections to avoid risk connected with negative axis usage

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed windows builds, added switcher to use mkldnn sofmax version only for 3D, 4D, 5D, 6D arrays

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed dataType selection per request

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fix for mac and windows builds

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j builds fix

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j first spet of elementwize tanh implementation on mkldnn

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed typo in error message for softmax MKLDNN, test case added, implementation of tanh on MKLDNN, need supported DataType testing

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j several fixes for tanh and temporary performance test added

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed mkldnn platform loader for tanh

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j MklDnn tanh removed unsupported data types, removed performance test case, added more appropriate equivalence test case, code clean up

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed problem with empty input case for MklDnn tanh and softmax

Signed-off-by: Oleg <oleg.semeniv@gmail.com>
2020-03-06 17:11:22 +03:00
Alex Black e6a7b94fe4
Loss namespace (#294)
* codegen for SDLoss. WIP.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* first pass of SDLoss.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* wip. Firsat cut of new op constructors. UNTESTED , NOT COMPILED YET.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* updated op signatures.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* add NDLoss tests.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* fix test.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* adds loss default params. factory.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* Regenerate NDLoss

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* adds tests for null weights.

Signed-off-by: Robert Altena <Rob@Ra-ai.com>

* Last few tweaks

Signed-off-by: Alex Black <blacka101@gmail.com>

Co-authored-by: Robert Altena <Rob@Ra-ai.com>
2020-03-06 16:07:22 +11:00
Alex Black 7494117e90
#8751 Arbiter grid search candidate generator fix [WIP] (#292)
* #8751 Arbiter grid search candidate generator fix

Signed-off-by: Alex Black <blacka101@gmail.com>

* Small fix

Signed-off-by: Alex Black <blacka101@gmail.com>

* Timeout

Signed-off-by: Alex Black <blacka101@gmail.com>
2020-03-06 12:01:21 +11:00
Alex Black 19d5a8d49d
Various fixes (#290)
* Add check to ensure ALL tests extend BaseND4JTest for proper timeouts + logging

Signed-off-by: Alex Black <blacka101@gmail.com>

* Add 'must extend BaseDL4JTest' check for deeplearning4j-core

Signed-off-by: Alex Black <blacka101@gmail.com>

* Flush logging on workspace exit during tests

Signed-off-by: Alex Black <blacka101@gmail.com>
2020-03-06 00:02:32 +11:00
raver119 2911da061b
blas fallback (#291)
Signed-off-by: raver119 <raver119@gmail.com>

Co-authored-by: raver119 <raver119@gmail.com>
2020-03-05 14:11:13 +03:00
raver119 784a2d13f8
separate omp impl for softmax (#289)
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-05 11:14:22 +03:00
raver119 3bb22a6ff8
strided_slice without view (#288)
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-05 09:56:52 +03:00
raver119 ca96a13ed0 softmax as standalone compilation unit
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-05 08:45:10 +03:00
Oleh 4d81af9fe9
Softmax operation implementation for mkldnn (#286)
* libnd4j first step of softmax mkldnn implementation

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j raw implementation of mkldnn softmax

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j merge master and added softmax to MklDnnTests

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j some corrections for softmax mkldnn

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j merge branch, fixed problem with negative axis, fixed dnnl::memory::format_tag selection, test cases added

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j minor corrections to avoid risk connected with negative axis usage

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed windows builds, added switcher to use mkldnn sofmax version only for 3D, 4D, 5D, 6D arrays

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed dataType selection per request

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fix for mac and windows builds

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j builds fix

Signed-off-by: Oleg <oleg.semeniv@gmail.com>
2020-03-04 19:36:42 +03:00
Samuel Audet 1c89512ec0
Add Maven profiles for ARM builds to pom.xml files (#265)
* Add Maven profiles for ARM builds to pom.xml files

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Remove mkl from dependencies when running on non intel/amd platforms

* Downgrade openblas for now

* Change back to 0.3.8

Co-authored-by: Adam Gibson <1144306+agibsonccc@users.noreply.github.com>
2020-03-04 11:11:01 +03:00
raver119 f990b2486d
simplified addBias2D for CUDA (#285)
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-04 09:50:55 +03:00
raver119 11d148a5eb get back to single-byte reads
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-04 08:14:39 +03:00
Fariz Rahman fec620fafa
TensorflowConversion Data Types (#284)
* dtypes

* bf16 and bool

* tests
2020-03-04 11:46:32 +11:00
raver119 d9cfa8073f bigger reads
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-03 14:19:55 +03:00
raver119 ebee7687e8 mkldnn version upgrade
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-03 08:57:02 +03:00
Yurii Shyrma 78934c17ad
profiling of stack and unstack ops (#261)
* - profiling of stack and unstack ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - fix bug in cpu concat op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correction of cuda stack and unstack

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change shape.h method which operates with unity dimensions strides

Signed-off-by: Yurii <iuriish@yahoo.com>

* - rearrange stack tests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct evaluation of smallest stride for moving through contiguous axis

Signed-off-by: Yurii <iuriish@yahoo.com>

* - forgot to update signature of function strideOverContigAxis in cuda concat and split ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - remove ShapeUtils::shapeAsString method applied before input arrays validations

Signed-off-by: Yurii <iuriish@yahoo.com>

* -  further removing of ShapeUtils::shapeAsString

Signed-off-by: Yurii <iuriish@yahoo.com>

* - take sub-array shapeIndo/offset calculation out of NDArray class
- add possibility of contiguous memory copy in execTransformAny op if opNum == assign

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct test_empty_scatter_2 in EmptyTests.cpp

Signed-off-by: Yurii <iuriish@yahoo.com>

* - profiling of slice op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of contiguous memcpy for some cases in concat and split ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - forgot to declare oid nd4j::SpecialMethods<T>::splitCpuGeneric

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct typo in calculation of threads in cuda split op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - forgot to correct another set of threads variables in split cuda ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further conflicts resolving

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>
2020-03-03 07:32:37 +03:00
raver119 0f581e74e3 one small test rearrangement
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-02 19:52:11 +03:00
raver119 c54cdaab75
full bert graph (#282)
Signed-off-by: raver119 <raver119@gmail.com>
2020-03-02 18:14:32 +03:00
raver119 63fa3c2ef3
libnd4j polishing (#273)
* initial set of include changes

Signed-off-by: raver119 <raver119@gmail.com>

* one more tweak

Signed-off-by: raver119 <raver119@gmail.com>

* few more rearrangements

Signed-off-by: raver119 <raver119@gmail.com>

* few more rearrangements

Signed-off-by: raver119 <raver119@gmail.com>

* few more rearrangements

Signed-off-by: raver119 <raver119@gmail.com>

* cuda includes rearrangements

Signed-off-by: raver119 <raver119@gmail.com>

* java update

Signed-off-by: raver119 <raver119@gmail.com>

* = namespace changed to sd
- few CMake variables renamed with SD_ prefix

Signed-off-by: raver119 <raver119@gmail.com>

* java update

Signed-off-by: raver119 <raver119@gmail.com>

* LoopKind minor fix

Signed-off-by: raver119 <raver119@gmail.com>

* few more changes

Signed-off-by: raver119 <raver119@gmail.com>

* few more changes

Signed-off-by: raver119 <raver119@gmail.com>

* few more changes

Signed-off-by: raver119 <raver119@gmail.com>

* sanitizer is optional now

Signed-off-by: raver119 <raver119@gmail.com>

* dev tests updated

Signed-off-by: raver119 <raver119@gmail.com>

* few more changes

Signed-off-by: raver119 <raver119@gmail.com>

* last update

Signed-off-by: raver119 <raver119@gmail.com>

* java update

Signed-off-by: raver119 <raver119@gmail.com>
2020-03-02 12:49:41 +03:00
Alex Black 483c3d7b8c
Assorted SameDiff/DL4J fixes (#279)
* #8565 Normalizer toString/hashcode

Signed-off-by: Alex Black <blacka101@gmail.com>

* #8731 ImagePreProcessingScaler lables/segmentation fix

Signed-off-by: Alex Black <blacka101@gmail.com>

* #8691 Fix SameDiffLayer/Vertx finetuning and parameter setting support

Signed-off-by: Alex Black <blacka101@gmail.com>

* #8663 DL4J embedding layer weight init - don't depend on vocab size

Signed-off-by: Alex Black <blacka101@gmail.com>

* EmbeddingLayer test tweak

Signed-off-by: Alex Black <blacka101@gmail.com>
2020-03-02 16:15:49 +11:00
Oleh f116f53d61
Loops auto-vectorization problem fix (#277)
* libnd4j cast loop types

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j more type castination added to loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j sync casting types of iterated variable in loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j more loops reviewed for vectorization problem fix

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed several typos

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j several more files reviewed to fix auto-vectorization problem in loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j merge master and reviewed more files to fix auto-vectorization problem in loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j several type casting added in broadcasting that were missed, fixed mac builds

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j double check all files and fix several more places in loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed builds

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j revert changes for lup.cpp

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j more files reviewed for auto-vectorization problem fix

Signed-off-by: Oleg <oleg.semeniv@gmail.com>
2020-02-28 17:04:45 +03:00
raver119 5332ace32b
better inplace exec with FastPath (#280)
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-28 12:06:30 +03:00
shugeo 330a69d4e2
Shugeo solve ls (#203)
* lstsq op. Initial commit.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Least squares linear problem solve op (lstsq). Cpu draft implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed shape routine and tests.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added test for lstsq op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Rectification for lstsq op implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected test to avoid numerical inconsistensy.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added prints for check computing.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Corrected tests to use evalueate facility instead.

Signed-off-by: shugeo <sgazeos@gmail.com>

* CPU implementation of MatrixSolveLs op and tests.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added cuda implementation for helpers with lstsq op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored tests for lstsq op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added processing for empty inputs.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Merged tests.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored lstsq op for fast case.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed test.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored lstsq op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed some issues with solve.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed lstsq op to avoid erros.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added kernel for giagonal factor

Signed-off-by: shugeo <sgazeos@gmail.com>

* lstsq wrapper and triangular_solve fixed

* Added proper processing empty inputs and test.

Signed-off-by: shugeo <sgazeos@gmail.com>

* SequenceMask test

* Build fixed

* Added proper processing of empty inputs with solve op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Mapping added

* Added check of input shapes with solve op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Added a couple of tests for lstsq op and minor changes with cuda helper for one.'

Signed-off-by: shugeo <sgazeos@gmail.com>

* Tests on

* Refactored test for lstsq op.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed test

* Added another approach for lstsq op aka solve_ls.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Finished cpu part for solve_ls op helpers.

* Added helper for low triangular matrix inversion.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored alternate solve_ls cpu implementation.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Removed alternate approach for solve_ls op. Added multithreading with matrix inversion.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Assert fixed

* Refactored multithreading for inverse matricies.

Signed-off-by: shugeo <sgazeos@gmail.com>

Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-02-28 11:37:26 +03:00
raver119 358c650b62 one micro fix
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-27 19:28:26 +03:00
raver119 31e3a2f7a5
transparent conversion to FastPath execution within Graph (#278)
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-27 16:10:38 +03:00
Alexander Stoyakin 353f901c7c
[WIP] Handle missing functionality for binary models (#269)
* Handle missing functionality for binary models

* Exception text fixed
2020-02-27 12:00:15 +11:00
Oleh b4575d11e9
Loops auto-vectorization problem fix (#274)
* libnd4j cast loop types

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j more type castination added to loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j sync casting types of iterated variable in loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j more loops reviewed for vectorization problem fix

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed several typos

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j several more files reviewed to fix auto-vectorization problem in loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j merge master and reviewed more files to fix auto-vectorization problem in loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j several type casting added in broadcasting that were missed, fixed mac builds

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j double check all files and fix several more places in loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed builds

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j revert changes for lup.cpp

Signed-off-by: Oleg <oleg.semeniv@gmail.com>
2020-02-26 21:12:19 +03:00
raver119 5c806d2fb5
reshape tweak (#275)
* - expand dims tweak
- reshape memcpy

Signed-off-by: raver119 <raver119@gmail.com>

* validation fix

Signed-off-by: raver119 <raver119@gmail.com>
2020-02-26 14:05:32 +03:00
Oleh b686368b82
Refactoring split operation (#266)
* libnd4j moved split operation implementation to helpers before special case adding

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j minor fixes for general split operation move, merge master

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libndj4 split cpu implementation

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* - provide cuda helper for split op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor correction

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor correction 2

Signed-off-by: Yurii <iuriish@yahoo.com>

* libnd4j moved split implementation from specials to split.cpp

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j update loopkind selections for 3D, 4D and 5D cases

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j removed unnecessary BUILD_SINGLE_TEMPLATE

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
2020-02-26 10:20:39 +03:00
raver119 cf67c7165a nano fix
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-25 15:20:51 +03:00
raver119 f6442b6724
few minor tweaks (#272)
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-25 11:13:23 +03:00
raver119 241ed05c64
VariableSpace uses unordered maps as well (#270)
Signed-off-by: raver119 <raver119@gmail.com>
2020-02-24 21:58:23 +03:00
Oleh f0706b21aa
Split operation improvement (#262)
* libnd4j moved split operation implementation to helpers before special case adding

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j minor fixes for general split operation move, merge master

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libndj4 split cpu implementation

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* - provide cuda helper for split op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor correction

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor correction 2

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
2020-02-24 08:22:41 +03:00
shugeo 1bb3ae4b03
Shugeo unordered map (#256)
* Refactored usage of std::map to std::unordered_map instead.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Eliminated crash with wrong ShapeDescriptor hash.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Eliminated crash with TadDescriptor hash.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored Stash hash.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored hashes.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored TadDescriptor hash and top_k mapping.

* Refactored hashes for ShapeDescriptor and TadDescriptor classes.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Refactored hash for ConstantDescriptor and ShapeDescriptor classes.

Signed-off-by: shugeo <sgazeos@gmail.com>

* Fixed map using with cuda platform.

Signed-off-by: shugeo <sgazeos@gmail.com>

* - few rearrangements for hash functions
- shared openblas allowed

Signed-off-by: raver119 <raver119@gmail.com>

* exports

Signed-off-by: raver119 <raver119@gmail.com>

* exports

Signed-off-by: raver119 <raver119@gmail.com>

* Stash reverted to std::map

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Added additional test.

Signed-off-by: shugeo <sgazeos@gmail.com>

* different maps for different compilers

Signed-off-by: raver119 <raver119@gmail.com>

* missing include

Signed-off-by: raver119 <raver119@gmail.com>

* fix the leak

Signed-off-by: raver119 <raver119@gmail.com>

Co-authored-by: raver119 <raver119@gmail.com>
2020-02-24 07:51:01 +03:00
Alex Black a0ed5487ca
Upgrade openblas version to 0.3.8 (#264)
Signed-off-by: Alex Black <blacka101@gmail.com>
2020-02-22 23:42:52 +11:00
raver119 e78be14cc1
arm fix (#260)
* range check for scalar_int

Signed-off-by: raver119 <raver119@gmail.com>

* no simd

Signed-off-by: raver119 <raver119@gmail.com>

* no ops

Signed-off-by: raver119 <raver119@gmail.com>

* cyclic shift?

Signed-off-by: raver119 <raver119@gmail.com>

* left split

Signed-off-by: raver119 <raver119@gmail.com>

* left split

Signed-off-by: raver119 <raver119@gmail.com>

* rot ops unrolled templates

Signed-off-by: raver119 <raver119@gmail.com>

* no rotl/rotr for uint64

Signed-off-by: raver119 <raver119@gmail.com>

* no rotl/rotr for uint64 2

Signed-off-by: raver119 <raver119@gmail.com>

* no rotl/rotr for uint64 3

Signed-off-by: raver119 <raver119@gmail.com>

* ARM_BUILD declared

Signed-off-by: raver119 <raver119@gmail.com>
2020-02-21 14:31:00 +03:00
Alex Black d19dbb955c Merge remote-tracking branch 'eclipse/master' 2020-02-21 19:57:07 +11:00
Oleh 0748c7e7c2
Oleh broadcast4d (#257)
* libnd4j raw implementation of native broadcast for special cases

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j fixed bugs for special case of 4D loop broadcast, add some tests, need more testing and discussion

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j added 3D and 5D cases support and tests, need testing with different orders

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j correctd case selection for broadcast 3,4,5D loops, fixed several places for more stable behavior, clean up

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j minor corrections to avoid some risks in strides selection, added tests and rename some variables

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j optimize usage the stride selection for all loops in separate ShapeUtils method copyCertainStridesFromShapeInfo, merge master

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j remove per request several tests for 3D, 4D and 5D broadcast loops

Signed-off-by: Oleg <oleg.semeniv@gmail.com>

* libnd4j removed some loac changes that had not been sync with serve playground, turn on new loops usage
2020-02-21 07:46:05 +03:00
Yurii Shyrma f7a9190407
profiling of concat op (both cuda and cpu) (#151)
* - profiling of concat op (both cuda and cpu)

Signed-off-by: Yurii <iuriish@yahoo.com>

* better comparison for large concat

Signed-off-by: raver119 <raver119@gmail.com>

* - further improving of concat op

Signed-off-by: Yurii <iuriish@yahoo.com>

* some loggin

Signed-off-by: raver119 <raver119@gmail.com>

* - add possibility to verify presence of trailing unities in shape and set strides/ews correspondingly
- restrict second simple case in concat op to c order only

Signed-off-by: Yurii <iuriish@yahoo.com>

* - move concat op to specials_single.cpp file

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of second concat op declaration in transforms.cpp file

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>
2020-02-20 21:19:01 +03:00
raver119 215641ea9e
Minor improvements (#255)
* static increments in loops

Signed-off-by: raver119 <raver119@gmail.com>

* specials and concat split into separate units

Signed-off-by: raver119 <raver119@gmail.com>
2020-02-20 11:43:26 +03:00
Serhii Shepel d9058b469a
Add classifier property for dl4j-test-resources (#249) 2020-02-19 15:31:21 +02:00