Go to file
raver119 c969b724bb [WIP] more CUDA stuff (#57)
* initial commit

Signed-off-by: raver119 <raver119@gmail.com>

* Added gradcheck test for dynamic_partition_bp op.

* - implementation of dilation op (cpu and cuda)

Signed-off-by: Yurii <yurii@skymind.io>

* Fixed broadcast_dynamic_shape 1D case and tests.

* Fixed usage of default integer arguments.

* Fixed dynamic_partition_bp op and tests.

* Eliminated test with grad check for dynamic_partition_bp op.

* start working on cuda svd - porting available corresponding api from cuSOLVER library

Signed-off-by: Yurii <yurii@skymind.io>

* provide prelu_bp

Signed-off-by: Yurii <yurii@skymind.io>

* - provide gruCell_bp (old version ??)

Signed-off-by: Yurii <yurii@skymind.io>

* - polishing cumsum_bp and cumprod_bp tests

Signed-off-by: Yurii <yurii@skymind.io>

* provide sparseSoftmaxCrossEntropyWithLogits and sparseSoftmaxCrossEntropyWithLogits_grad

Signed-off-by: Yurii <yurii@skymind.io>

* Fixed atomicMul with float input/output

* implementation of cuda kernel for triu_bp operation

Signed-off-by: Yurii <yurii@skymind.io>

* Refactored lup helper to add parrallel computing.

* cusolver libraries

Signed-off-by: raver119 <raver119@gmail.com>

* uncomment cuSolver APIs in svd.cu

Signed-off-by: Yurii <yurii@skymind.io>

* cusolver var

Signed-off-by: raver119 <raver119@gmail.com>

* - further work on cuSolver svd

Signed-off-by: Yurii <yurii@skymind.io>

* Implement usage of cuda solver to LUP decomposition.

* - correct naames in lup functions

Signed-off-by: Yurii <yurii@skymind.io>

* correct svdQR cuda

Signed-off-by: Yurii <yurii@skymind.io>

* - provide transpositions of input matrices in case of c order in svdCudaQR

Signed-off-by: Yurii <yurii@skymind.io>

* Fixed implementation issues with LUP usign cuda solver.

* Implementation of matrix_determinant helper with cuda kernels. Working revision.

* Implemented log_matrix_determinant helper with cuda kernels.

* - implementation of batched cuda svd

Signed-off-by: Yurii <yurii@skymind.io>

* Refactored cholesky helper and implementation of cuda solver cholesky batch.

* - implementation of cuda kernel for tile bp

Signed-off-by: Yurii <yurii@skymind.io>

* Implementation of cholesky and logdet with cuda kernels.

* - implementation of cuda kernel for sru_bidirectional

Signed-off-by: Yurii <yurii@skymind.io>

* Fixed cholesky helper.

* Cholesky op helper implementation. Working double-based cublas implementation.

* bad import excluded

Signed-off-by: raver119 <raver119@gmail.com>

* Finished with cuda implementation of cholesky helper and tests.

* - implementation of cuda kernel for sru_bidirectional_backprop operation

Signed-off-by: Yurii <yurii@skymind.io>

* Implementation of matrix_inverse op helper with cuda kernels. The first revision.

* - start working on gruCell_bp

Signed-off-by: Yurii <yurii@skymind.io>

* Implementation of matrix_inverse helper.

* - further work on new gruCell_bp

Signed-off-by: Yurii <yurii@skymind.io>

* cuBLAS related fixes

Signed-off-by: raver119 <raver119@gmail.com>

* calculateOutputShapes() now passes device buffers as well

Signed-off-by: raver119 <raver119@gmail.com>

* special concat/average/accumulate init host pointers now

Signed-off-by: raver119 <raver119@gmail.com>

* few more tweaks

Signed-off-by: raver119 <raver119@gmail.com>

* additional CudaDataBufferFactory signatures certain for data types

Signed-off-by: raver119 <raver119@gmail.com>

* cuSolver host buffer

Signed-off-by: raver119 <raver119@gmail.com>

* buffer to buffer memcpy host ptr allocation

Signed-off-by: raver119 <raver119@gmail.com>
2019-07-20 23:05:21 +10:00
.github Update contributing and issue/PR templates (#7934) 2019-06-22 16:21:27 +10:00
arbiter Merge master to upstream (#7945) 2019-06-27 18:37:04 +03:00
datavec Merge master to upstream (#7945) 2019-06-27 18:37:04 +03:00
deeplearning4j [WIP] INDArray hashCode() impl (#50) 2019-07-20 22:22:11 +10:00
docs Quick start ND4J (#7916) 2019-07-02 13:43:07 +10:00
gym-java-client Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
jumpy Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
libnd4j [WIP] more CUDA stuff (#57) 2019-07-20 23:05:21 +10:00
nd4j [WIP] more CUDA stuff (#57) 2019-07-20 23:05:21 +10:00
nd4s ND4S build fixed (#47) 2019-07-20 22:20:36 +10:00
pydatavec Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
pydl4j Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
rl4j RL4J refac: Added some observation transform classes (#7958) 2019-07-20 10:28:20 +10:00
scalnet Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
.gitignore Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
CONTRIBUTING.md Update contributing and issue/PR templates (#7934) 2019-06-22 16:21:27 +10:00
Jenkinsfile Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
LICENSE Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
README.md Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
change-cuda-versions.sh Update dependencies to just released JavaCPP and JavaCV 1.5.1 (#8004) 2019-07-14 21:07:33 +03:00
change-scala-versions.sh Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
change-spark-versions.sh Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
perform-release.sh Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00
pom.xml Update dependencies to just released JavaCPP and JavaCV 1.5.1 (#8004) 2019-07-14 21:07:33 +03:00

README.md

Monorepo of Deeplearning4j

Welcome to the new monorepo of Deeplearning4j that contains the source code for all the following projects, in addition to the original repository of Deeplearning4j moved to deeplearning4j:

To build everything, we can use commands like

./change-cuda-versions.sh x.x
./change-scala-versions.sh 2.xx
./change-spark-versions.sh x
mvn clean install -Dmaven.test.skip -Dlibnd4j.cuda=x.x -Dlibnd4j.compute=xx

or

mvn -B -V -U clean install -pl '!jumpy,!pydatavec,!pydl4j' -Dlibnd4j.platform=linux-x86_64 -Dlibnd4j.chip=cuda -Dlibnd4j.cuda=9.2 -Dlibnd4j.compute=<your GPU CC> -Djavacpp.platform=linux-x86_64 -Dmaven.test.skip=true

An example of GPU "CC" or compute capability is 61 for Titan X Pascal.

Want some examples?

We have separate repository with various examples available: https://github.com/deeplearning4j/dl4j-examples

In the examples repo, you'll also find a tutorial series in Zeppelin: https://github.com/deeplearning4j/dl4j-examples/tree/master/tutorials