d52e67209e
* StringUtils for utf convertor raw implementation of all possible combinations, need to be add counter of bytes per symbol for any type and add api to call convertors and store data Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor more corrections to support convertors Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor some corrections and bug fixes, need review to discuss how to add multi-threading Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some corrections to move to multi-threading, add one test need discussion data inputs/outputs array presentation, need discussion the way of multi-threading * StringUtils for utf convertor #8613 tests added some corrections to optimize build Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some corrections and code clean up Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 code clean up and optimize usage, need update ndarray factory before replace std usage Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some staff to integrate converters into NDArrayFactory, update tests and add some functionality Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 minor corrections and bug fix before discussion * StringUtils for utf convertor #8613 some fixes and tets * StringUtils for utf convertor #8613 some more staff to support different unicode Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fix linking bug * StringUtils for utf convertor #8613 corrected several tests as defaults for string ndarray changed * StringUtils for utf convertor #8613 replace some incorrect implementation, revert some test changes, need sync before testing * StringUtils for utf convertor #8613 fixed several thing that were badly implemented yesterday, need optimization, testing (before testing have to be add support of u32 and u16 buffer visualization) * StringUtils for utf convertor #8613 fixed to support u16 and u32, and convertor in ndarray, fix buffer print, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 merge master and sync with server Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some correction for string cast, need print check only asci support Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 merge master, remove copies and add cast, need test, refactoring according review and clean up * StringUtils for utf convertor #8613 fixed cast and copy issues Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed cuda and update tests * StringUtils for utf convertor #8613 integration into NdArray, fix several tests for build pass, refactoring, etc * - avoid ambiguity of NDArray ctrs overloading in some tests Signed-off-by: Yurii <iuriish@yahoo.com> * StringUtils for utf convertor #8613 NDArray string constructors added, updated NDArrayFactory, refactoring unicode and tests, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed cuda build and test, refactoring and void* added to some functions Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 void* integration, removed copy operation, refactoring, added tests for NDArray string constructors, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 several more fixes, improvements and updates Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 master merge, code clean up and optimization before review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 minor fixes string element size define Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 revert last changes as mistake Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed NDArray constructor build problem, remove order from string factory, fixed order use for factory via project, added catch of incorrect sync in cast of arrays to data types, fixed e method for strings, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 added javacpp hack, added multi-threading, minor corrections in license agreement Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 windows builds fix, as "sting" is not treated as utf8 Signed-off-by: Oleg <oleg.semeniv@gmail.com> Co-authored-by: Yurii Shyrma <iuriish@yahoo.com> |
||
---|---|---|
.. | ||
auto_vectorization | ||
blas | ||
cmake | ||
include | ||
minifier | ||
msi | ||
packages | ||
profile | ||
server | ||
tests_cpu | ||
.gitignore | ||
AddingNewOps.md | ||
CMakeLists.txt | ||
CMakeLists.txt.cpu_features.in | ||
CMakeLists.txt.in | ||
CMakeLists.txt.mkldnn.in | ||
CMakeSettings.json | ||
LICENSE | ||
README.md | ||
RaspberryPi.md | ||
UnderstandingGraph.md | ||
assembly-cuda.xml | ||
assembly.xml | ||
buildnativeoperations.sh | ||
cibuild.sh | ||
development.md | ||
flatproto.txt | ||
iOS.md | ||
linuxOnPower.md | ||
macOSx10 (CPU only).md | ||
pom.xml | ||
proto.sh | ||
setuposx.sh | ||
windows.md |
README.md
LibND4J
Native operations for nd4j. Build using cmake
Prerequisites
- GCC 4.9+
- CUDA 8.0 or 9.0 (if desired)
- CMake 3.8 (as of Nov 2017, in near future will require 3.9)
Additional build arguments
There's few additional arguments for buildnativeoperations.sh
script you could use:
-a XXXXXXXX// shortcut for -march/-mtune, i.e. -a native
-b release OR -b debug // enables/desables debug builds. release is considered by default
-j XX // this argument defines how many threads will be used to binaries on your box. i.e. -j 8
-cc XX// CUDA-only argument, builds only binaries for target GPU architecture. use this for fast builds
--check-vectorization auto-vectorization report for developers. (Currently, only GCC is supported)
More about AutoVectorization report
You can find the compute capability for your card on the NVIDIA website here.
For example, a GTX 1080 has compute capability 6.1, for which you would use -cc 61
(note no decimal point).
OS Specific Requirements
Android
Download the NDK, extract it somewhere, and execute the following commands, replacing android-xxx
with either android-arm
or android-x86
:
git clone https://github.com/deeplearning4j/libnd4j
git clone https://github.com/deeplearning4j/nd4j
export ANDROID_NDK=/path/to/android-ndk/
cd libnd4j
bash buildnativeoperations.sh -platform android-xxx
cd ../nd4j
mvn clean install -Djavacpp.platform=android-xxx -DskipTests -pl '!:nd4j-cuda-9.0,!:nd4j-cuda-9.0-platform,!:nd4j-tests'
OSX
Run ./setuposx.sh (Please ensure you have brew installed)
Linux
Depends on the distro - ask in the earlyadopters channel for specifics on distro
Ubuntu Linux 15.10
wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-ubuntu1504-7-5-local_7.5-18_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1504-7-5-local_7.5-18_amd64.deb
sudo apt-get update
sudo apt-get install cuda
sudo apt-get install cmake
sudo apt-get install gcc-4.9
sudo apt-get install g++-4.9
sudo apt-get install git
git clone https://github.com/deeplearning4j/libnd4j
cd libnd4j/
export LIBND4J_HOME=~/libnd4j/
sudo rm /usr/bin/gcc
sudo rm /usr/bin/g++
sudo ln -s /usr/bin/gcc-4.9 /usr/bin/gcc
sudo ln -s /usr/bin/g++-4.9 /usr/bin/g++
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
Ubuntu Linux 16.04
sudo apt install cmake
sudo apt install nvidia-cuda-dev nvidia-cuda-toolkit nvidia-361
export TRICK_NVCC=YES
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
The standard development headers are needed.
CentOS 6
yum install centos-release-scl-rh epel-release
yum install devtoolset-3-toolchain maven30 cmake3 git
scl enable devtoolset-3 maven30 bash
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
Windows
See Windows.md
Setup for All OS
-
Set a LIBND4J_HOME as an environment variable to the libnd4j folder you've obtained from GIT
- Note: this is required for building nd4j as well.
-
Setup cpu followed by gpu, run the following on the command line:
-
For standard builds:
./buildnativeoperations.sh ./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
-
For Debug builds:
./buildnativeoperations.sh blas -b debug ./buildnativeoperations.sh blas -c cuda -сс YOUR_DEVICE_ARCH -b debug
-
For release builds (default):
./buildnativeoperations.sh ./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
-
OpenMP support
OpenMP 4.0+ should be used to compile libnd4j. However, this shouldn't be any trouble, since OpenMP 4 was released in 2015 and should be available on all major platforms.
Linking with MKL
We can link with MKL either at build time, or at runtime with binaries initially linked with another BLAS implementation such as OpenBLAS. In either case, simply add the path containing libmkl_rt.so
(or mkl_rt.dll
on Windows), say /path/to/intel64/lib/
, to the LD_LIBRARY_PATH
environment variable on Linux (or PATH
on Windows), and build or run your Java application as usual. If you get an error message like undefined symbol: omp_get_num_procs
, it probably means that libiomp5.so
, libiomp5.dylib
, or libiomp5md.dll
is not present on your system. In that case though, it is still possible to use the GNU version of OpenMP by setting these environment variables on Linux, for example:
export MKL_THREADING_LAYER=GNU
export LD_PRELOAD=/usr/lib64/libgomp.so.1
##Troubleshooting MKL
Sometimes the above steps might not be all you need to do. Another additional step might be the need to add:
export LD_LIBRARY_PATH=/opt/intel/lib/intel64/:/opt/intel/mkl/lib/intel64
This ensures that mkl will be found first and liked to.
Packaging
If on Ubuntu (14.04 or above) or CentOS (6 or above), this repository is also set to create packages for your distribution. Let's assume you have built:
- for the cpu, your command-line was
./buildnativeoperations.sh ...
:
cd blasbuild/cpu
make package
- for the gpu, your command-line was
./buildnativeoperations.sh -c cuda ...
:
cd blasbuild/cuda
make package
Uploading package to Bintray
The package upload script is in packaging. The upload command for an rpm built for cpu is:
./packages/push_to_bintray.sh myAPIUser myAPIKey deeplearning4j blasbuild/cpu/libnd4j-0.8.0.fc7.3.1611.x86_64.rpm https://github.com/deeplearning4j
The upload command for a deb package built for cuda is:
./packages/push_to_bintray.sh myAPIUser myAPIKey deeplearning4j blasbuild/cuda/libnd4j-0.8.0.fc7.3.1611.x86_64.deb https://github.com/deeplearning4j
Running tests
Tests are written with gtest, run using cmake. Tests are currently under tests_cpu/
There are 2 directories for running tests:
1. libnd4j_tests: These are older legacy ops tests.
2. layers_tests: This covers the newer graph operations and ops associated with samediff.
For running the tests, we currently use cmake or CLion to run the tests.
To run tests using CUDA backend it's pretty much similar process:
1. ./buildnativeoperations.h -c cuda -cc <YOUR_ARCH> -b debug -t -j <NUMBER_OF_CORES>
2. ./blasbuild/cuda/tests_cpu/layers_tests/runtests (.exe on Windows)