88d3c4867f
* Refactor nd4j-common: org.nd4j.* -> org.nd4j.common.* Signed-off-by: Alex Black <blacka101@gmail.com> * Fix CUDA (missed nd4j-common package refactoring changes) Signed-off-by: Alex Black <blacka101@gmail.com> * nd4j-kryo: org.nd4j -> org.nd4j.kryo Signed-off-by: Alex Black <blacka101@gmail.com> * Fix nd4j-common for deeplearning4j-cuda Signed-off-by: Alex Black <blacka101@gmail.com> * nd4j-grppc-client: org.nd4j.graph -> org.nd4j.remote.grpc Signed-off-by: Alex Black <blacka101@gmail.com> * deeplearning4j-common: org.deeplearning4.* -> org.deeplearning4j.common.* Signed-off-by: Alex Black <blacka101@gmail.com> * deeplearning4j-core: org.deeplearning4j.* -> org.deeplearning.core.* Signed-off-by: Alex Black <blacka101@gmail.com> * deeplearning4j-cuda: org.deeplearning4j.nn.layers.* -> org.deeplearning4j.cuda.* Signed-off-by: Alex Black <blacka101@gmail.com> * Import fixes Signed-off-by: Alex Black <blacka101@gmail.com> * deeplearning4j-nlp-*: org.deeplearning4.text.* -> org.deeplearning4j.nlp.(language).* Signed-off-by: Alex Black <blacka101@gmail.com> * deeplearning4j-ui-model: org.deeplearning4j.ui -> org.deeplearning4j.ui.model Signed-off-by: Alex Black <blacka101@gmail.com> * datavec-spark-inference-{server/model/client}: org.datavec.spark.transform -> org.datavec.spark.inference.{server/model/client} Signed-off-by: Alex Black <blacka101@gmail.com> * datavec-jdbc: org.datavec.api -> org.datavec.jdbc Signed-off-by: Alex Black <blacka101@gmail.com> * Delete org.deeplearning4j.datasets.iterator.impl.MultiDataSetIteratorAdapter in favor of (essentially identical) org.nd4j.linalg.dataset.adapter.MultiDataSetIteratorAdapter Signed-off-by: Alex Black <blacka101@gmail.com> * ND4S fixes Signed-off-by: Alex Black <blacka101@gmail.com> * Fixes Signed-off-by: Alex Black <blacka101@gmail.com> * nd4j-common-tests: org.nd4j.* -> org.nd4j.common.tests Signed-off-by: Alex Black <blacka101@gmail.com> * Trigger CI Signed-off-by: Alex Black <blacka101@gmail.com> * Fixes Signed-off-by: Alex Black <blacka101@gmail.com> * #8878 Ignore CUDA tests on modules with 'nd4j-native under cuda' issue Signed-off-by: Alex Black <blacka101@gmail.com> * Fix bad imports in tests Signed-off-by: Alex Black <blacka101@gmail.com> * Add ignore on test (already failing) due to #8882 Signed-off-by: Alex Black <blacka101@gmail.com> * Import fixes Signed-off-by: Alex Black <blacka101@gmail.com> * Additional import fixes Signed-off-by: Alex Black <blacka101@gmail.com> |
||
---|---|---|
.. | ||
contrib | ||
rl4j-ale | ||
rl4j-api | ||
rl4j-core | ||
rl4j-doom | ||
rl4j-gym | ||
rl4j-malmo | ||
LICENSE.txt | ||
README.md | ||
cartpole.gif | ||
doom.gif | ||
malmo.gif | ||
pom.xml | ||
scoregraph.png |
README.md
RL4J: Reinforcement Learning for Java
RL4J is a reinforcement learning framework integrated with deeplearning4j and released under an Apache 2.0 open-source license. By contributing code to this repository, you agree to make your contribution available under an Apache 2.0 license.
- DQN (Deep Q Learning with double DQN)
- Async RL (A3C, Async NStepQlearning)
Both for Low-Dimensional (array of info) and high-dimensional (pixels) input.
Here is a useful blog post I wrote to introduce you to reinforcement learning, DQN and Async RL:
Disclaimer
This is a tech preview and distributed as is. Comments are welcome on our gitter channel: gitter
Quickstart
- mvn install
Visualisation
Quicktry cartpole:
- run with this main
Doom
Doom is not ready yet but you can make it work if you feel adventurous with some additional steps:
- You will need vizdoom, compile the native lib and move it into the root of your project in a folder
- export MAVEN_OPTS=-Djava.library.path=THEFOLDEROFTHELIB
- mvn compile exec:java -Dexec.mainClass="YOURMAINCLASS"
Malmo (Minecraft)
- Download and unzip Malmo from here
- export MALMO_HOME=YOURMALMO_FOLDER
- export MALMO_XSD_PATH=$MALMO_HOME/Schemas
- launch malmo per instructions
- run with this main
WIP
- Documentation
- Serialization/Deserialization (load save)
- Compression of pixels in order to store 1M state in a reasonnable amount of memory
- Async learning: A3C and nstep learning (requires some missing features from dl4j (calc and apply gradients)).
Author
Proposed contribution area:
- Continuous control
- Policy Gradient
- Update rl4j-gym to make it compatible with pixels environments to play with Pong, Doom, etc ..