74420bca31
* refactoring global async to use a much simpler update procedure with a single global lock Signed-off-by: Bam4d <chrisbam4d@gmail.com> * simplification of async learning algorithms, stabilization + better hyperparameters Signed-off-by: Bam4d <chrisbam4d@gmail.com> * started to play with using mockito for tests Signed-off-by: Bam4d <chrisbam4d@gmail.com> * Working on refactoring tests for async classes and trying to make async simpler Signed-off-by: Bam4d <chrisbam4d@gmail.com> * more work on mockito tests and making some tests much less complex and more explicit in what they are testing Signed-off-by: Bam4d <chrisbam4d@gmail.com> * some fixes from merging * do not allow copying of the current network to worker threads, fixing debug line Signed-off-by: Bam4d <chrisbam4d@gmail.com> * adding some more tests around PR review Signed-off-by: Bam4d <chrisbam4d@gmail.com> * Adding more tests after review comments Signed-off-by: Bam4d <chrisbam4d@gmail.com> * few more tests and fixes from PR review * remove rename of maxEpochStep to maxStepsPerEpisode as we agreed to review this in a seperate PR * 2019 instead of 2018 on copyright header * adding konduit copyright to files * some more copyright headers Signed-off-by: Bam4d <chrisbam4d@gmail.com> Co-authored-by: Alexandre Boulanger <aboulang2002@yahoo.com> |
||
---|---|---|
.. | ||
contrib | ||
rl4j-ale | ||
rl4j-api | ||
rl4j-core | ||
rl4j-doom | ||
rl4j-gym | ||
rl4j-malmo | ||
LICENSE.txt | ||
README.md | ||
cartpole.gif | ||
doom.gif | ||
malmo.gif | ||
pom.xml | ||
scoregraph.png |
README.md
RL4J: Reinforcement Learning for Java
RL4J is a reinforcement learning framework integrated with deeplearning4j and released under an Apache 2.0 open-source license. By contributing code to this repository, you agree to make your contribution available under an Apache 2.0 license.
- DQN (Deep Q Learning with double DQN)
- Async RL (A3C, Async NStepQlearning)
Both for Low-Dimensional (array of info) and high-dimensional (pixels) input.
Here is a useful blog post I wrote to introduce you to reinforcement learning, DQN and Async RL:
Disclaimer
This is a tech preview and distributed as is. Comments are welcome on our gitter channel: gitter
Quickstart
- mvn install
Visualisation
Quicktry cartpole:
- run with this main
Doom
Doom is not ready yet but you can make it work if you feel adventurous with some additional steps:
- You will need vizdoom, compile the native lib and move it into the root of your project in a folder
- export MAVEN_OPTS=-Djava.library.path=THEFOLDEROFTHELIB
- mvn compile exec:java -Dexec.mainClass="YOURMAINCLASS"
Malmo (Minecraft)
- Download and unzip Malmo from here
- export MALMO_HOME=YOURMALMO_FOLDER
- export MALMO_XSD_PATH=$MALMO_HOME/Schemas
- launch malmo per instructions
- run with this main
WIP
- Documentation
- Serialization/Deserialization (load save)
- Compression of pixels in order to store 1M state in a reasonnable amount of memory
- Async learning: A3C and nstep learning (requires some missing features from dl4j (calc and apply gradients)).
Author
Proposed contribution area:
- Continuous control
- Policy Gradient
- Update rl4j-gym to make it compatible with pixels environments to play with Pong, Doom, etc ..