12 Commits

Author SHA1 Message Date
Chris Bamford
74420bca31
RL4J: Sanitize async learner (#327)
* refactoring global async to use a much simpler update procedure with a single global lock

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* simplification of async learning algorithms, stabilization + better hyperparameters

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* started to play with using mockito for tests

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* Working on refactoring tests for async classes and trying to make async simpler

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* more work on mockito tests and making some tests much less complex and more explicit in what they are testing

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* some fixes from merging

* do not allow copying of the current network to worker threads, fixing debug line

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* adding some more tests around PR review

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* Adding more tests after review comments

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* few more tests and fixes from PR review

* remove rename of maxEpochStep to maxStepsPerEpisode as we agreed to review this in a seperate PR

* 2019 instead of 2018 on copyright header

* adding konduit copyright to files

* some more copyright headers

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

Co-authored-by: Alexandre Boulanger <aboulang2002@yahoo.com>
2020-04-20 11:21:01 +09:00
Chris Bamford
1a35ebec2e
RL4J: Add Backwardly Compatible Builder patterns (#326)
* Starting to switch configs of RL algorithms to use more fluent builder patterns. Many parameter choices in different algorithms default to SOTA and only be changed in specific cases

Signed-off-by: Bam4d <chris.bam4d@gmail.com>

* remove personal gpu-build file

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* refactored out configurations so they are heirarchical and re-usable, this is a step towards having a plug-and-play framework for different algorithms

* backwardly compatible configurations

* adding documentation to new configuration classes

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* private access modifiers are better suited here

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* RL4j does not compile without java 8 due to previous updates

fixing null pointers when listener arrays are empty

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* fixing copyright headers

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* uncomment logging line

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

* fixing default value for learningUpdateFrequency

fixing test failure due to #352

Signed-off-by: Bam4d <chrisbam4d@gmail.com>

Co-authored-by: Bam4d <chris.bam4d@gmail.com>
2020-04-06 12:36:12 +09:00
Alexandre Boulanger
8b10f0b876
RL4J: Add TransformProcess, part 2 (#8766)
* Part 2 of TransformProcess

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fix compile errors

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Revert unrelated changes

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
2020-03-11 11:56:41 +09:00
Alexandre Boulanger
20e3039f2e
RL4J: Change frame skipping logic (#8596)
* Added isSkipped() to Observation

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Changed refacInitMdp to use isSkipped()

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Changed getHistoryProcessor()

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Fixed getEpochCounter() incorrectly changed to getCurrentEpochStep() calls

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Removed StepCountable

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fix build

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Fixed a problem in QLearningDiscrete and another in CartpoleNative

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Update versions of JavaCPP Presets for NumPy, MKL, Gym, and TensorFlow

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* RL4J: Add ability to set a random seed for GymEnv

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
2020-02-04 12:23:39 +09:00
Samuel Audet
9edbefdc67
RL4J: Replace gym-java-client with JavaCPP (#8595)
* RL4J: Replace gym-java-client with JavaCPP

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>
2020-01-20 17:13:57 +09:00
Alexandre Boulanger
de3975f088 RL4J: Remove processing done on observations in Policy & Async (#8471)
* Removed processing from Policy.play() and fixed missing resets

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Adjusted unit test to check if DQNs have been reset

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Fixed a couple of problems, added and updated unit tests

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Removed processing from AsyncThreadDiscrete

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Fixed a few problems

Signed-off-by: unknown <aboulang2002@yahoo.com>
2019-12-18 16:27:05 +09:00
Alexandre Boulanger
47c58cf69d RL4J: Add Observation and LegacyMDPWrapper (#8368)
* Added Observable & LegacyMDPWrapper

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Moved observation processing to LegacyMDPWrapper

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Observation using DataSets, changes in Transition and BaseTDTargetAlgorithm

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Added javadoc to Transition new methods

Signed-off-by: unknown <aboulang2002@yahoo.com>
2019-11-26 23:05:11 +09:00
Alexandre Boulanger
d5e98afcef RL4J: Add VideoRecorder (#8106)
* Added VideoRecorder

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Added missing header

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Changed HistoryProcessor to use VideoRecorder

Signed-off-by: unknown <aboulang2002@yahoo.com>
2019-09-30 13:40:32 +09:00
Alexandre Boulanger
59f1cbf0c6 RL4J - AsyncTrainingListener (#8072)
* Code clarity: Extracted parts of run() into private methods

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Added listener pattern to async learning

Signed-off-by: unknown <aboulang2002@yahoo.com>

* Merged all listeners logic

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Added interface and common data to training events

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fixed missing info log file

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Fixed bad merge; removed useless TrainingEvent

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Removed param from training start/end event

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Removed 'event' classes from the training listener

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Reverted changes to QLearningDiscrete.setTarget()
2019-09-19 11:28:13 +10:00
Alexandre Boulanger
b2145ca780 RL4J Added listener pattern to SyncLearning (#8050)
* Added listener pattern to SyncLearning

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Did requested changes

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
2019-08-02 12:43:45 +10:00
Alexandre Boulanger
87d2b2cd3d Added interface IDataManager (#8034)
Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
2019-07-25 21:34:54 +10:00
skymindops
b5f0ec072f Eclipse Migration Initial Commit 2019-06-06 15:21:15 +03:00