029b84e2b7
* RL4J: Add generic update rule (#502) Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com> * Shyrma reduce (#481) * - start working on improving of cpu legacy code for reduce ops Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on improving legacy loops Signed-off-by: Yurii <iuriish@yahoo.com> * - still working on improving reduce ops Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on improving reduce ops Signed-off-by: Yurii <iuriish@yahoo.com> * - testing speed run of new reduce op Signed-off-by: Yurii <iuriish@yahoo.com> * - working on improvement of default loop for reduce op Signed-off-by: Yurii <iuriish@yahoo.com> * - update signatures of stuff which calls reduce ops Signed-off-by: Yurii <iuriish@yahoo.com> * - make corrections in cuda reduce kernels Signed-off-by: Yurii <iuriish@yahoo.com> * - change loop for default case in broadcast legacy ops Signed-off-by: Yurii <iuriish@yahoo.com> * - comment some shape stuff Signed-off-by: Yurii <iuriish@yahoo.com> * - comment unnecessary prints in RNGtests Signed-off-by: Yurii <iuriish@yahoo.com> * - finish to resolve conflicts after master has been merged Signed-off-by: Yurii <iuriish@yahoo.com> * - get rid of some compilation mistakes of cuda stuff Signed-off-by: Yurii <iuriish@yahoo.com> * - minor changes Signed-off-by: Yurii <iuriish@yahoo.com> * - further search for bug causing crash on java test Signed-off-by: Yurii <iuriish@yahoo.com> * - add scalar case in reduce_ ... exec stuff Signed-off-by: Yurii <iuriish@yahoo.com> * - minor corrections in NAtiveOps.cu Signed-off-by: Yurii <iuriish@yahoo.com> * - add switch to scalar case execReduceXD functions Signed-off-by: Yurii <iuriish@yahoo.com> * - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce Signed-off-by: Yurii <iuriish@yahoo.com> * - correct cuda mirrorPad Signed-off-by: Yurii <iuriish@yahoo.com> * - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce Signed-off-by: Yurii <iuriish@yahoo.com> Co-authored-by: raver119 <raver119@gmail.com> * Add support for CUDA 11.0 (#492) * Add support for CUDA 11.0 * libnd4j tweaks for CUDA 11 Signed-off-by: raver119@gmail.com <raver119@gmail.com> * bindings update, again? Signed-off-by: raver119@gmail.com <raver119@gmail.com> * * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy * update API to match CUDA 8 Signed-off-by: raver119@gmail.com <raver119@gmail.com> * * Update version of JavaCPP Presets for CPython * C++ updated for cuDNN 8.0 Signed-off-by: raver119@gmail.com <raver119@gmail.com> * one more test Signed-off-by: raver119@gmail.com <raver119@gmail.com> * one more test Signed-off-by: raver119@gmail.com <raver119@gmail.com> * one more test Signed-off-by: raver119@gmail.com <raver119@gmail.com> * 128-bit alignment for workspaces Signed-off-by: raver119@gmail.com <raver119@gmail.com> * change seed in 1 test Signed-off-by: raver119@gmail.com <raver119@gmail.com> * Fix dependecy duplication in python4j-parent pom * Fix group id for in python4j-numpy * few tests tweaked Signed-off-by: raver119@gmail.com <raver119@gmail.com> * Remove macosx-x86_64-gpu from nd4j-tests-tensorflow * few minor tweaks for IndexReduce Signed-off-by: raver119@gmail.com <raver119@gmail.com> * one test removed Signed-off-by: raver119@gmail.com <raver119@gmail.com> Co-authored-by: raver119@gmail.com <raver119@gmail.com> Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com> * RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504) Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com> Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com> Co-authored-by: Yurii Shyrma <iuriish@yahoo.com> Co-authored-by: raver119 <raver119@gmail.com> Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com> |
||
---|---|---|
.. | ||
arbiter-core | ||
arbiter-deeplearning4j | ||
arbiter-server | ||
arbiter-ui | ||
README.md | ||
buildmultiplescalaversions.sh | ||
pom.xml |
README.md
Arbiter
A tool dedicated to tuning (hyperparameter optimization) of machine learning models. Part of the DL4J Suite of Machine Learning / Deep Learning tools for the enterprise.
Modules
Arbiter contains the following modules:
- arbiter-core: Defines the API and core functionality, and also contains functionality for the Arbiter UI
- arbiter-deeplearning4j: For hyperparameter optimization of DL4J models (MultiLayerNetwork and ComputationGraph networks)
Hyperparameter Optimization Functionality
The open-source version of Arbiter currently defines two methods of hyperparameter optimization:
- Grid search
- Random search
For optimization of complex models such as neural networks (those with more than a few hyperparameters), random search is superior to grid search, though Bayesian hyperparameter optimization schemes For a comparison of random and grid search methods, see Random Search for Hyper-parameter Optimization (Bergstra and Bengio, 2012).
Core Concepts and Classes in Arbiter for Hyperparameter Optimization
In order to conduct hyperparameter optimization in Arbiter, it is necessary for the user to understand and define the following:
- Parameter Space: A
ParameterSpace<P>
specifies the type and allowable values of hyperparameters for a model configuration of typeP
. For example,P
could be a MultiLayerConfiguration for DL4J - Candidate Generator: A
CandidateGenerator<C>
is used to generate candidate models configurations of some typeC
. The following implementations are defined in arbiter-core:RandomSearchCandidateGenerator
GridSearchCandidateGenerator
- Score Function: A
ScoreFunction<M,D>
is used to score a model of typeM
given data of typeD
. For example, in DL4J a score function might be used to calculate the classification accuracy from a DataSetIterator- A key concept here is that they score is a single numerical (double precision) value that we either want to minimize or maximize - this is the goal of hyperparameter optimization
- Termination Conditions: One or more
TerminationCondition
instances must be provided to theOptimizationConfiguration
.TerminationCondition
instances are used to control when hyperparameter optimization should be stopped. Some built-in termination conditions:MaxCandidatesCondition
: Terminate if more than the specified number of candidate hyperparameter configurations have been executedMaxTimeCondition
: Terminate after a specified amount of time has elapsed since starting the optimization
- Result Saver: The
ResultSaver<C,M,A>
interface is used to specify how the results of each hyperparameter optimization run should be saved. For example, whether saving should be done to local disk, to a database, to HDFS, or simply stored in memory.- Note that
ResultSaver.saveModel
method returns aResultReference
object, which provides a mechanism for re-loading both the model and score from wherever it may be saved.
- Note that
- Optimization Configuration: An
OptimizationConfiguration<C,M,D,A>
ties together the above configuration options in a fluent (builder) pattern. - Candidate Executor: The
CandidateExecutor<C,M,D,A>
interface provides a layer of abstraction between the configuration and execution of each instance of learning. Currently, the only option is theLocalCandidateExecutor
, which is used to execute learning on a single machine (in the current JVM). In principle, other execution methods (for example, on Spark or cloud computing machines) could be implemented. - Optimization Runner: The
OptimizationRunner
uses anOptimizationConfiguration
and aCandidateExecutor
to actually run the optimization, and save the results.
Optimization of DeepLearning4J Models
(This section: forthcoming)