cavis/contrib/codegen-tools/codegen
agibsonccc fa1a31c877 Upgrade dl4j to junit 5 2021-03-15 13:02:01 +09:00
..
adr Dev commits 2021-02-01 14:31:20 +09:00
src Upgrade dl4j to junit 5 2021-03-15 13:02:01 +09:00
README.md Update codegen, add ctc loss 2021-03-12 21:31:14 +09:00
generate.sh Fix compilation issues from codegen 2021-03-13 11:00:11 +09:00
pom.xml Upgrade dl4j to junit 5 2021-03-15 13:02:01 +09:00

README.md

ND4J Op Definitions and Code Generation

This project contains the ND4J Op definitions, the DSL (Domain Specific Language) that is used for those definitions and code generators that use those definitions to create the actual Java code that is used to use the defined operations.

Why define ops externally?

As we started to support SameDiff, we also started to introduce inconsistencies between SameDiff and ND4J. Even though both of those libraries use the same underlying implementations for operations, there are both small and large differences in the API that we provide for them. Sometimes, we have provided an official API only for one usage, and not the other. And very often the documentation for a single op is in many different places.

In the future we want to support other programming languages with libnd4j, and provide more ways to use our C++ backend. This would only increase the aforementioned problems.

The root of all of those problems, is that Ops are used across different environments, and there is no single way of defining them with an enforced interface.

How does this project help with enforcing a single consistent interface for ops?

The solution we propose, is to define the operations separately, and then generate the necessary API code for them. All of the generated code is to be considered untouchable, editing it will result in the changes being overwritten sooner rather than later.

The combination of external op definition and code generation, opens up many opportunities for us. The first one being that we can easily create consistent APIs for both ND4J and SameDiff in Java. But, looking into the future, we can also create those APIs for other programming languages like Python, Swift, or even C#. We can even go beyond programming languages, and use the op definitions to create better documentation than what JavaDoc or similar might support out of the box.

Maintenance

This project is currently maintained by Paul Dubs, with feedback often collected from raver119 and Alex Black.

Current Status

At the moment we still focus on nailing down an easily readable and contribution friendly DSL for op definition and code generation that can replace namespace definitions. This means that at the moment we still rely on the pre-existing Op definition classes that already exist in ND4J.

Roadmap

  • Replace Bitwise and Random namespaces with autogenerated code In progress.
  • Implement a convenient CLI tool.
  • Define all Ops using the DSL.
  • Automatically generate derivative op declarations from existing ops
  • Replace all namespace definitions in ND4J / SameDiff with automatically generated ones
  • Replace all Op classes with automatically generated ones.

Usage

Pre-requisites:

  • JDK 8 or higher
  • Maven 3.3 or higher

TODO: Show usage output of the project itself

TODO: Show how to use from mvn

Generating Code - ND4J Namespaces

A script - generate.sh - is provided in the project root. This can be used (at present) to generate ND4J namespace classes. It is assumed that the deeplearning4j mono repo and the dl4j-dev-tools repo both exist and have a common parent directory i.e., somedir/deeplearning4j and somedir/dl4j-dev-tools both exist.

The script takes as argument the name (or names) of the ND4J namespaces to generate (not case sensitive) and projects (supported projects are nd4j, sd and both by default).

As of 26/11, namespaces names (and hence valid args) include: bitwise, neuralnetwork, random, and math Note also that all may be passed to the script to generate all namespaces.

For example, to generate both bitwise and random namespaces for both nd4j and SameDiff:

./generate.sh bitwise,random

Or to generate all namespaces for both nd4j and SameDiff, use:

./generate.sh all

To generate namespaces for one project only, use:

./generate.sh linalg -projects sd

or:

./generate.sh linalg -projects nd4j

The script will first compile the project, before running. Internally, the org.nd4j.codegen.cli.CLI class is used. Classes are written to deeplearning4j/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/factory/ops/

Generating documentation.

It is possible to use generate.sh for generation of code only, docs in markdown format only, or both docs and code. To generate docs only and store them to new folder "docs" for all namespaces:

./generate.sh all -docsdir ../../docs

Generation for selected namespaces works in the same way as for code:

./generate.sh -docsdir ../../docs bitwise,linalg

Code structure

The project is implemented using a mix of Java and Kotlin. The DSL definition and the accompanying data structures are implemented in Kotlin. At the moment the code generators are implemented in Java, in order to allow people who are not fluent in Kotlin, but know Java to be able to contribute to the code generators.

The source code for this project is structured a bit different that what you would typically see in a Java or Kotlin project. When you take a look inside the src/main directory, you will find 4 sub-directories.

The java and kotlin directories contain Java and Kotlin code respectively.

In order to not confuse op definitions with the machinery that allows them to be defined in that way, ops are kept in a separate folder called ops.

Because we use JavaPoet for Java code generator implementation, we also have yet another folder called stubs. That folder contains stub classes, that are used to reference other classes available in ND4J. These stub classes are intentionally left empty, as JavaPoet only requires them for naming and automatically creating proper imports. We use stub classes instead of depending on the actual nd4j API in order to break a cyclic dependency that would otherwise be created (i.e. in order to be able to generate code for ND4J, we would need an already compiled nd4j to be available). Note: If something is stubbed here and is moved in ND4J, then it also has to be moved to the appropriate place here, otherwise the generated code will be wrong.

The adr folder contains "Architecture Decision Records". These files give you more insight into the "why" of some of the bigger decisions within this project.

DSL for Op Definition

Ops are defined using a DSL that is implemented in Kotlin. This means that other than the DSL, as defined in the following, you can also use all of Kotlin when defining Ops. However, doing things the obvious and clearly understandable way is better than coming up with a clever way, so prefer to use the DSL as described if unsure.

val mathNs = Namespace("math") {
    Op("add") {
        javaPackage = "org.nd4j.linalg.api.ops.impl.transforms.pairwise.arithmetic"

        Input(NUMERIC, "x") { description = "First input to add" }
        Input(NUMERIC,"y") { count = AtLeast(1); description = "Second input to add" }
        Arg(INT,"shape") { count = AtLeast(1); description = "shape" }


        Output(NUMERIC, "z") { description = "Output (x+y)" }

        Doc(Language.ANY, DocScope.ALL) {
            """
            (From AddOp) Add op doc text that will appear everywhere - classes, constructors, op creators
            """.trimIndent()
        }
        Doc(Language.ANY, DocScope.CLASS_DOC_ONLY) {
            "Add op doc text that will appear in all class docs (javadoc etc)"
        }
        Doc(Language.ANY, DocScope.CONSTRUCTORS_ONLY) {
            "Add op doc text for constructors only"
        }

    }
}

This example shows how a namespace is defined. Namespaces are at the top layer, and ops can only be defined within the context of a namespace. This example namespace contains only a single op, called "add". If we wanted to add another op, we would simply add it below the first.

As you can see, every op has to have a name, if you try to create one without a name, you will get a compile error. Within the context of the op, we first set in which java package the op class can be found in, then define its inputs, arguments and outputs and finally add some free form documentation about what that op is doing.

Like with the op itself, the inputs, arguments and outputs all have to have a name, but unlike the op, they also require a type. Within their context, you can set a description and a count of how many parameters they can take respectively.

If an input, argument or output take anything else than exactly 1, they will be treated as arrays. Typically you would use this to define ops like concat which can take multiple input tensors or ops that might take shape arguments.

Examples

The following shows how a typical op definition looks like and how the generated Java code may look.

An op might be defined like this:

Op("binomial") {
    javaPackage = "org.nd4j.linalg.api.ops.random.custom"
    Arg(INT, "nTrials") { description = "Number of trials parameter for the binomial distribution" }
    Arg(FLOATING_POINT, "p") { description = "Probability of success for each trial" }
    Arg(INT, "shape") { count = AtLeast(1); description = "Shape of the new random SDVariable, as a 1D array" }

    Output(NUMERIC, "output") { description = "new random SDVariable, where values are randomly sampled according to a Binomial distribution" }

    Doc(Language.ANY, DocScope.ALL) {
        """
        Generate a new random SDVariable, where values are randomly sampled according to a Binomial distribution,
        with the specified number of trials and probability.
        """.trimIndent()
    }
}

The java code generator will create a method like the following for it:

  /**
   * Generate a new random SDVariable, where values are randomly sampled according to a Binomial distribution,
   * with the specified number of trials and probability.
   *
   * @param nTrials Number of trials parameter for the binomial distribution
   * @param p Probability of success for each trial
   * @param shape Shape of the new random SDVariable, as a 1D array (Size: AtLeast(min=1))
   * @return output new random SDVariable, where values are randomly sampled according to a Binomial distribution (NUMERIC type)
   */
  public static INDArray binomial(long nTrials, double p, long... shape) {
    Preconditions.checkArgument(shape.length >= 1, "shape has incorrect count. Expected: AtLeast(min=1)");
    return Nd4j.exec(new org.nd4j.linalg.api.ops.random.custom.BinomialOp(nTrials, p, shape))[0];
  }

Or an op with some more constraints:

Op("and") {
    javaPackage = "org.nd4j.linalg.api.ops.impl.transforms.custom"
    val x = Input(INT, "x") { description = "First input array" }
    val y = Input(INT, "y") { description = "Second input array" }
    Constraint("Must be same types"){ sameType(x, y) }
    Constraint("Must have broadcastable shapes"){ broadcastableShapes(x, y) }

    Output(INT, "output"){ description = "Bitwise AND array" }

    Doc(Language.ANY, DocScope.ALL){
        """
        Bitwise AND operation. Supports broadcasting.
        """.trimIndent()
    }
}

will be converted to java like this:

  /**
   * Bitwise AND operation. Supports broadcasting.
   *
   * Inputs must satisfy the following constraints: 
   * Must be same types: isSameType(x, y)
   * Must have broadcastable shapes: isBroadcastableShapes(x, y)
   *
   * @param x First input array (INT type)
   * @param y Second input array (INT type)
   * @return output Bitwise AND array (INT type)
   */
  public static INDArray and(INDArray x, INDArray y) {
    NDValidation.validateInteger("and", x);
    NDValidation.validateInteger("and", y);
    Preconditions.checkArgument(isSameType(x, y), "Must be same types");
    Preconditions.checkArgument(isBroadcastableShapes(x, y), "Must have broadcastable shapes");
    return Nd4j.exec(new org.nd4j.linalg.api.ops.impl.transforms.custom.AndOp(x, y))[0];
  }

Full DSL Description

Namespace

fun NamespaceName() = Namespace("name"){ /* Op definitions in namespace context */}

Defines a namespace.

Op

Only available within a namespace context

Op("opName") { /* op properties in op context */ }
Op("anotherOp", mixin) { /* op properties in op context */ }
Op("anotherOp2", mixin, keepInputs=false) { /* op properties in op context */ }

Every op requires a namespace unique op name.

When defining an op, you can also pass a mixin that it should inherit initial properties from. This has the same effect as using useMixin(mixin) as the very first thing in the op definition. If you don't want to inherit all of the parameters of the mixin, you can pass the same additional configuration as you would pass to useMixin(mixin, ...options..). See Mixin for more information.

Op properties

  • javaPackage (String): Package where the op is to be found in the java implementation.
  • javaOpClass (String): Name of java op class if inconsistent with opName. Default: same as opName
  • libnd4jName (String): The name the op has in libnd4j. Default: same as opName

Mixin

Available in global context.

val mixin = Mixin("name"){ /* op properties in op context */ }
// within an op context:
useMixin(mixin)
useMixin(mixin, ...options...)

// When needing to access something from within the mixin
mixin.input("name")
mixin.arg("name")
mixin.config("name")
mixin.output("name")

Mixins provide the facility to share commonalities between Ops. You can think of it like inheritance, especially when you declare the use of a mixin on Op definition. In contrast to normal (single) inheritance where only a single super class is possible, the mixin mechanism allows to "inherit" from multiple sources.

You can define almost all the same things within a mixin that you can within an Op. The only things that can not be configured within a mixin are Op name, libnd4jName and javaOpClass.

As mixins can be configured within the global context, you can share them across namespaces by defining them in their own file. If a mixin is namespace specific, you can also define it within the namespace context.

Mixins are used either on definition as a parameter Op("opname", mixin){...}, or with useMixin(mixin) within the op definition. While the former version only supports a single mixin, the latter version allows you to use as many mixins as are required.

You can also build up mixins by using useMixin(mixin) inside a Mixin itself.

useMixin(mixin, ...options...) supports a few additional options: keepInputs, keepArgs, keepConfigs, keepOutputs, keepSignatures, keepDoc, keepConstraints. They default to true. If you want to skip including some of them, you simply set the parameter for it to false, e.g. useMixin(mixin, keepDoc=false).

When using useMixin(mixin), all definitions within the mixin are applied as if this invocation was replaced with the content of the mixin itself. This means, that if you have already defined anything prior to using a mixin, the mixin's definitions will be after the previously defined things. This can be very useful if the commonality between ops is that they have a few trailing options.

If a named property or section is defined in both a mixin (or multiple mixins) and the op, then the last to define it will win. Named properties are legacy, javaPackage, named sections are Input, Arg, Output, Config.

For example, assume you have javaPackage defined in both an op and a mixin. Then you can have the following two cases:

First case:

    Op("foo"){
        useMixin(exampleMixin)
        javaPackage = "some.example.package"
    }

Second case:

    Op("foo"){
        javaPackage = "some.example.package"
        useMixin(exampleMixin)
    }

In the first case, the op will have the javaPackage value that is defined within the op. In the second case it will have the javaPackage value defined in the mixin.

For inputs, args, outputs, it works similarly. Assume you have Input(dataType, "a") defined in both the mixin and the op. Again you can have two cases:

First case:

    Op("foo"){
        useMixin(exampleMixin)
        Input(NUMERIC, "a")
    }

Second case:

    Op("foo"){
        Input(NUMERIC, "a")
        useMixin(exampleMixin)
    }

In the first case, it will overwrite the input from the mixin. In the second case, the mixin will overwrite that the input from the op.

Config

Only available within a namespace context

val nameConfig = Config("Name"){
    /* input, arg, constraint, doc properties */
}

Every config requires a namespace unique name.

A config allows to define a configuration class, that can be used as a holder for complex properties of specific ops which will be passed to an op as a parameter.

Similar to an op itself, it supports Input, Arg, Constraint and Doc definitions.

in order to use the config within an op you either use useConfig(cfg) or val configRef = useConfig(cfg). The second form allows you to reference the config.

Referencing the config allows to you reference its inputs and args by name: configRef.input("name") and configRef.arg("name"). Also it allows you to use a config in a signature Signature(a, b, c, configRef).

When default and shorthand signatures are used, configs will be always placed at the end.

If a config is defined but not used, an IllegalStateException will be thrown.

See also ADR 0007 "Configuration Objects".

Input

Available within an op, mixin and a config context

Input(FLOATING_POINT, "b"){ /* input properties in input context */ }
val a = Input(INT, "a"){ /* input properties in input context */ }

Inputs represent tensors. They are what the op will work on.

Every input requires a data type (either INT, FLOATING_POINT, NUMERIC or BOOLEAN) and an op unique name.

When defining an input, you can assign it to a variable in order to be able to reference it later on. You might want to do this when defining constraints.

If you want an input to represent an array, you will have to set a count accordingly. If no count is set, it is assumed that the count is meant to be Exactly(1).

Input properties

  • description (String): A short description what this input represents. Setting this is recommended.
  • count (Count): Can take one of Exactly(n); AtLeast(n); AtMost(n); Range(from, to)
  • defaultValue (Input): use another input as the default if this isn't set explicitly. The data type of the other input has to match the data type of this input. The other input may also have a default value.

Argument

Available within an op, mixin and config context

Arg(FLOATING_POINT, "b"){ /* Arg properties in arg context */ }
val a = Arg(INT, "a"){ /* Arg properties in arg context */ }

Args represent arguments. They modify how the op works on its inputs.

Every arg requires a data type (either INT, FLOATING_POINT, NUMERIC or BOOLEAN) and an op unique name.

When defining an arg, you can assign it to a variable in order to be able to reference it later on. You might want to do this when defining constraints.

If you want an arg to represent an array, you will have to set a count accordingly. If no count is set, it is assumed that the count is meant to be Exactly(1).

Note (Java specific): If the last arg is defined to represent an array, it will be translated to a vararg parameter, e.g. Arg(INT, "a"){ count = AtLeast(1); description = "..." } will be turned into long... a.

Argument properties

  • description (String): A short description what this argument represents. Setting this is recommended.
  • count (Count): Can take one of Exactly(n); AtLeast(n); AtMost(n); Range(from, to)
  • defaultValue (null|Number|Boolean|int[]|double[]|boolean[]|Arg|TensorShapeValue|TensorDataTypeValue|String): Use given value as default value, if this isn't explicitly set. Can refer to inputs and outputs using x.shape() and x.dataType(). The given default values has to match the data type for this argument. May also refer to another Arg, and that Arg may also have a default value. Default values based on outputs are treated like without a default in SameDiff mode.
  • possibleValues (String[]): only available when ENUM data type is used for the argument. Takes a list of possible values for the Enum. If used in in abstract base op, the enum will only be created once. See also ADR 0006 "Op specific enums".

Output

Only available within an op and mixin context

Output(FLOATING_POINT, "b"){ /* Arg properties in arg context */ }

Every output requires a data type (either INT, FLOATING_POINT, NUMERIC or BOOLEAN) and an op unique name.

While outputs can be assigned to a variable, there is no intended use-case for it. In contrast to inputs and args, outputs can not be used in constraints.

Output properties

  • description (String): A short description what this argument represents. Setting this is recommended.

Signature

Only available within an op and mixin context

Signature(a,b,c)
Signature(a,b,c) { "Some Documentation" }
AllParamSignature()
AllDefaultParamSignature()

For some ops only specific signatures make sense, as for example some optional parameters may become required in the presence of other optional parameters. This feature is mainly meant to help with the fact that not all programming languages (e.g. Java) support default parameters. Each signature is meant to describe one overload in those languages.

See also ADR 0005 "Optional parameters and signatures".

Signatures can also reference the output(s) of an op. Those signatures are only relevant in NDArray programming mode. They are not to be generated in SameDiff mode.

AllParamSignature() and AllDefaultParamSignature() are short hands for Signature(...all parameters...) and Signature(...only parameters with no default values...). Their parameters include references to outputs unless disabled using withOutput=false (e.g. AllParamSignature(withOutput=false)).

If no signature is specified for an op, it is treated as if AllParamSignature() and AllDefaultParamSignature() are both specified.

Each signature must satisfy the condition, that all required parameters are listed there. If this condition is not satisfied, an IllegalStateException will be thrown on construction.

Documentation

Only available within an op and mixin context

Doc(Language.ANY, DocScope.ALL){
    """ Some documentation
    It can be multiline. And indented.
    """.trimIndent()
}

Documentation can be language specific, and can be set to be only given at specific places. The documentation itself is given as a string. Because Kotlin supports multiline strings along with proper indentation, we are using them directly here.

Note: At the moment we are only creating java code, so the documentation can use JavaDoc syntax.

You can have multiple Doc definitions; they are treated as additive.

Any instances of the following values will be replaced when generating code:

  • %OPNAME% -> operation name ("Add", "Sub", etc)
  • %LIBND4J_OPNAME% -> libnd4j op name ("add", "sub", etc)
  • %INPUT_TYPE% -> input / output type depending on the generated api, i.e. SDVariable for SameDiff and INDArray for ND4J

See DocTokens class for more details.

Constraints

Available within an op, mixin and a config context.

Constraint("Error Message if constraint isn't satisfied"){ /* constraint definition */ }
BackendConstraint("Error Message if constraint isn't satisfied"){ /* constraint definition */ }

Many ops expect their inputs and arguments to satisfy some specific rules. Those rules can be expressed with the constraint system.

Constraints are to be enforced within the frontend language, while BackendConstraints are currently only to be used as a part of the documentation. They will be enforced within the C++ backend, so there is no point in double checking them.

There is a system in place to define even complex constraints for inputs and arguments.

In a constraint definition, you can reference inputs and arguments directly, if they are previously assigned to a variable using val name = Input(...). Inside the Constraint block, you can use the following operations:

  • eq: Compare equality (applicable to numbers and booleans), e.g. x eq 7, x eq true
  • neq: Compare inequality (applicable to numbers and booleans), e.g. x neq 3, x neq true
  • lt, lte: less than, less than equal (applicable to numbers), e.g. x lt 3, x lte 4
  • gt, gte: greater than, grater than equal (applicable to numbers), e.g. x gt 5, x gte 6
  • and: combine two comparisons where both have to be true, e.g. (x eq 8) and (y lt 3)
  • or: combine two comparisons where one has to be true, e.g. (x eq 8) or (y eq true)
  • all: combine N comparisons where all have to be true, e.g. all(x eq 8, y lt 3, z eq true)
  • some: combine N comparisons where at least one has to be true, e.g. some(x eq 8, y lt 3, z eq true)
  • not: negates a comparison, e.g. not(x eq 3)

In addition to those operations, you also get access to some more complex constraints:

  • sameType(...): true if all given inputs are the same type, e.g. sameType(x,y,z)
  • sameShape(...): true if all given inputs have the same shape, e.g. sameShape(x,y,z)
  • broadcastableShapes(...): true if all given inputs have broadcast compatible shapes, e.g. broadcastableShapes(x,y,z)

Inputs also get some additional methods on them to define useful constraints:

  • input.rank(): Rank of the given input
  • input.sizeAt(i): size of the given input at the i-th dimension
  • input.isScalar(): Short hand for x.rank() == 1

Examples

Some examples of constraints, and what they evaluate to. The example code contains a little bit of context.

val x = Input(INT, "x") { description = "First input array" }
val y = Input(INT, "y") { description = "Second input array" }
Constraint("foo bar"){
   x.sizeAt(7) eq 7 and y.isScalar()
}

will evaluate to:

    Preconditions.checkArgument((x.sizeAt(7) == 7) && (y.rank() == 1), "foo bar");

More examples (only the constraint itself, without context code):

Some

some(input.rank() eq 3, input.sizeAt(2) gte 7, input.sizeAt(4) lt 5)

turns to:

((x.rank() == 3) || (x.sizeAt(2) >= 7)) || (x.sizeAt(4) < 5)

Contributing to this project

If you want to contribute to this project other than by adding or improving op definitions, the following sections might be of special interest to you.

Extending the DSL

The DSL is implemented using Kotlins type-safe builders feature (see https://kotlinlang.org/docs/reference/type-safe-builders.html). The basic principle is that functions calls can receive blocks that can be executed in a specified context. When combined with the fact that we are just looking to create an object graph that is then going to be used as input to the code generators, this allows us to create a very feature rich DSL without actually having to write a lot of code to support it.

Most of the DSL specific code can be found in src/kotlin/org/nd4j/codegen/dsl/OpBuilder.kt. The actual class definitions for the object graph we are building, can be found in src/kotlin/org/nd4j/codegen/api.

If you want to add just a simple field to one of the objects, it is usually enough to directly add it to the particular class.

If you want to add a specific section to the op definition, i.e. a section like Input or Doc, you will have to add both the class for the object that it is going to be creating, as well as a function within OpBuilder.kt to create and register that section within the op.

Note: When you extend the DSL you will most likely also have to update all code generators to support the feature you have added.

Adding / extending code generators

Code generators can be written in either Java or Kotlin. Java has the advantage that more people will have experience in using it. Kotlin has the advantage of more convenient syntax, especially for plain string manipulation and when dealing with Enums and fixed sets of subclasses (called sealed classes in Kotlin).

All generators have to implement the org.nd4j.codegen.api.generator.Generator interface. For automatic detection by the CLI tool, they should also be within the org.nd4j.codegen.impl.LANGUAGE package, where LANGUAGE is the actual language that they generate.

Code generators can also use an auxiliary generator for constraint generation. Those auxiliary generators, have to implement org.nd4j.codegen.api.generator.ConstraintCodeGenerator interface.