--- title: DataVec Executors short_title: Executors description: Execute ETL and vectorization in a local instance. category: DataVec weight: 3 --- ## Local or remote execution? Because datasets are commonly large by nature, you can decide on an execution mechanism that best suits your needs. For example, if you are vectorizing a large training dataset, you can process it in a distributed Spark cluster. However, if you need to do real-time inference, DataVec also provides a local executor that doesn't require any additional setup. ## Executing a transform process Once you've created your `TransformProcess` using your `Schema`, and you've either loaded your dataset into a Apache Spark `JavaRDD` or have a `RecordReader` that load your dataset, you can execute a transform. Locally this looks like: ```java import org.datavec.local.transforms.LocalTransformExecutor; List> transformed = LocalTransformExecutor.execute(recordReader, transformProcess) List>> transformedSeq = LocalTransformExecutor.executeToSequence(sequenceReader, transformProcess) List> joined = LocalTransformExecutor.executeJoin(join, leftReader, rightReader) ``` When using Spark this looks like: ```java import org.datavec.spark.transforms.SparkTransformExecutor; JavaRDD> transformed = SparkTransformExecutor.execute(inputRdd, transformProcess) JavaRDD>> transformedSeq = SparkTransformExecutor.executeToSequence(inputSequenceRdd, transformProcess) JavaRDD> joined = SparkTransformExecutor.executeJoin(join, leftRdd, rightRdd) ``` ## Available executors {{autogenerated}}