--- title: DataVec Transforms short_title: Transforms description: Data wrangling and mapping from one schema to another. category: DataVec weight: 1 --- ## Data wrangling One of the key tools in DataVec is transformations. DataVec helps the user map a dataset from one schema to another, and provides a list of operations to convert types, format data, and convert a 2D dataset to sequence data. ## Building a transform process A transform process requires a `Schema` to successfully transform data. Both schema and transform process classes come with a helper `Builder` class which are useful for organizing code and avoiding complex constructors. When both are combined together they look like the sample code below. Note how `inputDataSchema` is passed into the `Builder` constructor. Your transform process will fail to compile without it. ```java import org.datavec.api.transform.TransformProcess; TransformProcess tp = new TransformProcess.Builder(inputDataSchema) .removeColumns("CustomerID","MerchantID") .filter(new ConditionFilter(new CategoricalColumnCondition("MerchantCountryCode", ConditionOp.NotInSet, new HashSet<>(Arrays.asList("USA","CAN"))))) .conditionalReplaceValueTransform( "TransactionAmountUSD", //Column to operate on new DoubleWritable(0.0), //New value to use, when the condition is satisfied new DoubleColumnCondition("TransactionAmountUSD",ConditionOp.LessThan, 0.0)) //Condition: amount < 0.0 .stringToTimeTransform("DateTimeString","YYYY-MM-DD HH:mm:ss.SSS", DateTimeZone.UTC) .renameColumn("DateTimeString", "DateTime") .transform(new DeriveColumnsFromTimeTransform.Builder("DateTime").addIntegerDerivedColumn("HourOfDay", DateTimeFieldType.hourOfDay()).build()) .removeColumns("DateTime") .build(); ``` ## Executing a transformation Different "backends" for executors are available. Using the `tp` transform process above, here's how you can execute it locally using plain DataVec. ```java import org.datavec.local.transforms.LocalTransformExecutor; List> processedData = LocalTransformExecutor.execute(originalData, tp); ``` ## Debugging Each operation in a transform process represents a "step" in schema changes. Sometimes, the resulting transformation is not the intended result. You can debug this by printing each step in the transform `tp` with the following: ```java //Now, print the schema after each time step: int numActions = tp.getActionList().size(); for(int i=0; i