1.3 KiB
title | short_title | description | category | weight |
---|---|---|---|---|
DataVec Readers | Readers | Read individual records from different formats. | DataVec | 2 |
Why readers?
Readers iterate records from a dataset in storage and load the data into DataVec. The usefulness of readers beyond individual entries in a dataset includes: what if you wanted to train a text generator on a corpus? Or programmatically compose two entries together to form a new record? Reader implementations are useful for complex file types or distributed storage mechanisms.
Readers return Writable
classes that describe each column in a Record
. These classes are used to convert each record to a tensor/ND-Array format.
Usage
Each reader implementation extends BaseRecordReader
and provides a simple API for selecting the next record in a dataset, acting similarly to iterators.
Useful methods include:
next
: Return a batch ofWritable
.nextRecord
: Return a singleRecord
, optionally withRecordMetaData
.reset
: Reset the underlying iterator.hasNext
: Iterator method to determine if another record is available.
Listeners
You can hook a custom RecordListener
to a record reader for debugging or visualization purposes. Pass your custom listener to the addListener
base method immediately after initializing your class.
Types of readers
{{autogenerated}}