public class MapFileSequenceRecordReader extends Object implements SequenceRecordReader
SequenceRecordReader implementation for reading from a Hadoop MapFile
A typical use case is with TransformProcess executed on Spark (perhaps Spark
local), followed by non-distributed training on a single machine. For example:
JavaRDD<List<List<Writable>>> myRDD = ...;
String mapFilePath = ...;
SparkStorageUtils.saveMapFileSequences( mapFilePath, myRDD );
SequenceRecordReader rr = new MapFileSequenceRecordReader();
rr.initialize( new FileSplit( new File( mapFilePath ) ) );
//Pass to DataSetIterator or similar
Alternatively, use MapFileSequenceRecordWriter.APPEND_LABEL, LABELS, NAME_SPACE| Constructor and Description |
|---|
MapFileSequenceRecordReader()
Create a MapFileSequenceRecordReader with no randomisation, and assuming MapFile keys are
LongWritable
values |
MapFileSequenceRecordReader(IndexToKey indexToKey,
Random rng)
Create a MapFileSequenceRecordReader with optional randomisation, with a custom
IndexToKey instance to
handle MapFile keys |
MapFileSequenceRecordReader(Random rng)
Create a MapFileSequenceRecordReader with optional randomisation, and assuming MapFile keys are
LongWritable values |
public MapFileSequenceRecordReader()
LongWritable
valuespublic MapFileSequenceRecordReader(Random rng)
LongWritable valuesrng - If non-null, will be used to randomize the order of examplespublic MapFileSequenceRecordReader(IndexToKey indexToKey, Random rng)
IndexToKey instance to
handle MapFile keysindexToKey - Handles conversion between long indices and key values (see for example LongIndexToKeyrng - If non-null, will be used to randomize the order of examplespublic void initialize(InputSplit split) throws IOException, InterruptedException
initialize in interface RecordReaderIOExceptionInterruptedExceptionpublic void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
initialize in interface RecordReaderIOExceptionInterruptedExceptionpublic void setConf(Configuration conf)
setConf in interface Configurablepublic Configuration getConf()
getConf in interface Configurablepublic List<List<Writable>> sequenceRecord()
sequenceRecord in interface SequenceRecordReaderpublic List<List<Writable>> sequenceRecord(URI uri, DataInputStream dataInputStream) throws IOException
sequenceRecord in interface SequenceRecordReaderIOExceptionpublic SequenceRecord nextSequence()
nextSequence in interface SequenceRecordReaderpublic SequenceRecord loadSequenceFromMetaData(@NonNull RecordMetaData recordMetaData) throws IOException
loadSequenceFromMetaData in interface SequenceRecordReaderIOExceptionpublic List<SequenceRecord> loadSequenceFromMetaData(@NonNull List<RecordMetaData> recordMetaDatas) throws IOException
loadSequenceFromMetaData in interface SequenceRecordReaderIOExceptionpublic boolean batchesSupported()
batchesSupported in interface RecordReaderpublic List<Writable> next(int num)
next in interface RecordReaderpublic List<Writable> next()
next in interface RecordReaderpublic boolean hasNext()
hasNext in interface RecordReaderpublic List<String> getLabels()
getLabels in interface RecordReaderpublic void reset()
reset in interface RecordReaderpublic List<Writable> record(URI uri, DataInputStream dataInputStream) throws IOException
record in interface RecordReaderIOExceptionpublic Record nextRecord()
nextRecord in interface RecordReaderpublic Record loadFromMetaData(RecordMetaData recordMetaData) throws IOException
loadFromMetaData in interface RecordReaderIOExceptionpublic List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
loadFromMetaData in interface RecordReaderIOExceptionpublic List<RecordListener> getListeners()
getListeners in interface RecordReaderpublic void setListeners(RecordListener... listeners)
setListeners in interface RecordReaderpublic void setListeners(Collection<RecordListener> listeners)
setListeners in interface RecordReaderpublic void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableIOExceptionCopyright © 2017. All rights reserved.