public class MapFileRecordReader extends Object implements RecordReader
RecordReader implementation for reading from a Hadoop MapFile
A typical use case is with TransformProcess executed on Spark (perhaps Spark
local), followed by non-distributed training on a single machine. For example:
JavaRDD<List<Writable>> myRDD = ...;
String mapFilePath = ...;
SparkStorageUtils.saveMapFile( mapFilePath, myRDD );
RecordReader rr = new MapFileRecordReader();
rr.initialize( new FileSplit( new File( mapFilePath ) ) );
//Pass to DataSetIterator or similar
Alternatively, use MapFileRecordWriter.APPEND_LABEL, LABELS, NAME_SPACE| Constructor and Description |
|---|
MapFileRecordReader()
Create a MapFileRecordReader with no randomisation, and assuming MapFile keys are
LongWritable
values |
MapFileRecordReader(IndexToKey indexToKey,
Random rng)
Create a MapFileRecordReader with optional randomisation, with a custom
IndexToKey instance to
handle MapFile keys |
MapFileRecordReader(Random rng)
Create a MapFileRecordReader with optional randomisation, and assuming MapFile keys are
LongWritable values |
| Modifier and Type | Method and Description |
|---|---|
boolean |
batchesSupported() |
void |
close() |
Configuration |
getConf() |
List<String> |
getLabels() |
List<RecordListener> |
getListeners() |
boolean |
hasNext() |
void |
initialize(Configuration conf,
InputSplit split) |
void |
initialize(InputSplit split) |
List<Record> |
loadFromMetaData(List<RecordMetaData> recordMetaDatas) |
Record |
loadFromMetaData(RecordMetaData recordMetaData) |
List<Writable> |
next() |
List<Writable> |
next(int num) |
Record |
nextRecord() |
List<Writable> |
record(URI uri,
DataInputStream dataInputStream) |
void |
reset() |
void |
setConf(Configuration conf) |
void |
setListeners(Collection<RecordListener> listeners) |
void |
setListeners(RecordListener... listeners) |
public MapFileRecordReader()
throws Exception
LongWritable
valuesExceptionpublic MapFileRecordReader(Random rng)
LongWritable valuesrng - If non-null, will be used to randomize the order of examplespublic MapFileRecordReader(IndexToKey indexToKey, Random rng)
IndexToKey instance to
handle MapFile keysindexToKey - Handles conversion between long indices and key values (see for example LongIndexToKeyrng - If non-null, will be used to randomize the order of examplespublic void initialize(InputSplit split) throws IOException, InterruptedException
initialize in interface RecordReaderIOExceptionInterruptedExceptionpublic void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
initialize in interface RecordReaderIOExceptionInterruptedExceptionpublic void setConf(Configuration conf)
setConf in interface Configurablepublic Configuration getConf()
getConf in interface Configurablepublic boolean batchesSupported()
batchesSupported in interface RecordReaderpublic List<Writable> next(int num)
next in interface RecordReaderpublic List<Writable> next()
next in interface RecordReaderpublic boolean hasNext()
hasNext in interface RecordReaderpublic List<String> getLabels()
getLabels in interface RecordReaderpublic void reset()
reset in interface RecordReaderpublic List<Writable> record(URI uri, DataInputStream dataInputStream) throws IOException
record in interface RecordReaderIOExceptionpublic Record nextRecord()
nextRecord in interface RecordReaderpublic Record loadFromMetaData(RecordMetaData recordMetaData) throws IOException
loadFromMetaData in interface RecordReaderIOExceptionpublic List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
loadFromMetaData in interface RecordReaderIOExceptionpublic List<RecordListener> getListeners()
getListeners in interface RecordReaderpublic void setListeners(RecordListener... listeners)
setListeners in interface RecordReaderpublic void setListeners(Collection<RecordListener> listeners)
setListeners in interface RecordReaderpublic void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableIOExceptionCopyright © 2017. All rights reserved.