public class MapFileSequenceRecordWriter extends AbstractMapFileWriter<List<List<Writable>>> implements SequenceRecordWriter
MapFileSequenceRecordReaderMapFileSequenceRecordReaderconvertTextTo, counter, DEFAULT_FILENAME_PATTERN, DEFAULT_INDEX_INTERVAL, DEFAULT_MAP_FILE_SPLIT_SIZE, filenamePattern, hadoopConfiguration, indexInterval, isClosed, KEY_CLASS, MAP_FILE_INDEX_INTERVAL_KEY, mapFileSplitSize, opts, outputDir, outputFiles, writersAPPEND| Constructor and Description |
|---|
MapFileSequenceRecordWriter(File outputDir)
Constructor for all default values.
|
MapFileSequenceRecordWriter(File outputDir,
int mapFileSplitSize)
Constructor for most default values.
|
MapFileSequenceRecordWriter(File outputDir,
int mapFileSplitSize,
WritableType convertTextTo) |
MapFileSequenceRecordWriter(File outputDir,
int mapFileSplitSize,
WritableType convertTextTo,
org.apache.hadoop.conf.Configuration hadoopConfiguration) |
MapFileSequenceRecordWriter(File outputDir,
int mapFileSplitSize,
WritableType convertTextTo,
int indexInterval,
org.apache.hadoop.conf.Configuration hadoopConfiguration) |
MapFileSequenceRecordWriter(File outputDir,
int mapFileSplitSize,
WritableType convertTextTo,
int indexInterval,
String filenamePattern,
org.apache.hadoop.conf.Configuration hadoopConfiguration) |
MapFileSequenceRecordWriter(File outputDir,
WritableType convertTextTo) |
| Modifier and Type | Method and Description |
|---|---|
protected org.apache.hadoop.io.Writable |
getHadoopWritable(List<List<Writable>> input) |
protected Class<? extends org.apache.hadoop.io.Writable> |
getValueClass() |
close, convertTextWritables, getConf, setConf, writeclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitclose, writegetConf, setConfpublic MapFileSequenceRecordWriter(File outputDir)
outputDir - Output directory for the map file(s)public MapFileSequenceRecordWriter(@NonNull
File outputDir,
int mapFileSplitSize)
outputDir - Output directory for the map file(s)mapFileSplitSize - Split size for the map file: if 0, use a single map file for all output. If > 0,
multiple map files will be used: each will contain a maximum of mapFileSplitSize
examples. This can be used to avoid having a single multi gigabyte map file, which may
be undesirable in some cases (transfer across the network, for example).public MapFileSequenceRecordWriter(@NonNull
File outputDir,
WritableType convertTextTo)
outputDir - Output directory for the map file(s)convertTextTo - If null: Make no changes to Text writable objects. If non-null, Text writable instances
will be converted to this type. This is useful, when would rather store numerical values
even if the original record reader produces strings/text.public MapFileSequenceRecordWriter(@NonNull
File outputDir,
int mapFileSplitSize,
WritableType convertTextTo)
outputDir - Output directory for the map file(s)mapFileSplitSize - Split size for the map file: if 0, use a single map file for all output. If > 0,
multiple map files will be used: each will contain a maximum of mapFileSplitSize
examples. This can be used to avoid having a single multi gigabyte map file, which may
be undesirable in some cases (transfer across the network, for example).convertTextTo - If null: Make no changes to Text writable objects. If non-null, Text writable instances
will be converted to this type. This is useful, when would rather store numerical values
even if the original record reader produces strings/text.public MapFileSequenceRecordWriter(@NonNull
File outputDir,
int mapFileSplitSize,
WritableType convertTextTo,
org.apache.hadoop.conf.Configuration hadoopConfiguration)
outputDir - Output directory for the map file(s)mapFileSplitSize - Split size for the map file: if 0, use a single map file for all output. If > 0,
multiple map files will be used: each will contain a maximum of mapFileSplitSize
examples. This can be used to avoid having a single multi gigabyte map file, which may
be undesirable in some cases (transfer across the network, for example).convertTextTo - If null: Make no changes to Text writable objects. If non-null, Text writable instances
will be converted to this type. This is useful, when would rather store numerical values
even if the original record reader produces strings/text.hadoopConfiguration - Hadoop configuration.public MapFileSequenceRecordWriter(@NonNull
File outputDir,
int mapFileSplitSize,
WritableType convertTextTo,
int indexInterval,
org.apache.hadoop.conf.Configuration hadoopConfiguration)
outputDir - Output directory for the map file(s)mapFileSplitSize - Split size for the map file: if 0, use a single map file for all output. If > 0,
multiple map files will be used: each will contain a maximum of mapFileSplitSize
examples. This can be used to avoid having a single multi gigabyte map file, which may
be undesirable in some cases (transfer across the network, for example).convertTextTo - If null: Make no changes to Text writable objects. If non-null, Text writable instances
will be converted to this type. This is useful, when would rather store numerical values
even if the original record reader produces strings/text.indexInterval - Index interval for the Map file. Defaults to 1, which is suitable for most caseshadoopConfiguration - Hadoop configuration.public MapFileSequenceRecordWriter(@NonNull
File outputDir,
int mapFileSplitSize,
WritableType convertTextTo,
int indexInterval,
String filenamePattern,
org.apache.hadoop.conf.Configuration hadoopConfiguration)
outputDir - Output directory for the map file(s)mapFileSplitSize - Split size for the map file: if 0, use a single map file for all output. If > 0,
multiple map files will be used: each will contain a maximum of mapFileSplitSize
examples. This can be used to avoid having a single multi gigabyte map file, which may
be undesirable in some cases (transfer across the network, for example).convertTextTo - If null: Make no changes to Text writable objects. If non-null, Text writable instances
will be converted to this type. This is useful, when would rather store numerical values
even if the original record reader produces strings/text.indexInterval - Index interval for the Map file. Defaults to 1, which is suitable for most casesfilenamePattern - The naming pattern for the map files. Used with String.format(pattern, int)hadoopConfiguration - Hadoop configuration.protected Class<? extends org.apache.hadoop.io.Writable> getValueClass()
getValueClass in class AbstractMapFileWriter<List<List<Writable>>>protected org.apache.hadoop.io.Writable getHadoopWritable(List<List<Writable>> input)
getHadoopWritable in class AbstractMapFileWriter<List<List<Writable>>>Copyright © 2017. All rights reserved.