public class HadoopSpillableTupleList extends SpillableTupleList
Iterable
object that can store an unlimited number of Tuple
instances by spilling
excess to a temporary disk file.
Spills will automatically be compressed using the defaultCodecs
values. To disable compression or
change the codecs, see SpillableProps.SPILL_COMPRESS
and SpillableProps.SPILL_CODECS
.
It is recommended to add Lzo if available.
"org.apache.hadoop.io.compress.LzoCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec"
Spillable.SpillListener, Spillable.SpillStrategy
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
defaultCodecs |
SPILL_CODECS, SPILL_COMPRESS, SPILL_THRESHOLD
Constructor and Description |
---|
HadoopSpillableTupleList(int threshold,
org.apache.hadoop.io.compress.CompressionCodec codec,
org.apache.hadoop.mapred.JobConf jobConf)
Constructor SpillableTupleList creates a new SpillableTupleList instance using the given threshold value, and
the first available compression codec, if any.
|
HadoopSpillableTupleList(int threshold,
TupleSerialization tupleSerialization,
org.apache.hadoop.io.compress.CompressionCodec codec) |
Modifier and Type | Method and Description |
---|---|
protected TupleInputStream |
createTupleInputStream(java.io.File file) |
protected TupleOutputStream |
createTupleOutputStream(java.io.File file) |
static org.apache.hadoop.io.compress.CompressionCodec |
getCodec(FlowProcess flowProcess,
java.lang.String defaultCodecs) |
add, addAll, clear, contains, containsAll, getCodecClass, getGrouping, getThreshold, isEmpty, iterator, remove, removeAll, retainAll, setGrouping, setSpillListener, setSpillStrategy, size, spillCount, toArray, toArray
public static final java.lang.String defaultCodecs
public HadoopSpillableTupleList(int threshold, org.apache.hadoop.io.compress.CompressionCodec codec, org.apache.hadoop.mapred.JobConf jobConf)
threshold
- of type longcodec
- of type CompressionCodecpublic HadoopSpillableTupleList(int threshold, TupleSerialization tupleSerialization, org.apache.hadoop.io.compress.CompressionCodec codec)
public static org.apache.hadoop.io.compress.CompressionCodec getCodec(FlowProcess flowProcess, java.lang.String defaultCodecs)
protected TupleOutputStream createTupleOutputStream(java.io.File file)
createTupleOutputStream
in class SpillableTupleList
protected TupleInputStream createTupleInputStream(java.io.File file)
createTupleInputStream
in class SpillableTupleList