Hadoop Streaming 实战: aggregate

http://blog.csdn.net/yfkiss/article/details/7019022

 

2. aggregate class summary

DoubleValueSum This class implements a value aggregator that sums up a sequence of double values.
LongValueMax This class implements a value aggregator that maintain the maximum of a sequence of long values.
LongValueMin This class implements a value aggregator that maintain the minimum of a sequence of long values.
LongValueSum This class implements a value aggregator that sums up a sequence of long values.
StringValueMax This class implements a value aggregator that maintain the biggest of a sequence of strings.
StringValueMin This class implements a value aggregator that maintain the smallest of a sequence of strings.
UniqValueCount This class implements a value aggregator that dedupes a sequence of objects.
UserDefinedValueAggregatorDescriptor This class implements a wrapper for a user defined value aggregator descriptor.
ValueAggregatorBaseDescriptor This class implements the common functionalities of the subclasses of ValueAggregatorDescriptor class.
ValueAggregatorCombiner<K1 extends WritableComparable,V1 extends Writable> This class implements the generic combiner of Aggregate.
ValueAggregatorJob This is the main class for creating a map/reduce job using Aggregate framework.
ValueAggregatorJobBase<K1 extends WritableComparable,V1 extends Writable> This abstract class implements some common functionalities of the the generic mapper, reducer and combiner classes of Aggregate.
ValueAggregatorMapper<K1 extends WritableComparable,V1 extends Writable> This class implements the generic mapper of Aggregate.
ValueAggregatorReducer<K1 extends WritableComparable,V1 extends Writable> This class implements the generic reducer of Aggregate.
ValueHistogram This class implements a value aggregator that computes the histogram of a sequence of strings

 

3. streaming中使用aggregate
在mapper任务的输出中添加控制,如下:
function:key\tvalue
eg:
LongValueSum:key\tvalue
此外,置-reducer = aggregate。此时,Reducer使用aggregate中对应的function类对相同key的value进行操作,例如,设置function为LongValueSum则将对每个键值对应的value求和。

你可能感兴趣的:(hadoop,Streaming)