Aggregatorby the fact that it operates on unique groups of values. It differs by the fact that an
Iteratoris provided and it is the responsibility of the
operate(cascading.flow.FlowProcess, BufferCall)method to iterate overall all the input arguments returned by this Iterator, if any. For the case where a Buffer follows a CoGroup, the method
operate(cascading.flow.FlowProcess, BufferCall)will be called for every unique group whether or not there are values available to iterate over. This may be counter-intuitive for the case of an 'inner join' where the left or right stream may have a null grouping key value. Regardless, the current grouping value can be retrieved through
BufferCall.getGroup(). Buffer is very useful when header or footer values need to be inserted into a grouping, or if values need to be inserted into the middle of the group values. For example, consider a stream of timestamps. A Buffer could be used to add missing entries, or to calculate running or moving averages over a smaller "window" within the grouping. By default, if a result is emitted from the Buffer before the argumentsIterator is started or after it is completed (
argumentsIterator.hasNext() == false), non-grouping values are forced to null (to allow for header and footer tuple results). By setting
Operation.prepare(cascading.flow.FlowProcess, OperationCall)method, the last seen Tuple values will not be nulled after completion and will be treated as the current incoming Tuple when merged with the Buffer result Tuple via the Every outgoing selector. There may be only one Buffer after a
CoGroup. And there may not be any additional
Everypipes before or after the buffers Every pipe instance. A
PlannerExceptionwill be thrown if these rules are violated. Buffer implementations should be re-entrant. There is no guarantee a Buffer instance will be executed in a unique vm, or by a single thread. Also, note the Iterator will return the same
TupleEntryinstance, but with new values in its child
Tuple. As of Cascading 2.5, if the previous CoGroup uses a
Joiner, a Buffer may be used to implement differing Joiner strategies. Instead of calling
BufferCall.getArgumentsIterator()(which will return null),
BufferCall.getJoinerClosure()will return an
JoinerClosureinstance with direct access to each CoGrouped Iterator.
|Modifier and Type||Method and Description|
Method operate is called once for each grouping.
void operate(FlowProcess flowProcess, BufferCall<Context> bufferCall)
BufferCallpasses in an
Iteratorthat returns an argument
TupleEntryfor each value in the grouping defined by the argument selector on the parent Every pipe instance. TupleEntry entry, or entry.getTuple() should not be stored directly in a collection or modified. A copy of the tuple should be made via the
new Tuple( entry.getTuple() )copy constructor. This method is called for every unique group, whether or not there are values in the arguments Iterator.
flowProcess- of type FlowProcess
bufferCall- of type BufferCall
Copyright © 2007-2015 Concurrent, Inc. All Rights Reserved.