|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object cascading.pipe.Pipe cascading.pipe.Splice cascading.pipe.GroupBy
public class GroupBy
The GroupBy pipe groups the Tuple
stream by the given groupFields.
Pipe
instance is provided on the constructor, all branches will be merged. It is required
that all Pipe instances output the same field names, otherwise the FlowConnector
will fail to create a
Flow
instance. Again, the Pipe instances are merged together as if one Tuple stream and not joined.
See CoGroup
for joining by common fields.
Typically an Every
follows GroupBy to apply an Aggregator
function to every grouping. The
Each
operator may also follow GroupBy to apply a Function
or Filter
to the resulting
stream. But an Each cannot come immediately before an Every.
Optionally a stream can be further sorted by providing sortFields. This allows an Aggregator to receive
values in the order of the sortedFields.
Note that local sorting always happens on the groupFields, sortFields are a secondary sorting on the grouped values within the
current grouping. sortFields is particularly useful if the Aggregators following the GroupBy would like to see their arguments
in order.
For more control over sorting at the group or secondary sort level, use Fields
containing Comparator
instances for the appropriate fields when setting the groupFields or
sortFields values. Fields allows you to set a custom Comparator
instance for each field name or
position. It is required that each Comparator class also be Serializable
.
It should be noted for MapReduce systems, distributed group sorting is not 'total'. That is groups are sorted
as seen by each Reducer, but they are not sorted across Reducers. See the MapReduce algorithm for details.
See the Hasher
interface when a custom Comparator
on the grouping keys is
being provided that makes two values with differing hashCode values equal. For example,
new BigDecimal( 100.0D )
and new Double 100.0D )
are equal using a custom Comparator, but
Object.hashCode()
will be different, thus forcing each value into differing partitions.
Note that grouping one String key with a lowercase value with another String key with an uppercase value using a
"case insensitive" Comparator will not have consistent results. The grouping will execute and be correct,
but the actual values in the key columns may be replaced with "equivalent" values from other streams.
That is, if two streams are merged and then grouped on a key, where one stream the key values are uppercase and the
other stream values are lowercase, the resulting key value for the grouping may arbitrarily be either upper or
lower case.
If the original key values must be retained, consider normalizing the keys with a Function and then grouping on the
resulting field.
Field Summary |
---|
Fields inherited from class cascading.pipe.Splice |
---|
declaredFields, keyFieldsMap, resultGroupFields, sortFieldsMap |
Fields inherited from class cascading.pipe.Pipe |
---|
configDef, name, parent, previous, stepConfigDef |
Constructor Summary | |
---|---|
GroupBy(Pipe pipe)
Creates a new GroupBy instance that will group on Fields.ALL fields. |
|
GroupBy(Pipe[] pipes)
Creates a new GroupBy instance that will first merge the given pipes, then group on Fields.FIRST. |
|
GroupBy(Pipe[] pipes,
Fields groupFields)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names. |
|
GroupBy(Pipe[] pipes,
Fields groupFields,
Fields sortFields)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names. |
|
GroupBy(Pipe[] pipes,
Fields groupFields,
Fields sortFields,
boolean reverseOrder)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names. |
|
GroupBy(Pipe pipe,
Fields groupFields)
Creates a new GroupBy instance that will group on the given groupFields field names. |
|
GroupBy(Pipe pipe,
Fields groupFields,
boolean reverseOrder)
Creates a new GroupBy instance that will group on the given groupFields field names. |
|
GroupBy(Pipe pipe,
Fields groupFields,
Fields sortFields)
Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names. |
|
GroupBy(Pipe pipe,
Fields groupFields,
Fields sortFields,
boolean reverseOrder)
Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names. |
|
GroupBy(Pipe lhsPipe,
Pipe rhsPipe,
Fields groupFields)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names. |
|
GroupBy(String groupName,
Pipe[] pipes,
Fields groupFields)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names. |
|
GroupBy(String groupName,
Pipe[] pipes,
Fields groupFields,
Fields sortFields)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names. |
|
GroupBy(String groupName,
Pipe[] pipes,
Fields groupFields,
Fields sortFields,
boolean reverseOrder)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names. |
|
GroupBy(String groupName,
Pipe pipe,
Fields groupFields)
Creates a new GroupBy instance that will group on the given groupFields field names. |
|
GroupBy(String groupName,
Pipe pipe,
Fields groupFields,
boolean reverseOrder)
Creates a new GroupBy instance that will group on the given groupFields field names. |
|
GroupBy(String groupName,
Pipe pipe,
Fields groupFields,
Fields sortFields)
Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names. |
|
GroupBy(String groupName,
Pipe pipe,
Fields groupFields,
Fields sortFields,
boolean reverseOrder)
Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names. |
|
GroupBy(String groupName,
Pipe lhsPipe,
Pipe rhsPipe,
Fields groupFields)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names. |
Method Summary |
---|
Methods inherited from class cascading.pipe.Splice |
---|
equals, getDeclaredFields, getJoinDeclaredFields, getJoiner, getKeySelectors, getName, getNumSelfJoins, getPipePos, getPrevious, getSortingSelectors, hashCode, isCoGroup, isEquivalentTo, isGroupBy, isJoin, isMerge, isSorted, isSortReversed, outgoingScopeFor, printInternal, resolveIncomingOperationPassThroughFields, toString |
Methods inherited from class cascading.pipe.Pipe |
---|
getConfigDef, getHeads, getParent, getStepConfigDef, getTrace, hasConfigDef, hasStepConfigDef, id, named, names, pipes, print, resolveIncomingOperationArgumentFields, setParent |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Methods inherited from interface cascading.pipe.Group |
---|
getKeySelectors, getName, getSortingSelectors, isCoGroup, isGroupBy, isSorted, isSortReversed |
Methods inherited from interface cascading.flow.FlowElement |
---|
getConfigDef, getStepConfigDef, hasConfigDef, hasStepConfigDef, isEquivalentTo, outgoingScopeFor, resolveIncomingOperationArgumentFields, resolveIncomingOperationPassThroughFields |
Constructor Detail |
---|
@ConstructorProperties(value="pipe") public GroupBy(Pipe pipe)
Fields.ALL
fields.
pipe
- of type Pipe@ConstructorProperties(value={"pipe","groupFields"}) public GroupBy(Pipe pipe, Fields groupFields)
pipe
- of type PipegroupFields
- of type Fields@ConstructorProperties(value={"pipe","groupFields","reverseOrder"}) public GroupBy(Pipe pipe, Fields groupFields, boolean reverseOrder)
pipe
- of type PipegroupFields
- of type FieldsreverseOrder
- of type boolean@ConstructorProperties(value={"groupName","pipe","groupFields"}) public GroupBy(String groupName, Pipe pipe, Fields groupFields)
groupName
- of type Stringpipe
- of type PipegroupFields
- of type Fields@ConstructorProperties(value={"groupName","pipe","groupFields","reverseOrder"}) public GroupBy(String groupName, Pipe pipe, Fields groupFields, boolean reverseOrder)
groupName
- of type Stringpipe
- of type PipegroupFields
- of type FieldsreverseOrder
- of type boolean@ConstructorProperties(value={"pipe","groupFields","sortFields"}) public GroupBy(Pipe pipe, Fields groupFields, Fields sortFields)
pipe
- of type PipegroupFields
- of type FieldssortFields
- of type Fields@ConstructorProperties(value={"groupName","pipe","groupFields","sortFields"}) public GroupBy(String groupName, Pipe pipe, Fields groupFields, Fields sortFields)
groupName
- of type Stringpipe
- of type PipegroupFields
- of type FieldssortFields
- of type Fields@ConstructorProperties(value={"pipe","groupFields","sortFields","reverseOrder"}) public GroupBy(Pipe pipe, Fields groupFields, Fields sortFields, boolean reverseOrder)
pipe
- of type PipegroupFields
- of type FieldssortFields
- of type FieldsreverseOrder
- of type boolean@ConstructorProperties(value={"groupName","pipe","groupFields","sortFields","reverseOrder"}) public GroupBy(String groupName, Pipe pipe, Fields groupFields, Fields sortFields, boolean reverseOrder)
groupName
- of type Stringpipe
- of type PipegroupFields
- of type FieldssortFields
- of type FieldsreverseOrder
- of type boolean@ConstructorProperties(value="pipes") public GroupBy(Pipe[] pipes)
pipes
- of type Pipe@ConstructorProperties(value={"pipes","groupFields"}) public GroupBy(Pipe[] pipes, Fields groupFields)
pipes
- of type PipegroupFields
- of type Fieldspublic GroupBy(Pipe lhsPipe, Pipe rhsPipe, Fields groupFields)
lhsPipe
- of type PiperhsPipe
- of type PipegroupFields
- of type Fields@ConstructorProperties(value={"groupName","pipes","groupFields"}) public GroupBy(String groupName, Pipe[] pipes, Fields groupFields)
groupName
- of type Stringpipes
- of type PipegroupFields
- of type Fieldspublic GroupBy(String groupName, Pipe lhsPipe, Pipe rhsPipe, Fields groupFields)
groupName
- of type StringlhsPipe
- of type PiperhsPipe
- of type PipegroupFields
- of type Fields@ConstructorProperties(value={"pipes","groupFields","sortFields"}) public GroupBy(Pipe[] pipes, Fields groupFields, Fields sortFields)
pipes
- of type PipegroupFields
- of type FieldssortFields
- of type Fields@ConstructorProperties(value={"groupName","pipes","groupFields","sortFields"}) public GroupBy(String groupName, Pipe[] pipes, Fields groupFields, Fields sortFields)
groupName
- of type Stringpipes
- of type PipegroupFields
- of type FieldssortFields
- of type Fields@ConstructorProperties(value={"pipes","groupFields","sortFields","reverseOrder"}) public GroupBy(Pipe[] pipes, Fields groupFields, Fields sortFields, boolean reverseOrder)
pipes
- of type PipegroupFields
- of type FieldssortFields
- of type FieldsreverseOrder
- of type boolean@ConstructorProperties(value={"groupName","pipes","groupFields","sortFields","reverseOrder"}) public GroupBy(String groupName, Pipe[] pipes, Fields groupFields, Fields sortFields, boolean reverseOrder)
groupName
- of type Stringpipes
- of type PipegroupFields
- of type FieldssortFields
- of type FieldsreverseOrder
- of type boolean
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |