|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcascading.pipe.Pipe
cascading.pipe.SubAssembly
cascading.pipe.assembly.AggregateBy
cascading.pipe.assembly.CountBy
public class CountBy
Class CountBy is used to count duplicates in a tuple stream, where "duplicates" means all tuples with the same values for the groupingFields fields. The resulting count is output as a long value in the specified countField.
Typically finding the count of a field in a tuple stream relies on aGroupBy
and a
Count
Aggregator
operation.
The CountBy SubAssembly is a (typically) more efficient replacement for these two steps, because it does map-side
pre-reduce counting (via CountBy.CountPartials
AggregateBy.Functor
) before the GroupBy operator;
this reduces network I/O from the map to reduce phases.
This strategy is similar to using combiners
, except no sorting or serialization is invoked and results
in a much simpler mechanism.
The threshold
value tells the underlying CountPartials functions how many unique key counts to accumulate
in the LRU cache, before emitting the least recently used entry. This accumulation happens map-side, and thus is
bounded by the size of your map task JVM and the typical size of each group key.
By default, either the value of AggregateByProps.AGGREGATE_BY_CAPACITY
System property
or AggregateByProps.AGGREGATE_BY_DEFAULT_CAPACITY
will be used.
If include
is CountBy.Include.NO_NULLS
, argument tuples with all null values will be ignored.
The values in the argument Tuple are normally all the remaining fields not used for grouping, but this can be
narrowed using the valueFields parameter. When counting the occurrence of a single field (when valueFields
is set on the constructor), this is the same behavior as select count(foo) ...
in SQL. If include
is
CountBy.Include.ONLY_NULLS
then only argument tuples with all null values will be counted.
AggregateBy
,
Serialized FormNested Class Summary | |
---|---|
static class |
CountBy.CountPartials
Class CountPartials is a AggregateBy.Functor that is used to count observed duplicates from the tuple stream. |
static class |
CountBy.Include
|
Nested classes/interfaces inherited from class cascading.pipe.assembly.AggregateBy |
---|
AggregateBy.Cache, AggregateBy.CompositeFunction, AggregateBy.Flush, AggregateBy.Functor |
Field Summary | |
---|---|
static int |
DEFAULT_THRESHOLD
Deprecated. |
Fields inherited from class cascading.pipe.assembly.AggregateBy |
---|
AGGREGATE_BY_THRESHOLD, USE_DEFAULT_THRESHOLD |
Fields inherited from class cascading.pipe.Pipe |
---|
configDef, parent, stepConfigDef |
Constructor Summary | |
---|---|
CountBy(Fields countField)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Fields countField,
CountBy.Include include)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Fields valueFields,
Fields countField)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Fields valueFields,
Fields countField,
CountBy.Include include)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe[] pipes,
Fields groupingFields,
Fields countField)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe[] pipes,
Fields groupingFields,
Fields countField,
CountBy.Include include)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe[] pipes,
Fields groupingFields,
Fields countField,
CountBy.Include include,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe[] pipes,
Fields groupingFields,
Fields valueFields,
Fields countField)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe[] pipes,
Fields groupingFields,
Fields valueFields,
Fields countField,
CountBy.Include include)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe[] pipes,
Fields groupingFields,
Fields valueFields,
Fields countField,
CountBy.Include include,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe[] pipes,
Fields groupingFields,
Fields valueFields,
Fields countField,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe[] pipes,
Fields groupingFields,
Fields countField,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe pipe,
Fields groupingFields,
Fields countField)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe pipe,
Fields groupingFields,
Fields countField,
CountBy.Include include)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe pipe,
Fields groupingFields,
Fields countField,
CountBy.Include include,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe pipe,
Fields groupingFields,
Fields valueFields,
Fields countField)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe pipe,
Fields groupingFields,
Fields valueFields,
Fields countField,
CountBy.Include include)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe pipe,
Fields groupingFields,
Fields valueFields,
Fields countField,
CountBy.Include include,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe pipe,
Fields groupingFields,
Fields valueFields,
Fields countField,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(Pipe pipe,
Fields groupingFields,
Fields countField,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe[] pipes,
Fields groupingFields,
Fields countField)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe[] pipes,
Fields groupingFields,
Fields countField,
CountBy.Include include)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe[] pipes,
Fields groupingFields,
Fields countField,
CountBy.Include include,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe[] pipes,
Fields groupingFields,
Fields valueFields,
Fields countField)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe[] pipes,
Fields groupingFields,
Fields valueFields,
Fields countField,
CountBy.Include include)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe[] pipes,
Fields groupingFields,
Fields valueFields,
Fields countField,
CountBy.Include include,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe[] pipes,
Fields groupingFields,
Fields valueFields,
Fields countField,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe[] pipes,
Fields groupingFields,
Fields countField,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe pipe,
Fields groupingFields,
Fields countField)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe pipe,
Fields groupingFields,
Fields countField,
CountBy.Include include)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe pipe,
Fields groupingFields,
Fields countField,
CountBy.Include include,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe pipe,
Fields groupingFields,
Fields valueFields,
Fields countField)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe pipe,
Fields groupingFields,
Fields valueFields,
Fields countField,
CountBy.Include include)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe pipe,
Fields groupingFields,
Fields valueFields,
Fields countField,
CountBy.Include include,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe pipe,
Fields groupingFields,
Fields valueFields,
Fields countField,
int threshold)
Constructor CountBy creates a new CountBy instance. |
|
CountBy(String name,
Pipe pipe,
Fields groupingFields,
Fields countField,
int threshold)
Constructor CountBy creates a new CountBy instance. |
Method Summary |
---|
Methods inherited from class cascading.pipe.assembly.AggregateBy |
---|
getAggregators, getArgumentFields, getCapacity, getFieldDeclarations, getFunctors, getGroupBy, getGroupingFields, getThreshold, initialize, initialize, verify |
Methods inherited from class cascading.pipe.SubAssembly |
---|
getName, getPrevious, getTailNames, getTails, setPrevious, setTails, unwind |
Methods inherited from class cascading.pipe.Pipe |
---|
equals, getConfigDef, getHeads, getParent, getStepConfigDef, getTrace, hasConfigDef, hashCode, hasStepConfigDef, id, isEquivalentTo, named, names, outgoingScopeFor, pipes, print, printInternal, resolveIncomingOperationArgumentFields, resolveIncomingOperationPassThroughFields, setParent, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
@Deprecated public static final int DEFAULT_THRESHOLD
Constructor Detail |
---|
@ConstructorProperties(value="countField") public CountBy(Fields countField)
AggregateBy
instance.
countField
- of type Fields@ConstructorProperties(value={"countField","include"}) public CountBy(Fields countField, CountBy.Include include)
AggregateBy
instance.
countField
- of type Fieldsinclude
- of type Include@ConstructorProperties(value={"valueFields","countField"}) public CountBy(Fields valueFields, Fields countField)
AggregateBy
instance.
countField
- of type Fields@ConstructorProperties(value={"valueFields","countField","include"}) public CountBy(Fields valueFields, Fields countField, CountBy.Include include)
AggregateBy
instance.
countField
- of type Fields@ConstructorProperties(value={"pipe","groupingFields","countField"}) public CountBy(Pipe pipe, Fields groupingFields, Fields countField)
pipe
- of type PipegroupingFields
- of type FieldscountField
- of type Fields@ConstructorProperties(value={"pipe","groupingFields","countField","threshold"}) public CountBy(Pipe pipe, Fields groupingFields, Fields countField, int threshold)
pipe
- of type PipegroupingFields
- of type FieldscountField
- fo type Fieldsthreshold
- of type int@ConstructorProperties(value={"name","pipe","groupingFields","countField"}) public CountBy(String name, Pipe pipe, Fields groupingFields, Fields countField)
name
- of type Stringpipe
- of type PipegroupingFields
- of type FieldscountField
- of type Fields@ConstructorProperties(value={"name","pipe","groupingFields","countField","threshold"}) public CountBy(String name, Pipe pipe, Fields groupingFields, Fields countField, int threshold)
name
- of type Stringpipe
- of type PipegroupingFields
- of type FieldscountField
- of type Fieldsthreshold
- of type int@ConstructorProperties(value={"pipes","groupingFields","countField"}) public CountBy(Pipe[] pipes, Fields groupingFields, Fields countField)
pipes
- of type Pipe[]groupingFields
- of type FieldscountField
- of type Fields@ConstructorProperties(value={"pipes","groupingFields","countField","threshold"}) public CountBy(Pipe[] pipes, Fields groupingFields, Fields countField, int threshold)
pipes
- of type Pipe[]groupingFields
- of type FieldscountField
- of type Fieldsthreshold
- of type int@ConstructorProperties(value={"name","pipes","groupingFields","countField"}) public CountBy(String name, Pipe[] pipes, Fields groupingFields, Fields countField)
name
- of type Stringpipes
- of type Pipe[]groupingFields
- of type FieldscountField
- of type Fields@ConstructorProperties(value={"name","pipes","groupingFields","countField","threshold"}) public CountBy(String name, Pipe[] pipes, Fields groupingFields, Fields countField, int threshold)
name
- of type Stringpipes
- of type Pipe[]groupingFields
- of type FieldscountField
- of type Fieldsthreshold
- of type int@ConstructorProperties(value={"pipe","groupingFields","countField","include"}) public CountBy(Pipe pipe, Fields groupingFields, Fields countField, CountBy.Include include)
pipe
- of type PipegroupingFields
- of type FieldscountField
- of type Fieldsinclude
- of type Include@ConstructorProperties(value={"pipe","groupingFields","countField","include","threshold"}) public CountBy(Pipe pipe, Fields groupingFields, Fields countField, CountBy.Include include, int threshold)
pipe
- of type PipegroupingFields
- of type FieldscountField
- fo type Fieldsinclude
- of type Includethreshold
- of type int@ConstructorProperties(value={"name","pipe","groupingFields","countField","include"}) public CountBy(String name, Pipe pipe, Fields groupingFields, Fields countField, CountBy.Include include)
name
- of type Stringpipe
- of type PipegroupingFields
- of type FieldscountField
- of type Fieldsinclude
- of type Include@ConstructorProperties(value={"name","pipe","groupingFields","countField","include","threshold"}) public CountBy(String name, Pipe pipe, Fields groupingFields, Fields countField, CountBy.Include include, int threshold)
name
- of type Stringpipe
- of type PipegroupingFields
- of type FieldscountField
- of type Fieldsinclude
- of type Includethreshold
- of type int@ConstructorProperties(value={"pipes","groupingFields","countField","include"}) public CountBy(Pipe[] pipes, Fields groupingFields, Fields countField, CountBy.Include include)
pipes
- of type Pipe[]groupingFields
- of type FieldscountField
- of type Fieldsinclude
- of type Include@ConstructorProperties(value={"pipes","groupingFields","countField","include","threshold"}) public CountBy(Pipe[] pipes, Fields groupingFields, Fields countField, CountBy.Include include, int threshold)
pipes
- of type Pipe[]groupingFields
- of type FieldscountField
- of type Fieldsinclude
- of type Includethreshold
- of type int@ConstructorProperties(value={"name","pipes","groupingFields","countField","include"}) public CountBy(String name, Pipe[] pipes, Fields groupingFields, Fields countField, CountBy.Include include)
name
- of type Stringpipes
- of type Pipe[]groupingFields
- of type FieldscountField
- of type Fieldsinclude
- of type Include@ConstructorProperties(value={"name","pipes","groupingFields","countField","include","threshold"}) public CountBy(String name, Pipe[] pipes, Fields groupingFields, Fields countField, CountBy.Include include, int threshold)
name
- of type Stringpipes
- of type Pipe[]groupingFields
- of type FieldscountField
- of type Fieldsinclude
- of type Includethreshold
- of type int@ConstructorProperties(value={"pipe","groupingFields","valueFields","countField"}) public CountBy(Pipe pipe, Fields groupingFields, Fields valueFields, Fields countField)
pipe
- of type PipegroupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fields@ConstructorProperties(value={"pipe","groupingFields","valueFields","countField","threshold"}) public CountBy(Pipe pipe, Fields groupingFields, Fields valueFields, Fields countField, int threshold)
pipe
- of type PipegroupingFields
- of type FieldsvalueFields
- of type FieldscountField
- fo type Fieldsthreshold
- of type int@ConstructorProperties(value={"name","pipe","groupingFields","valueFields","countField"}) public CountBy(String name, Pipe pipe, Fields groupingFields, Fields valueFields, Fields countField)
name
- of type Stringpipe
- of type PipegroupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fields@ConstructorProperties(value={"name","pipe","groupingFields","valueFields","countField","threshold"}) public CountBy(String name, Pipe pipe, Fields groupingFields, Fields valueFields, Fields countField, int threshold)
name
- of type Stringpipe
- of type PipegroupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fieldsthreshold
- of type int@ConstructorProperties(value={"pipes","groupingFields","valueFields","countField"}) public CountBy(Pipe[] pipes, Fields groupingFields, Fields valueFields, Fields countField)
pipes
- of type Pipe[]groupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fields@ConstructorProperties(value={"pipes","groupingFields","valueFields","countField","threshold"}) public CountBy(Pipe[] pipes, Fields groupingFields, Fields valueFields, Fields countField, int threshold)
pipes
- of type Pipe[]groupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fieldsthreshold
- of type int@ConstructorProperties(value={"name","pipes","groupingFields","valueFields","countField"}) public CountBy(String name, Pipe[] pipes, Fields groupingFields, Fields valueFields, Fields countField)
name
- of type Stringpipes
- of type Pipe[]groupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fields@ConstructorProperties(value={"name","pipes","groupingFields","valueFields","countField","threshold"}) public CountBy(String name, Pipe[] pipes, Fields groupingFields, Fields valueFields, Fields countField, int threshold)
name
- of type Stringpipes
- of type Pipe[]groupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fieldsthreshold
- of type int@ConstructorProperties(value={"pipe","groupingFields","valueFields","countField","include"}) public CountBy(Pipe pipe, Fields groupingFields, Fields valueFields, Fields countField, CountBy.Include include)
pipe
- of type PipegroupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fieldsinclude
- of type Include@ConstructorProperties(value={"pipe","groupingFields","valueFields","countField","include","threshold"}) public CountBy(Pipe pipe, Fields groupingFields, Fields valueFields, Fields countField, CountBy.Include include, int threshold)
pipe
- of type PipegroupingFields
- of type FieldsvalueFields
- of type FieldscountField
- fo type Fieldsinclude
- of type Includethreshold
- of type int@ConstructorProperties(value={"name","pipe","groupingFields","valueFields","countField","include"}) public CountBy(String name, Pipe pipe, Fields groupingFields, Fields valueFields, Fields countField, CountBy.Include include)
name
- of type Stringpipe
- of type PipevalueFields
- of type FieldsgroupingFields
- of type FieldscountField
- of type Fieldsinclude
- of type Include@ConstructorProperties(value={"name","pipe","groupingFields","valueFields","countField","include","threshold"}) public CountBy(String name, Pipe pipe, Fields groupingFields, Fields valueFields, Fields countField, CountBy.Include include, int threshold)
name
- of type Stringpipe
- of type PipegroupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fieldsinclude
- of type Includethreshold
- of type int@ConstructorProperties(value={"pipes","groupingFields","valueFields","countField","include"}) public CountBy(Pipe[] pipes, Fields groupingFields, Fields valueFields, Fields countField, CountBy.Include include)
pipes
- of type Pipe[]groupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fieldsinclude
- of type Include@ConstructorProperties(value={"pipes","groupingFields","valueFields","countField","include","threshold"}) public CountBy(Pipe[] pipes, Fields groupingFields, Fields valueFields, Fields countField, CountBy.Include include, int threshold)
pipes
- of type Pipe[]groupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fieldsinclude
- of type Includethreshold
- of type int@ConstructorProperties(value={"name","pipes","groupingFields","valueFields","countField","include"}) public CountBy(String name, Pipe[] pipes, Fields groupingFields, Fields valueFields, Fields countField, CountBy.Include include)
name
- of type Stringpipes
- of type Pipe[]groupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fieldsinclude
- of type Include@ConstructorProperties(value={"name","pipes","groupingFields","valueFields","countField","include","threshold"}) public CountBy(String name, Pipe[] pipes, Fields groupingFields, Fields valueFields, Fields countField, CountBy.Include include, int threshold)
name
- of type Stringpipes
- of type Pipe[]groupingFields
- of type FieldsvalueFields
- of type FieldscountField
- of type Fieldsinclude
- of type Includethreshold
- of type int
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |