public class RegexParser extends RegexOperation<Pair<java.util.regex.Matcher,TupleEntry>> implements Function<Pair<java.util.regex.Matcher,TupleEntry>>
RegexParser only expects one field value. If more than one argument value is passed, only the first is handled, the remainder are ignored.
Sometimes its useful to parse out a value from a key/value pair in a string, if the key exists. If the key does not exist, returning an empty string instead of failing is typically expected.
The following regex can extract a value from key1=value1&key2=value2
if key1 exists, otherwise an
empty string is returned:
(?<=key1=)[^&]*|$
Note a null
valued argument passed to the parser will be converted to an empty string (""
) before
the regex is applied.
Any Object value will be coerced to a String type if type information is provided. See the
CoercibleType
interface to control how custom Object types are converted to String
values.
Also, any type information on the declaredFields will also be honored by coercing the parsed String value to the
canonical declared type. This is useful when creating or using CoercibleType classes, like
DateType
.
patternString
fieldDeclaration, numArgs, trace
Constructor and Description |
---|
RegexParser(Fields fieldDeclaration,
java.lang.String patternString)
Constructor RegexParser creates a new RegexParser instance, where the argument Tuple value is matched and returned
as the given Field.
|
RegexParser(Fields fieldDeclaration,
java.lang.String patternString,
int... groups)
Constructor RegexParser creates a new RegexParser instance, where the patternString is a regular expression
with match groups and whose groups designated by
groups are stored in the named fieldDeclarations. |
RegexParser(java.lang.String patternString)
Constructor RegexParser creates a new RegexParser instance, where the argument Tuple value is matched and returned
in a new Tuple.
|
RegexParser(java.lang.String patternString,
int... groups)
Constructor RegexParser creates a new RegexParser instance, where the patternString is a regular expression
with match groups and whose groups designated by
groups are stored in the appropriate number of new fields. |
Modifier and Type | Method and Description |
---|---|
boolean |
equals(java.lang.Object object) |
int[] |
getGroups() |
int |
hashCode() |
void |
operate(FlowProcess flowProcess,
FunctionCall<Pair<java.util.regex.Matcher,TupleEntry>> functionCall)
Method operate provides the implementation of this Function.
|
void |
prepare(FlowProcess flowProcess,
OperationCall<Pair<java.util.regex.Matcher,TupleEntry>> operationCall)
Method prepare does nothing, and may safely be overridden.
|
getPattern, getPatternString
cleanup, flush, getFieldDeclaration, getNumArgs, getTrace, isSafe, printOperationInternal, toString, toStringInternal
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
cleanup, flush, getFieldDeclaration, getNumArgs, isSafe
@ConstructorProperties(value="patternString") public RegexParser(java.lang.String patternString)
If the given patternString declares regular expression groups, each group will be returned as a value in the resulting Tuple. If no groups are declared, the match will be returned as the only value in the resulting Tuple.
The fields returned will be Fields.UNKNOWN
, so a variable number of values may be emitted based on the
regular expression given.
patternString
- of type String@ConstructorProperties(value={"fieldDeclaration","patternString"}) public RegexParser(Fields fieldDeclaration, java.lang.String patternString)
If the given patternString declares regular expression groups, each group will be returned as a value in the resulting Tuple. If no groups are declared, the match will be returned as the only value in the resulting Tuple.
If the number of fields in the fieldDeclaration does not match the number of groups matched, an OperationException
will be thrown during runtime.
To overcome this, either use the constructors that take an array of groups, or use the (?: ...)
sequence
to tell the regular expression matcher to not capture the group.
fieldDeclaration
- of type FieldspatternString
- of type String@ConstructorProperties(value={"patternString","groups"}) public RegexParser(java.lang.String patternString, int... groups)
groups
are stored in the appropriate number of new fields.
The number of resulting fields will match the number of groups given (groups.length
).
patternString
- of type Stringgroups
- of type int[]@ConstructorProperties(value={"fieldDeclaration","patternString","groups"}) public RegexParser(Fields fieldDeclaration, java.lang.String patternString, int... groups)
groups
are stored in the named fieldDeclarations.fieldDeclaration
- of type FieldspatternString
- of type Stringgroups
- of type int[]public int[] getGroups()
public void prepare(FlowProcess flowProcess, OperationCall<Pair<java.util.regex.Matcher,TupleEntry>> operationCall)
BaseOperation
prepare
in interface Operation<Pair<java.util.regex.Matcher,TupleEntry>>
prepare
in class BaseOperation<Pair<java.util.regex.Matcher,TupleEntry>>
public void operate(FlowProcess flowProcess, FunctionCall<Pair<java.util.regex.Matcher,TupleEntry>> functionCall)
Function
operate
in interface Function<Pair<java.util.regex.Matcher,TupleEntry>>
flowProcess
- of type FlowProcessfunctionCall
- of type FunctionCallpublic boolean equals(java.lang.Object object)
equals
in class RegexOperation<Pair<java.util.regex.Matcher,TupleEntry>>
public int hashCode()
hashCode
in class RegexOperation<Pair<java.util.regex.Matcher,TupleEntry>>
Copyright © 2007-2017 Cascading Maintainers. All Rights Reserved.