cascading.scheme.local
Class TextLine

java.lang.Object
  extended by cascading.scheme.Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
      extended by cascading.scheme.local.TextLine
All Implemented Interfaces:
Traceable, Serializable

public class TextLine
extends Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>

A TextLine is a type of Scheme for plain text files. Files are broken into lines. Either line-feed or carriage-return are used to signal end of line.

By default, this scheme returns a Tuple with two fields, "num" and "line". Where "num" is the line number for "line".

Many of the constructors take both "sourceFields" and "sinkFields". sourceFields denote the field names to be used instead of the names "num" and "line". sinkFields is a selector and is by default Fields.ALL. Any available field names can be given if only a subset of the incoming fields should be used.

If a Fields instance is passed on the constructor as sourceFields having only one field, the return tuples will simply be the "line" value using the given field name.

Note that TextLine will concatenate all the Tuple values for the selected fields with a TAB delimiter before writing out the line.

By default, all text is encoded/decoded as UTF-8. This can be changed via the charsetName constructor argument.

See Also:
Serialized Form

Field Summary
static String DEFAULT_CHARSET
           
 
Constructor Summary
TextLine()
          Creates a new TextLine instance that sources "num" and "line" fields, and sinks all incoming fields, where "num" is the line number of the line in the input file.
TextLine(Fields sourceFields)
          Creates a new TextLine instance.
TextLine(Fields sourceFields, Fields sinkFields)
          Creates a new TextLine instance.
TextLine(Fields sourceFields, Fields sinkFields, String charsetName)
          Creates a new TextLine instance.
TextLine(Fields sourceFields, String charsetName)
          Creates a new TextLine instance.
 
Method Summary
 LineNumberReader createInput(InputStream inputStream)
           
 PrintWriter createOutput(OutputStream outputStream)
           
 String getCharsetName()
           
 void presentSinkFields(FlowProcess<Properties> process, Tap tap, Fields fields)
           
 void presentSourceFields(FlowProcess<Properties> process, Tap tap, Fields fields)
           
 void sink(FlowProcess<Properties> flowProcess, SinkCall<PrintWriter,OutputStream> sinkCall)
           
 void sinkCleanup(FlowProcess<Properties> flowProcess, SinkCall<PrintWriter,OutputStream> sinkCall)
           
 void sinkConfInit(FlowProcess<Properties> flowProcess, Tap<Properties,InputStream,OutputStream> tap, Properties conf)
           
 void sinkPrepare(FlowProcess<Properties> flowProcess, SinkCall<PrintWriter,OutputStream> sinkCall)
           
 boolean source(FlowProcess<Properties> flowProcess, SourceCall<LineNumberReader,InputStream> sourceCall)
           
 void sourceCleanup(FlowProcess<Properties> flowProcess, SourceCall<LineNumberReader,InputStream> sourceCall)
           
 void sourceConfInit(FlowProcess<Properties> flowProcess, Tap<Properties,InputStream,OutputStream> tap, Properties conf)
           
 void sourcePrepare(FlowProcess<Properties> flowProcess, SourceCall<LineNumberReader,InputStream> sourceCall)
           
protected  void verify(Fields sourceFields)
           
 
Methods inherited from class cascading.scheme.Scheme
equals, getNumSinkParts, getSinkFields, getSourceFields, getTrace, hashCode, isSink, isSource, isSymmetrical, presentSinkFieldsInternal, presentSourceFieldsInternal, retrieveSinkFields, retrieveSourceFields, setNumSinkParts, setSinkFields, setSourceFields, toString
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

DEFAULT_CHARSET

public static final String DEFAULT_CHARSET
See Also:
Constant Field Values
Constructor Detail

TextLine

public TextLine()
Creates a new TextLine instance that sources "num" and "line" fields, and sinks all incoming fields, where "num" is the line number of the line in the input file.


TextLine

@ConstructorProperties(value="sourceFields")
public TextLine(Fields sourceFields)
Creates a new TextLine instance. If sourceFields has one field, only the text line will be returned in the subsequent tuples.

Parameters:
sourceFields - of Fields

TextLine

@ConstructorProperties(value={"sourceFields","charsetName"})
public TextLine(Fields sourceFields,
                                           String charsetName)
Creates a new TextLine instance. If sourceFields has one field, only the text line will be returned in the subsequent tuples.

Parameters:
sourceFields - of Fields
charsetName - of type String

TextLine

@ConstructorProperties(value={"sourceFields","sinkFields"})
public TextLine(Fields sourceFields,
                                           Fields sinkFields)
Creates a new TextLine instance. If sourceFields has one field, only the text line will be returned in the subsequent tuples.

Parameters:
sourceFields - of Fields
sinkFields - of Fields

TextLine

@ConstructorProperties(value={"sourceFields","sinkFields","charsetName"})
public TextLine(Fields sourceFields,
                                           Fields sinkFields,
                                           String charsetName)
Creates a new TextLine instance. If sourceFields has one field, only the text line will be returned in the subsequent tuples.

Parameters:
sourceFields - of Fields
sinkFields - of Fields
charsetName - of type String
Method Detail

getCharsetName

public String getCharsetName()

verify

protected void verify(Fields sourceFields)

createInput

public LineNumberReader createInput(InputStream inputStream)

createOutput

public PrintWriter createOutput(OutputStream outputStream)

presentSourceFields

public void presentSourceFields(FlowProcess<Properties> process,
                                Tap tap,
                                Fields fields)
Overrides:
presentSourceFields in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>

presentSinkFields

public void presentSinkFields(FlowProcess<Properties> process,
                              Tap tap,
                              Fields fields)
Overrides:
presentSinkFields in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>

sourceConfInit

public void sourceConfInit(FlowProcess<Properties> flowProcess,
                           Tap<Properties,InputStream,OutputStream> tap,
                           Properties conf)
Specified by:
sourceConfInit in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>

sinkConfInit

public void sinkConfInit(FlowProcess<Properties> flowProcess,
                         Tap<Properties,InputStream,OutputStream> tap,
                         Properties conf)
Specified by:
sinkConfInit in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>

sourcePrepare

public void sourcePrepare(FlowProcess<Properties> flowProcess,
                          SourceCall<LineNumberReader,InputStream> sourceCall)
                   throws IOException
Overrides:
sourcePrepare in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
Throws:
IOException

source

public boolean source(FlowProcess<Properties> flowProcess,
                      SourceCall<LineNumberReader,InputStream> sourceCall)
               throws IOException
Specified by:
source in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
Throws:
IOException

sourceCleanup

public void sourceCleanup(FlowProcess<Properties> flowProcess,
                          SourceCall<LineNumberReader,InputStream> sourceCall)
                   throws IOException
Overrides:
sourceCleanup in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
Throws:
IOException

sinkPrepare

public void sinkPrepare(FlowProcess<Properties> flowProcess,
                        SinkCall<PrintWriter,OutputStream> sinkCall)
                 throws IOException
Overrides:
sinkPrepare in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
Throws:
IOException

sink

public void sink(FlowProcess<Properties> flowProcess,
                 SinkCall<PrintWriter,OutputStream> sinkCall)
          throws IOException
Specified by:
sink in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
Throws:
IOException

sinkCleanup

public void sinkCleanup(FlowProcess<Properties> flowProcess,
                        SinkCall<PrintWriter,OutputStream> sinkCall)
                 throws IOException
Overrides:
sinkCleanup in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
Throws:
IOException


Copyright © 2007-2015 Concurrent, Inc. All Rights Reserved.