cascading.tuple
Class TupleEntryCollector

java.lang.Object
  extended by cascading.tuple.TupleEntryCollector
Direct Known Subclasses:
BasePartitionTap.PartitionCollector, TupleEntrySchemeCollector, TupleListCollector

public abstract class TupleEntryCollector
extends Object

Interface TupleEntryCollector is used to allow BaseOperation instances to emit one or more result Tuple values.

The general rule in Cascading is if you are handed a Tuple, you cannot change or cache it. Attempts at modifying such a Tuple will result in an Exception. Preventing caching is harder, see below.

If you create the Tuple, you can re-use or modify it.

When calling add(Tuple) or add(TupleEntry), you are passing a Tuple to the down stream pipes and operations. Since no downstream operation may modify or cache the Tuple instance, it is safe to re-use the Tuple instance when add() returns.

That said, Tuple copies do get cached in order to perform specific operations in the underlying platforms. Currently only a shallow copy is made (via the Tuple copy constructor). Thus, any mutable type or collection placed inside a Tuple will not be copied, but will likely be cached if a copy of the Tuple passed downstream is copied.

So any subsequent changes to that nested type or collection will be reflected in the cached copy, a likely source of hard to find errors.

There is currently no way to specify that a deep copy must be performed when making a Tuple copy.


Field Summary
protected  TupleEntry tupleEntry
           
 
Constructor Summary
protected TupleEntryCollector()
           
  TupleEntryCollector(Fields declared)
          Constructor TupleCollector creates a new TupleCollector instance.
 
Method Summary
 void add(Tuple tuple)
          Method add inserts the given Tuple into the outgoing stream.
 void add(TupleEntry tupleEntry)
          Method add inserts the given TupleEntry into the outgoing stream.
 void close()
          Method close closes the underlying resource being written to.
protected abstract  void collect(TupleEntry tupleEntry)
           
 void setFields(Fields declared)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

tupleEntry

protected TupleEntry tupleEntry
Constructor Detail

TupleEntryCollector

protected TupleEntryCollector()

TupleEntryCollector

public TupleEntryCollector(Fields declared)
Constructor TupleCollector creates a new TupleCollector instance.

Parameters:
declared - of type Fields
Method Detail

setFields

public void setFields(Fields declared)

add

public void add(TupleEntry tupleEntry)
Method add inserts the given TupleEntry into the outgoing stream. Note the method add(Tuple) is more efficient as it simply calls TupleEntry.getTuple();

See TupleEntryCollector on when and how to re-use a Tuple instance.

Parameters:
tupleEntry - of type TupleEntry

add

public void add(Tuple tuple)
Method add inserts the given Tuple into the outgoing stream.

See TupleEntryCollector on when and how to re-use a Tuple instance.

Parameters:
tuple - of type Tuple

collect

protected abstract void collect(TupleEntry tupleEntry)
                         throws IOException
Throws:
IOException

close

public void close()
Method close closes the underlying resource being written to.

This method should be called when when an instance is returned via Tap.openForWrite(cascading.flow.FlowProcess) and no more Tuple instances will be written out.

This method must not be called when an instance is returned from getOutputCollector() from any of the relevant OperationCall implementations (inside a Function, Aggregator, or Buffer).



Copyright © 2007-2015 Concurrent, Inc. All Rights Reserved.