cascading.tuple.collect
Interface TupleMapFactory<Config>

All Superinterfaces:
CascadingFactory<Config,Map<Tuple,Collection<Tuple>>>

public interface TupleMapFactory<Config>
extends CascadingFactory<Config,Map<Tuple,Collection<Tuple>>>

Interface TupleMapFactory allows developers to plugin alternative implementations of a "tuple map" used to back in memory "join" and "co-group" operations. Typically these implementations are "spillable", in that to prevent using up all memory in the JVM, after some threshold is met or event is triggered, values are persisted to disk.

The Map classes returned must take a Tuple as a key, and a Collection of Tuples as a value. Further, Map.get(Object) must never return null, but on the first call to get() on the map an empty Collection must be created and stored.

That is, Map.put(Object, Object) is never called on the map instance internally, only map.get(groupTuple).add(valuesTuple).

Using the TupleCollectionFactory to create the underlying Tuple Collections would allow that aspect to be pluggable as well.

If the Map implementation implements the Spillable interface, it will receive a Spillable.SpillListener instance that calls back to the appropriate logging mechanism for the platform. This instance should be passed down to any child Spillable types, namely an implementation of SpillableTupleList.

The default implementation for the Hadoop platform is the HadoopTupleMapFactory which created a HadoopSpillableTupleMap instance.

The class SpillableTupleMap may be used as a base class.

See Also:
SpillableTupleMap, HadoopTupleMapFactory, TupleCollectionFactory, HadoopTupleCollectionFactory

Field Summary
static String TUPLE_MAP_FACTORY
           
 
Method Summary
 
Methods inherited from interface cascading.provider.CascadingFactory
create, initialize
 

Field Detail

TUPLE_MAP_FACTORY

static final String TUPLE_MAP_FACTORY
See Also:
Constant Field Values


Copyright © 2007-2015 Concurrent, Inc. All Rights Reserved.