cascading.flow
Class FlowConnector

java.lang.Object
  extended by cascading.flow.FlowConnector

public abstract class FlowConnector
extends Object

Class FlowConnector is the base class for all platform planners.

See the FlowDef class for a fluent way to define a new Flow.

Use the FlowConnector to link source and sink Tap instances with an assembly of Pipe instances into an executable Flow.

FlowConnector invokes a planner for the target execution environment.

For executing Flows in local memory against local files, see LocalFlowConnector.

For Apache Hadoop, see the HadoopFlowConnector. Or if you have a pre-existing custom Hadoop job to execute, see MapReduceFlow, which doesn't require a planner.

Note that all connect methods take a single tail or an array of tail Pipe instances. "tail" refers to the last connected Pipe instances in a pipe-assembly. Pipe-assemblies are graphs of object with "heads" and "tails". From a given "tail", all connected heads can be found, but not the reverse. So "tails" must be supplied by the user.

The FlowConnector and the underlying execution framework (Hadoop or local mode) can be configured via a Map or Properties instance given to the constructor.

This properties map must be populated before constructing a FlowConnector instance. Many planner specific properties can be set through the FlowConnectorProps fluent interface.

Some planners have required properties. Hadoop expects AppProps.setApplicationJarPath(java.util.Map, String) or AppProps.setApplicationJarClass(java.util.Map, Class) to be set.

Any properties set and passed through the FlowConnector constructor will be global to all Flow instances created through the that FlowConnector instance. Some properties are on the FlowDef and would only be applicable to the resulting Flow instance.

These properties are used to influence the current planner and are also passed down to the execution framework to override any default values. For example when using the Hadoop planner, the number of reducers or mappers can be set by using platform specific properties.

Custom operations (Functions, Filter, etc) may also retrieve these property values at runtime through calls to FlowProcess.getProperty(String) or FlowProcess.getStringProperty(String).

Most applications will need to call AppProps.setApplicationJarClass(java.util.Map, Class) or AppProps.setApplicationJarPath(java.util.Map, String) so that the correct application jar file is passed through to all child processes. The Class or path must reference the custom application jar, not a Cascading library class or jar. The easiest thing to do is give setApplicationJarClass the Class with your static main function and let Cascading figure out which jar to use.

Note that Map is compatible with the Properties class, so properties can be loaded at runtime from a configuration file.

By default, all Assertions are planned into the resulting Flow instance. This can be changed for a given Flow by calling FlowDef.setAssertionLevel(cascading.operation.AssertionLevel) or globally via FlowConnectorProps.setAssertionLevel(cascading.operation.AssertionLevel).

Also by default, all Debugs are planned into the resulting Flow instance. This can be changed for a given flow by calling FlowDef.setDebugLevel(cascading.operation.DebugLevel) or globally via FlowConnectorProps.setDebugLevel(cascading.operation.DebugLevel).

See Also:
LocalFlowConnector, HadoopFlowConnector

Field Summary
protected  Map<Object,Object> properties
          Field properties
 
Constructor Summary
protected FlowConnector()
           
protected FlowConnector(Map<Object,Object> properties)
           
 
Method Summary
 Flow connect(FlowDef flowDef)
           
 Flow connect(Map<String,Tap> sources, Map<String,Tap> sinks, Pipe... tails)
          Method connect links the named sources and sinks to the given pipe assembly.
 Flow connect(Map<String,Tap> sources, Tap sink, Pipe tail)
          Method connect links the named source Taps and sink Tap to the given pipe assembly.
 Flow connect(String name, Map<String,Tap> sources, Map<String,Tap> sinks, Map<String,Tap> traps, Pipe... tails)
          Method connect links the named sources, sinks and traps to the given pipe assembly.
 Flow connect(String name, Map<String,Tap> sources, Map<String,Tap> sinks, Pipe... tails)
          Method connect links the named sources and sinks to the given pipe assembly.
 Flow connect(String name, Map<String,Tap> sources, Tap sink, Map<String,Tap> traps, Pipe tail)
          Method connect links the named source and trap Taps and sink Tap to the given pipe assembly.
 Flow connect(String name, Map<String,Tap> sources, Tap sink, Pipe tail)
          Method connect links the named source Taps and sink Tap to the given pipe assembly.
 Flow connect(String name, Tap source, Map<String,Tap> sinks, Collection<Pipe> tails)
          Method connect links the named source Taps and sink Tap to the given pipe assembly.
 Flow connect(String name, Tap source, Map<String,Tap> sinks, Pipe... tails)
          Method connect links the named source Taps and sink Tap to the given pipe assembly.
 Flow connect(String name, Tap source, Tap sink, Map<String,Tap> traps, Pipe tail)
          Method connect links the named trap Taps, source and sink Tap to the given pipe assembly.
 Flow connect(String name, Tap source, Tap sink, Pipe tail)
          Method connect links the given source and sink Taps to the given pipe assembly.
 Flow connect(String name, Tap source, Tap sink, Tap trap, Pipe tail)
          Method connect links the given source, sink, and trap Taps to the given pipe assembly.
 Flow connect(Tap source, Map<String,Tap> sinks, Collection<Pipe> tails)
          Method connect links the named source Taps and sink Tap to the given pipe assembly.
 Flow connect(Tap source, Map<String,Tap> sinks, Pipe... tails)
          Method connect links the named source Taps and sink Tap to the given pipe assembly.
 Flow connect(Tap source, Tap sink, Pipe tail)
          Method connect links the given source and sink Taps to the given pipe assembly.
protected abstract  FlowPlanner createFlowPlanner()
           
protected abstract  Class<? extends Scheme> getDefaultIntermediateSchemeClass()
           
 Class getIntermediateSchemeClass(Map<Object,Object> properties)
          Method getIntermediateSchemeClass is used for debugging.
 PlatformInfo getPlatformInfo()
          Method getPlatformInfo returns an instance of PlatformInfo for the underlying platform.
 Map<Object,Object> getProperties()
          Method getProperties returns the properties of this FlowConnector object.
static void setApplicationJarClass(Map<Object,Object> properties, Class type)
          Deprecated. 
static void setApplicationJarPath(Map<Object,Object> properties, String path)
          Deprecated. 
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

properties

protected Map<Object,Object> properties
Field properties

Constructor Detail

FlowConnector

protected FlowConnector()

FlowConnector

protected FlowConnector(Map<Object,Object> properties)
Method Detail

getIntermediateSchemeClass

public Class getIntermediateSchemeClass(Map<Object,Object> properties)
Method getIntermediateSchemeClass is used for debugging.

Parameters:
properties - of type Map
Returns:
Class

setApplicationJarClass

@Deprecated
public static void setApplicationJarClass(Map<Object,Object> properties,
                                                     Class type)
Deprecated. 

This has moved to AppProps.setApplicationJarClass(java.util.Map, Class).

Parameters:
properties -
type -

setApplicationJarPath

@Deprecated
public static void setApplicationJarPath(Map<Object,Object> properties,
                                                    String path)
Deprecated. 

This has moved to AppProps.setApplicationJarPath(java.util.Map, String).

Parameters:
properties -
path -

getDefaultIntermediateSchemeClass

protected abstract Class<? extends Scheme> getDefaultIntermediateSchemeClass()

getProperties

public Map<Object,Object> getProperties()
Method getProperties returns the properties of this FlowConnector object. The returned Map instance is immutable to prevent changes to the underlying property values in this FlowConnector instance.

If a Properties instance was passed to the constructor, the returned object will be a flattened Map instance.

Returns:
the properties (type Map) of this FlowConnector object.

connect

public Flow connect(Tap source,
                    Tap sink,
                    Pipe tail)
Method connect links the given source and sink Taps to the given pipe assembly.

Parameters:
source - source Tap to bind to the head of the given tail Pipe
sink - sink Tap to bind to the given tail Pipe
tail - tail end of a pipe assembly
Returns:
Flow

connect

public Flow connect(String name,
                    Tap source,
                    Tap sink,
                    Pipe tail)
Method connect links the given source and sink Taps to the given pipe assembly.

Parameters:
name - name to give the resulting Flow
source - source Tap to bind to the head of the given tail Pipe
sink - sink Tap to bind to the given tail Pipe
tail - tail end of a pipe assembly
Returns:
Flow

connect

public Flow connect(String name,
                    Tap source,
                    Tap sink,
                    Tap trap,
                    Pipe tail)
Method connect links the given source, sink, and trap Taps to the given pipe assembly. The given trap will be linked to the assembly head along with the source.

Parameters:
name - name to give the resulting Flow
source - source Tap to bind to the head of the given tail Pipe
sink - sink Tap to bind to the given tail Pipe
trap - trap Tap to sink all failed Tuples into
tail - tail end of a pipe assembly
Returns:
Flow

connect

public Flow connect(Map<String,Tap> sources,
                    Tap sink,
                    Pipe tail)
Method connect links the named source Taps and sink Tap to the given pipe assembly.

Parameters:
sources - all head names and source Taps to bind to the heads of the given tail Pipe
sink - sink Tap to bind to the given tail Pipe
tail - tail end of a pipe assembly
Returns:
Flow

connect

public Flow connect(String name,
                    Map<String,Tap> sources,
                    Tap sink,
                    Pipe tail)
Method connect links the named source Taps and sink Tap to the given pipe assembly.

Parameters:
name - name to give the resulting Flow
sources - all head names and source Taps to bind to the heads of the given tail Pipe
sink - sink Tap to bind to the given tail Pipe
tail - tail end of a pipe assembly
Returns:
Flow

connect

public Flow connect(String name,
                    Map<String,Tap> sources,
                    Tap sink,
                    Map<String,Tap> traps,
                    Pipe tail)
Method connect links the named source and trap Taps and sink Tap to the given pipe assembly.

Parameters:
name - name to give the resulting Flow
sources - all head names and source Taps to bind to the heads of the given tail Pipe
sink - sink Tap to bind to the given tail Pipe
traps - all pipe names and trap Taps to sink all failed Tuples into
tail - tail end of a pipe assembly
Returns:
Flow

connect

public Flow connect(String name,
                    Tap source,
                    Tap sink,
                    Map<String,Tap> traps,
                    Pipe tail)
Method connect links the named trap Taps, source and sink Tap to the given pipe assembly.

Parameters:
name - name to give the resulting Flow
source - source Tap to bind to the head of the given tail Pipe
sink - sink Tap to bind to the given tail Pipe
traps - all pipe names and trap Taps to sink all failed Tuples into
tail - tail end of a pipe assembly
Returns:
Flow

connect

public Flow connect(Tap source,
                    Map<String,Tap> sinks,
                    Collection<Pipe> tails)
Method connect links the named source Taps and sink Tap to the given pipe assembly.

Since only once source Tap is given, it is assumed to be associated with the 'head' pipe. So the head pipe does not need to be included as an argument.

Parameters:
source - source Tap to bind to the head of the given tail Pipes
sinks - all tail names and sink Taps to bind to the given tail Pipes
tails - all tail ends of a pipe assembly
Returns:
Flow

connect

public Flow connect(String name,
                    Tap source,
                    Map<String,Tap> sinks,
                    Collection<Pipe> tails)
Method connect links the named source Taps and sink Tap to the given pipe assembly.

Since only once source Tap is given, it is assumed to be associated with the 'head' pipe. So the head pipe does not need to be included as an argument.

Parameters:
name - name to give the resulting Flow
source - source Tap to bind to the head of the given tail Pipes
sinks - all tail names and sink Taps to bind to the given tail Pipes
tails - all tail ends of a pipe assembly
Returns:
Flow

connect

public Flow connect(Tap source,
                    Map<String,Tap> sinks,
                    Pipe... tails)
Method connect links the named source Taps and sink Tap to the given pipe assembly.

Since only once source Tap is given, it is assumed to be associated with the 'head' pipe. So the head pipe does not need to be included as an argument.

Parameters:
source - source Tap to bind to the head of the given tail Pipes
sinks - all tail names and sink Taps to bind to the given tail Pipes
tails - all tail ends of a pipe assembly
Returns:
Flow

connect

public Flow connect(String name,
                    Tap source,
                    Map<String,Tap> sinks,
                    Pipe... tails)
Method connect links the named source Taps and sink Tap to the given pipe assembly.

Since only once source Tap is given, it is assumed to be associated with the 'head' pipe. So the head pipe does not need to be included as an argument.

Parameters:
name - name to give the resulting Flow
source - source Tap to bind to the head of the given tail Pipes
sinks - all tail names and sink Taps to bind to the given tail Pipes
tails - all tail ends of a pipe assembly
Returns:
Flow

connect

public Flow connect(Map<String,Tap> sources,
                    Map<String,Tap> sinks,
                    Pipe... tails)
Method connect links the named sources and sinks to the given pipe assembly.

Parameters:
sources - all head names and source Taps to bind to the heads of the given tail Pipes
sinks - all tail names and sink Taps to bind to the given tail Pipes
tails - all tail ends of a pipe assembly
Returns:
Flow

connect

public Flow connect(String name,
                    Map<String,Tap> sources,
                    Map<String,Tap> sinks,
                    Pipe... tails)
Method connect links the named sources and sinks to the given pipe assembly.

Parameters:
name - name to give the resulting Flow
sources - all head names and source Taps to bind to the heads of the given tail Pipes
sinks - all tail names and sink Taps to bind to the given tail Pipes
tails - all tail ends of a pipe assembly
Returns:
Flow

connect

public Flow connect(String name,
                    Map<String,Tap> sources,
                    Map<String,Tap> sinks,
                    Map<String,Tap> traps,
                    Pipe... tails)
Method connect links the named sources, sinks and traps to the given pipe assembly.

Parameters:
name - name to give the resulting Flow
sources - all head names and source Taps to bind to the heads of the given tail Pipes
sinks - all tail names and sink Taps to bind to the given tail Pipes
traps - all pipe names and trap Taps to sink all failed Tuples into
tails - all tail ends of a pipe assembly
Returns:
Flow

connect

public Flow connect(FlowDef flowDef)

createFlowPlanner

protected abstract FlowPlanner createFlowPlanner()

getPlatformInfo

public PlatformInfo getPlatformInfo()
Method getPlatformInfo returns an instance of PlatformInfo for the underlying platform.

Returns:
of type PlatformInfo


Copyright © 2007-2015 Concurrent, Inc. All Rights Reserved.