cascading.flow.hadoop
Class HadoopFlow
java.lang.Object
cascading.flow.BaseFlow<JobConf>
cascading.flow.hadoop.HadoopFlow
- All Implemented Interfaces:
- Flow<JobConf>, UnitOfWork<FlowStats>
- Direct Known Subclasses:
- MapReduceFlow, ProcessFlow
public class HadoopFlow
- extends BaseFlow<JobConf>
Class HadoopFlow is the Apache Hadoop specific implementation of a Flow
.
HadoopFlow must be created through a HadoopFlowConnector
instance.
If classpath paths are provided on the FlowDef
, the Hadoop distributed cache mechanism will be used
to augment the remote classpath.
Any path elements that are relative will be uploaded to HDFS, and the HDFS URI will be used on the JobConf. Note
all paths are added as "files" to the JobConf, not archives, so they aren't needlessly uncompressed cluster side.
- See Also:
HadoopFlowConnector
Constructor Summary |
protected |
HadoopFlow()
|
|
HadoopFlow(PlatformInfo platformInfo,
Map<Object,Object> properties,
JobConf jobConf,
FlowDef flowDef)
|
protected |
HadoopFlow(PlatformInfo platformInfo,
Map<Object,Object> properties,
JobConf jobConf,
String name,
Map<String,String> flowDescriptor)
|
Methods inherited from class cascading.flow.BaseFlow |
addListener, addStepListener, areSinksStale, areSourcesNewer, cleanup, complete, createConfig, createFlowThread, deleteCheckpointsIfNotUpdate, deleteCheckpointsIfReplace, deleteSinks, deleteSinksIfNotUpdate, deleteSinksIfReplace, deleteTrapsIfNotUpdate, deleteTrapsIfReplace, fireOnCompleted, fireOnStarting, fireOnStopping, fireOnThrowable, getCascadeID, getCascadingServices, getCheckpointNames, getCheckpoints, getCheckpointsCollection, getClassPath, getFieldsFor, getFlowDescriptor, getFlowSession, getFlowSkipStrategy, getFlowStats, getFlowSteps, getFlowStepStrategy, getHolder, getID, getName, getPlatformInfo, getRunID, getSink, getSink, getSinkModified, getSinkNames, getSinks, getSinksCollection, getSource, getSourceNames, getSources, getSourcesCollection, getSpawnStrategy, getStats, getSubmitPriority, getTags, getTrapNames, getTraps, getTrapsCollection, handleExecutorShutdown, hasListeners, hasStepListeners, initialize, initializeNewJobsMap, initSteps, internalStopAllJobs, isSkipFlow, isStopJobsOnExit, logInfo, openSink, openSink, openSource, openSource, openTapForRead, openTapForWrite, openTrap, openTrap, prepare, presentSinkFields, presentSourceFields, registerShutdownHook, removeListener, removeStepListener, resourceExists, retrieveSinkFields, retrieveSourceFields, setCascade, setCheckpoints, setFlowSkipStrategy, setFlowStepGraph, setFlowStepStrategy, setName, setSinks, setSources, setSpawnStrategy, setSubmitPriority, setTraps, start, stop, toString, updateSchemes, writeDOT, writeStepsDOT |
HadoopFlow
protected HadoopFlow()
HadoopFlow
protected HadoopFlow(PlatformInfo platformInfo,
Map<Object,Object> properties,
JobConf jobConf,
String name,
Map<String,String> flowDescriptor)
HadoopFlow
public HadoopFlow(PlatformInfo platformInfo,
Map<Object,Object> properties,
JobConf jobConf,
FlowDef flowDef)
initFromProperties
protected void initFromProperties(Map<Object,Object> properties)
- Overrides:
initFromProperties
in class BaseFlow<JobConf>
initConfig
protected void initConfig(Map<Object,Object> properties,
JobConf parentConfig)
- Specified by:
initConfig
in class BaseFlow<JobConf>
setConfigProperty
protected void setConfigProperty(JobConf config,
Object key,
Object value)
- Specified by:
setConfigProperty
in class BaseFlow<JobConf>
newConfig
protected JobConf newConfig(JobConf defaultConfig)
- Specified by:
newConfig
in class BaseFlow<JobConf>
getConfig
public JobConf getConfig()
getConfigCopy
public JobConf getConfigCopy()
getConfigAsProperties
public Map<Object,Object> getConfigAsProperties()
getProperty
public String getProperty(String key)
- Method getProperty returns the value associated with the given key from the underlying properties system.
- Parameters:
key
- of type String
- Returns:
- String
getFlowProcess
public FlowProcess<JobConf> getFlowProcess()
isPreserveTemporaryFiles
public boolean isPreserveTemporaryFiles()
- Method isPreserveTemporaryFiles returns false if temporary files will be cleaned when this Flow completes.
- Returns:
- the preserveTemporaryFiles (type boolean) of this Flow object.
internalStart
protected void internalStart()
- Specified by:
internalStart
in class BaseFlow<JobConf>
stepsAreLocal
public boolean stepsAreLocal()
internalClean
protected void internalClean(boolean stop)
- Specified by:
internalClean
in class BaseFlow<JobConf>
internalShutdown
protected void internalShutdown()
- Specified by:
internalShutdown
in class BaseFlow<JobConf>
getMaxNumParallelSteps
protected int getMaxNumParallelSteps()
- Specified by:
getMaxNumParallelSteps
in class BaseFlow<JobConf>
Copyright © 2007-2014 Concurrent, Inc. All Rights Reserved.