Driven Administration Guide
version 1.1.1Installing the Driven Plugin
The Driven Plugin is the component that collects data from a running Cascading application and sends it to a Driven Server for processing and analysis. It must be installed and accessible to the Cascading process.
Configuration Options
The general use of Driven does not require any customization beyond that needed to make it available to the running job. There are two main ways of configuring the plugin: cascading-service.properties and environment variables.
Place the cascading-service.properties file in the Hadoop classpath. The easiest way to do this is add it in Hadoop’s configuration directory. Set the environment variables for the process that runs the Cascading job.
Driven Plugin Path
The Driven plugin must be placed in the Cascading job classpath.
cascading-service.properties
cascading.management.service.jar=/path/to/driven-plugin.jar
or, Hadoop classpath
$ export HADOOP_CLASSPATH="$HADOOP_CLASSPATH:/path/to/driven-plugin.jar"
Driven Server URL
The self hosted version of Driven Server requires the plugin to be configured with the server’s URL.
cascading-service.properties
cascading.management.document.service.hosts=*_http<s>:/hostname<:port>_*
or, Environment Variable
$ export DRIVEN_SERVER_HOSTS=*_http<s>://hostname<:port>_*
API Key
API keys map to Teams (refer to Configuring Teams for Collaboration in the Driven User Guide ) in Driven. An app that uses an API key will be searchable by members of the team that owns the key.
cascading-service.properties
cascading.management.document.service.apikey=YOUR_API_KEY
or, Environment Variable
$ export DRIVEN_API_KEY=YOUR_API_KEY
Default App Tags
In addition to app tags added by the developer, default tags can be added automatically via configuration.
cascading-service.properties
driven.protocol.tags=CSV_TAGS
or, Environment Variable
$ export DRIVEN_DEFAULT_TAGS=CSV_TAGS
Slice Data Suppression
Slice data transmission can be suppressed by setting a property in cascading-service.properties. This property overrides the server side configuration if present.
cascading-service.properties
driven.protocol.slice.suppress=true
or, Environment Variable
$ export DRIVEN_SUPPRESS_SLICE_DATA=true
Slice Data on Completion
Send slice data on completion (true by default).
cascading-service.properties
driven.protocol.slice.state_change_only=true
or, Environment Variable
$ export DRIVEN_SLICE_STATE_CHANGE_ONLY=true
Java Command Suppression
The java command for a Cascading job can be suppressed by setting a property in cascading-service.properties. This property overrides the server side configuration if present.
cascading-service.properties
driven.protocol.command.suppress=true
or, Environment Variable
$ export DRIVEN_SUPPRESS_JAVA_COMMAND=true
Archive Mode
The Driven plugin can be run in archive mode. If archive mode is enabled, all records that are sent to the Driven Server are written to disk even if the server is unreachable. This can be useful if the Driven Server is not available or unreachable as the archive can later be replayed when the server is reachable. Sending data to the server is idempotent, so re-running data again does not corrupt already recorded data.
cascading-service.properties
cascading.management.document.service.archive.dir=/path/to/archive/directory
or, Environment Variable
$ export DRIVEN_ARCHIVE_DIR=/path/to/archive/directory
Amazon EMR
Use the Driven plugin with Amazon Elastic MapReduce (EMR) by using the following client command line argument.
--bootstrap-action //driven-plugin/install-driven-plugin.sh --args "--host,${driven_server_hosts},--api-key,${driven_api_key}"
Note
|
The compressed archive files will be located at the path set with a timestamp when the cascading job started. Do not use the tilde symbol (~) expansion for the path. |
License
Make sure that the copyright information is included.
Copyright (c) 2007-2014 Concurrent, Inc. All Rights Reserved. Project and contact information: http://www.concurrentinc.com/