Table of Contents

Installing self-hosted version of Driven

version 1.1.1

Overview

Driven is a web-application that helps you visualize the operational details around all phases of your Cascading application: application development, debugging, performance tuning, operator monitoring. The Driven application receives telemetry data from your running Cascading applications (through the Driven Plugin) and analyzes the signals to provide you meaningful insights.

You do not need to make any changes to your existing Cascading applications to integrate with the Driven application. To use Driven, the Driven Plugin must be visible to your Cascading application. Driven is free for development use.

Using self-hosted version of Driven

Step 1: To get the latest version of the Driven Plugin jar for self-hosted version (from Unix or Mac OS), invoke:

$ wget -i http://files.concurrentinc.com/driven/1.1/driven-plugin/latest-shaded-jar.txt

Step 2: If you have not done it already, log into the Driven application, and click on the user icon on the top right corner of the landing page. Select 'Team list', and note the key associated with your team. Please note that features associated with Teams are only available for the on-premise version of the Driven application.

Running your application on Hadoop

Step 1: Make API key visible to the Driven Plugin: Create cascading-service.properties in your Hadoop configuration directory (referred later as $HADOOP_CONF). For Hadoop 1.x, this may be $HADOOP_HOME/conf and for Hadoop 2.x, it will be $HADOOP_INSTALL/etc/hadoop

$ echo cascading.management.document.service.apikey=_YOUR_API_KEY_ >> $HADOOP_CONF/cascading-service.properties

Step 2: Make the plugin available by adding the following property to $HADOOP_CONF/cascading-service.properties

$ echo cascading.management.service.jar=path/to/driven-plugin-1.1.1-io.jar >> $HADOOP_CONF/cascading-service.properties

Alternatively, you can set the path to the plugin through the $HADOOP_CLASSPATH environment variable

$ export HADOOP_CLASSPATH=path/to/driven-plugin-1.1.1-io.jar

Step 3: Finally, set the host URL key.

$ echo cascading.management.document.service.hosts=_YOUR_DRIVEN_SERVER_URL_ >> $HADOOP_CONF/cascading-service.properties

Running your application in Local Mode

Step 1: Expose the Driven API key by either by setting a local environment variable:

$ export DRIVEN_API_KEY=_YOUR_API_KEY_

or by setting a Java JVM level system property named cascading.management.document.service.apikey:

$ java -Dcascading.management.document.service.apikey=_YOUR_API_KEY_ ....

Step 2: Next, add the Driven plugin to your classpath. The plugin only needs to be placed into the CLASSPATH environment variable or specified on the java command line using the -cp flag.

To run, you will have a command line similar to this:

$ java -cp your.jar:driven-plugin-1.1.1-io.jar your.Main arg1 arg2

Step 3: Finally, specify how the plugin with reach the Driven application. This can be done either by setting a local environment variable:

$ export DRIVEN_SERVER_HOSTS=_YOUR_DRIVEN_SERVER_URL_

By setting a Java JVM level system property named cascading.management.document.service.hosts:

$ java -Dcascading.management.document.service.hosts=_YOUR_DRIVEN_SERVER_URL_ ....

If you wish to package the Driven Plugin directly into your application, you may simply add the Maven spec to your project file as a dependency. This is particularly useful if you plan to use Driven from your IDE as you develop your Cascading application. See Configuring Maven and Gradle for details.

Note
There are multiple ways to integrate your Driven plugin jar into your application, so it is always wise to verify that you are picking up the file from the correct location. When your Cascading application runs, it will log the path of the Driven Plugin jar file through the properties 'ServiceLoader', 'CascadingService', and 'DrivenDocumentService'. Ensure that all three properties refer to the same Driven plugin.

Now, when you run your Cascading application, you will be able to visualize its operational details by going to the URL for your Driven application.

Running on Amazon EMR

Step 1: You can use the Driven plugin with Amazon Elastic MapReduce by adding this to the elastic-mapreduce client command line arguments:

--bootstrap-action s3://files.concurrentinc.com/driven/1.1/driven-plugin/install-driven-plugin.sh

This installation method installs the plugin via the Hadoop configuration so that Cascading applications launched from the Elastic MapReduce web console or elastic-mapreduce command line client will pickup the plugin.

Step 2: Alternatively, if you pre-installed the plugin, you can pass the value an argument:

--bootstrap-action s3://files.concurrentinc.com/driven/1.1/driven-plugin/install-driven-plugin.sh --args "--api-key,_YOUR_API_KEY_,--host,_YOUR_DRIVEN_SERVER_URL_"

Maven and Gradle Dependencies

The Driven Plugin is available through our Maven compatible repository.

Note
It is important to specify the proper artifact classifier. The Driven Plugin is provided as a self contained 'fat jar' with all its dependencies embedded. See the build file notes for restrictions on fatJar, shading, uberjar and other means of repackaging the Driven plugin.

Maven

In the <repositories> block add:

<repository>
  <id>conjars</id>
  <url>http://conjars.org/repo</url>
</repository>

and then add the dependency:

<dependency>
  <groupId>driven</groupId>
  <artifactId>driven-plugin</artifactId>
  <version>1.1.1</version>
  <classifier>io</classifier>
</dependency>

Gradle

mavenRepo name: 'conjars', url: 'http://conjars.org/repo'

And then add the dependency:

compile group: 'driven', name: 'driven-plugin', version: '1.1.1', classifier:'io'

Getting Support

Support for Driven is provided via a public forum:

or by emailing:

FAQ

For topics not covered here, checkout the FAQ.