Driven FAQ

version 1.2

What are the benefits of registering for the beta version of Driven instead of using the beta version anonymously?

Driven beta releases are incremental builds of the product before general release of the final version. Beta versions let you preview and test new features without installing the Driven Server because the cascading.io service hosts the server.

An anonymous implementation of Driven has limited functionality. For example, if you download and install the beta Driven Plugin without registering, you can view data about individual application instances but not aggregated-metric dashboards and tables that help you compare application runs. However, an anonymous beta environment can be an easier way to first use the product.

When you sign up at the Driven Beta registration page and configure the Driven API key, you can view compilations of application runs and other data with filtering capabilities. In addition, a registered beta environment can use Driven’s teams feature. The additional capabilities are useful for categorizing applications according to lines of business or for comparing historical execution behavior across the same class of applications to identify outliers and trends.

Can I run my own Driven Server locally on-premise or remotely at my cloud or hosting provider?

Yes, you can deploy Driven in both environments with a trial license, which you can obtain on the Driven Trial registration page.

What versions of Hadoop are supported?

Driven works with any version of Hadoop that is compatible with Cascading. See the Cascading compatibility matrixes for the full list of supported versions.

What versions of the JDK are supported?

The matrix of supported JDKs is on the Cascading compatibility matrixes page.

Does the Driven Plugin affect the execution of the Cascading application?

If your Cascading application executes successfully without the Driven Plugin, it executes successfully whether or not the Driven Plugin transmits all the telemetry data successfully. The push of the data to the Driven Server is decoupled from the execution of the Cascading MapReduce application.

I am seeing out-of-memory errors. What are the possible causes?

OOM messages from the Driven Plugin are most likely a symptom of an error rather than an error itself.

Some possible underlying reasons for the OOM messages are:

The Cascading application is transmitting data to the Driven Plugin at a rate faster than the plugin can send the data to the Driven Server.
The Driven Server is unavailable.
The Cascading application is unable to connect to the Driven Server.
A cluster is inadequately provisioned for NameNode processes and the machine (Driven Plugin pings the NameNode for slice information). Try changing the -Xms and -Xmx settings of your Cascading application and NameNode processes to the following: -Xms4096m -Xmx4096m

If your Cascading application executes successfully without the Driven Plugin, it will execute successfully whether or not the Driven Plugin transmits all the telemetry data successfully. The push of the data to the Driven Server is decoupled from the execution of the Cascading MapReduce application.

How can I ensure that I see the application execution data that the Driven Plugin detects?

The multilayered Driven web client renders much of the application performance data that the plugin sends. Provided that the Driven Server and the plugin are operational and connected, you should not typically need to go beyond the Driven graphical user interface to discover the information that you need.

An "archive mode" configuration is available for the plugin, which writes to disk all data that is sent by the plugin. The archive can be "replayed" to view the information in a log-like format. Archive mode allows you to view data transmitted from the plugin when the Driven Server is not functioning or cannot be accessed. The archive also provides additional details of application execution that are not displayed in the Driven web client.

To use archive mode:

Either pass the following JVM property to the Cascading job:

-Dcascading.management.document.service.archive.dir=/path/to/archive

Or you can set the HADOOP_OPTS environment variable:

> export HADOOP_OPTS="$HADOOP_OPTS -Dcascading.management.document.service.archive.dir=/path/to/archive"

The compressed archive files are saved to path names that consist of the Cascading application name and the ID. Do not use ~ expansion for the path.