Driven Administration Guide
version 1.1.1Setting up persistence cluster
Driven persists all historical operational details about your Hadoop applications; depending upon how many applications you run, how often you run, and how much data you process, you will want to size the cluster for your Driven persistence needs.
Note
|
Driven uses Elasticsearch for its persistence. To make configuration changes specific to Elasticsearch, refer to its documentation. Most of the application-specific parameters related to storing the data can be set through properties in the file driven.properties. |
Step 1: Specify the directory used by Driven to save data
driven.storage.data.path=./driven
Step 2: Give your persistence cluster a name
driven.storage.cluster.name=driven
Step 3: Specify shards and replicas for persistence scalability
In Elasticsearch, indices are equivalent to databases in Relational DBMS. Just as Relational Database has a schema, the Elasticsearch index has mapping. An index is broken into shards in order to distribute the data and scale. Replicas are copies of the shards which provide reliability if a node is lost.
To define the settings for your Driven cluster, set these values:
driven.storage.index.cascading.shards=<NUMBER_OF_SHARDS> driven.storage.index.cascading.replicas=<NUMBER_OF_REPLICAS>
Step 4: Define the hosts that are participating in the Elasticsearch cluster (leave them blank for single node systems):
driven.storage.cluster.discovery.unicast.hosts=