Driven User Guide

version 1.2

Focusing on Relevant Data with Searches and Saved Views

The Status Timeline and Status Frequency graphs provide an overall profile of accumulated application-wide processing. Use other areas of the Driven page to:

Drill into more granular processes of separate runs of the same application
Save searches with particular filters and search terms as views so that you can retrieve the search criteria and apply to data later with one click
See application execution information formatted in reports that facilitate service-level agreement analysis
Automatically retrieve a status view for each application to which you have access by belonging to the team with viewing permission

Figure 1 shows the areas of the user interface that allow you to filter what data populates the reported metrics on the page when you want to work beyond the Status Timeline and Status Frequency graphs. The figure shows graphs and a table in a Driven status view. When you drill deeper into the components and runtimes of particular applications (see application view documentation), the main area of the window redraws in a different format. However, the application view has the same filtering and search handles.

searching-filtering-viewing

Figure 1: Searching and filtering controls of application data, including saved views

The Driven Plugin collects a rich set of operational data with each application run, which is indexed in the persistence layer of the Driven server. The search feature lets you query for these insights.

The following sections cover some key search and filtering functions of Driven. With these parameters, you can focus on the dimensions of the applications in your cluster that are of most interest to you. You can also save these snapshots of information as views. Views let you shift the focus of your work in Driven without fear of losing the insights from a data set that is collected by particular criteria.

Starting a Search

Searches that you can initiate at top of the Driven window support several application-, process-, and tap-level query attributes. This is the starting point for mining Driven data if you do not have access to any saved views that can help you.

Note	You can use the asterisk (*) as a wildcard character. The Search feature does not support other special characters ("",@, #, $, <, >, ?, etc.) in search strings and does not support spaces.

There are multiple parameters that you can invoke to query the application runs:

Application, process, and tap metadata - You can use a filter to specify a category for your search term. Click the All drop-down menu to select a parameter. The available search filter parameters are shown in Figure 2.
Date - Click the All dates drop-down menu to select a predefined date range or to customize the date or dates.
Status - Click the Status drop-down menu to filter on an application-run parameter that is listed in Figure 3. The pre-defined Active states filter queries applications that are in Pending, Started, Submitted, or Running state. The pre-defined Finished states filter queries applications that are in Successful, Failed, or Stopped state.
Teams - Click the All teams drop-down menu to filter application data based on association with a Driven team. See the documentation about teams for more information.

search-filter

Figure 2: Search filter parameters

status-filters

Figure 3: Status filter parameters

Tip	If the application ID, tag, or owner is not displayed in the table rows of the search results, use the column chooser to bring these dimensions in view. Figure 4 shows how to access column-chooser attributes.

columnChooser_appID

Figure 4: Column chooser icon and excerpt of selectable attributes

Statement and Process ID Filters

You can search by filters that focus on components that are more granular than an application.

The Statement search filter is useful for finding applications containing select statements at the flow level. Generally, applications can contain SQL statements, such as Hive-based apps with Hive Query Language (HQL). For example, consider the select statement for the Hive Flow -CalculateAverageQuantity flow, which contains:

HiveFlow_Select

Tip	Use the asterisk symbol (*) as the wildcard at the front and end of the text within the select statement you want to search.

The Process ID search filter is helpful for finding applications with Cascading objects that are executed at the step level. The process ID is correlated with the job ID of Hadoop’s JobTracker. Use this filter as a convenient method of finding an application when you already know its process ID.

Note	The wildcard character () is NOT* supported in the search criteria when the Process ID search filter is selected. Such a search would potentially return all components from every slice in the system as well as all applications, which can be a very expensive operation.

Saving Search Queries as Views

After searching for applications based on the criteria and filters you have defined, you can save the query as a view. You can then return to the view to retrieve all applications that match the search criteria and filters. Search results of a saved view are dynamic, displaying applications with matching search parameters at the time that the view is opened.

When you save the view, select what information about the matched applications to display:

Status View highlights data about application states, graphically displaying Status Timeline and Status Frequency.
Application View shows the Application Runtime graph, which is a gateway to monitoring details of application runs (see application view documentation).

Links to your saved views appear in the sidebar.

save-your-view

Figure 5: Saving a view

Your saved views can be retrieved by clicking them in the side panel.

My Teams Views

Each item in the My Teams area of the side panel is an entry point to a status view that is associated with a team of which you are a member. All members of the same team have the same status view under My Teams. A My Team link opens to a status view that the same as the Show All view, except that the data is filtered to show only information that is correlated with the selected team.

Case Example

Joanne is a member of the ServerStar and EasternIT teams. She sees links to the status views for both of the teams under My Teams. Other members of the ServerStar team see the same status view in their My Teams area of the sidebar and access the same information if they click the view. The same applies to the EasternIT team.

After Joanne opens the ServerStar team search, she finds the information useful in general but she wants to refine the filter parameters to focus on application status for the past five days. She clicks the drop-down menu for filtering dates and selects Custom Dates to select the past five days. After naming and saving the search, the name appears in Joanne’s Status Views list. The view that she created is not saved in the My Teams list.

Auto-Refreshed Views

When you are in a view that is displaying application-level data, Driven can refresh the displayed information as updates stream in from the plugin. Ensure that the Refresh toggle in the top right corner is enabled to allow the displayed Driven data to auto-refresh in real time. If the Refresh toggle is disabled, you must manually refresh the browser window to see real-time updates. Generally, this feature is useful if you want to monitor applications as they run.

Click and slide the Refresh icon to toggle between enabled and disabled.

refresh-icon

Figure 6: Refresh icon

Customizing Searches

You can create custom searches for application runs that have specific attributes, such as having a certain time range for processing or populating a counter with a defined value range. Custom searches are based on Lucene query syntax, which is entered in the search field of a Search View or Application View.

As with searches that do not use Lucene query syntax, custom searches return applications that match your search-parameter values and that are associated with your Driven teams.

Table 1 shows some examples of the types of information that can be retrieved with a custom search and the query statements to obtain the application search results. Refer to the statement syntax examples in Table 1 for guidance on how to construct some types of Lucene queries. For detailed information about the required syntax in search queries, see Apache Lucene - Query Parser Syntax documentation.

Note	The letters in the query statement syntax are case-sensitive.

Table 1. Examples of Custom Search Goals and Statement Syntax
Application Attributes	Sample Values for Parameters	Statement Syntax
Processing duration; Tag identifier	Duration more than or equals 5 minutes; Tag identifier = production	`duration:[300000 TO *] AND tags:production`
Pending status time; Runtime	Pending time does not equal 0; Runtime > 0	`NOT pendingTime:0 AND (NOT runTime:0)`
Application name; Processing duration	Name = Cascading-Hive; Duration more than or equals 5 minutes	`name:Cascading-Hive* AND duration:[300000 TO *]`
Application name containing spaces	Name = sales region	`sales_region*`
Counter with a particular value	The BYTES_WRITTEN counter equals 814186673	`counters.org\:apache\:hadoop:\mapreduce\:lib\:output\: FileOutputFormatCounter.BYTES_WRITTEN:814186673`
Counters with a value in a particular range	The BYTES_WRITTEN counter equals or is greater than 814186000	`counters.org\:apache\:hadoop\:mapreduce\:lib\:output\: FileOutputFormatCounter.BYTES_WRITTEN:[814186000 TO *]`
Path to user directory	Path is /Users/smith/company/code/project	`userDir:\/Users\/smith\/company\/code\/project`

Note	You must prepend a custom search query for counter values with `counters`. As also shown in the counter examples, backslashes are required to comment out colons in the path so that they are not parsed as Lucene syntax.

Table 2 lists Cascading application attributes that are most relevant to searches in Driven. The most commonly used search targets, such as app name, are integrated in the Driven GUI so that you do not need to run a Lucene query to find matches. When an attribute can be located by a search filter or column chooser, Table 2 lists the part of the user interface that can be used to track the information.

Although an application attribute might be searchable with the GUI controls, you might prefer to query for the attribute and value in a Lucene query. This is particularly true when you want to find matches based on a range of values or if you want to run a complex query.

Elasticsearch truncates field values that exceed 16,000 characters. If a search parameter value includes characters that appear only in a string beyond the character limit (such as a very long classpath), Driven does not return a match because it is unable to search for possible matches after 16,000 characters of a string.

Table 2. Searchable Cascading Application Attributes
Attribute	Displayed in GUI?
(Duration and other time-based attributes are listed at bottom of table.)
cascadingVersion	Application Details page
classpath	No
command	Searchable with App Command filter; Displayed on Application Details page
counter (Note: `counters.` must prepend the path in the query)	Application Details and Flow Details tables
finished	Searchable with Status filter
frameworks	Application Details page
id (application ID)	Searchable with the App ID filter; Displayed as the last node of the URL for the Application Details page
jarName	Application Details page
jarPath	Application Details page (click on JAR Information link to display the path)
javaClassVersion	No
javaCompiler	No
javaHome	No
javaInterpreterVersion	No
javaIoTmpdir	No
javaVendor	No
javaVendorUrl	No
jvmMaxMemory	No
localeCountry	No
messagingProtocolVersion	No
name	Searchable with App Name filter; Displayed on Application View and Application Details pages
osArch	No
osName	No
osVersion	No
owner	Searchable with the App Owner filter; Displayed on the Application Details page
pid	No
pluginVersion	Displayed on the Application Details page when you hover over the information icon
status	Searchable with one of the Status filters; Displayed in Status View and Application View
tags	Searchable with the App Tags filter; Displayed on the Application Details page
type	Types that are identified in the Driven user interface are applications, flows, and steps
userDir	No
userHome	No
userLanguage	No
userRegion	No
userTimezone	No
version	No
Time Attributes	All time attributes with corresponding values can be displayed in the Application Details and Flow Details tables, except for the lastCounterFetchTime and statusTime attributes.
Absolute-time attributes Values for these attributes are in milliseconds:	finishedTime lastCounterFetchTime pendingTime runTime startTime statusTime submitTime
Duration-time attributes Values for these attributes are in Unix time:	duration (duration = Amount of time from Started to Finished status. Equivalent to the startTime:finishedTime attribute.) pendingTime:finishedTime pendingTime:runTime pendingTime:startTime pendingTime:submitTime pendingTime:finishedTime runTime:finishedTime startTime:finishedTime startTime:runTime startTime:submitTime

Application Views