Overview

Meandre Workbench is a visual programming environment that allows users to easily connect software components together in a unique data flow environment. This application relies on the Google Web Toolkit (GWT) and is accessed via your Internet browser. You can use this interface to develop diagrams of data operations relevant to your research. Each operation is represented by an icon, and the icons are linked together in a flow representing the movement of data through each operation. Each of these icons represents a component. These software components are reusable components that facilitate collaboration among developers. These components can be written in Java, Python, or Lisp. A set of components can be loaded into your Workbench for creating a flow (application).
Flows are essentially applications composed of components connected together. Flow complexity is limited only by the needs of your project. Pre-built flows can be loaded into the Workbench and modified as needed. A number of flows for use in a variety of data mining problems and domains have been developed and can be added to your Workbench.
Components and flows have tags and additional metadata associated with them that can be used to assist in searching and sorting.

Getting Started

Meandre Workbench (by default) runs on port 1712. Using your browser, go to http://localhost:1712, replacing “localhost” with the proper address of the machine where the Meandre Workbench instance is running. This will display the following login screen:

Workbench Login Screen

Login

This login screen prompts for your user id, password, and the server and port where Meandre server is running. The user will only be allowed to view the components and flows for which they have access.

Upon successful login, a screen similar to the following is presented:

Workbench Main Screen

This is the main Workbench application screen. It is divided into four sections that will be described shortly: the Workspace, the Repository panel, the Details panel, and the Output panel. Each of these sections can be resized as needed by dragging the desired section border in the appropriate direction.

Workbench Top-level Menu

The menu in the upper-right part of the Workbench shows the current user logged in and, upon hovering, shows the meander server to which this Workbench session is connected.

Logging out is accomplished by clicking the “logout” link. The application will prompt for confirmation if there are unsaved changes to the flows in the Workbench.

Workbench

The Workbench provides the most flexible and feature-rich user interface for composing flows and controlling knowledge discovery tasks. It supports the creation of data flow graphs where different methods can build and visualize models. The Workbench was designed to effectively use screen real estate, with panels that can be expanded and collapsed on the sides and bottom when needed.

Workspace

The Workspace is the large tabbed region in the center of the Workbench window where applications are constructed. Components can be dragged into this region from the “Components” panel and interconnected to create flows.

Workbench Workspace

In the picture above, three components were added to the Workspace and interconnected to create a very simple flow that takes a string, converts its characters to uppercase and then prints out the result. The Workspace is the highlighted area. If a flow needs more real estate space than what is provided by the Workspace area, scrollbars automatically appear for your convenience.

Toolbar

The Toolbar contains a row of buttons that provide access to frequently used functions. These buttons deal with saving flows, removing components, and controlling flow execution.

Workspace Toolbar

Save Flow

Saves all modifications made to the current open flow, overwriting as necessary.

Save Flow As

Presents a dialog that allows the user to change the details of the current flow before being saved.

Workbench Save As

The Description section can contain rich text and other formatting elements.

Export

The flow can be saved locally either as a ZigZag script file or as a MAU (Meandre Archive Unit) file.

Copy Component

Copies the selected component(s) and connections between these component(s) so that they can be pasted in this flow or into another flow open in this same session.

Paste Component

Pastes the selected component(s) and connections between these component(s) so that they can be pasted in this flow or into another flow open in this same session.

Remove Component

Removes the selected component from the flow. The same functionality can be achieved by right-clicking on the component and selecting “Remove” from the context menu.

Workspace Remove Component

Execute Flow

Executes the current flow loaded in the Workspace. Any output from the flow will be displayed in the Output panel. If the flow contains interactive components, they will be displayed automatically. Please be sure to set your browser to allow pop-ups from the Workbench, otherwise the web interactive components will not display.

Stop Flow

Sends a request to the Meandre server to abort the currently executing flow.

For more information about the Workspace, see the “Using the Workspace” section later in this document.

Repository Panel

The Repository panel is the area to the left of the Workspace that contains the necessary ingredients for the creation, importing, and execution of flows. Additionally, the repository panel also provides searching and sorting capabilities, and contains a fully customizable display model for presenting the user with the exact information needed.

Workbench Repository Panel

The Repository panel hosts three sections: Components, Flows, and Locations. These sections are described in more detail next. Selecting to view a particular section is done by clicking on the section name, or on the [+] button to the right of the section title. Once a section is expanded, it can be collapsed again by clicking on the [-] button.

The Repository panel can be collapsed as well to maximize the screen real estate. This can be accomplished by clicking on the [<>] button on the side bar.

A “Refresh” button, located in the top right corner of the Repository panel, tells the Workbench to retrieve a new copy of the components and flows from the Meandre server.

As previously mentioned, searching components and flows is supported. By typing a value into the search box in the Components panel, a list of components that satisfy the query will be displayed. By typing a value into the search box in the Flows panel, a list of flows that satisfy the query will be displayed. The user can control the different metadata searched by clicking on the search button and selecting the meta-data categories to be searched (see figure below). By default, we search all metadata.

Workbench Repository Search

Components

A Component is a software unit that is designed to accomplish a particular task. Since a standalone component by itself has limited value, their power is unleashed when multiple components are logically connected to form a flow (application). In order for components to be connectable, they must define, at a minimum, an input or an output port. These ports represent the communication points to other components. A component can define more than one input and/or output port. The input ports are always on the left, and outputs on the right. For example, in the figure below, the component depicted has one input port and two output ports:

Workbench Component

Components may also have properties that can be set. This can be determined by looking for a symbol in the lower left hand corner of a component icon (see picture below).

Workbench Component Properties

These properties allow the user to configure the execution behavior or other aspects of the component at flow design time. Once set, these properties are persisted across multiple executions of the flow.

Every component follows the Meandre Component API. This API specifies a software contract that each component developer must adhere to when creating components. Click on the Components tab in the Repository panel to view all the components in your Workbench. The components listed are available from the Meandre server specified during login. Components are listed with their Name, Creator, and Date shown by default, but the selection of columns to be displayed can be configured by the user by clicking on the small downward-pointing arrow that appears when hovering the column title (see below) and placing a check mark next to each column to be displayed in the Columns submenu. The icon in the first column identifies the component type as Java, Python or Lisp.

Workbench Display Customization

The column labels also allow sorting of the table display. This feature can be invoked through the context menu as displayed above, or by clicking on the column title itself to change the sort method. You can also group the list of components or flows by one of the column labels.

Workbench Group By Feature

The figure above shows how the display has changed after selecting to group the components by their creator.

When a component is selected here, its associated documentation is shown in the Details panel (discussed in another section). Components can be added to the Workspace using drag and drop. Once in the Workspace, they can be connected to other components.

Flows

A Flow is essentially an application — a group of components connected together to perform a certain task. Click on the Flows tab in the Repository panel to view the flows in your Workbench. The flows listed are available from the Meandre server specified during login. When a flow is selected in the Repository panel, the description and metadata associated with this flow is shown in the Details panel. Double click on a flow to load that flow into the Workspace.

Locations

The Meandre server has the ability to access components and flows that have been uploaded directly to the server (via Meandre Server Interface or the Meandre Development Eclipse Plugin or ant scripts). It also has the ability to load RDF repositories from other Meandre servers or from an RDF file that may exist on a web server. We have created several component and flow repositories that the user can load. In the context of the Workbench, these repositories are called “Locations”. You can find a list of available repository locations at http://www.seasr.org/documentation.

Click on the Locations tab in the Repository panel to view the locations added to your Meandre server. To add a location, click on the “Add location” button.

Workbench Add Location
Workbench Add Location Dialog

This will prompt for a URL of the RDF repository and a name that you want to associate with this repository location. After successfully adding a location, it will be listed in the “Locations” section. Adding a repository location causes all the components and flows hosted at that location to be imported in the user’s private repository on the server. A repository location can be removed by selecting it and clicking the “Remove location” button. This also removes the components and flows in this repository location from the server.

Details Panel

This panel shows information about the currently selected component or flow. When a component is selected in the Repository Panel or in the Workspace, its associated information is displayed. Displayed information includes Description and Properties.

Workbench Details Panel

When a flow is selected in the Repository panel, its associated information is displayed in the Description section and nothing is displayed in the Properties section.

Properties

The Properties section shows the default values for the component’s properties. This is a read-only display when a component is selected in the Components section table of the Repository panel. The Properties section allows editing of a component’s properties only when a component in the Workspace area is selected. The current selection is indicated by the blue background. In the screenshot above, the component “Push String” in the Workspace represents the current selection. In this case, the “message” and “times” properties of that component can be edited by clicking on the property name, or double-clicking the property value to be changed. Press ENTER to accept the new value and finalize the editing operation.

Workbench Component Properties

Description

For components, the Description displays information about the component function. Later versions may also include inputs, outputs and properties. For flows, the Description displays information about the flow and the components that are included and their property values.

Output Panel

The Output panel displays output and error messages generated by the Workbench.

Workbench Output Panel

In the figure above, the Output panel shows the result of running the simple flow presented. The resulting “HELLO SEASR” string is the uppercase version of “Hello SEASR” which was set as the value of the “message” property of the “Push String” component.

Using the Workspace

The Workspace occupies the largest area of the Workbench, and is the focus of user activity. The functional components central to the environment — components, and flows — are activated by user actions within the Workspace. Examples of such actions include placing components, connecting components, setting component properties, and executing flows.

Placing Components

The first step in building a flow is to choose components from the Repository panel and place them into the Workspace. As described in a previous section, Meandre follows a data flow paradigm, and each of the various components relates to a step in the Knowledge Discovery in Databases (KDD) process. Consequently, the first components you place will often be I/O Components or User Input Components. To place one of these types of components, click on the Components section in the Repository panel and drag the desired component over into the Workspace area. A flow must have at least one component with no inputs to be able to be executed by the Meandre server.

Components that have been placed in the Workspace can be moved by clicking and dragging them around. Be careful not to click on an input or output port as this will initiate another operation that will be discussed shortly.

Selecting Components

Components can be selected by single clicking on them in the Workspace. When a component is selected, other selected items are deselected. While selected, a component can be moved about the Workspace or deleted. A selected component (or flow) can be unselected by using CTRL+click on that component (or flow).

Labeling Components

Components are displayed in the Workspace with an accompanying label that can be changed. Editing the component label only changes the name of the component in the given flow. However, the label must remain unique among the other component labels in the flow. Component labels are edited in a fashion similar to editing a file name in the Macintosh or Windows operating systems. Simply click on the component label, just below the component icon, and enter the desired text. When finished editing, click outside the label or press ENTER to apply the new value. Pressing ESC while renaming a component causes the rename operation to be cancelled and the previous name to be restored.

Workbench Component Label

Connecting and Disconnecting Components

The power and usefulness of components are only realized once they are connected together to form a larger system known as a flow. Constructing a flow directs the movement of data through a series of software components, each performing some operation on the data.

The ports of two components should only be connected if their data types are compatible with one another. Any errors resulting from data incompatibilities will occur at run time.

To make a connection, click on the output port of the desired source component (the port you clicked will be colored red), and then click on the input port to which you wish to connect. You should now have a line connecting the output and input port. If, after selecting a port, you wish to cancel the operation, simply clicking the same port again will deselect it.

The user may remove connections by bringing up the context menu of a component (right clicking on a component) and selecting “Disconnect” and then selecting either “Inputs”, “Outputs”, or “All”, depending on need. The user may also remove a connection for a port by bringing up the context menu of a port (right clicking on a port) and selecting “Disconnect”.

Workbench Component Disconnect

A component’s output port may only be connected to one input port. However, a component’s input port may be connected to several different output ports. This could be useful when you are retrieving the same data format from multiple components.

The connection line is highlighted if the user hovers over an input or output port. This is useful for verifying connections in a complex flow. When hovering over a component port, the description of that port is also briefly displayed.

Workbench Component Connection

Setting Component Properties

With flow execution stopped, click on a component icon to edit its properties. The properties for this component will be displayed in the Details panel. Multiple instances of the same component with settable properties will each have properties to be set. Editing the properties of one will not change the properties of another. Component properties are saved when a flow is saved, and reloaded when the flow is loaded again.

Executing and Aborting Flows

Executing a flow is a straightforward operation. With the flow loaded in the Workspace, click once on the Run button in the Toolbar. Flow processing will send the request to execute the flow to the Meandre server.

Clicking the Abort button will send a request to abort the flow to the Meandre server. The aborted state may not be achieved immediately. The Meandre server must first allow components to finish current operations before processing will be truly stopped.

In order to optimize flow performance, it may be desirable to visualize thread and component activity on the machine, or machines, executing your flow. Doing so can help identify bottlenecks and/or poorly allocated system resources. Such problems are hardest to troubleshoot in distributed computing situations. We will be working on features like this in the future.

Component Types

Components are organized into categories, enabling users to easily identify the functionality of any given component in an application. Each component can be specified in only one of these categories.

Input Component (#input)

Input components load data from files or databases. For example, components that read a file, crawl the Web, or read from a database would be I/O Components.

Output Component (#output)

Output components save data to files or databases or show it to the user on the screen. For example, components that write files or print information to the console would also be Output Components.

Data Transformation Component (#transform)

Data Transformation components perform functions that prepare the data for analysis. Data selection, data cleaning, and data transformation algorithms would be of this type. For example, components that perform binning or normalization of data would be Transform Components.

Analytics Component (#analytics)

Analytics components typically perform the main calculations for the application. For example, algorithms used to solve data mining problems would be implemented as Analytics components.

Visualization Component (#vis)

Visualization components provide visual feedback to the user. For example, components that display a scatter plot graphical view of the data or a decision tree visualization of the data mining results would be Vis Components.

Control Component (Control Component)

Control components provide additional data flow control. For example, components that replicate a data object would be Control Components.