=Overview=
This document describes how to create a Java component for Meandre and specify the meta-information about a component with annotations. In addition, we describe documentation requirements for information displayed to the component user–not for developer- for example, directed documentation such as that provided via JavaDocs.

Except for name, creator, description, and tags, we have established default values for the annotations. However, default values can be changed to reflect the developers specific preferences.

We also have established an annotation for specifying applets called from a component to package the jar files for the applet.

=Component Information=

A Meandre Component contains information that is stored in RDF about the component. In Java, this descriptive information is added through the use of annotations. The annotations below need to be added to the “@Component ()” line(s) with a comma separating the annotation fields.

==Component name:==

There must be a display name for the component. This should be a name that makes sense to a end-user, for example functionality, not necessarily a repeat of the component class. This name is concatenated with the base url to form the component name URI. No default value is given, so the name is required to be specified by the developer.

Code Sample:
name = “My Component”

==Component creator:==

Creator identifies the name of the person who created the component. No default value is given, so the creator is required to be specified by the developer.

Code Sample:
creator = "Jane Doe"

==Component tags:==

Tags identifies labels that describe selected facets of a component’s functionality. The Meandre Workbench can use a set of tags to add an icon and color coding to components. Tags of the following format are identified by the workbench: _io_, _transform_, _analytics_, _ui_, _vis_ and _control_. If appropriate, please select one of these tags for your components. If no tag is specified, then it will be tagged as a “Analytics” Component. In general all tags should be lowercase. No default value is given, so tags is required to be specified by the developer.

Code Sample:
tags = "_io_, input, filename"

==Component rights:==

The rights specifies the license in use for a component. The following can be specified: UofINCSA, ASL_2, Other. The default value is Licenses.UofINCSA.

Code Sample:
rights = Licenses.UofINCSA

==Component firingPolicy:==

The firing policy refers to the way Meandre handles inputs. In case of having more than one input port, Meandre needs to know what policy to use to fire the execution of the component. Two options are available: all or any. All requires all the input data ports to be populated to fire the execution of the component, whereas any fires a component any time a data port is populated. The default value is FiringPolicy.all.

Code Sample:
firingPolicy = FiringPolicy.all

==Component baseURL:==

The default value is “http://www.seasrproject.org/components/”

Code Sample:
baseURL = "http://www.myorg.org/components/"

==Component runnable:==

This runnable property identifies the type of executable component identified by the URI. Three types are currently supported: java, python, and lisp. The default value is “java”.

Code Sample:
runnable = "java"

==Component format:==

Format property describes the format of the binary object that implements the described component. The default value is “java/class”, but it can assume the values “clojure” for lisp components and “jython” for python-based components.

Code Sample:
format = "java/class"

==Component dependency:==

Dependency indicates any additional jar file dependencies that this component needs. The default value is “” (null).

Code Sample:
dependency = {"myAdditionalJar.jar"}

==Component resources:==

Resources identifies any additional file dependencies that a component needs, such as property files or other dependencies that are not jar libraries. The default value is “” (null).

Code Sample:
resources = {"myProperty.txt"}

==Component rightsOther:==

The default value is “” (null).

Code Sample:
rightsOther = "BSD"

==Component description:==

Component descriptions are entered using descriptive textual paragraphs and key terms, to provide uniformity between the components, and to ensure that in the future the information may be extracted and formatted for separate documents and enhanced presentation. Please follow the outline given here.  Although some keyphrases will not be relevant for all components, all appropriate keyphrases should be entered.

All components must have the Overview and Detailed Description sections. No default values are given, they are required by the developer. Hear are some guides for you to follow:

  • If you refer to a property anywhere in the component description section, italicize the name of the property to make it stand out. For example: “The example table is divided into <i>Number of Folds</i> train/test sets.”, where Number of Folds is the property display name that is fully documented later in the Property Information. Note: <em> can be used instead of <i> for italics.
  • The component description should begin with an overview of the component functionality that is no more than a couple of sentences long. This is intended to give users an idea of the component functionality, so that they can make an initial determination as to the applicability of the component to their needs. The overview will likely be extracted and included in a separate document listing all components. The first paragraph should contain the overview and should be tagged with the keyphrase <p>Overview:
  • Following the overview, the component functionality must be fully described to the extent that the user knows what to expect from the component and the testers know how to test the correctness of the component. Include comments on how missing values are handled and similar “boundary” information. Use a new paragraph and keyphrase to start this section: </p><p>Detailed Description: For long descriptions, text in this section should be broken into multiple paragraphs by inserting </p><p> directives into the text.
  • In the next section, for some components–for example those implementing well-know, model-building algorithms–references to relevant papers, texts, technical reports, or other descriptive resources should be provided. Use a new paragraph and keyphrase to introduce the reference information: </p><p>References:
  • Document cases where the component is not applicable. For example, if the model can only use nominal input types or predict continuous values, note that in this section. If the component fails when nominal attributes have a large number of distinct values, document it. These are the types of things that are not checked by port compatibility checks. Separate different types of constraints listed in this section into different paragraphs. Use a new paragraph and keyphrase to introduce this section </p><p>Data Type Restrictions:
  • Discuss how the component treats the data on its input ports. In particular, if it makes modifications to any of the data, note it. This would be expected behavior for components that take mutable tables, for example. It’s worth highlighting for users so that they don’t create a flow where a mutable table is directed to a splitter, and then the output of the splitter goes to 2 components that both modify the data. Not good! Some components may make a copy of the input data and modify that. Here the input data is intact. Put the information about treatment of input data in a section with the keyphrase </p><p>Data Handling:
  • Document scalability issues. These may be memory, compute, or navigation time in the case of visualizations. If the component does not scale well for a large number of examples or for a large number of attributes, document that in this section. This section should begin in a new paragraph with a keyphrase </p><p>Scalability:
  • Non-standard component execution criterion must be clearly described. The default behavior is that a component must have all its inputs before it will fire. Document any deviation from this standard. If the component sets the firing policy to “any,” the behavior must be documented. This section begins with the keyphrase </p><p>Execution Criteria:

Code Sample:
description = "This component implements ..."

==Complete code example from ComputeConfidence for the component annotation==

@Component(

name = “Compute Confidence”,
tags = “_analytics_, frequent pattern, rule association, confidence”
creator = “Boris Capitanu”,
description = “<p>This module works in conjunction with other modules implementing the Apriori ” +
“rule association algorithm to generate association rules satisfying a minimum confidence ” +
“threshold. ” +

“</p><p>Detailed Description: ” +
“This module takes as input an <i>Item Sets</i> object generated by the <i>Table To Item Sets</i> ” +
“module, and <i>Frequent Itemsets</i> generated by the <i>Apriori</i> module. ” +
“From these inputs, it develops a set of possible association rules, each with a single ” +
“target item, where an item consists of an [attribute,value] pair. ” +
“For each possible rule, this module computes the <i>Confidence</i> in the prediction, ” +
“and accepts those rules that meet a minimum confidence threshold specified via the ” +
“property editor. ” +

“</p><p>” +
“For a rule of the form Antecedent A implies Consequent C, the <i>Confidence</i> is the percentage of ” +
“examples in the original data that contain A that also contain C. The ” +
“formula to compute the confidence of the rule A->C is: <br>” +
” Confidence = ( (# of examples with A and C) / (# of examples with A ) ) * 100.00 ” +

“</p><p>Limitations: ” +
“The <i>Apriori</i> and <i>Compute Confidence</i> modules currently ” +
“build rules with a single item in the consequent. ” +

“</p><p>Scalability: ” +
“This module searches all the Items Sets to compute the confidence for each Frequent Itemset. ” +

“The module-allocated memory for the resulting Rule Table. </p>”,
)

=Component Input and Output Ports=

This information is displayed when the user performs a mouse-over on an input or output port.
Coding guidelines:

  • use consistent names across components (stay tuned for table of names)
  • don’t use abbreviations (Decision Tree Model, not DTmodel)
  • capitalize each term (Prediction Table, not prediction table)

==Port name:==

There must be a name for the input or output port. This should be a name that makes sense to a user, not necessarily a repeat of the data structure object.

Code Sample:
name = "Frequent Item Sets"

==Port description:==

There must be a description for the input or output port. This description should provide information about the data object expected as input or output.

Code Sample:

description = "The frequent itemsets found by an <i>Apriori</i> module. These are the " +
"item combinations that frequently appear together in the original examples."

==Complete code example from ComputeConfidence for the component annotation==

@ComponentInput(

description = “The frequent itemsets found by an <i>Apriori</i> module. These are ” +
“the item combinations that frequently appear together in the original examples.”,

name = “Frequent Item Sets”)
final static String DATA_INPUT_FREQ_ITEM_SETS = “Frequent Item Sets”;

@ComponentOutput(

description = “A representation of the association rules found and accepted by this module. ” +
“This output is typically connected to a <i>Rule Visualization</i> module.”,

name = “Rule Table”)
final static String DATA_OUTPUT_RULE_TABLE = “Rule Table”;

=Component Properties=
Component properties are attributes and their values that are used to modify component behavior. All properties have a name, description, and default value.

==Property name:==

Properties should be given display names that will make sense to the user. These names appear next to property input boxes in the property dialog.

Code Sample:
name = “Maximum number of rounds"

==Property description:==

Property description should describe what the property characteristics. Properties that appear in many components (label name for example) should be identified and documented consistently in every occurrence across all the components that have that property. Properties are currently presented in alphabetical order. Property descriptions should:

  • provide the user with an idea of valid ranges / selections.
  • provide the user with enough information to have an understanding of the implications of one setting over another

Code Sample:
description = "This property is..."

==Property defaultValue:==
A defaultValue must be given for every property so that the component can execute without user input.

Code Sample:
defaultValue = "70.0"

==Complete code example from ComputeConfidence for the component property annotation==

@ComponentProperty(description = “The percent of the examples containing a rule antecedent ” +
“that must also contain the rule consequent before a potential association rule is accepted. ” +
“This value must be greater than 0 and less than or equal to 100. “, name = “confidence”,
defaultValue = “70.0”)
final static String DATA_PROPERTY_CONFIDENCE = “confidence”;

=ComponentNature=
An annotation was added to handle the ability to package jar files for applets called from a component.
==type==
Type indicates that this component will be using an applet. No default value is given.

Code Sample:
type = "applet"

==extClass==
The extClass specifies the applet’s main class. No default value is given.

Code Sample:
extClass = com.applet.FooApplet.class

==dependency==

Dependency indicates any additional jar file dependencies that this applet needs. The default value is “”. Do not use the @Component.dependency attribute to list those dependencies.

Code Sample:
dependency = {"myAdditionalJar.jar"}

== resources:==

Resources identifies any additional file dependencies that the applet needs, such as property files or other dependencies that are not jar libraries. The default value is “”. Do not use @Component.resources attribute as these files are all added to the component jar file.

Code Sample:
resources = {"myProperty.txt"}

=Interface Definitions from Java Code=
Specific interface definitions for component; input and output; and properties are given below.

==Component==
public @interface Component {
public enum FiringPolicy { all, any};
public enum Licenses {UofINCSA, ASL_2, Other};
public enum Runnable {java,python,lisp};
String name();
String baseURL() default “http://www.seasrproject.org/components/”;
String creator();
String description();
String tags();
Licenses rights() default Licenses.UofINCSA;
String format() default “java/class”;
Runnable runnable() default Runnable.java;
FiringPolicy firingPolicy() default FiringPolicy.all;
String rightsOther() default “”;
/*only jar dependencies*/
String[] dependency() default “”;
/*property files or other dependencies that are not jar libraries*/
String[] resources() default “”;
}

==Component Input==

public @interface ComponentInput {
String name();
String description();
}

==Component Output==

public @interface ComponentOutput {
String name();
String description();
}

==Component Property==
public @interface ComponentProperty {
String name();
String description();
String defaultValue();
}

==ComponentNature==
public @interface ComponentNature {
String type();
Class extClass();
String[] dependency() default “”;
/*property files or other dependencies that are not jar libraries*/
String[] resources() default “”;
}

==ComponentNatures==
public @interface ComponentNatures {
ComponentNature[] natures();
}