4. Using the Taverna API to programmatically execute a workflow

Table of Contents

4.1. Initialising the Repository
4.2. Registering the repository with the Taverna SPI registry
4.3. Example application
4.4. Example source code

Since Taverna version 1.4 it has been possible to execute workflow through the WorkflowLauncher helper class. However, since the introduction of Raven with version 1.5 onwards this became non-trivial and needed to be bootstrapped.

This section describes an approach to directly accessing the Taverna API without the need to bootstrap your application. This is in particulary useful if you want to use the Taverna API from an application or service, and you don't want to transition your application to use Maven and/or Raven.

What Raven provides for Taverna is dynamically loading of dependencies and components, such as processor types. Normally in Taverna everything except a tiny bootstrapper is loaded dynamically, which makes it possible to do plugins and online updates. However, following the approach of this section, we will explore a middleway that lets you keep normal Java dependencies on the parts of Taverna you need to interface with, and let Raven handle the rest. Instead of launching your application through a Raven bootstrapper, it will be enough to do a programmatic initialisation of Raven.

Although this example relates to executing a workflow through the WorkflowLauncher, it also applies to accessing other parts of the Taverna API with careful selection of the relevant artifacts, for example for building a workflow using the API.

4.1. Initialising the Repository

To avoid needing to use a bootstrapped approach, the solution is to initialise the Raven repository programmatically. This is achieved by using the LocalRepository method introduced with Taverna 1.5.2:

public Repository getRepository(File base, ClassLoader loader, Set<Artifact> systemArtifacts);

The parameter base is a File representing a directory to which any necessary artifacts will be downloaded. These are external artifacts that shouldn't be included in the classpath of the application that is using the Taverna API.

The parameter loader is the ClassLoader of the application that is invoking the Taverna API, this is where Raven will try to find any classes listed in systemArtifacts. Normally this will be your classloader, and therefore would include your classpath.

The parameter systemArtifacts is a set of Artifacts that are included within the application invoking the Taverna API. Defining these tell Raven that classes within these artifacts can be found within the classloader provided as the loader parameter, and not to use its own internal classloaders to create instances of these classes. This is important because otherwise Raven may download these artifacts and create another instance of this class from its own internal classloaders and wherever Taverna interacts with your applications ClassCastExceptions will occur. In practice this has been found to be rare, but providing these system artifacts acts as a good safety net.

Next any external artifacts need to be defined. These are artifacts that are accessed through the Taverna SPI extension points and are loaded through Tavernas internal plugin machinery. In this example of invoking a workflow these artifacts are the Processor artifacts. These are added to your Repository instance created previously through the method:

repository.addArtifact(Artifact artifact);

For these external artifacts, Raven needs to know where to find them. This is acheived by providing the repository with a list of Maven 2 repository URLs that contain the required artifacts. This should at least include:

http://www.mygrid.org.uk/maven/repository/ - the myGrid artifact repository
http://moby.ucalgary.ca/moby/moby_maven/ - Biomoby specific artifacts
http://www.ibiblio.org/maven2/ - the central Maven repository. You should also include some additional mirror sites.

Tip

The raven.properties included with Taverna (in the conf folder) provides a comprehensive list of mirrors assigned to the property raven.repository.<number>. Essentially here we are providing this same information programmatically.

To finalise the initialisation of the repository the method

repository.update();

now needs to be called. This will download any missing artifacts to the local repository base directory defined previously, which can take a few minutes when first run. Once they've been downloaded subsequent calls will be quick as long as the repository location does not change. Optionally you may wish to distribute a copy of this repository with your application and set the local repository directory accordingly.