2.3. Running Taverna

You should be able to run an installed Taverna system by invoking the 'runme.bat' or 'runme.sh' files, the former being for Windows and the latter for Linux. If this fails to work you should re-check the installation, Java version and GraphViz version (linux only), and carefully read through the section on configuration. If you have done this and the system still doesn't work please let us know via the mailing lists.

2.3.1. Enacting a predefined workflow

As a first quick demonstration of Taverna you can load one of the example workflows. Select open workflow from the file menu. You should be presented with a file chooser dialog box, the exact appearance of which will be determined by your operating system.

2.3.1.1. Loading a workflow

Workflow examples are stored in the 'examples' subdirectory of your Taverna installation, so use the file browser to navigate to this directory and select the 'SNPsForRegionsSurroundingGene' workflow. This should result in the Advanced Model Explorer and Workflow Diagram panels being populated with the workflow defined in this file. There may be a short delay while Taverna contacts the network to get more information about the resources the workflow uses - if the delay is a very long one followed by an error you probably need to check the configuration of your proxy settings, see the configuration section for more details.

2.3.1.2. Enacting the current workflow

Aside from creating and editing, the most useful thing to be able to do with a workflow is to run it. Select the 'Run Workflow' option from the file menu. If your workflow requires input, you will be presented with a new window - this is the place to enter the starting data on which the workflow is to run. In this case the workflow has a single input called 'GeneIDList'. Select ‘GeneIDList’ and right-click. You should see that the 'New Input' and 'New List' options are now available at the top of the window, click on the 'New Input' to create a single data input to the workflow. In the panel that has just appeared you should replace the 'Some input data goes here' with the input term, so try entering 'ENSG00000131959'. Once you have entered this input data you can press the 'Run Workflow' button at the bottom of this window to actually start the workflow engine off on the supplied input.

The workflow run will be displayed in the ‘Results’ window. This window shows you the progress of the workflow and also the results on completion. Additionally, if anything goes wrong with the enactment you can interrogate other parts of this display to find out exactly what has happened (but we won't cover this here). You should see a table appear initially containing a list of the component operations along with their status, this will update until the workflow has finished and you can inspect the inputs and outputs of any given part of the workflow by selecting the process you're interested in and viewing the intermediate values from the bottom half of the window. Once the workflow completes you will jump into the results view.

2.3.1.3. Browsing results

This particular workflow has two outputs called 'ReportList' and 'GeneIDList'. Select ReportList from the tabs. Select the only available item in the list on the left hand side of the window and repeatedly expand the list. You should see a list of results appear in the right hand side of the window. Congratulations, if this worked then you have installed and configured Taverna correctly and run your first workflow.

2.3.1.4. Saving a workflow

If you were to want to save the workflow (there's not much point as we haven't changed anything) you would select the 'Save' option from the file menu and save the document in the usual fashion for your operating system. You probably don't want to do this for now as there are no changes.

When saving a workflow you can select to save any Nested Workflows in their full form rather than just as URLs. Check the box marked "include full nested workflow" before selecting "Save"

2.3.1.5. Closing a workflow

To finish, and prepare for the next part of the introduction, you'll want to reset the workflow. Select the ‘Design’ view from the top of the workbench and then click on the red 'close workflow' icon in the ‘File’ menu. This just resets the definition, your results from the workflow invocation are still there; you can reset these by simply selecting ‘close’ in the top right corner of the ‘Results’ window.

2.3.2. Creating a (very) simple workflow

2.3.2.1. Workflow inputs and outputs

First things first - the workflow needs to have an input, in this case the id of a sequence to fetch. You can create a new workflow input by right clicking on the 'Workflow inputs' node in the Advanced Model Explorer in the ‘Design’ view and selecting 'Create new input'. This will then ask you for a name for the input, you can change this later but for now use 'sequenceID'. Click on 'OK' and you should see your new input appear in the Advanced Model Explorer and Workflow Diagram windows.

Similarly, the workflow will need an output. Follow an equivalent process but this time clicking on the 'Workflow outputs' node and using the name 'sequence' to create an output. Again, you should see the output appear in the two windows.

2.3.2.2. A single sequence fetch component

Now that the workflow has an input and output it needs something to fit between to actually do the work. If you're familiar with the EMBOSS tool suite you might be aware that there's a program called 'seqret' which can do exactly this, fetch a sequence from an ID. Fortunately the default services available from Taverna include a system called Soaplab at the EBI which contains all EMBOSS tools, including seqret. Go to the 'Available Services' window and either scroll down until you find it or, more sensibly, enter 'seqret' into the search box at the top and hit return. You should see the tree narrow to show you two matching operations (shown in yellow to denote soaplab services), 'seqretsplit' and 'seqret'.

To add the operation to the workflow you can drag the 'seqret' service from the 'Available Services' window into the 'Advanced Model Explorer' window, dropping the service into the empty space on the right of this window. You should hopefully see a new entry under the 'Processors' node called 'seqret' with an array of other child items hanging off it - these are the available inputs and outputs to and from the newly created process.

2.3.2.3. Connecting everything together

You should see the workflow input, processor and workflow output in the 'Workflow Diagram' window, but they are not linked together yet. To link them, you need to right click on the 'sequenceID' input in the 'Advanced Model Explorer' and select the 'seqret' child menu (under the 'connect to…processor…seqret). This child menu will show you all the inputs in the seqret tool to which you can link your workflow input. The exact details of all these inputs will depend on the processor, in this case you want to link the sequenceID workflow input to the 'sequence_usa' processor input. Select the 'sequence_usa' from the child menu and you should see the link appear in the workflow diagram.

The remaining link in the workflow is from the output of the processor to the workflow output. Conceptually we always link from the source of the information to its destination (following the flow of data), so you'll need to expand the 'seqret' node in the Advanced Model Explorer (if it isn't already expanded) and scroll down to see the available outputs, denoted by purple circles with outgoing arrows. There are two outputs from this processor, 'report' and 'outseq', the one we want to use is the 'outseq'. Similarly to the first link, right click on the 'outseq' node and follow the 'Workflow outputs' child menu. There is only one workflow output so select it and you should observe the now completed workflow in the 'Workflow Diagram' window.

2.3.2.4. Describing the input

You can add descriptive information to the workflow inputs - this information is useful to guide potential users of your new workflow. Select the 'sequenceID' node by left clicking on it in the 'Advanced Model Explorer'. You should observe a tab at the top of the window with a purple icon and the text 'Metadata for 'sequenceID''. Select this tab by clicking on it.

This is the metadata editor. Select the 'Description' tab, enter some description of the input (i.e. a quick paragraph explaining that this input should hold the sequence ID to fetch and perhaps an example value - 'swallid:ops2_*' is a reasonable one), click the 'Update' button when you're done and jump back to the 'Workflow' tab to get back to the default view of the workflow.

2.3.2.5. Enacting the workflow

You should now be able to run the workflow just as in the previous section, try entering your example 'swallid:ops2_*' as the input (without the quotes) and you should see the status display (briefly) followed by the results which, all being well, should contain the fetched sequences in fasta format. If this worked then great, you've created a simple workflow from scratch and enacted it.