How can I get Taverna to execute tasks in parallel?
Taverna will run tasks in parallel whenever it's possible. In fact you
have to put effort into serialising execution if it is needed, by
using "Coordinate from".
As a dataflow oriented workflow language, SCUFL will run a processor at
the point when all of its needed data is available. So if you as the
workflow designer just worry about the dataflow, Taverna will usually
handle the parallel calls.
What can cause problem is that not all data is available at the same
time. For instance a processor that takes three parameters as inputs,
fed from three different processors upstream, will not execute until all
of the three parameters are present.
Another issue has to do with lists. Although Taverna's implicit
iteration means that you can provide a list to a processor that only
cares about single inputs (such as "Concatenate two strings"), the way
the iteration is done in Taverna 1.x is such that the processor is
temporarily transcribed into taking a list as input and producing a list
as output.
As with any processor taking a list as an input, it can't start until
the full list has arrived from upstream. And the same applies to the
output, the next processors downstream can't start until the full result
list has been returned. This makes sense for a processor that for
instance is to find an average of the inputs, but isn't really necessary
for implicit iterations.
The enactor of Taverna 2 will improve on this and introduce streams,
which means that processors which just 'pipe through' items of a list
can start processing when the first item has been outputted upstream.
This can be compared to iterators in Java.