r10 - 17 May 2006 - 10:41:01 - JuneFinchYou are here: myGrid wiki >  Techreq Web  > AllowLooping

High Level Requirement Specification

Allow Looping

Reference Techreq.AllowLooping
Referenced Use-cases MiasGrid, LandscapeGenomics
Dependencies T2Architecture
Champion Tom Oinn
Status DEFERRED

  Taverna 1.3 Taverna 2.0
Priority   3
Rough estimate - MONTH

Overview

The ability for a processor, of any type, to be invoked multiple times within a loop with different peices of data. This allows 'feedback' into a previsouly invoked processor, and is different to the currently existing iterations which invokes a processor multiple times with a set of data, but continues downstream afterwards and never returns to that processor again during the workflow execution.

Overall Goals

This requirement will need to allow a processor be defined with two different sets of inputs - the initial inputs who source is upstream, and inputs fedback from a processor downstream.

Workflows need to be constructed so that there is a condition under which the loop will end. This can be currently handled with the conditional local widgets (Fail if true, and Fail if false), but this will become unwieldy for workflows with many loops. An additional conditional mechansim for breaking out of loops may be required, together with safety conditions that prevent a workflow from never ending (e.g. putting a limit of the number of loops).

It is also possible that looping could result in recursive behaviour, e.g by feeding outputs from a processor back into the inputs of the workflow, before the current workflow has completed. To safegaurd against infinate recursion, a bail out depth or bailout condition should be enforsed.

Looping will have a significant impact on the semantics describing the states of a workflow, and these will need updating.

Assessment

Looping itself can not be supported. (see Tom's notes below). However the "real" desired functionality which could be for recursion, parameter sweep, iteration or recursion are covered in the dispatch stack architecture in T2.

Affected Components

Taverna workbench, Freefluo enactment engine, Provenance, Scufl Model

Key Tasks

  • Extend the Scufl Model to allow for additional inputs for a given processor as well as the initial inputs, whilst maintaining backward compatibility with all existing Scufl Models.
  • Modify the Processor architecture to allow for the above change.
  • Design and implement a mechanism for handling the conditions for exiting a loop. This may require a new conditional processor, or may be acheivable using the current Fail if True, Fail if False local widgets.
  • Update the semantic language to accomadate the new logic and possible new states caused by the addition of looping.

Surely the solution described below supercedes this? Actually I'll make that less interogative - the solution below is a better and cleaner way of doing this, we will never allow loops in the structure of the dataflow graph.

Appendix

Recursive Invocation Example for Taverna 2

Recursive invocation is the repeated invocation of an operation where the input to each invocation can include results from the previous invocation in the input data set. The configuration of a recursive invocation therefore consists of three aspects:

  • A set of mappings from output to input port names. If a mapping exists from output ‘a’ to input ‘b’ this dictates that on invocations beyond the first the input set for the invocation should have the result (from the previous invocation) from output port ‘a’ inserted into input port ‘b’.
  • A recursion condition predicated on the input and output values for a given invocation, which determines whether the result should be passed back into the next invocation or propagated back up the stack.
  • A maximum recursion depth or bailout condition preventing infinite recursion and guaranteeing termination.

This dispatcher handled incoming job events from the layer above and incoming result set events from the layer below. When a job is received it is logged within the dispatcher state and propagated down to the next layer in the stack. The dispatcher keeps a reference to the input state for every job index along with a recursion count initially set to zero.

When a result event is received the dispatcher locates the input set (previously stored), checks for bailout (whether the recursion depth has exceeded that allowed in the configuration) and, assuming the depth is less than the bailout, passes both input data set and result data set into the recursion condition check. If the check determines that recursion should not be applied or if the bailout condition is hit the result set is propagated up to the next stage in the stack. If recursion is to be applied the mapping configuration is used to assemble a new input data set from the previous input set and the result data and this new job is propagated down to the next layer below in the stack.

(Taverna v2 Aims and Vision, TomOinn?)

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r10 < r9 < r8 < r7 < r6 | More topic actions
 
Powered by myGrid wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding myGrid wiki? Send feedback