High Level Requirement Specification
Dealing with Complex Events in Provenance
| | Taverna 1.3 | Taverna 2.0 |
| Priority | | |
| Rough estimate | WEEK | - |
Overview
Capturing and storing provenance information from workflows with complex structures.
In its present form, events are generated for each process in a workflow execution except for processes in nested workflows. In nested workflows, only the initial inputs and final outputs are exposed, meaning that the execution of potentially complex workflow fragments cannot be recorded and therefore also cannot be reproduced. Similarly, the ability to record and reproduce iterative processes is limited.
Overall Goals
Alter the event handling model to enable the capture of events (and therefore the provenance) generated as part of nested workflows and iterative processes. This will increase the functionality of the provenance components;
KAVE, and the provenance browser, to support real-world biological problems.
WorkflowCompletionEvents should identify whether they are the result of the completion of the parent workflow or of a Nested workflow.
Similarly, WorkflowFailureEvents should distinguish between the whole workflow or nested workflows.
If a ProcessCompletionEvent is the result of an iteration, then it should be possible to reference that process to all the IterationCompletionEvents that occured as part of that process.
A need for a ProcessCreationEvent (in addition to a WorkflowCreationEvent) has been identified, but this would currerently break the API with regard to EventListeners. This API is used by third parties, so cannot be willfully changed. A workaround has been found for Taverna 1.3, but is something that needs to be concidered in the Taverna 2.0 design.
Affected Components
Taverna
workbench, freefluo
enactment engine,
data store and
metadata store,
KAVE,
Provenance
Key Tasks
- Extend WorkflowCompletionEvent to create a new event NestedWorkflowCompletionEvent
- Extends WorkflowFailureEvent to create a new event NestedWorkflowFailureEvent
- Update Enactor to fire the correct type of event depending upon whether it is fired by the parent or a nested workflow
- Update ProcessCompletionEvent to cache related IterationCompletionEvents
- Update the Enactor to gather IterationCompletionEvents and as they are fired, and then associate them with the final ProcessCompletionEvent
- ProcessorTask is in need of refactoring, and needs to correctly deal with failures during an iteration. Currently the failure is not associated with a particular iteration but only the overall Process.
Appendix