High Level Requirement Specification
The publishing of workflows.
| Reference | Techreq.PublishingWorkflows |
| Referenced Use-cases | |
| Dependencies | ApiSep |
| Champion | Tom Oinn |
| Status | PRIORITISED |
| | Taverna 1.3 | Taverna 2.0 |
| Priority | 2 | |
| Rough estimate | MONTHS | |
Overview
Purpose
Benefactors
- myGrid users looking to build workflows based on prior work.
Scope
- The tool allows the publication, versioning and dissemination of workflows and their input data. It provides for uploading and downloading of workflows both through Web browsing and in batch.
References
- The 'repository' module from Taverna's CVS includes servlets and JSP to allow submission of workflow, autogenerates their diagrams and presents the diagrams with hyperlinks out to larger versions and the HTML summary text as a table. It can run workflows but not in a useful sense - the code is generally there though.
Overall Goals
Workflow Documentation Template
- Offer users a quick and standardised format for documenting workflows. Encourage users to fill in a workflow fact sheet when submitting workflows
Annotations and comments
- In a wiki-ish way, allow anyone to annotate, tag and/or comment on the workflows. Then, even if a good workflow is uploaded with almost no facts, other users could write say "This is a good example of blah blah analysis", and possibly provide (links to) newer versions for fixing outdated services etc.
Versioning/evolution of workflows
- Perhaps a CVS/Subversion based solution
A mechanism to upload/download/delete workflows and workflow data, with a user authentication/quota system
- Through a Web browser
- From a button within Taverna, "File -> Publish workflow"
- In batch, preferably through a combination of publicly available tools/protocols, such as WebDAV / LDAP / SSH.
Collection of general use workflows to get people started
- Sample inputs for each workflow (also helps to test which workflows are still alive)
Mechanism to auto-test workflows
- Overnight polling, checking how many/which of the consisting services are disfunctional/inaccessible. It's likely we can represent workflows as Web services, in which case they become a target for BioNanny (http://bionanny.sourceforge.net/), and it can do the QoS data gathering.
Calculation and publication of quality metrics/QoS metadata
- Indications which workflows are operational
- Submission date of workflows
- Last successful run
- Top 10 of most popular/downloaded workflows
- 5* rating mechamism (as in iTunes)
Integration with Grimoires
Assessment
This is outside of the codebase. No other dependencies.
Affected Components
Key Tasks
- Identify key use cases for the workflow repository, think "Who is this really for?"
- Investigate and decide platforms: For the repository: * SQL * RDF * Plain files (as todays webdav) * CVS/Subversioned files
For the metadata (annotations, scores, etc):
Same as above
How to communicate with the repository
* Web service (easy to integrate with Taverna)
*
WebDAV? (solutions exists, but problems with directory structure and access control)
* Just a web form (which we need anyway)
Appendix
Also talk to Stian about his testrepo for some ideas.