Skip to end of metadata
Go to start of metadata

This is my interpretation of Juk's scenario as described in an e-mail of 22nd January 2010 and significantly updated following the skype call of 27th January 2010.

 Initial creation of the overall workflow

 An overall workflow is created within Taverna.  It includes services to gather data, perform data-mining, distribute and save the results.  At this time the services for data-mining are placeholders and have no specified implementation.

The placeholders for the data-mining services include:

  • the input ports and their types in some ontology
  • the output ports and their types in some ontology
  • a description in some ontology of what the service will do

The ontological information may be specified in the next stage.

Filling in of the placeholders

Using the information specified for each placeholder in turn, the IDA is called to help the user build a DMWF that meets those requirements.  (Alternatively the ontological information could be specified only when the IDA is called.)  Sufficient information is passed back to Taverna so that the placeholder can be filled in by the enactment of the DMWF.  The realization of the placeholders are specified in DMWF and could be run as Taverna nested workflows (via translation) or as RapidMiner workflows on a RapidMiner (via translation)

Editing of the sub-workflows

It is possible that some changes may need to be done to the sub-workflows.  This can be done by translating them into nested Taverna workflows and editing those workflows.  Note that the result is a Taverna workflow - also note that some DMWFs cannot be translated into Taverna workflows (although they can be into RapidMiner workflows) and so the user cannot edit those workflows in Taverna.

Configuration of the sub-workflows

The user can specify if the sub-workflow will be enacted as a RapidMiner workflow or as a Taverna workflow (unless it cannot be translated).  There could be a user default preference with the option to override for a particular sub-workflow.

Enactment of the overall workflow

During the enactment of the overall workflow, the data mining services are enacted either by calls to a RapidMiner to enact the sub-workflows or as Taverna nested workflows.

For enactment by RapidMiner, the Taverna engine only handles the passing of input and receiving of output data; a RapidMiner runs the sub-workflow not Taverna.  Note that the enactment of a particular sub-workflow happens on a single RapidMiner, but different sub-workflows may be enacted on different RapidMiners.

For enactment within Taverna, the workflows are enacted as normal nested workflows.

Collection of provenance

Taverna can collect provenance information about the enactment of a workflow.  RapidMiner can also generate provenance information.  When a RapidMiner operator is called, or when a RapidMiner workflow is enacted as a sub-workflow, then RapidMiner will return provenance on an additional output port.  Taverna will take the RapidMiner provenance and collate it with its normal provenance.

Issues

  1. What form does the RapidMiner provenance take?
  2. Where does the provenance data go?  How is it referenced?

Musings

  1. If the RapidMiner sub-workflows and the overall Taverna workflows are all put on myExperiment, how are they related?
  2. Are the sub-workflows registered somewhere as services (c.f. the BioCatalogue) so that they can be reused? 

UNIMAN work

  1. A service that can be a placeholder for data mining or enact DMWFs either as RapidMiner workflows or as nested Taverna workflows - called DMWF service
  2. Ability to specify the semantics of the ports and intent of a DMWF service when it is a placeholder.
  3. Ability to call the IDA to give the DMWF for a placeholder DMWF service
  4. Translator from DMWF to Taverna workflows - and detection when this is not possible
  5. A way of replacing a DMWF service with a nested Taverna workflow
  6. Preferences for DMWF services
  7. Ability to detect workflows that contain placeholders and refuse to enact them
  8. A service that can enact a RapidMiner operation - called RM service
  9. Generic capability to get extra provenance back from a service enactment (only DMWF and RM services will implement it initially)
  10. Generic capability to collate provenance information

 

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. 2010-02-10

    Alan Williams says:

    Should generalise to getting back provenance getting back errors xml splitte...

    Should generalise to

    • getting back provenance
    • getting back errors
    • xml splitter hiding