r1 - 08 Apr 2003 - 13:49:08 - NickSharmanYou are here: myGrid wiki >  Mygrid Web  > WorkInProgress > LabBook > WorkbenchUseCases

Use Cases derived from the Workbench (Lab Book) Story Board

We aim to capture in the MyGridInformationRepository, for in silico experiments, the kind of data and provenance information that a 'wet lab' scientist would record in a lab book. Where possible, we aim to support this capture automatically. In other cases, we aim to provide suitable mechanisms whereby the scientist can capture the data or provenance information themselves. In some of these cases, the myGrid system should prompt the user for provenance information.

AlansLabBookStoryBoard shows a possible sequnce of interactions between a scientist and the myGrid system. Chris Greenhalgh has produced an annotated version, AlansLabBookStoryBoardWP6, and jointly over two Access Grid sessions, AccessGrid28Feb2003 and AccessGrid14Mar2003, the team has walked through all steps in the story board to match them with our current and planned implementation and to identify gaps and issues.

For IF-4 and demonstrations around the GravesDiseaseScenario, the scientist's principal means of interacting with myGrid will be via the myGrid WorkBench?, a NetBeans-based GUI application.

This topic seeks to identify, from the above, some of the interactions between the e-scientist and myGrid via the WorkBench?, and match them to new or existing use cases.

1. I log onto the myGrid system to give me access to my mIR & other services.

2. I check the status of any workflows that I started previously to see if they are now finished & I can check the result. I may also have notifications detailing people or services that have annotated the contents of my mIR, e.g. my supervisor has signed off on the work I did last week with her digital signature. N.B. These annotations need not be stored in my personal mIR, however I must have given something the permission to look at my mIR. Something may have created their own annotations on the contents of my mIR, but decided to keep these private.

The user may either browse the MIR to locate Workflow instances, and then check their status & details:

  • ViewWorkContext?
  • ViewWorkflowInstanceStatus?
  • ViewWorkflowInstanceDetails?
  • ViewWorkflowProvenance?
  • ViewDataThing?

Or may choose to view all or recent (since last login/view) notifications of workflow completion and other events:

  • ViewNotifications?
  • ViewNotificationDetails?

After dealing with a notification, the user may wish to delete it:

  • DeleteNotification?

3. I browse through my projects, experiments, workflows.

  • ViewWorkContext?
  • ViewWorkflowInstanceDetails?
  • ViewWorkflowDefinition?

4. I create a new project (myProject1) with a title & a free-text description (provenance annotation). I assume that myGrid will record provenance metadata including date, time, user name, user group, default security permissions, the host from which I created the project. The provenance is associated with myProject1 & stored in the mIR.

5. I upload some data (myData1) that I've obtained to my mIR & that will be part of my new project (myProject1) (automatically trap basic provenance metadata of this operation & the data).

6. I attach some metadata to this uploaded data, e.g. a name, a description, a type (e.g. unordered collection of protein sequences), where did it come from, how was it produced. Perhaps also some other annotation, e.g. what do I think of its quality.

  • AnnotateThing?

7. I have an idea of what I want to do with the data as part of an in silico experiment, so I create a new experiment (myExpt1) that is associated with my project (myProject1) with a title, free text description, plus the automatically trapped basic provenance metadata.

8. Somehow I find a workflow template (WFTemplate1) that satisfies my requirements & describes the services that are to be used in this in silico experiment.

  • ViewWorkContext?
  • SearchForThingByConcept?
  • SearchForThingByKeyword?

9. I want to associate this workflow template (WFTemplate1) with my experiment (myExpt1), plus some annotation about why I've chosen this workflow template. I trust myGrid to have recorded in the mIR where WFTemplate1 came from, plus provenance metadata about time, date, permissions, etc.

  • AssociateSubjectToObject?

[At another time I may want to find both "what experiments did I use WFTemplate1 in?" & "In myExpt1, what workflow templates did I use?".]

  • ViewThingAnnotations?
  • ViewAnnotation?

10. I need a registry service that can return specific instances of the services that the workflow template requires. The registry will return metadata about all service instances fitting the criteria. As well as criteria that the workflow template may require, e.g. a particular version of an application, I may have my own personal preferences, e.g. it has to be free.

  • SearchForServiceByConcept?
  • SearchForServiceByKeyword?

11. I may need to choose between alternative service instances (e.g. emma@EBI vs. emma@HGMP) using the metadata returned by the service registry & to be used in the workflow.

  • SelectServiceInstance?

12. I want to record why I made a choice in favour of particular service instances for this workflow, e.g. emma@EBI is free to anyone, but emma@HGMP is free to registered users only.

[To what do I attach the annotation about my choices of service instance? Is a part of the workflow annotatable? Or only the whole thing? I.e. is a service instance recognised in the mIR as a separate entity & to which metadata may be attached? I have a vision of a large XML file with multiple name spaces: how conflated is data & metadata in an XML representation of the workflow?]

  • AnnotateSelectedServiceInstance?

13. I configure each activity of the workflow with my choice of parameters or default to those recommended in the workflow for the service instance, as well as my starting data. I want to annotate why I made those choices on the parameters.

14. I want to store the details of my configured workflow (WFConfigured1) for this experiment (myExpt1) in the mIR.

15. I decide it's time to run my workflow.

  • EnactWorkflowDefinition?

16. As each activity in the workflow is completed, I'd like to be notified & to have the intermediate results stored in the mIR & associated with WFConfigured1 so that I can find them easily. I expect that provenance metadata about the service instances which are run is stored along with the results: location, input parameters I specified, default parameters the service instance used, including resources that the service instance used, e.g. which version of SWISS-PROT did the BLAST server use. For each result stored in the mIR, as well as the usual provenance metadata (who, date, time), I expect that its metadata includes a syntactic & semantic type for the data (taken from where?).

The EE doesn't write intermediate results to the MIR: they are only present in the provenance record, which is not visible until the end of the workflow.

[Who writes the results & provenance metadata to my mIR? - The enactment engine? The service instance? The Gateway? Who do I trust with my credentials? I might trust my lab's workflow enactment engine to write to my mIR, but I probably wouldn't trust a service instance or a public enactment engine. Is it inefficient to have everything shuttled through the Gateway?

The EE accumulates provenance data and currently writes it directly to the MIR. For IF-4, it will make it available on completion so that the GW retrieves it and stores it in the MIR.

I would expect that the results generated from a service instance & stored to the mIR should be immutable to protect against fraud - maybe use PKI to monitor if the stored results are the same as those sent originally?]

Not for IF-4.

17. For each activity, I want to look at the results & record thoughts about them (i.e. annotate them). If I don't like the intermediate results & the workflow is still running, I may want to terminate it.

The use-case here is that a relatively low-cost step produces input for a following, expensive, action. The later action will be useless if the earlier step produces junk: the user prefers not to waste time/money running the later action if possible. However, the user does not want to 'nurse' the workflow until the earlier step completes but instead be informed it has completed (by a notification) and then kill the later action if necessary.

Since the intermediate results are not available until the workflow completes, it is currently necessary for a workflow to include an explicit store action to support this. Also, stopping the workflow will not automatically kill an invoked service invocation. We may be able to exploit SOAPLAB's Abort action where appropriate, but there is no general Web Service abort operation.

18. I log out of myGrid.

19. Later I return & look at my results…

As step (1).

20. After logging in, I am notified that my in silico experiment has finished & the results have been stored, including the final result (as R1), along with metadata & provenance, in myExpt1 of myProject1.

As step (2).

21. I find & select that I want to look at the final result (R1) & a viewer displays it for me.

  • ViewWorkflowInstanceDetails?
  • ViewDataThing?

22. I decide to change some of the parameters for a service instance in the workflow (WFTemplate1) from their recommended values to produce a new workflow (WFConfigured1.1) & see how this changes the final result (R1.1). I'll need to record why I've done this & have all the parameter values captured & stored in the mIR.

Otherwise, and for running the new workflow, as steps (8)-(17).

23. I decide that the original result using the recommended parameters is the best & I add two annotations to the final result (R1): one is my conclusions (myNote1), the other is ideas for the next experiment (myNote2).

  • AnnotateThing?

24. I decide the results are so good, that I'm going to share them with my supervisor & the colleague from whom I got the original data & I send them the location of my result (R1) so that they can view it.

  • ChangeThingPermissions?

25. Since she's my boss, my supervisor has permissions to see everything I've done. From the result (R1), she can follow the trail back to see how it was generated, using which services, from which workflow template (WFTemplate1), with which parameters (WFConfigured1) & using which starting data (myData1). At each stage, she may also see the annotation that I may have attached to the objects describing why I made particular choices, e.g. which workflow template, which service instances & which parameters. She may make some comments of her own, and/or sign off on the work. Before signing off on the work, she may want to check that I haven't altered R1 to falisify the results. She could do this either by re-running the workflow (WFConfigured1) herself with my data (myData1), or if we had a PKI for checking with a service instance that a result hasn't been tampered with.

As in step (24), we will not support access dependent on role & group. The Workbench should support browsing annotations and associated objects where they are visible (see also below).

26. I'm a little more paranoid about my colleague. Although I'd like her to see my final results (R1), conclusions (myNote1) and how I got there (WFTemplate1 & WFConfigured1). I don't want her to see the annotations about what I want to do next (myNote2), or possibly comments I made about my perceptions of the quality of her data attached to myData1.

As in step (24), we will not support access control of this complexity.

27. Following comments from my supervisor & colleague, I decide to retrieve my original conclusions (myNote1) & re-write parts of it (myNote1.1).

  • EditAnnotation?
  • DeleteThing?

28. I run two further in silico experiments in this project: myExpt2 & myExpt3. myExpt2 is a different workflow template (WFTemplate2) which has the same function as WFTemplate1, but uses a different methodology - I decide the result (R2) of this experiment isn't as good as the first one, which I document & have stored in the mIR with R2. The other in silico experiment, myExpt3, takes the final result (R1) of the workflow (WFConfigured1) in the first in silico experiment (myExpt1), plus some new data (myData2) & runs it through a new workflow (WFTemplate3 & WFConfigured3) to produce a final result (R3).

This should all be OK. We noted that, in current thinking, these different workflow instances would be probably be part of the same experiment, as they are presumably testing the same hypothesis.

29. Some time later, I am in the process of writing up the paper about these experiments. I find my final result (R3). I need to know how & why that was generated all the way back to myData1, i.e. trace back through two workflows.

  • ViewThingAnnotations?
  • ViewAnnotation?
  • ViewWorkflowInstanceDetails?
  • ViewWorkContext?
  • ViewDataThing?

30. My supervisor wants to identify what people in the group have been working on & where there's commonality.

[A text analysis engine is run over the contents of everyone's mIR (or at least parts of them such as the descriptions & annotations) to identify terms & concepts. Look for people, projects & experiments where the same concepts are being used.]

Unlikely for the next IF.

-- NickSharman - 8 Apr 2003

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
Powered by myGrid wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding myGrid wiki? Send feedback