r3 - 11 Mar 2003 - 10:50:00 - AlanRobinsonYou are here: myGrid wiki >  Mygrid Web  > DocStore > MinutesStore > AccessGridMinutes > AccessGrid21Feb2003

Minutes of Access Grid meeting, 21 Feb 2003

Attendees

Manchester: Carole Goble, Phil Lord

Newcastle: Peter Li, Neil Wipat, Savas Parastatidis, Paul Watson

Nottingham: Milena Radenkovic, Kevin Glover

Southampton: Simon Miles, Juri Papay, Victor Tan

IT Innovation: Justin Ferris

EBI: Alan Robinson

Agenda

  1. Walk through the GravesDiseaseScenario produced by Newcastle.
  2. Review Savas' version of the service interaction matrix: do the identified interactions support the scenarios from (1)? [We didn’t have time to start point 2.]

ACTIONS:

  • Review Savas' version of the service interaction matrix: This is to be done offline by all sites creating their own copy of the matrix.
  • Savas volunteered to produce a security policy document for the scenario with Luc Moreau

Next Meeting:

We will discuss the new information model and the regular AG meeting on 28 Feb 2003

GravesDiseaseScenario

The scenario turns out to be a good basis for the June demonstrator Lab Book [btw: I still haven’t heard of a new name for the lab book. I’m thinking of calling it Estudio, which is Spanish for study].

The workflow is split into 3 parts.

Annotation pipeline

The services identified by the scenario that we will need to access via the workflow enactment engine are:

  • Transfac: This is technically ok as we can access this through EMBOSS but there are some issues with licences raised by AlanRobinson?

  • MEDLINE:
    • AJR (11/3/03): MEDLINE is available through both OpenBQS & SRS.

  • OMIM: Alan proposed that OMIM would be a good (better than Medline) candidate for the text processing WP7 as it is more constrained.
    • ARJ (11/3/03): My suggestion is based on OMIM being much smaller than MEDLINE (15,000 entries rather than 12,000,000) & I also suspect having a higher signal-to-noise ratio.

  • BLAST: this would show off the invocation of a web service with a heap of parameters (see GenericOperationInvocation), and open out the possibility of a computationally intensive Grid activity.
    • AJR (11/3/03): BLAST is currently submitted from Soaplab onto the EBI Linux farm using LSF. At one time we looked at using globus_run to submit jobs.

These services need to be:

  • wrapped as web services
    • AJR (11/3/03): Any command-line application can be made available as a web service using Soaplab. I still have issues as to what it means to wrap a database as a web service, e.g. do you just want a method, 'executeSQL(in string SQL)'?. Are OGSA/OGSA-DAI in a form where we should be examining them?
  • described by the ontology
  • described by the UDDI-M/view registry
  • at least one should have multiple views
  • registered in a UDDI registry

For each probe id we need to get the corresponding EMBL accession number, ensemble id, medline id, dbSNP and OMIN id. This implies an implementation of LSID and an LSID resolver by Nottingham.

  • create personalised view of database entries for candidate gene: this is quite DAS/SRSish. This view will again fall upon Nottingham to present in the Lab Book.

Genotype assay design system

Design primers: we have three choices here:

1. a highly interactive process with the biologist, perhaps using Talisman or some other application. The opportunities that this affords are:

  • user interaction with a workflow (halting and resuming a workflow)
  • a workflow notifying the user proxy
  • launching a third party tool from the workflow in the lab book, and notifying the workflo when the tool is exited
  • collecting provenance data as free text notes

2. we use an autogenerating primer and just run through the workflow, perhaps picking up user preferences.

3. the scenario is thought of as 2 separate workflows with an application in the middle. The lab book would host the lanuching of the primer application. On its close, a set of possible workflows that could follow could be suggested.

The opinion for June and then Dec were that we should do then in the order of 2, 3, 1.

Determine restriction enzyme for the above SNP.

This is a simple workflow that can be enacted by the workflow enactment engine.

3D Protein structure and SNP visualisation

get PDB id for candidate gene.

  • AJR (11/3/03): Unless the gene has been crystallised, this is a non-trivial step. Even if it has been crystallised, it's still far from trivial since for very god practical reasons, the translated DNA sequence & that used for crystallography may not be identical.

  • this means making MSD a service for myGrid
    • AJR (11/3/03): We've talked to Kim Henrick at the EBI about web services. They've tried them, but are sceptical about the performance.
  • MSD is a relational database in Oracle. So it is a candidate for turning into an OGSA-DAI service
    • AJR (11/3/03): An expert on OGSA-DAI needs to talk to Kim Henrick at the EBI.
  • When MSD changes then we could notify the user proxy; this would be an example of DatabaseUpdateNotification that would be convincing. The MSD people are keen on this.
    • AJR (11/3/03): An expert on myGrid notification needs to talk to Kim Henrick at the EBI.
  • we need to match up notification topics with MSD, and set these up through the lab book.

obtain information about protein and extract information about active site

  • InterPro?, Swiss-Prot, Pesto: this is a conventional workflow.
    • AJR (11/3/03): Not sure what you mean by this. We'll have already analysed InterPro? & SWISS-PROT as part of the DNA & protein sequence analysis. - I guess this is Sheffield's call

display 3d protein structure to user and highlight location of amino acid change caused by SNP using RASMOL viewer

  • AJR (11/3/03): "Here be dragons!!". You have to make sure that the DNA sequence & (fragment of) crystallised protein structure line-up. I've been down this road with the p53 gene: http://industry.ebi.ac.uk/~alan/MutationViewer/
  • Phil raised the issue about associating MIME types with a service, whihc will have to be included in the service description.

get medline ids for PDB id, extract protein structure data.

  • an opportunity for PESTO to get involved

The roles of each MyGrid service in the GravesDiseaseScenario

The role of MetaData and ServiceOntologies in the scenario

  • discovering services when building the workflow
  • discovering the workflow so that it is reused from the MIR
  • editing the workflow with an alternative service that is semantically proposed
  • support in constructing the workflow
  • using the concepts in provenance record, data, and workflow to link together the various components for browsing and searching purposes. This might mean using an annotation tool.

ideally we would like to do a semantic service substitution during a workflow enactment, but this seems unlikely. We should certainly do the substitution of an alterative service instance.

The role of ProvenanceData

  • the workflow provenance (and there have to be many runs of the flow!) is put into the MyGridInformationRepository
  • we also need some free annotation provenance.

  • mining/querying provenance. Through canned queries on mIR? through the ontology?
  • AJR (11/3/03): There is provenance data of sorts in the EMBOSS outputs that we can capture.

the role of NotificationService

to be completed

  • AJR (11/3/03): Updates generated from a database, e.g. generated by triggers in MSD's Oracle database.

the role of the MyGridInformationRepository

to be completed

the role of the ServiceDirectory/views

to be completed

the role of the Gateway / EScienceLayer

to be completed

the role of the LabBook

to be completed

the role of the WorkFlow Enactment

to be completed

the role of the WorkFlow Design

to be completed

the role of the DistributedQueryProcessor

  • the MSD and the MyGridInformationRepository are both relational databases. So there is the possibility of some sort of distributed query demo.

the role of the BioServices

  • what is the role of SoapLab??
    • AJR (11/3/03): Soaplab provides a consistent interface to access any command-line driven application.

The role of TextExtraction

  • there are possibilities with Medline and OMIM

Security

we discussed security and digital signatures. Our conclusion was that we should use the scenario as a framework for producing a myGrid policy document on security. carole noted that Comb-e-Chem and Geodise had built authorisation and authentician mechanisms using Globus and web services for databases and ontology services. One play is that whoever is logged in can only see their own experimental components in the mIR. Is this possible?

ACTION: Savas volunteered to produce such a document with Luc Moreau.

- CaroleGoble - 21 Feb 2003

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r3 < r2 < r1 | More topic actions
 
Powered by myGrid wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding myGrid wiki? Send feedback