r39 - 04 Apr 2005 - 15:15:00 - ArijitMukherjeeYou are here: myGrid wiki >  Mygrid Web  > WorkInProgress > MyGridInformationRepository

See also InformationModel, MirNotifications, MirQueries, MirLsidAuthority and (New Sep 04) MirBrowser

Overview

The primary objective of the myGrid information repository (MIR) is to store users' data. Other requirements on the MIR are described in the WP3 Requirements document.

The June 2003 MIR (MIR3) is undergoing a major revision. See InformationModel for details.

-- PeterLi - 04 Feb 2003

Outstanding issues

MirFourImplementationIssues details outstanding issues in using the July04 implementation of the MIR.

-- ChrisWroe - 09 Aug 2004

MIR3 Implementation

See also Mir3Reflections.

I have revised the initial implementation of the MIR (still under branch 'mir3' in the CVS mygrid/PersRepository. The corresponding information model and schema diagrams from Together-J are: info model and schema and Java API. This includes all of the changes previously proposed and documented below:

  • added 'addUnknownSubjectAs:String' and changed 'addUnknownAsConcept:boolean' to 'addUnknownObjectAs:String' in addAssociation() and addStringAnnotation() and addXMLAnnotation().
  • replaced 'includeDeleted:boolean', 'authoritative:boolean' and 'nonAuthoritative:boolean' with hashtable keyed on property names with standard values in getMetadataExternalURIsBySimpleMatch(), getAssociatedExternalURIsBySimpleMatch(),
  • replaced 'includeDeleted:boolean' with 'properties:Hashtable' in getExternalURIsByUserAndLocalType() and getExternalURIsByLocalType().
  • added addProxyThing().
  • added addCollection(). Note however that metadata over a Collection's members is not yet supported by getMetadataExternalURIsBySimpleMatch or getAssociatedExternalURIsBySimpleMatch.
  • added getCollectionStandardMetadata() and associated 'membersExternalURI' property to act as metadata subject or object on behalf of its members (rather than as the collection as a thing in itself).
  • changed 'valueXML:String' to 'value:Object' (should be type String or byte[] only) in addDataThing() and addOutputDataThing().
  • similarly, changed return type of getDataThingValue() to 'Object' from 'String'.

-- ChrisGreenhalgh - 24 Apr 2003

I have done an initial implementation of the MIR (currently under branch 'mir3' in the CVS mygrid/PersRepository. The corresponding information model and schema diagrams from Together-J are: info model and schema and Java API

-- ChrisGreenhalgh - 08 Apr 2003

Here are some refinements/extension currently under consideration:

  • Add ProxyThing? localType and API operation to create - used to support annotations and associations involving things outside the MIR such as web services.
  • Extend addAssociation/addAnnotation accordingly to allow automatic addition of ProxyThing? for unknown URIs (e.g. change addUnknownAsConcept:boolean to addUnknownAsLocalType:String, with null as false, and include this option also for the subject as well as the object).
  • Support binary as well as text-based data for DataThing? and XMLAnnotation - allow easier inclusion of images, etc. in MIR. This probably means adding a binary flag or encoding attribute to the DataThing? table, and changing the API to addDataThing, addXMLAnnotation and getDataThingValue e.g. to use Object (which will actually be String or byte[]) rather than String, and adding 'binary' (or encoding/whatever) as a StandardMetadata? property for DataThing?.
  • Drop suggestion in API docs that all DataThing? values are XML-wrapped - most of the scenario (e.g. GD) stuff at the moment is basically ASCII or images! Removing this lets clients work more intuitively in terms of (raw) values and their MIME types.
  • Allow simple matches for deleted items only (currently can ask for non-deleted, or all including deleted).
  • Consider generalisation of arguments to extensible set of flags or qualifiers to subsume deleted/notdeleted and authoritative/nonAuthoritative??
  • Add Collection localType and API operation to create - used to support Collections; not a ProxyThing? because its ExternalURI? will be a new internally generated MIR LSID. (?)

-- ChrisGreenhalgh - 10 Apr 2003

Information Model

Following the brainstorm of the information model at IF-3 ( inf-model.jpg ) I have had a go at UML-ifying and clarifying an initial attempt at the information model. This is a bit broader than the MIR schema, but there we go.

Known issues:

  • It is not clear where Subscription and Notification should go.

  • Implementation details are deliberately glossed here (e.g. local vs global references, object vs relational styles).

  • GridResource? and Service are more or less just place-holders at the moment; should they be reflected here?

  • ServiceDescription is also a placeholder, and begs a range of different kinds of description, for actual services available, actual service used, to requirements, both syntactic, semantic and qos-related.

  • ServiceProvenanceLog? is a place-holder for Luc's proposed 3rd party provenance recording.

  • No RDF mapping is yet defined.

InformationDescription is a structured description of the broader information model.

MIR schema for IF-4

Introduction and Goals

I (ChrisGreenhalgh?) want to rapidly adopt an initial and provisional subset of this for a new MIR schema to support lab-book style functionality. Concurrently, the broader model can be expanded and refined. The IF-4 MIR should include:

  • Thing - common functionality across a range of Entities, including nominal support for LSIDs.

  • DataThing? - which replaces DomainEntity?, and becomes a new base class for workflow definitions, etc.

  • User - which expands existing User info. with extra fields to (a) relate to user agent and (b) potentially tie in to GSI.

  • ActionPerformed? (and WFInstance), plus newly separated ActionProvenanceLog? (and WFProvenanceLog?), to record WF and direct action provenance, plus Input to preserve data-flow dependencies, which expand existing WFInstance.

  • ActionDefinition? (and WFDefinition), which expand existing WFDefinition and extend DataThing?.

  • WorkContext?, and in particular Experiment, for organising the lab book. This effectively replaces DEGroup.

  • Report, and in particular LabBookReport?, for the narative/document presentation of the lab book report.

  • Annotation, for 3rd party and other subsequent annotations of Thigns. In the first instance, signing off a lab book report can be done with a simple Annotation, although in future this should be a SigningAnnotation?.

-- ChrisGreenhalgh - 11 Feb 2003

Version 1 (superceded)

I have done an initial relational schema design based on a subset of the above information model. See the design doc and/or Together ControlCentre? project

-- ChrisGreenhalgh - 11 Feb 2003

Version 2 (superceded)

Hmm. Having trouble getting a stable relational schema, especially when I start to think about searching with concepts stuff, dealing with notifications, etc, etc. Perhaps it would be better to run for now with an abstract relational schema that directly supports a more RDF-style view of the world, so that stuff is more consistent at this level. Requires another level of schema of course, but then we have some of that in the Ontology and elsewhere... What do you think? abstract schema gif

-- ChrisGreenhalgh - 27 Feb 2003

Version 3 (current)

Here is my third attempt at a schema; it is semi-abstract, i.e. a number of relationships which were previously separate tables - and some new relationships - are all realised in terms of a general Association class (implemented in the relational schema by the MetaData table). Concept relationships are also included (equiv, super). Class diagram Entity-Relationship diagram for Relational Schema

This is making minimal changes from the PersRep2? schema (!). Notes:

  • I have not look in detail at whether SQL queries will support the queries we need over this; I doubt it. The two options to improve queryability are (i) use the DB2 XML extensions to include XQueries into the documents (e.g. provenance logs) or (ii) have the repository and/or the gateway create additional explicit MetaData to reflect other information 'hidden' in these documents in a way that is directly queryable using SQL.

  • Thing is a new common base class across all referable resources.
    • deleted things can just be marked and retained using the deleted attribute.
    • externalURI is used e.g. for Concept URIs.
    • localType is the name of the nominal class in the information model (class diagram)
  • Permission provides a trivial expression of access control (placeholder, only); I assume that we will make WFDefinitions public (shared read) by default.
  • DataThing? replaces DomainEntity?.
    • DEGroupID? is replaced by Associations of kind isPart, isUsedIn, wasCreatedIn, many of which can apply to each (Data)Thing.
    • Some column names are changed to reflect the change.
  • WFDefinition becomes a kind of DataThing? (and requires no additional data).
  • WFInstance becomes a subclass of DataThing?.
    • InstanceWFDefinition? is dropped, replaced by the definedBy:WFDefinitionID, since every WFInstance is an instance of precisely one WFDefinition.
  • DEGroup is generalised to WorkContext?, which in turn is fully implemented by Thing.
    • DEGroup.rDEGroupID is replaced by Associations of kind isPartOf.
  • MetaData is added, to express Annotations and Associations.
    • See above for some uses; also
    • EntityConcept? is replaced by Associations of kind isA.
    • MetaData.associationDistance allows expanded transitive associations to be placed in the mIR explicitly (with non-zero distance) to avoid recursive queries.
  • User is retained but becomes a subclass of Thing.
    • User.name is old UserID? (string).
    • User.X509DN is a placeholder for additional user-related information, in this case their Distinguished Name as used on the Public Key Certificate.
  • EntityWF? is replaced by Associations of kind hasInput and hasOutput.
  • ConceptType? becomes a kind of Thing.
    • Concept relationships are to be included, expressed using Associations of kind isSuperClass and isEquivClass.
  • A LabBookReport? is just a kind of DataThing?.

I intend to prototype a new repository web service API for this real soon now, basing it strongly on the PersRep2? API whereever possible (although many methods are still likely to change).

-- ChrisGreenhalgh - 13 Mar 2003

Views over MIR

As mentioned in the Overview above, the prime motive of the MIR is to store users' data. However, there is also the issue of supporting the users browsing their data, and the data of others for whom thay have the requisite permission.

RobertsMIRViews provides an initial target set of alternative views.

-- MarkGreenwood - 30 Apr 2003

ChrisWroemIRInstallationNotes as at 20th May 2003.

-- ChrisWroe - 20 May 2003

Some notes on architectures for content management in other projects.

-- ChrisWroe - 12 Sep 2003

mIR Generic Query Interface

The mIR Generic Query Interface would use the OGSA-DAI WS-I version. The version which is available on the OGSA-DAI web site (http://www.ogsadai.org.uk/downloads) is preconfigured for OMII-1. The bundle attached here is reconfigured to work on any platform independent of OMII. I am also attaching a text document on how to use it.

This has a bug in it - please do not use it. I am uploading a new version -- Arijit

The Twiki for some reason isn't allowing me to upload a new version of the zipped file. I am uploading the client code instead, please replace the client in the zipped file with this one.

-- ArijitMukherjee - 24 Jan 2005

mIR Performance and benchmarking

There are some performance issues around mIR which are highlighted with some benchmarking data in the attached document.

-- ArijitMukherjee - 24 Jan 2005

This is the first draft version of the MIR benchamarking document. The previous document is superceeded by this one.

Probably the final version of the benchmarking document - mIR is being modified to de-normalize some of the tables and use a serialization/de-serialization patch submitted by Ian (Roberts). The changes are committed to the CVS branch MIRv0_3 and will be merged with the HEAD branch soon.

More on mIR benchmarking - this version of the document contains the results achieved by setting the plug-in property for storing the traces on a PER_WORKFLOW basis. For workflows producing large volume of data, this requires a larger heapsize in tomcat; for smaller workflows, the default setting of tomcat works.

toggleopenShow attachmentstogglecloseHide attachments
Topic attachments
I Attachment Action Size Date Who Comment
jpgjpg inf-model.jpg manage 233.3 K 24 Jul 2006 - 09:40 ChrisGreenhalgh From IF-3 brainstorm of information model
pdfpdf WP3Requirements0_2.pdf manage 700.4 K 24 Jul 2006 - 09:40 PeterLi Requirements for the MIR
pdfpdf mygrid-info-model-cmg-20030211.pdf manage 31.3 K 24 Jul 2006 - 09:40 ChrisGreenhalgh multi-page PDF of the UML class diagram
pdfpdf MirPilotPoster.pdf manage 650.9 K 24 Jul 2006 - 09:40 PeterLi Poster for the MIR at the Pilot Projects meeting
gifgif mygrid-info-model-cmg-20030211.gif manage 62.8 K 24 Jul 2006 - 09:40 ChrisGreenhalgh large GIF of the UML class diagram
zipzip mygrid-info-model-cmg-20030211.zip manage 19.0 K 24 Jul 2006 - 09:40 ChrisGreenhalgh Together ControlCentre? (UML) version of info model
zipzip mir3-schema-cmg-20030211.zip manage 6.8 K 24 Jul 2006 - 09:40 ChrisGreenhalgh Together CC project for proposed MIR-3 schema
pdfpdf mir3-intermediate-schema-rev2.pdf manage 15.0 K 24 Jul 2006 - 09:40 ChrisGreenhalgh ER diagram for MIR-3 attempt no. 3 rev 2
pdfpdf mir3-schema-2003-04-08.pdf manage 26.5 K 24 Jul 2006 - 09:40 ChrisGreenhalgh Initial implementation of MIR3 relational schema
pdfpdf mygrInfoModel-deployment-0.2.pdf manage 150.1 K 24 Jul 2006 - 09:40 NedimAlpdemir deploying the infoModel- updated version (PDF)
pdfpdf mir3-schema.pdf manage 27.7 K 24 Jul 2006 - 09:40 ChrisGreenhalgh Revised schema 2003-04-24
pdfpdf IMDeployment-FromUseCasesToArchitecture-v0.2.pdf manage 173.8 K 24 Jul 2006 - 09:40 NedimAlpdemir IMDeployment - From Use Cases To Architecture(PDF)
pdfpdf MIR-GenericQuery-Opts.pdf manage 41.5 K 24 Jul 2006 - 09:40 NedimAlpdemir Evaluation of MIR Generic Query Options
docdoc myGridInfoModel-deployment-0.2.doc manage 213.0 K 24 Jul 2006 - 09:40 NedimAlpdemir deploying the informaitonmodel - updated version
pdfpdf myGridTypes-thoughts.pdf manage 149.4 K 24 Jul 2006 - 09:40 NedimAlpdemir Thoughts on myGrid Type System
pdfpdf mygrInfoModel-deployment.pdf manage 111.9 K 24 Jul 2006 - 09:40 NedimAlpdemir deploying the information model
pdfpdf mygrInfoModel-deployment-0.3.1.pdf manage 180.3 K 24 Jul 2006 - 09:40 NedimAlpdemir deploying the infoModel - version 0.3.1(PDF)
pdfpdf mygrInfoModel-deployment-0.3.pdf manage 180.3 K 24 Jul 2006 - 09:40 NedimAlpdemir deploying the infoModel - version 0.3 (PDF)
gifgif mir3-abstract-schema.gif manage 9.8 K 24 Jul 2006 - 09:40 ChrisGreenhalgh  
docdoc EscienceMed-and-Dev-Plan.doc manage 176.5 K 24 Jul 2006 - 09:40 NedimAlpdemir A discussion doc on Info. Model and Esci. mediator
javajava IMIR3.java manage 49.8 K 24 Jul 2006 - 09:40 ChrisGreenhalgh Revised MIR3 API 2003-04-24b
pdfpdf mir3-intermediate-classes-rev2.pdf manage 16.5 K 24 Jul 2006 - 09:40 ChrisGreenhalgh Class diagram for MIR-3 attempt no. 3 rev 2
pdfpdf respository.pdf manage 92.7 K 24 Jul 2006 - 09:40 ChrisWroe  
pdfpdf mir3-info-model.pdf manage 28.5 K 24 Jul 2006 - 09:40 ChrisGreenhalgh Revised info model 2003-04-24
pdfpdf mir3-info-model-2003-04-08.pdf manage 27.8 K 24 Jul 2006 - 09:40 ChrisGreenhalgh Initial implementation of MIR3 info model
pdfpdf mir3-design-note-cmg.pdf manage 244.9 K 24 Jul 2006 - 09:40 ChrisGreenhalgh Design notes and Schema for proposed MIR-3
pdfpdf mygrInfoModel-deployment-0.3.2.pdf manage 178.9 K 24 Jul 2006 - 09:40 NedimAlpdemir deploying the infoModel - version 0.3.2(PDF)
pdfpdf IMDeployment-CoreInterfaceSpecifications-v0.1.pdf manage 153.7 K 24 Jul 2006 - 09:40 NedimAlpdemir IMDeployment - Core interface Specifications (PDF)
pdfpdf MIR-Benchmarking-v0.1.pdf manage 181.6 K 24 Jul 2006 - 09:40 ArijitMukherjee  
pdfpdf InitialMIRBenchmarking.pdf manage 23.9 K 24 Jul 2006 - 09:40 ArijitMukherjee  
pdfpdf MIR-Benchmarking-v0.3.pdf manage 209.2 K 24 Jul 2006 - 09:40 ArijitMukherjee 3rd version of mIR benchmarking document
javajava WSIExampleClient.java manage 12.8 K 24 Jul 2006 - 09:40 ArijitMukherjee WS-I Client
txttxt OGSA-DAIWS-IHow-to.txt manage 1.9 K 24 Jul 2006 - 09:40 ArijitMukherjee How To guide for OGSA-DAI WS-I
docdoc UserGuide-all.doc manage 1943.0 K 24 Jul 2006 - 09:40 KatyWolstencroft  
zipzip ogsadai-wsi.zip manage 6641.1 K 24 Jul 2006 - 09:40 ArijitMukherjee OGSA-DAI WS-I bundle configured for any platform
pdfpdf MIR-Benchmarking-v0.2.pdf manage 202.2 K 24 Jul 2006 - 09:40 ArijitMukherjee 2nd version of mIR benchmarking document
Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r39 < r38 < r37 < r36 < r35 | More topic actions
 
Powered by myGrid wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding myGrid wiki? Send feedback