Introduction
The aim of this scenario is to identify and characterise genes which are located in regions on human chromosomes which show linkage to Graves’ disease (GD). GD is an autoimmune disease of the thyroid in which the immune system of an individual attacks cells in the thyroid gland resulting in hyperthyroidism. This is caused by the stimulation of the thyrotrophin receptor by thyroid-stimulating autoantibodies secreted by lymphocytes of the immune system.
The GD scenario has been divided into 4 sections:
Affymetrix microarray studies
The GD candidate genes were identified by microarray analysis.
Affymetrix U95A arrays were probed with RNA extracted from CD4 positive lymphocytes from four GD patients and four healthy controls. The four GD microarray datasets were then compared to the four control datasets using Data mining tool (Affymetrix) to identify differentially expressed genes.
Annotation pipeline
Over 50 genes were found to be differentially-expressed in CD4 positive lymphocytes from GD patients. In order to understand why these genes were expressed in lymphocytes from GD patients but not in healthy individuals, Claire, our biologist working on the GD scenario, would like to use myGrid to query public databases such as Ensembl, OMIM, dbSNP and Medline to view information about the gene structure and function, chromosome location, the presence of single nucleotide polymorphisms (SNPs), expression control features and association with other genetic diseases. She may also like to identify other experimental conditions or diseases in which the candidate genes expression is significantly altered.
Genotype Assay Design System
SNPs are small (single base pair changes) genetic variations which are found in the genome amongst individuals. The differential expression of the candidate genes in GD individuals may be due or related to the presence SNPs associated with GD. Claire is interested in identifying and determining the frequency of those SNPs which are found in her GD patients.
Restriction fragment length polymorphism (RFLP) assays are developed to genotype SNPs in her candidate genes. A region flanking either side of the SNP is amplified using polymerase chain reaction (PCR). The amplified PCR product is digested with a suitable restriction enzyme (i.e. one that will cut at one SNP allele and not the other) and the products are run on agarose gels to view product size and determine the genotype.
Claire would like to use myGrid to:
- Help her design primers (bits of DNA which signify the start and end points of the section of the DNA sequence which she wants to amplify) for the PCR experiment.
- Select the restriction enzyme that is specific to a particular SNP for the RFLP experiment.
- Design primers for other forms of experiments to characterise SNPs.
3D Protein Structure & SNP visualisation
Any SNPs occurring in the coding regions of a candidate gene may potentially give rise to a change in the amino acid sequence of the protein encoded by the gene.
Claire would like to use myGrid to:
- Query a protein structure database, e.g. PDB or MSD, to determine whether a structure of the protein encoded by her candidate gene is available. If so, view the protein structure and highlight the amino acid change caused by the SNP.
- Obtain information about the protein, e.g. its function and functional domains, by querying SWISS-PROT and Interpro. Sheffield’s PESTO information extraction service could be used to extract information on active sites that may be present on the protein.
List of services required for Graves disease scenario:
Annotation Pipeline
| Service Name | Parameters | Returns | Function | Group | Priority | Status |
| Microarray query | Sql statement (string) | Collection of probe set Ids (string array) | Performs a query on an existing microarray data set | Newcastle? | Not a priority | Pending |
| AffyMapper? | Probe set ID (string) | Accession number (string) | Maps probe set Id to Accession number | Newcastle | Required | Done |
| Srs.queryById() | Data entry Id, databank Id | Data entry (string) | Provides the databank entry given an entry Id and databank Id | EBI | Required | ASAP |
| Srs.queryByXRef() | Dataentry Id, databank Id, XRefDatabank? Id | Dataentry Id | Provides a mechanism to retrieve a data entry Id from a databank. The data entry Id is specified as a X-link into supplied dataentry Id | EBI | Required | ASAP |
| analyseGOTerms() | Collection of GO Ids | Collection of parent GO Ids | Used to classify changes in gene expression based on biological processes and molecular function | | Optional | |