r1 - 29 Jun 2004 - 10:21:06 - StefanEgglestoneYou are here: myGrid wiki >  Main Web  > StefanEgglestone > SchoolOfBiosciences > DivisionOfNutritionalBiochemistry > TimParr > TimParrEmails
Message 1

Dear Tim

You made a suggestion on Friday about a workflow I should try writing. The idea I got was that it should look up all protein ids for proteins with a given name from swissprot, then for each protein id look up the nucleotide sequence from which it was translated. I've got that working now. For each proteind id generated for a given protein name, the workflow generates a list of embl nucleotide sequences cross referenced by the swiss prot record for the protein id.

So, for example, given protein name apolipoprotein, the following list of protein ids are generated

SWISSPROT:ABME_HUMAN SWISSPROT:ABME_MESAU SWISSPROT:ABME_MONDO SWISSPROT:ABME_MOUSE SWISSPROT:ABME_RABIT ... (loads more)

then for SWISSPROT:ABME_HUMAN, the following embl ids are generated:

EMBL:AB009422 EMBL:AB009423 EMBL:AB009424 ...(about 5 more)

The sequences these refer to can then easily be looked up in embl using SRS.

I think you suggested these sequences should then be aligned against each other using clustalw, but I wasn't sure exactly how.

Should I do an alignment for each protein (eg align EMBL:AB009422,EMBL:AB009423,EMBL:AB009424, ... align all the sequences for SWISSPROT:ABME_MESAU against each other, align all the sequences fro SWISSPROT:ABME_MONDO against each other, producing a list of clustalw results) or do you want all of the sequences generated above aligned against each other, producing one clustalw result?

Stef

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
Powered by myGrid wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding myGrid wiki? Send feedback