FlyWeb/FunctionalRequirements

From ImageWeb

Jump to: navigation, search

FlyWeb > Functional Requirements

This page collects use cases and user stories for FlyWeb.

Contents

Actors

  • Researcher: a Drosophila functional genomics researcher.

Use Cases (In Scope)

These use cases are in scope for the current round of development.

Use Cases (Out of Scope)

These use cases are not in scope for the current round of development.

Find Images by Gene

A researcher wants to find images of in situ gene expression, anywhere in the organism, for a specific gene.

E.g. find images depicting in situ expression of gene "aly" anywhere in the organism, show me thumbnails of the images accompanied by a short summary (containing what information?), and give me links to the image at its source location.

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

How do researchers want to identify a gene? What's the difference between symbol, name, annotation symbol and flybase ID in flybase? What about synonyms and secondary ids in flybase? -- any of the above!

What does BDGP use to identify genes (and "gene products")? Where does BDGP get it's gene product identifiers from?

DS: researchers commonly fail to distinguish between genes and gene products.

DS: BDGP has done in situ hybridisation of mRNA, as has Helen.

TODO look into BDGP concept of "gene product", cDNA libraries, ESTs etc.

Find Chromosomal Neighbours of a Given Gene

A researcher wants to find all chromosomal neighbours of a given gene.

E.g. find all chromosome neighbours of gene "aly".

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

DS: genes always read 5' to 3' (maybe either strand, unusually overlapping on complementary strands). Go to genomic database, look along the strand. Traditionally do by genetic crossovers, e.g. blue flowers, pointy leaves, breed against each other, see how frequently alleles are segregated. If on same chromosome close together, almost always expressed together. If far apart, easier to split up.

TODO why do this (i.e. look for neighbours)?

DS: genes together tend to be under similar control. also get messed up together if chromosomal fault. genes with similar developmental functions are often close together, e.g. all HOX genes occur as linear sequence. Many MHC genes important in immunology cluster together. Significant if genes with similar function close together because historically arose by gene duplication.

Find GO Annotations for a Given Gene

A researcher wants to know what GO says about function, process and cellular component of a given gene.

E.g. what does GO say about function, process and cellular component of gene "aly".

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

DS: And find GO id for the gene as well.

Get the Sequence for a Given Gene

A researcher has a gene name, and wants to find its published sequence of nucleotides.

E.g. what is the published sequence for the gene "aly".

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Note: this sequence could be etiher a nucleotide sequence or a protein sequence.

Notes

Could be genomic sequence or mRNA sequence.

DS: Could want whole sequence, or bit of sequence to make primer or probe from.

Find Other Genes with High Level Sequence Similarity to a Given Gene

A researcher has a gene sequence, and wants to find other genes with high level sequence similarity.

E.g. what are the other genes with high level sequence similarity to my gene "amo"?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Note: we are trying to differ between a search for sequence similary and a search for homologues.

Note (a thought): how about wrapping up a Blast service as a data query service in the Data Web, and data returned from the Blast query could be programmatically integrated with other data on the data web.

Notes

DS: many do not distinguish between sequence similarity and homology, just put sequence into blast and extrapolate from sequence similarity to homology.

... this is often done to get clue about what gene is doing (function). Blast against other people's genome, see if homologues have identified function.

Find Homologues for a Given Gene in Other Organisms

A researcher has a gene sequence, and wants to find homologues from other organisms.

E.g. what are homologues for "aly" in humans.

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004 and DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

TODO check this story is genuine.

Note

DS: this use case synonymous with above.

Find Genes that Are Known to Interact with a Given Gene

A researcher has a gene, wants to find other genes known to interact with it.

E.g. what genes are known to interact with "aly"?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

TODO check this, genuine? more detail? what "genetic interaction" mean?

Notes

DS: Don't get genes interacting, get gene products interacting with genes, because many genes code for DNA binding proteins, either promote or inhibit expression of target gene, because block binding site for polymerase.

... May also mean, two gene products interact, e.g. proteins, one is regulator of an enzyme, or may form an active complex e.g. alpha and beta haemoglobin subunits. E.g. trypsin inhibitor interact with trypsin. Title of this use case ambiguous. Strictly (clinically) speaking genes don't interact with each other.

... Can also get interaction at RNA level, RNA interference. Can silence a gene by introducing a double-stranded RNA molecule, where one strand complementary to message. Enzyme called dicer, cut double-strand RNA into small fragments (22bases), two chains separate, one chain gets incorporated into "RISC" (??) then seek out and bind to complement on mRNA, then DICER cut up mRNA. Can use this to systematically knock out every gene e.g. in mouse and look at phenotype.

... Also e.g. transgenic cotton plant, express dsRNA specific ... normally cotton makes gossipol and insects die. Some weevils resistant because have enzyme which breaks it down. If create dsRNA to knock this out, insect ingests it and no longer resistant. Also done for maize, knock out resistance in weevil.

TODO heading ambiguous, look at this at all levels, DNA-RNA, RNA-RNA, Protein-protein etc.

DS: for protein-protein interactions, technique called yeast 2 hybrid technique, look for protein interactions in cytoplasmic extract. Can tell if proteins "holding hands", important for their function. If find associated proteins, maybe working together. Also other, less tidy techniques, e.g. co-precipitate.

In Which Tissues is a Given Gene Expressed?

A researcher has a gene, and wants to known in which tissues that gene is known to be expressed.

E.g. in which tissues is "aly" known to be expressed?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

TODO check this, maybe infer from different sources of information, e.g. ESTs or in-situ images

Notes

DS: Where in cells of that tissue, in what cells, in what parts of cells? Ilan davies looking at what parts of cells gene expressed at high resolution. How do you do that? 2 methods, fly atlas work from julian dow disects fly into series of organs, assay organs for expression of mRNA for dna-array screen. That's how he knows expression high in brain, low in testis. Helen talked about discrepancies between fly atlas data and her data due to imperfect dissection, e.g. gut contamination in brain sample.

... Other thing is to do in situ mRNA hybridisation at organ level, cell level, sub-cellular level. Or look at protein expression at organ, cell, sub-cell level. Look at protein expression in several ways, e.g. bing a labeled antibody (immunocytochemistry) -- label blue reaction product or usually fluorescence. Or, recently, tag protein with GFP or YFP, by insert upstream or downstream sequence for GFP (so expressed as unit), two bits fold up separately -- that's what cambridge people are doing in fly trap work. Cambridge guys randomly insert GFP in proteins, isolate mutants, then farm out to people e.g. Helen and see what's happening in a given tissue. Helen then says e.g. early spermatocytes brightly coloured, feeds data back to cambridge in private database (not called flytrap?) and only allowed to see other people's data if you have deposited your own data. For every 50 flytrap strains you do, get to see 50 others. Point is to look at protein gene product expression in cells.

Compare gene expression location (tissue) for two homologous genes in two different organisms?

A researcher has two homologous genes from two different organisms, and wants to compare information about gene expression location for each of these two genes.

E.g. TODO

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

Same gene e.g. actin in different organisms. or same gene initially have different names, but then turned out to be same gene. E.g. bcl1 in worms is called something else in drosophila or humans. In flybase, many synonyms are names used in other species.

... Stricter definition is the drosophila alpha actin gene. E.g. porcine pancreatic elastase gene (gives species and tissue). Also can have different isoforms of the same gene expressed. Will find biologists talk both specifically or lax, have to use context to disambiguate.

... Loosely speaking, talk about actin gene across species, cytochrome c highly variable used as a species marker (mitchondrial gene used to study maternal inheritance).

Find EST expression profiles for a given gene in a given organism?

A researcher has a given gene, and wants to find EST expression profiles for that gene in a specific organism.

E.g. find EST profiles for aly-1 in worms.

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Note: See domain model, question about "genes" and "species".

Notes

Really looking at mRNA expression -- how much, where? Either microarray (how much) or in situ (where). So this might mean go to fly atlas, find out how much expressed per tissue.

Is My Gene Expressed in Somatic and/or Germ Line Cells?

A researcher has a gene, and wants to know if this gene is expressed in somatic cells, germ line cells or both.

E.g. is VASA expressed in germ-line cells only?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Note: restrict to Drosophila only, or expand to other species?

Notes

Interrogation of FlyTED, don't have equivalent female database. According to cell-type annotations, in which cell type is expression? Also could look at fly atlas, if find in gut then know not exclusively expressed in germ line.

Is My Gene Known to be Involved in any Signalling Pathways?

A researcher has a gene, and wants to know if it is involved in any known signalling pathways.

E.g. is "aly" involved in any signalling pathways?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

There are databases of protein domains. Take sequence, blast bit of sequence against database, identify domains of protein, being independently folding bits, e.g. kinase domain. By recognising that, suspect protein is a kinase. Could then go to specialised kinase or phosphotase database and look more specifically ... for what?

Or may know from biochemical knowledge of signallying pathways, may know kinase X interacts with regulatory protein Y, by blasting find protein is one of these other proteins, then deduce involvement in pathway. Cf. systems biology, motility in bacteria, 5 receptors, 5 intracellular kinases, mechanism for propellor, trying to model levels of phosphorylation of different proteins, affect propellors go forwards or in reverse.

Because many proteins have been sequences, som crystallised and we know how they work. Known relationships between protein sequences (domains) and catalytic functions. So just by looking at sequence, can deduce protein function. Some might be signalling pathways.

Also get membrane receptors in signalling pathways, bind ligands outside cell. Receptors fall into known classes with known structures with known sequence profiles.

Find Published Papers About a Given Gene

A researcher has a gene, and wants to review published literature concerning that gene.

E.g. what papers talk about "aly" in Drosophila?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

Go to medline, pubmed, put in gene identifier or name in search box (how reliable?) .. then question of gene name disambiguation. Usually done at stage of results of search, read title of paper to disambiguate.

Cf. Martijn Schumie gene disambiguation service rotterdam, based on collexis software for conceptual fingerprinting of documents. IBRG used to use that as web service on the fly for bioimage search.

Find Published Papers About a Given Protein

A researcher has a protein, and wants to review published literature concerning that protein.

E.g. what papers talk about "polycystin"?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005


Find proteins from published papers

A researcher has a reference, and wants to find what proteins this paper talks about.

E.g. which protein is discussed in a literature reference?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Notes

Tool called ihop, information hyperlinked over proteins. Take medline abstracts, if two protein names appear close together, assume relationship, can go from one protein to another. (The whole web linked in three clicks.) Also look at type of relationship between protein names (text mining). Robert Hoffmann.

Find Published Papers About a Specific Species From a Known Author

A researcher knows an author in her/his research field, and wants to retrieve publications about a specific species from this author.

E.g. what papers written by "matunis" (last name of the author) talk about "drosophila".

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Notes

Note: one possible source for searching for such paper could be "Pubmed".

E.g. go to medline abstracts, do dual search on author name and species.

Find Images of In-Situ mRNA Expression for a Given Gene in a Given Tissue/Organ?

A researcher has a gene, and wants to find images depicting in-situ mRNA expression of that gene in different tissues/organs.

E.g. find images of "aly" expression in embryos.

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

In given species, or across species (e.g. in testis across species?)?

Could do that by literature search in present world, tissue-specific databases are few and not linked.

Find Images of In-Situ Protein Expression for a Given Protein in a Given Tissue/Organ?

A researcher has a protein, and wants to find images depicting in-situ expression of that protein in different tissues/organs.

E.g. find images of "polycystin-2" expression in embryos.

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Is My Gene Expressed in the Cytoplasm or the Nucleus?

A researcher has a gene, and wants to know if the gene is expressed in the cytoplasm or the nucleus.

E.g. is "aly" expressed in the nucleus?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

Many genes in nucleus at some part of cell cycle and at in cytoplasm at other part of cycle. Certain proteins, made in cytoplasm, specifically sequestered into nucleus because they bear a nuclear import signal. If that signal is mutated, they stay in cytoplasm. So may need when as well as where, might want to use fluorescent label at high mag under light microscope, see spread in cytoplasm or nuclear localisation. Helen's data might not answer this, because done at low magnification (so might need only images over certain magnification (need image size data to understand this)).

N.B. all mRNA is exported to cytoplasm for translation to protein. E.g. protein of large ribosomal subunit. So this use case possibly nonsense?

This only makes sense in the case of proteins, i.e. localisation of protein as gene product.

N.B. some genes make functional RNA molecules, e.g. tRNA, ribosomal RNA. But this is not what Helen's studied.

Is My Protein Expressed in the Cytoplasm or the Nucleus?

A researcher has a protein, and wants to know if the protein is expressed in the cytoplasm or the nucleus.

E.g. is "polycystin-2" expressed in the nucleus?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

TODO merge above use case here

At what developmental stage (i.e. when) is a given gene expressed?

A researcher has a gene, and wants to know at which developmental stage(s) the given gene is expressed.

E.g. is aly expressed at TODO stage?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

In drosophila testis, spatial location correlates with developmental stage, so can determine stage by looking at location. This is unusual.

The other way is to look at embryonic development at different ages, see what genes expressed where -- this is what BDGP does.

Depends what you mean by "development". Development of whole organism from egg, or development of particular cell types in adult (e.g. from stem cells). Rare to find correlation of spatial and temporal, do find in gut, crypts, near bottom of crypt 4-5 stem cells. N.b. there is Java model of stem cell development in gut. As develop, move up epithelium.

Where are the coding and regulatory regions for a given gene?

A researcher has a gene, and wants to know the location of coding and regulatory regions.

E.g. where are coding and regulatory regions for "aly"?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

Go to genomic database, look at the sequence and its annotations.

Is a given gene involved in regulating any other genes?

A researcher has a gene, and wants to know if this gene is involved in gene regulation, and if so, what other genes are regulated.

E.g. is UBX involved in gene regulation? What other genes are regulated?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

Can find answer from literature.

Can also look for sequence of transcription factor, certain domains known to be DNA binding domains.

Is UBX a single gene? Does it code for a single protein? Do genes always code for a single protein?

Is a given gene associated with polycistronic mRNA? If so, what are the related genes/protein products?

E.g. (TODO)

Source: DavidShotton's notes from discussion with HelenWhiteCooper 22/12/2004

Notes

Note: TODO check the wording here ... how is word "gene" used? E.g. same "gene" codes for two proteins? Or two "genes" share same piece of DNA / mRNA?

Where get polycistronic other than bacteria?

Cf. lack operon, single promoter, transcribe whole sequence of genes as single message, then ribosome goes along and reads them off. All in a pathway, all made synchronously, very efficient. Doesn't occur in eukaryotes?

Helen says, can get bicistronic transcripts in eukaryotes.

Related to question, who are gene neighbours.

TODO research this

Find Orthologous Genes for a Given Gene from Other Organisms

A researcher has a gene, and wants to find orthologs for this gene from other organisms.

E.g. The researcher has a gene "amo(Pkd2)" and wants to look for orthologs for this gene.

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Find the Synonyms for a Given Gene Symbol or a Gene Name

A researcher has a gene, and wants to find the synonyms for this gene.

E.g. what are the symbol synonyms or the name synonyms of the gene "Pkd2"?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Find the Evidence for a Given Gene Ontology (GO) Annotation

A researcher has a gene and the GO annotations about this gene, and wants to find out the evidence for these annotations, e.g. links to references, how ESTs match gene predications, etc.

E.g. what are evidence for the GO annotations of the gene "Pkd2"?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Find the ESTs for a Given Gene

A researcher has a gene, and wants to find out the ESTs for this gene.

E.g. what are ESTs for the gene "Pkd2"?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Notes

Note: it is not clear what the purpose is for looking for the ESTs for the given gene and how this information can serve as evidence for the GO annotations about this given gene.

TODO ask helen if valid and where you go for the answer

Find other papers in a given journal issue

A research has a reference or a journal issue number, and wants to find other papers in the same journal issue.

E.g. what are the other papers from the same issue as my paper "Gao et al., 2003 Curr. Biol".

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Find other papers in a given journal issue in a given topic

A researcher has a reference or a journal issue number, and wants to find other papers in the same journal issue talking aobut a given topic.

E.g. what are the other papers from the same journal issue as my paper "Gao et al., 2003 Curr. Biol". and talking about "male fertility"?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Find information about crystal structure of a given protein?

A researcher has a given protein, and wants to find information about the crystal structure of this protein. N.B. This could be published papers containing empirical evidence for the structure of a protein, or published crystal structure data (cf. eCrystals), or predicted protein structure.

E.g. find papers investigating the structure of polycystin-1.

E.g. find crystal structure data for polycystin-1.

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Notes: TODO check this, maybe split this out into several use cases.

Find more recent papers from a given author

A researcher knows the name of a colleague, and wants to read more recent publications from this colleague.

E.g. what have been published by "Gao" since 2003?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Notes

Cf. also crossref, ed pentz. Citation navigation (in and out).

Find conference abstracts about a given gene

A researcher has a gene, and wants to find abstracts from past and latest conferences about this gene.

E.g. what abstracts have been published for the gene "PKD2"?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

Notes

Grey matter, not well indexed in medline etc.

N.B. easier for drosophila because manually curated in flybase annotations.

Find personal communications about a given gene

A researcher has a gene, and wants to find all the personal communications about this gene.

E.g. what personal communications exist for the gene "PKD2"?

Source: DavidShotton's notes from discussion with HelenWhiteCooper 31/10/2005

What's new? What's in the past? Has information about a given gene from a given source changed since a given time/release?

Has new information been discovered about a given gene? Has the concept of a gene changed? Where does each piece of information about a given gene come from (i.e. which data source, which version)? ...

E.g. TODO

Notes

Notes: TODO split this up!

Addition of new information order of magnitude faster than changes in gene name or schema. If you have an updating service, does most of the work.

Personal tools
Oxford DMP online
MIIDI
Claros