Collaborative Ontology Normalization
From ImageWeb
Contents |
Collaborative Ontology Normalization Workshop II
A second workshop to validate and extend the ontology developed in workshop I.
Details here http://imageweb.zoo.ox.ac.uk/wiki/index.php/Collaborative_Ontology_Normalization2
Collaborative Ontology Normalization Workshop
Oxford, Wolfson College 24-26 June 2008. Local Organizer: David Shotton
Notes from an experiment in collaborative normalization of the Cell Type Ontology.
Attendees
Mikel EgaƱa, University of Manchester
Simon Jupp, University of Manchester
Philip Lord, University of Newcastle
James Malone, European Bioinformatics Institute
Helen Parkinson, European Bioinformatics Institut
David Randall, Manchester Metropolitan University
David Shotton, University of Oxford
Robert Stevens, University of Manchester
PHOTOS
[1] Group Photo 1
[2] Group Photo 2
[3] - Archive
Day 1. 25 June 2008
Introduction from Robert Stevens Collaborative Ontology Normalization
Hope to make some observation of discussion and process to inform us about tool dev and what kind of support that we need, for non ontology representation for collaborations.
Reason that is this group, right size, and David is our domain specialist for cells, all of us are biologists of some sort, possibly excepting DR, we all know OWL, to some good degree, all familiar with the process of normalization as practiced by Manchester and others, and proposed by Alan Rector (** add ref).
Aims:
RS: As a team take OBO CTO, hand crafted taxonomy of cell types, with development, 'tangle'. When buld by hand there will be errors, Mikel and I have looked at GO and normalizing and rebuild he h'archy 1/10 has a missing or error in a sub sumption reln. The process of norm will give modeles that are maintable, and 'highly' axiomatized and make better inferences. Will go through the norm and by the end will come out with a plan for how to normalize it, primary, secondary axes of classification restrictions on cell types etc. Then we may rebuild some of it, after the event we will come up with a plan, and other BioHealth uses an ontology pre-processing lang, generating and transforming OWL format ontologies. Mikel will prog in OPPL and we will get a norm complete OWL norm ontology. We will also have through Dave's observation a record of what we have done and how we decided what to do. Also annotating bits of the CTO as we go along with decisions. Supported by OntoClean - technique for evaluating decisions
DS:what about publishing the ontology
RS:OBO are reworking the CTO, this can be a product that informs the process
HP:why the CTO?
RS:small enough, 900 ish terms, domain is focussed, mole function is too diverse, Simon and I chose it for some other work, also familiar. Within ontogenesis we have David who is a cell biologist. Happy to hand over the finished artefact to Alan Ruttenberg, and knows OWL.
RS:Johnathan Bard is now in Oxford, we'll do normalization and OWL and try and use his skills later in the process to answer questions. Need to keep records about questions that we ask.
RS:flow of actvity is looking at the CTO, quick look at structure, what it describes, need to id the axes of classification, people in the room have had a look at the CTO so we have a basis to proceed. Will see terms like 'cell by lineage' - what does that mean? Explicit axis of classification, mature cell, immature cells, these are hidden in names on OBO terms. Having id the axes we need to decide on a primary axis, and we will decide on the asserted is-a relns that will form the backbone, others will be in supp ontologies, and others may exist in PATO - Ploidy for example, PATO talks about ploidy and then take cells as leaves of the CTO 'Renshaw cell' will have 'x' ploidy through a restriction. Then can recrate the intermediate classes halpoid etc and OWL will classify these and infer the ss relns and will be complete and dynamic. Cells that take part in secretions, GO process will talk about secretion. maturity, morphology may or may not exist and we may need to build these if not in PATO, and we can make a simple tree and send off to PATO if they don't have it. Hope that we can do this formally use an aspect of OntoClean.
RS:are people familar with upper level ontologies a la BFO
DS:do you want to use one here
RS:no, it will be implicit. Ontoclean a way to eval ss to check right
RIGIDITY - things which are inherent to essence of things, properties held by an entity for the duration, or part person from start to end, student for some of that time, student is a role, always a person. Things that are essential and things that are essential to all of the entities. RIGID for part of existence ANTI-RIGID, some class members - SEMI-RIGID for all of existence 'RIGID'. Should id 1 axis as a rigid property, will make a safe tree. Go through the axes in CTO and annotate as ANTI, SEMI, etc
UNITY - parts and wholes - errors with is-a ocean is a kind of water water is part of ocean,
IDENTITY - nec and sufficiency and sufficiency and dist cn these.
RS: is maturity inherent to a cell? No - anti rigid then. Will take some of cells in CTO and look at creating restrictions and create fillers for these. All of that in one day - and then iterate.
ME:limit discussion to 20 mins, and tomorrow revisit
RS:Go meetings things that are not decided so we decided to try and limit discussion
DS:There's a lot of thought into development A->B and C. can you comment? Stem cell divides to be a stem cell and a daughter cell that differentiates - myoblasts fuse to be a multinucleate muscle fibre, process of change has been considered.
RS:I am correct all blood cells from hm stem cells, may want to say hm stem cell ->erthyrocyte, assumes that all-> at least one erthrocyte, not true want to say the other way around, e develops from hm stem cell. Discussion on whether stem cells are immortal in this context.
HP:is this a question of temporal processes and how we model those?
DS:no more about modelling change, RS said that he is a person, and was a student, how do we model a myoblast that has become a multinucl muscle fibre
RS:look at RO - they make a distinction between development and derivation, in one case identify is maintained, in the other not. glucose->fuctose - is derived from, fetus->adult is development
PL:in case of a cell, once a cell divides is two cells, when you get older are still the same person. is about identity, A-> B and C, less clear from differentiation, many do both at the same time. clone could exist even if a cell doesn't. Lineages differentiate
DS:type and individual differences - clone and instances of that clone.
PL:not clear what CTO was modelling, sin of person class, 'person' should be just the cell ontology.
DS:for biol use needs to be the CTO not the CO
HP:just a label
Discussion on truth and beauty.
PL:do we want an artefact in OWL and how do we do that
RS:we will do in OWL not in 2 days. Plan to produce a plan OPPL script -> ontology. We may take a part some cells and mock up in OWL, and show it working, will be a fragment. Will not be able to do for all axes and see where we get to.
DS:hepatoctye appears nine times in the CTO
ME:we should focus on axes not exceptions
RS:CTO classified by nucleation, we can pull that out, muscle cells are polunucleate, cardiac - write a definition. They don't talk about a cell component could add restrictions connecting cells to this.
HP:the experimentally modified bit is tricky, and we may want to disregard that
PL:any permanent cell line is a cross product with part of the rest of the CTO, and this is an issue. Cell lines have undergone process.
DS:issues with experimentally modified organisms.
HP:natural and non natural things were discussed in OBI and we decided for various reasons that this was not useful
Looking at the top level many classes
ME:cell by organsism could be a candidate for 1 axis
Cell by function - primary end goal or behaviour
Cell by histology - a classification by their microscopics - we think this is incomplete and what cell by histology means.
HP:is basically morphology, or stainability
ME:not by tissues - I expected that
PL:morphology in vivo, but also fixed
HP:could be any microscopy, light, scanning em
RS:they do not discuss brain cell, eye cell, etc - is species neutral,
AI: Axis - histology become morphology
ME:epithelial cell is a category of cells, and bipolar neuron is more precise, and this is a test case that we can look into.
Class by lineage - no def, suggests lineage by reproduction, mesoderm, ectoderm, endoderm, only some sponges don't fit the pattern. Based on higher eukaryote body plan, jellyfish etc not included.
HP:there are prokaryotes, and eukaryotes
PL:is a bacteria an anucleate cell
RS:we understand lineage, if if we do agree
DS:we could think about triploblastic organisms, and get that right, and most of this ontology is about eukaryotes, cover mods not At though
RS:we should make sure we leave the hooks in for prokaryotes
DS:cell by nuclear number - no macro and micronucleus - we need that
PL:skeletal cell, cardiac muscle cell - contradicts having no anatomy, is a kind of multi-nucleate cell. not a ss
HP:we can see an axis of organism and also got a sensu terms in labels, we need to be aware of that.
ME:we are sensu - as appears in mammalia and also appears in all mammalia which are we using here
RS: I think we need to decide on a case by case basis
Cell by ploidy - diploid, haploid, polyploid,
HP:I want to say tetraploid
DS:we want to have a sep axis that gives nuclear number,
RS:we can make defined classes for that
RS:ploynucleate is anything above 1, polyploid is anything above 2 - is this reasonable?
DS:multinucleate and polyploid is the standard language
'Non terminally differentiated cells' stem cells are sibs of this
DS:there is a missing class 'terminally differentiated' is a list of cell precursor types
RS:why is cell by org not a sib of the above?
AI:we think cell by class is not really that useful, and we could lose that, seems to be a container class
RS:we have in vivo vs exp modified, we will ignore these, cell by class is removed. Nuclear number is not a 1 axis
DS:number, histology, function will not work,
RS:Why? is function true some of the time?
JM:function is intrinsic by BFO
PL:if you use BFO function doesn't change, function is intrinsic to entity due to design, hammer meant to bang in nails a role for screws -
DS:if we are doing multicell orgs is cell by lineage as a starting point
PL: development is the primary axid
JM:what about function
PL:cross product
RS:most will be go processes and insulin secreting etc will be in Go, if it isn't should be
PL:photosythethis is a process - metabolism, - are processes so will not only be the
RS: function is GO is stage in a process, not function in BFO speak
PL:first pass, we can blitz the tree, and replace with involved in process from GO
DS:lumping cell types secretory cells, are very different, secretion is common
ME:apoptosis fated cell, is fate not function
HP:do defence cells include stingers?
RS:is defence a role?
PL:defender is a role, defence is a process.
RS:process here is a ragbad
PL:harsh, cell by function is badly defined. quality function is conflated with the cell type, need a h'archy of cell functions, not cells with a particular function
RS;in BFO function is intrisic to bearer and in that sense is RIGID
PL:yes ROLE is ANTI/SEMI rigid
DS:annotate the existing cell ontology for the readers, or create a new ontology
RS:annotate
PL: we can make the cross product with GO process
HP:could have a critique paper
RS:will be part of the methodology paper
Histology - we will describe this as morphology -
HP:schmoo cell - in yeast, columnar epithelial cell
RS:columnar vs squamous
PL:epithelial is about lineage
RS:always columnar
PL:this can change and is the same cell -
ANTI-RIGID then
PL:some defs based on artefacts - gram negative e.g. in terms of assay
RS:columnar and squamous should be in PATO, staining etc we'll need to deal with.
PL:immature and mature - disjoints are a mess as well
Lineage
RS:cell from mesoderm is always mesodermal
DS:expts in stem cells, take a mesod sc into mice and see them turn up as neural, etc, cells can transdifferentiate from one lineage to another - epithelial cell, make into all cell types.
HP:ok but does this happen in vivo
DS:maybe, there is a possibility that that occurs. having said that - this ecto, endo, meso and germ is a good place.
RS:RIGID for now, caveat that understanding might change - these boundaries are not permanent in vitro -
DS:like Elements where uraniam can decay into another element
ME:so lineage is the best candidate?
RS:we need to be sure that if we use lineage we can still work less complex organisms
DS:suggest then higher classification unicell, multi-cell -
HP:suggests this could be quality
PL:could use lineaged vs non linaged cell
DS:is this for multicellular organisms, and do we need to classify cell types in yeast
HP:if we consider the data, then most of the data is higher euk
RS:we don't want to decide halfway through proceed
Cell by nuclear number - does it change over the life time of a cell,
PL:depends how you define cell, sync blastoderm - not a cell.
HP:how do we define cell, in this case by a plasma membrane - maximally connected compartment,
PL:defined by partomy, stuff inside a plasma membrane, in this case syncytium is a cell
DS:I set this as an essay, bounding pm is one thing, store and replicate genetic material, and metabolise
PL:any plasma membrane could be included and cell is in the definition - bad, what does maximally connected cell compartment
HP:we could spend the rest of the day doing that
PL:we could think about multinuclear cells, comp questions
RS: Nuclear number is intrinisc - and if that changes
DS:in mitosis there is are 2 nucleic, and sometimes diseappears, is SEMI and ANTI-RIGID
PL:yeast cell, 2 cells with .5 of a nucleus each,
RS:ploidy doubles in cell div
PL:depends how is replicating
HP:steady state then, are we deciding that we don't need to account for cell cycle - but we do need to deal with differentiation
PL:dividing cell is out of scope, cell which could divide is in scope. s phase cell is out of scope
DS:cells losing the ability to divide
Ploidy -
RS:haploid cells always haploid -
HP:I think we have approximated yes
PL:if we have 2 cells haploid cells, diploid is not the same cell
DS: tetraploid yeast cells - were selected by sel breeding
PL:retinal cell - LOH - change chromosome number, but still has original ploidy
PL:ploidy is RIGID, biologically at least, some edge cases, where is SEMI-RIGID - likely patholgical , we can keep 'normal' in scope
Non terminolgically differentiated cells
RS:ANTI-RIGID - this will change.
Stem cells vs embryonic stem cells - these differentiate and are not left behind adult stem cells persist
SJ:stem cells - 3 properties, not terminally differentiated, divide without limit, each duaghter cell has a choice, even tho all may be differentiated.
ME:blasts shouldn't be under stem cell
PL:need poss of stem cell's kids being differentiated or another stem cell
SJ:looking at alberts, stem cell has change, rather than by division not being stem cell.
DS:we need to know when the switch occurs
PL:sc can diff and be another cell type, or a period where two children may be stem cells or not, or has to gain stem cell ness following mitosis. Suggest that the gain of sc s tricky, sc should be able to differentiate and we need to have that in the ontology
PL:sc is not RIGID
RS:are we being consistent in assigning this? Ploidy doesn't change, due to bio convention, sc ness doesn't change
PL:edge case for ploidy are pathological
RS:in cell division we are talking about the base case
PL:unclear when new cells appear, steady state vs temporal
RS:stem cell, divide into 2 cell, no longer cell A -> B and C - was it a stem cell for all life of stem cell A stem cellness is RIGID
DS:B or C may be stem cells we don't know
DS:two diagrams, mitotic daughter may be same or different
RS:rigidity is about instances
DS:cell B looks like a stem cell after mitosis, B and C may differentiate
PL:concl. If sc ness is rigid B is not known to be a stem cell, or will be at some point, B or C may be a stem cell, we don't know yet - A which is always a stem Cell rigid, B and C - may or may not be a stem cell. Suggest not RIGID, easier to model.
RS:we describe that there is an issue is ANTI-RIGID/RIGID - consequences of both
Cell by organism - RIGID
RS: if I am a human cell, am I always a human cell
DS:some animals eat a cnidaria and put the stingers on the surface ? Need to find an example of this
RS:3 rigid - organism, rigid and function - BUT function is rigid, but the CTO def of function is wrong and many were processes we looked at - cell by function is therefore RIGID, cell by process is not RIGID. Want to look at these cases and lineage and organism relations.
RS:We rename cell by process. We have lineage and organism and there is a relationship. Organism would allow us to leave the hooks in.
We discuss whether how much if any taxonomy is needed. and what the upper level h'archy should be like
we consider
Euk
Animalia etc
Prok
PL:define by other properties
RS:ok cell, and then leaves, do everything by restriction - have a flat list instead of
HP:also a good way to build views into the ontology this way for different user group
RS:do it all be restriction, ultra norm
DS:how handle the terms that have the same names and different contexts
HP:difft id and synonyms,
RS:class name is just a URI and this is unique, these are numeric, can have the same label on all of them
PL:cell parent - can't hold the functions etc. Will need other sibs.
DS:we'll need a process h'archy from GO. DS: mammalian erythrocytes, avian are nucleate, we have one entry?
RS:We'll have two entries for this
PL:we don't seem to have another option, haploid, - is-a to cell,
RS:only be a kids of cell prior to classification using the reasoner
HP:if we pick 20 cell types then we can have a go at getting the restrictions then go back an generalize
AI:select 20 cell types, good rep spread across CTO h'archies
ME:what about cell components RS:go thru processes in CTO see if they have something e.g. insulin secretion
AI:typo - Endopterygota
PL:pick some by hand, and some random so we've got reasonable coverage
HP:that's a separate excerise, test phase
DS:do we want 20 leaves and do we want higher categories as well, should be able to see some intermediate classes and write definitions
we decide criteria are the ones that we've discussed today e.g. epithelial cell
DS:apoptosis - once cell that is destined to die in programmed cell death - scaffold cell
RS:we have 51 in the list, we need to ensure that we only have leaves
DS:would retain these
we run through the list and pull out cases where we have non leaf nodes and look for a child term, and check for obsoletion (SJ has the s/sheet for this
Helen and Phil lineage PATO -
none of the terms are in PATO so we looked at child terms of these classes for terms that might be in PATO
child terms mapped to PATO: - branched - is in there - superficial - almost quality of being on the surface, superficial to - relational quality - nothing that's the most superficial nearest thing we could find - keratinized - some quality that is a result of having undergone a process - not there in pato - - periarticular chondrocyte - surrounding a joint - proximity and anatomy - hypertrophic chrondrocyte - OK in PATO - columnar - rod shaped ? sort of, - endothelial tip cell - 'tip' is the sensory or the tipness that's key - seems to be tip, is a region - continuant - in this case is the part that's growing, cf. apical meristem - are apical and basal should be in PATO - spatiotemporal quality is a parent in PATO, motility is a child term - growing tip - is a spatio temporal quality - non-branched duct epithelial cell - unbranched is in PATO - stromal - not in PATO - no position info - white, brown fat cells - in PATO - presence of mitochondrial - can it be defined by the name? in this case the colour relates to the tissue ie anatomical
AI:Hp and JM will talk to George G about PATO below and discussions here and report back.
-
RS: again that seems that it's all doable, for our purposes for describing the lineage we make our embryonic tissues and use derived from for cell that derive from these. We should trace up the tree that what's being said there is useful and consistent SJ:we didn't look at all the context so there may be some assumptions here
RS:anatomy - leave on one side - HP:anatomy issue is that these ontologies are many and so the work is much harder to do, we need a complete CARO
RS:we need to take some of our cell types and make some restrictions, or we do the restrictions and generate a definition.
some kinds are a 1-1 mapping from cell to go, also cases where very specific terms, lymph circulation, happens all over the place. Things can take place in many processes. e.g. phagocyte - different processes and motile cells -
HP:do we need to take the highest useful term, or to use a slim? MI:we've taken higher terms, and we need to make many restrictions to allow for multiple processes
examples: gut absortive cell -> intestinal absorption apoptosis fated cell -> apoptosis Schwan cell -> ensheathment of neurons circulating cell -> circulatory system process contractile cell -> muscle contraction defense cell -> NOTE: defense is bad in CTO: split into inmune response and transport electrically active cell -> cell-cell signaling germ line cell -> metabolising cell -> metabolism process mitogenis signaling cell -> cell cycle motile cell -> cell motility nitrogen fixing cell -> nitrogen fixation
RS:some processes that are absent from GO - MI:nitrogen fixation, copper accumulation PL:any of them functions? SJ:no functions not all under cellular processes in go
sex - PATO
morphology, size
maturity PATO
nuclear number PATO
ploidy PATO
potentiality - totipotent, unipotent
promixity/location - juxta, extra, neuron associated cell
Sex PATO 0000047 lacks genotypic sex
Ploidy PATO 0001374. Lacks heterokaryon should be present under nucleate/ploidy
Nucleate quality PATO 001404 lacks syncitium
Heterokaryon should be under ploidy, not nucleate qualtiy
Shape PATO 0000052 Lacks bipolar, polarized, apical, basolateral, columnar, cuboid, stratifies, dendritic
Size PATO 0000117 lacks large, medium, small; has gigantic, dwarf
Deviation (from normal) PATO *** has hypertrophic, hypotrophic Structure PATO 0000141 has lots of words like matted, spongy, not relevant for cells Fragility is not a structure
Maturity PPATO 0000261 has terms relevant to animals
Lacks: stem, blast, differentiated - these are an expression of potentiality Lacks: embyonic, foetal, neonatal, adult
PL:they have pubescent etc above are an omission PL:mature, immature and juvenile are there
Cellular potency PATO 0000197
Relational spatial quality: PATO 0001631 has proximal to, distal to
Lacks next to / juxta /adjacent Apoptotic PATO 0000638 is under Morphology, defining appearance of an apoptotic cell
Conclusion:some easy additions, but not bad
===Example terms and their restrictions
Looking at 25 term list, select 2 to make restrictions
Defining CD8-positive alpha, beta T cell
ME:in Go there is a similar process, GO:43369 how do we relate
kind-of 'lymphocyte' function cytotoxic cell differentiates in the thymus developed from haemopoetic stem cell peripheral blood has lineage 'mesoderm' in CTO - nicely done many intermediates
RS:would expect to see a series of derived from stages back to mesodermal cell.
SJ:next derived from is lymphocyte, we only need to capture lymphocyte only, this cell is derived from lymphocyte, will get back to mesoderm
ME:there's a go term that is about this lineage committment
RS:one of its ancestor cells takes place in that and this is the outcome, is it a property of this?
DS:Suspect immature T cell doesn;t undergo a mitosis prior to being committed.
RS:do all of them take part in that process, if some do need disjunction
RS:what makes something a lymphocyte
DS:circulating white blood cell which is part of the adaptive cellular immune system
PL:what's not part
DS:dendritic cells
HP:circulatory system anatomical
PL:part of the adaptive immune system, and something that distinguishes between it and dendrites
DS:thinks dendritic cell is also part of adaptive immune cell
HP:so what distinguishes
DS:macrophages present antigen too, need another difference Consulting Alberts - Mol Biol of the Cell
PL:wikipedia - leukocytes - table showing differences, lymphocytes, T, B, NK cells
DS:dendritic cell is innate not adaptive immune system in Alberts. Narrow def back to adaptive cells of immune system are the lymphocytes. Adaptive is that cell surface molecules have seq that are specific for particular antigen not general. Done by molecular recognition. CD8 etc is a cell surface marker, not about function. NK cells are innate immune system. NK cell is not a lymphocyte
SJ:is this a kind of
DS:T-cell is short form of T lymphocyte
RS:gene expression roles about alpha beta etc, CD4 and CD8 - we can add restrictions for these, no ontology for doing that
PL:CD8 is a partonomy of cell surface
HP:then we need an extension of the go component plasma mebrane
RS:we need a tiny marker ontology to deal with this now so we can proceed
ME:we can add the proper classification later
PL:immune system process part of GO might give enough definition, could allow lymphocyte diffn from all the rest
DS:www.bioscience.org/atlases/cdclass/cdclass.htm - all CD nos are there
RS:we can do the roles, stuff in GO that is in processes e.g.
RS:nuclei? 1, it's 2 n, shape
DS:shows some videos of them doing different things, they are motile, this is in PATO, they are polarized
DS:do we want to add images -
RS:we can do that technically
RS:organism - mammalian?
DS:don't know about non mammals, wikipedia thinks vertebrates -
RS:Sex is not a defining characteristic in this case, we decide not to say anything
DS:shape is an artefact of way grown
RS:done process, lineage, morphology, sex, size, potentiality ? can become stimualated or is virgin
DS:some of these go onto to be memory cells,
RS:are these then a leaf?
PL:memory cells are a state of a single cell,
DS:virgin, activated, etc fact that are disjoint in CTO could be wrong. Looking at Alberts, naive, effector, memory cells says that these are matured. Can change further in response to signals.
RS:are all the things we have said so far true of e.g. memory cells -
DS:there are different roles for the different maturation state - naive' potential to respond , role could be monitoring, still in the immune process, effector cells have some other role.
AI:this term needs checking and we need to decide if these are kinds of this cell, or if there is some other aspect of this cell.
Defining Fast Muscle Cell
Most muscle a mix of fast and slow fibres, speed of contraction, whether they get energy from anaerobic/aerobic - change by training for atheletes. Flight are fast, posture slow - they don't fatigue as easily as fast do. True for vertebrates, also attach to skeleton hence skeletal muscle. Three classes cardiac, smooth, skeletal.
Morphology/Shape = cylindrical HP:is this instead thing that can be observed value - striation PL:
Size - this is relative, DStThinks that we want to say something about size, and maybe volume. Here we might need large, median, small,
RS:thinks we can use a data type to add micron information and so we'd be able to do this
Function/Process - skeletal muscle contraction process, but is fast twitch, some complexity in GO, we think there's a missing ss in GO - 'fast twitch skeletal muscle fibre contraction' new parent 'involuntary skeletal muscle contraction'. ' DS:Movement of the skeleton, contractile to move the skeleton? RS:is that muscle contraction vs the cell's property. Some discussion about this. We note and leave it out for now. PL:skeletal muscle contraction doesn't move in the same way that an amoeba moves.
Role: thermoregulation (shivering), lactose metabolism (more cardiac muscle), RS:we don't need to be complete, we are being descriptive, we need to think about questions we want to ask.
Ploidy - diploid
Nucleate - multi - nucleate (syncitial) - indeterminate number and changes so not the same as multinucleate - changeable, this is the difference in flies at least should be in PATO. Nuclei. come from division.
Potential:fully differentiated
Mature: mature
Lineage:develops from fast muscle myoblast, mesenchymal
Sex:not relevant
Anatomy:cell type is a necessary component of skeletal muscle
DS:we also need somatic cell as a parent for the mesoderm, ectoderm, endoderm in the h'archy.
AI:decide if use has quality means need to use cardinality, or has morphology - then there'll be a relationship explosion - can only be columnar - patterns. Not an issue in OWL if done sensibly. Need to handle cardinality max 1.
PL:thinks need a sub property so that can be blue and columnar.
Day 2 CTO Normalization Experiment
RS:things to note to do
AI:PATO and GO need to be careful that the parts used that we have all the disjoints or the reasoning will not work. PATO should be easier.
SJ:already got GO and PATO imported into OWL
RS:we'll add some axioms, with trees easier to manage disjoints as sibs in a tree are disjont. When we do the translation we need to retain the CTO ids as annotation properties and use our own ID scheme, will have semantic free ids, and english labels. Default CTO labels will be used, if we do not, CTO label will be an annotation property. We have 25 cell types and we have cell types A, need to add predecesor and all the chain, we will fake that for now and will go A-B-F-G so we have some transitivity working but not all. Asks David, mesodermal cell, what's the start?
DS:gastrulationm then mesdoderm
HP:there's a missing ss relation in the CTO for mesodermal cell.
RS:so we'll need mesodermal cell, etc. V pleased at progress
RS Review: looked at RIGID properties, couldn't find a RIGID property that is for primary axis, true of euk and prok. Lineage nice for prok, need scaffold for prok ended up replicating taxonomy - bad. With Mikel and Alan R will check the RIGIDITY analysis is correct. We looked at the cell types looking at implicit classification axes, size, shape etc. Looked at supporting ontologies, PATO and GO cover nost of it. Will make some requests from PATO and usual issues with anatomy. Nice methodological thing, restrict to euk cells, and excluded cell by experimemtal process, 50 representative leaf node cells that cover the axes we id'd. Chosen 25 to do by hand now, and 25 to test what we have done. Finished by looking at 2 in detail and restrictions on those. Some of them were easy, some were tricky
Ploidy, nucleation, morphology, sex, all quite easy, lineage, organism, (look up in CTO, record what predecessor cell was) size, - more problems, most discussion on processes (CTO function). Propose to do split the lists and divide in pairs and do the easy bits. In pairs will get as far as we can. We want something about everything.
RS:T cell example yesterday, cell quiescent or growing
DS:maturity?
PL:on in immune systems, spore is quiescent, not naive
JM:GO has cell cycle etc - quiescent etc
Task to complete classification for selected terms
3 pairs (David/James, Helen/Phil, Mikel/Simon) provided restrictions for the selected ~25 terms. ME and SJ are collated these on a single sheet which will form the basis of building the OWL file and also a data collectio exercise going forward.
DS: Hard things were morphology, process
PL: Test cases - platelets not formed by division
PL: giant cell - fusion rather than division events
ME:multi process terms, antigen presentation, and lymph circulation. Need two properties, does a function and participates in - active vs passive
HP:also had a case of forming a structure - we assumed that this was not a partonomy reln
SJ:also motility, needs to be added. GO has annotations for genes and so does PATO - cell has phenotype of being motile,
ME:cell motility in GO is involved in cytoskeleton, term is not about cell motility.
RS:PATO motile is better for cells. GO are for processes and we need to be careful that we are using a GO process that is appropriate to a cell and not a protein.
Group Reviewing the of classification spreadsheets
discussion on the definitions of pluripotent, totipotent, etc we may not have used these consistently.
we discuss not overloading the cells with their future potential - e.g. if is some prior thingh to a phagocyte then we don't need to annotate that, we can query by the relationship types. tendency to overload terms with future potential
we add cell surface antigen - plays the role of a marker
Discussion on using GO terms - GO has protein terms, we need cell terms and these are at different granularity or poss should be in a different ontology anyway. Protein is involved, is the cell getting all the processes that the protein is involved in by transitivity. Not clear. GO does provide a lot of what's needed for cross products. We are coming up against some granularity issues.
DS: e.g. new term 'neuronal signalling' is more precise for cell needs, but some are ok subclass of cell-cell signalling RS:is neuronal same as nerve signalling DS:yes
Seeing a lot of incompleteness for GO term addition in this exercise. Presumably due to not being that familiar with the GO, Go is focussed at a different level.
we get to term 10 on the list.
RS:shall we carry on for the next 15 or shall we start encoding
HP:can we continue and keep defining
AI:Simon and Mikel will start coding up to term 10 and the rest of us will continue defining.
DS:we will stop doing defining at 3 and come back together.
Looking at the RO - derives-from and transformation-of as appropriate
e.g. osteocyte derives-from osteoblast e.g. reticulocyte and erthrocyte
located-in etc.
ME:question about actively doing something vs passive is that agent in vs participates in?
AI:cellular components have not been discussed in the context of this work. We need to decide what to do.
AI:prepare a note about the schema, and assn RO choices with the schema - Manchester people
ME and SJ start coding in OWL and others continue on the s/sheet.
AI:SJ to use OWL perl api to find out leaf nodes - DONE - 450 in total
back to reviewing the sheet:
Term 11, gut absorptive cell.
Term 12, secondary spermatocyte - diploid, depends on how we define haploid and diploid, we agree on diploid.
Term 13 zoospore - flagellate non sex repro cells from fungi plants protists. added some more info.
Term 14. APUD cell. Complete
Term 15. phagocyte sensu Nematoda and protostomia. Complete
Term 16. inhibitory interneuron
Term 17. Diploid cell - this may disappear - we don't think this is a leaf in the normalized CTO. We delete the row.
Term 18. epiblast -. 'epiblast' cell definition will be epithelia that are part of the epiblast. Columnar here is defined as 2x as high as wide. We think this is also missing a epithelial cell is-a parent, pluripotent
Term 19. xylem element - vessel element - lignified. was a cell, not now a cell. - will need an alternate parent if a cell definition is not alive.
Term 20. platelet - see above.
Term 21. fibroblast - OK
Term 22. M cell - added some information -
Term 23. multi-nucleated giant cell -
AI:ploidy is looking at the nucleus, diploid and nucleate are not disjoint. Diploid is a feature of the nucleus not the cell.
AI:pleiomorphic - has shape shape, should be changed in PATO - from pleiomorphic.
Term 24. ascospore
discussion on maleness and femaleness - male gamete - has potential to be male or female - not who made them, this comes from anatomy. Sperm are neither male nor female they have a mating type - as they are haploid. Yeast is opposite, haploid have a mating type. Do we mean involved in sex reproduction.
RS:N/A is where we made no comment, usage is somatic etc, and sex is not relevant. If I have an egg and a sperm, my egg is female, and my sperm is male or female but not both
PL:scrotal sac epithelia is in a male - ovary is in a female, not true of a skin epithelial cell.
DS:male germ line, or female germ line - specific and not problematic
AI:Secondary spermatocyte - is now male germline not male female
Term 25. aminoserosal cell - problematic, skipped review
Term 26. cardiac muscle cell -
Discussion Conclusions
AI:HP to clean up the wiki, sort out the AI into a separate page
AI:Everything except Dave Randall's videos of including notes, ontologies and spreadsheets will be made publicly available.
Prototype normalized CTO in OWL
imported PATO terms, properties from RO, added a couple. Flat list of cells chosen, did 3/4 definitions - stratified keratinized epithel stem cell' used derived from ectodermal cell, morphology,
fibroblast - done
secondary spermatocyte
9 inferred classes, and some multiple inheritance ME:it's nearly the same amt of work but they are all explicit
RS:restriction matrix - would work well for me.
RS:Suggest that we dump out an OBO version of the inferred h'archy
PL:look at SIBS, which ones have most kids, and tells which will classify new cell types
AI:Create an OBO version of the inferred h'archy to show JB - SJ/ME
RS:Simon now has all the 440 ish leaf nodes, got 425 to do. We will partition these into coherent lumps and make a s/sheet template with columns per bio category or portion of category and we'll dish them out.
AI:SJ and ME will complete the s/sheet with leaf node terms make it available for people to work on. We decide to go for the bio partition.
HP:will Ontogenesis need a report, or is it just the wiki? Apparently just the wiki is OK.
SJ:s.sheet the format was ad hoc, if we have some consistency - process where there are multiple terms add in GO or PATO id.
DS:shall we all use the wiki? Agreed that HP will clean up
Categories for the terms to be handed out
Secretory Cells - HP
Electrically Active cells and all children - PL
Contractile Cells - DS
Circulating Cells - DS (all the immune cells)
Structural Cells - JM
AI:we will ask JB to review these
AI:SJ will make 12 spreadsheets of bits of hierarchy and some have been spoken for some will be handed out and some ar tricky and may be left out.
RS:suggest when done a sig proportion of the ontology have another get together, physically or virtually
AI:Deadline for term submission is - August 15th 2008.
AI:ME will make a spreadsheet parser -
Summarized Workshop Action Items and Status
AI: Current CTO hierarchy for histology becomes morphology in the normalization experiment - DONE. Added to the list of categories.
AI:C CTO cell by class is not thought to be a useful classification and this is a container class that could be removed. DONE. Not used as an axis of classification during normalization.
AI:select 20 cell types from current CTO, current leaf nodes, implicit classification examples included e.g. mature/immature, large etc - DONE and annotated on a spreadsheet during the workshop.
AI:Typo in CTO AI:typo - Endopterygota is spelled Endopyerygota communicate to CTO. HP. Done, tracker added as CTO tracker item [4]
AI:HP and JM will report the discussion to George Goutkos of PATO and communicate any new terms or relevant PATO experiences. DONE - HP has mailed George to ask for a meeting and will provide a list of comments and suggestions for that
AI:When using has-quality consider if we use a general has_quality and cardinality or specific relationship type derived from RO terms. No technical issue in OWL, just need to decide on cardinalities, RO derivation and naming.
AI:when importing PATO/GO add their respective disjoints or reasoning will fail.
AI:build a prototype during the workshop for a small number of terms PARTIALLY DONE by ME, SJ - will be completed for terms we defined on the spreadsheet
AI:Cellular components - we did not make any classification based on these, though e.g. the property of having a flagellum e.g. could be made as a cross product term with the relevant part of GO. Need to decide if we want to use this as a set of potential cross products and hence axis of classification
AI:SJ to identify CTO leaf nodes. DONE there are 450 in total.
AI:we decided that ploidy refers to the nucleus, not the cell. Diploid and nucleate are not disjoint and e.g. Diploid is a feature of the nucleus not the cell, and is not related to how many nucleic are present in a cell when the cell is multicnucleate.
AI:pleiomorphic from PATO could be redefined as - has shape 'shape', suggest to GG
AI:HP to clean up the WIKI DONE. Anyone spotting typos etc feel free to edit.
AI:Consensus that everything except Dave Randall's videos of discussion including notes, ontologies and spreadsheets will be made publicly available on this wiki which will be cross linked to the ontogenesis one.
AI:SJ/ME will create an OBO version of the inferred h'archy to show JB If this could be done by 2 July HP will show JB
AI:HP will talk to JB about reviewing the definitions and normalized CTO
AI:SJ and ME will produce a template spreadsheet with leaf node terms and make it available for people to work on. We decide to go for the bio partition ie all secretory cells will be annotated together to make things quicker. We aim to do at least 50% of the leaf nodes.
AI:DS will annotate contractile cells and circulating cells and their children
AI:Hp will annotate secretory cells and children
AI:PL will annotate electrically Active cells and all children - PL
AI:JM will annotate structural cells.
AI:Deadline for submitting annotated sheets back to Mikel and Simon will be 15 August. Partially completed sheets are OK as they'll be parsed using OPPL.

