Collaborative Ontology Normalization

From ImageWeb

Jump to: navigation, search

Contents

Collaborative Ontology Normalization Workshop II

A second workshop to validate and extend the ontology developed in workshop I.

Details here http://imageweb.zoo.ox.ac.uk/wiki/index.php/Collaborative_Ontology_Normalization2

Collaborative Ontology Normalization Workshop

Oxford, Wolfson College 24-26 June 2008. Local Organizer: David Shotton

Notes from an experiment in collaborative normalization of the Cell Type Ontology.

Attendees

Mikel EgaƱa, University of Manchester

Simon Jupp, University of Manchester

Philip Lord, University of Newcastle

James Malone, European Bioinformatics Institute

Helen Parkinson, European Bioinformatics Institut

David Randall, Manchester Metropolitan University

David Shotton, University of Oxford

Robert Stevens, University of Manchester

PHOTOS

[1] Group Photo 1

[2] Group Photo 2

[3] - Archive

Day 1. 25 June 2008

Introduction from Robert Stevens Collaborative Ontology Normalization

Hope to make some observation of discussion and process to inform us about tool dev and what kind of support that we need, for non ontology representation for collaborations.

Reason that is this group, right size, and David is our domain specialist for cells, all of us are biologists of some sort, possibly excepting DR, we all know OWL, to some good degree, all familiar with the process of normalization as practiced by Manchester and others, and proposed by Alan Rector (** add ref).

Aims:

RS: As a team take OBO CTO, hand crafted taxonomy of cell types, with development, 'tangle'. When buld by hand there will be errors, Mikel and I have looked at GO and normalizing and rebuild he h'archy 1/10 has a missing or error in a sub sumption reln. The process of norm will give modeles that are maintable, and 'highly' axiomatized and make better inferences. Will go through the norm and by the end will come out with a plan for how to normalize it, primary, secondary axes of classification restrictions on cell types etc. Then we may rebuild some of it, after the event we will come up with a plan, and other BioHealth uses an ontology pre-processing lang, generating and transforming OWL format ontologies. Mikel will prog in OPPL and we will get a norm complete OWL norm ontology. We will also have through Dave's observation a record of what we have done and how we decided what to do. Also annotating bits of the CTO as we go along with decisions. Supported by OntoClean - technique for evaluating decisions

DS:what about publishing the ontology

RS:OBO are reworking the CTO, this can be a product that informs the process

HP:why the CTO?

RS:small enough, 900 ish terms, domain is focussed, mole function is too diverse, Simon and I chose it for some other work, also familiar. Within ontogenesis we have David who is a cell biologist. Happy to hand over the finished artefact to Alan Ruttenberg, and knows OWL.

RS:Johnathan Bard is now in Oxford, we'll do normalization and OWL and try and use his skills later in the process to answer questions. Need to keep records about questions that we ask.

RS:flow of actvity is looking at the CTO, quick look at structure, what it describes, need to id the axes of classification, people in the room have had a look at the CTO so we have a basis to proceed. Will see terms like 'cell by lineage' - what does that mean? Explicit axis of classification, mature cell, immature cells, these are hidden in names on OBO terms. Having id the axes we need to decide on a primary axis, and we will decide on the asserted is-a relns that will form the backbone, others will be in supp ontologies, and others may exist in PATO - Ploidy for example, PATO talks about ploidy and then take cells as leaves of the CTO 'Renshaw cell' will have 'x' ploidy through a restriction. Then can recrate the intermediate classes halpoid etc and OWL will classify these and infer the ss relns and will be complete and dynamic. Cells that take part in secretions, GO process will talk about secretion. maturity, morphology may or may not exist and we may need to build these if not in PATO, and we can make a simple tree and send off to PATO if they don't have it. Hope that we can do this formally use an aspect of OntoClean.

RS:are people familar with upper level ontologies a la BFO

DS:do you want to use one here

RS:no, it will be implicit. Ontoclean a way to eval ss to check right

RIGIDITY - things which are inherent to essence of things, properties held by an entity for the duration, or part person from start to end, student for some of that time, student is a role, always a person. Things that are essential and things that are essential to all of the entities. RIGID for part of existence ANTI-RIGID, some class members - SEMI-RIGID for all of existence 'RIGID'. Should id 1 axis as a rigid property, will make a safe tree. Go through the axes in CTO and annotate as ANTI, SEMI, etc

UNITY - parts and wholes - errors with is-a ocean is a kind of water water is part of ocean,

IDENTITY - nec and sufficiency and sufficiency and dist cn these.


RS: is maturity inherent to a cell? No - anti rigid then. Will take some of cells in CTO and look at creating restrictions and create fillers for these. All of that in one day - and then iterate.

ME:limit discussion to 20 mins, and tomorrow revisit

RS:Go meetings things that are not decided so we decided to try and limit discussion

DS:There's a lot of thought into development A->B and C. can you comment? Stem cell divides to be a stem cell and a daughter cell that differentiates - myoblasts fuse to be a multinucleate muscle fibre, process of change has been considered.

RS:I am correct all blood cells from hm stem cells, may want to say hm stem cell ->erthyrocyte, assumes that all-> at least one erthrocyte, not true want to say the other way around, e develops from hm stem cell. Discussion on whether stem cells are immortal in this context.

HP:is this a question of temporal processes and how we model those?

DS:no more about modelling change, RS said that he is a person, and was a student, how do we model a myoblast that has become a multinucl muscle fibre

RS:look at RO - they make a distinction between development and derivation, in one case identify is maintained, in the other not. glucose->fuctose - is derived from, fetus->adult is development

PL:in case of a cell, once a cell divides is two cells, when you get older are still the same person. is about identity, A-> B and C, less clear from differentiation, many do both at the same time. clone could exist even if a cell doesn't. Lineages differentiate

DS:type and individual differences - clone and instances of that clone.

PL:not clear what CTO was modelling, sin of person class, 'person' should be just the cell ontology.

DS:for biol use needs to be the CTO not the CO

HP:just a label

Discussion on truth and beauty.

PL:do we want an artefact in OWL and how do we do that

RS:we will do in OWL not in 2 days. Plan to produce a plan OPPL script -> ontology. We may take a part some cells and mock up in OWL, and show it working, will be a fragment. Will not be able to do for all axes and see where we get to.

DS:hepatoctye appears nine times in the CTO

ME:we should focus on axes not exceptions

RS:CTO classified by nucleation, we can pull that out, muscle cells are polunucleate, cardiac - write a definition. They don't talk about a cell component could add restrictions connecting cells to this.

HP:the experimentally modified bit is tricky, and we may want to disregard that

PL:any permanent cell line is a cross product with part of the rest of the CTO, and this is an issue. Cell lines have undergone process.

DS:issues with experimentally modified organisms.

HP:natural and non natural things were discussed in OBI and we decided for various reasons that this was not useful

Looking at the top level many classes

ME:cell by organsism could be a candidate for 1 axis

Cell by function - primary end goal or behaviour

Cell by histology - a classification by their microscopics - we think this is incomplete and what cell by histology means.

HP:is basically morphology, or stainability

ME:not by tissues - I expected that

PL:morphology in vivo, but also fixed

HP:could be any microscopy, light, scanning em

RS:they do not discuss brain cell, eye cell, etc - is species neutral,

AI: Axis - histology become morphology

ME:epithelial cell is a category of cells, and bipolar neuron is more precise, and this is a test case that we can look into.

Class by lineage - no def, suggests lineage by reproduction, mesoderm, ectoderm, endoderm, only some sponges don't fit the pattern. Based on higher eukaryote body plan, jellyfish etc not included.

HP:there are prokaryotes, and eukaryotes

PL:is a bacteria an anucleate cell

RS:we understand lineage, if if we do agree

DS:we could think about triploblastic organisms, and get that right, and most of this ontology is about eukaryotes, cover mods not At though

RS:we should make sure we leave the hooks in for prokaryotes

DS:cell by nuclear number - no macro and micronucleus - we need that

PL:skeletal cell, cardiac muscle cell - contradicts having no anatomy, is a kind of multi-nucleate cell. not a ss

HP:we can see an axis of organism and also got a sensu terms in labels, we need to be aware of that.

ME:we are sensu - as appears in mammalia and also appears in all mammalia which are we using here

RS: I think we need to decide on a case by case basis

Cell by ploidy - diploid, haploid, polyploid,

HP:I want to say tetraploid

DS:we want to have a sep axis that gives nuclear number,

RS:we can make defined classes for that

RS:ploynucleate is anything above 1, polyploid is anything above 2 - is this reasonable?

DS:multinucleate and polyploid is the standard language

'Non terminally differentiated cells' stem cells are sibs of this

DS:there is a missing class 'terminally differentiated' is a list of cell precursor types

RS:why is cell by org not a sib of the above?

AI:we think cell by class is not really that useful, and we could lose that, seems to be a container class

RS:we have in vivo vs exp modified, we will ignore these, cell by class is removed. Nuclear number is not a 1 axis

DS:number, histology, function will not work,

RS:Why? is function true some of the time?

JM:function is intrinsic by BFO

PL:if you use BFO function doesn't change, function is intrinsic to entity due to design, hammer meant to bang in nails a role for screws -

DS:if we are doing multicell orgs is cell by lineage as a starting point

PL: development is the primary axid

JM:what about function

PL:cross product

RS:most will be go processes and insulin secreting etc will be in Go, if it isn't should be

PL:photosythethis is a process - metabolism, - are processes so will not only be the

RS: function is GO is stage in a process, not function in BFO speak

PL:first pass, we can blitz the tree, and replace with involved in process from GO

DS:lumping cell types secretory cells, are very different, secretion is common

ME:apoptosis fated cell, is fate not function

HP:do defence cells include stingers?

RS:is defence a role?

PL:defender is a role, defence is a process.

RS:process here is a ragbad

PL:harsh, cell by function is badly defined. quality function is conflated with the cell type, need a h'archy of cell functions, not cells with a particular function

RS;in BFO function is intrisic to bearer and in that sense is RIGID

PL:yes ROLE is ANTI/SEMI rigid

DS:annotate the existing cell ontology for the readers, or create a new ontology

RS:annotate

PL: we can make the cross product with GO process

HP:could have a critique paper

RS:will be part of the methodology paper

Histology - we will describe this as morphology -

HP:schmoo cell - in yeast, columnar epithelial cell

RS:columnar vs squamous

PL:epithelial is about lineage

RS:always columnar

PL:this can change and is the same cell -

ANTI-RIGID then

PL:some defs based on artefacts - gram negative e.g. in terms of assay

RS:columnar and squamous should be in PATO, staining etc we'll need to deal with.

PL:immature and mature - disjoints are a mess as well

Lineage

RS:cell from mesoderm is always mesodermal

DS:expts in stem cells, take a mesod sc into mice and see them turn up as neural, etc, cells can transdifferentiate from one lineage to another - epithelial cell, make into all cell types.

HP:ok but does this happen in vivo

DS:maybe, there is a possibility that that occurs. having said that - this ecto, endo, meso and germ is a good place.

RS:RIGID for now, caveat that understanding might change - these boundaries are not permanent in vitro -

DS:like Elements where uraniam can decay into another element

ME:so lineage is the best candidate?

RS:we need to be sure that if we use lineage we can still work less complex organisms

DS:suggest then higher classification unicell, multi-cell -

HP:suggests this could be quality

PL:could use lineaged vs non linaged cell

DS:is this for multicellular organisms, and do we need to classify cell types in yeast

HP:if we consider the data, then most of the data is higher euk

RS:we don't want to decide halfway through proceed

Cell by nuclear number - does it change over the life time of a cell,

PL:depends how you define cell, sync blastoderm - not a cell.

HP:how do we define cell, in this case by a plasma membrane - maximally connected compartment,

PL:defined by partomy, stuff inside a plasma membrane, in this case syncytium is a cell

DS:I set this as an essay, bounding pm is one thing, store and replicate genetic material, and metabolise

PL:any plasma membrane could be included and cell is in the definition - bad, what does maximally connected cell compartment

HP:we could spend the rest of the day doing that

PL:we could think about multinuclear cells, comp questions

RS: Nuclear number is intrinisc - and if that changes

DS:in mitosis there is are 2 nucleic, and sometimes diseappears, is SEMI and ANTI-RIGID

PL:yeast cell, 2 cells with .5 of a nucleus each,

RS:ploidy doubles in cell div

PL:depends how is replicating

HP:steady state then, are we deciding that we don't need to account for cell cycle - but we do need to deal with differentiation

PL:dividing cell is out of scope, cell which could divide is in scope. s phase cell is out of scope

DS:cells losing the ability to divide

Ploidy -

RS:haploid cells always haploid -

HP:I think we have approximated yes

PL:if we have 2 cells haploid cells, diploid is not the same cell

DS: tetraploid yeast cells - were selected by sel breeding

PL:retinal cell - LOH - change chromosome number, but still has original ploidy

PL:ploidy is RIGID, biologically at least, some edge cases, where is SEMI-RIGID - likely patholgical , we can keep 'normal' in scope

Non terminolgically differentiated cells

RS:ANTI-RIGID - this will change.

Stem cells vs embryonic stem cells - these differentiate and are not left behind adult stem cells persist

SJ:stem cells - 3 properties, not terminally differentiated, divide without limit, each duaghter cell has a choice, even tho all may be differentiated.

ME:blasts shouldn't be under stem cell

PL:need poss of stem cell's kids being differentiated or another stem cell


SJ:looking at alberts, stem cell has change, rather than by division not being stem cell.

DS:we need to know when the switch occurs

PL:sc can diff and be another cell type, or a period where two children may be stem cells or not, or has to gain stem cell ness following mitosis. Suggest that the gain of sc s tricky, sc should be able to differentiate and we need to have that in the ontology

PL:sc is not RIGID

RS:are we being consistent in assigning this? Ploidy doesn't change, due to bio convention, sc ness doesn't change

PL:edge case for ploidy are pathological

RS:in cell division we are talking about the base case

PL:unclear when new cells appear, steady state vs temporal

RS:stem cell, divide into 2 cell, no longer cell A -> B and C - was it a stem cell for all life of stem cell A stem cellness is RIGID

DS:B or C may be stem cells we don't know

DS:two diagrams, mitotic daughter may be same or different

RS:rigidity is about instances

DS:cell B looks like a stem cell after mitosis, B and C may differentiate

PL:concl. If sc ness is rigid B is not known to be a stem cell, or will be at some point, B or C may be a stem cell, we don't know yet - A which is always a stem Cell rigid, B and C - may or may not be a stem cell. Suggest not RIGID, easier to model.

RS:we describe that there is an issue is ANTI-RIGID/RIGID - consequences of both

Cell by organism - RIGID

RS: if I am a human cell, am I always a human cell

DS:some animals eat a cnidaria and put the stingers on the surface ? Need to find an example of this

RS:3 rigid - organism, rigid and function - BUT function is rigid, but the CTO def of function is wrong and many were processes we looked at - cell by function is therefore RIGID, cell by process is not RIGID. Want to look at these cases and lineage and organism relations.

RS:We rename cell by process. We have lineage and organism and there is a relationship. Organism would allow us to leave the hooks in.

We discuss whether how much if any taxonomy is needed. and what the upper level h'archy should be like

we consider

Euk

Animalia
etc

Prok

PL:define by other properties

RS:ok cell, and then leaves, do everything by restriction - have a flat list instead of

HP:also a good way to build views into the ontology this way for different user group

RS:do it all be restriction, ultra norm

DS:how handle the terms that have the same names and different contexts

HP:difft id and synonyms,

RS:class name is just a URI and this is unique, these are numeric, can have the same label on all of them

PL:cell parent - can't hold the functions etc. Will need other sibs.

DS:we'll need a process h'archy from GO. DS: mammalian erythrocytes, avian are nucleate, we have one entry?

RS:We'll have two entries for this

PL:we don't seem to have another option, haploid, - is-a to cell,

RS:only be a kids of cell prior to classification using the reasoner

HP:if we pick 20 cell types then we can have a go at getting the restrictions then go back an generalize

AI:select 20 cell types, good rep spread across CTO h'archies

ME:what about cell components RS:go thru processes in CTO see if they have something e.g. insulin secretion

AI:typo - Endopterygota

PL:pick some by hand, and some random so we've got reasonable coverage

HP:that's a separate excerise, test phase

DS:do we want 20 leaves and do we want higher categories as well, should be able to see some intermediate classes and write definitions

we decide criteria are the ones that we've discussed today e.g. epithelial cell

DS:apoptosis - once cell that is destined to die in programmed cell death - scaffold cell

RS:we have 51 in the list, we need to ensure that we only have leaves

DS:would retain these


we run through the list and pull out cases where we have non leaf nodes and look for a child term, and check for obsoletion (SJ has the s/sheet for this



Helen and Phil lineage PATO -

none of the terms are in PATO so we looked at child terms of these classes for terms that might be in PATO

child terms mapped to PATO: - branched - is in there - superficial - almost quality of being on the surface, superficial to - relational quality - nothing that's the most superficial nearest thing we could find - keratinized - some quality that is a result of having undergone a process - not there in pato - - periarticular chondrocyte - surrounding a joint - proximity and anatomy - hypertrophic chrondrocyte - OK in PATO - columnar - rod shaped ? sort of, - endothelial tip cell - 'tip' is the sensory or the tipness that's key - seems to be tip, is a region - continuant - in this case is the part that's growing, cf. apical meristem - are apical and basal should be in PATO - spatiotemporal quality is a parent in PATO, motility is a child term - growing tip - is a spatio temporal quality - non-branched duct epithelial cell - unbranched is in PATO - stromal - not in PATO - no position info - white, brown fat cells - in PATO - presence of mitochondrial - can it be defined by the name? in this case the colour relates to the tissue ie anatomical

AI:Hp and JM will talk to George G about PATO below and discussions here and report back.

-

RS: again that seems that it's all doable, for our purposes for describing the lineage we make our embryonic tissues and use derived from for cell that derive from these. We should trace up the tree that what's being said there is useful and consistent SJ:we didn't look at all the context so there may be some assumptions here

RS:anatomy - leave on one side - HP:anatomy issue is that these ontologies are many and so the work is much harder to do, we need a complete CARO

RS:we need to take some of our cell types and make some restrictions, or we do the restrictions and generate a definition.


some kinds are a 1-1 mapping from cell to go, also cases where very specific terms, lymph circulation, happens all over the place. Things can take place in many processes. e.g. phagocyte - different processes and motile cells -

HP:do we need to take the highest useful term, or to use a slim? MI:we've taken higher terms, and we need to make many restrictions to allow for multiple processes

examples: gut absortive cell -> intestinal absorption apoptosis fated cell -> apoptosis Schwan cell -> ensheathment of neurons circulating cell -> circulatory system process contractile cell -> muscle contraction defense cell -> NOTE: defense is bad in CTO: split into inmune response and transport electrically active cell -> cell-cell signaling germ line cell -> metabolising cell -> metabolism process mitogenis signaling cell -> cell cycle motile cell -> cell motility nitrogen fixing cell -> nitrogen fixation

RS:some processes that are absent from GO - MI:nitrogen fixation, copper accumulation PL:any of them functions? SJ:no functions not all under cellular processes in go


sex - PATO morphology, size maturity PATO nuclear number PATO ploidy PATO potentiality - totipotent, unipotent promixity/location - juxta, extra, neuron associated cell


Sex PATO 0000047 lacks genotypic sex

Ploidy PATO 0001374. Lacks heterokaryon should be present under nucleate/ploidy

Nucleate quality PATO 001404 lacks syncitium

Heterokaryon should be under ploidy, not nucleate qualtiy

Shape PATO 0000052 Lacks bipolar, polarized, apical, basolateral, columnar, cuboid, stratifies, dendritic

Size PATO 0000117 lacks large, medium, small; has gigantic, dwarf

Deviation (from normal) PATO *** has hypertrophic, hypotrophic
Structure PATO 0000141 has lots of words like matted, spongy, not relevant for cells
Fragility is not a structure

Maturity PPATO 0000261 has terms relevant to animals

 Lacks: stem, blast, differentiated - these are an expression of potentiality
 Lacks: embyonic, foetal, neonatal, adult 

PL:they have pubescent etc above are an omission PL:mature, immature and juvenile are there

Cellular potency PATO 0000197

Relational spatial quality: PATO 0001631 has proximal to, distal to

Lacks next to / juxta /adjacent
Apoptotic PATO 0000638 is under Morphology, defining appearance of an apoptotic cell

Conclusion:some easy additions, but not bad


===Example terms and their restrictions

Looking at 25 term list, select 2 to make restrictions

Defining CD8-positive alpha, beta T cell

ME:in Go there is a similar process, GO:43369 how do we relate

kind-of 'lymphocyte' function cytotoxic cell differentiates in the thymus developed from haemopoetic stem cell peripheral blood has lineage 'mesoderm' in CTO - nicely done many intermediates

RS:would expect to see a series of derived from stages back to mesodermal cell.

SJ:next derived from is lymphocyte, we only need to capture lymphocyte only, this cell is derived from lymphocyte, will get back to mesoderm

ME:there's a go term that is about this lineage committment

RS:one of its ancestor cells takes place in that and this is the outcome, is it a property of this?

DS:Suspect immature T cell doesn;t undergo a mitosis prior to being committed.

RS:do all of them take part in that process, if some do need disjunction

RS:what makes something a lymphocyte

DS:circulating white blood cell which is part of the adaptive cellular immune system

PL:what's not part

DS:dendritic cells

HP:circulatory system anatomical

PL:part of the adaptive immune system, and something that distinguishes between it and dendrites

DS:thinks dendritic cell is also part of adaptive immune cell

HP:so what distinguishes

DS:macrophages present antigen too, need another difference Consulting Alberts - Mol Biol of the Cell

PL:wikipedia - leukocytes - table showing differences, lymphocytes, T, B, NK cells

DS:dendritic cell is innate not adaptive immune system in Alberts. Narrow def back to adaptive cells of immune system are the lymphocytes. Adaptive is that cell surface molecules have seq that are specific for particular antigen not general. Done by molecular recognition. CD8 etc is a cell surface marker, not about function. NK cells are innate immune system. NK cell is not a lymphocyte

SJ:is this a kind of

DS:T-cell is short form of T lymphocyte

RS:gene expression roles about alpha beta etc, CD4 and CD8 - we can add restrictions for these, no ontology for doing that

PL:CD8 is a partonomy of cell surface

HP:then we need an extension of the go component plasma mebrane

RS:we need a tiny marker ontology to deal with this now so we can proceed

ME:we can add the proper classification later

PL:immune system process part of GO might give enough definition, could allow lymphocyte diffn from all the rest

DS:www.bioscience.org/atlases/cdclass/cdclass.htm - all CD nos are there

RS:we can do the roles, stuff in GO that is in processes e.g.

RS:nuclei? 1, it's 2 n, shape

DS:shows some videos of them doing different things, they are motile, this is in PATO, they are polarized

DS:do we want to add images -

RS:we can do that technically

RS:organism - mammalian?

DS:don't know about non mammals, wikipedia thinks vertebrates -

RS:Sex is not a defining characteristic in this case, we decide not to say anything

DS:shape is an artefact of way grown

RS:done process, lineage, morphology, sex, size, potentiality ? can become stimualated or is virgin

DS:some of these go onto to be memory cells,

RS:are these then a leaf?

PL:memory cells are a state of a single cell,

DS:virgin, activated, etc fact that are disjoint in CTO could be wrong. Looking at Alberts, naive, effector, memory cells says that these are matured. Can change further in response to signals.

RS:are all the things we have said so far true of e.g. memory cells -

DS:there are different roles for the different maturation state - naive' potential to respond , role could be monitoring, still in the immune process, effector cells have some other role.

AI:this term needs checking and we need to decide if these are kinds of this cell, or if there is some other aspect of this cell.

Defining Fast Muscle Cell

Most muscle a mix of fast and slow fibres, speed of contraction, whether they get energy from anaerobic/aerobic - change by training for atheletes. Flight are fast, posture slow - they don't fatigue as easily as fast do. True for vertebrates, also attach to skeleton hence skeletal muscle. Three classes cardiac, smooth, skeletal.

Morphology/Shape = cylindrical HP:is this instead thing that can be observed value - striation PL:

Size - this is relative, DStThinks that we want to say something about size, and maybe volume. Here we might need large, median, small,

RS:thinks we can use a data type to add micron information and so we'd be able to do this

Function/Process - skeletal muscle contraction process, but is fast twitch, some complexity in GO, we think there's a missing ss in GO - 'fast twitch skeletal muscle fibre contraction' new parent 'involuntary skeletal muscle contraction'. ' DS:Movement of the skeleton, contractile to move the skeleton? RS:is that muscle contraction vs the cell's property. Some discussion about this. We note and leave it out for now. PL:skeletal muscle contraction doesn't move in the same way that an amoeba moves.

Role: thermoregulation (shivering), lactose metabolism (more cardiac muscle), RS:we don't need to be complete, we are being descriptive, we need to think about questions we want to ask.

Ploidy - diploid

Nucleate - multi - nucleate (syncitial) - indeterminate number and changes so not the same as multinucleate - changeable, this is the difference in flies at least should be in PATO. Nuclei. come from division.

Potential:fully differentiated

Mature: mature

Lineage:develops from fast muscle myoblast, mesenchymal

Sex:not relevant

Anatomy:cell type is a necessary component of skeletal muscle

DS:we also need somatic cell as a parent for the mesoderm, ectoderm, endoderm in the h'archy.


AI:decide if use has quality means need to use cardinality, or has morphology - then there'll be a relationship explosion - can only be columnar - patterns. Not an issue in OWL if done sensibly. Need to handle cardinality max 1. PL:thinks need a sub property so that can be blue and columnar.


Day 2 CTO Normalization Experiment

RS:things to note to do

AI:PATO and GO need to be careful that the parts used that we have all the disjoints or the reasoning will not work. PATO should be easier.

SJ:already got GO and PATO imported into OWL

RS:we'll add some axioms, with trees easier to manage disjoints as sibs in a tree are disjont. When we do the translation we need to retain the CTO ids as annotation properties and use our own ID scheme, will have semantic free ids, and english labels. Default CTO labels will be used, if we do not, CTO label will be an annotation property. We have 25 cell types and we have cell types A, need to add predecesor and all the chain, we will fake that for now and will go A-B-F-G so we have some transitivity working but not all. Asks David, mesodermal cell, what's the start?

DS:gastrulationm then mesdoderm

HP:there's a missing ss relation in the CTO for mesodermal cell.

RS:so we'll need mesodermal cell, etc. V pleased at progress

RS Review: looked at RIGID properties, couldn't find a RIGID property that is for primary axis, true of euk and prok. Lineage nice for prok, need scaffold for prok ended up replicating taxonomy - bad. With Mikel and Alan R will check the RIGIDITY analysis is correct. We looked at the cell types looking at implicit classification axes, size, shape etc. Looked at supporting ontologies, PATO and GO cover nost of it. Will make some requests from PATO and usual issues with anatomy. Nice methodological thing, restrict to euk cells, and excluded cell by experimemtal process, 50 representative leaf node cells that cover the axes we id'd. Chosen 25 to do by hand now, and 25 to test what we have done. Finished by looking at 2 in detail and restrictions on those. Some of them were easy, some were tricky

Ploidy, nucleation, morphology, sex, all quite easy, lineage, organism, (look up in CTO, record what predecessor cell was) size, - more problems, most discussion on processes (CTO function). Propose to do split the lists and divide in pairs and do the easy bits. In pairs will get as far as we can. We want something about everything.

RS:T cell example yesterday, cell quiescent or growing

DS:maturity?

PL:on in immune systems, spore is quiescent, not naive

JM:GO has cell cycle etc - quiescent etc


Task to complete classification for selected terms

3 pairs (David/James, Helen/Phil, Mikel/Simon) provided restrictions for the selected ~25 terms. ME and SJ are collated these on a single sheet which will form the basis of building the OWL file and also a data collectio exercise going forward.

DS: Hard things were morphology, process

PL: Test cases - platelets not formed by division

PL: giant cell - fusion rather than division events

ME:multi process terms, antigen presentation, and lymph circulation. Need two properties, does a function and participates in - active vs passive

HP:also had a case of forming a structure - we assumed that this was not a partonomy reln

SJ:also motility, needs to be added. GO has annotations for genes and so does PATO - cell has phenotype of being motile,

ME:cell motility in GO is involved in cytoskeleton, term is not about cell motility.

RS:PATO motile is better for cells. GO are for processes and we need to be careful that we are using a GO process that is appropriate to a cell and not a protein.

Group Reviewing the of classification spreadsheets

discussion on the definitions of pluripotent, totipotent, etc we may not have used these consistently.

we discuss not overloading the cells with their future potential - e.g. if is some prior thingh to a phagocyte then we don't need to annotate that, we can query by the relationship types. tendency to overload terms with future potential

we add cell surface antigen - plays the role of a marker

Discussion on using GO terms - GO has protein terms, we need cell terms and these are at different granularity or poss should be in a different ontology anyway. Protein is involved, is the cell getting all the processes that the protein is involved in by transitivity. Not clear. GO does provide a lot of what's needed for cross products. We are coming up against some granularity issues.

DS: e.g. new term 'neuronal signalling' is more precise for cell needs, but some are ok subclass of cell-cell signalling RS:is neuronal same as nerve signalling DS:yes

Seeing a lot of incompleteness for GO term addition in this exercise. Presumably due to not being that familiar with the GO, Go is focussed at a different level.

we get to term 10 on the list.

RS:shall we carry on for the next 15 or shall we start encoding

HP:can we continue and keep defining


AI:Simon and Mikel will start coding up to term 10 and the rest of us will continue defining.


DS:we will stop doing defining at 3 and come back together.


Looking at the RO - derives-from and transformation-of as appropriate

e.g. osteocyte derives-from osteoblast e.g. reticulocyte and erthrocyte

located-in etc.

ME:question about actively doing something vs passive is that agent in vs participates in?


AI:cellular components have not been discussed in the context of this work. We need to decide what to do.


AI:prepare a note about the schema, and assn RO choices with the schema - Manchester people


ME and SJ start coding in OWL and others continue on the s/sheet.


AI:SJ to use OWL perl api to find out leaf nodes - DONE - 450 in total


back to reviewing the sheet:

Term 11, gut absorptive cell.

Term 12, secondary spermatocyte - diploid, depends on how we define haploid and diploid, we agree on diploid.

Term 13 zoospore - flagellate non sex repro cells from fungi plants protists. added some more info.

Term 14. APUD cell. Complete

Term 15. phagocyte sensu Nematoda and protostomia. Complete

Term 16. inhibitory interneuron

Term 17. Diploid cell - this may disappear - we don't think this is a leaf in the normalized CTO. We delete the row.

Term 18. epiblast -. 'epiblast' cell definition will be epithelia that are part of the epiblast. Columnar here is defined as 2x as high as wide. We think this is also missing a epithelial cell is-a parent, pluripotent

Term 19. xylem element - vessel element - lignified. was a cell, not now a cell. - will need an alternate parent if a cell definition is not alive.

Term 20. platelet - see above.

Term 21. fibroblast - OK

Term 22. M cell - added some information -

Term 23. multi-nucleated giant cell -

AI:ploidy is looking at the nucleus, diploid and nucleate are not disjoint. Diploid is a feature of the nucleus not the cell.

AI:pleiomorphic - has shape shape, should be changed in PATO - from pleiomorphic.

Term 24. ascospore

discussion on maleness and femaleness - male gamete - has potential to be male or female - not who made them, this comes from anatomy. Sperm are neither male nor female they have a mating type - as they are haploid. Yeast is opposite, haploid have a mating type. Do we mean involved in sex reproduction.

RS:N/A is where we made no comment, usage is somatic etc, and sex is not relevant. If I have an egg and a sperm, my egg is female, and my sperm is male or female but not both

PL:scrotal sac epithelia is in a male - ovary is in a female, not true of a skin epithelial cell.

DS:male germ line, or female germ line - specific and not problematic

AI:Secondary spermatocyte - is now male germline not male female

Term 25. aminoserosal cell - problematic, skipped review

Term 26. cardiac muscle cell -

Discussion Conclusions

AI:HP to clean up the wiki, sort out the AI into a separate page

AI:Everything except Dave Randall's videos of including notes, ontologies and spreadsheets will be made publicly available.

Prototype normalized CTO in OWL

imported PATO terms, properties from RO, added a couple. Flat list of cells chosen, did 3/4 definitions - stratified keratinized epithel stem cell' used derived from ectodermal cell, morphology,

fibroblast - done

secondary spermatocyte

9 inferred classes, and some multiple inheritance ME:it's nearly the same amt of work but they are all explicit

RS:restriction matrix - would work well for me.

RS:Suggest that we dump out an OBO version of the inferred h'archy

PL:look at SIBS, which ones have most kids, and tells which will classify new cell types

AI:Create an OBO version of the inferred h'archy to show JB - SJ/ME

RS:Simon now has all the 440 ish leaf nodes, got 425 to do. We will partition these into coherent lumps and make a s/sheet template with columns per bio category or portion of category and we'll dish them out.

AI:SJ and ME will complete the s/sheet with leaf node terms make it available for people to work on. We decide to go for the bio partition.

HP:will Ontogenesis need a report, or is it just the wiki? Apparently just the wiki is OK.

SJ:s.sheet the format was ad hoc, if we have some consistency - process where there are multiple terms add in GO or PATO id.

DS:shall we all use the wiki? Agreed that HP will clean up

Categories for the terms to be handed out

Secretory Cells - HP

Electrically Active cells and all children - PL

Contractile Cells - DS

Circulating Cells - DS (all the immune cells)

Structural Cells - JM

AI:we will ask JB to review these

AI:SJ will make 12 spreadsheets of bits of hierarchy and some have been spoken for some will be handed out and some ar tricky and may be left out.

RS:suggest when done a sig proportion of the ontology have another get together, physically or virtually

AI:Deadline for term submission is - August 15th 2008.

AI:ME will make a spreadsheet parser -

Summarized Workshop Action Items and Status

AI: Current CTO hierarchy for histology becomes morphology in the normalization experiment - DONE. Added to the list of categories.

AI:C CTO cell by class is not thought to be a useful classification and this is a container class that could be removed. DONE. Not used as an axis of classification during normalization.

AI:select 20 cell types from current CTO, current leaf nodes, implicit classification examples included e.g. mature/immature, large etc - DONE and annotated on a spreadsheet during the workshop.

AI:Typo in CTO AI:typo - Endopterygota is spelled Endopyerygota communicate to CTO. HP. Done, tracker added as CTO tracker item [4]

AI:HP and JM will report the discussion to George Goutkos of PATO and communicate any new terms or relevant PATO experiences. DONE - HP has mailed George to ask for a meeting and will provide a list of comments and suggestions for that

AI:When using has-quality consider if we use a general has_quality and cardinality or specific relationship type derived from RO terms. No technical issue in OWL, just need to decide on cardinalities, RO derivation and naming.

AI:when importing PATO/GO add their respective disjoints or reasoning will fail.

AI:build a prototype during the workshop for a small number of terms PARTIALLY DONE by ME, SJ - will be completed for terms we defined on the spreadsheet

AI:Cellular components - we did not make any classification based on these, though e.g. the property of having a flagellum e.g. could be made as a cross product term with the relevant part of GO. Need to decide if we want to use this as a set of potential cross products and hence axis of classification

AI:SJ to identify CTO leaf nodes. DONE there are 450 in total.

AI:we decided that ploidy refers to the nucleus, not the cell. Diploid and nucleate are not disjoint and e.g. Diploid is a feature of the nucleus not the cell, and is not related to how many nucleic are present in a cell when the cell is multicnucleate.

AI:pleiomorphic from PATO could be redefined as - has shape 'shape', suggest to GG

AI:HP to clean up the WIKI DONE. Anyone spotting typos etc feel free to edit.

AI:Consensus that everything except Dave Randall's videos of discussion including notes, ontologies and spreadsheets will be made publicly available on this wiki which will be cross linked to the ontogenesis one.

AI:SJ/ME will create an OBO version of the inferred h'archy to show JB If this could be done by 2 July HP will show JB

AI:HP will talk to JB about reviewing the definitions and normalized CTO

AI:SJ and ME will produce a template spreadsheet with leaf node terms and make it available for people to work on. We decide to go for the bio partition ie all secretory cells will be annotated together to make things quicker. We aim to do at least 50% of the leaf nodes.

AI:DS will annotate contractile cells and circulating cells and their children

AI:Hp will annotate secretory cells and children

AI:PL will annotate electrically Active cells and all children - PL

AI:JM will annotate structural cells.

AI:Deadline for submitting annotated sheets back to Mikel and Simon will be 15 August. Partially completed sheets are OK as they'll be parsed using OPPL.

Spreadsheets containing terms for annotation [5]

Personal tools
Oxford DMP online
MIIDI
Claros