DefiningImageAccess/Resource/SchemaAlignment

From ImageWeb

Jump to: navigation, search

Contents

Schema Alignment Work

Work on schema and data alignment has been under way for many years in the relational database community (cf. Doan and Halevy, below), but more recently the needs are appearing in slightly different form for dealing with disparate Semantic Web schemas and ontologies.

Papers

Links

Commentary

We make much of separating schema alignment from coreference, without being entirely clear about what we mean by this. Partly, the distinction isn't always clear but, roughly, what this means is distinguishing between ontology- or schema-alignment and detecting references to the same object or instance in different data sources.

Work on relational database alignment has an easier time of it: the distinction between schema and table data is clear. This is discussed in a recent CACM article: Semantic Matching Across Heterogeneous Data Sources, Huimin Zhao, January 2007 (http://portal.acm.org/citation.cfm?id=1188913.1188916).

For semantic web data, the distinction is sometimes less clear, as there can be some lack of clarity about whether a term is a class (an ontological or schema element) or an instance -- e.g., see http://www.w3.org/TR/swbp-classes-as-values/. Some examples:

  • animals vs plants - these would usually be recognized as classes of objects
  • Dolly the sheep - an instance of a sheep
  • Hydrogen - the class of hydrogen atoms, but statements about Hydrogen might be treating Hydrogen as an instance.
  • A strain, or genetic line, of drosophila: a subclass of all drosophila, but gene expression observations would likely relate to this as an instance.

On reflection, I think the initial distinction may be fairly easy: if a term appears in instance data, then recognizing different terms meaning the same thing in different instance data is a coreference problem. If different instance stores describe the same attributes using different schematic structures, then schema alignment is needed. This distinction isn't unambiguous, but I think it serves as a starting point.

Personal tools
Oxford DMP online
MIIDI
Claros