Meetings/20070131/SemanticPortals etc

From ImageWeb

Jump to: navigation, search
Defining Image Access

Meeting on 31 Jan 2007 at ILRT

Present:

  • Nikki Rogers
  • Damian Steer
  • David Shotton
  • Graham Klyne

Discussing semantic portals and IUGO, with a view to applicability for creating image data webs.

Contents

IUGO

Emma Tonkin wrote a very good evaluation report for IUGO

Tentative connection with data mining: Simon Price doing PhD in Bristol. DS mentioned the National Text Mining Centre in Liverpool.

Image metadata

W3C content labelling incupator; now a working group (POWDER?)

XMP (Adobe) metadata was mentioned -- there's a lot of it out there, mostly workflow related.

For automated metadata extraction, Steve Cayzer's work on biologically inspired computing was mentioned (for machine learning type tasks).

Nikki notes that we need a range of tools for metadata acquisition.


Related activities

  • Museum sector has many classified images of objects
  • English Heritage activities supporting tourism
    • cf. Brian Fuchs' use cases for CLAROSnet
    • Has some dedicated funding, but only small amounts
  • CIDOC CRM mentioned; Martin Doerr has previously been in similar orbit to ILRT work (detail?)
  • Dan Brickley was involved in some early work on cross-searching Biomedical repositories


Offline investigation

Yes, Z39.50 profiles seem the place to look.

A quick review of http://www.niso.org/standards/resources/Z39_89final.pdf?CFID=32183435&CFTOKEN=51600042 reveals a number of OIDs defined that seem to correspond to attributes like those one might find uysed with Dublin Core. Toi make sense of this, remember that Z39.50 is based on ASN.1, which is a completely definied (binary) syntax for protocol data units (messages), and internally uses a particular form of identifier called OID. As such, the Z39.50 standard is very generic, and particular profiles are created to restrict the particular OIDs that are recognized. The term of "vocabulary" is not used, but the set of recognized OID corresponds roughly to a vocabulary of URUs used with (say) RDF.

See also http://www.collectionscanada.ca/bath/bp-current.htm, The Bath Profile: An International Z39.50 Specification for Library Applications and Resource Discovery, upon which the previous specification is based. Section 4.2 of this lists all the registered Z39.50 objects referenced by the specification, described as: bib-1 attribute set, holdings attribute set, utility attribute set, cross domain attribute set, bib-1 diagnostic set, holdings schema, eSpec-q, UNIMARC record syntax, MARC21 record syntax, Simple unstructured records syntax (SUTRS), Generic record syntax (GRS-1), XML record syntax.

From here on, it all starts to get very complex, which is quite common with specifications based on ASN.1, and there is a lot of functionality present that very few people would ever bother to use. I can well imagine Dublin Core being proposed as a reaction against this complexity (and I've heard conference speakers almost say as much).

As a document, the Bath Profile seems to be quite readable, given the complex nature of the material it covers.


SWED Semantic Portal Software

The semantic portal software developed for SWED as part of the SWAD-E project is still in use. The IUGO project has used it

About the software:

  • is is fairly stable, and still has some (limited) active support
  • there are some scaling problems, particularly to do with queries that generate large result sets (system insists on generating exact counts, even when these are not needed; this can be expensive).
  • There may be an IUGO follow-on at ILRT.
  • User interface is basically a faceted browser; there are somne scaling problems related to user perceptions at the user interface.

HP is apparently developing a commercial version of SWED (semantic portal software), which will not be open source, and will be a completely new code base. They are, however, still using the and supporting SWED code base as a basis for prototyping and user requirement gathering.

SWED design is somewhat driven by faceted browse.

Nikki comments: suggest using faceted browse model to drive requirements on queries that a data web must support; this allows users to see the richness of the data.

GK thinks: we need to raise priority on devloping descriptions of search themes to guide subsequent design work.

SWED software configuration is very flexible, but not very dynamic.

For image (content?) classification terminology, sggestion to use a "terminology service" - there's some JISC interest here (HILT?)


Browsing styles

Faceted browsing, column browsing, more? As opposed to searching.

Survey of interaction styles?

The Argos web site was mentioned (http://www.argos.co.uk/) as an example of faceted browsing: notice how at each step there are several choices for refining the search, and at each step, new choices are offered based on what has gone before.

 Semantic Web posts facts for the world
 SW User Interaction shows these facts
 SWUI techniques include:
   (Faceted) browsing
   Graphs
   Recommenders and other personalization
   Clustering
   Narrative and Rhetoric

By comparison, mSpace was described as a column browser (?)

There was mention of a report by Emma Tonkin and Paul Shabajee (SWED/IUGO?) that characterizes browsing states. I searched for this but couldn't find anything. Did I get this right?

Data models

(Context for this?):

  • Subject hierarchy - 300,000 words
  • Project, for viewing and navigation - uses SKOS hoierarchy for navigation only
  • The need for f;exibility in the model used for constructing a view leads to manual construction only (for now)
Personal tools
Oxford DMP online
MIIDI
Claros