Meetings/20070131/SemanticPortals etc
From ImageWeb
| Defining Image Access |
Meeting on 31 Jan 2007 at ILRT
Present:
- Nikki Rogers
- Damian Steer
- David Shotton
- Graham Klyne
Discussing semantic portals and IUGO, with a view to applicability for creating image data webs.
Contents |
IUGO
Emma Tonkin wrote a very good evaluation report for IUGO
Tentative connection with data mining: Simon Price doing PhD in Bristol. DS mentioned the National Text Mining Centre in Liverpool.
Image metadata
W3C content labelling incupator; now a working group (POWDER?)
- http://www.w3.org/2005/Incubator/wcl/
- http://www.w3.org/2005/Incubator/wcl/XGR-report/
- http://www.w3.org/2005/Incubator/wcl/use_req.html
- http://www.w3.org/2005/Incubator/wcl/matching.html
XMP (Adobe) metadata was mentioned -- there's a lot of it out there, mostly workflow related.
For automated metadata extraction, Steve Cayzer's work on biologically inspired computing was mentioned (for machine learning type tasks).
- http://www.hpl.hp.com/personal/Steve_Cayzer/index.html
- http://www.hpl.hp.com/personal/Steve_Cayzer/downloads/presentations/Presentation020514_hawaii.ppt - A Recommender System based on the Immune Network
- Hmmm... is there any potential here for collaborative ontology building, maybe linked to something like Alistair Miles "vines"?
Nikki notes that we need a range of tools for metadata acquisition.
Related activities
- Museum sector has many classified images of objects
- English Heritage activities supporting tourism
- cf. Brian Fuchs' use cases for CLAROSnet
- Has some dedicated funding, but only small amounts
- CIDOC CRM mentioned; Martin Doerr has previously been in similar orbit to ILRT work (detail?)
- Dan Brickley was involved in some early work on cross-searching Biomedical repositories
- Z39.50; where are the schemas? Maybe, look for "profiles"
Offline investigation
Yes, Z39.50 profiles seem the place to look.
A quick review of http://www.niso.org/standards/resources/Z39_89final.pdf?CFID=32183435&CFTOKEN=51600042 reveals a number of OIDs defined that seem to correspond to attributes like those one might find uysed with Dublin Core. Toi make sense of this, remember that Z39.50 is based on ASN.1, which is a completely definied (binary) syntax for protocol data units (messages), and internally uses a particular form of identifier called OID. As such, the Z39.50 standard is very generic, and particular profiles are created to restrict the particular OIDs that are recognized. The term of "vocabulary" is not used, but the set of recognized OID corresponds roughly to a vocabulary of URUs used with (say) RDF.
See also http://www.collectionscanada.ca/bath/bp-current.htm, The Bath Profile: An International Z39.50 Specification for Library Applications and Resource Discovery, upon which the previous specification is based. Section 4.2 of this lists all the registered Z39.50 objects referenced by the specification, described as: bib-1 attribute set, holdings attribute set, utility attribute set, cross domain attribute set, bib-1 diagnostic set, holdings schema, eSpec-q, UNIMARC record syntax, MARC21 record syntax, Simple unstructured records syntax (SUTRS), Generic record syntax (GRS-1), XML record syntax.
From here on, it all starts to get very complex, which is quite common with specifications based on ASN.1, and there is a lot of functionality present that very few people would ever bother to use. I can well imagine Dublin Core being proposed as a reaction against this complexity (and I've heard conference speakers almost say as much).
As a document, the Bath Profile seems to be quite readable, given the complex nature of the material it covers.
SWED Semantic Portal Software
The semantic portal software developed for SWED as part of the SWAD-E project is still in use. The IUGO project has used it
- http://www.w3.org/2001/sw/Europe/
- http://www.w3.org/2001/sw/Europe/showcase/sem-portal.html
- http://www.swed.org.uk/swed/index.html
- http://www.swed.org.uk/swed/swed_technical_resources.htm
- http://iugo.ilrt.bris.ac.uk/
- http://iugo.ilrt.bris.ac.uk/about/ - There is often a great deal of informal content spread across the web, e.g. the abstract, paper, presentation, information about the related project, online notes (e.g. as a blog), etc. At present individual users must locate these themselves, which is a often problematic and time consuming. Iugo means to bind together."
About the software:
- is is fairly stable, and still has some (limited) active support
- there are some scaling problems, particularly to do with queries that generate large result sets (system insists on generating exact counts, even when these are not needed; this can be expensive).
- There may be an IUGO follow-on at ILRT.
- User interface is basically a faceted browser; there are somne scaling problems related to user perceptions at the user interface.
HP is apparently developing a commercial version of SWED (semantic portal software), which will not be open source, and will be a completely new code base. They are, however, still using the and supporting SWED code base as a basis for prototyping and user requirement gathering.
SWED design is somewhat driven by faceted browse.
Nikki comments: suggest using faceted browse model to drive requirements on queries that a data web must support; this allows users to see the richness of the data.
GK thinks: we need to raise priority on devloping descriptions of search themes to guide subsequent design work.
SWED software configuration is very flexible, but not very dynamic.
For image (content?) classification terminology, sggestion to use a "terminology service" - there's some JISC interest here (HILT?)
Browsing styles
Faceted browsing, column browsing, more? As opposed to searching.
Survey of interaction styles?
The Argos web site was mentioned (http://www.argos.co.uk/) as an example of faceted browsing: notice how at each step there are several choices for refining the search, and at each step, new choices are offered based on what has gone before.
- http://homepages.cwi.nl/~lloyd/WIS/SWUI.ppt - interesting presentation about UIs for the semantic web; lots of nice examples. From the conclusions slide:
Semantic Web posts facts for the world SW User Interaction shows these facts SWUI techniques include: (Faceted) browsing Graphs Recommenders and other personalization Clustering Narrative and Rhetoric
- http://www.miskatonic.org/library/facet-biblio.html
- http://www.mel.nist.gov/msid/conferences/SWESE/repository/15ws_reps.pdf (section 5)
- http://www.searchtools.com/info/faceted-metadata.html
- Also check (offline when I tried):
- simile.mit.edu/wiki/Faceted_Browser
- simile.mit.edu/wiki/Longwell_User_Guide
By comparison, mSpace was described as a column browser (?)
There was mention of a report by Emma Tonkin and Paul Shabajee (SWED/IUGO?) that characterizes browsing states. I searched for this but couldn't find anything. Did I get this right?
Data models
(Context for this?):
- Subject hierarchy - 300,000 words
- Project, for viewing and navigation - uses SKOS hoierarchy for navigation only
- The need for f;exibility in the model used for constructing a view leads to manual construction only (for now)

