DefiningImageAccess/Project/SCULPTEUR
From ImageWeb
Contents |
SCULPTEUR
| DefiningImageAccess/Project/SCULPTEUR | |
|---|---|
| homepage:=http://www.sculpteurweb.org/}} | |
| has sub-project::DefiningImageAccess/Project/eChase}} | |
| [[sub-project of::{{{Supproject}}}]]}} | |
| [[start date:={{{Start date}}}]]}} | |
| [[end date:={{{End date}}}]]}} | |
| Status:=Completed}} | |
| JISCProject:=false}} | |
| [[Image Materials:={{{ImageMaterials}}}]]}} | |
| Focus:=Metadata Creation, Metadata Harvesting, and Metadata Query and Virtual Exhibition}} | |
| [[Publishes::{{{Publishes}}}]]}} | |
| [[References::{{{References}}}]]}} | |
| [[Uses::{{{Uses}}}]]}} | |
| [[Creates::{{{Creates}}}]]}} | |
| [[Partner::{{{Partner}}}]]}} | |
| [[Contact::{{{Contact}}}]]}} | |
SCULPTEUR project details
- http://www.sculpteurweb.org/ - the project web site, is singularly unhelpful.
- http://www.sculpteurweb.org/html/events/D7.1_Public.zip - "SCULPTEUR D7.1: Semantic Network of Concepts and their Relationships - Public Version", contains a version of the material from "SCULPTEUR D7.5: Semantic Layer Implementation", mentioned below.
- http://eprints.ecs.soton.ac.uk/8593/
- http://www.ecs.soton.ac.uk/research/projects/sculpteur
See also:
EU project, IST 5th framework. 3M Euro funding. May 2002 to May 2005.
"This paper describes the design and prototype implementation of a novel architecture for integrated concept, metadata and content based browsing and retrieval of museum information. ..."
"In the same way that there is no centre of the Web there is no centre of the SCULPTEUR system ... The SCULPTEUR system will enable users to approach the same data in multiple different ways using a single interface."
The SCULPTEUR project appears to be about a system for retrieval of cultural heritage information that mirrors in many respects what we wish to do for research images, using Semantic Web ideas to make metadata content explicit and provide a basis for combining information from multiple sources. "The extensive use of semantic web technologies in SCULPTEUR provides a way to establish common semantics between heterogeneous digital libraries containing multimedia collections".
SCULPTEUR uses CIDOC CRM as the basis for its main ontology, and augments this with some additional terms for its specific area of application. It also implements a concept browser, allowing a user to find terms or concepts that are related in the ontology. The project also identifies missing information and developed tools using knowledge extraction techniques to locate and extract such information from ordinary web pages. "Information processing is required for extracting the missing relations from the Web. Whereas search engines (e.g. 'Google' or 'Yahoo') can retrieve pages which are possibly relevant to the information required, the extraction of the specific relations within the pages can be better served by knowledge extraction techniques." This is used to assist rather than replace the curation process: "Human experts are used to validate correctness of information before it is committed to the knowledgebase."
The concept browser is used to assist the construction of queries that can use this semantic information. The work includes some evaluation of effectiveness the user query interface thus created.
The ontology can also be used to enable different ways of presenting query results: "... possible to view these outputs in many ways such as 2D thumbnails ordered by similarity to a query image, art object titles chronologically ordered on a timeline, or paintings presented in a virtual gallery". This information also includes "Digital Atrtributes", which can also be used to determine the applicability of specialized metadata anlysis (e.g. image colour space analyses). "SCULPTEUR will use the system ontology to determine the right tools, data and algorithms to use for a particular query".
One of the papers [1] contains some discussion of using SRW and extensions to access multimedia collections.
[1] Much of the information for this survey comes from: Addis, M., Boniface, M., Goodall, S., Grimwood, P., Kim, S., Lewis, P., Martinez, K. and Stevenson, A. (2003) SCULPTEUR: Towards a New Paradigm for Multimedia Museum Information Handling. In Proceedings of Semantic Web ISWC 2870, 582 -596, also available from http://eprints.ecs.soton.ac.uk/8593/.
[2] Concept browsing for multimedia retrieval in the SCULPTEUR project Sinclair, P. A. S., Goodall, S., Lewis, P. H., Martinez, K. and Addis, M. J. (2005) Concept browsing for multimedia retrieval in the SCULPTEUR project. In Proceedings of The 2nd Annual European Semantic Web Conference (in press), Heraklion, Crete. Metadata at http://eprints.ecs.soton.ac.uk/10913/, but full text is not available.
Further observations
Further observations from two additional documents supplied privately to us, and apparently not published on the web or elsewhere:
- SCULPTEUR D7.5 Semantic Layer Implementation - this is an important document, and there are many lessons here for our proposed data web developments. A version of this material is also in projectr deliverable D7.1:
- SCULPTEUR D7.6 Interoperability Protocol
Semantic layer
- This section based on: SCULPTEUR D7.5 Semantic Layer Implementation
The Semantic Layer Implementation document contains a survey of reference models, including CIDOC CRM, Iconclass, Getty's Art & Architecture Thesaurus (AAT), Getty Thesaurus of Geographic Names (TGN), Union List of Artist Names (ULAN), ABC Ontology fromn the Harmony project:
- "[CIDOC] CRM was found to be the most suitable for SCULPTEUR. As a formal ontology for integrating heterogeneous cultural heritage information, it is able to model the complex objects and relations present in the metadata schemas of our museum partners. The CRM is an open model, so it can be extended to cover virtually any possible relation; for example, there may be specific documentation techniques used by a single museum. Although domain specific terminology can be modelled with the CRM, no vocabularies or thesauri are provided. Reliable sources of such information are required and in SCULPTEUR we have been investigating the use of controlled lists that are defined in the museum metadata."
- "An advantage of using the CRM is that it avoids term mismatches in the different metadata schemas used by cultural heritage institutions. By using the CRM as the common ontology, cross collection searching will be possible enhancing the opportunities of sharing and communicating information among the museum partners."
- "In order to demonstrate the efficiency of CRM for capturing and representing museum information, attempts to map existing cultural databases to the CRM were carried out and reported in [11][24]. The mapping results showed that user datasets could be transformed to the CRM entities without loss of information meaning that the CRM is sufficient for dealing with a variety of information types." Some other ontology mapping efforts are also mentioned.
- "... for some users, it can be a challenge to make use of the CRM due to its lack of documentation and its complexity. To those who have no or little experience of how to employ an ontology, it is even more difficult to apply the CRM to their systems. A few examples provided by the CRM experts are good places to start, but some introductory courses on ontology studies are crucial for such users."
- "The problems and difficulties in developing the semantic layer in SCULPTEUR should not be underestimated. The importing and mapping of legacy museum data, both in terms of concepts and instances in the ontology is a complex manual process involving collaboration between technologists, domain experts and CRM experts.
- Problems occurred at various levels: from establishing coherent semantics at the highest level to interpreting or unravelling obscure coding and data formats at the lowest level. It was crucial to establish the mappings so that development could start on the tools for automatically structuring the data in the museum partner legacy systems according to the ontology.
- "Members of the CRM committee believe that the mapping process is straightforward, and they have had experiences where over a short workshop they are able to train museum information specialists and aid them in producing a complete mapping of their metadata schema. In hindsight, the ontology workshop organised at the start of the project would have benefited greatly with the input from a CRM expert, as many of the problems encountered would have been resolved immediately."
- "It may also be useful to develop tools so that the mapping process can be supported incrementally and more dynamically. For instance, instead of performing a complete mapping of an entire dataset, one could start with a few key fields and return to add mappings as they are needed."
- "The approach taken to instance population was to use an SRW service to dynamically structure instance information directly from a relational database into CRM-based XML. Although advanced semantic web queries are not possible through this interface, it is possible to obtain RDF dumps, either in part or complete, of the museum datasets. Using the SRW to perform these types of queries might also be possible with further work". (There is also description of an earlier attempt based on D2R Map, which fell foul of Jena model memory limitations - problems that may be obviated in later versions of Jena and/or D2R server.)
- "The Concept Browser is able to display the CIDOC CRM ontology in a graphical way, using a graph-based approach for the visualisation based on the “TouchGraph” library ..."
- "It is important to visualise the properties defined for each concept in the ontology, so a new view was developed that displayed these relations. Users would be able to switch between the concept hierarchy view and the property view."
- "The CRM contains 80 classes and 130 properties, and when considering that low-level concepts inherit the properties of the higher-level concepts the display of this information can get very cluttered and confusing."
- "... recommendations by the CRM special interest group stated that the CRM should never be shown raw to the user. The design of the CRM was motivated more by completeness and logical correction than human comprehension, and as such it would not be suitable as a base for a user interface. The concept browser interface being developed could be useful for learning about the CRM, or even as a reference for browsing the CRM documentation. It could be developed into a useful tool to support the process of mapping metadata schemas to the CRM. However, it was clear that following this design would not be result in a useful interface for browsing information in the museum collection metadata systems."
- Discussion of simplified display strategies leads to mention of a tool in which users can define simplifications of the display (I'm not sure I fully follow this, as it would appear to be inconsistent with not allowing users to see CRM -p maybe its the ontology mappers, or similar, who use this tool?).
- "The implementation of the mSpace interface has been useful, and our museum partners have commented favourably on the technique. Applying the mSpace idea to a real world application has raised interesting issues. The final evaluation process is still needed to obtain an idea of the interface’s effectiveness"
Interoperability protocol
- This section based on: SCULPTEUR D7.6 Interoperability Protocol
The Sculpteur interoperability protocol document describes the project's use of XML, Web services, SRW and CQL as the basis of an interoperable protocol for finding and accessing information about cultural heritage artifacts. These "nuts and bolts" are combined with semantic interoperability layers (CIDOC CRM, vocabularies for content, and search facilities) to define a complete interoperability stack:
Search capabilities, content analysis Content semantics, controlled vocabularies Structural semantics, meaning of fields and properties Query language (e.g. CQL) Protocol (e.g. SRW) Access mechanism (web services, WSDL) Syntax and data structures (XML, etc.)
Members of the Southampton University team have since told us that, if they were starting out today and in light of developments in the years since inception of the Sculpteur, they would choose differently, basing the "nuts and bolts" elements on RDF, REST style protocol interactions over HTTP and SPARQL queries.
The description of semantic elements in this document seems to largely overlap content in the "Semantic Layer Implementation" document.
Section 10 on "Specification of the interoperability protocol" specifies the use of SRW (http://www.loc.gov/z3950/agency/zing/srw/), CQL (http://www.loc.gov/z3950/agency/zing/cql/) and ZeeRez (http://explain.z3950.org/). It further profile the elements of SRW that form a basis for interoperability between Sculpteur components. A small number of supporting services are also described:
- MediaService, provides feature vector generation and media object upload capabilities and methods to create both 2D and 3D model thumbnails
- LightboxService, provides lightbox management
- QueryHistoryService, provides query history management and retrieval
Section 11 on "CQL query syntax" specifies a subset of CQL syntax that is supported by Sculpteur, noting that "Only a limited set of the CQL specification will be supported ...". Further, it explains "Two new keywords have been added to CQL to add extra functionality required for content based searching and to provide distinct enumerations of terms for a particular metadata attribute".
So it seems that the web service based standards for searching and querying need to be quite heavily "adjusted" to provide the interoperability basis for Sculpteur. We have learned that web service interfaces can be complex and inflexible, and this somewhat reinforces out strategy to pursue a REST style of solution. (It should be noted that REST and web services are not necessarily incompatible, but in practice a simple REST style of interface typically does not need that additional capabilities of a full web service protocol stack.)
Focus:=Metadata Creation Focus:=Metadata Harvesting Focus:=Metadata Query Focus:=Virtual Exhibition

