Meetings/20070124/SurveySchemas
From ImageWeb
Contents |
Survey data schemas meeting
GK/JZ/(DS)
Agenda
- quick review of the state of project and project plan (15-30 mins)
- for each of the topics to be surveyed (repo. software, repo. schema, tool software), identify and record an initial set of information we'd like to collect about each (30 mins each?)
- discuss schema design issues and conventions (namespaces, naming, structure, ...) (30 mins?)
- pick one topic, and explore the use of Semantic Media Wiki and Media Wiki templates to capture schema-related information; try to put some actual data into the wiki (maybe hypothetical); try and determine if SMW will do the job (rest of available time).
- review of information from Cambridge meeting
Review of project plan
Generally on track at the moment (but it is early days).
Jun note that the survey work is very broad, going beyond the goals of the immediate project. GK aggrees, but notes that it is useful to help us understand the landscape in which we are getting involved, and that the subsequent project phases will be much more tightly scoped to the specified goals.
Closer examination reveals that the project planning software has lost the original effort estimates, and replaced them entirely with duration estimates. (This may be an artefact of the conversion process.)
Information to be collected
Repository software
- Architecture
- Data Storage (database, file, etc)
- Presented repository object format (contant+metadata wrapper)
- Form of object identifiers supported (arbitrary, URI, DOI, etc.)
- Granularity of reference (e.g. images within articles?)
- Image-specific features (e.g. thumbnailing)
- Submission methods (e.g. interactive by object, programmatic by object, batch)
- Metadata exposure and machine access
- Presentation format (e.g. METS, RDF, MPEG-21, etc.)
- Access mechanism (e.g. Web browser (interactive), OAI-PMH. HTTP request, Web services (SOAP, WSDL))
- Associated software
- ingest
- query
- search
- image handling
- image aggregation
- user interface
- other
Repository content and metadata schemas
- Repository (technical):
- Software system
- Size (number of entries, images)
- Data Storage (database, file, etc)
- Presented repository object format (contant+metadata wrapper)
- Form of object identifiers used
- Granularity of reference
- Image-specific metadata
- Submission methods (e.g. interactive by object, programmatic by object, batch)
- Metadata exposure and machine access
- Presentation format (e.g. METS, RDF, MPEG-21, etc.)
- Web browser (interactive)
- OAI-PMH
- HTTP request
- Web services (SOAP, WSDL)
- Other methods
- Associated software
- ingest
- query
- search
- image handling
- image aggregation
- user interface
- other
- COLLECTIONS (content)
- Name/identifier
- Subject matter
- Media types (image/audio/video/etc)
- Container format (TIFF, MPEG, etc)
- Encoding format (e.g. compression technique used)
- Public or 'dark'
- Suitability for ImageWeb (use as exemplar?)
- Metadata
- availability
- standards used
- overlap with other repositories
- Policy
- Primary purpose (e.g. preservation, research, teaching)
- Material (university library, college library, research, museum, teaching)
- Identifier allocation
- Metadata minimum requirements
- Metadata standards required or preferred
- Experience and observations
- Strengths and weaknesses of repository software
- Strengths and weaknesses of metadata schemas
- Problems with metadata (e.g. lack, ambiguity)
- How to get good metadata (e.g. policy, help and liaison with depositors)
- Availability of thumbnails for end-user browsing
- Formality of schema use (e.g. keywords, taxonomy, ontology)
- Relationship with Google: What Google indexes (web-accessibility of content)
- Scalability (anticipated possible future size)
- Users (e.g. number, kind of user, nature of use)
- depositors
- accessors
- others?
- discovery modes (e.g. search, browse, formal(-ish) query)
Tool software
Schema deferred for later consideration.
Schema design issues
Jun has 2 questions:
- schema or taxonomy? This information doesn't fit a pure taxonomic hierarchy, so a schema is what we want. We seem to agree on this.
- assuming we refine the schema as we go along, how will we handle mappings between versions, so that semantic queries can be supported? Don't know for sure. But try to revise schemas in ways that don't invalidate existing data, even if it becomes incomplete ("missing isn't broken"). If we fail, the project isn't fatally compromised, and we learn important lessons.
Looking at Semantic Media Wiki
The template definition syntax can be seen by viewing the source (i.e. editing), say, Template:Person in http://www.ontoworld.org (i.e. http://ontoworld.org/wiki/Template:Person, ethen open the edit tab). There are a number of helper templates taht are used by this.
For these templates to work properly, itnis also necessary to install the ParserFunctions Media Wiki extension.
With these, the semantic templates seem to be working in our wiki.
To make further progress, we need to experiment using the SMW templates. Over time, we expect appropriate design patterns to emerge.
Creator::User:GrahamKlyne Creator::User:JunZhao Creator:=GrahamKlyne Creator:=JunZhao

