Meetings/20060618-TonyHey-eScience-Oxford

From ImageWeb

Jump to: navigation, search

Contents

Tony Hey - eScience and Scolarly Communications

18 June 2007

Oxford e-Research Centre / Computing Laboratory

Introduction - Anne Trefethen

Led UK program to support e-Science.

Past Dean of Engineering at Southampton.

Now VP technical computing at Microsoft - to become VP for external research.


Tony Hey

Emergence of new science paradigm

Traditional: experminet, theoretical, computation

Today: e-Science or Data centric science

Too much emphasis on "toy" tools - but industrial tools are better

Example: project Neptune; vision for scientific workflow. Internet-addressable sensors along seabed, or mobile at sea. Distributed data collection. Neptunbe seafloor + IRIS seismic events + NOAA surface temperatures. Need easy way to create "computational workflows"(?). Slide looks like COM model implementation of myGrid?

e-SCience and cyberinfrastructure.

Also, for humanities and social sciences. "Cultural commonwealth". Notion of "digital scholarship". Can bring new analytical and interpretive power to bear.

"digital scholarship" means:

  • building digital collection
  • tools for collection-building
  • tools for analysis
  • use of above to create new intellectual products
  • (one more...)

Example from Jane Hunter's work: "digital repatriation". Allowing Oz aboriginals to reclaim their cultural heritage. Repositories. Ownership to traditional owners. Support customs. Protect against misappropriation. (one more...) Leads to "knowledge spiral".

OECD declaration of access to research data from public funds (2004):

  • open access promotes scientific progress
  • open access maximizes derived value
  • risk that restrictions diminish quality, efficiacy of scientific research

Scientific data on the web is: (another life cycle):

  • produce, acquired, discovered
  • analyze, processed, transformed
  • published
  • disseminated
  • archived

Desiderata(?):

  • Easily available, analyzable, searchable.
  • Easily sharable (e.g. http://cas.sdss.org/dr5/en/). Atronomy communities historically tend to be segregated by wavelength; sharing brings these together.
  • Services expose functionality; e.g. BLAST service in workflow.
  • Service composition; e.g. Taverna. Importance of provenence.
  • Security of access. Interoperability between different system security identifiuers (e.g. ShibGrid).

New tools "can make a difference": annotation, visualization. Need to go beyond a research prototype - proper dependability. Suggests a role here for IT companies.

Knowledge creation, publishing, archiving, discovery. Semantic markup; example: myGrid.

"Cloud services" - (services in the Web?) - "somewhere out there in the cloud". Add in ancilliary services - blogging, processing, upload, storage. E.g. Amazon storage service, elastic compute cloud.

A social grid built around the web?

Flavours of "grid":

  • Cross-organization
  • Intra-organization
  • Data centre
  • Social

Don't build complex standards: "we've got enough standards out there"

Research: control of ordering?:

  • searching and visualization; clustering (Grokker)
  • live documents (not whole journal, just selected papers, RSS feeds for commetary, blogs, correspondence, errata).
  • reputation and influence. Different types of peer review.

Nature's 5D framework (10 D's?):

  • Deep Data
  • Discussion and Dialog
  • Detailed Discovery
  • Dynamic discovery
  • Data Display

Publications as live documents.

New forms of peer review. "Faculty of 1000 ...." Users can build there vown trust models.

I like this idea of personalized trust models

Connotea - tagging - del.icio.us - from prose descriptions via a sentence to a few keywords.

Lab notebooks as blogs. Recording experiments that fail - publishable? - useful? Critical details are often missing. Group blog to capture local knowledge. Record "negative results", use for tyraining new graduates.

Jeremy Frey instrument blog: e.g. MQTT Lego Microscope. The instrument itself generates a blogroll.

"Capture digitally as early as possible". Lack of trust in digital recording mechanisms. (Proposes role for commercial companies, again.)

Wikis - a cautionary lesson. Vandalls trashed the content. I thought wikis were supposed to keep old revisions of each page, for just this reason.

EU petition on open access, February 2007. Anecdote from Southampton: as student, didn't have access to all of department's output, as library couldn't afford to subscribe to all journals that researchers published in. New balance needed. How much of this is also due to the publish-or-die effect?

... towards a policy on open access for the outputs of EU-funded research.


The future of university libraries

Library 2.0 = Library + Web 2.0. Talis.

Role for librarians: to capture and protect the intellectual output of the institution. Statistic from Google Scholar shows Southampton at no 15 worldwide in citations/publication, due to ePrints.

1400 repositories, 20 software systems.

Need to link all these together; cf. ORE federation of repositories.

w.r.t. earlier slide of social grid, what about linking between layers?

"Loosely connected federated repositories - search in a way you can get meaningful results"

Gives example of a putative seminar by Bill Gates being videoed and lodged in institutional repository; What about blogged video annotation of such? One for Sabre?

"Digital information lasts forever, or five years, whichever comes first"

How to guarantee spreadsheet data for 20+ years?

EU Planets project. Microsoft OOXML (Open Office XML formats). OOXML is ECMA standard. Microsoft Covenant Not to Sue (CNS) when using OOXML. OpenXML translator project. Personal statement by TH: "I believe Microsoft are serious in doing this - I wouldn't be there otherwise"

Questions

Mike Brady - takes a pot-shot at Wikipedia; implies that it is generally flawed; cannot be cited as authority? A: some of the articles are good. But how to know which are good and which are not? A': Having access to data craetes a kind of authority, or means to evaluate.

I wonder if this ought to be seen not so much as a problem, but an opportunity, in that it creates opportunities to learn how to assess information provided, and to continually practice doing so.


Notes

Instrument arrays: How does this relate to ubiquitous web project at W3C?

Personal tools
Oxford DMP online
MIIDI
Claros