ADMIRAL 20100204 meeting - IT Support

From ImageWeb

Jump to: navigation, search

Notes from meeting with Zoology IT support to discuss ADMIRAL issues

When
4-Feb-2010, 10:00
Where
IT Support office
Present
SE (IT Support)
GK (ADMIRAL project manager, Graham Klyne)
DG (ADMIRAL project)
(As these notes are in a public wiki, personal references are restricted to initials, except where explicit permission for public use of a full name has been expressed.)
Duration
45 minutes


Contents

Goals

  • Start progress towards a proposal for long-term hosting of the data sharing facility
  • Gather input on supportability of chosen platform options
  • Gather recommendations for authorization/access control (as opposed to authentication) mechanisms
  • Any other information that might help us

Agenda

  1. Introduction, review of ADMIRAL's goals, and issues of supporting department researchers beyond the project lifetime.
  2. Review choice of platform and technologies. We are currently testing systems based on Ubuntu 9.10 JeOS, Samba, Apache2, Kerberos 5, WebAuth, DropBox, to be augmented by web applications to assist submission to the University research archive (ORA/Databank) for long-term preservation.
  3. Review authorization/access control options (based University SSO authentication).
  4. Hosting platform. We are currently testing in a VM environment based on Linux KVM; among other things, this offers options for EC2-compatible hosting via Ubuntu UEC/Eucalyptus. Other hosting platforms could be considered.
  5. Long term hosting environment. We have some budget for this, to provide an environment that can outlive the project. The immediate choice is departmental vs. OUCS hosting; if local, there are matters of hardware selection.
  6. Other operational matters to consider (e.g. system monitoring, access control configuration, security review, ...)

Discussion

Introduction

Briefly recounted that ADMIRAL's goal was to provide a system for facilitating capture and organization of research data with a view to (selective) submission to the Oxford Research Archive for long-term preservation. The starting point is to create a shared file system that can be used by small research groups to save, with minimal setup requirements, which will initially provide for sharing internally and with selected collaborators, and for automated backup of that data. We will then layer onto this system a set of tools for annotation and organization of the data, then packaging and submission of selected data to the Oxford Research Archive.

This system is intended to outlive the ADMIRAL research project.

It was immediately noted that it's currently not clear that departmental IT Support are responsible or reasonably resources to provide long-term support of such a system. Currently, support would be limited to support of the virtual hosting environment, preferably based on VMWare.

It is also not clear to what extent support for such systems might be provided for centrally (e.g. by OUCS). It seems fairly clear that support for a diversity of non-standard systems is not viable: if such support is to be provided, the systems need to be standardized. Another support option might be via open-source community support - may be worth discussing this with OSS-Watch (this clarifies the requirement to create an open source repository for version-management of project assets).

ACTION: SE to ask the departmental administration about the policy towards supporting research data management systems

DONE: GK raise this matter for discussion with other projects and policy influencers (initially as an ADMIRAL project blog posting) - http://admiral-announce.blogspot.com/2010/02/research-data-management-support-issues.html

Review choice of platform and technologies

SE thought that the current choice of Ubuntu/Samba/Apache platform technology was generally sound and reasonable.

Review authorization/access control options

SE suggested that a directory service, probably OpenLDAP, was the way to go, possibly with a simple home-grown web interface for user management. There are PAM modules for LDAP-based authorization, and Samba probably supports this natively.

Hosting platform

VMWare, Xen and KVM were all mentioned as possible hosting platforms.

For long-term support, SE indicated a distinct preference for VMWare, as such a hosting environment is already running within the department, and if needed it would be (technically) possible to migrate this to the University's NSMS (who run a full VMWare hosting service, at a price) if necessary.

It was noted that the ADMIRAL project is currently using KVM, but agreed that we'd consider migrating our work to VMWare at some stage during the course of the project (though not immediately).

GK noted concerns about remote management access to VMWare from non-Windows clients; apparently, Firefox on Linux (but not on MacOS) works well with the VMWare web-based management interface.

Long term hosting environment

For live hosting, given that we are anticipating several Terabytes of research data based on initial survey returns, SE indicated that departmental hosting would be very much more economic that OUCS/NSMS hosting.

As noted previously, IT Support for this would probably be limited to the hardware and virtual hosting environment.

SE notes that Amazon EC2 as a possible hosting choice should be discussed with the University's legal team, as it might fall outside UK jurisdiction. Even if the data is required by a research contract to be openly published, it is still (in general) the University's property and responsibility, so jurisdictional considerations are not thereby absolved. DG noted this could also be an issue for DropBox.

As a very rough indication, SE suggested that a suitable dual quad-core server would likely cost around £3.5K, and a few Terabytes of high performance storage (SAS: Serial Attached Storage, or Network Attached Storage?) would be about £2K. Price estimates ex-VAT. Storage costs might be reduced by using slower SATA for less-used "archival" storage. Plus some allowance for maintenance over the required lifetime.

ACTION: GK: Need to get some idea about frequency/volume of data access.

ACTION: GK: specify requirements for hosting, then raise for discussion with SE to propose suitable hardware purchase. Q: can we sensibly phase the purchase, rather than fix the specification up-front?.

Other operational matters

SE notes that Nagios is used in the department for live system monitoring, so would be a good choice for us.

SE mentioned security penetration testing (something not previously considered).

GK mentioned performing a security/trust modelling and review exercise, to create a baseline which could be incrementally reviewed and improved as required. SE mentioned that the University has a security best practice review (contact: JA in the OXCERT team), that could form a useful basis for security review.

Follow-up

Silk data

5-Feb-2010

Spoke to CH about data volumes and usage frequency - volumes currently about 1.8Tb for entire group, with relatively infrequent access. Large files are videos, confocal image stacks, beam-line data, which are probably best held on slower drives not shared with other research groups.

Personal tools
Oxford DMP online
MIIDI
Claros