GENIE: Grid Enabled Integrated Earth System Model
by Andrew Price, Tim Lenton, Simon Cox, Paul Valdes, John Shepherd and the GENIE team
An understanding of the astonishing and, as yet, unexplained natural variability of past climate is an essential pre-requisite to increase confidence in predictions of long-term future climate change. GENIE is a new Grid-enabled modelling framework that can compose an extensive range of Earth System Models (ESMs) for simulation over multi-millennial timescales, to study ice age cycles and long-term human induced global change. Grid technology is a key enabler for the flexible coupling of constituent models, subsequent execution of the resulting ESMs and the management of the data that they generate.
To predict the future, we must understand the past. In the case of planet Earth we do not yet fully understand the mechanisms that have driven the most fundamental change in climate over the past million years - the transitions between ice ages and warm inter-glacials. To improve understanding of the physical processes and feedbacks that are important in the Earth System, the GENIE project is creating a component framework that allows the flexible coupling of constituent models (ocean, atmosphere, land, etc.) of varying resolution (grid sizes), dimensionality (2D and 3D models) and comprehensiveness (resolved physics vs. parameterisations) to form new integrated ESMs. Through the systematic study of a hierarchy of GENIE models the project aims to determine the spatial resolution and process complexity that actually need to be included in an ESM to exhibit past Earth System behaviour.
The GENIE project is funded by the National Environment Research Council (NERC) and brings together expertise from UK and international academic institutions. The Universities of Bristol and East Anglia, the Southampton Oceanography Centre and the Centre for Ecology and Hydrology have provided mature models of major Earth System components including atmosphere, ocean, sea ice, ocean biogeochemistry, sediments, land vegetation and soil, and ice sheets. The e-Science centres at the University of Southampton and Imperial College have been engaged to provide the software infrastructure for the composition, execution and management of the integrated Earth System Models and their output on the Grid. We have strong international collaborations with researchers at the Frontier Research Centre for Global Change in Japan, University of Bern in Switzerland and University of British Columbia in Vancouver.
e-Science Challenge
The objectives of the GENIE project are to develop a Grid-based computing framework which will allow us:
- to flexibly couple together state-of-the-art components to form a unified Earth System Model (ESM)
- to execute the resulting ESM across a computational Grid
- to share the distributed data produced by simulation runs
- to provide high-level open access to the system, creating and supporting virtual organisations of Earth System modellers.
Software
Grid computing technology is required to ease the construction of new instances of Earth system model, automate the process of model tuning, speed up the execution of individual long integrations, enable large ensembles to be run, ease their execution, and feed and recycle data back into model development. A principle aim of the project is to ensure that the Grid is useable directly from the environment where the climate modellers are performing their work. The software deployed to meet these requirements is built upon products of the first phase of the UK e-Science programme. These include:
- Geodise Compute Toolbox
The Geodise computational toolbox for Matlab provides a suite of Matlab functions that provide programmatic access to Globus Grid enabled compute resources. The computational toolbox uses the APIs provided by the Java CoG toolkit to allow the submission of compute jobs to Globus enabled resources, GridFTP data transfer and the management of proxy certificates. An interface to Condor resources is also provided.
- Geodise Database Toolbox
An augmented version of the Geodise Database Toolbox has been deployed to provide a distributed data management solution for the GENIE project. The Geodise system exploits database technology to enable rich metadata to be associated with any data file, script or binary submitted to the repository for archiving. XML schemas define the structure of the metadata and are mapped into the underlying Oracle 9i database. The database system is built on open W3C compliant standards technologies and is accessed through a web services interface. Client tools are provided in Matlab and Jython which allow both programmatic and GUI access to the system.
- OptionsMatlab
OPTIONS is a design exploration and optimisation package that has been developed in the Computational Engineering and Design Centre at the University of Southampton. This software provides a suite of sophisticated multidimensional optimisation algorithms developed primarily for engineering design optimisation. The package has been made available to Matlab via the OptionsMatlab interface and has been exploited in conjunction with the Geodise Toolboxes to tune GENIE model parameters.
Tuning
A key challenge to the project is to tune or re-tune the parameterisations of individual model components so that the new coupled ESMs simulate reasonable climate states. In particular, it is imperative that the fluxes passed between components are compatible if the resulting coupled model is to be stable. We have exploited the Grid enabled toolset in conjunction with the OPTIONS package to apply Response Surface Modelling techniques and Genetic Algorithms to optimise GENIE model parameters. In addition, the ensemble Kalman Filter, a data assimilation method, has also been employed. These techniques provide a comprehensive set of tools for a program of extensive model tuning which has progressed in step with model development.
Current and Future Study
We are exploiting local resources (condor pools, Beowulf clusters) and the UK National Grid Service to perform extensive studies of GENIE models. The computational Grid provides the means to perform large ensemble runs. To date, experiments have studied the stability of the thermohaline circulation to multi-parameter freshwater inputs, using typically ~1000 instantiations of the model, involving ~40 million years of model integration.
The database repository plays a central role in these studies as a resource for both steering computation and sharing of the data. Future work will involve the development of a distributed federated database system, deployment of the database on the National Grid Service data node(s) and further enhancements to the data management tools. The project will adopt the GeodiseLab Toolbox from the OMII (Open Middleware Infrastructure Institute) managed programme when this product is released to the community.
Links:
http://www.genie.ac.uk/
http://www.geodise.org/
http://www.omii.ac.uk/
Please contact:
Andrew Price,
Southampton Regional e-Science Centre, University of Southampton, UK
Tel: +44 23 8059 8375
E-mail: a.r.pricesoton.ac.uk