THETIS - A Data Management and Visualization System for the Support of Coastal Zone Management in the Mediterranean Sea
by Catherine Houstis
The aim of the THETIS project is the design and development of an open federated environmental scientific information system for the support of the interdisciplinary area of Coastal Zone Management. The THETIS system architecture takes advantage of distributed computing infrastructures including Digital Library, Mediation, Graphical Information Systems (GIS) and World Wide Web technology to provide single point access, location, integration, retrieval, and visualization of distributed geospatial data and programs remotely via a Web browser.
Scientific repositories have traditionally evolved in isolation. They are essentially legacy systems running in stand-alone mode. There is an increasing need for scientists, engineers, and decision-makers to integrate the wealth of accumulated information. Coastal Zone Management support and, more generally, decision support based on scientific data depends on the dissemination and visualization of information possibly geographically distributed over a number of heterogeneous and autonomous scientific data repositories from a single point of access. In addition, it can be significantly aided by the generation of new information through the remote execution of simulation and data processing models possibly located on legacy systems, on demand.
Coastal Zone Management (CZM) is a methodology for the holistic management of all coastal resources with the ultimate aim of promoting sustainable development of coastal zones. An abundance of information has been accumulated for CZM. This information is the result of work in several scientific disciplines such as marine biology, oceanography, chemistry, and engineering. It is very diverse in terms of content, resolution or accuracy, storage formats, and the types of repositories that manage it. Typical information used for CZM purposes includes monitored data and images that may be stored in various databases, files, and even spreadsheets. Furthermore, mathematical models exist for simulating physical processes of coastal sea circulation, wave generation, sediment transport, etc. In addition, techniques such as image processing and statistical methods for reformulating, fusing, or extracting information from monitored data are available. Data access may require specialized vendor database tools and integrated access does not exist. In the simplest case data exchange is accomplished through surface mailed diskettes. Moreover, scientific models are often implemented as legacy programs that require specialized hardware and software to execute.
The THETIS system is an Environmental Scientific Information System designed and implemented for managing data for coastal zones in the Mediterranean Sea. The system is accessible through the WWW and provides user access to data, programs and images stored in geographically distributed scientific repositories. The user is able to locate, retrieve and visualize data stored in these repositories from a web browser. In addition, the user can produce data on demand by remotely invoking programs with appropriate data inputs and visualize these data using a GIS based user interface. Search of data/programs is also done via a map. GISs provide a natural user interface since environment related data have always a geographical attribute. In addition, specialized visualization software is used. The user is capable of storing the results of program invocations back into the system by means of an automatic publication system that is implemented within THETIS. Thus a globally shared digital library of scientific information is formed dynamically regarding the application of interest, i.e., coastal zone management.The THETIS system design is based on a number of components, namely, a distributed search engine, a distributed retrieval engine, the system data repositories and the user interface as shown in Figure 1. The search engine implements a distributed search of metadata regarding the system repositories or publishing sites. There are two different types of metadata addressing data and programs. The FGDC standard has been used as the basis for both of these types because of the common geospatial component in environmental data and the thematic information requirements by the standard. Metadata, an electronic form implementing the standards agreed fields, are completed/edited by repositories publishers (i.e., scientists that make their data/programs/images available to the system) and directly submitted via the web to the THETIS system. The search engine automatically indexes these forms. Moreover, a spider program also automatically retrieves metadata from repositories when the repositories are updated independently and maps them into the systems metadata. Search queries are based on thematic keywords, specific organization names and/or geographical locations selected graphically on a map (by encircling the area of interest via a mouse) at the user interface. The output of search queries provides metadata information about data, programs and images and a link of their location. The user brows metadata information then selects a link of interest and queries it via the retrieval engine.
The retrieval engine offers two main functionalities. First, it enables the access of data and images that reside in distributed publishing sites through a JDBC interface (other interfaces such as FTP and HTTP are also supported). In accordance with this interface, the retrieval engine supports a restricted SQL query language to query the data. To support a distributed access to data, data publishers first have to make their data available in the form of relational tables. The data themselves do not have to reside in a relational database though. Instead, a software module, called a data wrapper, takes charge of the dynamic translation of the original data into a relational format when queried. Several data wrappers are available for different kinds of data in the THETIS system. Once data are published via data wrappers, the query execution engine at the client site processes a user query locally. This engine maps the user global query into local queries, each for a different query execution engine of some remote publishing site, and a composition query for producing the final result. Client sites communicate with publishing sites via a CORBA communication module. Local queries received by the query execution engine of a publishing site are sent for execution to the appropriate wrappers of the site. The query execution engine has a runtime system to integrate the results of local queries.
As a second functionality, the retrieval engine enables the invocation of remote program execution with input data arguments that are the result of distributed queries. Such invocations are performed via a job execution language. The processing of job execution commands is somehow similar to queries. First, program publishers publish their programs via program wrappers. Then, a client site sends to the publishing site where the program resides a job execution command. At the publishing site, job execution commands are processed asynchronously. A job manager module requests execution of the queries that compute the program inputs, invokes the execution of the program, makes the result available to the client, and also notifies the client of the result.There are three different application demonstrators integrated from the scientific repositories connected to the system, which we call scenarios of use of the THETIS system, and which support coastal zone management: A Waste Transport scenario for the computation of concentration of effluents based on general circulation data displaying the movement of pollutants, calibrated for the north coast of Heraklion in Crete; A Sea Structure Tracking scenario, which allows for the study of the dynamic of oceans through stepwise satellite image processing from satellite pictures of the Mediterranean Sea; and A Wave Prediction/Hindcasting scenario, which is based on the calculation of wave climate at specified points based on historical wind and wave data, applicable any where in the Mediterranean, provided that local input data are available. The scenarios are implemented interactively with the user. The user is also allowed via the web interface to submit parameters relevant to the programs he/she is remotely executing, such as grid accuracy, wind direction, etc. These inputs are user dependent and required for the execution of scientific programs. The user interface is shown in Figure 2. It displays an instance of the search and retrieval capabilities of the Waste scenario visualized on a map of the North of Crete. At the user interface the user may visualize the resulting data via a Geographical Information System, VRML and customized visualization tools automatically invoked by the Web browser.
The THETIS project started on 1 July 1998 and will be completed on 31 December 2000. The THETIS consortium includes the research organizations: Institute of Computer Science and Institute of Applied Mathematics, FORTH, University of Crete, Institute of Marine Biology of Crete, INRIA and IMA-CNR, corporations specializing on the management of environmental information: HR-Wallingford (England), and Alcatel Industries (France), and a user group: RECORMED which consists of delegates of four Marine Research Organizations, namely, ICRAM (Italy), IEO (Spain), IFREMER (France) and NCMR (Greece). The project coordinator is ERCIM. The project leader is ICS-FORTH.
Links:
THETIS web site: http://www.ics.forth.gr/pleiades/THETIS/thetis.html
Demo server: http://kos.ics.forth.gr:8000/CoordsIndex.html
Visualization in THETIS: http://thetis.ima.ge.cnr.it/~thetis/wave/wave.htm
Please contact:
Catherine Houstis - FORTH
Tel: +30 81 39 1729
E-mail: houstis@csi.forth.gr