GeneSyS: Monitoring and Management of Distributed Systems
by Balázs E. Pataki and László Kovács
Application development is moving from monolithic systems towards distributed architectures based on middleware technologies (eg GRID). This kind of application requires a sophisticated monitoring framework that provides information on all levels from the hardware over the network up to the application level. The goals of the EU Information Society Technologies 'GeneSyS' project are the specification, development and standardisation of an open, standards-based, open-source monitoring and management framework.
GeneSyS (Generic System Supervision) commenced in March 2002 with the goal of specifying and developing a new system-monitoring and management middleware for distributed systems and applications. Besides specification and development, the project intends to help GeneSyS become a standard or recommendation for system supervision based on Web Services technologies.
The need for such a framework arises from the fact that distributed applications are gaining an increasingly wide acceptance. Driven by global players, the introduction of Web Services has pushed interest in distributed applications further. In order to reflect the requirements of different areas within the architecture, the GeneSyS consortium consists of academic and industrial partners from a number of domains. The partners are EADS Space Transportation (France), RUS/HLRS (Germany), NAVUS GmbH (Germany) and ERCIM member SZTAKI (Hungary).
Drawbacks of Existing Monitoring Systems
The major drawbacks of most supervision tools are their closed and proprietary nature and their limited view of a specific domain of monitoring as a network. The main problems can be summarised as follows:
- the application programming interfaces are not open, ie they are neither documented nor publicly available
- the inter-agent protocols are often not open standards-based and use an encoding scheme which depends on the operating system
- most existing supervision solutions target a specific domain of monitoring, such as network monitoring or database monitoring, or are designed to monitor a specific commercial application
- existing systems follow an inflexible architecture in which the transportation core, monitoring agents and data visualisation consoles are not well separated.
Architecture
To achieve a high level of flexibility, the monitoring tools are separated from the visualisation application and the communication bus. GeneSyS has the following building blocks:
- Supervised Entity: the resource (hardware, network, software) that is supervised by GeneSyS. It can be a Monitored Entity resource, which provides information about its internal status, or a Controlled Entity, which provides a command interface enabling its internal state to be manipulated.
- Delegate Agent: the component that is able to produce monitoring data for a given Monitored Entity, as well as controlling the Controlled Entity resource with commands coming from the Supervisor Agents.
- Supervisor Agent: this component is able to consume monitoring data coming from a Delegate and is also used for sending control commands to the Delegate agent to affect (eg restart, reconfigure) the Supervised Entity. A typical human-managed Supervisor combines both monitoring and control functionality and interfaces with a human operator by means of a graphical supervision console.
- Core: this is the component that provides publishing and discovery services to agents. It acts as a directory service provider. Every agent should register with the Core and publish its component information to be queried by other agents.
- Repository: the GeneSyS Repository is capable of archiving supervision data in persistent storage and retrieving it upon request.
|
Figure 1: Components of the GeneSyS Architecture. |
The GeneSyS components are connected by the GeneSyS Messaging Protocol (GMP), which defines communication between Agent and Supervisor, and Agent and Core. The protocol itself is implementation-neutral and depends neither on languages nor operating systems. In GeneSyS version 1.0 the GMP has been implemented for the Simple Object Access Protocol (SOAP) with a message interchange format based on XML.
Validation Scenarios
GeneSyS architecture is evaluated using real, mission-critical applications. GeneSyS V1 was validated in the context of EADS's Preliminary Design Review (PDR) application. This involves up to several hundred remotely connected engineers who review engineering documents of the Automated Transfer Vehicle (ATV). The ATV's mission is to provide transportation services to the International Space Station. The PDR requires audio-video communication between the engineers and access to the Engineering Document Database (EDB). In this scenario GeneSyS has been successfully evaluated and proved to be a vital tool for providing awareness of the status of the underlying systems for application and system administrators.
The forthcoming version of GeneSyS (version 2.0) is designed to be used in new areas. The Web Servers Monitoring (WSM) scenario will use GeneSyS to supervise typical Web server configurations consisting of HTTP servers, database servers, applications servers and scripts. In this scenario, GeneSyS will introduce the concept of supervised entity dependency, which will allow easier root cause analysis of problems coming from the execution of dependent systems.
|
Figure 2: Layout of the GeneSyS Messaging Protocol. |
Future Plans
The first version of GeneSyS and its specification are available in open-source form at SourceForge. The GeneSyS concept has been introduced to standardisation bodies such as the World Wide Web Consortium (W3C) and the Organization for the Advancement of Structured Information Standards (OASIS) for evaluation. Ongoing discussions target the realisation of a Web Services-based system-monitoring recommendation based on or incorporating ideas from GeneSyS.
The next version of GeneSyS - currently under development - targets the security of communication between agents, a more consistent supervision console architecture, events-based monitoring and the introduction of intelligent agents, which will be able to decide and report autonomously in case of exceptional situations.
Links:
Project home page: http://genesys.sztaki.hu
Department of Distributed Systems at SZTAKI: http://dsd.sztaki.hu
Please contact:
László Kovács, SZTAKI
Tel: +36 1 279 6212
E-mail: laszlo.kovacssztaki.hu
|