ERCIM News No.45 - April 2001 [contents]
Co-allocating Compute Resources in the Grid
by Gerd Quecke and Wolfgang Ziegler
Multidisciplinary simulations - as the simulation of air-plane wings for example - repeatedly require different compute resources being available at the same time to successfully perform such simulations. Co-allocating resources today usually requires a substantial amount of human communication on all levels. MeSch is a solution for the problem of (synchronous) resource allocation and job scheduling in a distributed heterogeneous environment developed in the Institute for Algorithms and Scientific Computing (SCAI) at GMD.
Resource management and job scheduling in the typical Grid environment based on multi-MPP systems or clusters is still one of the challenging problems today. Especially in a geographically distributed and heterogeneous environment, it turns out, that although scheduling tools and policies are available for each subsystem, there is a lack of global resource management and thus, resource allocation is far away from being performed automatically. On the contrary: a substantial amount of human communication on all levels is necessary to partition the application, locate resources, and observe the behavior of distributed modules.
MeSch is a light-weight solution for the problem of resource allocation and job scheduling in a distributed heterogeneous environment. The same way, a Grid application uses the Grid resources as a metacomputing environment allowing the use of more than one MPP system or cluster, MeSch leads to the idea of building a metascheduler, which takes the burden of resource co-allocation for a metajob. The approach here is to build the metascheduler such that it can use schedulers of all subsystems involved for all co-ordination and resource allocation tasks. The MeSch metascheduler prototype allows co-ordination of the whole scheduling process during the application lifetime including resource allocation. The algorithm was especially designed to allow simultaneous access to the requested resources, a requirement typically needed by parallel applications.
Until now, the only solution to overcome these problems is to use scheduling systems that are able to completely handle resource management for all resources involved. However, trying to use heterogeneous environments as they are becomes difficult if such attempts will be based on a single task approach as a regular service, without any need to change local administration rules and policies. Or, for example, to introduce local components like the GRAMs of the Globus system building an additional encapsulating layer that interfaces to local resource management systems.
However, this approach implies an overhead which may not be desirable. We are well aware that there are other powerful systems like Globus, Legion or Unicore providing a broader range of integrated tools. As those systems make more and more use of evolving standards it becomes possible to exchange standard components against others that are more suitable and efficient in a certain context, eg a meta-scheduler may replace the standard scheduler to manage distributed parallel applications. In addition MeSch may also be used stand-alone providing a simple and efficient way of bundling distributed computing resources for the bigger parallel jobs of a user without the need to install one of the systems mentioned above.
The MeSch approach handles resource allocation as a global task which can be divided into subtasks that may be delegated to co-operating schedulers of the subcomponents of a Grid environment. Ideally, we wont discard local schedulers; instead, we build the metascheduler on top of the local ones. This allows us to build a hierarchy of schedulers.
In the same sense as a traditional scheduler maintains the nodes/processors as allocatable resources, the metascheduler does so with systems (or partitions of systems). The advantage is, that all subsystems can act in their usual way with their own policy. Moreover, allocation of processors remains in the responsibility at the local system level and is not explicitly done by the metascheduler. As subsystems remain responsible for allocation, the local use of subsystems is not affected. No restriction is imposed on local scheduling strategies and administrative policies.
The MeSch approach does not impose any restrictions on the type of Grid system: they may be homogeneous or heterogeneous, geographically distri-buted, any combination of MPP, cluster, and dedicated systems. However, MeSch requires some local scheduler attributes in order to be able to take over the burden of the overall scheduling task's global synchronisation: To provide simultaneous access to required resources, methods of getting reliable information about suitable time slots must be available. This information enables MeSch to determine a common time slot on all Grid components that are required for a Grid application. First subscheduler suggestions about available time slots in general will not lead to a solution for the complete metajob. Thus, we must be able to ask for alternative time slots to have a chance to determine a commonly agreed time slot for simultaneous access.
If a commonly suitable time slot can be determined, the MeSch metascheduler must be able to inform each subscheduler to reserve the time slot and to guarantee that it will allocate required resources at the agreed start time for the agreed time interval.
MeSch synchronization management requires several iterations of interaction with subschedulers to find a solution for a suitable time slot. Obviously, offered time slots must be (pre-)reserved by subschedulers, while they are under consideration for suitability. An allocation agreement protocol eases the synchronisation process by defining a set of states a job may have from an scheduling request to its final execution (see figure).
Allocation Agreement Protocol. MeSch is a prototype metajob scheduler approach for a Grid environment. Its main advantage is that local scheduling policies are not affected by Grid jobs. The meta job scheduling can be viewed as using local schedulers as resource managers in a scheduler hierarchy. However, for an easy to implement allocation agreement protocol, local schedulers must provide a run time estimation facility for submitted jobs and accept and guarantee dedicated start time specification. The practicability of the approach has been demonstrated by a prototype implementation based on an enhanced EASY scheduler version.
Currently we are investigating how to implement scheduling for visualization devices such as a workbench for applications with real-time visualisation demand.
Link:
http://www.gmd.de/SCAI/popcorn/Please contact:
Gerd Quecke, Wolfgang Ziegler - GMD
Tel: +49 2241 14 2375, +49 2241 14 2258
E-mail: Gerd.Quecke@gmd.de, Wolfgang.Ziegler@gmd.de