ERCIM News No.36 - January 1999
Software Renovation
by Arie van Deursen
In 1976, Belady and Lehman formulated their Laws of Program Evolution Dynamics. First, a software system that is used will undergo continuous modification. Second, the unstructuredness (entropy) of a system increases with time, unless specific work is done to improve the systems structure. This activity of improving legacy software systems is called system renovation. It aims at making existing systems more comprehensible, extensible, robust and reusable.
Due to the fact that a typical industrial or governmental organization has millions of lines of legacy code in continuous maintenance, well-applied software renovation can lead to significant information technology budget savings. For that reason, in 1996 Dutch bank ABN AMRO and Dutch software house Roccade commissioned a renovation research project. The research was carried out by CWI, the University of Amsterdam, and ID Research. The goals of the project included the development of a generic renovation architecture, as well as application of this architecture to actual renovation problems.
Of the various facets of software renovation - such as visualization, database analysis, domain knowledge, and so on - an enabling factor is the analysis and transformation of legacy sources. Since such source code analysis has much in common with compilation (in which sources are analyzed with the purpose of translating them into assembly code), many results from the area of programming language technology could be reused. Of great significance for software renovation are, for example, lexical source code analysis, parsing, dataflow analysis, type inference, etc.
Program Transformations
Software renovation at the source code level includes automated program transformations for the purpose of step-by-step code improvement. In this project, we successfully applied transformations to COBOL programs, dealing with goto elimination, dialect migration (between COBOL-85 and COBOL-74) and modifications in the conventions for calling library utilities.
To make this possible, we developed a COBOL grammar, instantiated the ASF+SDF Meta-Environment with this grammar to obtain a COBOL parser and pretty printer, and designed term rewriting rules describing the desired transformations. The resulting system is capable of automatically performing the desired transformations on hundreds of thousands of lines of code, yielding a fully automatic transformation factory.
Object Identification
At a higher level of abstraction, software renovation includes the migration of legacy code to architectures better capable of meeting todays requirements. A typical example is the migration of procedural COBOL code to object technology. This is a process which cannot be fully automated. Instead, renovation tools will have to focus on helping the human reengineer in understanding the legacy system.
Thus, such renovation tools will have to extract as much meaningful information as possible from a legacy system, showing it to the human reengineer in a concise and usable manner. For the purpose of object identification, this information consists of the business data items (candidate object attributes) the programs or procedures performing the key tasks (candidate methods) and an overview of the combined use of these (candidate classes). We were able to develop heuristic techniques based on cluster and concept analysis to extract such class proposals automatically.
Conclusion
The overall result of the project is a generic renovation architecture aimed at program transformation and system understanding. A specific instantiation for COBOL has been developed, which has been applied to various real life case studies.
Pointers to publications with more information can be found at:
http://www.cwi.nl/~arie/resolver/Please contact:
Arie van Deursen - CWI
Tel: +31 20 592 4075
E-mail: arie@cwi.nl