ERCIM News No.26 - July 1996 - GMD
Twelve Languages - one Translation Method
by Maria Theresia Rolland
In ERCIM News No. 22 of July 1995 (page 21) a new method is described
called logotechnique (word processing). This method enables a fully automated
language processing both for a query system in a single language and for
machine translation. In the following we will illustrate machine translation
according to this method by explaining how one sentence could be treated
in all twelve languages of the thirteen ERCIM institutions. The fundamentals
of the method will be described in a simplified manner and the translation
specifications will be given. Finally, the translations will be presented.
Initially, the structure of each language itself must be identified [details:
Rolland: Sprachverarbeitung durch Logotechnik. Bonn: Dümmler 1994].
In this way we obtain the relational structure of the language, ie the connection
between the initial word and the dependent words in their specific relationships
(eg: to buy -> what? -> computers, not: *to buy -> what? ->
earthquakes). All relationships of the initial word result in the construction
plan. Each word has such a construction plan. This plan thus comprises the
initial word, and the relationships involved including the specific dependent
words using their correct inflections. The word contents consists of the
special contents and the general contents, differentiated into inflection
and construction. The dependent words are grouped into classes of meaning
according to their semantic similarity (eg furniture: table, chair, ...;
device: computer, printer, ...) etc. Within the meaning classes each word
is marked by the possible relationships; each relationship has a reference
to the respective inflection. Therefore, construction plans, meaning classes
and inflection groups have to be determined only once. Each sentence is
an extract from the possibilities which are given by the construction plan
including the references. All construction plans taken together are the
explicit relational structure of the language. The construction plans labelled
for natural language processing represent the relation base.
Additionally, it is necessary to consider specific features of the language.
Eg a relationship may not only be concretized by a single word, but also
by a sub-ordinate clause, with or without an introductory conjunction or
the like, but always with a predicate which has its own construction plan
(eg to walk on -> in spite of what concession? -> although -> to
be exhausted: They walked on although they were exhausted). Predicates include
verbs, auxiliary verbs, modal verbs etc. Articles and possessive pronouns
are dependent on the corresponding noun [R.e. pp. 83; 203]. In addition,
each language has particular characteristics to be handled by rules. Rules
also govern the word order. When establishing a translation system it is
necessary to create the link between the relation bases. Equivalents may
be, eg: a) a single word - a subordinate clause; b) active form - passive
form; c) reflexive verb - phrase; d) article +noun - noun only etc.
If we assume that the relation bases of the twelve languages of the ERCIM
institutions were explicitly available and had been linked by translation
experts, we could have the following twelve language-units: German - English;
English - Finnish; Finnish - French; French - Italian; Italian - Greek;
Greek - Dutch; Dutch - Norwegian; Norwegian - Portuguese; Portuguese - Swedish;
Swedish - Spanish; Spanish - Hungarian; Hungarian - German. The sentence,
which is to be translated, reads as follows: "Mag sich auch die ganze
Welt gegen die Wahrheit rüsten, so wird man doch ihren Sieg nicht verhindern"
(Let the whole world rise up in arms against the truth, still its victory
cannot be prevented). The construction plans of the verbs: verhindern (to
prevent) and sich rüsten (to rise up in arms) would be included in
the linked relation bases in all of the above languages. Now it is possible
to identify the structure of the sentence according to the relation base
in one language and look up the linked units in the other language.
If one needs the translation: German - Italian, but there are only the equivalents
in the sequence: German - English - Finnish - French - Italian, a computer
system can identify the corresponding sentence structure, beginning with
German and continuing through English, Finnish, French and Italian. The
system uses only the Italian word ordering rules for presenting the correct
Italian translation in the final step. So an arbitrary language can be the
source or target language. The twelve sentences are presented in tables
available in rtf and ps
format.
Please contact:
Maria Theresia Rolland - GMD
Tel: +49 2241 14 2087
E-mail:rolland@gmd.de
return to the contents page