ERCIM News No.26 - July 1996 - CWI
Computational Linguistics at CWI - the Logic of Ambiguity
by Jan van Eijck
Computational Linguistics combines insights from formal language
theory, empirical linguistics, and logic, with the overall aim to implement
natural language understanding systems on computers. A group of applied
logicians at CWI has looked at the logical underpinnings of computational
linguistic tools such as semantic representation languages, feature logics,
and tree description logics.
CWI has been involved in a large-scale national project on the application
of tools from dynamic logic to natural language understanding, in a European
(LRE) project FraCaS (a Framework for Computational Semantics), and right
now has an ongoing effort on analysing large collections of Key Phrases
in scientific documents. The main aim here is to build a thesaurus of key
phrases for mathematics, on the basis of a very large collection of key
phrases for mathematical papers (from the Zentralblatt für Mathematik).
The work on dynamic logic has resulted in a proposal for a framework for
dynamic semantics, in an analysis of Discourse Representation Theory in
terms of dynamic logic, and in various publications on modal tree logics.
One of the yields of the FraCaS work has been an analysis of the logic of
ambiguity by J. van Eijck and J. Jaspars. To this we now turn.
In the formal study of natural language semantics the representation of
ambiguous information is one of the major problems. Initial representations
of NL expressions are often ambiguous, due to lack of information about
the meanings of lexical items (lexical ambiguity), the ways in which anaphoric
elements are to be resolved (anaphoric under-specification), attachment
ambiguities (structural ambiguity) and the choice between various possible
scope orderings between operators (scope ambiguity).
The principal reason for wanting to construct a meaning representation for
a natural language sentence is to get a handle on the information conveyed
by that sentence. Is the sentence consistent with a given body of information?
If the sentence is true, what follows from it? If a natural language sentence
is ambiguous, as many natural language sentences are, the key question becomes:
how can we find a representation for it that we can reason with?
There are many kinds of ambiguity in natural language. The most local ambiguities
are lexical ambiguities, like the one in The ball was splendid or
I went to the bank, and referential ambiguities, like John addressed
her, when there is a fixed list of possible antecedents for the pronoun.
A different kind of ambiguity has to do with scope under-specification caused
by the interaction of parts of speech. Examples are Every boy didn't
appear or Everybody in this room has to sign one document. Related
are ambiguities of distribution, as in The boys ordered two sandwiches,
where it is left unspecified whether the object distributes over the subject
(two sandwiches each) or not (two sandwiches altogether). Finally, there
is an open-ended spectrum of under-specification caused by some kind of
incompleteness or flaw in the linguistic data (even: corruption of the data).
Under this heading we have structural ambiguities, like the two readings
of John saw the girl with the telescope, or the problem of what to
make of John ... (noise) ... the girl with the telescope.
Suppose a sentence A is ambiguous between readings A1 and A2 . Here are
some desiderata for what A means:
- if someone informs us that A is true, then one should be allowed to
con-clude that at least one of A1, A2 is true
- if one is sure that A1 and A2 are both true, then one can safely assert
that A is true, the ambiguity of A notwithstanding
- if someone informs us that not A1 is true, then one should be allowed
to conclude that at least one of not A1, not A2 is true
- if one is sure that neither A1 nor A2 is true, then one can safely
assert that not A is true, the ambiguity of A notwithstanding
- unless A1 and A2 are logically equivalent, A or not A cannot be a
logical truth
- unless A1 and A2 are logically equivalent, A and not A need not be
a contradiction.
To explain this final point a bit further, note that in this example we
do not insist that several occurrences of the same expressions be disambiguated
in the same way. To take a real-life example, consider the following sentence:
Every boy did not appear, and it is not the case that every boy did not
appear.
If both occurrences of the ambiguous Every boy did not appear in this example
sentence are disambiguated in the same way, then the example sentence is
indeed contradictory. If we do not insist on this, however, then it is not.
In the work of Van Eijck and Jaspars (Ambiguity and Reasoning, CWI Report
CS-R9616, Amsterdam 1996), ambiguous logical languages are introduced which
extend classical propositional and predicate logic and which can deal with
lexical ambiguity and scope ambiguity. It turns out that an ambiguous consequence
relation can be defined and axiomatized that satisfies all of the desiderata
given above.
Computational linguistics research at CWI concentrates on theoretical issues
and emphasises the use of tools from pro-gramming language analysis for
the analysis of natural language. It is expected, however, that the insights
thus gained will be of great use for the more down-to-earth endeavour of
building practically useful natural language interfaces.
Please contact:
Jan van Eijck - CWI
Tel: +31 20 592 4052
E-mail: jve@cwi.nl
return to the contents page