Adversarial Constraint Satisfaction by Game Tree Search
by James Little and Ken Brown
How players taking part in specially configured games can solve real-world optimisation problems.
Many decision problems can be modelled as adversarial constraint satisfaction (CS), which allows us to integrate methods from AI game playing into traditional CS backtracking search. In particular, by using the idea of opponents, we can model both collaborative problem solving, where intelligent participants with different agendas must work together to solve a problem, and multi-criteria optimisation, where one decision maker must balance different objectives. To date in our research we have focused on the case where two opponents take turns to instantiate constrained variables, each trying to direct the solution towards their own objective. We represent the process as game-tree search and as a consequence, we develop variable and value ordering heuristics based on game playing strategies. We have evaluated the performance of these algorithms on general-sum graph colouring games, for both multi-participant and multi-criteria optimisation.
As a motivating example, consider planning university committee meetings. Each committee has possible meeting times, and each room on campus has limited availability. Researchers want to cluster meetings together, to leave more time for research. Administrators want to minimise travel time, preferring to locate the meetings close to the administration block. How should the University produce a schedule? The approach considered in this paper would appoint two agents, one for each interest group, and have them take turns choosing rooms and times for individual meetings, in the hope that the interplay between their choices would produce a fair settlement. The agents would clearly bring their own objectives to the problem. If the university prefers a particular balance, it could appoint agents with appropriate negotiating skills.
Our research ultimately has two main objectives: (i) to provide assistance for self-motivated decision makers in possibly adversarial situations, and (ii) to provide a convenient framework for modelling and solving multi-criteria constrained optimisation problems. Within the context of one particular game scenario, for (i) we propose configurations of the constraint-based searcher for play against known opponents. For (ii) we show how to configure both players to achieve desired results.
What is the Game
The game starts with a set of variables each with a domain of possible values. Players take turns to choose a variable and assign a value to it. All the variables must be assigned values, consistent with the constraints, for the game to terminate. The rules of the game are represented by the constraints on the variables, which prevent certain combinations of variable/values being chosen as the game proceeds. Each player has their own objective, reflected in their strategies for assigning values to variables. The different objectives considered are of two types: (1) maximise the number of nodes coloured with a specific colour; and (2) maximise the sum of weights, where each node has a unary function mapping colours to integer weights.
A position in the game corresponds to some of the variables assigned values consistent with the constraints. As a game, we could expect it to terminate without necessarily assigning all variables with values and still be able to calculate a payoff for each player. Therefore we need to modify the game to include backtracking and allow players to collaborate in allowing another move to be chosen. At a terminal position, each player calculates their payoff indicating how well they have satisfied their objectives. How the player plays the game has a bearing on these eventual payoffs. A variety of strategies have been investigated for the players to use, based on the well-known Minimax algorithm.
Experiments & Results
For the multi-participant games, we ask and answer four questions: what strategy should I play if (a) I want to beat my opponent's score in individual games; (b) I want to achieve, on average, a higher score than my opponent; (c) I want to get as close to my optimal score as possible; and (d) I want to maximise my own score given that my opponent has achieved a particular score. For the multi-criteria problems, we ask how should we configure the two players (e) to achieve a result close to being Pareto optimal, or (f) to bias or balance the performance?
|
Results for one game, Red versus Blue. |
The figure shows the results of one game with a variety of different playing strategies. The results are measured with respect to the Pareto frontier and a balance line, indicating how well each objective does relative to the other. The experiments are run on 50 different graph colouring games all of 16 nodes and the results are averaged across them.
About the Research
The research has been carried out at the Cork Constraint Computation Centre (www.4c.ucc.ie) which is funded by the Science Foundation Ireland (www.sfi.ie). The centre's focus is on making Constraint Technology more accessible and easier to use, predominantly using techniques from Artificial Intelligence. This research has been carried out for 9 months and has resulted in a published workshop paper at CP03 and a submission to ECAI04. We are now seeking funding to investigate this approach further and would like to form relationships with other research groups, especially in the areas of Game Theory and Agents.
Please contact:
James Little or Ken Brown, University College Cork, Ireland
Tel: +353 21 4255410
E-mail: jlittle4c.ucc.ie, kbrown4c.ucc.ie
|