DiSCiPl

Debugging Systems
for Constraint Programming

Project Description

1. Project Summary

Context and Motivations

Conventional programming techniques are not well suited for solving many highly combinatorial industrial problems, like scheduling, decision making, resource allocation or planning. Constraint Programming (CP), an emerging software technology, offers an original approach allowing for efficient and exible solving of complex problems, through combined implementation of various constraint solvers and expert heuristics. Its applications are increasingly elded in various industries.

One of the main features of CP is a new approach to software production: the same program is progressively improved at each step of the development cycle, from the first prototype until the final product. This makes debugging the cornerstone of CP technology.

Topic

We consider debugging in a broad sense: it concerns both validation aspects (to build a correct application) as well as methodological aspects (to find the best solution to a problem by better understanding of constraint solver behaviour). To satisfy these objectives, tools must locate and explain bugs, and graphic tools must help interpreting program behaviour and results. Existing tools reveal to be ineffective in most industrial situations and tools developed for imperative or functional programming are not adapted to the context of CP. The main reasons are that the huge numbers of variables and constraints makes the computation state difficult to understand, and that the non deterministic execution increases drastically the number of computation states which must be analysed.

Objectives and Results

The goal of the proposed project is to define, to implement and to assess novel and effective debugging systems for Constraint Programming.

The expected results of the project include:

novel techniques of high level debugging,
integration of these novel techniques into industrial CP platforms,
assessment of these techniques in industrial applications,

They will allow industrialists to apply this technology readily to applications that can be even more complex in domains such as scheduling, planning, design or configuration.

Approach

We propose to extend three paradigms currently used in advanced debugging:

Declarative debugging: The clear semantics of CP languages allows to adapt debugging techniques developed in the context of Logic Programming.
Assertion based methods: The user may be unable to understand the results or the behaviour of a program, due to the complexity of the constraint system in some intermediate computation state. We propose methods such as the instantiation of the answer constraints and the use of assertions describing the expected properties.
Graphics based methods. The use of graphical tools is of great interest both to display properties of the solutions, and to follow the history of changes of variables or constraints. They may be used during debugging, at several levels, for constraint display and update and combined with assertion based methods.

Impact of Results

The industrial end-user partners stand to benet immediately from the proposed work, in the form of reduced cost for even more complex applications, developed for industries in Europe.

In general, any industry using CP can benet from the proposed work.

The academic partners will use the debugging tools for education and research. Education plays a key role for further proliferation of CP technology in European industry. An important fallout of the
project is the possible use of such debugging systems in other declarative languages.

Mechanisms for Take-up of the Results

The industrial vendor partners will be present throughout the project to allow the academic results to be taken-up in the form of commercial products and training.
Mechanisms for take-up will be:

Production and test of beta releases in commercial products of PrologIA and COSYTEC.
Free academic software based on CLP(FD) and \CIAO" libraries, available by anonymous ftp or WEB access.
Dissemination actions.

2. Approach

2.1 Context and Approach

The two crucial aspects of debugging of CP applications are methodology and validation.

Within application modelling, methodological issues include the choice of a model and of a constraint system, the choice of a resolution strategy and its optimisation.

Validation is the verification of the correspondance between the program and the programmer's intention. It can be modeled by the notions of partial correctness and completeness. The main problem is to locate bugs when symptoms of incorrectness or incompleteness have been observed.

Existing debugging tools are mostly based on displaying variables and constraints, classical dynamic trace box model or GANTT chart display. Graphical tools play an important role to help the user to figure out aspects of the solutions or of an execution state. They are usually programmer oriented and are well adapted to locate simple bugs as long as not too many constraints are involved.

Present methods suffer strong limitations which make them ineffective in most industrial situations. There are two main reasons for that: the huge amount of variables and constraints makes the computation state too complex to understand, and the non determinism of the execution increases drastically the number of computation states which must be analysed.

Since CP is a declarative paradigm, it is particularly well suited to apply declarative analysis methods such as assertion based methods used in validation, and declarative debugging techniques used to locate bugs. A lot of advantages can be derived from using these methods enhanced with adapted graphic tools.

The approach of the proposed debugging systems is to combine the use of graphic tools and declarative analysis methods to assist the user and to simplify bug location.

Such debugging systems will also contribute to resolve the methodological issues, by allowing the user to understand and compare different execution strategies.

The proposed approach is based on the following paradigms:

Declarative debugging for declarative languages,
Assertion based methods,
Graphics based methods.

2.2 Declarative Debugging for Declarative languages

The proposed approach for bug localisation builds up on the notion of Declarative Debugging (DD), developed first for pure logic programs and more recently adapted for other kinds of programming languages, e.g. for a subset of Pascal, and for lazy functional languages including Haskell.

Declarative debugging is based on the fact that in a declarative language (i.e. a language with a semantics which is independent of its execution model) the notion of error is declarative, i.e. a small piece of code can be identified as erroneous without referring to any operational semantics. In the case of definite programs, for example, an error will be an incorrect or, roughly speaking, a missing clause. A user observes an error symptom only by looking at results of his/her program. The basic idea of declarative debugging is that, when some symptom has been identified, the related errors may be located through a user-dialog where questions have to be answered by the user in a way which is independent of the implemented execution model. The dialog may be automatically directed in such a way that only relevant questions are asked, thus avoiding the use of long and ineffective traces.

The kind of symptom to be considered in the project are symptoms which are wrong and missing solutions. Declarative debugging can be viewed as abstract tracing. Each step requires the display of a relevant part of a complex computation state. This brings a number of research problems to be addressed in the project.

A substantial know-how has been already accumulated showing advantages and limits of declarative debugging for various programming languages. Declarative debugging appears to be very promising, but its success is related to the declarativity level of the host language.

Recent industrial practice shows that CP is effectively used in a declarative way. There are several reasons for this. First, the added expressivity of CP languages with regards to classical Logic Programming, and in particular the availability of disequations and of global constraints, make the use of cuts and negation by failure much less mandatory for programmers. Second, CP applications are usually geared towards the exploitation of the numerical capabilities of the language, which means that the meta-programming facilities of the language are less used. Hence, constraint programming languages are much more suited to the application of declarative debugging and should benefit to a large extent from this approach.
Previous studies have shown that declarative diagnosis can be simply adapted to the classical CLP framework, using proof tree and skeleton notions. Nevertheless, both theoretical and practical aspects of CP forbid direct application of this approach. Here are some of the problems which must be further explored :

In pure logic program semantics, Herband interpretations have a foremost importance; declarative diagnosis is very clear because of these privileged interpretations. In constraint programming, they do not represent semantics any more.
Some elements of the constraint domain do not correspond to a ground term of the program language. Moreover, they can not be described by a finite formula of the language (for example, in CLP ( ), the number ).
With respect to the classical theoretical framework, the practical implementations use incomplete constraint solvers. It is important for practical purposes to take into account these features of real implementation. This cannot be expressed in a logical framework: the CLP (R ) constraint solver answers yes for and maybe for , nevertheless ; it answers no for and maybe for , while .

A declarative semantics which considers an incomplete constraint solver cannot be based on the domain or the theory of the underlying interpretation of constraints.

Moreover, in logic programming, each declarative answer (logical consequence of the program) is covered by a more general computed answer. This is not true in constraint programming: if c is a computed answer for the goal then , but if then where are the computed answers for the goal .

For example, let be the constraint domain and be the program then is a -logical consequence of but there is not a more general computed answer. A sufficient condition for this single covering property is the Independence of Negated Constraints (INC), but this condition is verified by very few interesting constraint domains.

As a matter of fact, each declarative answer is covered by a set of computed answers. Sometimes, this set is not finite: for example, let be the constraint domain and be the program . true is a declarative answer, but is not covered by a subset of .

Assume the relation (resp. ) of the previous examples specifies that x is an integer (resp. a real number). From a declarative viewpoint, we cannot consider that the lack of an answer more general than true is a symptom of an error in the program. If the program is correct then its declarative semantics must verify intended properties of the programmer.

More and more CP systems use approximations. The idea is that, for practical purpose, in addition to the current store, the solver uses approximations of the variable value domains. Approximations given by this machinery contain all the answers but possibly other values. Approximations are intervals on continuous domains (: CLP(BNR), PROLOGIV) or (finite) discrete domain (: CHIP).

Moreover, in addition to the answer constraint which can be a complex constraint with many existentially quantified variables, the system displays value domain approximations of the goal variables (this is sometimes the single information considered by the user). We have also to take into account these approximations to elaborate user-adapted diagnosers.

The above problems show that new theoretical foundations are necessary to build effective debugging systems for industrial applications of Constraint Programming.

The semantics based on a rejection criterion presented in abstracts the interpretation of constraint satisfaction. The rejection criterion can be deduced from a constraint domain , a constraint theory (not necessarily satisfaction complete) or an algorithm of constraint satisfaction (not necessarily complete). We believe that incorrectness declarative diagnosis can be studied in this framework, considering the incompleteness of solvers. However, the rejection criterion must be extended to study insufficiency problems.

Declarative diagnosis algorithms of logic programs use well-known properties of logic programs (for example, for insufficiency, the single covering properties). Therefore, there is the need to find new definitions of wrong and missing answers adapted to the more general framework of constraint programming. Then, there is the need to find new declarative diagnosis algorithms in this context.

2.3 Assertion based methods

Declarative debugging is based on the assumption that the user is able to indicate which of the intermediate answers are erroneous. In the case of CP the intermediate answers are constraints representing sets of solutions. In the case of very complex formulae, the user may not be able to decide whether all the solutions of such a constraint correspond to his/her expectations. Thus, querying the user directly about complex constraints in the debugging process should be avoided. Therefore we focus our attention on the possible ways of overcoming the difficulty by adding assertions to the language.

Assertions are partial specifications which are provided by the programmer. They can be attached to the program's predicates, constraints or variables, and they represent expected properties of the program. Assertions are associated to a notion of correctness. A program is correct w.r.t. given assertions if the assertions are satisfied in some formally defined sense.

Two kinds of assertions have been considered. The assertions of the first kind describe properties of the expected answers of the program, thus properties of the expected model. They are often called declarative properties of the program, since they do not depend on the operational semantics (provided that this semantics is sound). A program is correct w.r.t. an assertion if the assertion is entailed by every answer computed by the program. The proof method proposed by Clark is a classical approach to proving declarative properties. A modular approach to proving declarative properties, known as the annotation method has been developed by Deransart. Clark's method can be seen as a special case of the annotation method. In both methods some verification conditions are to be checked locally for each clause. In the context of program debugging this means that the clauses which satisfy the local verification conditions for expected declarative properties may be excluded from debugging, while the other ones may need correction.

Another kind of assertions, concerning execution of the program with Prolog computation rule has been discussed by Drabent and Maluszynski. Such assertions describe input-output behaviour of the predicates. More precisely, they describe the form of predicate calls during the execution of the program, and also the form of the answers for the calls satisfying the assertions. Therefore the described properties are sometimes called dynamic or run-time properties. A program is correct if the initial call satisfying the input assertion of its predicate will result in a computation where every call of a predicate satisfies the input assertion, and every intermediate answer satisfies the output assertion. The proof method of Drabent and Maluszynski defines verification conditions which have to be checked locally for each clause to guarantee correctness of the program. It is worth noticing that the run-time assertions are potentially very useful in the context of debugging. Due to the locality of verification conditions, every clause proved to be correct need not be taken into consideration in debugging of the program. For the remaining clauses a run-time test may be installed for checking whether the form of the local call conforms to the user's expectation stated as the assertion. However, the verification conditions of Drabent and Maluszynski refer to the constraint solving mechanism (in particular to the unification mechanism in the case of Prolog) and are therefore rather difficult to check in practice.

A special kind of run-time assertions where the verification conditions become more simple is known under the name of directional types..

The recent work of Boye and Maluszynski shows a refinement of the verification conditions for directional types and relates the type checking techniques to the annotation method of Deransart.

The work of E. Vétillard shows that directional types are applicable for debugging of CP programs. The assertions are here considered as constraints on an extended constraint structure, and their proof is performed using abstract interpretation techniques. The use of such approximated techniques simplifies the proof, and some experimental results done on a subset of Prolog III have shown that this did not incur a significant precision loss in the proof.

We plan to further investigate the role of abstract interpretation not only in testing assertions, but also in generating them. In particular, assertions can generally be viewed as stating properties at the predicate level in some abstract domain that is understandable to the user. As mentioned before, abstract interpretation techniques can be applied to check such user supplied assertions. Additionally, since the main role of global program analysis based on abstract interpretation is to derive properties of the program, the technique can also be applied to derive new assertions. Such new assertions can for example state properties at intermediate points within the program, providing new, more detailed information for the user. Furthermore, it is also possible to accurately generate missing predicate level assertions from others present in the program. This can greatly reduce the work for the programmer in that when so desired it makes it possible to state assertions only for critical predicates. Also, it is usually the case that small changes to programs take place during program development and debugging. In such a context, assertion information can be updated efficiently and automatically by means of incremental abstract interpretation techniques. We plan to use incremental analysis techniques which are capable of dealing with both user-supplied assertions and those automatically generated. We also plan to extend the techniques to deal with commercially available languages.

In addition to the existence of such static proof mechanisms, which make it possible to debug constraint programs at compile-time, the assertions can be used in the framework of declarative debugging. By providing additional information about the program, the assertions can help the declarative debugging process by answering automatically to some queries. Furthermore, the availability of local static proof methods makes it possible to exclude some clauses altogether from the declarative debugging querying. All this results in a simplification of the dialog between the machine and the programmer during a declarative debugging session.

The dialog itself can be instrumented through the use of assertions, i.e., allowing the user to introduce assertions during a debugging dialog. In such a dynamic use of assertions, there are two important and related points to consider. First, how such assertions can affect, and can be used to incrementally modify, the previous assertions already checked to be correct, so that the annotated program remains correct w.r.t. the old and new assertions. Second, how

to treat programs annotated with assertions. Additionally, these issues should also be considered for constraint languages with complex computation rules, such as passive constraints, delay primitives, and/or suspension of computations. Finally, we plan to investigate other methods such as non-failure analysis and determinacy analysis which can also aid in the program development task.

2.4. Graphics based tools

In the declarative debugging process, one of the research problems mentioned above is to display complex computation states referenced during the dialog. A query to the user comes in the form of a complex constraint, which may sometimes be difficult for the user to understand, and thus to answer. We propose to develop visualization tools in order to help the user to understand such complex constraints, as well as for other purposes related to (performance) debugging, such as visualizing complex search spaces.

There are many challenges involved in developing useful visualization tools for the above mentioned applications. The theory of CLP provides a uniform view of the CLP computation process. There is a generic framework followed by all CP systems, but each system then implements specific types of constraints. Based on the general CP scheme it is possible to develop generic constraint visualization tools, which can be used for any constraint domain, including user--defined constraints. However, for some well--known constraint domains, the user expects some specific properties to hold, which the visualizer should take into account in order to provide a more intuitive depiction of the computation process. This can be accomplished by instantiating the general visualization paradigm with the actual characteristics of the constraint system being used. This instantiation is, in many cases, non trivial (for example, in the cases where different constraint domains are being used at the same time), and can lead to visualization paradigms which look completely different from the generic one, but with which the same underlying theoretical background is shared.

The actual form in which the dialog between the user and the debugging system is maintained is, in principle, an orthogonal issue. Possible interfaces between the user and the system are geometrical representations of the computation space represented by a particular (set of) constraint(s), projections of constraints on a set of variables, representation of the current subset of the search space by means of intervals associated with variables, etc. Adapting them to different constraint systems, as suggested in the above paragraph, can in principle be performed for any of the visualization systems chosen. For example, generic intervals for each variable can be represented as intervals in for clp(), as sets of discrete points in clp (fd) or as subsets of the Herbrand universe in clp(). Designing intuitive constraint representations is a task in which work remains to be done, and which will be addressed in this project.

As mentioned before, one additional important characteristic of the visualization systems to be used in debugging is that they should make it easier for the user to understand the questions produced by the debugger, by making reference to easily recognizable entities such as, for example, original query variables. This can be accomplished by using two different approaches:

The first approach is to identify a subset of relevant variables and constraints and visualize a projection of the query constraint, via an iterative dialog which successively refines the subset.
The second approach is to visualize the query constraint via a set of display tools, which may be generated either automatically or manually.
- Automatic: In CP languages, there are so-called global constraints, which have a very precise, high-level semantics often encountered in industry. For example, the cumulative constraint is applicable to finite-capacity resource. This constraint can be meaningfully visualized with a graph showing the cumulated load versus time.
- Manual: To handle the case where more advanced constraints or complex uses of simpler ones, do not allow simple yet meaningful displays (an example being the Gantt Chart).

Such techniques can be implemented with the use of sophisticated graphic tools. The XGIP module developed by COSYTEC allows the user to design and develop sophisticated graphical interfaces. It is desirable to build drag-and-drop tools on top of XGIP to allow the user to define the displays with the right level of abstraction.

Graphic tools may also be used to update computation states. For example, with the Gantt Chart, the user may post additional constraints via the graphic interface, even to the point of instantiating variables. This interaction can be used within declarative debugging, when the user is queried about more specific states.

A desirable characteristic in the visualization systems to be developed is the ability of performing an abstraction of the represented constraints. This means being able to provide the user with views at different levels of detail, which allow easily understanding the ``big picture'' as well as the more concrete details. This abstraction process can take several forms, depending on the conceptual dimension which is being abstracted: it might very well be as simple as zooming in and out (as done, for example, where a complex process structure is visualized via abstracting sequential tasks and using zooming to move between levels of concretization, and color coding to represent physical resources), or as complex as allowing the user to fade away some classes of constraints (i.e., the user might be interested only on visualizing constraints where variables have a semi--infinite range, or only in constraints which have been suspended due to the inability of the solver to treat them directly). Several interesting dimensions (which in some cases depend on the particular solver being used) must be identified, and the visualization paradigm for them devised.

From a practical point of view, all the tools which implement these visualization ideas and designs need to be easily plugged into existing CP systems. This leads to the need of providing a generic interface with which the constraint language and debugging system can maintain a two--way communication with the visualization. A generic toolkit should be devised to this end. This toolkit can be designed to take into account special characteristics of particular CP systems in much the same way as the generic visualization paradigm can be.

An interesting issue, subject to further study, is representing not only the constraint store as a set of (dis)equations in a (or several) given domain(s), but reflecting how the current state of the constraint store affects the execution flow of the programs. An approach along those lines was the one taken for Prolog. The CP paradigm offers a much richer setting: the set of active and passive constraints associated with every point in the program can be visualized, and the debugger would tell the user why a given execution path was chosen. Special attention should be given to disjunctive constraints which can be seen as encoding alternative execution paths, thus allowing the programs to delay speculative computations and to achieve better performance.

2.5 Research Topics

In addition to the development work (some details are provided in the description of the tasks below), this approach leads to three families of research problems, which have been identified and must be investigated in order to reach the objectives of the project. They are summarized below:

Adaptation of the declarative approach already developed in the framework of logic programming to the CP context.
- Definition of an abstract kernel language aimed at studying aspects of declarative CP such as semantics and execution models, over domains including terms domains, finite domains, rationals and intervals. This will serve as a common kernel for subsequent work by all partners.
- Redefinition of the notions of wrong and missing answers in the context of CP and incomplete solvers, and interactive kernels.
Extensions of CP languages by assertions.

Our approach in handling assertions is to consider different kinds of language extensions including assertions, possibly defined in different languages, either inside the constraints, or outside, as comments. The research will thus include the following points.

Definition of an assertion language, and of the associated validation methods. This work will be led by practical considerations, and will include the analysis of various possible types of assertions, as well as their precise definition in the CP framework. The emphasis is to find a good trade-off between the expressivity of the system and its performance.
Establishment of a link between the assertion language defined above and the CP execution model, and definition of the dynamic use of assertions, in particular during a debugging session.
Develop techniques for assertion generation, user provided assertion checking aand assertion refinement.

Definition of CP visualization paradigms and tools.

Complex constraints may be displayed using graphics and therefore, new forms of interaction during debugging must be designed. Moreover, the display of the history of variables and constraints, showing past events and deductions, will help the user to understand the behavior of the program.

Definition of independent visualization paradigms, abstractionn of complex constraints and of computation states history.
Definition of flexible toolkits to allow users to define visualizations that display details at the most suitable level of abstraction, i.e. domain-dependent or even application-dependent.

As further explained in Chapter 6 on the proposers, the consortium is confident that all required expertise is available within the consortium. Some risk does exist, linked to the research topics described above, but the risk level is acceptable.