Here is the story thus far: a meta-program is a program that takes another program as input or output. Based on this idea we wrote an interpreter for a simple logic programming language and later extended it to build a proof tree. A proof of concept, if you will. Meta-interpreters have lost a lot of steam in the last years. The reason being that they are just too hard to write in most popular programming languages. There’s no a priori reason that prevents us from writing a meta-interpreter in e.g. Python or Java, but the truth is that it’s such a lot of work that it’s not worth the trouble in most cases. The only exception that I can think of are integrated development environments which typically have at least some semantic awareness of the object language. But these languages doesn’t have a simple core and makes parsing awkward to say the least. In logic programming the situation is different. If an interpreter supports definite Horn clauses — facts and rules — and built-in operations it’s powerful enough to run quite a lot of real programs.
So what’s the purpose then? Is meta-programming just a sterile, academic exercise that has no place in real world software development? Since that was a rhetorical question, the answer is no. A resounding no! First, meta-interpreters are great for experimenting with new language features and implementation techniques. For instance we could ask ourself if it would be worthwhile to add support for new search rules in Prolog instead of defaulting to a simple depth-first search. Implementing a new search rule in a meta-interpreter can be done in a few hours, and the resulting program won’t be longer than perhaps a page of code (unless you screwed up, that is). Doing the same task in an imperative programming environment could take days or even weeks depending on the complexity of the existing code base. So meta-programming is useful for prototyping. What else? It can actually be a great aid in debugging. In the following sections we’re going to explain what debugging means in logic programming and develop a simple but functional system for squashing bugs.
Assume that we have a logic program and a goal query . Sterling and Shapiro cites three possible bugs in The Art of Prolog:
- The interpreter could fail to terminate.
- The interpreter could return a false solution . (incorrectness)
- The interpreter could fail to return a true solution . (insufficiency)
Since the first problem is undecidable in general we shall focus on the latter two. But first we need to decide what the words true and false means in this context, and in order to do that some remarks about the semantics of logic programs have to be made. If you’re feeling a bit rusty, I urge you to read up a bit on Herbrand models. Wikipedia and my own earlier post are both good starting points. The basic idea is fortunately rather simple. Logic formulas and programs can be viewed as specifications of models. A model is an interpretation in which the program is true. In general there are many, infinitely many, models of any given definite logic program. Which one should we choose? In a model we are free to reinterpret the non-logical vocabulary in any way we see fit. Consider the following logic program:
It can be seen as a specification of either the set or the set . Notice the subtle difference. The latter model is simpler in the sense that it doesn’t take us outside the domain of the textual representation of the program itself. Such models are known as Herbrand models. Could we be so lucky that Herbrand models are the only kind of models that we need to pay attention to? This is indeed the case. If a logic program has a model then it also has a Herbrand model. But we still need to pick and choose between the infinitely many Herbrand models. The intuition is that a model of a logic program shouldn’t say more than it have to. Hence we choose the smallest Herbrand model as the meaning of a logic program. Or, put more succinct, the intersection of all Herbrand models. For a logic program , let denote the smallest Herbrand model of .
This is good news since we now know that every well-formed logic program has a meaning. Let’s return to the question of false solutions. This notion is only relevant if the programmer has an intended meaning that differs from the actual meaning of the program. In all but the most trivial programming tasks this happens all the time. An intended meaning of a logic program is the set of ground goals for which the program should succeed. Note the “should”. If we briefly return to , the intended meaning is nothing else than the actual meaning, i.e. the set . With this terminology it’s possible to give a precise definition of incorrectness and insufficiency of a logic program :
- is incorrect iff .
- is insufficient iff .
With these definitions we see that the program is neither incorrect nor insufficient. But let’s introduce some bugs in it:
Can you spot them? is incorrect since the base clause is too inclusive. is not a subset of since e.g. the element is not a member of . In the same vein, is insufficient since it’s equivalent to just .
Quite a lot of legwork to explain something which is actually rather simple! What remains is to put everything in practice. Due to space constraints we’ll focus on the incorrectness problem.
A logic program is incorrect if it gives solutions that are not included in the intended model. In a real-world situation this means that the programmer has found a goal which the program should reject, but it doesn’t, and hence it contains at least one bug. The purpose is to find the part in the program that is responsible for the bug. In logic programming terms this is of course a clause. A clause is false iff is true and is false. The purpose of the algorithm is to traverse the proof tree and find such a clause. With this in mind we can at least write the top-level predicate:
false_solution(Goal, Clause) :- %Build a proof tree. interpreter::prove(Goal, Tree), %Find a false clause. false_goal(Tree, Clause).
Well, that wasn’t too hard. What about ? The tree is of the form . Hence there are two cases: either is false or it’s true. If it’s false, then we must continue the search in . If it’s true, then the current clause is the clause that we’re looking for. To determine whether is false we need an auxiliary predicate, , where the first argument is the conjunction of nodes and the second argument is the false clause (if it exists).
false_goal((A :- B), Clause) :- ( false_conjunction(B, Clause) -> true ; Clause = (A :- B1), %Necessary since we don't want the whole tree. extract_body(B, B1) ).
By the way, this is a fine example of top-down development. In each step we’re breaking the original problem into easier problems and assume that we’re able to solve them later. is a bit trickier. The first argument is a conjunction of nodes of the form . Just like before there are two cases since is either false or true. If it’s true, then we move on to the rest of the nodes. If it’s false, then we’d like to know whether is true or false. Luckily we’ve already solved this problem before — a call to will do the trick just fine.
false_conjunction(((A :- B), _Bs), Clause) :- query_goal(A, false), !, false_goal((A :- B), Clause). %Almost the same case as above, but with only one element. false_conjunction((A :- B), Clause) :- query_goal(A, false), !, false_goal((A :- B), Clause). false_conjunction((_A, As), Clause) :- %A is implicitly true. false_conjunction(As, Clause).
Only the most perplexing predicate remains: . The second argument is if is true and if it’s false. How can we know this? This is where the programmer’s intended model enters the picture. For now, we’re just going to use her/him as an oracle and assume that all choices are correct. The predicate is then trivial to write:
query_goal(G, Answer) :- %Change later. write('Is the goal '), write(G), write(, ' true?'), nl, read(Answer).
In essence the user will be asked a series of questions during a session with the program. Depending on the answers, i.e. the intended model, the program will dive deeper and deeper into the proof tree in order to find the troublesome clause. As an example, here’s an append program where the base case is wrong:
append([_X], Ys, Ys) :- true. append([X|Xs], Ys, [X|Zs]) :- append(Xs, Ys, Zs).
And the session with the program would look like this:
 ?- debugging::false_solution(append([a,b,c], [d,e], Xs), Clause). Is the goal append([b,c],[d,e],[b,d,e]) true? |: false. Is the goal append([c],[d,e],[d,e]) true? |: false. Xs = [a, b, d, e], Clause = (append([c], [d, e], [d, e]):-true)
And we clearly see that it’s the base case that’s wrong.
The algorithm was taken from The Art of Prolog. Some simplifying assumptions have been made. Among other things there’s currently no support for built-in operations. This is rather easy to fix, however. A more serious question is if it would be possible to minimize the role of the oracle, since it’s now queried every time a decision needs to be made. There are two techniques for coping with this. Either we do a smarter traversal of the proof tree with e.g. divide and conquer, or we find a way to approximate the intended model of the program without the use of an oracle.
The source code is available at https://gist.github.com/1351227.