Archive | Logtalk RSS for this section

Meta-Programming in Prolog – Part 2

Here is the story thus far: a meta-program is a program that takes another program as input or output. Based on this idea we wrote an interpreter for a simple logic programming language and later extended it to build a proof tree. A proof of concept, if you will. Meta-interpreters have lost a lot of steam in the last years. The reason being that they are just too hard to write in most popular programming languages. There’s no a priori reason that prevents us from writing a meta-interpreter in e.g. Python or Java, but the truth is that it’s such a lot of work that it’s not worth the trouble in most cases. The only exception that I can think of are integrated development environments which typically have at least some semantic awareness of the object language. But these languages doesn’t have a simple core and makes parsing awkward to say the least. In logic programming the situation is different. If an interpreter supports definite Horn clauses — facts and rules — and built-in operations it’s powerful enough to run quite a lot of real programs.

So what’s the purpose then? Is meta-programming just a sterile, academic exercise that has no place in real world software development? Since that was a rhetorical question, the answer is no. A resounding no! First, meta-interpreters are great for experimenting with new language features and implementation techniques. For instance we could ask ourself if it would be worthwhile to add support for new search rules in Prolog instead of defaulting to a simple depth-first search. Implementing a new search rule in a meta-interpreter can be done in a few hours, and the resulting program won’t be longer than perhaps a page of code (unless you screwed up, that is). Doing the same task in an imperative programming environment could take days or even weeks depending on the complexity of the existing code base. So meta-programming is useful for prototyping. What else? It can actually be a great aid in debugging. In the following sections we’re going to explain what debugging means in logic programming and develop a simple but functional system for squashing bugs.

Algorithmic debugging

Assume that we have a logic program P and a goal query \leftarrow G. Sterling and Shapiro cites three possible bugs in The Art of Prolog:

  1. The interpreter could fail to terminate.
  2. The interpreter could return a false solution G\theta. (incorrectness)
  3. The interpreter could fail to return a true solution G\theta. (insufficiency)

Since the first problem is undecidable in general we shall focus on the latter two. But first we need to decide what the words true and false means in this context, and in order to do that some remarks about the semantics of logic programs have to be made. If you’re feeling a bit rusty, I urge you to read up a bit on Herbrand models. Wikipedia and my own earlier post are both good starting points. The basic idea is fortunately rather simple. Logic formulas and programs can be viewed as specifications of models. A model is an interpretation in which the program is true. In general there are many, infinitely many, models of any given definite logic program. Which one should we choose? In a model we are free to reinterpret the non-logical vocabulary in any way we see fit. Consider the following logic program:

natural(zero).
natural(s(X)) \leftarrow natural(X).

It can be seen as a specification of either the set \{natural(0), natural(1), \ldots\} or the set \{natural(zero), natural(s(zero)), \ldots \}. Notice the subtle difference. The latter model is simpler in the sense that it doesn’t take us outside the domain of the textual representation of the program itself. Such models are known as Herbrand models. Could we be so lucky that Herbrand models are the only kind of models that we need to pay attention to? This is indeed the case. If a logic program has a model then it also has a Herbrand model. But we still need to pick and choose between the infinitely many Herbrand models. The intuition is that a model of a logic program shouldn’t say more than it have to. Hence we choose the smallest Herbrand model as the meaning of a logic program. Or, put more succinct, the intersection of all Herbrand models. For a logic program P, let M_P denote the smallest Herbrand model of P.

This is good news since we now know that every well-formed logic program has a meaning. Let’s return to the question of false solutions. This notion is only relevant if the programmer has an intended meaning that differs from the actual meaning of the program. In all but the most trivial programming tasks this happens all the time. An intended meaning I_P of a logic program P is the set of ground goals for which the program should succeed. Note the “should”. If we briefly return to natural/1, the intended meaning is nothing else than the actual meaning, i.e. the set \{natural(zero), natural(s(zero)), \ldots \}. With this terminology it’s possible to give a precise definition of incorrectness and insufficiency of a logic program P:

  1. P is incorrect iff M_P \not\subseteq I_P.
  2. P is insufficient iff I_P \not\subseteq M_P.

With these definitions we see that the natural/1 program is neither incorrect nor insufficient. But let’s introduce some bugs in it:

natural1(\_).
natural1(s(X)) \leftarrow natural1(X).

natural2(zero).
natural2(s(X)) \leftarrow natural2(s(X)).

Can you spot them? natural1/1 is incorrect since the base clause is too inclusive. M_P is not a subset of I_P since e.g. the element natural(-1) is not a member of I_P. In the same vein, natural2/1 is insufficient since it’s equivalent to just natural2(zero).

Quite a lot of legwork to explain something which is actually rather simple! What remains is to put everything in practice. Due to space constraints we’ll focus on the incorrectness problem.

Incorrectness

A logic program P is incorrect if it gives solutions that are not included in the intended model. In a real-world situation this means that the programmer has found a goal which the program should reject, but it doesn’t, and hence it contains at least one bug. The purpose is to find the part in the program that is responsible for the bug. In logic programming terms this is of course a clause. A clause A \leftarrow B is false iff B is true and A is false. The purpose of the algorithm is to traverse the proof tree and find such a clause. With this in mind we can at least write the top-level predicate:

   false_solution(Goal, Clause) :-
       %Build a proof tree.
       interpreter::prove(Goal, Tree),
       %Find a false clause.
       false_goal(Tree, Clause).

Well, that wasn’t too hard. What about false\_goal/2? The tree is of the form A \leftarrow B. Hence there are two cases: either B is false or it’s true. If it’s false, then we must continue the search in B. If it’s true, then the current clause is the clause that we’re looking for. To determine whether B is false we need an auxiliary predicate, false\_conjunction/2, where the first argument is the conjunction of nodes and the second argument is the false clause (if it exists).

   false_goal((A :- B), Clause) :-
       (    false_conjunction(B, Clause) ->
            true
       ;    Clause = (A :- B1),
            %Necessary since we don't want the whole tree.
            extract_body(B, B1)
       ).

By the way, this is a fine example of top-down development. In each step we’re breaking the original problem into easier problems and assume that we’re able to solve them later. false\_conjunction/2 is a bit trickier. The first argument is a conjunction of nodes of the form A \leftarrow B. Just like before there are two cases since A is either false or true. If it’s true, then we move on to the rest of the nodes. If it’s false, then we’d like to know whether B is true or false. Luckily we’ve already solved this problem before — a call to false\_goal/2 will do the trick just fine.

   false_conjunction(((A :- B), _Bs), Clause) :-
       query_goal(A, false),
       !,
       false_goal((A :- B), Clause).
   %Almost the same case as above, but with only one element.
   false_conjunction((A :- B), Clause) :-
       query_goal(A, false),
       !,
       false_goal((A :- B), Clause).
   false_conjunction((_A, As), Clause) :-
       %A is implicitly true.
       false_conjunction(As, Clause).

Only the most perplexing predicate remains: query\_goal/2. The second argument is true if A is true and false if it’s false. How can we know this? This is where the programmer’s intended model enters the picture. For now, we’re just going to use her/him as an oracle and assume that all choices are correct. The predicate is then trivial to write:

   query_goal(G, Answer) :-
       %Change later.
       write('Is the goal '),
       write(G),
       write(, ' true?'),
       nl,
       read(Answer).

In essence the user will be asked a series of questions during a session with the program. Depending on the answers, i.e. the intended model, the program will dive deeper and deeper into the proof tree in order to find the troublesome clause. As an example, here’s an append program where the base case is wrong:

append([_X], Ys, Ys) :- true.
append([X|Xs], Ys, [X|Zs]) :-
    append(Xs, Ys, Zs).

And the session with the program would look like this:

[1]  ?- debugging::false_solution(append([a,b,c], [d,e], Xs), Clause).
Is the goal append([b,c],[d,e],[b,d,e]) true?
|: false.
Is the goal append([c],[d,e],[d,e]) true?
|: false.
Xs = [a, b, d, e],
Clause = (append([c], [d, e], [d, e]):-true)

And we clearly see that it’s the base case that’s wrong.

Summary

The algorithm was taken from The Art of Prolog. Some simplifying assumptions have been made. Among other things there’s currently no support for built-in operations. This is rather easy to fix, however. A more serious question is if it would be possible to minimize the role of the oracle, since it’s now queried every time a decision needs to be made. There are two techniques for coping with this. Either we do a smarter traversal of the proof tree with e.g. divide and conquer, or we find a way to approximate the intended model of the program without the use of an oracle.

Source code

The source code is available at https://gist.github.com/1351227.

Meta-Programming in Prolog – Part 1

Introduction

Meta-programming is part of the folklore in Prolog, and is in general a rather old concept with roots tracing back to at least the 50’s. To give a definition that captures all the relevant concepts is outside the scope of this introductory text, but I shall at least provide some pointers that’ll be useful later on. Programs are useful in many different domains. We might be working with numbers, with graphs, with lists or with any other data structure. What happens when the domain is another programming language? Well, nothing, really, from the computer’s point of view there’s no difference between this scenario and the former. But conceptually speaking we’re writing programs that are themselves working with programs. Hence the word “meta” in meta-programming. A compiler or interpreter is by this definition a meta-program. But in logic programming we’re usually referring to something more specific when we’re talking about meta-programming, namely programs that takes other logic programs as data. Since Prolog is a homoiconic language there’s also nothing that stops us from writing programs that takes other Prolog programs as data, but even though there’s a subtle distinction between this and the former scenario they are often referred to as one and the same. So, to summarize, when we’re talking about meta-programs in logic programming we’re quite often referring to Prolog programs that uses logic programs as data.

The road map for this post is to see some examples of meta-interpreters in Prolog. Then we’re going to use the interpreters to aid program development with a technique known as algorithmic debugging. But enough talk, let’s do this!

Meta-interpreters

There’s still ample room for confusion regarding the word “meta” in meta-interpreter. I shall use the word whenever I refer to an interpreter for a logic programming language, even though this is not factually correct since one usually demands that the object language and the meta language are one and the same. That is: we write an interpreter for Prolog in Prolog. There are good reasons for not doing this. Prolog is a large and unwieldy language with many impure features such as cut, IO, assert/retract and so on, and when we’re working with meta-interpreters we’re often only interested in a small, declarative part of the language. Hence we shall restrict our focus to a programming language akin to pure Prolog which is basically just a set of Horn clauses/rules.

Even though we still haven’t decided the syntax for the object language we know that we must represent at least two things: facts and rules. Since a fact A is equivalent to the rule A \leftarrow true we can store these in the same manner. Assume that P is a definite logic program. How should we represent it? As a list or a search tree? This could be a good approach if we were interested in implementing dynamic predicates in a declarative way, but since P is static it’s much easier to just use the database and store everything as facts. For every rule A \leftarrow B_1, ..., B_n \in P, represent it as the fact rule(A, [B_1, ..., B_n]). If a rule only has the single atom true in its body, i.e. it is a fact, then the second argument is the empty list. Obviously this is just one of many possible representations, but it’s simple to implement and work with.

As an example, here’s how we would write append/3:

rule(append([], Ys, Ys), []).
rule(append([X|Xs], Ys, [X|Zs]),[append(Xs, Ys, Zs)]).

Simple, but not exactly pleasing to the eye. Fortunately it’s easy to add some syntactic sugar with the help of Prolog’s term expansion mechanism. Instead of directly using rule/2 we can rewrite append/3 as:

append([], Ys, Ys) :- true.
append([X|Xs], Ys, [X|Zs]) :-
    append(Xs, Ys, Zs).

And then define a suitable expansion object so that we end up with a set of rule/2 facts. This is a rather mundane and not very exciting programming task and hence omitted. Now on to the interpreter. It will be defined by a set of prove/1 clauses where the single argument is a list of goals. If you’ve never seen a meta-interpreter in Prolog before, you’re probably in for some serious disappointment since the program is so darn simple. So simple that a first reaction might be that it can’t possibly do anything useful. This first impression is wrong, however, since it’s easy to increase the granularity of the interpreter by implementing features instead of borrowing them from the Prolog system.

As mentioned the interpreter takes a list of goals as argument. This means that there’s a base case and a recursive case. In the base case of the empty list we are done. In the recursive case we have a list of the form [G|Gs] where G is the first goal that shall be proven. How do we prove G then? By looking if there’s a corresponding rule rule(A, [B_1, ..., B_n]) where A and G are unifiable with mgu \theta and recursively prove ([B_1, ..., B_n|Gs]) \theta. In almost any other language this would be considerable work, but since Prolog is a logic programming language we already know how to do unification. Thus we end up with:

%Initialize the goal list with G.
prove(G) :-
   prove1([G]).

prove1([]).
prove1([G|Gs]) :-
   rule(G, B),
   prove1(B),
   prove1(Gs).

This is a prime example of declarative programming. We’ve only described what it means for a conjunction of goals to be provable and left the rest to the Prolog system. If you’re unsure why or how the interpreter works I urge you to try it for yourself.

Extensions

To prove that I wasn’t lying before I shall illustrate some neat extensions to the bare-bone interpreter. Strictly speaking we don’t really need anything else since the language is already Turing complete. It’s e.g. trivial to define predicates that define and operate on the natural numbers. For example:

nat(zero) :- true.
nat(s(X)) :- nat(X).

add(zero, Y, Y) :- true.
add(s(X), Y, s(Z)) :-
   add(X, Y, Z).

But since these operations can be implemented much more efficiently on any practical machine it’s better to borrow the functionality. Hence we shall define a set of built-in predicates that are proved by simply executing them. The easiest way is to add a rule/2 definition for every built-in predicate.

rule(rule(A, B), []) :-
    rule(A, B).
rule((X is Y), []) :-
    X is Y.

Why the first clause? So that we can facilitate meta-programming and use rule/2 in our object language. I mentioned earlier that the interpreter as defined is not really a meta-interpreter in the strict sense of the word, and that Prolog is such a large language that writing meta-interpreters for it is probably not worth the hassle. But now we have a very restricted yet powerful language. Can we write a real meta-interpreter in that language? Yes! Actually it’s hardly any work at all since we already have the source code for the old interpreter.

prove(G) :-
    prove1([G]).

prove1([]) :- true.
prove1([G|Gs]) :-
    rule(G, B),
    prove1(B),
    prove1(Gs).

Glorious. Perhaps not very practical, but glorious.

Building a proof tree

When our interpreter gives an answer it doesn’t provide any indication as to why that answer was produced. Perhaps the answer is in fact wrong and we want to localize the part of the code that is responsible for the error. The first step in this process is to build a proof tree. A proof tree for a goal \leftarrow G and logic program P is a tree where 1) the root is labeled G, and  2) each node has a child for every subgoal with respect to P. Hence the proof tree is more or less a representation of a sequence of trace steps.

It might sound like a complex task, but it’s really not. All we need is to extend the prove/1 predicate with an additional argument for the proof tree. In the base case of the empty list the tree contains the single node true. If [G|Gs] are the current goals then we prove G and Gs and builds a proof tree from the recursive goals.

prove(G, T) :-
    prove1([G], T).

prove1([], true).
prove1([G|Gs], ((G :- T1), T2)) :-
    rule(G, B),
    prove1(B, T1),
    prove1(Gs, T2).

And when called with G = append([a,b], [c,d], Xs) the resulting tree looks like this:

?- interpreter::prove(append([a,b], [c,d], Xs), T).
Xs = [a, b, c, d],
T = ((append([a, b], [c, d], [a, b, c, d]):- (append([b], [c, d], [b, c, d]):- (append([], [c, d], [c, d]):-true), true), true), true)

NB: this tree has a lot of redundant true entries. How can we fix this?

Summary

We’re now able to build proof trees. In the next entry we’re going to use them to localize errors in logic programs.

For a good discussion of meta-interpreters in Prolog the reader should turn to The Craft of Prolog by Richard O’Keefe. This post was just the tip of the iceberg. Another interesting subject is to experiment with different search rules, and for this I shamelessly promote my own bachelor’s thesis which is available at http://www.diva-portal.org/smash/record.jsf?searchId=1&pid=diva2:325247.

Source code

The source code is available at https://gist.github.com/1330321.

Prolog’s Makin’ Music – Part 3

This time we’ll have a look at some techniques for automatically generating music — or rather, to be more accurate, melodies. Since we’ve deduced that a musical scale is a mathematical structure in which it’s possible to perform all the standard operations, we have quite a lot of freedom when it comes to the choice of a suitable formalism. We’ll also make some simplifications to make the job easier: namely that our melodies consists of a single list of notes where all notes are assumed to be of equal importance, e.g. played in the same timbre and tempo. This means that the resulting melodies won’t be that pleasing to the ear, but there’s of course nothing that stops us from using one of these melodies as a building block in a larger musical composition. I suppose that we still need musicians for something!

Lindenmayer systems

A Lindermayer system, or an L-system, is a formal grammar quite similar to a context-free grammar. The goal is to rewrite a starting string, the axiom, by applying as many rules as possible. A rule is simply an if-then statement of the form: if the token is X then replace it by Y. Formally speaking this information can be summarized as a tuple:

L = (V, \omega, P)

Where V is a set of variables, \omega the starting axiom and P the set of production rules, i.e. functions from V to the language. The symbols that don’t appear in V, the constants, are always left untouched. The first example on Wikipedia is an Algae system. It has two variables, A and B, A as starting axiom and the two rules:

A \rightarrow AB

B \rightarrow A

So the strings that will be produced are: A, AB, ABA, ABAAB, and so on. It shouldn’t be hard to see how the rules were applied. First the axiom was used. Then, the rule for A was used which produced AB. Then the rules for both A and B were used which produced AB and A, i.e. ABA.

We are free to interpret the structure in an L-system in any way we see fit. For example, we could interpret A in the Algae system as “play the first note in the scale” and B as “play the second note in the scale”. I shall however use something a bit closer to the Logo notation that is commonly used to visualize L-systems. It consists of the following commands:

  • f — draw forward.
  • + — turn right.
  • -  — turn left.
  • s — push the current point on the stack.
  • r — pop an entry from the stack.

But since we’re working with scales, and not images, we have to reinterpret these commands. I propose the following:

  • f — play the current note.
  • + — increase the current note.
  • - — decrease the current note.
  • s — push the current note on the stack.
  • r — pop an entry from the stack.

Hence we’re going to use L-systems that produce strings in this format. From such a string it’s then possible to extract a melody. For example, the string "f+f-f" could be interpreted as the notes 0,1,0.

We’ll return to this later. For now, let’s concentrate on implementing L-systems in Logtalk. This can be done in a large number of ways, but once we’ve chosen a suitable representation everything else will more or less follow automatically. Every L-system will be represented by an axiom and a set of production rules for both variables and constants. Since the production rules take symbols as argument and produces strings/lists, DCG’s are a fine choice. For the moment we can ignore everything else and just stipulate what an L-system is.

:- protocol(l_system).

   :- public(rule//1).
   :- public(axiom/1).

:- end_protocol.

:- object(algae,
    implements(l_system)).

   axiom([a]).

   rule(a) --> [a,b].
   rule(b) --> [a].

:- end_object.

Then we’ll need a predicate that takes a string as input and applies all applicable production rules. Since the rules themselves are written in DCG notation, it’s easiest to continue with this trend. The predicate will take a string and an L-system as input, and iteratively apply the rules for the elements in the string.


next([], _) --> [].
next([X|Xs], L) -->
    L::rule(X),
    next(Xs, L).

And all that remains is a predicate that calls next//2 for a predetermined number of generations. It’s more or less a standard loop: if N is 1, then the resulting string is the axiom of the L-system. Otherwise, recursively run the L-system for N - 1 generations and then run it once more.

generation(1, L, X) :-
    L::axiom(X).
generation(N, L, X) :-
    N > 1,
    N1 is N - 1,
    generation(N1, L, Y),
    phrase(next(Y, L), X, []).

This is almost too easy! For reference, let’s also implement an L-system that makes use of the Logo commands previously discussed.

:- object(koch_curve,
    implements(l_system)).

    axiom([f]).

    rule(-) --> [-].
    rule(+) --> [+].
    rule(f) --> [f,+,f,-,f,-,f,+,f].
:- end_object.

This structure is known as a Koch curve, and when interpreted as drawing commands it looks like:

Now we’ll need a predicate that transforms a list of commands into a list of notes. It’ll need 4 input arguments:

  • Xs — the list of commands.
  • Scale — the scale that the notes shall be generated according to.
  • N — the starting/current note.
  • S — the stack.

And one single output argument:

  • Ns — the resulting list of notes.

It’s not that hard to implement since it  only consists of a case-analysis of the command list. For example, if the command list is empty then the list of notes is empty. If the command is f, then we add the current note N to Ns, and so on for all the other commands.

    transform([], _, _, _, []).
    transform([f|Cs], Scale, N, S, [N|Ns]) :-
        transform(Cs, Scale, N, S, Ns).
    transform([-|Cs], Scale, N, S, Ns) :-
        Scale::lower(N, N1),
        transform(Cs, Scale, N1, S, Ns).
    transform([+|Cs], Scale, N, S, Ns) :-
        Scale::raise(N, N1),
        transform(Cs, Scale, N1, S, Ns).
    transform([s|Cs], Scale, N, S, Ns) :-
        transform(Cs, Scale, N, [N|S], Ns).
    transform([r|Cs], Scale, _, [N|S], Ns) :-
        transform(Cs, Scale, N, S, Ns).

Putting everything together

We can now generate command strings from L-systems and convert these into notes in a given scale. What remains is to convert the notes into frequencies with a specific duration. These can then be converted into samples and be written to a WAV file.

    generate_notes(L, I, Scale, Notes, Number_Of_Samples) :-
       l_systems::generation(I, L, X),
       Scale::nth(0, Tonic),
       l_systems::transform(X, Scale, Tonic, [], Notes0),
       findall(F-0.2,
              (list::member(Note, Notes0),
               Scale::frequency(Note, F)),
              Notes),
       length(Notes, Length),
       synthesizer::sample_rate(SR),
       Number_Of_Samples is Length*(SR/5).

The value 0.2, the duration of each note, is of course just an example and can be changed at whim. This is all we need in order to crank out some simple tunes. Luckily, I’ve already prepared some samples for your auditory pleasure.

Koch curve in C major

This is the curve depicted earlier. To be frank it sounds kind of dreadful, but fortunately the other samples are somewhat more interesting. Next up is the dragon curve!

Dragon curve in C major

Dragon curve in C minor

I think it sounds much better than the Koch curve, but that might be due to the fact that I view my creations with rose-tinted eyes; unable to see the unholy abomination that is their true form. Let’s have a look at the Hilbert curve.

Hilbert curve in C major

Hilbert curve in C minor

Catchy! The last L-system is a fractal plant.

Fractal plant in C major

Fractal plant in C minor

I think the results are quite interesting, and this is only the tip of the iceberg since it’s possible to create any kind of L-system and interpret it as a melody. The whole set is available at Soundcloud.

I initially intended to include a section in which I created a Prolog interpreter that for each refutation also produced a melody, but the time is already running out. It’s not impossible that I’ll return to the subject at a later date however!

Source code

The source code is available at https://gist.github.com/1034067.

Prolog’s Makin’ Music – Part 2

Scales, scales and scales

It’s time to bring on the noise! To recapitulate the story thus far, it suffices to say that we’re now able to write raw audio data in the WAV format after some mysterious bit-fiddling. As already mentioned we could in principle start to crank out tunes in this stage, but to illustrate why this isn’t a good idea I have prepared a small sample file containing 1 million randomly generated samples in the range [-32768, 32767]. It is perhaps also instructive to see a visual representation of the samples. Here’s the results:

Soundcloud link.
As expected the result is garbled noise. There’s obviously no discernible structure to speak of since the samples are literally speaking all over the place. We want something more harmonic and symmetrical that we can actually work with. Why does classical instruments (excluding drums, cymbals and other unpitched instruments) have such nice acoustic properties? To make a long story short, many instruments produce sounds with the help of vibrating strings – oscillations at different frequencies, e.g. a sine wave. Different frequencies give us different tones. In e.g. a piano the keys to the right have higher frequencies than those to the left. Hence, to construct something akin to an instrument we need a system of  frequencies and a function that can generate the corresponding waves. Obviously these problems have already been solved many times over in the history of music theory, and it would be ignorant to not take advantage of this. Let’s start with the first problem of finding a system of frequencies, a scale. This is actually harder than expected. We know that the scale should go from lower to higher frequencies and that there at the very least should exist some relationship between them. A first attempt might be to start the scale at an arbitrary frequency, e.g. 500, and for every new frequency add 50. This would result in a scale where the difference between any two adjacent frequencies is constant, or in other words linear. With 12 frequencies we would obtain:

0 500
1 550
2 600
3 650
4 700
5 750
6 800
7 850
8 900
9 950
10 1000
11 1050

Our first scale! The first number in the column, the identification number, is called a note (to be more precise a note also needs a duration). Hence the purpose of the scale is to give frequencies to notes. The traditional notation (pun not intended) for a series of 12 notes is A, A\sharp, B, C, C\sharp, D, D\sharp, E, F, F\sharp, G, G\sharp, where the notes with funny looking sharp signs correspond to the small, black keys on a piano (so-called “accidentals”). For simplicity we’ll use the numeric notation though. The next question is how this scale sounds when it is played in succession.

Linear scale

Perhaps surprisingly, it sounds pretty terrible even though it’s not that simple to say why. Wrong starting frequency? Wrong increment? Well, maybe, but that’s not the real problem, namely that as the frequencies increase the perceived difference, the distance, gets smaller and smaller which results in an unsymmetrical scale. Hence we want a scale where the distance between any two adjacent frequencies is a constant. This is known as equal temperament. To be more precise the distance doesn’t have to be a constant, but it have to be a multiple of the smallest possible step in the scale. For example we could have a scale where the distance between the first and second frequency is 1.5, but where the distance between the second and third frequency is 1.5 \times 2 = 3.

With this in mind it’s not to hard to create a new scale. The frequency of the N:th note is then Start * Ratio^{N-1}, where Start and Ratio are constants. For Start = 500 and Ratio = 1.06978 we get the scale:

0 500.0
1 534.89
2 572.2
3 612.1
4 654.9
5 700.6
6 749.4
7 801.7
8 857.7
9 917.5
10 981.6
11 1050.0

Equal temperament scale

To be honest it still doesn’t sound very good, but at least it’s a step in the right direction. Somehow it doesn’t provide enough closure, and if we were to extend it even further the new notes wouldn’t really relate to the old notes in a natural way (what is “natural” is of course biased by experience and tradition). Here’s an idea: what if the extended frequencies were (approximately) multiples of the first 12 frequencies? That is F_{12} \approx 2 \times F_0, F_{13} \approx 2 \times N_1 and so on. It’s not too hard to derive such a constant. Let x be the constant. Then F_{12} = 2\times F_0 = F_0 \times x ^{12} \Leftrightarrow \frac{2\times F_0}{F_0} = x^{12} \Rightarrow 2^{1/12} = x. Hence the general formula is F_n = F_0 \times x^{n} = F_0 \times (2^{1/12})^n = F_0 \times 2^{n/12}.

If the starting frequency is 500 we then get the scale:

0 500.0
1 529.7
2 561.2
3 594.6
4 629.0
5 667.4
6 707.1
7 749.2
8 793.7
9 840.9
10 890.9
11 943.9

Equal temperament scale – second attempt

It’s hard to notice the difference in the first few notes, but in the end of the scale the difference gets more and more pronounced. Now we have something quite close to what’s actually used in the real world. The only difference is the starting frequency, which is usually 440 Hz, the so-called standard concert pitch. This value is somewhat arbitrary, but just for reference here’s what we get:

0 440.0
1 466.2
2 493.9
3 523.3
4 554.4
5 587.3
6 622.3
7 659.3
8 698.5
9 739.9
10 783.9
11 830.6

Chromatic scale

Fortunately it’s rather easy to implement scales once we have the theory behind us. There are two basic choices for the representation: either  we work with the raw frequencies in the scale, or we work with the notes and extract the frequencies when needed. I shall go with the second option since it’s often easier to work with notes. Interestingly enough, the chromatic 12 tone scale that we just used is an example of an abelian (commutative) group with 0 as the unit element, which means that it’s quite pleasant to work with. The basic operations that we want to perform are:

  • raise/2 – get the next note in the scale.
  • lower/2 – get the preceding note in the scale.
  • add/3 – add two notes in the scale.
  • length/1 – get the number of notes in the scale.
  • nth/2 – get the n:th note in the scale, starting from 0.
  • frequency/2 – get the associated frequency of the note.

Which is easily expressible in terms of a protocol:

:- protocol(scalep).

   :- public(raise/2).
   :- public(lower/2).
   :- public(add/3).
   :- public(nth/2).
   :- public(length/1).
   :- public(frequency/2).

:- end_protocol.

And to implement the chromatic scale is straightforward:

:- object(chromatic_scale,
        implements(scalep)).

    %A, A#, ..., G, G#.
    length(12).

    raise(N, N1) :-
        N1 is (N + 1) mod 12.

    lower(N, N1) :-
        N1 is (N - 1) mod 12.

    add(N1, N2, N3) :-
        N3 is (N1 + N2) mod 12.

    nth(I, I) :-
        % Used so that we can call nth/2 with uninstantiated
        % arguments.
        between(1, 12, I).

    %A4 to G#5.
    frequency(N, F) :-
        F is 440 * 2 ** (N/12).

:- end_object.

Extending this scale to use more than 12 elements would of course not be hard either. Just to show something different we’re also going to implement the C major scale. It contains the frequencies:

0 523.3
1 587.3
2 659.3
3 698.5
4 783.9
5 880
6 987.8

The C major scale

It’s slightly harder to implement than the chromatic scale since the distances between adjacent notes is not constant. The distance between any two adjacent notes is either a half step (the distance between two adjacent notes in the chromatic scale) or two half steps. If we then represent each note with its distance from the first note we get:

0 0
1 2
2 4
3 5
4 7
5 9
6 11

Don’t worry if these specific distances doesn’t make any sense to you. But they are not completely arbitrary; each note in C major corresponds to a white key on the piano, and is actually the only major scale that only makes use the white keys. Since we are now counting half-steps we can more or less use the same formula as in the chromatic scale for calculating frequencies.

:- object(c_major,
        implements(scalep)).

    nth(0, 0).
    nth(1, 2).
    nth(2, 4).
    nth(3, 5).
    nth(4, 7).
    nth(5, 9).
    nth(6, 11).

    raise(N, N1) :-
        nth(I1, N),
        I2 is ((I1 + 1) mod 7),
        nth(I2, N1).

    lower(N, N1) :-
        nth(I1, N),
        I2 is ((I1 - 1) mod 7),
        nth(I2, N1).

    % As far as I know, this is the only way to make sense of addition
    % in C major. Simply adding the distance from the tonic doesn't work
    % since that makes it possible to get notes outside the scale.
    add(N1, N2, N3) :-
        nth(I1, N1),
        nth(I2, N2)
        I3 is ((I1 + I2) mod 7),
        nth(I3, N3).

    % C, D, E, F, G, A, B.
    length(7).

    %C5 to B5.
    frequency(N, F) :-
        F is 440 * 2 ** ((N + 3)/12).

:- end_object.

The synthesizer

Whenever we’re going to generate music we’re going to use a specific scale in order to get a linear sequence of notes (since we don’t use chords). From the notes we get a series of frequencies. But to actually produce something that is nice to listen to we need something more. To play e.g. the standard concert pitch at 440 Hz we’re going to generate a wave with 440 oscillations per second. How we generate this wave determines how the note will be played. A sine wave will give a smooth sound while a sawtooth wave will give something reminiscent of a mechanical dentist drill. To create more complex sounds a technique known as additive synthesis can be used. We shall however not peruse this option at the moment.

Our synthesizer will take 3 input arguments: the frequency, the duration and the filter that shall be applied, and returns a list of samples in its single output argument. From the duration it’s possible to calculate how many samples that we’ll need to generate with the help of the sample rate. For example, if the duration is 0.5 seconds and the sample rate is 22050 the number of samples is 0.5 \times 22050 = 11025. The wave will be generated with a loop from 0 to Number\_Of\_Samples where the following operations are performed on each sample:

  • Divide the sample by the sample rate, so that we get the correct resolution. A high sample rate means that we’ll generate more points on the wave.
  • Calculate the angular frequency of the sample, i.e. \omega = 2\pi F, where F is the frequency.
  • Apply the filter. The filter should return a floating point number in [-1, 1].
  • Scale the sample in [-1, 1] with a volume factor so that we get samples in the full sample space.
This can actually be done in rather few lines of code. Without further ado I present to you:
:- object(synthesizer).

    :- public(samples/4).
    :- public(sample_rate/1).
    :- public(bits_per_sample/1).

    :- private(filter/3).
    :- private(volume/3).
    :- private(wave//3).

    bits_per_sample(16).
    sample_rate(22050).

    samples(Frequency, Duration, Filter, Samples) :-
        sample_rate(SR),
        N is floor(SR * Duration),
        phrase(wave(N, Frequency, Filter), Samples).

    %% We could have implemented this as higher order predicates
    %% instead, but the performance loss would not have been worth it
    %% since the filter might be applied to millions of samples.
    filter(sine, Sample0, Sample) :-
        Sample is sin(Sample0).
    filter(sawtooth, Sample0, Sample) :-
        Sample is Sample0 - floor(Sample0).
    filter(triangle, Sample0, Sample) :-
        Sample is -((acos(sin(Sample0)) / pi - 0.5)*2).

    volume(M, N, V) :-
        bits_per_sample(BPS),
        V0 is (2**BPS)/2 - 1,
        %% Decrease the volume over time.
        Percent is (M/N)/2,
        V is V0*(1 - Percent).

    wave(N, Freq, F) --> wave(0, N, Freq, F).
    wave(M, N, _, _) --> {M > N}, [].
    wave(M, N, Freq, F) -->
        {M =< N,
        sample_rate(SR),
        M1 is M + 1,
        volume(M, N, V),
        X is (2*pi*Freq)*M/SR,
        filter(F, X, Sample0),
        Sample is floor(Sample0*V)},
        [word(2, little, Sample)],
        wave(M1, N, Freq, F).

:- end_object.

Putting everything together

Somehow we’ve come this far without a suitable name for the project. I’ll name it Xenakis in honor of the Greek-French music theorist, composer and architect Iannis Xenakis. You can listen to one of his most famous pieces here (warning: it’s rather frightening).

Using the components just described is not hard. First one generates a list of frequencies in a scale, that is then used as input to the synthesizer which gives a list of samples which is written to a WAV file.

:- object(xenakis).

   :- public(init/0).

   init :-
       %% N is the number of samples.
       generate_notes(Ts, N),
       wav::prepare(output, N),
       write_samples(Ts).

   %% Generate the frequencies in the C major scale. Each note has a
   %% duration of 0.5 seconds.
   generate_notes(Ts, N) :-
       Scale = c_major,
       findall(F-0.5,
               (Scale::nth(_, Note),
                Scale::frequency(Note, F)),
               Ts),
       Scale::length(L),
       synthesizer::sample_rate(SR),
       N is L*SR/2.

   %% Write the notes to 'output'.
   write_samples([]).
   write_samples([F-D|Fs]) :-
        synthesizer::samples(F, D, sine, Samples),
        wav::write_audio(output, Samples),
        write_samples(Fs).

:- end_object.

All the scales that are available on Soundcloud were of course generated using this method. We now have a good foundation for the next installment where we at last will look at methods for automatically generating note sequences.

Source code

The source code is available at https://gist.github.com/1007820.

Prolog’s Makin’ Music – Part 1

Interlude

Gather around everyone, and I’ll tell the story of how I sold my soul to the binary devil.

It all began a dark and gloomy night. I’ve had one too many to drink – coffee, that is – and found it hard to concentrate on anything else than the splashing rain. The murky light played tricks on my eyes, or so I thought. Dangling contours everywhere. The buzzing monitor didn’t help either. I stretched out my back with a loud, cracking sound and tried to suppress a yawn.

“Do you want the power to create music from thin air?”

A voice from nowhere. Surely I hadn’t had that much to drink. I held up my keyboard like a club, cursing myself for getting rid of the IBM model M keyboard in favor of an ergonomic one, and slowly turned my head in the direction of the voice. If there was an intruder, I wouldn’t go down without a fight.

“Who’s there?”, I cried.

After a long silence the voice finally answered:

“Do you want to make a deal?”

“A deal?!” I blurted out, getting rather annoyed by his impudence.

“I shall grant your computer the gift of making music. All I ask in return is that your next blog entry contains some steamy, bit-on-bit action that somehow involves the WAV format. Also, I shall need your soul for all eternity.”

Having run out of ideas, I had no choice but to accept his offer.

“Sure! Wait, no!… Who are you?”

A manic laughter followed. He vanished in a hazy puff of smoke and left. All that remained was a chilly wind and a feeling that I had somehow been cheated.

Computer generated music

Now to the point: the goal of this and the following entries will be to create computer generated music in Prolog/Logtalk. That might sound (pun not intended – I can’t help it) like a tall order, but hopefully everything will become clearer once we’ve explicated some of the concepts in music theory. The outline is as follows:

  • Step 1 – Generate audio.
  • Step 2 – Generate tones from audio.
  • Step 3 – Generate melodies from tones, with a suitable formalism such as a cellular automata or an L-system.

Sound as oscillations

In order to generate music we first need to understand what sound is. Wikipedia says:

Sound is a mechanical wave that is an oscillation of pressure transmitted through a solid, liquid, or gas, composed of frequencies within the range of hearing and of a level sufficiently strong to be heard, or the sensation stimulated in organs of hearing by such vibrations.

Or to put it a bit more pragmatic: a sound is a series of frequencies. Of course, this is a bit too simplistic to be useful in practice. Among other things, we need to decide whether we’re interested in mono or stereo sound, how fine-grained each frequency should be and how fast they should be played.

So we have an idea of how sound should be represented. First we decide how it should be interpreted by the listener, and then we give out the actual frequencies. As one might suspect there exists a myriad of different formats for this purpose. One of the simplest is the WAV format, which we shall use in this project.

Writing to binary files

WAV is a binary format, and thus consists of a sequence of integers of varying sizes. Hence the first step is to learn how one writes to binary files in Prolog. The bad news is that there only exists one ISO primitive for this purpose: put\_byte/2, which is not sufficient since it only works for single byte, signed integers. The good news is that we can get it to do what we want with some low-level bit-fiddling. Here’s the operations that we’ll need in order to produce a fully functional WAV file:

  • Write 4 byte, unsigned integers in big endian format.
  • Write 4 byte, unsigned integers in little endian format.
  • Write 2 byte, unsigned integers in little endian format.
  • Write 2 byte, signed integers in little endian format.

It would be nice if we could handle this in a uniform way, so that the underlying details of how one should use put\_byte/2 can be postponed as far as possible. For this purpose we’ll introduce a data structure, word, that has the format:

word(Byte\_Count, Endian, Integer)

where Byte\_Count is either 2 or 4, Endian is either big or little, and Integer is a positive or negative integer. So to represent the number 135  in the little endian format we would use:

word(2, little, 135)

while the number 1350 in big endian format would represented as:

word(4, big, 1350)

Simple, but it might feel kind of weird to represent such a low-level concept in this way. In most imperative languages we would of course explicitly declare the data as either char, short, int and so on, but this is the best we can do in Prolog (unless we create necessary bindings for the host language and borrow some datatypes). Next, we’re going to define write\_word/2 that writes a word to a stream. Let’s focus on 2 byte integers for the moment. A first attempt might look like:

write_word(word(2, Endian, I), Stream) :-
    put_byte(Stream, I).

Alas, this only works for single byte integers. If we want to write 2 bytes, we need to extract the individual bytes from the integer and call put\_byte/2 two times. This can be done with shifting and the bitwise and-operation.

write_word(word(2, Endian, Bs), S) :-
    X1 is Bs >> 8,
    X2 is Bs /\ 0x00ff,
    (  Endian = big ->
       put_byte(S, X1),
       put_byte(S, X2)
    ;  put_byte(S, X2),
       put_byte(S, X1)
    ).

Note that we also check whether Endian is big, and if so output the bytes in reversed order. This works fine for positive numbers, but what about signed, negative numbers? Since put\_byte/2 only works with positive numbers, we need to convert the negative number into a positive number that is still negative with respect to that byte range. This is actually rather easy to do since we’re using two’s complement numbers: if the number is negative, then add  a number such that the sum is the two’s complement of the absolute value of the negative number. The code will make this easier to understand:

    write_word(word(2, Endian, Bs), S) :-
        Bs >= 0,
        X1 is Bs >> 8,
        X2 is Bs /\ 0x00ff,
        (  Endian = big ->
           put_byte(S, X1),
           put_byte(S, X2)
        ;  put_byte(S, X2),
           put_byte(S, X1)
        ).
    write_word(word(2, Endian, Bs), S) :-
        Bs < 0,
        Bs1 is Bs + 0xffff,
        write_word(word(2, Endian, Bs1), S).

(Thanks to Pierpaolo Bernardi who showed me this trick on the SWI-Prolog mailing list!)
Update: Richard O’Keefe also showed a simpler solution that doesn’t need the explicit positive/negative test. It’s left as an exercise to the reader!

The code for 4 byte integers is rather similar and hence omitted.

The WAV format

Now let’s focus on WAV. All my knowledge of the format stems from a single source (click for a useful, visual diagram). A WAV file consists of:

  • A header containing the string “RIFF”, the remaining chunk size and the string “WAVE”.
  • A format subchunk containing the string “fmt” (format), the remaining chunk size, the audio format, the number of channels, the sample rate, the byte rate, the block align and the number of bits that are used for each sample.
  • A data subchunk that contains the string “data”, the remaining size of the subchunk and finally the actual data (the samples).

Don’t worry if some of these terms are unfamiliar or confusing. It’s not necessary to understand all the details. We begin by defining the number of samples, the number of channels, the bits per sample and the sample rate as facts:

    num_samples(100000). %This value will of course differ depending on the audio data.
    num_channels(1). %Mono.
    bits_per_sample(16). %Implies that each sample is a 16 bit, signed integer.
    sample_rate(22050).

All the other values can be derived from these parameters. For simplicity we’re going to produce a list of words that are later written with the help of write\_word/2. This can be done in any number of ways, but DCG’s are fairly straightforward in this case. The RIFF chunk is first. It takes the size of the data chunk as argument since it is needed in order to produce the size of the remaining chunk.

    riff_chunk(Data_Chunk_Size) -->
        riff_string,
        chunk_size(Data_Chunk_Size),
        wave_string.

    riff_string --> [word(4, big, 0x52494646)].
    wave_string --> [word(4, big, 0x57415645)].

    chunk_size(Data_Chunk_Size) -->
        {Size is Data_Chunk_Size + 36}, % Magic constant!
        [word(4, little, Size)].

The end result will be a list of the form [word(4, big, 0x52494646), ...]. The format chunk follows the same basic structure:

fmt_chunk -->
    fmt_string,
    sub_chunk1_size,
    audio_format,
    number_of_channels,
    sample_rate,
    byte_rate,
    block_align,
    bits_per_sample.

fmt_string -->  [word(4, big, 0x666d7420)]. %"fmt".

sub_chunk1_size --> [word(4, little, 16)]. %16, for PCM.

audio_format --> [word(2, little, 1)]. %PCM.

number_of_channels -->
    [word(2, little, N)],
    {num_channels(N)}.

.
.
. % And so on for all the remaining stuff.

The remaining data chunk is even simpler:

data_chunk(Data_Chunk_Size) -->
    data_string,
    [word(4, little, Data_Chunk_Size)],
    test_data.

test_data --> ... %This should generate a list of samples.

And finally, we say that a WAV file consists of a riff chunk, an fmt chunk and a data chunk:

    wav_file -->
        {num_samples(N),
         bits_per_sample(BPS),
         num_channels(Cs),
         Data_Chunk_Size is N*BPS*Cs/8},
        riff_chunk(Data_Chunk_Size),
        fmt_chunk,
        data_chunk(Data_Chunk_Size).

It is used in the following way:

    output(File) :-
        open(File, write, S, [type(binary)]),
        %Call the DCG, get a list of words as result.
        phrase(wav_file, Data),
        %Write the list of words.
        write_data(Data, S),
        close(S).

    write_data([], _).
    write_data([B|Bs], S) :-
        write_word(B, S),
        write_data(Bs, S).

As test data, we’re going to generate a 440HZ sine wave.

    sine_wave(0) --> [].
    sine_wave(N) -->
        {N > 0,
        sample_rate(SR),
        N1 is N - 1,
        %% Standard concert pitch, 440 Hz.
        Freq is 440,
        ScaleFactor is 2*pi*Freq/SR,
        %% Needed since sin(X) returns an integer in [-1, 1], which
        %% is barely (if at all) perceivable by the human ear. The
        %% constant 32767 is used since we're dealing with 16 bit,
        %% signed integers, i.e. the range of the samples is [-32768,
        %% 32767].
        VolumeFactor is 32767,
        X is ScaleFactor*N,
        Sample0 is sin(X),
        %% Floor the sample. Otherwise we would end up with a floating
        %% point number, which is not allowed.
        Sample is floor(Sample0*VolumeFactor)},
        [word(2, little, Sample)],
        sine_wave(N1).

It’s not necessary to understand all the details, but the end result is a list of 2 byte words that represent a 440 HZ sine wave. You can listen to it here.

Summary

We’re now able to write samples to WAV files. These samples can represent any tone or sound, so in theory we already have everything that’s needed to generate music. But representing a tune as millions and millions of samples is not very user-friendly and would make it more or less impossible to automatically generate anything interesting. For that we’re going to need further abstractions, and among other things define a sound bank that contains some common tones.

Source code

The source code is available at https://gist.github.com/955626.

Prologomenon Goes Viral

Apparently, social medias like Twitter is all the rage these days. It’s been used to convey critical information in emergency situations, helped revolutions in suppressed governments, and now it’s finally time to overthrow the corrupt imperative languages that have grown fat on the labors of the working class…. Kidding – some of my best friends are imperative languages! I have however started a Twitter account in relation with the blog with the intent to a) troll imperative languages, b) post code-snippets within the character limit that accomplishes something cool. I’ve currently implemented:

All just for fun, of course. Don’t try this at home unless you’re sure you’re able to use cryptic variable/predicate names and abuse all the language features that normally allows one to write readable, concise source code.

An Adventure Game – Part 5

We’re rapidly approaching the end. This time we’ll implement a meta-language that makes it easier to create new games with the existing engine. Conceptually speaking we’re not adding anything new to the table, but the example game from the previous post was created in an ad-hoc manner that demanded knowledge of both Prolog/Logtalk and the engine in question. What we want is a declarative language in which it’s possible to define rooms, entities and how they should be connected. Exactly how this language is interpreted should not be the game designer’s concern. This is a scenario in which it’s best to start with the language itself, since it’s pretty much impossible to write an interpreter otherwise.

The language

We want the language to be capable of:

  • Creating entities.
  • Adding properties to existing entities.
  • Setting new values for entities. For instance, if we first create a door with the default lock we’ll probably want to change this later on.
  • Adding entities to rooms.
  • Connect two rooms with an entrance.

All these steps should be expressible within a single file. The first part might look something like (as always, I’m just making stuff up on the fly, it’s quite possible that there are simpler/better ways to accomplish this!):

begin entities.

build(room, room1, "A rather unremarkable room.\n").
build(room, room2, "...").

build(door, door, "...").

build(lock, lock).

build(key, key, "A slightly bent key.\n").

build(player, player).

end entities.

Where build(Type, Id) should be read as: build an entity of type Type and name it Id. build/3 allows us to create printable entities, where the string is the description. This suggests that we’ll need a builder-object that’s capable of constructing some default entities. Since entities consist of properties, it would be possible to only build generic entities and manually add all the interesting properties in question, but we make the job easier for the game designer if he/she can assume that some primitive objects are already available. Of course, it wouldn’t be practical to demand that all entities are created in this manner. If two entities are identical except for a few differing properties then it would be simpler to create a generic entity and add the properties manually instead of defining a new build-predicate. For example, say that our game will consist of two different kinds of fruits: magical and non-magical. If we eat the former we finish the game, if we eat the latter we increase the health of the player. This is naturally implemented by creating two different properties: fruit\_property and magic\_fruit\_property. Hence, to create two fruits – one magical and one non-magical –  we first create two generic instances and then add the defining properties.

begin entities.

build(generic_entity, apple, "An apple! \n").
build(generic_entity, banana, "A yellow banana. Hot diggity dog!\n").

end entities.

begin properties.

add_property(fruit_property, apple).
add_property(carriable_property, apple).

add_property(magic_fruit_property, banana).
add_property(carriable_property, banana).

end properties.

The Argus-eyed reader will probably realize that it would be ever better to factorize the common properties (carriable\_property) of the two fruits into a prototypical base object, and then clone this object and later add the unique properties (magic\_fruit\_property and fruit\_property).

Since the entities that have been created thus far only has the default values we now turn to the problem of sending messages, so that we’re able to change these at whim. Say that we want to tell the lock that it should use the key that we just created, and the door that it should use the lock. A first attempt might look like this:

begin relations.

action(set_key, lock, key).
action(set_lock, door, lock).

end relations.

All identifiers here refer to the entities that have already been created. This won’t work however, due to a subtle semantic difference between how locks and doors work. A door has a lock, it consists of a lock. Therefore it’s correct to send the lock entity as an argument to set\_lock. A lock on the other hand doesn’t consist of a key. It only needs to know what key will unlock it, hence it’s not correct to send the whole key entity as argument. We only need one part of the key entity, its identity, in this case. To be able to differentiate between these cases we’ll introduce the notation that a term preceded by a dollar sign ($) will be sent as-is, instead of sending the entity corresponding to the identity. The previous attempt should hence be rewritten as:

begin relations.

action(set_key, lock, $ key).
action(set_lock, door, lock).

end relations.

The file will be interpreted from top to bottom, so if we added a property in the preceding block we’re able to change it here. The next step is to connect rooms. Strictly speaking this relation is not necessary since we’re already able to send messages to entities, but including it as a primitive in the language will make it much easier to use. The syntax is:

begin relations.
.
.
.
connect(room1, room2, door).

end relations.

The full example game in all its glory would be written as:

begin entities.

build(room, room1, "A rather unremarkable room.\n").
build(room, room2, "A room almost identical to the previous one. What on earth is going on!?\n").
build(door, door, "A wooden door with a small and rusty lock.\n").
build(lock, lock).
build(key, key, "A slightly bent key.\n").
build(generic_entity, apple, "An apple! \n").
build(generic_entity, banana, "A yellow banana. Hot diggity dog!\n").
build(player, player).

end entities.

begin properties.

add_property(fruit_property, apple).
add_property(carriable_property, apple).

add_property(magic_fruit_property, banana).
add_property(carriable_property, banana).

end properties.

begin relations.

action(set_key, lock, $ key).
action(set_lock, door, lock).

action(set_state, door, $ closed).

action(add_item, room1, apple).
action(add_item, room1, key).

action(add_item, room2, banana).

action(set_location, player, $ room1).

connect(room1, room2, door).

end relations.

A substantial improvement in readability compared to the previous efforts!

Parsing

Another boring, dry entry on parsing? Fear not, because I have a trick up my sleeve -there was a reason why the syntax of the meta-language was a spitting image of Prolog’s all along! One way to interpret the file is to say that begin/end and the dollar sign are all operators with a single argument. Then the file is nothing but a collection of facts that can be accessed as normal and we won’t have to worry about parsing at all. A slightly more contrived but more general approach is to use what in Prolog-nomenclature is known as term expansion. This is usually the preferred approach to handle embedded languages and is somewhat similar to macros in Lisp. The basic idea is simple: instead of taking a term at face-value we expand it according to a user-defined rule. What’s the point? Basically we don’t have to type as much. For example, let’s say that we have a database consisting of rule/3 facts, where the first argument is an atom, the second a list and the third an integer denoting the length of the list.


rule(a, [], 0).

rule(b, [a], 1).

Furthermore assume that we don’t want to calculate the length of the second argument at run-time. There’s nothing inherently wrong with this approach, but manually entering the length of the list is a drag and quite error-prone. It would be better if we could simply write:


rule(a, []).

rule(b, [a]).

And tell the Prolog system that these facts should be construed as rule/3 facts with an additional third argument which contains the length of the list. Fortunately this can easily be realized with the built-in predicate term\_expansion/2. The first argument of term\_expansion is the term that shall be expanded. The second argument is the expanded term. A suitable definition for the previous example is:

term_expansion((rule(Head, Body)), rule(Head, Body, Length)) :-
    length(Body, Length).

Great success! We don’t have to concern ourselves with how or when this predicate is called, just that it will eventually be called when the file is loaded. Like all powerful language constructs it’s easy to abuse the term expansion mechanism and create unreadable code. We could for instance expand A :- B to B :- A if we were in a facetious mood (don’t do it, OK?). Fortunately the Logtalk support is a bit more sane than in most Prolog systems. Instead of defining term\_expansion/2 rules in the same file as the rules that shall be expanded, they’re encapsulated in a special expansion object. This object is later used as a hook in logtalk\_load/2 to make sure that the effect is localized to a given file. In summary, there’s two steps involved:

  • Define the term\_expansion/2 rules in an expansion object (which must implement the expanding protocol).
  • Load the file (the script file in our case) with the expansion object.

I shan’t spell out the full details of the expansion object, but what it does is to remove the begin/end directives and create a set of entity/1 facts with the initial entities. Also, to make things easier in the script interpreter, it removes add\_property/2, action/3, connect/3 and replaces them with unary facts instead.

The builder

The builder is in charge of building game entities. As mentioned earlier it’s not strictly needed since it’s always possible to add properties manually, but it does simplify things. Here’s how a map and a room could be created:

    build(world, Id, Rooms, Player, World) :-
        Ps = [map_property-State],
        build(final_state, F),
        map_property::new([[F|Rooms], Player], State),
        entity::new(Ps, Id, World).

    build(room, Id, Description, Room) :-
        Ps = [container_property - State1,
              printable_property - State2],
        container_property::new([], State1),
        printable_property::new([Description], State2),
        entity::new(Ps, Id, Room).

Interpreting

We have a description of the game and want to transform it into an entity that has the map\_property. Strictly speaking this is not an interpreter, but rather a compiler from the script language to the entity language. The most important predicate is interpret\_game/2 that takes an expanded script file as argument and produces a game world. It works by extracting the entity directives, the property directives, the relations directives and then separately interprets each one of these. Finally it extracts the rooms and the player and asks the builder to build a game world.

   interpret_game(DB, World) :-
       findall(E, DB::entity(E), Es0),
       findall(P, DB::add_property(P), Ps),
       findall(A, DB::action(A), As),
       findall(C, DB::connect(C), Cs),
       interpret_properties(Ps, Es0, Es1),
       interpret_actions(As, Es1, Es2),
       interpret_connectors(Cs, Es2, Es),
       get_rooms(Es, Rooms),
       get_player(Es, Player),
       builder::build(world, world1, Rooms, Player, World).

The interpret\_x predicates are all rather similar. They iterate through the list of commands and changes the set of entities accordingly. For brevity, let’s concentrate on interpret\_actions/3.

    interpret_actions([], Es, Es).
    interpret_actions([t(M, Id1, Id2)|As], Es0, Es) :-
        select_entity(Id1, Es0, E0, Es1),
        lookup_argument(Id2, Es0, Arg),
        entity::action(M, [Arg], E0, E),
        interpret_actions(As, [E|Es1], Es).

    lookup_argument(Id, Es, Arg) :-
        (   Id = $(Symbol) ->
            Arg = Symbol
        ;   lookup_entity(Id, Es, Arg)
        ).

The body of interpret\_actions/3 should be read as: execute action M on the entity corresponding to Id1 with the argument Id2 (remember that arguments preceded by a dollar-mark are left unchanged). Since we might need to update an entity several times, it’s re-added to the list of entities whenever it’s updated.

Putting everything together

We need to make a small change to init/0 in the game object. Instead of building a world manually it’ll take a script file as argument and ask the interpreter to interpret (compile!) it.

    init(Game) :-
         write('Welcome to Bacchus-Bosch!'), nl,
         current_input(S),
         script_interpreter::interpret_game(Game, World),
         repl(S, [], World).

That’s pretty much it – the end of Bacchus-Bosch. As I suspected when I started the project the final game isn’t very fun. Wait, scratch that, it might be the worst game I’ve ever played, and that includes the infamous yellow car game. But it does have a magic banana, that ought to count for something. In any case it wouldn’t be hard to create more engrossing games since all the building blocks are in place. It should also be noted that the engine is hardly limited to text-based or turn-based games. In a real-time game we could for instance run the update method a fixed amount of times per second instead of waiting for player input. We could also add e.g. role playing elements by defining new properties.

I hope that this pentalogy has been at least somewhat comprehensible and coherent. What the future holds for the blog I dare not promise. Hopefully we’ll someday see the return of the magic banana!

Source code

The source code is available at https://gist.github.com/924052.

An Adventure Game – Part 4

And yet again I display exceptional prowess in the (un)holy art of procrastination. This time I blame lack of coffee, Sweden’s gloomy weather and the Illuminati. This post will concern two of the three remaining obstacles. First we’re going to add enough properties so that we can create a map consisting of a few rooms together with some entities. Second, we’re going to translate the parsed commands from the user into a suitable set of action commands and execute them with respect to the game world and its entities. As always it’s best to start with a use-case scenario and extrapolate some requirements.

> look

You see a small, rusty key and a door.

> pick up the key and open the door.

> go through the door.

This room is almost identical to the previous one. How eerie. You see a banana and a door.

> eat the door.

no.

> eat the banana

no.

> take the banana and eat it.

Congratulations! You solved mystery of the missing banana!

Let’s don our declarative thinking cap. This particular game consists of two rooms and a player. The player starts in the first room, picks up the key, unlocks the door and enters the second room where he/she in a moment of absolute intellectual clarity manages to deduce that the only way to beat the game is to eat the banana. Just like in real life. We can summarize the requirements as:

  • The player must be able to move from one location to another.
  • The player must be able to reference entities by their names.
  • The door must be linked to the second room.
  • The key must be able to lock/unlock the door.
  • The banana must have the property that the game stops when the player eats it.

Fortunately we can already handle some of these. The most fundamental requirement is that of referencing entities. Since each entity is just a list of properties we’re currently unable to distinguish them in any sensible way. The most obvious solution is to add another property, identity\_property, which describes the property of having an identity. For example, the door in the scenario would be represented by the list:

[openable\_property-State_1, printable\_property-State_2, ..., identity\_property-State_n]

For simplicity the state of identity\_property is just going to be an atom, e.g. door. In a real implementation it would of course be preferable to use a more complex data structure so that entities can be referenced by attributes as well, e.g. “wooden door”, but the basic idea is the same. Of course, just because this is a nice, declarative solution doesn’t mean that it’s good. To quote Richard O’Keefe from The Craft of Prolog:

The nasty thing about declarative programming is that some clear specifications make incredibly bad programs.

It’s not to hard to see that storing the identity of an entity as a property is hopelessly inefficient. To find an entity in a container we have to iterate through the whole container and check whether or not every element satisfies the identity property in question. Ouch. It would be a much better idea to demand that all entities have an identity and then store the container as a tree where nodes are entities ordered by their identities. Of course, it doesn’t mean that I’m going to use this solution just because it’s better! Sometimes it’s enough to be aware that something is a potential bottleneck and fix it if/when it becomes a real problem. Or buy a faster computer.

Next up is the problem of the game world. We shall introduce a new property by the name map\_property and stipulate that it consists of a list of rooms and a player. Why not add the player as an item in the current room instead? Just for simplicity; it’s slightly easier to move the player from room to room if we don’t have to explicitly remove him/her from the first room and add him/her to the new one. Since we have removed the player from the rooms we’re going to need another property, that of having a position/being movable, so that it’s always possible to find the current room.

:- object(movable_property,
    extends(property)).

    new(identity_property-_).

    action(move, [New], Owner, _, Owner, New).

    action(get_location, [Room], Owner, Room, Owner, Room).
:- end_object.

The state of movable\_property is simply an identity property. When someone issues the move command the current location is changed. Exactly how this works should become clearer later on. For now, let’s concentrate on implementing map\_property. Its state will be a tuple of a list of rooms and the player, and it’ll have commands to add/remove rooms, get the current room, update the rooms and so on.

:- object(map_property,
    extends(property)).

    new([]-[]).

    update(E0, E) :-
        entity::update_property(map_property, E0, Rooms0-P0, Rooms-P, E),
        entity::update(P0, P),
        update_rooms(Rooms0, Rooms).

    update_rooms([], []).
    update_rooms([R0|R0s], [R|Rs]) :-
        entity::update(R0, R),
        update_rooms(R0s, Rs).

    action(add_rooms, [Rooms1], Owner, Rooms2-P, Owner, Rooms-P) :-
        ...

    action(get_room, [Property, R], Owner, Rooms-P, Owner, Rooms-P) :-
        ...

    action(select_room, [Property, R], Owner, Rooms-P, Owner, Rooms1-P) :-
        ...

    action(print, [], Owner, Rooms-P, Owner, Rooms-P) :-
        action(current_room, [Room], Owner, Rooms-P, _, _),
        entity::action(print, [], Room, _).

    action(current_room, [Current], Owner, Rooms-P, Owner, Rooms-P) :-
        entity::action(get_location, [Id], P, _),
        list::member(Current, Rooms),
        entity::get_property(Current, Id).

    action(get_player, [P], Owner, Rooms-P, Owner, Rooms-P).

:- end_object.

It’s not necessary to study the details of this particular implementation, but some of the predicates demand an explanation. update/2 uses update\_property/5 in entity to update map\_property with the state obtained by updating the list of rooms and the player. To put it more simply: it just calls update/2 on the rooms and the player. The print command extracts the current room and prints it (it would not be very interesting to print anything else). The current\_room command gets the current location of the player and then uses member/2 to find the room with that particular identity.

Then there are doors. To have a text-game without lots of doors would simply be madness. Given the representation of the game world, each door must contain the identity of the area which it leads to. It just stores the identity, not the area itself (since all areas are stored in the map\_property). We shall store this identity in entrance\_property:

:- object(entrance_property,
    extends(property)).

    new(identity_property-_).

    action(get_location, [Location], Owner, Location, Owner, Location).
:- end_object.

So when we create a door we plug in an entrance\_property with a suitable identity of a room.

Translating commands to entity actions

Before we begin the translation process we’re going to define some useful action commands in container\_property and map\_property. The most frequently used action will be that of extracting  an entity according to some property (e.g. the identity), perform some state-changing operation and then adding it anew. This procedure is necessary since we don’t have explicit state: whenever we extract an entity we get a copy of it, and changes to this copy won’t affect the original. For this purpose we’re going to define two additional action commands in container\_property and map\_property that takes three arguments:

  • P – the property of the entity that shall be updated.
  • Old – will be unified with the old entity.
  • New – the new entity.
:- object(container_property,
    extends(property)).

     .
     .
     . % As before.

    action(update_item, [P, Old, New], Owner, Items, Owner, [New|Items1]) :-
        list::select(Old, Items, Items1),
        entity::get_property(Old, P).

    .
    .
    . % As before.
:- end_object.

The neat thing about this definition is that we can extract an entity with Old and simply pass a variable as New, and unify this variable to the updated entity later on. The definition in map\_property is similar, but works for the player and the current room.

:- object(map_property,
    extends(property)).

    .
    .
    . % As before.

    action(update_current_room, [Current0, Current],
           Owner, Rooms-P, Owner, [Current|Rooms1]-P) :-
        entity::action(get_location, [Id], P, _),
        list::select(Current0, Rooms, Rooms1),
        entity::get_property(Current0, Id).

    action(update_player, [P0, P], Owner, Rooms-P0, Owner, Rooms-P).

    .
    .
    . % As before.

:- end_object.

Now we can finally begin with the translation process. Since we know that the input from the user will be a list of commands (remember that conjunctions are allowed) we will execute them one by one and thread the state of the game world.

eval_commands([], World, World).
eval_commands([C|Cs], World0, World) :-
    write('The command is: '), write(C), nl,
    eval_command(C, World0, World1),
    eval_commands(Cs, World1, World).

Each eval\_command/3 rule changes the state of World0 to World1 according to the command in question. The simplest one is the look-command with no arguments, that just prints the current room:

    eval_command(look-[], World, World) :-
        entity::action(current_room, [Room], World, _),
        write('You see: '), nl,
        entity::action(print_children, [], Room, _).

It asks the game world for its current room and then issues the print\_children command. The take-command is slightly more convoluted. It takes an identity as argument, tries to find the entity in question and asks the player to pick it up.

eval_command(take-[Id], World0, World) :-
    entity::action(update_current_room, [R0, R], World0, World1),
    entity::action(update_player, [P0, P], World1, World),
    entity::action(select_item, [identity_property-Id, Item], R0, R),
    entity::action(add_item, [Item], P0, P).

The move-command is perhaps the most complex of the bunch, but follows the same basic structure:

    eval_command(move-[Id], World0, World) :-
        entity::action(current_room, [Room], World0, _),
        entity::action(update_player, [P0, P], World0, World),
        entity::action(get_item, [identity_property-Id, Entrance],
                       Room, _),
        entity::action(open, [], Entrance, _),
        entity::action(get_location, [Location], Entrance, _),
        entity::action(move, [Location], P0, P).

It tries to find and open the entrance in the current room, asks it where it leads and finally asks the player to move to that location. The lock/unlock and open/close commands are implemented in the same way. One problem remains though: it’s possible to take the key, unlock the door, open it and go through it, but no way to actually finish the game. Just like everything else this functionality can be implemented in a number of ways. It might be tempting to somehow augment the top-loop and in every iteration check whether or not the final state have been reached, but this is needlessly complicated. Instead we’re going to introduce a special entity with only two properties, that of having the identity final\_state and that of being printable. It’s constructed as:

    build_win_screen(Screen) :-
        Screen = [printable_property - State, identity_property-final_state],
        State = "Congratulations! A winner is you!\n (No, you can't quit. Stop trying.)\n".

Then we need an object that has the effect that it asks the player to move itself to the final state whenever it is used, for instance a banana.

:- object(fruit_property,
    extends(property)).

    action(dissolve, [E0, E], Owner, State, Owner, State) :-
        entity::action(move, [identity_property-final_state], E0, E).

:- end_object.

The banana entity is created by combining an identity, a printable, a carriable and a fruit property:

    build_test_banana(Banana) :-
        Banana = [fruit_property - State1, printable_property-State2,
                  identity_property-banana, carriable_property-State3],
        fruit_property::new(State1),
        banana_description(State2),
        carriable_property::new(State3).

Then we of course need an eat-command, but this is straightforward to implement. So what happens when the player eats the banana is that the current location changes to final\_state. This room doesn’t have any entities and doesn’t support any operations besides being printed, which means that the player can’t return to the rest of the game world and have completed the game.

Putting everything together

We shall use the top-loop from part 2, but with some modifications. The input will be parsed into commands that are executed with respect to the current game world. The loop then calls itself recursively with the new state and asks for new input.

    init :-
         write('Welcome to Bacchus-Bosch!'), nl,
         current_input(S),
         build_test_world(World),
         repl(S, [], World).

    repl(S, History, World0) :-
        entity::update(World0, World1),
        entity::action(print, [], World1, _),
        write('> '),
        nlp::parse_line(S, Atoms),
        write('The input is: '),
        meta::map([X] >> (write(X), write(' ')), Atoms), nl,
        nlp::tag_atoms(Atoms, AtomTags),
        write('The tagged input is: '),
        meta::map([X] >> (write(X), write(' ')), AtomTags), nl,
        (   eval(History, AtomTags, World1, World) ->
            true
        ;   write('no.'), % This is Prolog after all.
            nl,
            World = World1
        ),
        write('-------------------'), nl,
        repl(S, AtomTags, World).

    eval(History, AtomTags, World, World1) :-
        nlp::resolve_pronouns(History, AtomTags, AtomTags1),
        nlp::parse_atoms(AtomTags1, _, Commands),
        eval_commands(Commands, World, World1).

Like a professional TV-chef I prepared a small test world and got the following result (with some of the debug output omitted):

Welcome to Bacchus-Bosch!
A rather unremarkable room.
> look

You see:
A slightly bent key.
A wooden door with a small and rusty lock.
——————-
A rather unremarkable room.
> open the door

no.
——————-
A rather unremarkable room.
> take the key and unlock the door with it

——————-
A rather unremarkable room.
> go through the door
——————-
A room almost identical to the previous one. What on earth is going on!?
> look

You see:
A yellow banana. Hot diggity dog!
A wooden door with a small and rusty lock.
——————-
A room almost identical to the previous one. What on earth is going on!?
> take the banana
——————-
A room almost identical to the previous one. What on earth is going on!?
> eat it
——————-
Congratulations! A winner is you!
(No, you can’t quit. Stop trying.)

The final herculean task, the creation of a script-language, is saved for the next entry!

Source code

The source code is available at https://gist.github.com/900462.

An Adventure Game – Part 3

I’ve been procrastinating too much. It’s time to roll up our sleeves and get dirty with the nitty-gritty core of Bacchus-Bosch. We’re currently able to read input from the player and parse commands. Before even considering how commands should be executed we have to take a step back and make some ontological commitments. We know that the game will consists of interaction between game objects, the entities, but haven’t specified how these should be represented or how they are created. It might be tempting to simply use a set of unit clauses and statically recite everything of interest:

entity(key).
entity(room_1).
entity(room_2).
entity(door).

contains(room_1, key).
contains(room_1, door).

connected(room_1, door, room_2).
.
.
.
%And so on for the whole level.

At a first glance this might look like an acceptable solution. It wouldn’t be very hard to add e.g. properties to the entities by adding more facts, but everything breaks down in the moment when the player changes the state of the world by picking up the key. Then the key is no longer directly contained by the room and should instead be added to the inventory of the player. The quick and easy hack is to use assert/retract in order to add and remove facts according to the fluctuating state of the game world, but in the long run that’s a terrible idea. To be forced to use impure features for such an integral part of the game is either a sign that we’re using the wrong data structures or the wrong language. I’ll go with the former theory!

Let’s begin by considering a simplified problem where the game only consists of a player with the property of having health in the form of an integer: “hit points”. It should be possible to both increase and decrease this value according to events from other entities. What’s an entity? A common and simple solution is to create a class hierarchy with an abstract class in the top and concrete classes such as items and monsters in the lower layers. While there’s nothing inherently wrong with this I think we can do better, for two reasons:

  • The class hierarchy will be huge and hard to make sense of.
  • It’s not (at least not in most languages, Logtalk happens to be an exception) possible to create new classes during runtime, with the effect that behaviour can’t be modified or extended once the game has started.

An often cited design principle is to favor composition over inheritance. We want the entities to be entirely composed of smaller objects, of properties. The sum of all the properties constitute the entity – nothing more, nothing less (followers of holism should stop reading here!). It should of course be possible to both add, remove and update properties during runtime. It’s easiest to start with a basic definition of a property, which we’ll do with a prototypical object. A property should be able to:

  • Give the initial state of the property. For the health property, this might simply be a positive integer.
  • Update the entity to which the property belong. If the property doesn’t directly influence the entity, it’s left unchanged.
  • Execute an action that changes the state of the property, e.g. increasing or decreasing the health.

With these considerations the object becomes:

:- object(property).
    :- info([
        version is 1.0,
        author is 'Victor Lagerkvist',
        date is 2011/03/19,
        comment is 'A property constitutes the basic behaviours of the objects in Bacchus-Bosch.']).

    :- public(update/2).
    :- mode(update(+entity, -entity), zero_or_more).
    :- info(update/2, [
        comment is 'Update the entity to which the property belong.',
        argnames is ['Entity', 'Entity1']]).

   :- public(action/6).
    :- mode(action(+atom, +list, +entity, +state, -state, -entity), zero_or_more).
    :- info(action/6, [
        comment is 'Execute the action Name with respects to Args, State and Entity, and store the resulting new state and entity in State1 and Entity1.',
        argnames is ['Name', 'Args', 'Entity', 'State', 'Entity1', 'State1']]).

    :- public(new/1).
    :- mode(new(-term), zero_or_more).
    :- info(new/1, [
        comment is 'Unify State with the initial state of the property.',
        argnames is ['State']]).

    %% Basic definition: do nothing!
    update(E, E).

    %% Basic definition, overloaded in almost every descendant prototype.
    new(void).

:- end_object.

Exactly how these predicates should be implemented will be clearer with an example.

:- object(health_property,
    extends(property)).

    new(10).

    action(decrease_health, [], Owner, H0, Owner, H) :-
        H is H0 - 1.
    action(increase_health, [], Owner, H0, Owner, H) :-
        H is H0 + 1.

:- end_object.

Hence, the initial state for the health property is 10. To decrease the value one calls action/6 with the correct action name, an empty argument list, the owner of the property and the old state, and in return obtains the updated owner and new state with the fifth and sixth parameters. If a property doesn’t support a specific action it’ll simply fail. Let’s see how update/2 can be used together with the health property. It should be called in each tick of the game. For example: the health should be decreased if the player is poisoned or have somehow been ignited. Such a property won’t support any actions and only overload the definition of update/2 from property.

:- object(on_fire_property,
    extends(property)).

    update(E0, E) :-
        entity::action(decrease_health, [], E0, E).

:- end_object.

Since update/2 takes an entity as argument, the on_fire property simply asks this entity to decrease its health. Now we have to decide how to represent entities and how to delegate action messages. As mentioned earlier an entity is simply the sum of its properties, hence we can simply represent it as a list that contains properties and their states. The formal definition reads:

  • [] is an entity.
  • [P|Ps] is an entity if Ps is an entity and P is a tuple of the form: N - S, where N is the name of a property and S is a state of that property.

Continuing on our example, [health\_property - 10] is an entity. If we decide to have some fun and for a moment indulge ourselves with arson, [health\_property - 10, on\_fire\_property - void] is also an entity. A few basic definitions remain before we can actually do anything with these entities. First we need a predicate update/2 that iterates through the list of properties that constitute the entity and calls update/2 for every property.

update(E0, E) :-
    update(E0, E0, E).

update(E, [], E).
update(E0, [P|Ps], E) :-
    P = Name - _,
    Name::update(E0, E1),
    update(E1, Ps, E).

The effect is that each property is updated with respect to the current state of the entity. Next up is action/4. It takes the name of the action that shall be performed, its arguments, the entity in question and returns the updated entity if the action could be performed or simply fails otherwise. Since an entity can’t perform any actions it has no choice but to try to delegate the action to its properties.

action(A, Args, E0, E) :-
    %% Select a property from the list such that the action A can
    %% be performed with the arguments Args.
    list::select(P, E0, E1),
    P = PropertyName - State,
    PropertyName::action(A, Args, E1, State, E2, State1),
    %% Add the property with the updated state to E2.
    P1 = PropertyName - State1,
    E = [P1|E2].

As seen from the code action/4 uses select/3 to find the first property that supports the operation. If there’s more than one potential property it will simply choose the first one but leave choice point for the others. If we put it together into one massive object we get:

:- object(entity).

    :- info([
        version is 1.0,
        author is 'Victor Lagerkvist',
        date is 2011/03/20,
        comment is 'The entity operations of Bacchus-Bosch.']).

    :- public(update/2).
    :- public(action/4).
    :- public(get_property/2).
    :- public(select_property/3).

    update(E0, E) :-
        %% As before.

    action(A, Args, E0, E) :-
        %% As before.

    get_property(E, P) :-
        list::member(P, E).

    select_property(P, E, E1) :-
        list::select(P, E, E1).

This is more or less the basic building blocks, the engine, but of course quite a lot of work remains before we have an actual game in our hands. For the remainder of the post we shall implement enough properties so that it’s possible to construct a room with some entities in it. A room is an entity that contains other entities. We can model this as a property, namely the property of being a container. This property at least has to support the following operations:

  • Being able to update the state of its children, i.e. overload update/2 from its parent.
  • Add and remove items with actions.

For simplicity the data structure is going to be a list of entities. This makes it almost trivial to write the actions.

:- object(container_property,
    extends(property)).

    new([]).

    update(E0, E) :-
        entity::select_property(container_property-Items, E0, E1),
        update_children(Items, Items1),
        E = [container_property-Items1|E1].

    update_children([], []).
    update_children([E|Es], [E1|E1s]) :-
        entity::update(E, E1),
        update_children(Es, E1s).

    action(add_item, [E], Owner, Items, Owner, [E|Items]).

    action(remove_item, [E], Owner, Items, Owner, Items1) :-
        list::select(E, Items, Items1).
:- end_object.

Then we of course need a door. Doors are awesome: you can walk through them, open them, close them, lock them – the only limit is the imagination! Our particular door will only have one property though, that of being openable. But it’s only possible to open a door if it happens to be unlocked, hence the openable property in turn depends on whether or not the lock can be unlocked with the key in question. For simplicity we’re going to assume that the player uses the key every time he/she wishes to open the door, but extending these properties so that the lock has an internal state would not be terribly difficult.

:- object(openable_property,
    extends(property)).

    new(closed-[lock_property- Lock]) :-
        lock_property::new(Lock).

    action(open, Key, Owner, closed-Lock, Owner, open-Lock) :-
        entity::action(unlock, Key, Lock, _).
    action(close, [], Owner, _-Lock, Owner, closed-Lock).

:- end_object.

The action/6 command should be read as: change the state of the door from closed to open if the key can unlock the lock. To complete the door we of course need to define the lock and key properties.

 :- object(key_property,
    extends(property)).

    %% The default key.
    new(key).

:- end_object.

:- object(lock_property,
    extends(property)).

    %% The (default) key that opens the lock.
    new(key).

    valid_key(E, Key) :-
        entity::get_property(E, key_property-Key).

    action(lock, [Entity], Owner, Key, Owner, Key) :-
        valid_key(Entity, Key).
    action(unlock, [Entity], Owner, Key, Owner, Key) :-
        valid_key(Entity, Key).

:- end_object.

The two actions specifies that the entity can lock or unlock the lock if it has the property of being a key with the correct state. Note that the entity in question doesn’t just have to be a key, it can still support other properties as long as the key property is fulfilled.

We’re now able to create a room with a door and a key. What remains is the player. For the moment that’s just an entity that has the property of having health, which we defined earlier, and that of having an inventory. The Argus-eyed reader will probably realize that having an inventory is quite similar to that of being a container. The only difference is that the player can only pick up items that are carriable, so instead of defining everything from the ground up we’re going to inherit some definitions from container\_property. What happened to the principle of favoring composition over inheritance? Bah!

:- object(inventory_property,
    extends(container_property)).

    valid_item(E) :-
        entity::get_property(E, carriable_property-_).

    action(add_item, [E], Owner, Items, Owner, [E|Items]) :-
        valid_item(E).

    action(drop_item, [E], Owner, Items, Owner, Items1) :-
        valid_item(E), % This shouldn't be necessary since only valid item are added in the first place.
        list::select(E, Items, Items1).

:- end_object.

:- object(carriable_property,
    extends(property)).

    %% Perhaps not the most interesting property in the game.

:- end_object.

It should be noted that I haven’t actually tested most of these properties, but it should/could work! So do we now have a functional but simplistic game? Not really. We have the basic entities, but there’s a quite glaring omission: everything is invisible since nothing can be printed. For a text game this is a distinct disadvantage! The simplest solution is to add another property, that of being printable, where the state is the string that shall be displayed.

:- object(printable_property,
    extends(property)).

    new(Description) :-
        list::valid(Description).

    action(print, [], Owner, Description, Owner, Description) :-
        format(Description).
:- end_object.

But since we also want the possibility to print items in a container, e.g. a room, we’re going to define an additional action in container\_property.

:- object(container_property,
    extends(property)).

    new([]).
    .
    . % As before.
    .

    action(print_children, Args, Owner, Items, Owner, Items) :-
        % Print all children that are printable.
        meta::include([E] >>
                       (entity::action(print, Args, E, _)),
                      Items,
                      _).
:- end_object.

Putting everything together

Let’s use the properties and construct some entities. The test program is going to build a room and then print its description together with its children.

    init :-
        write('Welcome to Bacchus-Bosch!'), nl,
        build_test_room(Room),
        entity::action(print, [], Room, Room1),
        write('You see: '), nl,
        entity::action(print_children, [], Room1, _Room).

Building the room and its components is not hard, just a bit tedious.

    build_test_room(Room) :-
        build_test_door(Door),
        build_test_player(Player),
        build_test_key(Key),
        Room0 = [container_property - State1, printable_property - State2],
        container_property::new(State1),
        room_description(State2),
        entity::action(add_item, [Door], Room0, Room1),
        entity::action(add_item, [Player], Room1, Room2),
        entity::action(add_item, [Key], Room2, Room).
    build_test_door(Door) :-
        Door = [openable_property-State1, printable_property-State2],
        openable_property::new(State1),
        door_description(State2).

    build_test_key(Key) :-
         Key = [key_property-State1, printable_property-State2],
         key_property::new(State1),
         key_description(State2).

    build_test_player(Player) :-
        Player = [inventory_property-State1, health_property-State2],
        inventory_property::new(State1),
        health_property::new(State2).

The string descriptions are simply given as facts.

    room_description("A rather unremarkable room.\n").
    door_description("A wooden door with a small and rusty lock.\n").
    key_description("A slightly bent key.\n").

When running the test program we get the following output:

Welcome to Bacchus-Bosch!
A rather unremarkable room.
You see:
A slightly bent key.
A wooden door with a small and rusty lock.
true

Only three major obstacles remain before we have a game in our hands:

  • More properties.
  • Conjoin the parsed commands with the game world so that it’s possible for the user to interact with entities. It shouldn’t be too hard to see that commands will be executed by sending action commands to the entities in question.
  • Abstract the creation of game objects, e.g. with the help of an embedded language.

Which will be the topics of the next post. Stay tuned!

Source code

The source code is available at https://gist.github.com/878277.

An Adventure Game – Part 2

When we left off last time we were able to read input from the player and tag the atoms with their word classes. This time we’ll use this information to analyze what the input means with two different techniques: pronominal anaphora resolution and chunk-based parsing. Don’t worry about the technical names – I’m way to lazy to do something that would actually be strenuous, so the presented solutions won’t be that hard to grasp. Let’s have a look at some typical user input and some potential answers:

> look

You are in a dimly lit room. A banana hangs from the ceiling in a thin, plastic wire.

> look at the banana

It’s a typical banana. Yellow, crooked and utterly delicious. Mmm.

> eat it

The banana is out of reach.

> grab the banana and eat it

Not even your long, hairy ape-arms are enough to bridge the vast gap.

> fling feces around the room

I’m sorry. I don’t know what ‘fling’ means.

All these sentences have something in common: they are imperative in nature and somehow either changes the state of the game world or inspects it. We’ll make the basic assumption that every command can be described as a tuple, C-Args, where C is an atom and Args is a list of tags. The “look”-command from the example can be represented as:

command(look, [entity]).
command(look, []).

Since a banana is an entity, the sentence “look at the banana” can be parsed as a look-command with “banana” as argument. It can also be parsed simply as “look”, with no argument at all, but the first interpretation is preferable. There is however one problem that we must tackle before the parsing can take place: what does “it” mean? That depends on the context and is the subject of pronominal resolution.

Pronominal resolution

Just like everything else in natural language processing, resolving pronouns is quite challenging in the general case. Fortunately we’re dealing with a rather restricted subset of English. When a pronoun such as “it” is used we can make the assumption that it’s just a shorthand for a previously identified entity. The task then becomes to replace all occurrences of a pronoun with the corresponding entity before the parsing. Let’s have a look at a few examples:

> take the banana and eat it

Here the pronoun refers to the banana, i.e. the sentence should be replaced by “take the banana and eat the banana”.

> look at the ceiling and take the banana and eat it

Here the pronoun probably refers to the banana, but it could possibly also refer to the first entity: the ceiling. Since we always want the possibility to backtrack and try alternative solutions in case the first one fails later on, it’s a good idea to make the predicate non-deterministic. Assuming that we’re given a list of tagged atoms, a simple algorithm may look as follows:

  • If the list of tagged atoms is non-empty, inspect the first element.
  • If it’s not a pronoun, ignore it and continue with the rest of the list.
  • If it’s a pronoun, replace the pronoun with the corresponding entity, where the candidates of possible entities are the tagged atoms that occur to the left of the pronoun in the sentence.

Putting this into code is straightforward. The predicate will have three arguments: the list of tagged atoms, the tagged atoms that have been processed so far and the result.

    resolve_pronouns(ATs, Xs) :-
        resolve_pronouns(ATs, [], Xs).

    resolve_pronouns([], _, []).
    resolve_pronouns([A-T|Xs], Pre, [Y|Ys]) :-
        (   T = pronoun ->
            resolve_pronoun(A, Pre, Y),
            resolve_pronouns(Xs, Pre, Ys)
        ;   Y = A-T,
            resolve_pronouns(Xs, [Y|Pre], Ys)
        ).

resolve\_pronoun/2 is also a walk in the park. For now, we’re only going to be dealing with “it”. Other pronouns can be added in a similar manner.

    %% NAME:
    %%  resolve_pronoun(Pronoun, AtomTags, X)
    %% DESCRIPTION:
    %% True if X is an entity corresponding to the pronoun resolved
    %% with the help of AtomTags. It is the caller's responsibility to
    %% make sure that AtomTags is in the correct (possibly reversed)
    %% order.
    resolve_pronoun(it, Xs, X) :-
        X = A-entity,
        list::member(X, Xs).

Nota bene: since the words are pushed onto Pre in LIFO order, a pronoun will be resolved from right-to-left. While this algorithm is indeed very simple, it’s not particularly efficient. Due to the call to member/2 in resolve\_pronoun/2, the worst case execution time (to find one solution) for resolve\_pronouns/2 is exponential. But considering the fact that a typical sentence is just a few words long and that they rarely if ever contains more than one or two pronouns, I think we’ll survive.

One step remains. Note that “it” is used in one of the example sentences without referring to anything in that particular sentence. Instead it refers to the entity that was used in the previous sentence. The obvious yet quite flexible solution is to augment resolve\_pronouns/2 with an additional argument that holds the previous sentence. Then a pronoun can first be resolved with respect to the current sentence, but if that fails the previous sentence is tried instead.

Parsing

Natural language contains a lot of noise and variations. This makes formal methods such as context-free grammars somewhat ill-suited and cumbersome to use. One deceivingly simple solution is to ignore everything that we’re not interested in and group the good stuff into chunks. A command, e.g. “eat”, can be considered a chunk. We know that it takes one entity-argument, so after encountering the command chunk we scan the rest of the sentence for an argument chunk of the correct type. This might sound like cheating – and it kind of is – but as long as the reduced, simpler problem still yields the correct output that’s hardly of any importance.

The goal is to produce one or more commands given a list of tagged atoms (where the pronouns are resolved). Remember that commands are defined as:

        command(look, [entity]).
        command(look, []).
        command(eat, [entity]).

To allow some flexibility in the input, we’re also going to allow synonyms. These can be defined as additional facts.

    variants(look ,[look, inspect]).
    variants(eat, [eat, munch, digest]).

For every chunk of type X we’re going to write a corresponding X\_chunk predicate that has 3 arguments: the list of tagged atoms, the list that remains after this chunk is parsed, and the resulting chunk. First off is the sentence chunk.

    %% NAME:
    %%  parse_atoms(+AtomTags, -Remainder, -Commands)
    %% DESCRIPTION:
    %% True if Commands is a list of commands parsed from from the atoms and their
    %% tags in AtomTags.
    parse_atoms(ATs, Remainder, Commands) :-
        sentence_chunk(ATs, Remainder, Commands).

    %% NAME:
    %%  sentence_chunk(+AtomTags, -Remainder, -Commands)
    %% DESCRIPTION:
    %% True if Commands is a list of commands parsed from from the atoms and their
    %% tags in AtomTags. The remainder, i.e. the chunk after the
    %% sentence, is stored in Remainder (most likely just the empty list).
    sentence_chunk(ATs, Remainder, [C-Args|Cs]) :-
        command_chunk(ATs, ATs1, C),
        argument_chunk(ATs1, C, ATs2, Args),
        (   conjunction_chunk(ATs2, Remainder0, _) ->
            sentence_chunk(Remainder0, Remainder, Cs)
        ;   Cs = []
        ).

    conjunction_chunk([Conj-conjunction|ATs], ATs, Conj-conjunction).

Which shall be read as “first parse a command followed by its argument, then if the arguments are followed by a conjunction, recursively parse the rest of the list”. The command chunk is slightly more involved since we have to deal with variants of commands.

    %% NAME:
    %%  command_chunk(+AtomTags, -Remainder, -Command)
    %% DESCRIPTION:
    %% True if Command is the command parsed from the atoms and their
    %% tags in AtomTags. The remainder, i.e. the chunk after the
    %% command, is stored in Remainder.
    command_chunk(ATs, ATs1, C) :-
        %% First find a verb, i.e. a potential command.
        list::append(_, [C0-verb|ATs1], ATs),
        %% Then check whether or not the verb is a variant of a known
        %% command.
        game_logic::variants(C, Cs),
        list::member(C0, Cs).

Note the use of append/3 to ignore everything in the list up to a verb. When parsing the arguments we scan the input and look after atoms of the correct type.

    %% NAME:
    %%  argument_chunk(+AtomTags, +Command, -Remainder, -Args)
    %% DESCRIPTION:
    %% True if Args are the arguments corresponding to the arity of
    %% Command with respect to the atoms and their tags in
    %% AtomTags. The remainder, i.e. the chunk after the last
    %% argument, is stored in Remainder.
    argument_chunk(ATs, C, ATs1, Args) :-
        game_logic::command(C, Tags),
        matches(ATs, Tags, ATs1, Args).

    %% NAME:
    %%  matches(+AtomTags, +Tags, -Remainder, -Args).
    %% DESCRIPTION:
    %%  True if the list of atoms and their corresponding tags matches
    %%  Tags, i.e. there exists a sequence of atoms, not necessarily
    %%  in a a direct linear sequence, such that their tags can be
    %%  mapped to the tags in Tags.
    matches(ATs, [], ATs, []).
    matches(ATs, [T|Ts], Remainder, [A|As]) :-
        list::append(_, [A-T|ATs1], ATs),
        matches(ATs1, Ts, Remainder, As).

That’s all the constructs that we’re going to support at the moment. Two remarks should be made before we move on:

  • The chunk predicates can be rewritten as DCGs instead since they’re essentially just threading state. I’ll leave this as an exercise to the reader!
  • argument\_chunk/4 is given the ability to ignore everything that it can’t make sense of. While this is good in many circumstances, it also has the effect that some ill-formed sentences will be incorrectly parsed. For example, consider the sentence “eat the bnana and look at the room”. Assuming that bnana is tagged as unknown, the sentence will be parsed as eat-room since room is the only entity that argument\_chunk/4 discovers. One possible solution is to restrict the argument parsing to the current clause and disallow it to continue past conjunctions.

Putting everything together

Now we’re able to dissect the meaning of the user input and extract commands from it. The next task is to actually carry out the commands and update the state of the game world, but for the moment we’re just going to print the parsed commands to verify that everything works. The steps are the following:

  • Read a line and convert it to atoms.
  • Tag the atom list.
  • Resolve the pronouns.
  • Parse the resulting list of tagged atoms.

And it’s rather easy to cobble together.

    init :-
         write('Welcome to Bacchus-Bosch!'), nl,
         current_input(S),
         repl(S, []).

    repl(S, History) :-
        write('> '),
        nlp::parse_line(S, Atoms),
        write('The input is: '),
        meta::map([X] >> (write(X), write(' ')), Atoms), nl,
        nlp::tag_atoms(Atoms, AtomTags),
        write('The tagged input is: '),
        meta::map([X] >> (write(X), write(' ')), AtomTags), nl,
        (   eval(History, AtomTags) ->
            true
        ;   write('I''m sorry, Dave. I''m afraid I can''t do that'),
            nl
        ),
        repl(S, AtomTags).

    eval(History, AtomTags) :-
        nlp::resolve_pronouns(History, AtomTags, AtomTags1),
        eval_commands(AtomTags1).

    eval_commands(AtomTags) :-
        nlp::parse_atoms(AtomTags, _, Commands),
        write('The parsed commands are: '), nl,
        meta::map([X] >> (write(X), nl), Commands).

If we feed this program with the example sentences we end up with the following dialogue:

Welcome to Bacchus-Bosch!

> look

The input is: look
The tagged input is: look-verb
The parsed commands are:
look-[]

> look at the banana

The input is: look at the banana
The tagged input is: look-verb at-preposition the-article banana-entity
The parsed commands are:
look-[banana]

> eat it

The input is: eat it
The tagged input is: eat-verb it-pronoun
The parsed commands are:
eat-[banana]

> grab the banana and eat it

The input is: grab the banana and eat it
The tagged input is: grab-verb the-article banana-entity and-conjunction eat-verb it-pronoun
The parsed commands are:
take-[banana]
eat-[banana]

> fling feces around the room

The input is: fling feces around the room
The tagged input is: fling-unknown feces-unknown around-preposition the-article room-entity
I’m sorry, Dave. I’m afraid I can’t do that

Source code

The source code is available at https://gist.github.com/862464.

Due to a hefty amount of school-work (I should actually be crunching theorems in number theory right now!) part 3 will probably be delayed one or two weeks.