Archive | February 2011

An Adventure Game – Part 1

A Prolog programmer walks into a bar:

– Can I have beer?

– No.

As promised, in this and the following posts I’ll design and implement a simple, text-driven adventure game in the spirit of Zork. I shan’t be too concerned with superfluous features and instead direct most energy towards nifty algorithms and reasonably well-designed data structures. No guarantees whatsoever will be made regarding the actual playability of the final product! Since they say that a picture is worth more than a thousand words:

Wait. That’s probably not true in the case of a text-adventure game. Crap. In any case the basic ideas involved are not too hard to grasp. The only interface against the user is text in the form of output and input. Output is as one might expect one of the easiest parts to implement since all game objects will have textual descriptions. Input is far, far worse since we’ll be dealing with natural instead of formal languages. In the general case perfect natural language processing is more or less impossible. This is why almost all successful services that make use of natural languages are based on statistical/probabilistic methods instead of strictly formal models such as context-free grammars (hybrids also exist). But of course, just because something is very hard in the general case doesn’t necessarily imply that it’s just as hard in every case. We will however be forced to make some severe restrictions regarding what kind of input we want to accept and focus on these areas.

Where to start? UML-diagrams? Activity diagrams? Of course not — we’ve already put several minutes into the project without savoring a single cup of the brown, precious liquid known to man as coffee. That’s the first step. The second step is to blankly stare into the monitor for at least a few minutes and realize that it would probably be a good idea to put on some music before the mind-numbing silence quells all remaining sanity. Good choices are Einstürzende Neubauten, Melt Banana or Programmer Art, by Yellow Bear. The fourth step is to automatically generate a nonsensical name for the project. I used a soundex-algorithm with random names as input, and after some tinkering I ended up with:

Bacchus-Bosch

No, I’ve got absolutely no idea what it means, but it rolls of the tongue nicely. Now we’re finally in a position where we might be inclined to do some real work! This is an instance where the bottom-up methodology will work nicely: start with something simple that we know that we’ll need later on and expand from there. For no particular reason I’ll start with the most fundamental parts of user input: tokenizing and part-of-speech tagging. A typical example of user input is:

open the door

go through the door

How should these sentences be construed? Well, no matter what we’ll have to start by reading them character by character from an input stream. This will result in a list of characters. We can code this in more or less the same way as in an imperative language: continue reading characters until the end of line is reached.

read_line(S, Chars) :-
    get_char(S, C),
    read_line(C, S, Chars).

read_line('\n', _, []) :- !.
read_line(C, S, [C|Chars]) :-
    read_line(S, Chars).

Where get\_char/2 takes a stream as input and returns the next character. We could in theory analyze this character list already at this stage, but for simplicity it’s better to introduce some additional pre-processing steps. Therefore we’ll take our character list and group together the characters into words instead. For example, the character list from the first example sentence would be [o,p,e,n,'\,',t,h,e,'\,',d,o,o,r],  and would result in the word list [open, the, door]. There already exists a primitive for this purpose: atom\_chars/2, so what we need to do is to go through the list and look for groups separated by whitespace and convert each group to an atom. One simple solution is to use a set of DCG-rules (for brevity some basic rules are omitted):

chars_to_list([A|As]) -->
	chars_to_atom(A),
	whitespace,
	chars_to_list(As).
chars_to_list([A]) -->
	chars_to_atom(A),
	opt_whitespace.
chars_to_atom(A) -->
	chars(Cs),
	{atom_chars(A, Cs)}.

The goal in the curly braces, atom\_chars(A, Cs), makes sure that each group is converted appropriately. This is certainly not the most efficient solution. For example: the last atom always needs to be parsed twice in chars\_to\_list//1, one time in the first rule and one time in the second. In a real compiler this might not be acceptable, but since we’re dealing with sentences of just a few words it’s A-OK.

Now we can recognize individual words, but we still don’t know what they mean, i.e. what semantic class they belong to. Are they imperative commands? Directions? Physical objects? For that we’ll use a technique that in natural language processing is known as tagging: each word will be assigned a tag corresponding to its lexical category. For example, the sentences above could be manually tagged as:

open/verb the/article door/noun

go/verb through/preposition the/article door/noun

Don’t fret if the part-of-speech lectures from middle school are nothing but a distant, hazy memory: it won’t become much more complicated than this. Like all other domains in natural language processing, tagging is very hard to do perfectly. Therefore we shall use a simple tagger that either looks up the tag for a word manually, or forms a decision based on the previously tagged words. First we need a small dictionary that contains words and their lexical categories. The simplest solution is to implement it as a set of word/2 facts:

:- object(game_logic).

    :- public(word/2).

    word(entity, door).
    word(entity, key).
    word(entity, banana).

    word(direction, north).
    word(direction, east).
    word(direction, south).
    word(direction, west).

    word(preposition, on).
    word(preposition, through).

    word(pronoun, he).
    word(pronoun, it).
    word(pronoun, she).

    word(verb, go).
    word(verb, take).
    word(verb, open).
    word(verb, eat).
    word(verb, use).

    word(article, a).
    word(article, an).
    word(article, the).

:- end_object.

A remark: it would be preferable to store the tuples with the arguments switched to exploit first-argument indexing, since the word is usually looked up more frequently than the lexical category. Obviously this is just a sample dictionary, but it can easily be extended once we begin working on the rest of the game. Next we’ll need a predicate that, given a word returns its lexical category as a tag. However, since we also want to be able to construct rules that takes the history into account, it will in addition to the word also need a list of the preceding words and their tags. For efficiency reason this list will be stored in reversed order. An example of such a rule is: “if the previous tag was an article, then the word must be an entity”.

    %% NAME:
    %% 	tag(+Preceding, -Word, -Tag).
    %% DESCRIPTION:
    %% 	True if Tag is the tag corresponding to Word. Preceding is used
    %% 	whenever a rule needs to look at the history in order to tag the
    %% 	word.
    tag([_-article|_], _, entity).
    tag(_, A, T) :-
        game_logic::word(T, A).

It’s easy to extend tag/3 with additional rules, but let’s leave it as it is for now. The actual tagger, tag\_atoms/2, will just go through the list of atoms and call tag/3 for each of them.

    %% NAME:
    %%  tag_atoms(+Atoms, -AtomTags).
    %% DESCRIPTION:
    %%  True if AtomTags is a list of pairs of the form Atom-Tag,
    %%  where Tag is the tag corresponding to Atom.
    tag_atoms(As, Ts) :-
        tag_atoms(As, [], Ts).

    tag_atoms([], _, []).
    tag_atoms([A|As], Pre, [A-T|Ts]) :-
        tag(Pre, A, T),
        tag_atoms(As, [A-T|Pre],Ts).
    tag_atoms([A|As], Pre, [A-unknown|Ts]) :-
        %We don't use a cut since we want the ability to try several
        %different tags if necessary.
        \+ tag(Pre, A, _),
        tag_atoms(As, [A-unknown|Pre], Ts).

Note something about tag\_atoms/3: it can return several solutions if tag/2 is non-deterministic. This is actually a good thing since natural language is inherently non-precise and ambiguous. If some other part in the game fails due to an erroneous tagging it’s always possible to backtrack and try again.

Here’s how the tagger looks in action:

-> eat the key
The input is: eat the key
The tagged input is: eat-verb the-article key-entity

-> open the door with the banana
The input is: open the door with the banana
The tagged input is: open-verb the-article door-entity with-preposition the-article banana-entity

Whether those specific commands make any sense in the finished game remains to be seen!

Summary

We’ve implemented some fundamental parts of Bacchus-Bosch and are now able to get input from the player and make sense of it by tagging it. Next we’ll concern ourselves with parsing. The goal is to construct a parse tree from the tagged input that reflects the structure of the command that the player had in mind. Stay tuned!

Source code

The source code is available at https://gist.github.com/847541.

Advertisements

A Crash Course in Logtalk

Or: Logtalk as she is spoke.

Since the last post was quite theoretical it’s only fair that the following ones will be more pragmatic in nature. The goal will be to develop a simplistic but still fairly capable text-adventure game in the same spirit as Zork and its successors and present the parts incrementally. But as you, the reader, probably knows compatibility between various Prolog systems is spurious at best. One remedy which I’ve championed before is Logtalk – an “open source object-oriented logic programming language” developed by Paulo Moura, which allows one to write portable programs without worrying about the details of the underlying Prolog system. As the description suggests, Logtalk leaves the constricted domain of a logic program and introduces a few new concepts that can greatly aid the programmer not only in large, but small and medium-sized projects. The question of what an object is in the context of logic programming is an interesting one. We shall however not concern us ourselves by giving any formal definitions and instead be rather descriptive. A formalistic viewpoint will probably be perused in a later entry, but for now the reader can refer to for example Logic and Objects, by Francis G. McCabe, which contains a thorough entry on the semantics of the Logic and objects language and answers what inheritance and objects mean in logic programming. Hence the remainder of this post will try to shed some light on the basics of the Logtalk language and explain how it’s installed and used. More advanced concepts will be explained as they are used when the time is ripe.

Installation

We begin our journey by downloading the Logtalk distribution. This can be done either via the Logtalk download page or by Subversion – the choice is yours! The next step is to read the exhaustive INSTALL.txt file, but for a POSIX system the procedure is as follows:

  • Set the correct environment variables. Logtalk uses two variables: LOGTALKHOME and LOGTALKUSER. For a single-user system these can coincide to the same directory, e.g. $HOME/lgtsvn, or wherever the distribution was unpacked. It’s also a good idea to augment the $PATH and $MANPATH variables to include $LOGTALKHOME/scripts, $LOGTALKHOME/integration and $LOGTALKHOME/man respectively.
  • Run the correct script in $LOGTALKHOME/integration. For SWI-Prolog this is swilgt.sh, for YAP Prolog it’s yaplgt.sh.

That’s it! Of course one may also digress in various customization aspects, but for basic usage this is more than enough.

Usage

Let’s hack away. I shall refrain from giving the idiomatic but ultimately quite meaningless Hello World-example and instead present something that uses at least some of the basic Logtalk facilities. The problem is the following: compute the bottom-up closure of a propositional logic program. If you’ve read the last entry on the semantics of logic programs you should immediately realize that this is nothing else than the least Herbrand model. We begin by outlining the algorithm:

  • The facts in the program are true.
  • If there is a rule H \leftarrow B_1, ,,, B_n such that all B_i are known to be true, then H must be true.
  • If it’s impossible to collect any new information, we’ve reached a fixpoint and are done.

The first step is to choose a program representation. Rules shall be represented as facts of the form rule(Head, Body), where Body is a list [B_1, ..., B_n] and Head and B_i are atoms. For facts, Body = [].  For example:

rule(a, []).
rule(b, [])

rule(c, [a,b])

Which should be read as: a and b are true, and if a and b are true then c must be true. This brings us to our first Logtalk concept: the object. The rule set discussed above could written as:

:- object(database).

    :- public(rule/2).

    rule(a, []).
    rule(b, []).

    rule(c, [a,b]).

:- end_object.

There’s nothing really fancy going on. The :-\, object directive obviously denotes the start of an object, while the :-\, end\_object directive denotes the end. public(rule/2) means that the set of rule/2 clauses have public visibility, i.e. that they can be accessed from anywhere. The only possible source of confusion might be the loose use of the word object. In e.g. Java or C++ there’s a sharp distinction between a class and an object. We say that an object is an instance of a class. This distinction is needed due to the fact that objects have explicit, dynamic state. In Logtalk, however, objects are most often used to provide module-like namespaces. (To be fair, there exists dynamic objects but they should be used sparingly, just like assert/retract in Prolog). Hence there’s often no need to make a distinction between a class and an object. For a more thorough explication, see the relevant part of the Logtalk manual.

Now let’s turn to the algorithm. From the pseudo-code we know that we’ll need a predicate that takes the current set of atoms known to be true and collects the heads of all the satisfied rules. This is easy to do: if we encounter a rule with the head H and body B, such that B is currently satisfied, then add H to the set of new inferences. Since there already exists a library for ordered sets in Logtalk we can write it as:

next(I0, I) :-
    setof(Head,
            (
             Body^(database::rule(Head, Body),
             satisfied(Body, I0),
             \+ set::member(Head, I0))
            ),
            New),
    set::union(I0, New, I).

The ::-operator denotes message sending. Hence, when we want to call a predicate defined in the database we write database::rule(...), but when we want to call a set predicate we prefix it with set.  I is the union of the old set of inferences, I0, and the new set of inferences, New. A head Head of a rule is added to New if its body is satisfied and it hasn’t been collected before. Since we’re only dealing with propositional atoms, the satisfied/2 predicate is easily implemented as:

satisfied(Body, Atoms) :-
    set::subset(Body, Atoms).

Next we need a predicate that repeatedly calls next/2 until no new inferences can be made. Since we use setof/3 in next/2, it will simply fail if New is the empty set. Hence the iteration only needs an if-then-else construct that either unifies its last argument with the obtained set of inferences, or calls itself recursively.

iterate(I0, Fixpoint) :-
    (   next(I0, I) ->
        iterate(I, Fixpoint)
    ;   Fixpoint = I0
    ).

To wrap it up it might be nice to have a predicate that takes a goal as argument, calls iterate/2 and checks whether the goal is a member of the fixpoint or not.

:- public(prove/1).

prove(Goal) :-
    iterate([], Fixpoint),
    set::member(Goal, Fixpoint).

Putting everything together

Logtalk programs are loaded with logtalk\_load/1 or logtalk\_load/2. Assuming that the bottom-up procedures were included in an object with the name bup, the whole project is loaded with the sequence:

logtalk_load(database). %The database.
logtalk_load(library(types_loader)) %The set library.
logtalk_load(bup) %The main program.

To run a query, one writes:

?- bup::prove(X).
X = a n
X = b n
X = c n
false.

And since it’s a mouthful to manually load the files every time it’s better to put that information in a loader file named loader.lgt.

:- initialization((
    logtalk_load(library(types_loader)),
    logtalk_load(database),
    logtalk_load(bup))).

The loader file can then be loaded from the top-level with either logtalk_load(loader) or the shorthand {loader}.

That’s it for now. I’ll admit that it would be nice to attach the full source code directory, but unfortunately the free WordPress hosting doesn’t seem to have that functionality. I managed to upload a zip-file by changing the file extension, but that hardly seems ideal. Rest assured that the matter will probably be solved before the next entry!

A Gentle Introduction to Herbrand Models

Almost related pun of the day: Some interpretations are more equal than others.

I’ve not yet encountered a single person — including myself — who have grasped the concept and utilities of Herbrand interpretations at a first glance. This might lead one to believe that the theory is so hard and abstract that it’s not really related to every day logic programming anyway, but nothing could be further from the truth. On the contrary, the problem is rather that the definition of a Herbrand interpretation is too simple to make any sense — a wolf in sheep’s clothing. Hence we have a good subject at hand and a splendid opportunity for me to confuse you further with deceptive didacticism.

A logic program is a finite set of rules and facts. Let’s not bother ourselves by giving a formal definition and instead cut right to the chase with an example:

natural(zero).
natural(s(N)) \leftarrow natural(N).

From a pragmatic perspective it should be construed as: zero is natural (i.e. a natural number), and if it is known that N is a natural number, then it’s the case that s(N) is a natural number. I’ll be the first one to admit that it’s not a very interesting program, but it’ll have to do for the moment. Following this intuition it’s possible to answer queries to the program in two different ways: top-down and bottom-up. Let our query be natural(s(zero)). Here’s how the two different strategies differ:

Top-down

To solve natural(s(zero)), we unify natural(s(zero)) with natural(s(N)) and obtain the unifier \{N= zero\}. Then we recursively solve the new goal obtained from the right-hand side of the rule: natural(zero). This can be done in one step, and since there are no more goals to resolve we’re done.

Bottom-up

We begin to note that natural(zero) holds. A solid start! Then, since the body of the second clause is true with the unifier \{N= zero\} (the same as before), we conclude that natural(s(zero)) must also be true. Since this is the same as the original query there’s no reason to continue.

I’ve admittingly danced over a lot of intricate details, but I should at least mention that the top-down approach corresponds to SLD-resolution while the bottom-up approach corresponds to unit-resolution (sometimes called hyper-resolution). The difference between the two lies in the application of the resolution rule: in SLD-resolution we produce resolvents between a head of clause and a goal clause, in unit-resolution we produce resolvents between a unit-clause and the body of a clause. So we have two ways of answering a query, but how do we know that they are correct? For that we need to backtrack (pun not intended) and carefully explicate the meaning of the program. This is formally done with a model. The content of this particular program is:

  1. A single constant: zero.
  2. A single function symbol of arity 1: s.
  3. A single predicate symbol of arity 1: natural.

And to provide a meaning to the program we need to explain exactly what these symbols mean. We need two functions: one for the constant symbol, f_{zero}, and one for the function symbol, f_s. In the domain of the natural numbers these can be defined as:

  1. f_{zero}(zero) = 0
  2. f_s(N) = N + 1

Which may then be combined into a single function, f, that allows us to recursively compute the value of any term in the program. It is defined as:

f(zero) = f_{zero}(zero)

f(s(N)) = f_s(f(N))

For example:

f(s(zero)) = f_s(f_{zero}(zero)) = f_s(0) = 0 + 1 = 1

The only predicate symbol, natural, must be mapped to a relation on the domain. The simplest such is just the set \{0, 1, ...\}. And we’re done with this particular model.

Is it a good model? That depends. In the context of our logic program it seems to be a bit overkill. Previously we only had to worry about the symbols that actually appeared in the program: s, zero and natural; but now we’ve introduced arithmetical notation that so to speak goes beyond the original interpretation. There’s really no reason at all to demand that zero must be evaluated to 0 and that s(zero) must be evaluated to 1. They are just two different representations. With this in mind we can construct a simpler model:

  1. f_{zero}(zero) = zero
  2. f_s(N) = s(N)

And in the same vein as before this may be combined into a single function f. Applied to the same term as before we obtain:
f(s(zero)) = f_s(f_{zero}(zero)) = f_s(zero) = s(zero)

We’re back with what we started with! This means that the terms in the program are just mapped to themselves, and why not? The predicate symbol natural will now be mapped to the relation \{zero, s(zero),...\} instead of \{0, 1, ...\}.

These kinds of models are known as Herbrand models. The general definition closely mimics the earlier one, with the effect that all terms are mapped to themselves and that predicates are mapped to relations over the universe of terms (the Herbrand universe). Since the only interesting part of a Herbrand model is the mapping of predicates, it’s often presented as a set that enumerates the values for which the predicates should be true. In our example, which is in fact a least Herbrand model, this was \{natural(zero), natural(s(zero)), ...\}. As should be evident by now this is precisely the set that we already had in mind as programmers, it’s the intended meaning of the program.

But was this just a fluke? Could there be cases where it’s impossible to find a Herbrand model even though there’s exists another model? Fortunately the answer is no: if there exists a model for a logic program then there also exists a Herbrand model. This provides a theoretical justification for both SLD- and unit-resolution. In the first case we use this in our search for unsatisfiable sets of ground clauses from the Herbrand universe, in the second case we actually build the (least) Herbrand model from bottom up.

Prolog

And with that awful pun, we’re off.  The good news is that it can only get better from here. While I can’t claim to have any grand plans regarding the content of the blog, my aim is to cover both practical and theoretical issues in the realm of logic and logic programming. From this premise, it immediately follows that the title “The blogging of logic programming” probably would have been a more accurate name of the blog, since I’ll probably leave the constricted domain of Prolog quite often. Oh well.

I’ll not make any special assumptions when it comes to the knowledge level of the reader, but I’ll try to keep a didactic and pedagogical approach for the simple reason that I probably wouldn’t understand a post once the coffee-induced frenzy wore off otherwise.