An Adventure Game – Part 2

When we left off last time we were able to read input from the player and tag the atoms with their word classes. This time we’ll use this information to analyze what the input means with two different techniques: pronominal anaphora resolution and chunk-based parsing. Don’t worry about the technical names – I’m way to lazy to do something that would actually be strenuous, so the presented solutions won’t be that hard to grasp. Let’s have a look at some typical user input and some potential answers:

> look

You are in a dimly lit room. A banana hangs from the ceiling in a thin, plastic wire.

> look at the banana

It’s a typical banana. Yellow, crooked and utterly delicious. Mmm.

> eat it

The banana is out of reach.

> grab the banana and eat it

Not even your long, hairy ape-arms are enough to bridge the vast gap.

> fling feces around the room

I’m sorry. I don’t know what ‘fling’ means.

All these sentences have something in common: they are imperative in nature and somehow either changes the state of the game world or inspects it. We’ll make the basic assumption that every command can be described as a tuple, C-Args, where C is an atom and Args is a list of tags. The “look”-command from the example can be represented as:

command(look, [entity]).
command(look, []).

Since a banana is an entity, the sentence “look at the banana” can be parsed as a look-command with “banana” as argument. It can also be parsed simply as “look”, with no argument at all, but the first interpretation is preferable. There is however one problem that we must tackle before the parsing can take place: what does “it” mean? That depends on the context and is the subject of pronominal resolution.

Pronominal resolution

Just like everything else in natural language processing, resolving pronouns is quite challenging in the general case. Fortunately we’re dealing with a rather restricted subset of English. When a pronoun such as “it” is used we can make the assumption that it’s just a shorthand for a previously identified entity. The task then becomes to replace all occurrences of a pronoun with the corresponding entity before the parsing. Let’s have a look at a few examples:

> take the banana and eat it

Here the pronoun refers to the banana, i.e. the sentence should be replaced by “take the banana and eat the banana”.

> look at the ceiling and take the banana and eat it

Here the pronoun probably refers to the banana, but it could possibly also refer to the first entity: the ceiling. Since we always want the possibility to backtrack and try alternative solutions in case the first one fails later on, it’s a good idea to make the predicate non-deterministic. Assuming that we’re given a list of tagged atoms, a simple algorithm may look as follows:

  • If the list of tagged atoms is non-empty, inspect the first element.
  • If it’s not a pronoun, ignore it and continue with the rest of the list.
  • If it’s a pronoun, replace the pronoun with the corresponding entity, where the candidates of possible entities are the tagged atoms that occur to the left of the pronoun in the sentence.

Putting this into code is straightforward. The predicate will have three arguments: the list of tagged atoms, the tagged atoms that have been processed so far and the result.

    resolve_pronouns(ATs, Xs) :-
        resolve_pronouns(ATs, [], Xs).

    resolve_pronouns([], _, []).
    resolve_pronouns([A-T|Xs], Pre, [Y|Ys]) :-
        (   T = pronoun ->
            resolve_pronoun(A, Pre, Y),
            resolve_pronouns(Xs, Pre, Ys)
        ;   Y = A-T,
            resolve_pronouns(Xs, [Y|Pre], Ys)
        ).

resolve\_pronoun/2 is also a walk in the park. For now, we’re only going to be dealing with “it”. Other pronouns can be added in a similar manner.

    %% NAME:
    %%  resolve_pronoun(Pronoun, AtomTags, X)
    %% DESCRIPTION:
    %% True if X is an entity corresponding to the pronoun resolved
    %% with the help of AtomTags. It is the caller's responsibility to
    %% make sure that AtomTags is in the correct (possibly reversed)
    %% order.
    resolve_pronoun(it, Xs, X) :-
        X = A-entity,
        list::member(X, Xs).

Nota bene: since the words are pushed onto Pre in LIFO order, a pronoun will be resolved from right-to-left. While this algorithm is indeed very simple, it’s not particularly efficient. Due to the call to member/2 in resolve\_pronoun/2, the worst case execution time (to find one solution) for resolve\_pronouns/2 is exponential. But considering the fact that a typical sentence is just a few words long and that they rarely if ever contains more than one or two pronouns, I think we’ll survive.

One step remains. Note that “it” is used in one of the example sentences without referring to anything in that particular sentence. Instead it refers to the entity that was used in the previous sentence. The obvious yet quite flexible solution is to augment resolve\_pronouns/2 with an additional argument that holds the previous sentence. Then a pronoun can first be resolved with respect to the current sentence, but if that fails the previous sentence is tried instead.

Parsing

Natural language contains a lot of noise and variations. This makes formal methods such as context-free grammars somewhat ill-suited and cumbersome to use. One deceivingly simple solution is to ignore everything that we’re not interested in and group the good stuff into chunks. A command, e.g. “eat”, can be considered a chunk. We know that it takes one entity-argument, so after encountering the command chunk we scan the rest of the sentence for an argument chunk of the correct type. This might sound like cheating – and it kind of is – but as long as the reduced, simpler problem still yields the correct output that’s hardly of any importance.

The goal is to produce one or more commands given a list of tagged atoms (where the pronouns are resolved). Remember that commands are defined as:

        command(look, [entity]).
        command(look, []).
        command(eat, [entity]).

To allow some flexibility in the input, we’re also going to allow synonyms. These can be defined as additional facts.

    variants(look ,[look, inspect]).
    variants(eat, [eat, munch, digest]).

For every chunk of type X we’re going to write a corresponding X\_chunk predicate that has 3 arguments: the list of tagged atoms, the list that remains after this chunk is parsed, and the resulting chunk. First off is the sentence chunk.

    %% NAME:
    %%  parse_atoms(+AtomTags, -Remainder, -Commands)
    %% DESCRIPTION:
    %% True if Commands is a list of commands parsed from from the atoms and their
    %% tags in AtomTags.
    parse_atoms(ATs, Remainder, Commands) :-
        sentence_chunk(ATs, Remainder, Commands).

    %% NAME:
    %%  sentence_chunk(+AtomTags, -Remainder, -Commands)
    %% DESCRIPTION:
    %% True if Commands is a list of commands parsed from from the atoms and their
    %% tags in AtomTags. The remainder, i.e. the chunk after the
    %% sentence, is stored in Remainder (most likely just the empty list).
    sentence_chunk(ATs, Remainder, [C-Args|Cs]) :-
        command_chunk(ATs, ATs1, C),
        argument_chunk(ATs1, C, ATs2, Args),
        (   conjunction_chunk(ATs2, Remainder0, _) ->
            sentence_chunk(Remainder0, Remainder, Cs)
        ;   Cs = []
        ).

    conjunction_chunk([Conj-conjunction|ATs], ATs, Conj-conjunction).

Which shall be read as “first parse a command followed by its argument, then if the arguments are followed by a conjunction, recursively parse the rest of the list”. The command chunk is slightly more involved since we have to deal with variants of commands.

    %% NAME:
    %%  command_chunk(+AtomTags, -Remainder, -Command)
    %% DESCRIPTION:
    %% True if Command is the command parsed from the atoms and their
    %% tags in AtomTags. The remainder, i.e. the chunk after the
    %% command, is stored in Remainder.
    command_chunk(ATs, ATs1, C) :-
        %% First find a verb, i.e. a potential command.
        list::append(_, [C0-verb|ATs1], ATs),
        %% Then check whether or not the verb is a variant of a known
        %% command.
        game_logic::variants(C, Cs),
        list::member(C0, Cs).

Note the use of append/3 to ignore everything in the list up to a verb. When parsing the arguments we scan the input and look after atoms of the correct type.

    %% NAME:
    %%  argument_chunk(+AtomTags, +Command, -Remainder, -Args)
    %% DESCRIPTION:
    %% True if Args are the arguments corresponding to the arity of
    %% Command with respect to the atoms and their tags in
    %% AtomTags. The remainder, i.e. the chunk after the last
    %% argument, is stored in Remainder.
    argument_chunk(ATs, C, ATs1, Args) :-
        game_logic::command(C, Tags),
        matches(ATs, Tags, ATs1, Args).

    %% NAME:
    %%  matches(+AtomTags, +Tags, -Remainder, -Args).
    %% DESCRIPTION:
    %%  True if the list of atoms and their corresponding tags matches
    %%  Tags, i.e. there exists a sequence of atoms, not necessarily
    %%  in a a direct linear sequence, such that their tags can be
    %%  mapped to the tags in Tags.
    matches(ATs, [], ATs, []).
    matches(ATs, [T|Ts], Remainder, [A|As]) :-
        list::append(_, [A-T|ATs1], ATs),
        matches(ATs1, Ts, Remainder, As).

That’s all the constructs that we’re going to support at the moment. Two remarks should be made before we move on:

  • The chunk predicates can be rewritten as DCGs instead since they’re essentially just threading state. I’ll leave this as an exercise to the reader!
  • argument\_chunk/4 is given the ability to ignore everything that it can’t make sense of. While this is good in many circumstances, it also has the effect that some ill-formed sentences will be incorrectly parsed. For example, consider the sentence “eat the bnana and look at the room”. Assuming that bnana is tagged as unknown, the sentence will be parsed as eat-room since room is the only entity that argument\_chunk/4 discovers. One possible solution is to restrict the argument parsing to the current clause and disallow it to continue past conjunctions.

Putting everything together

Now we’re able to dissect the meaning of the user input and extract commands from it. The next task is to actually carry out the commands and update the state of the game world, but for the moment we’re just going to print the parsed commands to verify that everything works. The steps are the following:

  • Read a line and convert it to atoms.
  • Tag the atom list.
  • Resolve the pronouns.
  • Parse the resulting list of tagged atoms.

And it’s rather easy to cobble together.

    init :-
         write('Welcome to Bacchus-Bosch!'), nl,
         current_input(S),
         repl(S, []).

    repl(S, History) :-
        write('> '),
        nlp::parse_line(S, Atoms),
        write('The input is: '),
        meta::map([X] >> (write(X), write(' ')), Atoms), nl,
        nlp::tag_atoms(Atoms, AtomTags),
        write('The tagged input is: '),
        meta::map([X] >> (write(X), write(' ')), AtomTags), nl,
        (   eval(History, AtomTags) ->
            true
        ;   write('I''m sorry, Dave. I''m afraid I can''t do that'),
            nl
        ),
        repl(S, AtomTags).

    eval(History, AtomTags) :-
        nlp::resolve_pronouns(History, AtomTags, AtomTags1),
        eval_commands(AtomTags1).

    eval_commands(AtomTags) :-
        nlp::parse_atoms(AtomTags, _, Commands),
        write('The parsed commands are: '), nl,
        meta::map([X] >> (write(X), nl), Commands).

If we feed this program with the example sentences we end up with the following dialogue:

Welcome to Bacchus-Bosch!

> look

The input is: look
The tagged input is: look-verb
The parsed commands are:
look-[]

> look at the banana

The input is: look at the banana
The tagged input is: look-verb at-preposition the-article banana-entity
The parsed commands are:
look-[banana]

> eat it

The input is: eat it
The tagged input is: eat-verb it-pronoun
The parsed commands are:
eat-[banana]

> grab the banana and eat it

The input is: grab the banana and eat it
The tagged input is: grab-verb the-article banana-entity and-conjunction eat-verb it-pronoun
The parsed commands are:
take-[banana]
eat-[banana]

> fling feces around the room

The input is: fling feces around the room
The tagged input is: fling-unknown feces-unknown around-preposition the-article room-entity
I’m sorry, Dave. I’m afraid I can’t do that

Source code

The source code is available at https://gist.github.com/862464.

Due to a hefty amount of school-work (I should actually be crunching theorems in number theory right now!) part 3 will probably be delayed one or two weeks.

Advertisements

3 responses to “An Adventure Game – Part 2”

  1. Jensan says :

    Great write-up! The chunk-technique was especially interesting. It would be interesting to write a scripting language using this approach…

    • victorlagerkvist says :

      My pleasure! I think, but I might be wrong here, that the reason that chunk-based parsing is not used with programming languages is because it would most likely lead to the birth of Skynet in the long run. (Or even worse: computer programs that spews nonsensical natural language sentences instead of doing anything productive. Heaven forbid!).

Trackbacks / Pingbacks

  1. An Adventure Game – Part 4 | The Blogging of Prolog - April 3, 2011

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: