Bad, vile and meaningless: Why Perl rocks from Alan's clob

An article today on Slashdot was about glotski-style game. The purpose in this game is to arrange free slots in a maze (similar to the game "fifteen") until you have succeeded in a goal (like, got the big 2x2 square from upper left corner percolated to the lower right corner).

Because I heard that this maze's a tough one, I wanted to see how tough and spent 1 minute converting the maze in my "maze format" file, and then ran it through my solver. After 0.22 seconds of user time, the solution dropped out, totalling 37 moves of pieces (some people say you can solve this in 25 moves, but that's because they count moves a bit differently: you can sometimes slide a piece 1 or 2 units into free space, and they count this as one move).

So, here's the maze as my program requires it:

aabb
aacd
  cd
effh
eggi
a--

Each piece is assigned unique letter. Thus the 2x2 square, called 'a', consists of 4 pieces of a placed on two rows. b forms a 2x1 piece. The final row is the victory condition. The game terminates when the condition is true. What exactly it means will be covered later on.

Any program begins with reading its input:

my ($xdim, $maze, $re) = _read_maze();

The $xdim is the width of the maze, here 4. The $maze is string representation of the maze, essentially the above picture. $re is the final line, and it's not in $maze.

Next, we figure out what pieces are precent by testing if any of the letters can be found in the $maze string:

my @pieces = grep { index($maze, $_) != -1 } 'a' .. 'z';

At this point, you can imagine that the solver is going to go through all the @pieces and try to move them all in different directions. This is the core loop of the code, run in function play_game:

foreach my $dir (qw(d r u l)) {
    foreach my $chr (@pieces) {
        if (my $maze = move_piece($maze, $chr, $dir)) {
            push @moves, [$maze, $moves.$dir.$chr];
        }
    }
}

The function move_piece does the work of testing if a new maze can be built after moving a piece (such as 'a' or 'i') one unit to the desired direction (given as letter, such as 'u' or 'l'). It also records our progress with the $moves . $dir . $chr expression that concatenates the moves led to this point with the moves it's now about to try. This is so that when we find a solution, we can just print out the moves that led to that point and exit.

After this loop is run, @moves hopefully contains some plausible mazes for further analysis:

while (@moves) {
    # if this returns a solution, hail it as victory
    my $ret = play_game(@{shift @moves});
    return $ret if $ret;
}

So, looking above, we see that we "posted" a note for ourselves by pushing a combination of values in @moves. Then, we call play_game on the new values to examine those positions further.

But wait! There must be a mechanism in place for us to constantly re-examining same trivial moves again and again. In my above maze, I can push 'a' down one notch, and then pull it up one notch in the next iteration, on and on to infinity. So, something must be there to stop the iteration. This is probably the trickiest thing yet.

Before play_maze agrees to check new configuration of pieces in $maze, it first checks from %cache whether we have seen this configuration before. Now, it does a few tricks to make it efficient. Firstly, it notices that pieces that have the same shape are equivalent -- that's what the business with $equivalent_pieces is all about. It for instance knows that 'c' and 'd' have same shape by having a common character to represent both pieces. (This char could be anything, but I chose the first of the identical occurences.) If you don't do this, you'll spend a lot of time with mazes that have many small 1x1 pieces. Their different permutations can really eat away your CPU for no good at all!

So, this mouthful of perl:

($a = $maze) =~ s/([^ ])/$equivalent_pieces->{$1} || ''/ges;
if (exists($cache{$a}) && length($moves) >= $cache{$a}) {
    return "";
} else {
    $cache{$a} = length($moves);
}

first assignes $maze to $a, then converts the redundant pieces to same letter, then checks from %cache whether we have seen this configuration before. It's also smart enough to test how many moves had we done last time before we reached that configuration, and it redoes all the work of a previous iteration if it discovers that there was a way to reach this arrangement of pieces again. It's a bit silly, but I don't know how to do it any better. The purpose of that is to ensure that we do find the shortest solution to any maze given to us.

Now I need to acquint you with the layout I chose for maze, and the exit condition that is stored in variable $re and is used as a regex against the maze. The code to test if we have solved the maze is as simple as:

# test victory condition
return $moves if $maze =~ $re;

If we remember our exit condition, we remember that it was 'a--'. So, somehow a-- will match when we have a located at the bottom right corner of the maze. Well, on to the _read_maze():

sub _read_maze {
    die "Usage: $0 <mazefile>" unless @ARGV;

    # read the maze
    chomp(our @maze = <>);
    die unless @maze;

    my $re = pop(@maze);
    $re = qr<$re>;

    my $xdim = length($maze[0]);
    my $ydim = @maze;
    foreach (@maze) {
        die "Maze row '$_' wrong length" unless length($_) eq $xdim;
    }
    return $xdim, join("-", "-" x $xdim, @maze, "-" x $xdim), $re;
}

So, it reads the maze from the first argument using perl's diamond operator, removes newlines, dies if the file was empty, picks the regex out from the last line and compiles it into a regex object, and then computes the width of the maze and ensures that the input was of suitable width on all rows (except the last that was removed above). Finally, it returns the maze as one string that looks like this:

-----aabb-aacd-...-----

There's one - between each consequtive row, and there's some - padding before and after the maze. This also means that a-- will match only when there's some 'a' at the bottom righmost square of the game. Kludgey but to-the-point.

The real jackpot is, however, the move_piece(). The moving of pieces is implemented with some xor + regex tricks. The regex is used to test whether a piece has only space (remember that they can be arbitrary shape) in where it's trying to move, and then the movement itself is performed using rather evil xor string manipulation tricks. Here's the whole thing, without further comments:

sub move_piece {
my ($tmp, $piece, $dir) = @_;

# create XOR mask that allows us to remove this piece from $tmp
# we could also do s/$piece/\00/g or course, but we need the $mask
# to perform the moving.
(my $mask = $tmp) =~ s/[^$piece]/\x00/g;
$mask =~ s/$piece/$piece ^ ' '/ge or die "Object '$piece' got lost in $tmp";

# now try do the move
if      ($dir eq 'r' && $tmp !~ /$piece(?=[^ $piece])/) {
    return $tmp ^ $mask ^ "\x00" . substr($mask, 0, length($mask) - 1);
} elsif ($dir eq 'l' && $tmp !~ /(?<=[^ $piece])$piece/) {
    return $tmp ^ $mask ^ substr($mask, 1);
} elsif ($dir eq 'u' && $tmp !~ /(?<=[^ $piece].{$xdim})$piece/) {
    return $tmp ^ $mask ^ substr($mask, $xdim+1);
} elsif ($dir eq 'd' && $tmp !~ /$piece(?=.{$xdim}[^ $piece])/) {
    return $tmp ^ $mask ^ "\x00" x ($xdim+1) . substr($mask, 0, length($mask) - ($xdim+1));
} else {
    return;
}
}

As we remember above, if the move_piece() fails, then the function returns nothing, and the main loop code can not pursue that tree for further moves.