Rosio Pavoris a blog

Search trees are easy

A decent proportion of my readers are noobie programmers or people who aren’t in a position to receive a formal CS education, so I thought I’d cover the basics of a fundamental concept most people cover in their first semester of algorithms or AI today: search trees. The fact that my college considers this to be third-year material so advanced they cannot in good faith make the class compulsory is neither here nor there.

Consider the famous problem of the farmer who wants to cross a river with his fox, goose, and grain, though the only boat can only carry himself and one of these three possesions. Ignore for a moment why a farmer would own a fox, and let’s stretch credibility a bit more by assuming that while the fox and goose are well-trained enough not to wander off in the absence of the farmer, they are not trained not to eat the goose or the grain, respectively, in said absence. How can he safely get to the other side without losing his goose or grain?

Oh noes!

A straight-forward way to view this is by having a state, and a number of rules of production, which will transform this state into a new one.
The state, of course, is just on which riverbank everyone is at any given point. The rules of production are just all possible moves:

  1. The farmer can take the fox to the opposite bank.
  2. The farmer can take the goose to the opposite bank.
  3. The farmer can take the grain to the opposite bank.
  4. The farmer can go to the opposite bank by himself.

Note that not every rule can be applied to every state. The farmer can’t take anything to the opposite bank if he isn’t on the same side of the river as the thing he’s trying to move, and he can’t do anything if it would leave the fox alone with the goose, or the goose alone with the grain.

We’ll represent states as nodes in a tree, and we’ll let the vertices of that tree be applications of rules of production. Let’s now consider what our options are from our starting state:

Rule 1 can’t be applied, because it would leave the goose alone with the grain. Rule 3 can’t be applied because it would leave the fox alone with the goose. Rule 4 can’t be applied because it would leave the fox with the goose and the goose with the grain. So our options are actually pretty limited:

For the next step, we could just take the goose back to the other bank, but that would put us back at our starting state. Let’s also not allow any move that puts us in a state we’ve been in before, because there’s no point in them.
The farmer can just go to the original bank by himself, and it turns out that’s the only valid move, because obviously he can’t move the fox or the grain because they’re on the opposite side of the river.

Pretty boring so far, but now we at least have two valid moves to choose from: he could take the fox across, or the grain. Rule 2 can’t apply because the goose is across the river, and rule 4 can’t apply because it brings us back to a state we’ve been in before.
Our tree now looks like this:

At this point and for this problem it doesn’t matter too much which path we take, so let’s go with the one where he takes the fox across.
Our state at this point is grain on the left bank, and farmer, fox, and goose on the right. What are our options now? Rule 1 can’t apply because that brings us back to a previous state. Rule 2 can apply. Rule 3 can’t apply because the grain is on the opposite bank. Rule 4 can’t apply because the fox would be left alone with the goose.

Which state do we continue from now? We could go to S4, to get all nodes on that level out of the way first, or we could continue with S5, and see how far down we can go. This is the distinction between a breadth-first and a depth-first search of the tree. Let’s go with depth-first, for now.
On the left bank we have the farmer, the goose, and the grain. The right only has the fox. The only rule that can apply is rule 3, of course, which leaves us with the goose on the left, and the farmer, fox, and grain on the right.

Continuing along, we have two valid moves: rule 1 and rule 4.

From S7 we can’t do anything that wouldn’t put us back in a state that we’ve already seen before, so it’s a dead end. We’ll abandon this branch and continue with S8 instead. From there the only valid move is rule 2, at which point we’ve gotten all of our things on the right side of the river, and we’re done.

Hurray!

Turning this into a computer algorithm is simple. Let’s use Python.
First we need a way to represent state. For clarity, I’ve opted for a dictionary with four keys, and boolean values, True meaning left bank, False meaning right. It’d be much more efficient to use an array, or you could even use half a byte, but clarity trumps efficiency at this point.
Our starting state is this:

> start = {"farmer": True, "fox": True, "goose": True, "grain": True}

Our goal state is of course the inverse:

> goal = {"farmer": False, "fox": False, "goose": False, "grain": False}

Our rules of production are similarly straightforward. I’m using classes with three methods: apply, can_apply, and describe (for clarity). We’ll put them in a list, and because list indices start at 0, the numbers of our rules will too:

> class Rule0():
>     num = 0
>
>     def apply(self, state):
>         return {"farmer": not state["farmer"],
>                 "fox": not state["fox"],
>                 "goose": state["goose"],
>                 "grain": state["grain"]}
>
>     def can_apply(self, state):
>         return state["farmer"] == state["fox"] and \
>                state["goose"] != state["grain"]
>
>     def describe(self):
>         print "Farmer moves fox to the opposite bank."
>
> class Rule1():
>     num = 1
>
>     def apply(self, state):
>         return {"farmer": not state["farmer"],
>                 "fox": state["fox"],
>                 "goose": not state["goose"],
>                 "grain": state["grain"]}
>
>     def can_apply(self, state):
>         return state["farmer"] == state["goose"]
>
>     def describe(self):
>         print "Farmer moves goose to the opposite bank."
>
> class Rule2():
>     num = 2
>
>     def apply(self, state):
>         return {"farmer": not state["farmer"],
>                 "fox": state["fox"],
>                 "goose": state["goose"],
>                 "grain": not state["grain"]}
>
>     def can_apply(self, state):
>         return state["farmer"] == state["grain"] and \
>                state["fox"] != state["goose"]
>
>     def describe(self):
>         print "Farmer moves grain to the opposite bank."
>
> class Rule3():
>     num = 3
>
>     def apply(self, state):
>         return {"farmer": not state["farmer"],
>                 "fox": state["fox"],
>                 "goose": state["goose"],
>                 "grain": state["grain"]}
>
>     def can_apply(self, state):
>         return state["fox"] != state["goose"] and \
>                state["goose"] != state["grain"]
>
>     def describe(self):
>         print "Farmer moves to the opposite bank."
>
> rules = [Rule0(), Rule1(), Rule2(), Rule3()]

Our algorithm will finish with a list of rule numbers to be applied in order, so let’s have a thing that can translate that into readable English, using our describe method from above.

> def long_print(steps):
>     state = start
>
>     for s in steps:
>         rules[s].describe()
>         state = rules[s].apply(state)
>
>         print "On the left bank: ",                         \
>               " ".join(map(lambda (a, b): a,
>                            filter(lambda (a, b): b,
>                                   state.items()))) +        \
>               ("(none)" if not any(state.values()) else "")
>         print "On the right bank:",                         \
>               " ".join(map(lambda (a, b): a,
>                            filter(lambda (a, b): not b,
>                                   state.items()))) +        \
>               ("(none)" if all(state.values()) else "")
>         print

Each node will be represented as a tuple consisting of the steps required to reach that node, and the state after all those steps. It’s obvious that we’ll need a queue of nodes still to examine, which will initially just have the starting node, which takes no steps to reach and has a state equal to the starting state.

> queue = [([], start)]

We’ll also need a list of visited states, to avoid loops.0 And let’s also have a counter that keeps track of how many nodes we’ve visited.

> visited = [start]
> steps = 0

Now for the main loop of our algorithm. We’ll be adding things to a queue and taking things out and looking at them, so a good termination condition is just an empty queue. This means we’ll have looked at all possible nodes in the search tree without success.

> while len(queue) > 0:

Every iteration, we’ll take a node from the front of the queue, and compose a list of nodes we can reach from there.

>     path, state = queue[0]
>     new_nodes = []

We’ll look at each rule to see if it can be applied. If it can, we’ll first see if it gets us to our goal, in which case we’re done. Otherwise, we make sure the state reached isn’t one we’ve seen before, and if not, we add it to our list of new nodes, and add the new state to our list of seen states.

>    for rule in rules:
>        if rule.can_apply(state):
>            n_state, n_path = rule.apply(state), path + [rule.num]
>
>            if n_state == goal:
>                print "Solution found in %d iterations:" % steps, n_path
>                print
>                long_print(n_path)
>                exit(0)
>
>            if not any([n_state == s for s in visited]):
>                new_nodes.append((n_path, n_state))
>                visited.append(n_state)

Then we add the new nodes to the queue. How we add them is important, of course. If we add them to the front, we have the depth-first search we did above. If we add them to the back, you can see we have a breadth-first search. Other things we could do is sort the new nodes based on some heuristic, though that isn’t useful for this particular problem. We could also just add the nodes and sort the entire queue according to a heuristic. Finding the right heuristic for a given problem is an interesting exercise, but not in this case.
Since we’re keeping it simple, we’ll just add them to the front (and remember to remove the node we just looked at).

>     queue = new_nodes + queue[1:]
>     steps += 1

And that’s really it. In this case we know there’s a solution, but for problems where there isn’t, the loop will exit when the entire search tree has been explored, so let’s tell people when that’s happened too.

> print "No solution found!"

Code is here, if you like.
It goes without saying that this algorithm can be applied to any problem where there’s an unambiguous and exhaustive way to represent possible states and a clear starting and ending state, and where it’s possible to enumerate a complete set of production rules.

An important thing to remember about this algorithm is that it finds a solution, not necessarily an optimal one. In this case the solution happens to be optimal because it’s the only (loop-free) one (remove the exit(0) to see for yourself), but there’s no guarantee that that will happen. You can sometimes improve the expected quality of your results through heuristics, and there are ways to turn this into an optimal search, but all of that is beyond the scope of this post. If the cost of a state transition is equal in every case (as it is in our toy problem), breadth-first search is optimal, of course.
Note that this algorithm is complete. If there is a solution, it will be found, subject to certain conditions. For depth-first search, for instance, it’s important that there are no branches of infinite depth, or the thing might not terminate. Breadth-first doesn’t have that limitation, but it has the condition that the branching factor of the search tree be finite (that there’s no node from which an infinite number of steps could be taken; that would be hard to represent in our rules, though, so no worries).

Depth-first and breadth-first basically aren’t very great algorithms, though they perform very well for their simplicity. DFS has a worst-case time complexity of O(bd), where b is the branching factor and d is the depth of the tree. BFS does better1 with O(bm), where m is the depth the first solution is at. On the other hand, DFS has a worst-case memory use of O(d*b), which is a lot better than BFS’s O(bm).
You can do significantly better with algorithms like iterative deepening, which is just a repeated DFS up to a depth n, where n increases with every iteration until a solution is found (which doesn’t sound very sensible, but actually is; excepting degenerate situations, most of the time will be spent in the last iteration anyway, and the time spent on previous iterations is comparatively minute), but the real money is in algorithms that can be adjusted based on outside knowledge of the system. It doesn’t really apply in this situation, but in many cases, you can have a decent idea of which branches of the search tree are more promising than others, and you want to take those first.
That’s where heuristics come in, and if there’s any demand, I’ll cover that at some point in the future.


0 There are actually better ways to avoid loops, and for some algorithms (like, indeed, the breadth-first variety of this one), loops aren’t even a problem. We’re just using a list of all states here because our state space is really small. There’s a total of 24 possible states, and many of those are invalid, reducing the maximum length that list could be in this case to just 10. Another way to do this would be to just look at the states of the nodes in the queue and forget about our list, which usually works adequately for real problems.

1 In the worst case, but not for our toy problem, you’ll find.

6 Comments

  1. Anonymous said,

    >A decent proportion of my readers are noobie programmers or people who aren’t in a position to receive a formal CS education,

    …how did you know?

  2. Robert Peaslee said,

    I’d appreciate a follow-up post on searching with heuristics.

    This post presents the information in a much more concise and accessible way than it was in my CS program. Thanks!

  3. Thanassis said,

    When running your code as-is, I get a solution in 7 steps.
    When I switch to the Breadth First Search (uncommenting line 112 and commenting 111) the solution is found in 8 steps.

    I thought that BFS is optimal – what’s going on?

  4. Thanassis said,

    My mistake – BFS finds the optimal solution, but not necessarily with the least amount of steps.

    Very nice tutorial – keep it up.

  5. Anonymous said,

    > A decent proportion of my readers are noobie programmers or people who aren’t in a position to receive a formal CS education,

    Xarn takes a class in AI, decides he is better than his fellow /prog/riders.

  6. Cairnarvon said,

    Feeling insecure, are we?

Post a Comment

RSS feed for comments on this post · TrackBack URL