Thursday, September 26, 2019

Beatboxing and the Shannon number

There is a famous number called the Shannon number, which "estimates" the number of possible games of chess. In mathematics and computer science, when you estimate something, it doesn't necessarily mean you find a number that's fairly close to it. It means that you compare it to something where you can say with absolute certainty whether is greater or smaller. If you're thinking about how long it would take for a snail to crawl to the moon, one estimate is that it must take at least one second, since the moon is more than one light-second away.

In the same spirit, Claude Shannon pointed out in a pioneering paper on computer chess that the number of possible games of chess is at least $10^{120}$. Let's write that out just for fun:

1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.

As we will see, Shannon's estimate is off by several thousands of orders of magnitude. In comparison, given that it took only a few hundred million years for slimy creatures to evolve into astronauts, the snail versus speed of light estimate is pretty good, just 16 or 17 orders of magnitude away from best possible.

The Shannon number turns out not to be that important to the question of how hard chess is. But I while ago it turned up in another context: People asking Siri to compute one trillion to the tenth power!

Shannon's number was just mentioned in a remark in his paper, but went viral, as we would say today.

In Shannon's calculation, a game of chess means a sequence of moves from the start to the end of the game, like for instance 1. e2-e4  e7-e5, 2. Bf1-c4 Bf8-c5, 3. Qd1-h5 Ng8-f6, 4. Qh5xf7#. That's one game in the space of all chess games, and another one is 1. e2-e4  e7-e5, 2. Qd1-h5 Bf8-c5, 3. Bf1-c4 Ng8-f6, 4. Qh5xf7#. Even though they reach the same position after White's third move, they get there through different move orders and are therefore counted as two different games. On the other hand it doesn't matter how many times the same sequence of moves has been played throughout chess history. If it's exactly the same sequence, it counts as one game.

The assumption in Shannon's paper is that the 50-move rule and the rule of threefold repetition ensure that a game of chess cannot go on forever, and that therefore the number of possible games is finite. This isn't strictly valid under standard tournament play, since one of the players must claim a draw in order for the rules to apply. But let's go with the assumption that a game ends as soon as the same position is repeated three times or 50 moves (100 consecutive "ply") are made without a capture or a pawn move.

Shannon's idea was that a normal game lasts about 40 moves (40 for each player, so 80 ply), and that in an ordinary position there are about 30-35 move options. That gives roughly 1000 possibilities for a pair of a white and a black move, and $1000^{40} = 10^{120}$ is Shannon's number.

What he wanted to point out was just that the obvious algorithm of recursively tracing through the whole game tree of chess isn't practical. He must surely have been aware that the estimate of $10^{120}$ is ridiculously off the mark. The true number of games is way way, way, larger. Notice that the 50-move rule and the rule of threefold repetition didn't even enter Shannon's calculation, and without them the number of games would have been infinite.

Anyway, let's imagine that we are navigating through the space of all chess games, inspecting various specimens at random. What do we see? The first thing we notice is that the games don't make sense from a player's point of view, since the pieces just move aimlessly. The next thing that strikes us is that most games are really long. This is because there are many more different ways of playing hundreds of moves than there are ways of playing 30 or 40 moves.

A back-of-the-envelope calculation shows that if we start by pushing all the white pawns to the third rank and the black pawns to the sixth rank to create some space, and then just shuffle the pieces around, pushing a pawn every 50 moves, we can keep going for well over 2000 moves capturing only two white and two black pawns. Even though this puts some restrictions on the number of move options available, we can make sure that each piece except the pawns always has at least one move option. Avoiding threefold repetition is a little bit of a hassle, but still there must easily be more than $10^{4000}$ different games.

An equally rough estimate (!) shows that the games that last at most 700 moves don't give more than $10^{3500}$ possibilities, meaning they are outnumbered by the longer games by a mind-boggling factor of at least $10^{500}$. According to a list of world records in chess, the longest known tournament game lasted for 269 moves. But as we fly through the space of all chess games, we never encounter a game as short as that one.

And how does a "typical" game start? In tournament play, 1. e2-e4 has been the most common first move throughout most of chess history, with 1. d2-d4 being about equally popular in the last century. But from our spaceship, we don't see a single game starting with a pawn move! The games we see from our window all start with White making one of the four possible knight moves, and Black responding similarly with a knight.

This is because for every game that starts with some pawn move, for instance 1. e2-e4, there is a corresponding astronomical number of games that start with the players developing their knights, then moving them around for a while, finally returning them to their original squares just before move 50, and White only then playing e2-e4.

For instance, the famous Opera game started 1. e2-e4, and ended with White checkmating with 17. Rd1-d8#. That's one game. But it has 16 siblings that start with White and Black each making a knight move, then going back with the knights, and continuing 3. e2-e4, ... 19. Rd1-d8#. And it has several hundred cousins that start by the knights jumping around for another two moves before playing 5. e2-e4, ... , 21. Rd1-d8#, and so on. The number of ways that the knights can jump around (and rooks move back and forth) returning to the starting position at move 48 is astronomical (quiz: in how many ways can they return after move 49?).

So the typical citizen in the space of all chess games is a game where the first 45 or so moves by each player are made by knights and rooks, and where in the next few hundred moves, a pawn is pushed only about every 50 moves.  

So given that there are more than $10^{4000}$ games, does this mean that to play perfect chess we would need a computer capable of performing that many logical operations?

Actually no. We could solve chess completely with far fewer than $10^{120}$ operations, so in that sense, the Shannon number is completely off in the other direction too!

This is because the number of chess positions is much smaller than the number of games. To see this, notice that in a chess position, each of the 64 squares can be in only 13 different states: empty, or occupied by a black or white piece of one of the six types. In addition, we need to keep track of who is about to move, castling rights, and possible captures by en-passant. But all this can be encoded if we let the squares have 14 states instead of 13. Even without taking into account that each player must have exactly one king, at most 16 pieces and so on, we easily estimate (!) the number of chess positions to be at most $10^{74}$.

The idea that a chess endgame can be understood by considering just the positions it can lead to, rather than the much larger number of lines of play, is what lies behind endgame databases. But long before computers it was implicit in the concept of corresponding squares. In particular, it's interesting that the winning strategy from the famous Lasker-Reichhelm position can be completely understood by a human, despite the vast number of possible lines of play.

Even $10^{74}$ is a very rough upper bound, and in order to play perfect chess we probably don't have to examine more than a tiny fraction of the space of all positions. Presumably there is a much smaller set of key positions that would suffice for a complete "endgame database" from the starting position.

At a dinner a couple of years ago I discussed with Anders Sandberg whether solving chess would be feasible. We agreed that yes, we probably have enough resources already in our own solar system! We just need to send a swarm of self-replicating robots to Jupiter, transforming the planet into a giant chess computer, and most likely we'll get the answer!