Thursday, September 26, 2019

Beatboxing and the Shannon number

There is a famous number called the Shannon number, which "estimates" the number of possible games of chess. In mathematics and computer science, when you estimate something, it doesn't necessarily mean you find a number that's fairly close to it. It means that you compare it to something where you can say with absolute certainty whether is greater or smaller. If you're thinking about how long it would take for a snail to crawl to the moon, one estimate is that it must take at least one second, since the moon is more than one light-second away.

In the same spirit, Claude Shannon pointed out in a pioneering paper on computer chess that the number of possible games of chess is at least $10^{120}$. Let's write that out just for fun:

1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.

As we will see, Shannon's estimate is off by several thousands of orders of magnitude. In comparison, given that it took only a few hundred million years for slimy creatures to evolve into astronauts, the snail versus speed of light estimate is pretty good, just 16 or 17 orders of magnitude away from best possible.

The Shannon number turns out not to be that important to the question of how hard chess is. But I while ago it turned up in another context: People asking Siri to compute one trillion to the tenth power!

Shannon's number was just mentioned in a remark in his paper, but went viral, as we would say today.

In Shannon's calculation, a game of chess means a sequence of moves from the start to the end of the game, like for instance 1. e2-e4  e7-e5, 2. Bf1-c4 Bf8-c5, 3. Qd1-h5 Ng8-f6, 4. Qh5xf7#. That's one game in the space of all chess games, and another one is 1. e2-e4  e7-e5, 2. Qd1-h5 Bf8-c5, 3. Bf1-c4 Ng8-f6, 4. Qh5xf7#. Even though they reach the same position after White's third move, they get there through different move orders and are therefore counted as two different games. On the other hand it doesn't matter how many times the same sequence of moves has been played throughout chess history. If it's exactly the same sequence, it counts as one game.

The assumption in Shannon's paper is that the 50-move rule and the rule of threefold repetition ensure that a game of chess cannot go on forever, and that therefore the number of possible games is finite. This isn't strictly valid under standard tournament play, since one of the players must claim a draw in order for the rules to apply. But let's go with the assumption that a game ends as soon as the same position is repeated three times or 50 moves (100 consecutive "ply") are made without a capture or a pawn move.

Shannon's idea was that a normal game lasts about 40 moves (40 for each player, so 80 ply), and that in an ordinary position there are about 30-35 move options. That gives roughly 1000 possibilities for a pair of a white and a black move, and $1000^{40} = 10^{120}$ is Shannon's number.

What he wanted to point out was just that the obvious algorithm of recursively tracing through the whole game tree of chess isn't practical. He must surely have been aware that the estimate of $10^{120}$ is ridiculously off the mark. The true number of games is way way, way, larger. Notice that the 50-move rule and the rule of threefold repetition didn't even enter Shannon's calculation, and without them the number of games would have been infinite.

Anyway, let's imagine that we are navigating through the space of all chess games, inspecting various specimens at random. What do we see? The first thing we notice is that the games don't make sense from a player's point of view, since the pieces just move aimlessly. The next thing that strikes us is that most games are really long. This is because there are many more different ways of playing hundreds of moves than there are ways of playing 30 or 40 moves.

A back-of-the-envelope calculation shows that if we start by pushing all the white pawns to the third rank and the black pawns to the sixth rank to create some space, and then just shuffle the pieces around, pushing a pawn every 50 moves, we can keep going for well over 2000 moves capturing only two white and two black pawns. Even though this puts some restrictions on the number of move options available, we can make sure that each piece except the pawns always has at least one move option. Avoiding threefold repetition is a little bit of a hassle, but still there must easily be more than $10^{4000}$ different games.

An equally rough estimate (!) shows that the games that last at most 700 moves don't give more than $10^{3500}$ possibilities, meaning they are outnumbered by the longer games by a mind-boggling factor of at least $10^{500}$. According to a list of world records in chess, the longest known tournament game lasted for 269 moves. But as we fly through the space of all chess games, we never encounter a game as short as that one.

And how does a "typical" game start? In tournament play, 1. e2-e4 has been the most common first move throughout most of chess history, with 1. d2-d4 being about equally popular in the last century. But from our spaceship, we don't see a single game starting with a pawn move! The games we see from our window all start with White making one of the four possible knight moves, and Black responding similarly with a knight.

This is because for every game that starts with some pawn move, for instance 1. e2-e4, there is a corresponding astronomical number of games that start with the players developing their knights, then moving them around for a while, finally returning them to their original squares just before move 50, and White only then playing e2-e4.

For instance, the famous Opera game started 1. e2-e4, and ended with White checkmating with 17. Rd1-d8#. That's one game. But it has 16 siblings that start with White and Black each making a knight move, then going back with the knights, and continuing 3. e2-e4, ... 19. Rd1-d8#. And it has several hundred cousins that start by the knights jumping around for another two moves before playing 5. e2-e4, ... , 21. Rd1-d8#, and so on. The number of ways that the knights can jump around (and rooks move back and forth) returning to the starting position at move 48 is astronomical (quiz: in how many ways can they return after move 49?).

So the typical citizen in the space of all chess games is a game where the first 45 or so moves by each player are made by knights and rooks, and where in the next few hundred moves, a pawn is pushed only about every 50 moves.  

So given that there are more than $10^{4000}$ games, does this mean that to play perfect chess we would need a computer capable of performing that many logical operations?

Actually no. We could solve chess completely with far fewer than $10^{120}$ operations, so in that sense, the Shannon number is completely off in the other direction too!

This is because the number of chess positions is much smaller than the number of games. To see this, notice that in a chess position, each of the 64 squares can be in only 13 different states: empty, or occupied by a black or white piece of one of the six types. In addition, we need to keep track of who is about to move, castling rights, and possible captures by en-passant. But all this can be encoded if we let the squares have 14 states instead of 13. Even without taking into account that each player must have exactly one king, at most 16 pieces and so on, we easily estimate (!) the number of chess positions to be at most $10^{74}$.

The idea that a chess endgame can be understood by considering just the positions it can lead to, rather than the much larger number of lines of play, is what lies behind endgame databases. But long before computers it was implicit in the concept of corresponding squares. In particular, it's interesting that the winning strategy from the famous Lasker-Reichhelm position can be completely understood by a human, despite the vast number of possible lines of play.

Even $10^{74}$ is a very rough upper bound, and in order to play perfect chess we probably don't have to examine more than a tiny fraction of the space of all positions. Presumably there is a much smaller set of key positions that would suffice for a complete "endgame database" from the starting position.

At a dinner a couple of years ago I discussed with Anders Sandberg whether solving chess would be feasible. We agreed that yes, we probably have enough resources already in our own solar system! We just need to send a swarm of self-replicating robots to Jupiter, transforming the planet into a giant chess computer, and most likely we'll get the answer!

Tuesday, August 20, 2019

Testing the chess clock!

In my last blog post I described a new type of chess clock, or rather a new type of time limit: the double-flag system. In short, this is a format of time control where bonus time can only be used in order to obtain a draw, not for playing for a win.

After the publication of that post, we actually tested the clock at Solrosen ("The Sunflower", a vegetarian restaurant and chess club in Gothenburg). I had written a Python hack (code below!) that allowed us to use my laptop as a chess clock. The layout is similar to the display of an ordinary digital chess clock, and to press the clock you press the keys 'P' and 'Q' respectively. In order for those keys to be clearly visible I had attached pieces of red paper on them. It's not as convenient as a real chess clock, but it works.


In order to put the clock to the test, we played with a time limit that's in reality too short for us patzers: 5 minutes each plus a bonus of 5 to 30 seconds. The "5 to 30 seconds" means (as explained in the previous blog post) that when your 5 minutes of main time have run out, you get another 30 seconds, but now your opponent has the right to claim a draw at any time (and it counts as a draw even if you checkmate your opponent on the board). When playing on the extra time, you also get a bonus "increment" of 5 seconds per move, under the constraint that your accumulated extra time will never exceed 30 seconds (details in the earlier post).

This means that in any single game, at most one player will reach the phase of extra time. Whenever the other player runs out of their main time too, the game is automatically drawn.

Normally we want to focus on the board and not the clock (this is the point of the new system!), but this time the purpose was to see if the double-flag clock makes sense in practice.

I met with my friends Anders and Bertil at Solrosen, and later some spectators joined us.

In the first game I had the black pieces against Anders, who launched the four pawns attack against my king's indian. My position was a bit cramped in the opening, but eventually I could untangle my pieces. I won an exchange when Anders missed a nasty Bd4+ Kh1 Nf2+ Rxf2 Bxf2. After that he had a difficult position with the seconds ticking away. When his five minutes had run out and he was playing on extra time, I had a minute or so left. It was still not clear if I would be able to checkmate in time. I found a queen check on the first rank, first thinking it would force a queen exchange, but as Anders realized before me, it was impossible to prevent a back-rank mate!

In the second game I was white against Bertil and chose the Tarrasch variation against his French defence. Bertil played aggressively and tried a line where he gave two minor pieces for a rook and two pawns by taking twice on c3 and forking my rooks on a1 and e1 (is this winning or sacrificing material?!). In any case it slowed down my development, but I had four minor pieces against his two in the middle-game. Eventually I got some play against his king, but at that point none of us had much time left. Bertil was the first to run out of the main five minutes, meaning I could play for a win without risk. I didn't have enough time to grind down a long endgame, but there were possibilities like the bishop sacrifice Bxh6 followed by coordinating the queen and a knight. But Bertil calmly defended. With 10 seconds or so left I thought I'd try the piece sacrifice anyway since I had nothing to lose. But then Bertil threw in a queen move threatening a perpetual check. I had to find a defensive move, and those last seconds ran out. That meant that the new possibility, draw on time, had now been realized!

In the third game Bertil played white against Anders. I was talking to some of the bystanders and didn't follow the game closely, but took a couple of photos that show Anders playing the Budapest gambit.


Eventually Bertil got the upper hand with pressure against Anders' king as well as a dangerous pawn on the queenside. Bertil could play faster and had more than two minutes when Anders went into extra time. Anders kept fighting, but 5 seconds per move isn't much in a difficult position. Finally Anders lost on time in a position that must have been very hard to defend.

After these three games we chatted with the other guys and didn't play any more that evening. But I felt that the clock had passed the first test, and several people thought it was an interesting idea!

Python code

Below is the Python code for the clock. As I said it's a hack, and it was translated from a prototype written in another language. Anyway, for what it's worth...

To try it out, even if you don't know any programming, just download a Python programming environment (for instance PyCharm) for free, check out how to create a python file, and copy-paste the code into it. And if you do know some programming, you might want to tell me how the program should have been written...

The clock has a simple layout with digital displays and six buttons for various time controls. You start it by pressing 'P' or 'Q'. It can be paused (and resumed) with the space bar.

The main time is displayed with black digits (or green, but we'll get to that in a moment), and extra time with red digits. Depending on how much time is left, it displays either hours:minutes.seconds, just minutes.seconds, or minutes.seconds.tenths.

The "Rapid", "Blitz", and "Bullet" options are quite simple, with time limits given on the buttons. There's also a "Test" option giving each player just 15 seconds, for demonstrating how the clock works.

The "Classical" option has two periods of main time (and then extra time as usual): 1h 45min for the first 40 moves, and 15 more minutes for the rest of the game. To distinguish the periods, the first one is displayed in green and the second in black. The clock doesn't count moves: When the green time is out, it just switches to the black 15 minutes (and finally to the red extra time).


There is also a "Customized" option. It's initially set to the rather serious mode of 2 hours for the first 40 moves followed by 30 minutes for the remaining game, with extra time of 30 seconds to 10 minutes. To actually customize this option, you can go into the program code, lines 208-210 in my code editor. It's in the class "Game" and under the definition of the "new_custom" method. Just below the line that says "def new_custom(self):", there are three lines that currently read:

        self.leftTime, self.rightTime = 7200, 7200
        self.A, self.b = 600, 30
        self.leftExtra, self.rightExtra = 1800, 1800

The first of these three lines determines the length of the first period of main time, in seconds (7200 seconds = 2 hours). This can be set individually for the left and right player by the way. The second line sets the initial batch of extra time and the increment per move, in this case 600 seconds (=10 minutes) and 30 seconds. As the program is designed, these are the same for both players. The third line sets the length of the (optional) second period of main time. To avoid a second period, just set it to zero.

Alright, the code. Have fun!


from tkinter import *
import time

root = Tk()
root.title("Double-Flag Chess Clock")

mainBackground = "light blue"

root.geometry("950x360+0+0")
root.configure(background=mainBackground)

leftDisplay = Label(root, font=("Typewriter", 96), bg="light yellow")
leftDisplay.place(x=50, y=50, width=400, height=150)

rightDisplay = Label(root, font=("Typewriter", 96), bg="light yellow")
rightDisplay.place(x=500, y=50, width=400, height=150)

qLabel = Label(root, font=("System", 18), anchor=W, bg=mainBackground)
qLabel.place(x=50, y=20, width=400)
qLabel.config(text='Q to press clock')

pLabel = Label(root, font=("System", 18), anchor=E, bg=mainBackground)
pLabel.place(x=500, y=20, width=400)
pLabel.config(text='P to press clock')

modeLabel = Label(root, font=("System", 18), anchor=W, bg=mainBackground)
modeLabel.place(x=50, y=200, width=400)
modeLabel.config(text="Mode: Blitz")


def reset_display():
    leftDisplay.config(fg="black")
    rightDisplay.config(fg="black")
    pLabel.config(text="P to press clock")
    qLabel.config(text="Q to press clock")


classicalString = "Classical: 1h 45min + 15min + (30s to 5min)"
rapidString = "Rapid: 15min + (10 to 60s)"
blitzString = "Blitz: 5min + (5 to 30s)"
bulletString = "Bullet: 2min + (5 to 10s)"
customString = "Customized"
testString = "Test"


def classical_callback():
    if g.paused or g.gameOver or (not g.leftRunning and not g.rightRunning):
        g.new_classical()
        reset_display()
        modeLabel.config(text="Mode: Classical")
        leftDisplay.config(fg="dark green")
        rightDisplay.config(fg="dark green")


classicalButton = Button(root, text=classicalString,
                         highlightbackground=mainBackground,
                         command=classical_callback)
classicalButton.place(x=50, y=240, width=400)
classicalButton.config(bg="green")


def rapid_callback():
    if g.paused or g.gameOver or (not g.leftRunning and not g.rightRunning):
        g.new_rapid()
        reset_display()
        modeLabel.config(text="Mode: Rapid")


rapidButton = Button(root, text=rapidString,
                     highlightbackground=mainBackground,
                     command=rapid_callback)
rapidButton.place(x=500, y=240, width=400)


def blitz_callback():
    if g.paused or g.gameOver or (not g.leftRunning and not g.rightRunning):
        g.new_blitz()
        reset_display()
        modeLabel.config(text="Mode: Blitz")


blitzButton = Button(root, text=blitzString,
                     highlightbackground=mainBackground,
                     command=blitz_callback)
blitzButton.place(x=500, y=270, width=400)


def bullet_callback():
    if g.paused or g.gameOver or (not g.leftRunning and not g.rightRunning):
        g.new_bullet()
        reset_display()
        modeLabel.config(text="Mode: Bullet")


bulletButton = Button(root, text=bulletString,
                      highlightbackground=mainBackground,
                      command=bullet_callback)
bulletButton.place(x=500, y=300, width=400)


def test_callback():
    if g.paused or g.gameOver or (not g.leftRunning and not g.rightRunning):
        g.new_test()
        reset_display()
        modeLabel.config(text="Mode: Test")
        if g.leftExtra > 0:
            leftDisplay.config(fg="dark green")
            rightDisplay.config(fg="dark green")


testButton = Button(root, text=testString,
                    highlightbackground=mainBackground,
                    command=test_callback)
testButton.place(x=50, y=300, width=400)


def custom_callback():
    if g.paused or g.gameOver or (not g.leftRunning and not g.rightRunning):
        g.new_custom()
        reset_display()
        modeLabel.config(text="Mode: Customized")
        if g.leftExtra > 0:
            leftDisplay.config(fg="dark green")
            rightDisplay.config(fg="dark green")


customButton = Button(root, text=customString,
                      highlightbackground=mainBackground,
                      command=custom_callback)
customButton.place(x=50, y=270, width=400)


def enable_modes():
    classicalButton.config(state=NORMAL)
    rapidButton.config(state=NORMAL)
    blitzButton.config(state=NORMAL)
    bulletButton.config(state=NORMAL)
    testButton.config(state=NORMAL)
    customButton.config(state=NORMAL)


def disable_modes():
    classicalButton.config(state=DISABLED)
    rapidButton.config(state=DISABLED)
    blitzButton.config(state=DISABLED)
    bulletButton.config(state=DISABLED)
    testButton.config(state=DISABLED)
    customButton.config(state=DISABLED)


class Game:
    def __init__(self):
        self.leftTime, self.rightTime = 300, 300
        self.reference = 0
        self.leftRunning, self.rightRunning, self.paused = FALSE, FALSE, FALSE
        self.nLeftFallen, self.nRightFallen = 0, 0
        self.gameOver = FALSE
        self.A, self.b = 30, 5
        self.leftExtra, self.rightExtra = 0, 0   # Time added after 40 moves

    def new_classical(self):
        self.leftTime, self.rightTime = 6300, 6300
        self.reference = 0
        self.leftRunning, self.rightRunning, self.paused = FALSE, FALSE, FALSE
        self.nLeftFallen, self.nRightFallen = 0, 0
        self.gameOver = FALSE
        self.A, self.b = 60, 10
        self.leftExtra, self.rightExtra = 900, 900

    def new_rapid(self):
        self.leftTime, self.rightTime = 900, 900
        self.reference = 0
        self.leftRunning, self.rightRunning, self.paused = FALSE, FALSE, FALSE
        self.nLeftFallen, self.nRightFallen = 0, 0
        self.gameOver = FALSE
        self.A, self.b = 60, 10
        self.leftExtra, self.rightExtra = 0, 0

    def new_blitz(self):
        self.leftTime, self.rightTime = 300, 300
        self.reference = 0
        self.leftRunning, self.rightRunning, self.paused = FALSE, FALSE, FALSE
        self.nLeftFallen, self.nRightFallen = 0, 0
        self.gameOver = FALSE
        self.A, self.b = 30, 5
        self.leftExtra, self.rightExtra = 0, 0

    def new_bullet(self):
        self.leftTime, self.rightTime = 120, 120
        self.reference = 0
        self.leftRunning, self.rightRunning, self.paused = FALSE, FALSE, FALSE
        self.nLeftFallen, self.nRightFallen = 0, 0
        self.gameOver = FALSE
        self.A, self.b = 10, 5
        self.leftExtra, self.rightExtra = 0, 0

    def new_test(self):
        self.leftTime, self.rightTime = 15, 15
        self.A, self.b = 15, 5
        self.leftExtra, self.rightExtra = 0, 0
        self.reference = 0
        self.leftRunning, self.rightRunning, self.paused = FALSE, FALSE, FALSE
        self.nLeftFallen, self.nRightFallen = 0, 0
        self.gameOver = FALSE

    def new_custom(self):
        self.leftTime, self.rightTime = 7200, 7200
        self.A, self.b = 600, 30
        self.leftExtra, self.rightExtra = 1800, 1800
        self.reference = 0
        self.leftRunning, self.rightRunning, self.paused = FALSE, FALSE, FALSE
        self.nLeftFallen, self.nRightFallen = 0, 0
        self.gameOver = FALSE


g = Game()


def key_pressed(event):

    if event.char == 'p' and not g.paused and not g.leftRunning and not g.gameOver:
        disable_modes()
        g.rightRunning = FALSE
        pLabel.configure(text="")
        if g.nRightFallen == 1:
            if g.rightTime > g.A - 2*g.b:
                g.rightTime = (g.A + g.rightTime) / 2
            else:
                g.rightTime += g.b
        g.reference = g.leftTime + time.time()
        g.leftRunning = TRUE
        qLabel.configure(text="Q to press clock")

    if event.char == 'q' and not g.paused and not g.rightRunning and not g.gameOver:
        disable_modes()
        g.leftRunning = FALSE
        qLabel.configure(text="")
        if g.nLeftFallen == 1:
            if g.leftTime > g.A - 2*g.b:
                g.leftTime = (g.A + g.leftTime) / 2
            else:
                g.leftTime += g.b
        g.reference = g.rightTime + time.time()
        g.rightRunning = TRUE
        pLabel.configure(text="P to press clock")

    if event.char == ' ' and (g.leftRunning or g.rightRunning):
        g.paused = not g.paused
        if g.paused:
            enable_modes()
        else:
            disable_modes()
        if not g.paused and g.leftRunning:
            qLabel.configure(text="Q to press clock")
            g.reference = g.leftTime + time.time()
        if not g.paused and g.rightRunning:
            pLabel.configure(text="P to press clock")
            g.reference = g.rightTime + time.time()
        if g.paused and g.leftRunning:
            qLabel.configure(text="Paused")
        if g.paused and g.rightRunning:
            pLabel.configure(text="Paused")


root.bind("<Key>", key_pressed)


def update_clock():
    if g.leftRunning and not g.paused and g.nLeftFallen == 0:
        g.leftTime = g.reference - time.time()
        if g.leftTime < 0:
            if g.leftExtra > 0:
                g.leftTime = g.leftExtra
                g.leftExtra = 0
                leftDisplay.configure(fg="black")
            else:
                g.leftTime = g.A
                g.nLeftFallen = 1
                leftDisplay.configure(fg="red")
                if g.nLeftFallen + g.nRightFallen >= 2:
                    g.gameOver = TRUE
                    g.leftRunning = FALSE
                    g.rightRunning = FALSE
                    qLabel.config(text="1/2")
                    pLabel.config(text="1/2")
                    enable_modes()
            g.reference = g.leftTime + time.time()

    if g.leftRunning and not g.paused and g.nLeftFallen == 1:
        g.leftTime = g.reference - time.time()
        if g.leftTime < 0:
            g.leftTime = 0
            g.nLeftFallen = 2
            g.gameOver = TRUE
            g.leftRunning = FALSE
            g.rightRunning = FALSE
            qLabel.config(text="0")
            pLabel.config(text="1")
            enable_modes()

    if g.rightRunning and not g.paused and g.nRightFallen == 0:
        g.rightTime = g.reference - time.time()
        if g.rightTime < 0:
            if g.rightExtra > 0:
                g.rightTime = g.rightExtra
                g.rightExtra = 0
                rightDisplay.configure(fg="black")
            else:
                g.nRightFallen = 1
                g.rightTime = g.A
                rightDisplay.configure(fg="red")
                if g.nLeftFallen + g.nRightFallen >= 2:
                    g.gameOver = TRUE
                    g.leftRunning = FALSE
                    g.rightRunning = FALSE
                    qLabel.config(text="1/2")
                    pLabel.config(text="1/2")
                    enable_modes()
            g.reference = g.rightTime + time.time()

    if g.rightRunning and not g.paused and g.nRightFallen == 1:
        g.rightTime = g.reference - time.time()
        if g.rightTime < 0:
            g.rightTime = 0
            g.nRightFallen = 2
            g.gameOver = TRUE
            g.leftRunning = FALSE
            g.rightRunning = FALSE
            qLabel.config(text="1")
            pLabel.config(text="0")
            enable_modes()
    lh = int(g.leftTime / 3600)
    lm = int(g.leftTime / 60) % 60
    ls = int(g.leftTime) % 60
    lt = int(g.leftTime * 10) % 10
    rh = int(g.rightTime / 3600)
    rm = int(g.rightTime / 60) % 60
    rs = int(g.rightTime) % 60
    rt = int(g.rightTime * 10) % 10
    if lh > 0:
        leftDisplay.config(text=f'{lh:01}' + ':' + f'{lm:02}' + '.' + f'{ls:02}')
    elif lm > 0:
        leftDisplay.config(text=f'{lm:02}' + '.' + f'{ls:02}')
    else:
        leftDisplay.config(text=f'{lm:02}' + '.' + f'{ls:02}' + '.' + f'{lt:01}')
    if rh > 0:
        rightDisplay.config(text=f'{rh:01}' + ':' + f'{rm:02}' + '.' + f'{rs:02}')
    elif rm > 0:
        rightDisplay.config(text=f'{rm:02}' + '.' + f'{rs:02}')
    else:
        rightDisplay.config(text=f'{rm:02}' + '.' + f'{rs:02}' + '.' + f'{rt:01}')
    root.after(20, update_clock)  # run itself after given number of milliseconds


# run first time
update_clock()

root.mainloop()




Tuesday, June 11, 2019

The double-flag chess clock, a timer for games with reversible moves

A new type of chess clock could improve the game. By allowing bonus time only in order to play for a draw, we can avoid meaningless time scrambles and at the same time have fewer long-winded games.

Moreover, the "double-flag" system would let us abandon the 50-move rule and the rule of threefold repetition. Apart from opening the possibility of playing out certain winning lines, this would allow players in long endgames to focus on the position on the board without counting moves and repetitions.

Game clocks


Many abstract strategy games are played in organized tournaments and international championships: Chess, Go, Backgammon, Scrabble, Draughts, Othello, Magic the Gathering and so on.

Competitions in such games require some method of restricting the amount of time that the players can spend on their moves. For this purpose, chess clocks were introduced in the late nineteenth century as the first international chess tournaments were arranged.


A chess clock has one clock and one button for each player. After making your move, you press your button, thereby stopping your own clock and starting your opponent's. A traditional analog clock has a flag (hinged at a couple of minutes before 12 o'clock) for each player that shows indisputably whether or not they have run out of time.

A chess clock is also convenient for casual play. It allows the players to agree in advance on how long the game should take, and to allocate their own time as they wish. You don't have to be annoyed when your opponent spends a lot of time on a move, and you don't have to apologize when you do the same.

And the clock can be used for a variety of games, look for instance at Chess Clock Jenga!

Games with reversible moves


Some games, by the nature of their rules, progress inevitably towards their conclusion. In Othello a new disc is placed on the board at every move, and the discs are never removed. Not counting pass moves, there can therefore be at most 60 moves in a game.

For such games, the simplest form of time limit makes sense: A fixed amount of time, for instance 30 minutes per player, for the whole game.

In other games like chess and draughts, pieces can move back and forth and there is no practical upper limit on the number of moves in a game. In such games, a fixed time limit for the whole game doesn't necessarily work well: The clock provides a new way of winning, and this can be abused.

Suppose for instance that in a game of chess, a position is reached where each player has only a king and a rook:


In such a position it's impossible to make any kind of progress unless the opponent makes a very unlikely blunder. But neither can anyone force the game to end. It ought to be a draw, but a player determined to try to win can keep moving indefinitely. If the game is played with a fixed time limit, a player with better manual dexterity can keep moving until the opponent runs out of time.  

One might have thought that something as silly as this couldn't take place in serious tournament play, but it can actually occur at the highest level. The infamous game between Monika Soćko and Sabina-Francesca Foisor from the women's world championship in 2008 (which can be seen around 1.30 to 2.30 into this video) led to the even more ridiculous king and knight versus king and knight endgame, which White eventually won by playing on until Black ran out of time.

This is not in the spirit of the game, and hardly compatible with ideas of everyone, young, old or physically disabled, competing on the same premises.

Even with many pieces on the board, a game can pass a point of no return after which it becomes obvious that none of the players has enough time to finish the game. The 2008 US women's championship was decided by a tie-break game between Irina Krush and Anna Zatonskih where, after a number of nonsense moves on the side of the board closest to the clock, Zatonskih won on time in a lost position with one second left on her clock.

As was pointed out in an excellent article by Tom Braunlich written shortly after the Soćko - Foisor game, this sort of thing is a consequence of the rules, not something nefarious that the players are doing to cheat.

These games were played more than ten years ago, but so-called armageddon games are still used as tie-break in international championships. Something equally absurd could have decided last year's world championship match between Magnus Carlsen and Fabiano Caruana, and no amount of sportsmanship on the part of the players would have resolved the issue.

Bonus time and rules for claiming a draw


In order to avoid meaningless "time scrambles", practice since the early days of tournament chess is to allow more time as the game progresses. A system like 2 hours for the first 40 moves and then another hour for every 20 moves (plus time remaining) is easily implemented with a mechanical clock: Every time a flag falls, the player must have made at least the prescribed number of moves.

Nowadays games are required to finish quicker, but digital clocks allow for systems of bonus time (also called increment or add-on) per move, where for instance 30 seconds are added to a player's time at every move.

But this still doesn't solve the problem of how to claim a draw. Special rules (50-move rule, threefold repetition) are required in order to make it possible to claim a draw in a position where an opponent keeps playing without making progress.

The 50-move rule is a compromise which is somewhat flawed at both ends: On one hand it means that some endgames become drawn even though they could otherwise be won. These endgames famously include some positions with king, rook and bishop against king and rook. And something like K+B+B+N vs K+R would even be an easy win for a moderately skilled player, were it not for the 50-move rule. There are also examples with many pieces on the board where the 50-move rule has been highlighted. In the beginning of this video, Magnus Carlsen explains that (in this game against Veselin Topalov, London 2015) he had to allow the exchange of a pair of knights, thereby severely diminishing his winning prospects, because the 50-move limit was approaching.

On the other hand, the 50-move rule is often insufficient and difficult to implement, especially at fast time controls but also in endgames with mobile pawns.

Even though the old article 10.2 has now been removed, the official FIDE rules (Guidelines III) still include situations where an arbiter may have to assess whether a player makes "sufficient attempts to win by normal means".

There have been, over the years, extensive discussions and several changes in regulation of time controls and situations in which a player can claim a draw (see for instance this discussion of the history of the 50-move rule by Edward Winter).

Distractions in long endgames


We have mentioned some rather spectacular situations, but the rules for time control and claiming draws are influencing many games. Let's look at a fairly common situation: We are far into an endgame and one player has an advantage, say queen versus rook. There is nothing special about this, the same remarks would apply to many other endgames.


Although K+Q vs K+R is a well-known theoretical win for the stronger side, it's quite challenging against good defence. But under current rules there will always be distractions from trying to play the endgame well. Suppose first that the game is played under a fixed time limit, so-called "sudden death". Since this sort of position only occurs near the end of an unusually long game, the rule rather than the exception is that both players are short of time. 

If White has better time, they might try to win by moving aimlessly but quickly. When it succeeds, the result will appear to be fair since after all White had a theoretical win. Again notice that White may have had the best intentions, but they too were in time trouble and could have lost if they had spent the time trying to play with precision.

Assuming that White tries to win on the board, another problem presents itself: How long does White dare to play for a win? If White is down to only a few seconds, there is suddenly a risk of losing. This means that White might have to keep an "emergency exit" by playing in such a way that they can force an exchange of the last pieces or a perpetual check if needed. 


If on the other hand the game is played with a classical time limit and a 30 second bonus, there are other distractions. White can now start by rattling off a number of aimless checks, this time in order to accumulate time of their own rather than to make their opponent spend theirs. In the position of the diagram, White could for instance (depending on where the black king moves) give checks on g5, g6, h5, h6 and then g5 again, collecting a couple of minutes of extra time.

But all of a sudden the players have to try to keep track both of the number of moves played since the last pawn move or capture, and of the positions that have occurred before (and those that have occurred twice). And White has to manage a trade-off between gaining time on the clock and staying within the 50-move limit.

The double-flag chess clock


The purpose of this post is to describe a simple system for time control that solves all the problems we have mentioned at once. I'd like to call it the double-flag chess clock. The idea is to have a system of bonus time, but to allow the bonus to be used only in order to obtain a draw.

Let's first look at how this might work in a simple example setting of rapid chess. First the players are given a fixed amount of time, say 15 minutes, that we call the main time. In order to win the game, you have to checkmate your opponent without using more time than this (unless they resign or overstep both their main and their extra time).

If a player uses up their 15 minutes before the end of the game, their first flag falls. They can now no longer win the game (even if they checkmate their opponent!) but are allowed to play on for a draw. At this point they are given a batch of extra time, let's say 1 minute. In this new phase of the game, they (but not their opponent!) will be given a bonus per move that guarantees they always have at least a certain minimum, say 10 seconds, for every move. But the bonus system will be designed so that they can never accumulate more extra time on the clock than 60 seconds (the initial amount).

There are a couple of different ways of implementing this idea (as discussed below). What I will suggest is that when a player using extra time presses the clock, the full bonus of 10 seconds will be added if they have up to 40 seconds left, while if they have between 40 and 60 seconds, the bonus is half of what remains up to 60. For instance, if they have 44 seconds before pressing the clock, 8 seconds will be added and the display will read 52 seconds once they have pressed the clock.

Let's say that Black runs out of their 15 minutes of main time while White still has some of their main time left. Now White can, whenever they wish, stop the game and claim a draw. If White chooses to play on, there are a couple of possibilities. If the game ends on the board with a white win or a draw, then that is the result of the game. But if Black wins on the board while playing on their extra time, the game counts as a draw.

If Black oversteps time again, by failing to move within the stipulated extra time, then they have actually lost the game on time. The final possibility is that White too oversteps their main 15 minutes. If this happens, the game is drawn regardless of the position on the board.

Features of the double-flag system


Let's see how the double-flag system solves the problems we have mentioned. Suppose first that we arrive at a "meaningless" endgame like king and one piece against king and the same sort of piece. Since a player using their extra time can never be forced to overstep it, the outcome if their opponent insists on playing is that eventually they too will run out of their main time and the game will be drawn.

If in a complicated position both players are about to run out of their main time, there will not be a nonsense time scramble for a full point. Instead the logical conclusion is again a draw. If both players are down to only a few seconds of main time, it doesn't matter who runs out of it first, and there is no point in making quick nonsense moves. You might try to bamboozle your opponent with a surprising king's attack though!

Since the game cannot go on indefinitely anyway, the 50-move rule is no longer needed. And neither is the rule of threefold repetition, which too is a hassle in cases other than direct repetition. This means for instance that if a player reaches K+B+N against a bare king with 5 minutes of main time remaining, they will have 5 minutes to try to figure out how to checkmate, regardless of whether it takes them 30 or 60 moves. On the other hand they can't hope to win on time and there is no point in even trying.

In a normal but long endgame, players will be able to focus on the board without being unnecessarily distracted by having to look at the clock or the protocol. There is no need to count moves or repetitions.

Long-winded play and repetition of positions can never increase winning chances, but favours a player trying to draw. This gives the players the correct incentives: In a position like K+Q vs K+R discussed earlier, White is the one who has to try to make progress in order to win, while Black will be happy to give repeated checks or retain status quo. And if White runs out of time, they can, for all practical purposes, claim a draw just like they could if Black didn't have checkmating material.

Designing the bonus system


There are systems of time control already in use that limit the accumulation of bonus time. With Bronstein delay, a player can get back the time they spent on the last move, but not more than that. In Go, a system called byo-yomi has a similar effect: A certain minimum time per move is guaranteed, but bonus time cannot be accumulated.

The bonus system for the double-flag clock can be regarded as a hybrid between delay and add-on: Bonus time can be accumulated, but only up to a certain threshold.

The reason it shouldn't be possible to accumulate bonus time beyond the initial amount is that a player whose first flag is about to fall should never be able to gain anything by letting their main time run out in order to get more quickly to the bonus per move. With the initial amount as an upper limit on the accumulation of extra time, it will always be better to run out of main time at a later stage: You obtain the initial batch of extra time when your main time runs out, and you could not have had more at that point in the game if your first flag had fallen earlier.

Let's look at some possible implementations, assuming again (as an example) that the minimum time per move is 10 seconds and the maximum accumulated bonus is 60 seconds.

1. Truncated bonus: The simplest implementation of a bonus system satisfying the requirements is just truncating the extra time at 60 seconds. When you make a move playing on extra time, the full 10 second bonus will be added, except if you already have more than 50 seconds, in case the extra time will be set to 60 seconds. The addition of the bonus follows the formula
\[ t \mapsto \min(t+10, 60).\]
A slight flaw of the truncated system is that a player with more than 50 seconds of extra time has no incentive of moving immediately, even if they have decided what to play.

2. Linear bonus: We might therefore prefer the added bonus to decrease gradually as the accumulated time approaches the 60 second limit. An alternative to a truncated system is a linear one, where the added bonus decreases linearly from 10 seconds when the time remaining is essentially zero, to nothing if you already have 60 seconds. This is the same thing as saying that the time added always takes you one sixth of what remains up to 60 seconds. Mathematically:
\[ t \mapsto t + 10 - \frac{10}{60}\cdot t.\]
But with a purely linear system, it might be annoying that the bonus decreases too quickly even at moderate levels of accumulated extra time.

 3. Hybrid linear: What I suggest, again to minimize distraction of the players, is a hybrid system where the full 10 seconds are added as long as the player has at most 40 seconds (the initial amount minus two times the minimum per move) when pressing the clock. Above that, the bonus decreases linearly, which means that the time on the clock is taken half-way up to 60 seconds. With mathematical notation,
\[ t \mapsto \min\left(t+10, \frac{t+60}{2}\right).\]
For example, if a player presses the clock with 42 seconds of extra time, the new time is 51 seconds, while if they press it with 58 seconds remaining, the time is adjusted to 59 seconds.

Classical time limit


The double-flag system can be implemented on all time-scales. For a classical game we normally want to encourage good endgame play by adding extra time at move 40. One possibility would be to let the main time consist of an initial 1h 30 min, or perhaps 1h 45 min to better suit players accustomed to the "90+30"-tempo, and a secondary 15 or 30 minutes added after 40 moves.

With such a system, no mercy needs to be given a player who oversteps the time for the first 40 moves. Only after the final period of main time should there be extra time available for holding a draw. For the extra time we might have a bonus of the usual 30 seconds per move, and a maximum accumulated bonus of 5 or perhaps 10 minutes.

New possibilities


Apart from the advantages already mentioned, the double-flag clock offers new possibilities at both ends of the spectrum from bullet to classical games.

Abandoning the 50-move rule will allow players to treat certain endgames correctly, but there are other consequences: In a game where one player has run out of their main time, the other player might try a dangerous line in order to stir up winning chances in a position where they would normally have played safely to secure a draw. This could lead to interesting play that would otherwise not have occurred on the board.

At the faster end of the spectrum, the double-flag system offers possibilities of playing interesting games even at "bullet" or "lightning" time controls (1-2 minutes). Under traditional rules, the game deteriorates to a parody of chess as the time is decreased. With the double-flag system, sensible (but aggressive) play is rewarded, and the game remains a miniature version of chess. At extremely fast time controls it just becomes drawish.

Since the double-flag system would allow serious competition at very fast time controls, we can even imagine bullet chess becoming recognized as an e-sport!

Related ideas


The idea that one doesn't necessarily lose the game when time runs out seems historically to have preceded the more modern notion that the game is immediately lost when the flag falls.

The primitive chess clocks of the late 19:th century weren't very exact, and the first consequence of overstepping time seems to have been that a tournament director requested you to play faster. According to this article, you could also be fined.

principle that has been applied in Othello (IV 6. Time Defaults) is that if your flag falls, you are given two minutes of extra time that can only be used in order to minimize the margin of loss. If the player who lost on time later wins or draws on the board, the score will be 33-31 (the smallest possible win) in favour of their opponent. My guess is that the reason for this rule is that rewarding a player with a 64-0 victory for their opponent's bad time management might be considered unfair to a third party.

Advantages of the double-flag system, in short


Games are finished in time.

No nonsense time scrambles.

Long-winded play never increases winning chances.

Allows us to abandon the 50-move rule and the rule of threefold repetition.

Promotes good play and focus on the board.

Allows sensible blitz and bullet games.


Friday, April 5, 2019

Which number is hardest to guess?

I finally figured out how to compute the "devil's distribution", a probability distribution that produces a random positive integer that's in a sense the hardest to guess.

I've known for a long time that this thing must exist, but didn't know until recently what it looked like. Here it is:
The setup is not too complicated: Two people, the Selector and the Devil, play a game with a deck of cards. Some cards are item cards and the rest are, well, ordinary cards. If we play with a standard 52-card deck we might decide for instance that the kings, queens and jacks are item cards.

First, the Devil removes 11 cards of their choice from the deck without showing the Selector. Since there were 12 item cards from the start, at least one and at most 12 are still in the game. Now the remaining 41-card deck is shuffled and the cards turned up one by one. The Selector's task is to choose online the last item card. "Online" means that whenever you see an item, you have to decide immediately whether to accept or reject it, and you can't later go back and change your mind.

If the Selector accepts the last item card, they win the game. Otherwise (if they accept an item that turns out not to be the last one, or end up not accepting anything), the Devil wins.

If you are the Selector, your task is basically to guess how many items there are in the deck. If you had to guess from the start you'd have 1 chance in 12 of winning, but it's better than that because the times when the items arrive will give you a clue. If you see many items early on, you can expect there to be more of them, but if you see only a few you might want to accept before it's too late.

But what's a good strategy? And if you play the Devil, how many item cards should you keep in the deck? If you keep many of them, is that generally going to make it harder or easier for the Selector than if there are only a few? Not so obvious!

An interesting thing about it is that if instead we play with a super-mega-giant deck of cards, the game roughly stays the same. We assume then that the Selector knows the size of the deck and that the Devil can put any number of items in it as long as there's at least one.

There is an "infinite deck limit" where instead the Devil chooses an arbitrary positive integer, and that many items arrive at random times in the interval from 0 to 1 (representing the percentage of the deck that we have seen by the time the item card is drawn).

This is where the devil's distribution enters. It represents the optimal way for the Devil to choose the number of items. Of all the ways you might randomize the number of items, this particular one makes the Selector's task hardest.

To me this distribution is special, because it's a mathematical object that I first didn't think would exist. I came to think of this game when I was working on a different problem, the secretary problem on partially ordered sets (that I eventually solved together with Ragnar Freij). At some point I thought I might have a really neat solution to that problem, but in order for it to work, the Selector would have to have a strategy that wins the continuous time Selector-Devil game with probability at least $1/e$, or approximately 36.8%.

Initially it looked promising. The number $1/e$ is what the Selector's winning chances would have been if the game had simply become harder with more items. And there was a perfect analogy, by change of time-scale, to the classical secretary problem where $1/e$ is indeed the Selector's probability of success, and that indeed gets harder with more items. Ah, mathematics is so beautiful!

And so devilish! Because in the middle of a computation, a coefficient in a taylor expansion had the wrong sign. It should have been positive, but it was $-1/2$. Well actually it didn't matter what I thought it should have been. It was negative, and you can't negotiate with the truth.

At first it seemed almost paradoxical to me, because it was clear that it can't be optimal for the Devil to randomize the number of items from a finite list of numbers. They must sometimes choose a large number. At the same time the calculation seemed to show that the Selector's task would get easier with many items.

The only possibility that didn't lead to a contradiction was that the Devil must have a specific optimal strategy consisting of most of the time choosing reasonably small numbers, and then to let the probabilities of larger numbers decay exponentially.

I proved this and described the Selector's optimal policy in a paper published on the arXiv in 2011. But it wasn't until recently that I realized that computing the devil's distribution with any degree of precision was feasible. It seemed at first that errors would propagate and grow to the point where any results obtained in reasonable time would be meaningless. But it turned out it wasn't that hard, and if we start with a precision of say 100 decimals, then ten or so will still be correct in the end.

So here are some data: The Selector, using an optimal policy, wins with probability approximately \[0.35291700020719554668690761575691114631783787927644651520731735696011.\] The Devil has a unique probability distribution to choose from that makes the Selector's task this hard. With any other, the Selector can increase their winning chances by adjusting their strategy.
Facts about the distribution: The probability of choosing only one item is about 4.444% (but no, it's not $2/45$). The number most frequently chosen is 5, which occurs with probability a little more than 8%. From then on the probabilities decrease, and for large numbers they decay in each step with a factor of approximately 1.316496. The distribution approximately has mean 8.873 and standard deviation 5.956.

And I can't relate any of these numbers to known mathematical constants. Except 5.


Monday, March 11, 2019

Primtal och siffermönster

Euklides kunde bevisa redan för 2300 år sedan att det finns oändligt många primtal. Med tiden har jag kommit att tycka att det mest fascinerande inte var att han begrep att det finns oändligt många, utan att han argumenterade för det med ett bevis. Vem var det han förklarade för? Hur diskuterade man? Hur såg det samhälle ut där ett stringent logiskt resonemang ansågs värt att skriva ner och föra vidare, när högröstad auktoritet i de flesta tider och kulturer har betraktats som mer trovärdigt?

Bevismetoden är idéhistoria. Om vi till exempel vill visa att det finns ett primtal som är större än en miljon, kan vi tänka på talet \[M = 1\cdot 2\cdot 3\cdot 4\cdot 5 \cdots 1000000 + 1.\] Här har vi multiplicerat ihop alla tal upp till en miljon och adderat 1. Liksom alla heltal större än 1 måste detta tal vara delbart med något primtal, för är det inte delbart med något annat primtal är det självt ett primtal. Men det primtal som talet $M$ är delbart med kan inte vara något av talen upp till en miljon, för $M$ ger rest 1 vid division med vart och ett av dem. Vi ser alltså att det måste ligga myriader av primtal och lura i mörkret bortom en miljon, även om resonemanget inte ger någon ytterligare ledtråd till vilka dessa tal är.

Idén om en formel eller ett mönster som styr primtalen har varit en sorts helig graal för matematiker genom tiderna. På 1600-talet fick Pierre de Fermat idén att starta från 2 och bilda en följd där varje tal är kvadraten av det föregående, alltså $2, 4, 16, 256, 65536, \dots$ Han gissade att om vi adderar 1 till vart och ett av dessa tal, så att vi får \[3, 5, 17, 257, 65537,\dots\] skulle vi få en följd av enbart primtal. Men det sprack när redan nästa tal, 4294967297, visade sig vara delbart med 641. Han hittade inte graalen...

Gör man en lista över primtal upptäcker man att alla utom 2 och 5 slutar på någon av siffrorna 1, 3, 7 eller 9. För ett par-tre år sedan blev det ståhej i matematikvärlden när det påstods att man hade upptäckt "primtalskonspirationer" i form av oväntade mönster i primtalens slutsiffror. Men det visade sig till slut att mönstren skulle vara där och att talen uppför sig som de ska.

Jag tänkte prata lite om att vi faktiskt vet, sedan 1830-talet, att det finns oändligt många primtal med var och en av slutsiffrorna 1, 3, 7 och 9. Det fantastiska beviset av Lejeune Dirichlet är giltigt inte bara modulo 10 utan i varje talbas.

Jag har flera gånger tänkt att det borde finnas en introduktion till Dirichlets teorem som började med att beskriva fallet modulo 10. Förutom att 10 är basen för vårt siffersystem är det också det enklaste fallet där man behöver komplexa tal. Men man behöver ingen komplex analys per se. Koefficienterna i de serier man behöver är bara $\pm 1$ och $\pm i$ och det räcker att titta på det komplexa argumentet av en enda serie.

Det kanske finns något sådant nerskrivet någonstans, men jag har inte hittat det, och ska man få någonting riktigt som man vill ha det, får man ändå göra det själv...

Men vi börjar med att se hur långt vi kan komma genom att modifiera Euklides bevis. Första steget i Dirichlets riktning blir att visa att det finns oändligt många primtal som slutar på 3 eller 7.

Antag att vi multiplicerar ihop alla tal som slutar på 3 eller 7 upp till en viss gräns, till exempel en miljon. Om gränsen är ett jämnt tiotal så att vi har lika många faktorer som slutar på 3 respektive 7, slutar produkten på en etta. Om vi sedan adderar 2, får vi ett tal som slutar på 3 och som inte kan vara delbart med något tal som ingår i produkten:
\[M_3 = 3\cdot 7 \cdot 13\cdot 17 \cdot 23\cdot 27\cdots 999997 + 2.\] Men ett tal som slutar på 3 eller 7 måste ha någon primfaktor som slutar på 3 eller 7, så det måste finnas något sådant primtal större än en miljon.

Med ett resonemang som bygger på så kallade kvadratiska kontra icke-kvadratiska rester (ett nyckelord är kvadratisk reciprocitet) kan vi visa att det också finns oändligt många primtal som slutar med 1 eller 9. Vi kan nämligen titta på vilka primfaktorer ett tal av typen $n^2-5$ kan ha. Om $p$ är ett primtal som delar $n^2-5$, så är $n^2$ kongruent med $5$ modulo $p$. Multiplikation med 5 måste därmed ge vad som i kombinatoriken kallas en jämn permutation av restklasserna modulo $p$. Men det visar sig att om $p$ slutar på siffran 3 eller 7, kommer multiplikation med 5 att ge en udda permutation av restklasserna. Ett tal av typen $n^2-5$ kan därför aldrig ha en primfaktor som slutar på 3 eller 7.

Här finns det massor av intressanta siffermönster. Ett tal som består av en radda med tvåor följt av en radda med ettor, och som har en mer etta än tvåa, kan till exempel bara ha primfaktorer som slutar på 1 eller 9. Talen 211, 22111 och 2221111 är själva primtal, medan \[222211111 = 379 \cdot 586309,\] \[22222111111 = 109\cdot 8221\cdot 24799,\] \[2222221111111 = 20411\cdot 108873701,\] och så vidare. De här talen är inte själva på formen $n^2-5$, men de följer mönstret \[222211111 = \frac{66665^2 - 5}{20}.\] När vi dividerar bort 20, finns bara faktorer som slutar på 1 och 9 kvar.

Ska vi göra ett bevis av Euklidiskt snitt kan vi titta på talet \[M_9 = \left(2\cdot 1\cdot 9\cdot 11\cdot 19 \cdot 21 \cdot 19 \cdots 999999\right)^2 - 5.\] Produkten inom parentes består av alla tal under en miljon som slutar på 1 eller 9, samt en faktor 2 för att garantera att resultatet efter att vi har subtraherat 5 är ett udda tal.

Talet $M_9$ kan inte vara delbart med 2 eller 5, och på grund av den kvadratiska reciprociteten inte heller med något primtal som slutar på 3 eller 7. Alla dess primfaktorer måste alltså sluta på 1 eller 9, men de kan inte ingå i produkten i vänsterledet, så de måste vara större än en miljon.

I själva verket kan vi stärka slutsatsen ytterligare: Talet $M_9$ slutar på siffran 9, så det måste ha en primfaktor som slutar på 9 (om alla slutade på 1 skulle $M_9$ sluta på 1).

Det var förresten den kvadratiska reciprociteten som ställde till "primtalskonspirationerna". Primtalen kunde inte ligga hur som helst utan var tvungna att lyda reciprocitetslagen.

Vi kan även konstruera en följd av tal vars samtliga primfaktorer slutar på siffran 1, och samtidigt ge Pierre de Fermat upprättelse. Vi tittar på tal av typerna \[n^4 + n^3 + n^2 + n + 1 = \frac{n^5-1}{n-1}, \text{ och}\] \[n^4 - n^3 + n^2 - n + 1 = \frac{n^5+1}{n+1}.\] Om primtalet $p$ är en delare i något av dessa tal, måste $n^5$ vara kongruent med $1$ respektive $-1$ modulo $p$. Enligt en sats av Fermat innebär det att $p-1$ måste vara delbart med 5, såvida inte $n$ är kongruent med $1$ eller $-1$. I båda fallen blir slutsatsen att talet 5 kan förekomma som en primfaktor för vissa värden på $n$, men att samtliga övriga primfaktorer måste sluta på siffran 1.

Om vi tar $n=10^{12}$ till exempel, blir $n^4 + n^3 + n^2 + n + 1$ lika med talet \[1000000000001000000000001000000000001000000000001\] med det vackert klingande namnet en oktiljon en sextiljon en kvadriljon en biljon ett och den underbara primfaktoriseringen \[31\cdot 41\cdot 61\cdot 211\cdot 241\cdot 271\cdot 2161\cdot 3541 \cdot 9091\cdot 27961\cdot 2906161\cdot 4188901 \cdot 39526741.\] Även här kan vi à la Euklides tvinga fram primfaktorer bortom varje given gräns genom att låta $n$ vara delbart med alla tal upp till denna gräns.

För att sammanfatta så långt, vet vi alltså att det finns oändligt många primtal som slutar på siffran 1, oändligt många som slutar på 9, och oändligt många som antingen slutar på 3 eller 7. Det här visste nog Fermat, Euler och Gauss, men för att knyta ihop säcken skulle vi vilja ha ett bevis för att det inom den sistnämnda kategorin primtal både finns oändligt många som slutar på 3 och oändligt många som slutar på 7.


Dirichletserier

Den så kallade harmoniska serien \[ 1 + \frac12 + \frac13 + \frac14 + \frac15 + \frac16 + \frac17 + \frac18 + \frac19 + \frac1{10} + \dots\] är divergent. Summan kan alltså bli hur stor som helst bara vi tar med tillräckligt många termer. Den harmoniska serien kan faktoriseras genom av vi delar upp alla nämnare i primtalspotenser och bryter ut ett primtal i taget: \[ \left(1+\frac12 + \frac14 + \frac18+\dots\right)\cdot\left(1+\frac13+\frac19+\frac1{27}+\dots\right)\cdot \left(1+\frac15+\frac1{25}+\dots\right)\] \[ \cdot \left(1+\frac17+\frac1{49}+\dots\right)\cdot \left(1+\frac1{11}+\frac1{121}+\dots\right)\cdot \left(1+\frac1{13}+\frac1{169}+\dots\right)\cdots\] Om vi börjar med att bryta ut faktorn i den första parentesen, får vi termerna med udda nämnare kvar. Efter att vi också har brutit ut den andra faktorn har vi kvar de termer vars nämnare inte är delbara med 2 eller 3, och så vidare.

Varje faktor är en geometrisk serie, så vi kan skriva om hela produkten som \[\frac21\cdot \frac32 \cdot \frac54\cdot \frac76 \cdot \frac{11}{10}\cdot \frac{13}{12}\cdot \frac{17}{16}\cdot \frac{19}{18} \cdots\] Här har vi följden av alla primtal i täljarna. Delprodukterna svarar inte på något enkelt sätt mot delsummor av den ursprungliga serien, men det står ändå klart att även produktformen måste vara divergent, och att \[\prod_{p<N}\frac{p}{p-1} \geq \log N.\] Detta ger oss ett nytt bevis för att det finns oändligt många primtal.

Så här långt hade Leonhard Euler kommit på 1730-talet. Ungefär hundra år senare fick Dirichlet idén att faktorisera andra liknande serier på samma sätt. Till exempel tittade han på serien \[1- \frac13 - \frac17 + \frac19 + \frac1{11} - \frac1{13} - \frac1{17} + \frac1{19} + \frac1{21} + \dots\] Här är bara termer vars nämnare slutar på 1, 3, 7 eller 9 med, och dessutom har vi satt minustecken på dem vars nämnare slutar på 3 eller 7.

Nu blir serien konvergent, det vill säga summan blir ett ändligt tal. Om vi grupperar termerna fyra och fyra efter jämna tiotal kan vi skriva den som \[\sum_{k=0}^\infty \left(\frac1{10k+1}-\frac1{10k+3}-\frac1{10k+7} + \frac1{10k+9}\right)\] \[= 120\cdot \sum_{k=0}^\infty \frac{2k+1}{(10k+1)(10k+3)(10k+7)(10k+9)}\approx 0.64561.\] Skrivet på det sättet blir alla termer positiva och de går snabbt mot noll. Redan den första termen ger $1-1/3-1/7+1/9 = 40/63 \approx 0.63492$.

När vi faktoriserar blir det alternerande tecken för alla primtal som slutar på 3 eller 7, men plustecken rakt igenom för dem som slutar på 1 eller 9 (och inga faktorer för 2 eller 5): \[\left(1-\frac13+\frac19-\frac1{27}+\dots\right)\cdot \left(1-\frac17+\frac1{49}-\dots\right) \cdot \left(1+\frac1{11}+\frac1{121}+\dots\right)\] \[\cdot \left(1-\frac1{13} + \frac1{169}-\dots\right) \cdot \left(1-\frac1{17} + \frac1{289}-\dots\right) \cdot \left(1+\frac1{19} + \frac1{361}+\dots\right) \cdots\] \[ = \frac34 \cdot \frac78\cdot\frac{11}{10}\cdot \frac{13}{14}\cdot \frac{17}{18}\cdot\frac{19}{18}\cdot\frac{23}{24}\cdot\frac{29}{28}\cdot \frac{31}{30} \cdots\] Men i det här fallet har serien både positiva och negativa termer, så relationen mellan delsummor av serien och delprodukter av den oändliga produkten blir inte lika enkel. Faktum är att den här produkten konvergerar mot samma värde som summan, men det är ett väldigt djupt resultat och ingenting som Euler eller Dirichlet kunde reda ut. En nyckelfras här är primtalssatsen för aritmetiska följder.

Som Dirichlet själv påpekade i originalartikeln (som jag själv tycker är lättast att läsa i engelsk översättning), måste ett bevis för konvergens av den här typen av produkt utnyttja att primtalen kommer i en viss ordning, för om vi blandar om faktorerna kan den fås att konvergera mot eller oscillera mellan vilka positiva värden som helst. Så länge ingenting i vårt resonemang beror av att vi bryter ut faktorerna i någon viss ordning, kan vi inte ha tillräckligt med krut för att visa att den här produkten konvergerar.

För att komma runt detta införde Dirichlet en variabel $s$ som vi kan tänka på som ett tal strax ovanför 1, typ $1.01$, och tittade istället på vad vi än idag kallar för $L$-funktioner, i det här fallet \[1- \frac1{3^s} - \frac1{7^s} + \frac1{9^s} + \frac1{11^s} - \frac1{13^s} - \frac1{17^s} + \frac1{19^s} + \dots\] \[ = \left(1-\frac1{3^s} +  \frac1{9^s}-\dots\right)\cdot \left(1-\frac1{7^s}+\frac1{49^s}-\dots\right) \cdot \left(1+\frac1{11^s}+\frac1{121^s}+\dots\right)\cdots\] \[= \frac{3^s}{3^s + 1}\cdot \frac{7^s}{7^s+1}\cdot\frac{11^s}{11^s-1}\cdot\frac{13^s}{13^s+1}\cdot\frac{17^s}{17^s+1}\cdot\frac{19^s}{19^s-1}\cdots.\] Nu är serien absolutkonvergent (för varje fixt $s>1$), så vi kan skyffla om termerna hur som helst. I synnerhet måste den oändliga produkten konvergera mot samma värde som summan. Variabeln $s$ kan vi tänka på som ett smart sätt att trunkera så att vi får samma resultat oavsett om vi tittar på summan eller produkten.

Om vi gör samma sak med den harmoniska serien, får vi vad som sedermera skulle komma att kallas Riemanns zetafunktion: \[ \zeta(s) = 1 + \frac1{2^s} + \frac1{3^s} + \frac1{4^s} + \dots = \frac{2^s}{2^s - 1}\cdot \frac{3^s}{3^s - 1}\cdot\frac{5^s}{5^s-1}\cdot\frac{7^s}{7^s-1}\cdots.\] Även denna serie är absolutkonvergent för varje $s>1$, men den växer mot oändligheten då $s$ minskar mot 1 eftersom den då väsentligen beter sig som den harmoniska serien.

Låt oss först pausa ett ögonblick och begrunda hur långsamt serien för zetafunktionen konvergerar. Vi kan jämföra summan med \[ \int_1^\infty \frac1{x^s}\, dx = \frac1{s-1}.\] För $s=1.01$ till exempel, blir summan lite drygt 100. Men summerar vi de $2^{100}$ första termerna har vi bara kommit halvvägs, och vid $3^{100}$ har vi fortfarande en tredjedel kvar. Tre korrekta decimaler får vi först någonstans kring $10^{500}$!

Men för att jämföra med den förstnämnda $L$-funktionen tittade Dirichlet i stället på en serie med enbart plustecken, men där vi fortfarande bara har med termer som svarar mot tal som slutar på 1, 3, 7, 9. \[1 + \frac1{3^s} + \frac1{7^s} + \frac1{9^s} + \frac1{11^s} + \dots = \frac{3^s}{3^s - 1}\cdot \frac{7^s}{7^s - 1}\cdot\frac{11^s}{11^s-1}\cdot\frac{13^s}{13^s-1}\cdots.\] Vi kan kalla den här serien med bara plustecken för $L_{\bf 1}$ och den med varierande tecken för $L_{\bf -1}$. De kallas som sagt på riktigt för $L$-funktioner, så det är notation som nästan men inte riktigt är standard. I själva verket består $L_{\bf 1}$ av de termer vi får kvar om vi bryter ut faktorerna för 2 och 5 i zeta-serien, dvs \[ \zeta(s) = \frac{2^s}{2^s-1}\cdot \frac{5^s}{5^s-1} \cdot L_{\bf 1}.\] Det betyder att även $L_{\bf 1}$ går mot oändligheten när $s\to 1$, och lite mer precist att \[L_{\bf 1} \approx \frac25\cdot \frac1{s-1}.\] Om vi tittar på produktformerna av $L_{\bf 1}$ och $L_{\bf -1}$  är det förstås intressant att de faktorer som svarar mot primtalen på 1 och 9 är desamma. Det ligger nära till hands att dividera för att kancellera dem och se vad vi kan säga om det som blir kvar. Då får vi \[\frac{L_{\bf 1}}{L_{\bf -1}} = \frac{1+1/3^s+1/7^s+1/9^s+\dots}{ 1-1/3^s-1/7^s+1/9^s+\dots}\] \[= \frac{3^s+1}{3^s-1}\cdot \frac{7^s+1}{7^s-1}\cdot \frac{13^s+1}{13^s-1}\cdot \frac{17^s+1}{17^s-1}\cdot \frac{23^s+1}{23^s-1}\cdot \frac{37^s+1}{37^s-1}\cdots\] Här vet vi från representationerna som serier att $L_{\bf 1}$ går mot oändligheten och $L_{\bf -1}$ mot ungefär $0.64561$ när $s$ närmar sig 1. Kvoten mellan dem går därför också mot oändligheten, och en slutsats av detta är att produkten i högerledet måste ha oändligt många faktorer. Vi får alltså ett nytt bevis för att det finns oändligt många primtal som slutar på 3 eller 7.

Om vi i stället för kvoten tittar på produkten av $L_{\bf 1}$ och $L_{\bf -1}$ blir det primtalen på 3 och 7 vars faktorer (nästan) kancellerar: \[L_{\bf 1}\cdot L_{\bf -1} = \left(\frac{3^{2s}}{3^{2s}-1}\cdot  \frac{7^{2s}}{7^{2s}-1}\cdot  \frac{13^{2s}}{13^{2s}-1}\cdots \right)\] \[\cdot \left(\frac{11^s}{11^s-1}\right)^2\cdot \left(\frac{19^s}{19^s-1}\right)^2\cdot \left(\frac{29^s}{29^s-1}\right)^2\cdot \left(\frac{31^s}{31^s-1}\right)^2 \cdots.\] Här utgörs den första parentesen av "skräp" från primtalen på 3 och 7 och som inte riktigt kancellerar, men som konvergerar mot ett ändligt värde. Ansvaret för det faktum att produkten går mot oändligheten då $s\to 1$ vilar därmed på primtalen på 1 och 9 i de följande faktorerna, som därför måste vara oändligt många.

Om vi vill sortera bort "skräpet" och uttrycka slutsatserna lite mer kvantitativt, kan vi logaritmera alltihop. Eftersom faktorerna ligger nära 1, får vi bara ett begränsat fel om vi använder approximationen $\log x \approx 1+x$ rakt igenom. Då blir slutsatsen av faktoriseringen av $L_{\bf 1}$ att \[ \sum_p \frac1{p^s} = \log\left(\frac1{s-1}\right) + O(1).\] Här uttalar vi oss alltså om vad som händer när $s$ går mot $1$, och termen $O(1)$ står för något som är begränsat som funktion av $s$.

När vi faktoriserar $L_{\bf -1}$ får vi å andra sidan att \[ \sum_{p=11, 19, 29, 31,\dots}\frac1{p^s} - \sum_{p=3, 7, 13, 17,\dots} \frac1{p^s} = O(1),\] dvs differensen mellan de här summorna är begränsad när $s\to 1$. Om vi delar upp primtalen i dem på 1 och 9 kontra dem på 3 och 7, kommer de alltså att bidra med ungefär lika mycket till den totala summan av $1/p^s$. Var och en av summorna i vänsterledet måste alltså vara $1/2\cdot \log(1/(s-1))$ plus något begränsat.

Så här långt var nog egentligen Euler med, och han var ju dessutom en hejare på det här med komplexa tal, men det var först Dirichlet som gick vidare till att titta på faktoriseringar av serier som $L_{\bf 1}$ och $L_{\bf -1}$ men med komplexa koefficienter.

De två lantliga och rejäla serierna $L_{\bf 1}$ och $L_{\bf -1}$ har nämligen två kusiner från den komplexa storstaden, serierna (med fortsatt aningen improviserad notation) \[L_{\bf i} = 1 + \frac{i}{3^s} - \frac{i}{7^s} - \frac{1}{9^s} + \frac1{11^s} + \frac{i}{13^s} - \frac{i}{17^s} - \frac1{19^s} + \dots\] och \[L_{\bf -i} = 1 - \frac{i}{3^s} + \frac{i}{7^s} - \frac{1}{9^s} + \frac1{11^s} - \frac{i}{13^s} + \frac{i}{17^s} - \frac1{19^s} + \dots\] med motsvarande produktrepresentationer \[ L_{\bf i} = \frac{3^s}{3^s-i}\cdot \frac{7^s}{7^s+i}\cdot \frac{11^s}{11^s-1}\cdot \frac{13^s}{13^s-i}\cdot \frac{17^s}{17^s+i}\cdot \frac{19^s}{19^s+1}\cdots\] och \[ L_{\bf -i} = \frac{3^s}{3^s+i}\cdot \frac{7^s}{7^s-i}\cdot \frac{11^s}{11^s-1}\cdot \frac{13^s}{13^s+i}\cdot \frac{17^s}{17^s-i}\cdot \frac{19^s}{19^s+1}\cdots\] som vi kan få fram genom att bryta ut på samma sätt som Euler gjorde med den harmoniska serien.

Med risk för viss historierevisionism vill jag nu börja med att multiplicera ihop de två kusinerna. Då får vi ytterligare en reell serie som vi för tillfället kan kalla $L_{\bf \star}$. Anledningen är att jag först vill se hur långt vi kommer med ren "envarre" (en reell variabel).

På serieform kan vi skriva produkten som \[ L_\star = L_{\bf i}\cdot L_{\bf -i} = \left(1 -\frac1{9^s} +\frac1{11^s} - \frac1{19^s} + \frac1{21^s} - \frac1{29^s}+\dots\right)^2\] \[ + \left(\frac1{3^s} - \frac1{7^s} + \frac1{13^s} - \frac1{17^s} + \frac1{23^s}-\frac1{27^s}+\dots\right)^2.\] Här har vi tagit realdel gånger realdel för sig, och imaginärdel gånger imaginärdel för sig. Eftersom kusinerna är varandras komplexkonjugat, kommer alla termer av typen reell gånger imaginär att kancellera.

I det här fallet kan vi till och med tala om explicit vad gränsvärdet blir när $s\to 1$. Om vi sätter in $s=1$ (detta går ju bra på serieform) blir den första parentesen (realdelen av var och en av kusinerna) lika med \[ \frac{\pi}{10}\cdot \cot\frac{\pi}{10} = \frac{\pi}{10}\cdot \sqrt{5+2\sqrt{5}}\approx 0.96688.\] Varför det blir så är en annan historia, som egentligen börjar med det så kallade Baselproblemet.

Den andra parentesen, imaginärdelen av $L_{\bf i}$, blir \[ \frac{\pi}{10}\cdot \cot\frac{3\pi}{10} = \frac{\pi}{10}\cdot \sqrt{5-2\sqrt{5}}\approx 0.22825.\] Om vi kvadrerar och lägger ihop, får vi \[ \lim_{s\to 1} L_\star = \frac{\pi^2}{10}\approx 0.98696.\] Så länge $s>1$ kan vi även skriva $L_\star$ på produktform genom att multiplicera produkterna för kusinerna. Även produktformen blir helt igenom reell: \[L_\star = \prod_{p=3,7,13,17,\dots} \frac{p^{2s}}{p^{2s}+1} \cdot \prod_{p=11,31,41,\dots} \left(\frac{p^s}{p^s-1}\right)^2 \cdot \prod_{p=19,29,59, \dots} \left(\frac{p^s}{p^s+1}\right)^2.\] Här ligger faktorerna för $p=3, 7, 13, 17, \dots$ så nära 1 att den första produkten går mot ett ändligt värde skilt från noll även när $s$ går mot $1$, medan de följande två grupperna av faktorer har potential att var för sig dra iväg hur långt som helst mot oändligheten respektive ner mot $0$, när $s$ närmar sig $1$.

Om vi jämför detta med produktformen för $L_{\bf 1}\cdot L_{\bf -1}$, blir det igen lockande att dividera det ena med det andra, eftersom faktorerna för primtal som slutar på 1 då kommer att kancellera. Vi får \[ \frac{L_{\bf 1}\cdot L_{\bf -1}}{L_\star} = \prod_{p=3,7,13,17\dots}\frac{p^{2s}+1}{p^{2s}-1} \cdot \prod_{p=19,29,59,\dots}\left(\frac{p^s+1}{p^s-1}\right)^2.\] Vi vet genom serierepresentationerna att $L_{\bf 1}$ går mot oändligheten när $s\to 1$, medan både $L_{\bf -1}$ och $L_\star$ går mot ändliga värden som inte är noll. Hela detta kalas måste alltså gå mot oändligheten när $s\to 1$, vilket bara kan bero på att det finns oändligt många primtal som slutar på 9.

Återigen kan vi multiplicera istället för att dividera, och då är det primtalen på 1 som hamnar i rampljuset: \[L_{\bf 1}\cdot L_{\bf -1}\cdot L_\star = \] \[ \prod_{p=3,7,13,17,\dots}\frac{p^{4s}}{p^{4s}-1} \cdot \prod_{p=19,29,59,\dots}\left(\frac{p^{2s}}{p^{2s}-1}\right)^2 \cdot \prod_{p=11,31,41,\dots} \left(\frac{p^s}{p^s-1}\right)^4.\] Även detta drar iväg mot oändligheten när $s\to 1$, och alla utom primtalen som slutar på 1 har alibi.

Nu har vi använt reella Dirichletserier för att reproducera precis vad vi visste sedan Euklides och Fermat: Det finns oändligt många primtal med slutsiffra 1, oändligt många med slutsiffra 9, och oändligt många med 3 eller 7.

Om vi logaritmerar och taylorapproximerar igen, får vi den något mer precisa informationen att \[ \sum_{p\equiv 1 \text{ (mod 10})} \frac1{p^s} = \frac14\cdot \log\left(\frac1{s-1}\right) + O(1),\] \[ \sum_{p\equiv 9 \text{ (mod 10})} \frac1{p^s} = \frac14\cdot \log\left(\frac1{s-1}\right) + O(1),\] och \[ \sum_{p\equiv 3, 7 \text{ (mod 10})} \frac1{p^s} = \frac12\cdot \log\left(\frac1{s-1}\right) + O(1).\] Men nu kommer det fina med Dirichlets metod: Om vi tar isär parhästarna $L_{\bf i}$ och $L_{\bf -i}$ så att det blir räkning med komplexa tal på riktigt, får vi fram information som skiljer ut primtalen på 3 från dem på 7!

Det ligger nära till hands att titta på kvoten $L_{\bf i}/L_{\bf -i}$ för att kancellera alla primtal på 1 och 9 (och den här gången få omvänt tecken på 3 och 7). Men eftersom vi då får ett tal på den komplexa enhetscirkeln, kokar hela analysen ner till att titta på argumentet för $L_{\bf i}$, alltså vinkeln från den positiva reella axeln.

Som påminnelse: \[L_{\bf i} = 1 + \frac{i}{3^s} - \frac{i}{7^s} - \frac{1}{9^s} + \frac1{11^s} + \frac{i}{13^s} - \frac{i}{17^s} - \frac1{19^s} + \dots\] \[ = \frac{3^s}{3^s-i}\cdot \frac{7^s}{7^s+i}\cdot \frac{11^s}{11^s-1}\cdot \frac{13^s}{13^s-i}\cdot \frac{17^s}{17^s+i}\cdot \frac{19^s}{19^s+1}\cdots\] och \[\lim_{s\to 1} L_{\bf i} = \frac{\pi}{10}\cdot \left(\cot\frac{\pi}{10} + i\cdot \cot\frac{3\pi}{10}\right) \approx 0.96688 + i\cdot 0.22825.\] Från serierepresentationen ser vi att både realdelen och imaginärdelen av $L_{\bf i}$ kommer att vara alternerande serier som uppfyller Leibniz kriterium, så argumentet kommer att ligga i intervallet $[0,\pi/2]$ för alla $s>1$. Vi kan till och med beräkna gränsvärdet av argumentet då $s\to1$ som \[\arctan\left(\frac{\cot(3\pi/10)}{\cot(\pi/10)}\right) = \arctan\left(\sqrt{5}-2\right) = \frac12\,\arctan\frac12.\] Om vi jämför med argumentet för var och en av faktorerna i produktrepresentationen, får vi det märkliga resultatet att \[\arctan\frac1{3^s}-\arctan\frac1{7^s}+\arctan\frac1{13^s}-\arctan\frac1{17^s}+\arctan\frac1{23^s}-\arctan\frac1{37^s}+\dots\] \[ \to \frac12\, \arctan\frac12 \approx 0.23182.\] Här är det plustecken på termerna för primtal som slutar på 3, och minustecken för dem som slutar på 7. Vi måste ha $s>1$, annars vet vi inte (med vårt resonemang här) att vänsterledet konvergerar överhuvudtaget. Men för varje $s>1$ är summan absolutkonvergent, och värdet på denna summa konvergerar sedan i sin tur mot $1/2\cdot\arctan(1/2)$.

Det är frestande att tänka att det ska gå att dra någon slutsats om fallet $s=1$ här, och summerar vi för primtalen upp till en miljon med $s=1$ blir det mycket riktigt ungefär $0.23182$. Man kan fråga sig om det verkligen är möjligt för en summa av den här typen att först ha ett gränsvärde när $s$ närmar sig 1 och sedan bete sig helt annorlunda för $s=1$. Men det är det, för det är som sagt bara att ta termerna i en annan ordning så blir det så. Det betyder att så länge ingenting i vårt resonemang pekar ut ordningen av primtalen efter storlek som det "rätta" sättet att ordna termerna, kan inga generella konvergenssatser hjälpa oss.

Faktum är att summan för $s=1$ ändå konvergerar mot $1/2\cdot\arctan(1/2)$, men det är ett helt annat kapitel av talteorin. Det går att komma åt den typen av summor genom att även låta variabeln $s$ vara komplex, och det var Bernhard Riemann som visade vägen ett par decennier efter Dirichlets artikel, även om inte han heller rodde det i hamn. Även här får vi hänvisa till primtalssatsen för aritmetiska följder.

Det finns en annan utsökt subtilitet i beräkningen av argumentet för $L_{\bf i}$, och det är hur vi vet att summan av argumenten av faktorerna blir just $0.23182$ och inte det talet plus någon multipel av $2\pi$. Det beror egentligen på att vi kan starta med $s$ från oändligheten, då argumentet är nära noll, och sedan kontinuerligt minska $s$ ner till 1. Vi vet från serierepresentationen att $L_{\bf i}$ aldrig kommer att röra sig ut ur den första kvadranten, och värdet kan därför aldrig snurra runt nollan i det komplexa talplanet.

För att återgå till primtalen kan vi göra en linjär approximation av varje term även i ekvationen för argumentet. Om vi använder $\arctan x \approx x$ för varje term, får vi \[\sum_{p=3, 13, 23, 43,\dots}\frac1{p^s} - \sum_{p=7, 17, 37, 47,\dots} \frac1{p^s} = O(1).\] I kombination med vad vi redan visste, blir slutsatsen nu att oavsett vilken av de fyra möjliga slutsiffrorna vi väljer, har vi \[\sum_{p\text{ med fix slutsiffra}} \frac1{p^s} = \frac14\cdot\log\frac1{s-1} + O(1).\] Eftersom högerledet går mot oändligheten när $s\to 1$, vet vi till slut att det finns oändligt många primtal i alla fyra klasserna.

Då har vi ändå bara skrapat på ytan av Dirichlets teorem. I fallet modulo 10 kunde vi konstatera lite ad-hoc att de tre serierna $L_{\bf -1}$, $L_{\bf i}$ och $L_{\bf -i}$ inte kan gå mot noll då $s\to 1$, och vi kunde ganska enkelt se hur argumenten av de komplexa serierna beter sig. Men redan på 1830-talet redde alltså Dirichlet ut vilka $L$-funktioner man får för generellt modulus, varför ingen av dem kan gå mot noll när $s\to 1$, och hur man kombinerar dem för att få en summa som domineras av primtal från en enskild klass. Han kunde alltså bland annat visa att om vi specificerar en sifferkombination hur lång som helst, så finns det alltid oändligt många primtal som slutar med just den sifferkombinationen, förutsatt bara att den allra sista siffran är 1, 3, 7 eller 9.