AlphaGo records third consecutive victory over go world champion

Spread the love

AlphaGo also won the third match against go world champion Lee Sedol. That means that Google DeepMind has won the majority of the go games with its artificial intelligence program and has taken home the million dollars in prize money.

The match in which South Korean go world champion Lee took on Google’s artificial intelligence program for the third time lasted more than four hours. “I’m sorry I couldn’t live up to people’s expectations,” Lee said. He thinks the pressure in the third round was too much for him. AlphaGo, on the other hand, continued to perform well, despite situations that were not seen before in rounds 1 and 2. The million dollars in prize money will go to various charities.

Despite the fact that AlphaGo has now won the majority of the go rounds against the human world champion, the other two rounds are still played; there is no mercy rule. Although Lee has been defeated by the deep learning system of Google DeepMind, the South Korean could still show that the program can in principle be beaten by a human. The last two rounds will be played on Sunday 13 and Monday 14 March.

AlphaGo

“Every day I have to rewrite my story. Lee never expected that he could be beaten as a human being,” says Leo Dorst of the UvA’s Faculty of Science. Dorst said this during a meeting around the match between AlphaGo and Lee Sedol, on Thursday 10 March. Lee lost to Google’s deep learning system for the second time that Thursday.

The competition between man and machine is causing a great deal of concern among connoisseurs of the seemingly simple game that, for a long time, was regarded as almost impossible to win within artificial intelligence. “That’s exactly what go is: simple but exciting. Always new, simple and complicated. That’s why it’s extra fun for nerds,” Dorst says to the lecture hall, which is mainly filled with students from the science faculty.

To indicate what AlphaGo is doing, Dorst explains how it is with people. “A talented kid can get to 3 dan in about 15 years.” Dan is a certain rank and indicates how good you are. Professional 9 then is the highest attainable. “The difference between Fan Hui, who lost to AlphaGo in October, and Lee Sedol is ten years, eight hours a day. Lee is 33 years old and has been a professional since he was twelfth,” says Dorst. “He is also creative, because he comes up with new opening moves. That is why it was thought that AlphaGo would have difficulty with it. Lee Sedol is much stronger than Fan Hui. Everyone thought that the makers of AlphaGo would undermine their own success with this. is now put in a slightly different perspective.”

Yet man is also at a disadvantage during the match in a way: Lee knows that a million dollars is at stake and he knows he is playing against a program. There are also normally three or four game-free days between games. In this case, there is only one day of rest. Because AlphaGo also uses a playing style that people would not normally use, Lee is less able to prepare for the next match. In the first game, Lee played a somewhat unorthodox opening. He seemed to be doing this to test AlphaGo. He also made use of overplay, something that according to the experts you would do against weak players.

In the second game, Lee played a so-called waiting game. If he does that to people, they lose. “It was predicted that you would need 10,000 GPUs to reach Lee’s level. Everyone who thought AlphaGo would win was considered crazy, but things turned out differently for the time being. The go community was shocked at first, but after the second victory of AlphaGo turned that around. People now think that AlphaGo will enrich go.”

Thirst goes back to AlphaGo’s win over Fan Hui. Fan is Europe’s best go player with 2 dan professional. He started with go in 1988. During the research that was published in January, AlphaGo lost two matches to Fan and won eight. It was agreed in advance that certain game conditions would not be counted. Fan did better in the short practice matches, but these didn’t count.

There is a real fear of going online, namely cheaters. This is already a big problem in online chess. But according to Dorst, that problem has nothing to do with the fact that computers are stronger than humans, but with humans themselves.

The artificial intelligence

Go is a very complicated game because of the enormous number of possible positions. “There are whole studies of the game and go matches from the distant past that are still being actively studied,” Dorst says. “With chess it is clear: you have to take the king. With go it is not so clear. The goal is less tangible. Without guidance, go is therefore a difficult game for beginners. For example, overpowering other pieces is not the main goal, while children especially think that’s what it’s about.”

“In go, the groups of stones and the areas in between are the ‘pieces’ in the game. A strong player knows which group is strong or weak and what the final score will be possible. With professionals, the difference in final score is generally small, something like two points and that while you can get about 180 points per player on a board with 19 by 19 lines,” says Dorst. “With professionals, a game often lasts about 250 moves. Someone once calculated that the maximum number of moves can be 2×10⁴⁶, but a person does not survive that.” A normal match between professionals usually lasts about five hours, although sometimes twenty hours are clocked at a Japanese title match.

Anyway, Go is a game of the big numbers. It is therefore practically impossible for a computer to calculate in advance all possible moves from a certain position, as in chess. The AlphaGo machine uses various machine learning elements. Max Welling, professor of machine learning at the UvA, explains in a nutshell how AlphaGo works on the basis of the paper published in January.

“Despite the fact that AlphaGo has probably changed significantly since the last time the computer played against a champion, not much will have changed in the basics,” says Welling. “It was clear to go players: AlphaGo would not win. But the current status is different.”

AlphaGo uses four machine learning ingredients. Supervised deep learning, reinforcement learning and Monte Carlo Tree Search. The machine also uses deep convolutional networks to scan the game board and recognize images.

Learning to predict data from previous matches is called supervised learning. In that case, there is an existing dataset from which predictions are made. If a wrong prediction is made, the algorithm has to be adjusted slightly, until the result is correct.

The second process is called reinforcement learning. In doing so, the neural network itself performs an action, such as placing a stone on a specific position on the board. Then it goes to find out if it would win or lose with that move. If there is indeed a win, the rule can be improved. “But, those kinds of actions can be quite noisy,” Welling says.

In addition, AlphaGo analyzes games that people have played. How would a person react? Where would a man put a stone? Then there is a network that generates new matches itself. That’s millions of games. That dataset allows another network to learn and train. The latter network is not concerned with the value of the stone, it looks at the value of the position. With this AlphaGo trains itself from both human and own moves.

“Eventually, Monte Carlo Tree Search is just around the corner,” Welling says. “Every move has a value. In chess you can try every possible move within a certain amount of time. Then the best possible move is chosen. In go there are too many possible moves.”

However, AlphaGo sometimes plays entire games, as noted earlier. AlphaGo does this in a ‘cheap’ way. If AlphaGo wins or loses, it is returned to the point where it started and is repeated. The information that this yields is fed back into the Monte Carlo assignment.

Compared to the chess computer DeepBlue, AlphaGo makes 1000 times less use of board evaluations than DeepBlue. Instead, it makes much more use of machine learning.

The full lecture, including slides, can be viewed via the UvA web lectures.

You might also like