Previously, computers have surpassed human opponents in many other competitions, such as chess, Othello and quiz "Danger Edge". However, Weiqi is an ancient skill with a history of more than 2,500 years, and its complexity far exceeds that of chess. Therefore, human masters can surpass even the strongest computer system almost effortlessly. Earlier this month, artificial intelligence experts outside Google have raised the question of whether the breakthrough of artificial intelligence in the field of Go can be realized quickly. Until last year, most people thought it would take 10 years for computers to beat professional chess players.
However, Google did this. Remi Coulom, a French researcher, had previously developed the most powerful artificial intelligence Go program in the world. He said, "This day came earlier than I expected."
In 20 14, Google acquired DeepMind, which called itself "Apollo Project in the field of artificial intelligence". In June 5438+last year 10, DeepMind's research team held a competition between artificial intelligence and human players in the London office. DeepMind's system is called AlphaGo, and its opponent is European Go champion Fan Hui. Under the supervision of the editor of Nature magazine and the referee of the British Go Association, AlphaGo won an overwhelming 5-0 victory in the gobang competition. Dr. Tanguy Chouard, editor of Nature magazine, said in a media conference call on Tuesday: "This is one of the most exciting moments in my career, whether as a researcher or an editor.
A paper published in Nature introduced DeepMind's system. This system uses a variety of technologies, including an increasingly important artificial intelligence technology, namely deep learning. Using a large number of human chess manuals (the total number of steps is about 30 million), DeepMind's research team trained AlphaGo to learn Go independently. However, this is only the first step. In theory, such training can only cultivate artificial intelligence with the same chess power as the best human chess players. In order to beat the best human players, the research team let the system play against itself. This brings new data, which can be used to train new artificial intelligence systems and eventually surpass top experts.
Demis Hassabis, the head of DeepMind, said, "The most important thing is that AlphaGo is not only an expert system, but also follows the rules set manually. In fact, this uses universal machine learning technology to explore how to win in the Go game. "
This victory of artificial intelligence is not new. Internet services such as Google, Facebook and Microsoft have long used deep learning technology to recognize photos and sounds, or to understand natural language. DeepMind technology combines deep learning, reinforcement learning and other methods. How the robots in the real world learn daily tasks and respond to the surrounding environment points out the future direction. Hasabis said: "This is very suitable for robots."
He also believes that these methods can accelerate scientific research, and scientists will be able to achieve more results by introducing artificial intelligence systems into their work. "This system can process larger data sets, analyze structured information and provide it to human experts, thus improving efficiency. The system can even provide suggestions on methods and means to human experts to help achieve breakthroughs. "
However, at present, Go is still his focus. After defeating a professional player behind closed doors, Hasabis and his team set their sights on the world's top Go players. In mid-March, AlphaGo will openly challenge Li Shishi in South Korea. Li Shishi ranks second in the number of international champions, while Li Shishi has the highest winning percentage in the past 10 years. Hassabis thinks that Li Shishi is "the Federer of Weiqi".
It is more difficult than chess.
At the beginning of 20 14, Crazystone, the Go software of Cullom, challenged yoda norimoto's ninth section in the Japan Tour and won. However, the quality of this victory is not enough: Crazystone won the concession of his fourth son. Cullom predicted that artificial intelligence would take 65,438+00 years to defeat the top Go players without being eliminated.
The difficulty of this challenge lies in Weiqi itself. Previously, in a reasonable time, no supercomputer had enough processing power to predict the subsequent results of every possible method. 1997, IBM Deep Blue defeated the chess master Kasparov. At that time, this supercomputer adopted the method of "violent calculation". In essence, Deep Blue analyzes the possible results of each step. However, it doesn't work in the game of Go. In a chess match, there are 35 possible moves in an average round. However, the chess board of 19x 19 is adopted in the Go game, with an average of 250 moves per round. Hasabis pointed out that there are more kinds of chess on the board of Go than the total number of atoms in the universe.
Using the method named "Monte Carlo tree search", a system similar to Crazystone can complete more prediction steps. Combined with other technologies, the computer can complete the necessary analysis of various possibilities. Such a computer can beat some excellent Go players, but it is far from the top players. For a true master, intuition is an important part. These players will choose how to act according to the chess type on the board, rather than accurately analyzing the possible results of each move. Hasabis himself is a Weiqi player. He said, "A good chess set looks beautiful. This seems to follow some kind of aesthetics. This is also the reason why this game has been enduring for thousands of years. "
However, after entering 20 15, some artificial intelligence experts, including researchers from Edinburgh University, Facebook and DeepMind, began to explore the use of deep learning technology to solve the Go problem. They imagine that deep learning technology can simulate the necessary human intuition in Go. Hasabis said: "Go has many hints, and pattern matching is very important. Deep learning can be done very well. "
Self-enhancement
The foundation of deep learning is neural network. This network composed of software and hardware can simulate neurons in the human brain, and its operation does not depend on "violent computing" and artificial rules. Neural network will analyze a large amount of data to engage in a task of "learning". For example, if enough photos of wombats are input into the neural network, then it can identify wombats. If you input enough word pronunciations into the neural network, it can recognize your pronunciations. Neural network can learn to play Go as long as it inputs enough Go.
In DeepMind of Edinburgh University and Facebook University, researchers hope that by "observing" the chess mode, neural networks can master the method of playing Go. As Facebook said in a recent paper, this technology works well. Through the combination of deep learning and Monte Carlo tree method, Facebook's system has defeated some human players.
However, DeepMind went further on this basis. After learning the 30 million moves of human chess players, the accuracy of this neural network in predicting the next move of human chess players reached 57%, much higher than the previous 44%. Later, Hassabis and his team slightly adjusted the neural network to make it fight against themselves, which is called reinforcement learning. In this process, the neural network can understand what kind of walking mode can bring the best effect.
David Shivell, a researcher at DeepMind, said: "By playing millions of games between neural networks, AlphaGo has learned to discover new strategies and gradually improve them."
Shivell said that this makes AlphaGo superior to other Go software including Crazystone. Subsequently, the researchers input the results into another neural network. After judging the opponent's next move first, this neural network can predict the result of each step with the same skill. This is similar to older systems such as Deep Blue, but the difference is that AlphaGo can learn in the process and analyze more data instead of using violent calculation to judge all possible results. In this way, AlphaGo can not only surpass the current artificial intelligence system, but also defeat human masters.
dedicated chip
Similar to most advanced neural networks, DeepMind's system runs on a computer based on GPU (graphics processing chip). GPU was originally designed for graphics rendering of games and other image applications, but recent research shows that this chip is also very suitable for deep learning technology. Hasabis said that DeepMind's system performed quite well on a single computer equipped with multiple GPU chips, but in order to challenge Fan Hui, the researchers built a larger computer network, including 170 GPU cards and 1200 standard CPU processors. This huge computer network trained AlphaGo and participated in the competition.
Hasabis said that AlphaGo will use the same hardware configuration in the match with Li Shishi. At present, they are constantly improving this artificial intelligence system. In order to prepare for the match against Li Shishi, they also need an Internet connection. Hasabis said: "We are installing our own optical cable."
Cullom and other experts pointed out that the match with Li Shishi will be more difficult. Cullom, however, has made a bet on DeepMind. In the past 10 years, he has been hoping to develop a system that can surpass the top Go players. He believes that the system is here now. He said, "I'm buying some GPUs."
The road to the future
The importance of AlphaGo is self-evident. This technology can be applied not only to robotics and scientific research, but also to many other tasks, such as Siri-like mobile voice assistants and financial investment decisions. Chris Nicholson, founder of Skymind, a deep learning startup, said: "You can use it for any confrontational problem, such as all kinds of competitions that require strategy, as well as wars and business transactions."
For some people, this situation is worrying, especially considering that DeepMind's system has the ability to teach Go by itself. AlphaGo's learning materials are not from humans, but can be self-directed by generating data by themselves. In recent months, Tesla founder Elon Musk and other celebrities have said that such an artificial intelligence system will eventually surpass human intelligence and break through human control.
However, DeepMind's system is strictly controlled by Hasabis and his team. AlphaGo is used in the most complicated board game, but it is still just a game. In fact, AlphaGo is far from real human intelligence, far from super intelligence.
Ryan Calo, a law professor who specializes in artificial intelligence at the University of Washington and the founder of the Science and Technology Policy Laboratory, said: "This is still a highly structured situation, not a real human understanding." However, AlphaGo points out the future direction. If DeepMind's artificial intelligence system can understand Go, then it can understand more information. Carlo said, "The universe is just a bigger game of Go."