This year marks the 70th anniversary of the publication of alan turing's paper introducing the concept of "Turing Test". In this paper, he answered the question-"Can machines think?" . The goal of this test is to determine whether the machine can show dialogue behavior that is indistinguishable from human beings.
Turing predicted that by the year 2000, the probability of ordinary people distinguishing artificial intelligence from real people in simulation games will be less than 70%, and the responder in the game may be real people or artificial intelligence, and the evaluator has no knowledge of this.
Alan turing
Why can't we achieve this goal as an industry after 20 years? I don't think Turing's goal is a practical goal for an artificial intelligence scientist like me.
Turing test is full of restrictive factors, and Turing himself discussed some of them in this groundbreaking paper. As artificial intelligence is widely integrated into mobile phones, cars and families, it is increasingly obvious that people are more concerned about whether their interaction with machines is practical, seamless and transparent, and the concept of distinguishing machines from people is outdated.
Therefore, it's time for this legend, which has been a source of inspiration for 70 years, to retire. We need to set up a brand-new challenge to inspire researchers and practitioners alike.
Turing test and public imagination
In just a few years after the concept was put forward, Turing test has become the North Star in the field of artificial intelligence.
Eliza and Parry, the earliest chat robots in the 1960s and 1970s, aimed to pass the Turing test. In 20 14, the chat robot "Eugene Goostman" announced that it had passed the Turing test, which fooled 33% of human judges and made them think they were real people. However, it has also been pointed out that the standard of deceiving 30% human referees is arbitrary. Even so, this victory made some people feel behind the times.
However, the Turing test continues to stimulate the public's imagination. OpenAI's "generative pre-training" Transformer 3(GPT-3) language model made headlines because of its potential to beat Turing test. Similarly, journalists, business leaders and other observers will still ask me: "When will Alexa pass the Turing test?"
There is no doubt that Turing test is a way to measure Alexa's intelligence, but is it really important to measure Alexa's intelligence in this way? Does it make sense?
To answer this question, let's go back to the time when Turing first put forward this paper.
1950, the first commercial computer was not listed. Four years later, the basic research of optical cable was published, and the field of artificial intelligence was not formed until 1956. Today, the computing power of mobile phones is 65438+ million times that of Apollo 1 1. Coupled with cloud computing and high-bandwidth connections, artificial intelligence can make decisions based on massive data in a few seconds.
Although Turing's original idea can still inspire us, understanding Turing test as the ultimate symbol of artificial intelligence progress will inevitably be limited by the times when it was first put forward.
First of all, Turing test hardly considers the machine properties of artificial intelligence, such as fast calculation and information search, which are the most effective features of modern artificial intelligence.
Deliberately emphasizing deceiving human beings means that if artificial intelligence wants to pass the Turing test, it must answer "Do you know what the cube root of 3434756 is?" Or "How far is Seattle from Boston?" Pause when asking such questions.
In fact, artificial intelligence knows these answers immediately, and pausing to make its answers sound more like real people is not the best way to use its skills.
In addition, the Turing test does not take into account the growing ability of artificial intelligence to use sensors to listen, see and feel the outside world. On the contrary, Turing test is limited to text communication.
Secondly, in order to make artificial intelligence more practical today, these systems need to complete our daily tasks efficiently. When you ask an artificial intelligence assistant to turn off the lights in the garage for you, you don't want to start a conversation. On the contrary, you will want it to meet this requirement immediately and notify you with a simple confirmation such as "ok" or "OK".
Even if you have a wide-ranging conversation with an artificial intelligence assistant on a hot topic, or let it read stories to children, you still want to know that it is artificial intelligence rather than real people. In fact, pretending to be a real person to "fool" users will bring real risks. Considering the possibility of dystopia, we have begun to see the emergence of robots that spread fake news and deep fake news.
Artificial intelligence faces new major challenges.
Instead of indulging in making artificial intelligence no different from human beings, we should devote ourselves to building artificial intelligence that can enhance human intelligence and improve our daily life in a fair and inclusive way.
A valuable potential goal is to make artificial intelligence show intelligent attributes similar to human beings-including common sense, self-monitoring and language ability, combined with machine efficiency such as quick search, memory recall and completing tasks on your behalf. The final result is to learn and complete various tasks and adapt to the new situation, which is far from what an ordinary person can do.
This focus reveals the really important research in the field of artificial intelligence-sensory understanding, dialogue, profound knowledge, efficient learning, decision-making reasoning, and eliminating any inappropriate prejudice (that is, achieving fairness). Progress in these areas can be measured in many ways.
One way is to break down the challenge into multiple tasks. For example, Kaggle's "abstract and reasoning challenge" focuses on solving reasoning tasks that artificial intelligence has never seen before.
Another method is to design a large-scale real-world challenge of human-computer interaction, such as the "Alexa Social Robot Grand Prix" for college students' interactive artificial intelligence competition.
In fact, when we launched the Alexa Grand Prix on 20 16, we had a heated debate on how to evaluate competitors' "social robots". Should we convince people that social robots are real people and conduct some kind of Turing test? In other words, do we want artificial intelligence to have the ability of natural dialogue, so as to promote learning and provide entertainment, or just regard it as a pleasant pastime?
Sophia, the first robot to obtain citizenship.
We have made a rule that requires social robots to have a coherent and interesting conversation with real people on a wide range of hot topics including entertainment, sports, politics and technology within 20 minutes.
In the development stage before the finals, customers will grade robots according to whether they are willing to talk to them again. In the final, an independent human referee will score it on a five-point scale according to consistency and naturalness.
If any social robot has an average conversation duration of 20 minutes and scores above 4.0, it can pass this major challenge.
Although no social robot has passed this major challenge at present, this method is guiding the research and development of artificial intelligence, making it have human-like dialogue ability with the help of neural methods based on deep learning. It gives priority to artificial intelligence to show humor and empathy under appropriate circumstances, rather than pretending to be a real person.