In the example on the left, S 1 corresponds to three different translations, (S 1, T 1) (S 1, T2) (S 1, T3 T4), and its entropy is relatively large. We replace all the corresponding translations with a special word "stokens 4s 1" to reduce the entropy of word translation. On the right, we put forward three methods to improve the translation effect, including pre-training, multi-task learning and twice decoding. If you are interested, you can read the newspaper.
From the experimental results, compared with Transformer, the quality of Chinese-English translation has been significantly improved, and the proportion of missing translation of high-entropy words has decreased significantly.
Sparse data
The second challenge is sparse data. Compared with statistical machine translation, this problem is more serious for neural network translation. Experiments show that neural network is more sensitive to the amount of data.
Aiming at the problem of sparse data, a multi-language translation model of multi-task learning is proposed. In multilingual translation, the source language * * * enjoys an encoder, and at the decoding end, different languages use different decoders. In this way, the encoder information will be shared in the source language, thus alleviating the problem of sparse data. Later, the University of Montreal, Google, etc. Some work has been done in this direction.
Experiments show that the convergence speed of this method is faster and the translation quality is obviously improved. Please read the newspaper for more details.
This paper is the best paper on EMNLP in 20 18, and puts forward a unified framework. A) The blue dots and red dots represent two different language sentences. How to construct a translation system from monolingual data of two languages?
First of all, I have to make an initialization, and b) it is initialization. First, build a dictionary to align the words between the two languages. C) is a language model. Based on monolingual data, a language model can be trained to measure the fluency of this language. So what is d)? D) is a technique called reverse translation, which is a commonly used method to enhance data at present.
The dictionary constructed after initialization in b) can be translated from one language to another, even if it is based on words first. Then, use the language model of another language to measure translation. Then pick out the sentences with high scores and translate them back. This process is called reverse translation, and then the language model of the original language is used to measure the sentence. Through this iteration, the data will get better and better, and the translation quality of the system will get better and better.
Introduce knowledge
The third challenge is to introduce knowledge. How to introduce more knowledge into the translation model is a long-term challenge for machine translation. In this example, the Chinese sentence "Crossflow" corresponding to the target language has not been translated, and it is marked by a special marker called UNK (unknown word).
So what do we do? We introduced several kinds of knowledge, the first one is called phrase list or word list. If we find that the word "cross flow" has not been translated, we will look it up in this dictionary, which will be introduced as an external knowledge. At the same time, we also introduce a language model to measure whether this sentence is fluent in the target language. At the same time, we introduce the length reward feature to reward long sentences. Because the longer the sentence, the less information you may miss. This work introduces the characteristics of statistical machine translation into neural network translation for the first time, which can be used as a framework for introducing knowledge.
But at present, the introduction of knowledge is still superficial. The introduction of knowledge needs more and deeper work. For example, this sentence is ambiguous. When the context of "China and Pakistan" is not given, it is impossible to judge which country "Pakistan" is short for.
But the following sentence has a qualification, "BRIC framework". At this time people will know how to translate. But can the machine know? You can check it with the translation engine. Because people know China and which countries are BRICS countries, but machines don't have this knowledge. How to give this knowledge to machines is a very challenging problem.
Another challenge is interpretability: is neural network translation god or nerve? Although people can optimize the system and improve the quality by designing and adjusting the network structure. But there is still a lack of in-depth understanding of this method.
There is also a lot of work trying to study the internal working mechanism of the network. Tsinghua University has an article that studies from the perspective of attention.
For example, in the example on the left, there is a UNK. How is UNK formed? Although there is no translation, it appears in the right position and occupies a position. Through the correspondence of attention, we can see that this UNK corresponds to the debtor country. The example on the right is a phenomenon of repeated translation. Neural network machine translation often misses translation and repeats translation. For example, there are two "histories". Then through this correspondence, we can see that the history of the sixth position is repeated, and its appearance is not only related to the first position "American" and the second position "history", but also related to the fifth position "The". Because of the definite article "the", the model thinks that this place should have "history". In this paper, a lot of examples are analyzed, and some analysis results and solutions are given. For further information, please read the original text. The fifth challenge that machine translation has faced for a long time is text translation. At present, most translation systems use sentence-based translation methods, which are sentence-by-sentence translation. It is acceptable to look at the translation of these three sentences alone. But when you look at it together, you feel stiff and incoherent.
Text translation
The fifth challenge that machine translation has faced for a long time is text translation. At present, most translation systems use sentence-based translation methods, which are sentence-by-sentence translation. It is acceptable to look at the translation of these three sentences alone. But when you look at it together, you feel stiff and incoherent.
The output of our method. It can be seen that the addition of definite articles and pronouns improves the coherence between sentences.
We propose a two-step decoding method. In the first round of decoding, the preliminary translation results of each sentence are generated separately. In the second round of decoding, the translation content is polished by the results of the first round of translation, and an enhanced learning model is proposed to reward the model that produces smoother translation. This is the result of our system output. Overall, fluency has improved.
Original link: /q4TY Author | Wu Yijun, Xia Yingce Source | Microsoft Research Institute AI Headline (ID:MSRAsia) Editor's Note: At present, unlabeled monolingual data in the target language has been widely used in machine translation tasks. However, once the unlabeled data in the target language is improperly used, it will have a negative impact on the model results. In order to make effective use of the large-scale monolingual data in the source language and the target language, Microsoft Research Asia put forward some suggestions in the paper published in EMNLP 20 19. ...
Continue to visit
Tencent AI Lab Tu Zhaopeng: How to Improve the Loyalty of Neural Network Translation | Attached PPT+ Video
This article is the live sharing record of Tu, a senior researcher of Tencent AI Lab, in the 22nd PhD Talk on June 4th, 65438. Machine translation is one of the classic tasks of natural language processing, which involves two basic problems of natural language processing: language understanding and language generation. The modeling of these two questions directly corresponds to two evaluation indexes of the translation: faithfulness (whether the original text is fully expressed) and fluency (whether the translation is fluent). In recent years, neural network machine translation has made great progress and become the mainstream model. Because neural network can alleviate data sparseness and capture.
Continue to visit
Dry goods | For machine translation, reading this article is enough.
Author's brief introduction Yu Qian, algorithm engineer of Ctrip's Big Data R&D Department, is mainly responsible for the research and application of machine translation, and currently focuses on mature solutions of natural language processing in vertical domain. The development of machine translation technology has always been related to computer technology, information theory, language ...
Continue to visit
Principle of artificial neural network algorithm, an example of artificial neural network algorithm
4.2 Advantages and Disadvantages of Artificial Neural Networks Artificial neural networks have some basic characteristics of human brain functions because they simulate the organization of brain neurons, which opens up a new way for the research of artificial intelligence. The advantages of neural network are: (1) parallel distributed processing. Because the arrangement of neurons in artificial neural network is not chaotic, it is often layered or arranged in a regular order, and signals can reach the input ends of a group of neurons at the same time. This structure is very suitable for parallel computing. (3) Robustness and Fault Tolerance Because a large number of neurons and their interconnections are used, they have the ability of associative memory and associative mapping, which can enhance the fault tolerance of expert systems. The failure or error of a few neurons in the artificial neural network will not seriously affect the overall function of the system. ..
Continue to visit
Neurotranslation Notes 5 Extended B. Common Machine Translation Skills
Article directory Neurotranslation Notes 5 Extension B. Common machine translation skills combined decoding monolingual data application reordering field adaptation reference Neurotranslation Notes 5 Extension B. Common machine translation skills This section introduces common means to improve the effect of machine translation system. Some of these methods are actually common techniques of deep learning, some were put forward shortly after the emergence of neurotranslation, and they have been used until now after several developments, and some even appeared in the era of statistical translation. In any case, these methods exist independently of the model architecture, which can not only enhance the model effect, but also show tenacious vitality and good versatility. This section refers to three sections of Cohen's NMT Review: 13.6. 1, 13.6.3, 13.6.7, and is made according to personal preference.
Continue to visit
From the Cold War to deep learning, a text can understand the development history of machine translation.
Click on "CSDN" above, select the key moment of "Top WeChat Official Account" and deliver it as soon as possible! Friendly tip from CSDN editor: It will take at least one week to finish reading this article. Please collect ~ ~ ~ pictures from the Internet in advance. Overall, I open Google Translate twice as often as Facebook. For me, instant translation is no longer the exclusive story of "Cyberpunk", but has become a part of our real life. It is hard to imagine that after a century of hard work, the algorithm of machine translation has been realized, and even half of the time we are not aware of the development of this technology. Congsou
Continue to visit
Limitations of machine translation
1. It is difficult to identify polysemous words. Polysemous words mean that the information sent by people in communication can show many different meanings in different contexts. This is the most basic and difficult problem in machine translation. For example, telling my husband that today is Saturday may remind the children to tell their parents, or the children may want to relax and go out to play. The boss told the migrant workers that it might be that going to work today is overtime. From the mouth of an overworked student, it may mean sleeping in today. There may be more examples in the above game, but that's all.
Continue to visit
Neural Network Machine Translation Technology and Its Application (I)
He Zhongjun, head of Baidu machine translation technology. This paper is based on the author's special report at the Global Architects Summit in February, 20 18. This report is divided into the following five parts: the basic principle of machine translation, introducing the principle, main challenges, development process and evaluation methods of machine translation, and introducing the technical challenges of machine translation based on neural networks that have risen rapidly in recent years. Although machine translation based on neural network has made a series of great progress, it still faces many challenges. Typical application, machine translation in production, ...
Continue to visit
On the shallowness of machine translation
Translation | shawn Editor | Introduction of Bobo and Qi Fei AI Technology Base Camp Although machine translation obviously can't play with the long-awaited content, we have to admit that it does provide some convenience for people to know the meaning of words quickly. Strangely, both media reports and industries seem to be creating an atmosphere in which machine translation is about to replace human translation, giving people an illusion that it is about to become. Some people want to lift the veil of blind optimism, and Hou Shida, an American scholar who won the Pulitzer Prize for his book Godel, escher and Bach, is one of them. He kissed his body with himself.
Continue to visit
Review and Prospect of Machine Translation Technology | Industry Observation
Today, do you love? Attention: decision intelligence and machine learning, learning some AI dry goods every day * * *: 2497 words, 7 pictures, estimated reading time: 7 minutes.