Current location - Education and Training Encyclopedia - Graduation thesis - In recent months, the evolution speed of Google Translation seems to have suddenly accelerated.
In recent months, the evolution speed of Google Translation seems to have suddenly accelerated.
The first paper links Google's neural machine translation system: bridging the gap between human and machine translation;

First of all, the previous translation system has the following shortcomings:

The translation effect of long sentences based on phrase translation is not good

The cost of systematic training and translation reasoning is very high.

Difficult and uncommon words

The above are obvious shortcomings, which make the translation system accurate and fast in practical application. The following figure shows the framework of the core algorithm of the translation system:

Google's neural machine translation system consists of a deep LSTM network with 8 encoders and 8 decoders, and it also adds attention mechanism and residual connection. In order to improve parallelism and reduce training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder. In order to speed up the final translation, we use low-precision operation in the process of reasoning and calculation. In order to improve the treatment of uncommon words, we divide words into a limited set of common sub-word units (components of words), which are both input and output. This method can provide a balance between the flexibility of character separation model and the effectiveness of word separation model, naturally handle the translation of rare words, and ultimately improve the overall accuracy of the system. Our beam search technique uses length normalization process and coverage penalty, which can stimulate the generation of output sentences that may cover all words in the source sentence. In WMT's 14 English-French and English-German benchmark tests, GNMT achieved the same results as the current best results. Compared with the phrase-based system that Google has put into production, its translation errors are reduced by an average of 60% through the comparative evaluation of a single simple sentence set.

The above is the abstract translation of the paper. With the application of deep learning in the field of natural language processing and the introduction of some new algorithms, such as batch normalization, various LSTM variants and attention mechanism, the practical application performance has been improved. But Google is still a big company, always making big news.