Why does the author of the Transformers paper claim that "attention is all you need"?
Explain Transformer in detail (attention is what you need). As the title, the traditional CNN and RNN were abandoned in Transformers, and the whole network structure was completely composed of attention mechanism. More precisely, Transformer consists of and.