Current location - Education and Training Encyclopedia - Graduation thesis - Residual network
Residual network
Residual network (ResNet for short) was put forward on 20 15 after three classic CNN networks, Alexnet Googlenet VGG, and won the first prize in the classification task of ImageNet competition. ResNet is widely used in detection, segmentation, recognition and other fields because of its simple and practical advantages.

ResNet can be said to be the most groundbreaking work in the field of computer vision and deep learning in the past few years. It effectively solves the problem that the accuracy of training sets decreases with the deepening of the network, as shown in the following figure:

Students who have done deep learning should know that with the increase of network layers, one of the reasons for the poor training effect is the problem of gradient dispersion and gradient explosion, which inhibits the convergence of shallow network parameters. But this problem has been solved by some parameter initialization techniques. Interested students can look at the following articles in the references: [2][3][4][5][6].

Even so, when the network depth is high (such as the 56-layer network in the figure), the effect will be worse. From the first three models of Alexnet Googlenet VGG, we can see that the depth of the network plays a vital role in image recognition. The deeper the network, the more features at different levels can be learned automatically. So what causes the effect to get worse?

Figure 3

The calculation amount of VGG model in the left 19 layer is1960 million FLOPs, and the calculation amount of ordinary convolution network in the middle 34 layer is 3.6 billion FLOPs.

On the right is a 34-layer ResNet with a computational load of 3.6 billion FLOPs. In the figure, the solid arrow is a direct mapping without size change, and the dotted line is a mapping with size change. By comparison, it can be seen that although the number of layers of VGG is small, the amount of calculation is still very large. It can be seen from the experimental data that the performance of 34-layer ResNet will be better than that of 19.

As can be seen from the figure, 34-layer residual network is superior to VGG and Google network in effect. Among the three schemes A, B and C, scheme C has the best effect, but the calculation of schemes B and C is much larger than that of scheme A, and the effect is not greatly improved, so the author suggests that scheme A is more practical.

Let's introduce the structure of the remaining network above 50 layers: deeper bottleneck architecture. This structure is designed by the author to reduce the training time. The structural pair is shown in the following figure:

ResNet solves the degradation problem of deep network through residual learning, so it can train deeper network, which can be called a historic breakthrough of deep network. Maybe there will be a better way to train deeper networks soon. Let's look forward to it!

At present, you can find a 34-layer implementation example of Residual Network (ResNet) based on tensorflow on the artificial intelligence modeling platform Mo. The data set is cifar-10 (ten data sets of cifar). The accuracy of this example is 90% on the test set and 98% on the verification set. The main program is in ResNet_Operator.py, the block structure of the network is in ResNet_Block.py, and the trained model is saved in the results folder.

Source address of the project:/explore/5d1b0a031afd944132a0797d? Type = Application

References:

[1]_ He Guoguang, Zhang, Ren, Sun Jun. Depth residual learning for image recognition. ArXiv preprint arxiv:1512.03385,2015.

[2] Y. Lekun, L. Botu, G. B. Orr and K.-R.M. Houlle. Efficient back projection. Neural Networks: Trading Skills, pp. 9-50. Springer, 1998.

[3] X. Glodt and Y. Bengio. Understand the difficulty of training deep feedforward neural networks. In AISTATS, 20 10.

[4] A.M. Sachs, J.L. mcclelland and S. Ganguly. The exact solution of nonlinear dynamics of deep linear neural network learning ARXIV:1312.6120,2013.

[5] He Guoguang, Zhang Xiaosong, Ren Shaoning and Sun Jun. In-depth study of rectifier: performance beyond human level in imagenet classification. In ICCV, 20 15.

[6] S. Yoffie and C. Szegodi. Batch standardization: Accelerate deep network training by reducing internal covariant transfer. At ICML, 20 15.

Mo (website:) is an online artificial intelligence modeling platform supporting Python, which can help you develop, train and deploy models quickly.

Mo artificial intelligence club is a club initiated by the product design team of Hehe website, which is committed to lowering the threshold of artificial intelligence development and use. The team has experience in big data processing, analysis, visualization and data modeling, undertakes intelligent projects in many fields, and has full-line design and development capabilities from the bottom to the front. The main research direction is big data management analysis and artificial intelligence technology to promote data-driven scientific research.