The motivation of this paper is to propose an improved genetic algorithm for small sample training data to improve the diversity of generated data.
This paper has two main contributions:
1. The potential space of the original GAN is defined as a mixed Gaussian model. Experiments show that this slight change can effectively improve the diversity of generated samples and improve the stability of GAN training under the condition of limited training data. They named this model Deligan.
2. In order to quantitatively measure the intra-class diversity of generated samples, they put forward an index called modified incidence-score, which is called m-IS for short.
The training goal of the original GAN is to put a lateral variable that obeys a simple distribution? Z is mapped into high-dimensional data that obeys complex distribution. In order to achieve diversified effects, this generally requires the depth of the network, which is difficult to achieve in the case of limited data. Therefore, the author focuses on improving the complexity of Z.
They define the distribution of z as a mixed Gaussian model:
here
Represented by Gaussian distribution
The probability of sampling to z in the middle, so this distribution is equivalent to randomly selecting a distribution among n Gaussian distributions to sample z.
Every Gaussian distribution has
and
Two parameters belong to the model to be learned. However, there will be a problem. The gradient of these two parameters should be propagated back through z, but z is a non-differentiable random variable and the gradient cannot be transmitted here.
So you need to use a technique called reparameterization. This technique was put forward by the author VAE, and its principle is very simple, that is, each Gaussian distribution can be written in the form of standard Gaussian distribution:
In this way, z becomes a deterministic variable, and we can differentiate it.
The changes of model nodes before and after applying the reparameterization technique are as follows, and the function of this technique can be clearly seen.
In this way, redefining Z in Drygen is
definition
These two parameters are also parameters that need to be optimized in the model.
The difference between DeLiGAN and the original GAN can be represented by a schematic diagram:
T.Salimans, I. Goodfellow and others improved in their papers? When training GAN, an index named inception-score is proposed to evaluate the images generated by GAN. In addition to evaluating the quality of generated images, inception-score also considers inter-class diversity. . In this paper, the initial score is improved, and in addition to image quality, intra-class diversity is emphasized.
Details are as follows:
1) Firstly, the images generated by GAN are classified by the CNN image classification model Inception, so for each image,
We can all get the distribution p(y|x) of its classification labels.
2) We want the generated image to be realistic enough to be easily judged by the classifier, so the distribution profile of p(y|x) should have a "peak", so p(y|x) should have a relatively low entropy.
3)? On the other hand, we hope that the diversity of the same category is large enough. For example, when GAN generates two images and xj that are also classified as cars, their image details should be different. Naturally, there should be obvious differences between p(y|xi) and p(y|xj), which is reflected in information theory, that is, the cross entropy of these two distributions is relatively large.
In short, the corrected incidence score (m-is) can be measured by KL divergence:
because
When the divergence of k 1 is relatively large,
Relatively large,
Relatively small, in line with the above analysis.
In addition, a series of comparative experiments are carried out with the original GAN, and the results show that the original GAN is not as good as DeLiGAN in image quality and diversity when the training data is small.
Obviously, the fundamental reason lies in the difference of potential spatial complexity between the two models. When the distribution of real data is complex and multi-modal and the training data is limited, the mixed Gaussian distribution is more flexible than the single Gaussian distribution and can fit the real data distribution more accurately.
In addition to the performance advantages, DeLiGAN also has the advantage of simple implementation, and can easily act as a "plug-in" of the existing GAN structure.