In this paper, the author proposes to solve the problem of DR-Gan. As shown in the figure below:
The following figure shows the comparison between the previous GAN and the DR-GAN proposed by the author:
There are two variants of DR-GAN. One is the basic model, which takes one picture as input, and it is called single-image DR-GAN. The other is multi-image DR-GAN, whose input is multiple pictures.
Generally speaking, GAN contains a generator and a discriminator, and the competition between them is maximized and minimized in a problem. Will try to distinguish the real picture from the generated picture, and at the same time will try to generate a seemingly real picture to cheat. As shown in the figure below:
There are two obvious differences between single-image DR-GAN and traditional GAN.
According to the above description, we can express this question:
Given a face picture: and its label:, the former is id and the latter is gesture. Our goal is: 1. Learning facial feature representation independent of gestures; 2. Synthesize a face picture with the same id but different postures. Moreover, this is a multi-objective CNN network, which consists of two parts:.
That is to say, given an input face picture, its id and posture will be generated, and given a generated face, it will try to predict that it is false. The following formula:
Meanwhile, it includes an encoder and a decoder. The encoder generates the feature representation of the input face image:, and the decoder outputs the generated face image:, where is the target posture and noise. The following formula:
As shown below, multiple images are the same, but different.
It should be noted that all * * * share a set of parameters.