Modèle is an image generation algorithm based on deep learning, which was proposed by French researchers in 20 16. It adopts a brand-new generation model-——VariationalAutoencoder(VAE), and combines it with ConvolutionalNeuralNetwork (CNN) to achieve high-quality image generation.
operation sequence/order
1. data set preparation
Modley needs a lot of image data to train the model, so it is necessary to prepare a large enough data set first. You can use public datasets, such as MNIST, CIFAR- 10, or you can use your own datasets. The size and quality of the data set have a great influence on the training effect of the model, so it needs to be carefully selected.
2. Model structure
Modley's model consists of two parts: encoder and decoder. The encoder converts the input image into a vector in the potential space, and the decoder converts the vector in the potential space into an output image. Among them, the potential space is a low-dimensional vector space, which can be regarded as the "characteristic expression" of the image.
Both encoder and decoder are composed of multilayer convolutional neural networks. The encoder compresses the image layer by layer and finally outputs a low-dimensional vector; The decoder decompresses this vector layer by layer, and finally outputs an image similar to the original image.
3. Model training
Model training is the core of Modley. In the training process, we need to minimize the image reconstruction error and the distribution error of potential vectors. Specifically, we need to use the reconstruction error and KL divergence to define the loss function, and then use the back propagation algorithm to update the model parameters.
4. Image generation
After the model training is completed, we can use the encoder to convert any image into a potential vector, and then use the decoder to convert the potential vector into an output image. Because the latent vector is a low-dimensional vector, it can be interpolated and translated in the latent space to generate various images.