VQ-GAN | PyTorch Implementation

Описание к видео VQ-GAN | PyTorch Implementation

In this video we are implementing the famous Vector Quantized Generative Adversarial Networks (VQGAN) paper using PyTorch. VQGAN is a generative model for image modeling. It was introduced in Taming Transformers for High-Resolution Image Synthesis. The concept is build upon two stages. The first stage learns in an autoencoder-like fashion by encoding images into a low-dimensional latent space, then applying vector quantization by making use of a codebook. Afterwards, the quantized latent vectors are projected back to the original image space by using a decoder. Encoder and Decoder are fully convolutional. The second stage is learning a transformer for the latent space. Over the course of training it learns which codebook vectors go along together and which not. This can then be used in an autoregressive fashion to generate before unseen images from the data distribution.

PyTorch Code: https://github.com/dome272/VQGAN

#deeplearning #gan #generative # vqgan #pytorch #transformer

0:00 Introduction
0:58 Helper modules
5:42 Encoder
7:55 Decoder
9:22 Codebook
12:25 VQGAN
15:49 Discriminator
16:30 LPIPS
17:37 Utils
19:48 Training: First Stage
26:09 Results: First Stage
27:25 Introducing Second Stage
28:00 GPT
29:03 VQGAN Transformer
34:04 Training: Second Stage
36:54 Results: Second Stage
37:41 Github Code & Outro

Further Reading:
• VAE: https://towardsdatascience.com/unders...
• VQVAE: https://arxiv.org/pdf/1711.00937.pdf
• Why CNNS are invariant to sizes: https://www.quora.com/How-are-variabl...
• NonLocal NN: https://arxiv.org/pdf/1711.07971.pdf
• PatchGAN: https://arxiv.org/pdf/1611.07004.pdf
• Hinge Loss: https://arxiv.org/pdf/1705.02894v2.pdf

Follow me on instagram lol:   / dome271  

Комментарии

Информация по комментариям в разработке