StarGAN v2: Diverse Image Synthessis for Multiple Domains

Author: Yunjey Choi, Youngjung Uh, Jaejun Yoo, Jung-Woo Ha
Date: Dec 04, 2019
URL: https://arxiv.org/abs/1912.01865

Abstract

Image translation 을 잘 하는 Model을 학습하려면 다음 사항을 만족해야함
- Diversity of generated images
- Scalability over multiple domains
기존의 방법들은 limited diversity, multiple models(networks)를 다룸.
StarGAN v2는 두 조건 모두 만족.

Introduction

Domain: 시각적으로 구별되는 범주
Style: 각 영상이 가지는 독특한 외관적 특성

StarGAN v2

Proposed framework

4개의 Network 로 구성.
Generator (G)
- Image x와 Style code s를 입력으로 받아 새로운 영상을 생성.
- adaptive instance normalization (AdaIN) 사용.
Mapping network (F)
- Latent code z와 Domain code y를 입력으로 받아 Style code s생성.
- Multi Layer Perceptron 구조.
Style encoder (E)
- Image x와 Domain code y를 입력으로 받아 x에서 Style code s를 추출.
Discriminator (D)
- Image x를 입력으로 받아 Domain code y와 Real/Fake 분류.

Training objectives

Adversarial objective
- GAN 에서 기본적으로 사용되는 Loss

$$\mathcal{L}_{adv}=\mathbb{E}_{\mathrm{x},y}[\log{D_y}(\mathrm{x})] + \mathbb{E}_{\mathrm{x}, \tilde{y}, \mathrm{z}}[\log{(1-D_{\tilde{y}}(G(\mathrm{x}, \tilde{\mathrm{s}})))}$$

Style reconstruction
- G(x, s) 를 Style encoder E 에 넣어 s 추출 후 입력 s와 비교

$$\mathcal{L}_{sty}=\mathbb{E}_{\mathrm{x},\tilde{y}, \mathrm{z}}[\parallel\tilde{\mathrm{s}}-E_{\tilde{y}}(G(\mathrm{x}, \tilde{\mathrm{s}}))\parallel_1]$$

Style diversification
- G가 다양한 Image를 생성할 수 있도록 Regularization 하는 역할.
- z1, z2 가 F에 의해 생성된 s1, s2와 입력 x를 G의 입력으로 새로운 영상 생성.
- L1 Norm 계산.

$$\mathcal{L}_{ds}=\mathbb{E}_{\mathrm{x},\tilde{y}, \mathrm{z}_1, \mathrm{z}_2}[\parallel G(\mathrm{x}, \tilde{\mathrm{s}}_1) - G(\mathrm{x}, \tilde{\mathrm{s}}_2) \parallel_1]$$

Preserving source characteristics
- Cycle GAN 의 cycle consistency loss.
- target domain의 style 을 적용한 영상을 다시 E(x)로 추출된 s를 이용하여 x'로 reconstruction 한 후 L1 Norm 계산.

$$\mathcal{L}_{cyc}=\mathbb{E}_{\mathrm{x}, y, \tilde{y}, \mathrm{z}}[\parallel \mathrm{x} - G(G(\mathrm{x}, \tilde{\mathrm{s}}), \hat{\mathrm{s}})\parallel_1]$$

Full objective

$$\mathcal{L}_D = -\mathcal{L}_{adv} \ \mathcal{L}_{F, G, E}=\mathcal{L}_{adv} + \lambda_{sty} \mathcal{L}_{sty} - \lambda_{ds} \mathcal{L}_{ds} + \lambda_{cyc} \mathcal{L}_{cyc}$$
- About $\lambda$