Skip to content

Latest commit

 

History

History
291 lines (227 loc) · 13.4 KB

README.md

File metadata and controls

291 lines (227 loc) · 13.4 KB

Continuous Conditional Generative Adversarial Networks

[UPDATE! 2023-08-01] We fixed a typo in the codes for computing the Label Score for the Steering Angle (128x128) experiments (the 64x64 experiments are NOT affected). The original evaluation code tends to underestimate the Label Scores of compared methods. After the fixing, the Label Scores of cGAN (210 classes), cGAN (concat), and CcGAN (SVDL+ILI) are respectively 31.756 (23.005), 42.757 (27.341), and 18.438 (16.072). Fortunately, the conclusion that CcGAN substantially outperforms cGANs is unchanged!
[UPDATE! 2022-12-10] A journal version of CcGAN is accepted by T-PAMI (link)!
[UPDATE! 2021-07-28] We provide codes for training CcGAN on high-resolution RC-49, UTKFace, and Steering Angle where the resolution varies from 128x128 to 256x256. We also provide simplified codes for computing NIQE.
[UPDATE! 2021-07-27] We add a new baseline cGAN (concat) which directly appends regression labels to the input of generator and the last hidden map of discriminator. cGAN (K classes) and cGAN (concat) are two modifications on conventional cGANs (to fit the regression scenario) and they show two types of failures of conventional cGANs. (1) cGAN (K classes) has high label consistency but bad visual quality and low intra-label diversity. (2) cGAN (concat) has high intra-label diversity but bad/fair visual quality and terrible label consistency.
[UPDATE! 2021-01-13] A conference version of CcGAN is accepted by ICLR 2021.


This repository provides the source codes for the experiments in our papers for CcGANs.
If you use this code, please cite

@ARTICLE{9983478,
    author={Ding, Xin and Wang, Yongwei and Xu, Zuheng and Welch, William J. and Wang, Z. Jane},
    journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
    title={Continuous Conditional Generative Adversarial Networks: Novel Empirical Losses and Label Input Mechanisms}, 
    year={2023},
    volume={45},
    number={7},
    pages={8143-8158},
    doi={10.1109/TPAMI.2022.3228915}
}

@inproceedings{
    ding2021ccgan,
    title={Cc{GAN}: Continuous Conditional Generative Adversarial Networks for Image Generation},
    author={Xin Ding and Yongwei Wang and Zuheng Xu and William J Welch and Z. Jane Wang},
    booktitle={International Conference on Learning Representations},
    year={2021},
    url={https://openreview.net/forum?id=PrzjugOsDeE}
}

Repository Structure

├── RC-49
│   ├── RC-49_64x64
│   │   ├──CcGAN
│   │   ├──CcGAN-improved
│   │   └──cGAN-concat
│   ├── RC-49_128x128
│   │   └──CcGAN-improved
│   └── RC-49_256x256
│       └──CcGAN-improved
├── UTKFace
│   ├── UTKFace_64x64
│   │   ├──CcGAN
│   │   ├──CcGAN-improved
│   │   └──cGAN-concat
│   ├── UTKFace_128x128
│   │   └──CcGAN-improved
│   └── UTKFace_192x192
│       └──CcGAN-improved
├── Cell-200
│   └── Cell-200_64x64
│       ├──CcGAN
│       ├──CcGAN-improved
│       └──cGAN-concat
├── SteeringAngle
│   ├── SteeringAngle_64x64
│   │   ├──CcGAN
│   │   ├──CcGAN-improved
│   │   └──cGAN-concat
│   └── SteeringAngle_128x128
│       └──CcGAN-improved
└── NIQE
    ├── RC-49
    │   ├── NIQE_64x64
    │   ├── NIQE_128x128
    │   ├── NIQE_256x256
    ├── UTKFace
    │   ├── NIQE_64x64
    │   ├── NIQE_128x128
    │   └── NIQE_192x192
    ├── Cell-200
    │   └── NIQE_64x64
    └── SteeringAngle
        ├── NIQE_64x64
        └── NIQE_128x128

The overall workflow of CcGAN

The overall workflow of CcGAN. Regression labels are input into the generator and the discriminator by novel label input mechanisms (NLI and ILI). Novel empirical losses (HVDL, SVDL, and a generator loss) are used to train the generator and discriminator. CcGAN can also employ modern GAN architectures (e.g., SNGAN and SAGAN) and training techniques (e.g., DiffAugment).


Hard Vicinal Discriminator Loss (HVDL) and Soft Vicinal Discriminator Loss (SVDL)

The formulae for HVDL and SVDL.

An example of the hard vicinity An example of the soft vicinity

Naive Label Input (NLI) and Improved Label Input (ILI) Mechanisms

The workflow of the naive label input (NLI) mechanism.

CNN for label embedding in ILI The embedding network in ILI

The workflow of the improved label input (ILI) mechanism.


Software Requirements

Item Version
Python 3.9.5
argparse 1.1
CUDA 11.4
cuDNN 8.2
numpy 1.14
torch 1.9.0
torchvision 0.10.0
Pillow 8.2.0
matplotlib 3.4.2
tqdm 4.61.1
h5py 3.3.0
Matlab 2020a

Datasets

The RC-49 Dataset (h5 file)

Download the following h5 files and put them in ./datasets/RC-49.

RC-49 (64x64)

RC-49_64x64_download_link

RC-49 (128x128)

RC-49_128x128_download_link

RC-49 (256x256)

RC-49_256x256_download_link

The preprocessed UTKFace Dataset (h5 file)

Download the following h5 files and put them in ./datasets/UTKFace.

UTKFace (64x64)

UTKFace_64x64_download_link

UTKFace (128x128)

UTKFace_128x128_download_link

UTKFace (192x192)

UTKFace_192x192_download_link

The Cell-200 dataset (h5 file)

Download the following h5 files and put them in ./datasets/Cell-200.

Cell-200_64x64_download_link

The Steering Angle dataset (h5 file)

Download the following h5 files and put them in ./datasets/SteeringAngle.

Steering Angle (64x64)

SteeringAngle_64x64_download_link
SteeringAngle_5_scenes_64x64_download_link

Steering Angle (128x128)

SteeringAngle_128x128_download_link
SteeringAngle_5_scenes_128x128_download_link


Sample Usage

run ./scripts/run_train.sh in the following folders. Remember to set correct root path, data path, and checkpoint path.

Low-resolution experiments (64x64)

Folders with name CcGAN are for the NLI-based CcGAN. Folders with name CcGAN-improved are for the ILI-based CcGAN. Foders with name cGAN (concat) are for the baseline cGAN (concat) [i.e., cGAN (concat) directly appends regression labels to the input of generator and the last hidden map of discriminator].

Simulation (./Simulation): The Circular 2D Gaussians experiment in our ICLR paper [1].

RC-49 (64x64) (./RC-49/RC-49_64x64)

./RC-49/RC-49_64x64/CcGAN: Train AE and ResNet-34 for evaluation. Train cGAN (K classes) and NLI-based CcGAN.
./RC-49/RC-49_64x64/CcGAN-improved: Train cGAN (K classes) and ILI-based CcGAN.
./RC-49/RC-49_64x64/cGAN-concat: Train cGAN (concat).

UTKFace (64x64) (./UTKFace/UTKFace_64x64)

./UTKFace/UTKFace_64x64/CcGAN: Train AE and ResNet-34 for evaluation. Train cGAN (K classes) and NLI-based CcGAN.
./UTKFace/UTKFace_64x64/CcGAN-improved: Train cGAN (K classes) and ILI-based CcGAN.
./UTKFace/UTKFace_64x64/cGAN-concat: Train cGAN (concat).

Cell-200 (64x64) (./Cell-200/Cell-200_64x64)

./Cell-200/Cell-200_64x64/CcGAN: Train AE for evaluation. Train cGAN (K classes) and NLI-based CcGAN.
./Cell-200/Cell-200_64x64/CcGAN-improved: Train cGAN (K classes) and ILI-based CcGAN.

Steering Angle (64x64) (./SteeringAngle/SteeringAngle_64x64)

./SteeringAngle/SteeringAngle_64x64/CcGAN: Train AE and ResNet-34 for evaluation. Train cGAN (K classes) and NLI-based CcGAN.
./SteeringAngle/SteeringAngle_64x64/CcGAN-improved: Train cGAN (K classes) and ILI-based CcGAN.
./SteeringAngle/SteeringAngle_64x64/cGAN-concat: Train cGAN (concat).

High-resolution experiments

In high-resolution experiments, we only compare CcGAN (SVDL+ILI) with cGAN (K classes) and cGAN (concat). For all GANs, we use SAGAN [3] as the backbone. We also use hinge loss [2] and DiffAugment [4].

RC-49 (128x128)

./RC-49/RC-49_128x128\CcGAN-improved: Train AE and ResNet-34 for evaluation. Train cGAN (K classes), cGAN (concat) and CcGAN (SVDL+ILI).

RC-49 (256x256)

./RC-49/RC-49_256x256\CcGAN-improved: Train AE and ResNet-34 for evaluation. Train cGAN (K classes), cGAN (concat) and CcGAN (SVDL+ILI).

UTKFace (128x128)

./UTKFace/UTKFace_128x128\CcGAN-improved: Train AE and ResNet-34 for evaluation. Train cGAN (K classes), cGAN (concat) and CcGAN (SVDL+ILI).

UTKFace (192x192)

./UTKFace/UTKFace_192x192\CcGAN-improved: Train AE and ResNet-34 for evaluation. Train cGAN (K classes), cGAN (concat) and CcGAN (SVDL+ILI).

Steering Angle (128x128)

./SteeringAngle/SteeringAngle_128x128\CcGAN-improved: Train AE and ResNet-34 for evaluation. Train cGAN (K classes), cGAN (concat) and CcGAN (SVDL+ILI).


Computing NIQE

The code for computing NIQE is in ./NIQE. Let's take RC-49_128x128 (in ./NIQE/RC-49/NIQE_128x128) as an example. First, create a folder ./NIQE/RC-49/NIQE_128x128/fake_data where we store the folder that concains fake images generated from a CcGAN or a cGAN. Second, rename the folder that contains fake images to fake_images, i.e., ./NIQE/RC-49/NIQE_128x128/fake_data/fake_images. Third, unzip ./NIQE/RC-49/NIQE_128x128/models/unzip_this_file.zip (containing pre-trained NIQE models). Fourth, run ./NIQE/RC-49/NIQE_128x128/run_test.bat. Please note that in this directory, we only provide a Windows batch script to run the evaluation. Please modify it to fit Linux system.


Some Results

Example fake images from CcGAN in high-resolution experiments

Example 128X128 fake RC-49 images generated by CcGAN (SVDL+ILI) with angles varying from 4.5 to 85.5 degrees (from top to bottom).

Example 192x192 fake UTKFace images generated by CcGAN (SVDL+ILI) with ages varying from 3 to 57 (from top to bottom).

Example 128x128 fake Steering Angle images generated by CcGAN (SVDL+ILI) with angles varying from -71.9 to 72 degrees (from top to bottom).

Line graphs for low-resolution experiments

Line graphs for the RC-49 experiment.

Line graphs for the UTKFace experiment.

Line graphs for the Cell-200 experiment.

Line graphs for the Steering Angle experiment.


References

[1] Ding, Xin, et al. "CcGAN: Continuous Conditional Generative Adversarial Networks for Image Generation." International Conference on Learning Representations. 2021.
[2] Lim, Jae Hyun, and Jong Chul Ye. "Geometric GAN." arXiv preprint arXiv:1705.02894 (2017).
[3] Zhang, Han, et al. "Self-attention generative adversarial networks." International conference on machine learning. PMLR, 2019.
[4] Zhao, Shengyu, et al. "Differentiable Augmentation for Data-Efficient GAN Training." Advances in Neural Information Processing Systems 33 (2020).