metrics

This repo contains information/implementation (PyTorch, Tensorflow) about IS and FID score. This is a handy toolbox that you can easily add to your projects. TF implementations are intended to compute the exact same output as the official ones for reporting in papers. Discussion/PR/Issues are very welcomed.

Usage

Put this metrics/ folder in your projects, and see below (Pytorch), and each .py's head comment for usage.

We also need to download some files in res/, see res/README.md for more details.

TF implementations (almost the same as official, just changed the interface, can be reported in papers)

inception_score_official_tf.py: inception score
fid_official_tf.py: FID score
precalc_stats_official_tf.py: calculate stats (mu, sigma)

Pytorch Implementation (CANNOT report in papers, but can get an quick view)

Requirements
- pytorch, torchvision, scipy, numpy, tqdm
is_fid_pytorch.py
- inception score, get around mean=9.67278, std=0.14992 for CIFAR-10 train data when n_split=10
- FID score
- calculate stats for custom images in a folder (mu, sigma)
- multi-GPU support by nn.DataParallel
  - e.g. CUDA_VISIBLE_DEVICES=0,1,2,3 will use 4 GPU.

command line usage

calculate IS, FID

# calc IS score on CIFAR10, will download CIFAR10 data to ../data/cifar10
python is_fid_pytorch.py

# calc IS score on custom images in a folder/
python is_fid_pytorch.py --path foldername/

# calc IS, FID score on custom images in a folder/, compared to CIFAR10 (given precalculated stats)
python is_fid_pytorch.py --path foldername/ --fid res/stats_pytorch/fid_stats_cifar10_train.npz

# calc FID on custom images in two folders/
python is_fid_pytorch.py --path foldername1/ --fid foldername2/

# calc FID on two precalculated stats
python is_fid_pytorch.py --path res/stats_pytorch/fid_stats_cifar10_train.npz --fid res/stats_pytorch/fid_stats_cifar10_train.npz

precalculate stats

# precalculate stats store as npz for CIFAR 10, will download CIFAR10 data to ../data/cifar10
python is_fid_pytorch.py --save-stats-path res/stats_pytorch/fid_stats_cifar10_train.npz

# precalculate stats store as npz for images in folder/
python is_fid_pytorch.py --path foldername/ --save-stats-path res/stats_pytorch/fid_stats_folder.npz

in code usage

mode=1: image tensor has already normalized by mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]

mode=2: image tensor has already normalized by mean=[0.500, 0.500, 0.500], std=[0.500, 0.500, 0.500]

from metrics import is_fid_pytorch

# using precalculated stats (.npz) for FID calculation
is_fid_model = is_fid_pytorch.ScoreModel(mode=2, stats_file='res/stats_pytorch/fid_stats_cifar10_train.npz', cuda=cuda)
imgs_nchw = torch.Tensor(50000, C, H, W) # torch.Tensor in -1~1, normalized by mean=[0.500, 0.500, 0.500], std=[0.500, 0.500, 0.500]
is_mean, is_std, fid = is_fid_model.get_score_image_tensor(imgs_nchw)

# we can also pass in mu, sigma for get_score_image_tensor()
is_fid_model = is_fid_pytorch.ScoreModel(mode=2, cuda=cuda)
mu, sigma = is_fid_pytorch.read_stats_file('res/stats_pytorch/fid_stats_cifar10_train.npz')
is_mean, is_std, fid = is_fid_model.get_score_image_tensor(imgs_nchw, mu1=mu, sigma1=sigma)

# if no need FID
is_fid_model = is_fid_pytorch.ScoreModel(mode=2, cuda=cuda)
is_mean, is_std, _ = is_fid_model.get_score_image_tensor(imgs_nchw)

# if want stats (mu, sigma) for imgs_nchw, send in return_stats=True
is_mean, is_std, _, mu, sigma = is_fid_model.get_score_image_tensor(imgs_nchw, return_stats=True)

# from pytorch dataset, use get_score_dataset(), instead of get_score_image_tensor(), other usage is the same
cifar = dset.CIFAR10(root='../data/cifar10', download=True,
                     transform=transforms.Compose([
                         transforms.Resize(32),
                         transforms.ToTensor(),
                         transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
                     ])
                    )
IgnoreLabelDataset(cifar)
is_mean, is_std, _ = is_fid_model.get_score_dataset(IgnoreLabelDataset(cifar))

TODO

Refactor TF implementation of IS, FID Together
MS-SSIM score - PyTorch
MS-SSIM score - Tensorflow

Info

Inception Score (IS)

Assumption
- MEANINGFUL: The generated image should be clear, the output probability of a classifier network should be [0.9, 0.05, ...] (largely skewed to a class). $p(y|\mathbf{x})$ is of low entropy.
- DIVERSITY: If we have 10 classes, the generated image should be averagely distributed. So that the marginal distribution $p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)})$ is of high entropy.
- Better models: KL Divergence of $p(y|\mathbf{x})$ and $p(y)$ should be high.
Formulation
- $\mathbf{IS} = \exp (\mathbb{E}{\mathbf{x} \sim p_g} D{KL} [p(y|\mathbf{x}) || p(y)] )$
- where
  - $\mathbf{x}$ is sampled from generated data
  - $p(y|\mathbf{x})$ is the output probability of Inception v3 when input is $\mathbf{x}$
  - $p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)})$ is the average output probability of all generated data (from InceptionV3, 1000-dim vector)
  - $D_{KL} (\mathbf{p}||\mathbf{q}) = \sum_{j} p_{j} \log \frac{p_j}{q_j}$, where $j$ is the dimension of the output probability.
Explanation
- $p(y)$ is a evenly distributed vector
- larger $\mathbf{IS}$ score -> larger KL divergence -> larger diversity and clearness
Reference
- Official TF implementation is in openai/improved-gan
- Pytorch Implementation: sbarratt/inception-score-pytorch
- TF seemed to provide a good implementation
- scipy.stats.entropy
- zhihu: Inception Score 的原理和局限性
- A Note on the Inception Score

Fréchet Inception Distance (FID)

Formulation
- $\mathbf{FID} = ||\mu_r - \mu_g||^2 + Tr(\Sigma_{r} + \Sigma_{g} - 2(\Sigma_r \Sigma_g)^{1/2})$
- where
  - $Tr$ is trace of a matrix (wikipedia)
  - $X_r \sim \mathcal{N}(\mu_r, \Sigma_r)$ and $X_g \sim \mathcal{N}(\mu_g, \Sigma_g)$ are the 2048-dim activations the InceptionV3 pool3 layer
  - $\mu_r$ is the mean of real photo's feature
  - $\mu_g$ is the mean of generated photo's feature
  - $\Sigma_r$ is the covariance matrix of real photo's feature
  - $\Sigma_g$ is the covariance matrix of generated photo's feature
Reference
- Official TF implementation: bioinf-jku/TTUR
- Pytorch Implementation: mseitzer/pytorch-fid
- TF seemed to provide a good implementation
- zhihu: Frechet Inception Score (FID)
- Explanation from Neal Jean

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

metrics

Usage

TF implementations (almost the same as official, just changed the interface, can be reported in papers)

Pytorch Implementation (CANNOT report in papers, but can get an quick view)

TODO

Info

Inception Score (IS)

Fréchet Inception Distance (FID)

Files

README.md

Latest commit

History

README.md

File metadata and controls

metrics

Usage

TF implementations (almost the same as official, just changed the interface, can be reported in papers)

Pytorch Implementation (CANNOT report in papers, but can get an quick view)

TODO

Info

Inception Score (IS)

Fréchet Inception Distance (FID)