GAN常用评估指标

1、人工判断

        1)AMT perceptual studies

                Turkers were presented with a series of trials that pitted a “real” image against a “fake” image generated by our algorithm

        2)Mean Opinion Score (MOS) testing

                assign integral from 1 (bad quality) to 5 (excellent quality)

2、借助pretrained网络

        1)inception

                a)Inception score (IS)

                        -> how well a model captures the full ImageNet class distribution

                        -> produce individual samples that are convincing examples of a single class

                        -> do not reward covering the whole distribution or capturing diversity within a class. models which memorize a small subset of the full dataset will still have high IS

                b)Frechet Inception Distance (FID) score

                        -> Inception-V3

                        -> symmetric measure of the distance between two image distributions

                        -> more consistent with human judgement than IS

                        -> A reliable FID for inpainting is usually computed with more than 1000 images

                c)sFID score

                        -> use spatial features rather than the standard pooled features

                        -> better captures spatial relationships, rewarding image distributions with coherent high-level structure

        2)FCN

                a)FCN score (image-to-image generation)

                        Train classifiers on real images. The FCN predicts a label map for a generated photo. This label map can then be compared against the input ground truth labels using standard semantic segmentation metrics               

3、diversity score

4、LPIPS

5、votes

6、Improved Precision and Recall metrics

        -> Precision: fidelity. fractioin of model samples which fall into the data manifold

        -> Recall: diversity. fraction of data samples which fall into the sample manifold

你可能感兴趣的:(paper阅读笔记,生成对抗网络,人工智能,神经网络)