【论文综述】一篇关于GAN在计算机视觉邻域的综述

前言

这是一篇关于GAN在计算机视觉领域的综述。

正文

生成对抗网络是一种基于博弈论的生成模型,其中神经网络用于模拟数据分布。应用领域:语言生成、图像生成、图像到图像翻译、图像生成文本描述、视频生成。GAN模型能够复制数据分布并生成合成数据,应用一定的标准偏差来创建新的、以前从未见过的数据

图1显示了GAN架构是如何组成的。由于这种架构的复杂性,GANs在训练[16–18]过程中存在不稳定。这些模型中训练的不稳定性导致了模态崩溃等问题,因此人们对[19–23]的这类问题进行了研究。正如[24]所定义的,当GANs模型生成具有不同输入相同类输出时,就会发生模式崩溃

【论文综述】一篇关于GAN在计算机视觉邻域的综述_第1张图片

GAN调查通常集中在GAN模型结构[16,27]或它们在某些任务[28,29]中的应用上。本文主要聚焦在模型结构本身 。文章[34]这样的调查的重点是分析最先进的通用神经网络,并进一步分析各种网络的性能。此外,他们还提出了一套关于哪种损失函数最适合每种使用情况的建议文章[35]关注的是过去几年不同的GAN的架构如何用于不同的问题,而文章[28]则展示了计算机视觉及其应用的不同架构。

文章调研总览

【论文综述】一篇关于GAN在计算机视觉邻域的综述_第2张图片

GAN网络的模型结构时间轴

【论文综述】一篇关于GAN在计算机视觉邻域的综述_第3张图片

GAN网络的损失函数时间轴

【论文综述】一篇关于GAN在计算机视觉邻域的综述_第4张图片

GAN网络的时间轴

【论文综述】一篇关于GAN在计算机视觉邻域的综述_第5张图片

参考文献

[1] I.J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S.
Ozair, A. Courville, Y. Bengio, Generative adversarial networks, 2014.
[2] J. Cheng, Y. Yang, X. Tang, N. Xiong, Y. Zhang, F. Lei, Generative adversarial
networks: A literature review., KSII Trans. Internet Inf. Syst. 14 (12)
(2020).
[3] T. Karras, T. Aila, S. Laine, J. Lehtinen, Progressive growing of GANs for
improved quality, stability, and variation, 2018.
[4] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, A. Courville, Improved
training of wasserstein GANs, in: Proceedings of the 31st International
Conference on Neural Information Processing Systems, NIPS ’17, Curran
Associates Inc., Red Hook, NY, USA, 2017, pp. 5769–5779.
[5] J. Xu, X. Ren, J. Lin, X. Sun, Diversity-promoting GAN: A cross-entropy
based generative adversarial network for diversified text generation, in:
Proceedings of the 2018 Conference on Empirical Methods in Natural
Language Processing, Association for Computational Linguistics, Brussels,
Belgium, 2018, pp. 3940–3949.
[6] T. Karras, S. Laine, T. Aila, A style-based generator architecture for
generative adversarial networks, 2019.
[7] J.-Y. Zhu, T. Park, P. Isola, A.A. Efros, Unpaired image-to-image translation
using cycle-consistent adversarial networks, in: 2017 IEEE International
Conference on Computer Vision, ICCV, 2017, pp. 2242–2251.
[8] P. Isola, J.-Y. Zhu, T. Zhou, A.A. Efros, Image-to-image translation with
conditional adversarial networks, 2018.
[9] M. Zhu, P. Pan, W. Chen, Y. Yang, DM-GAN: Dynamic memory generative
adversarial networks for text-to-image synthesis, in: Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR,
2019.
[10] Y. Li, M. Min, D. Shen, D. Carlson, L. Carin, Video generation from text,
in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32,
2018, p. 1.
[11] S.W. Kim, Y. Zhou, J. Philion, A. Torralba, S. Fidler, Learning to sim
ulate dynamic environments with gamegan, in: Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020,
pp. 1231–1240.
[12] D.H. Ackley, G.E. Hinton, T.J. Sejnowski, A learning algorithm for
Boltzmann machines, Cogn. Sci. 9 (1) (1985) 147–169.
[13] D. Bank, N. Koenigstein, R. Giryes, Autoencoders, 2021.
[14] A. van den Oord, N. Kalchbrenner, Pixel RNN, in: ICML, 2016.
[15] Y. Sun, L. Xu, L. Guo, Y. Li, Y. Wang, A comparison study of VAE and
GAN for software fault prediction, in: S. Wen, A. Zomaya, L.T. Yang
(Eds.), Algorithms and Architectures for Parallel Processing, Springer
International Publishing, Cham, 2020, pp. 82–96.
[16] M. Wiatrak, S.V. Albrecht, Stabilizing generative adversarial network
training: A survey, 2019, arXiv.
[17] H. Thanh-Tung, T. Tran, S. Venkatesh, Improving generalization and
stability of generative adversarial networks, 2019.
[18] X. Mao, Q. Li, H. Xie, R.Y. Lau, Z. Wang, S. Paul Smolley, Least squares
generative adversarial networks, in: Proceedings of the IEEE International
Conference on Computer Vision, ICCV, 2017.
[19] Bhagyashree, V. Kushwaha, G.C. Nandi, Study of prevention of mode
collapse in generative adversarial network (GAN), in: 2020 IEEE 4th
Conference on Information Communication Technology, CICT, 2020,
pp. 1–6.
[20] D. Bang, H. Shim, MGGAN: Solving mode collapse using manifold guided
training, 2018.
[21] S. Adiga, M.A. Attia, W.-T. Chang, R. Tandon, On the tradeoff between
mode collapse and sample quality in generative adversarial networks,
in: 2018 IEEE Global Conference on Signal and Information Processing
(GlobalSIP), 2018, pp. 1184–1188.
[22] D. Bau, J.-Y. Zhu, J. Wulff, W. Peebles, H. Strobelt, B. Zhou, A. Torralba,
Seeing what a GAN cannot generate, in: Proceedings of the IEEE/CVF
International Conference on Computer Vision, ICCV, 2019.
[23] R. Durall, A. Chatzimichailidis, P. Labus, J. Keuper, Combating mode
collapse in GAN training: An empirical analysis using hessian eigenvalues,
2020.
[24] H. Thanh-Tung, T. Tran, Catastrophic forgetting and mode collapse in
GANs, in: 2020 International Joint Conference on Neural Networks, IJCNN,
2020, pp. 1–10.
[25] A. Aggarwal, M. Mittal, G. Battineni, Generative adversarial network: An
overview of theory and applications, Int. J. Inf. Manage. Data Insights 1
(1) (2021) 100004.
[26] M. Arjovsky, S. Chintala, L. Bottou, Wasserstein GAN, 2017.
[27] B. Ghosh, I.K. Dutta, M. Totaro, M. Bayoumi, A survey on the progression
and performance of generative adversarial networks, in: 2020 11th
International Conference on Computing, Communication and Networking
Technologies, ICCCNT, 2020, pp. 1–8.
[28] Z. Wang, Q. She, T.E. Ward, Generative adversarial networks in computer
vision: A survey and taxonomy, 2020.
[29] H. Alqahtani, M. Kavakli-Thorne, D.G. Kumar Ahuja, Applications of gen
erative adversarial networks (GANs): An updated review, Arch. Comput.
Methods Eng. 28 (2019).
[30] Z. Pan, W. Yu, X. Yi, A. Khan, F. Yuan, Y. Zheng, Recent progress on
generative adversarial networks (GANs): A survey, IEEE Access 7 (2019)
36322–36333.
[31] K. Wang, C. Gou, Y. Duan, Y. Lin, X. Zheng, F.-Y. Wang, Generative
adversarial networks: introduction and outlook, IEEE/CAA J. Autom. Sin.
4 (4) (2017) 588–598.
[32] V. Sampath, I. Maurtua, J.J.A. Martín, A. Gutierrez, A survey on generative
adversarial networks for imbalance problems in computer vision tasks, J.
Big Data 8 (1) (2021) 1–59.
[33] X. Wu, K. Xu, P. Hall, A survey of image synthesis and editing with
generative adversarial networks, Tsinghua Sci. Technol. 22 (6) (2017)
660–674.
[34] Z. Pan, W. Yu, B. Wang, H. Xie, V.S. Sheng, J. Lei, S. Kwong, Loss functions
of generative adversarial networks (GANs): opportunities and challenges,
IEEE Trans. Emerg. Top. Comput. Intell. 4 (4) (2020) 500–522.
[35] J. Gui, Z. Sun, Y. Wen, D. Tao, J. Ye, A review on generative adversarial
networks: Algorithms, theory, and applications, 2020.
[36] H. Zhang, Z. Le, Z. Shao, H. Xu, J. Ma, MFF-GAN: An unsupervised gen
erative adversarial network with adaptive and gradient joint constraints
for multi-focus image fusion, Inf. Fusion 66 (2021) 40–53.
[37] R. Liu, Y. Ge, C.L. Choi, X. Wang, H. Li, DivCo: Diverse conditional image
synthesis via contrastive generative adversarial network, in: Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
CVPR, 2021, pp. 16377–16386.
[38] D.M. De Silva, G. Poravi, A review on generative adversarial networks, in:
2021 6th International Conference for Convergence in Technology (I2CT),
2021, pp. 1–4.
[39] L. Metz, B. Poole, D. Pfau, J. Sohl-Dickstein, Unrolled generative adversarial
networks, 2017.
[40] S. Suh, H. Lee, P. Lukowicz, Y.O. Lee, CEGAN: Classification enhancement
generative adversarial networks for unraveling data imbalance problems,
Neural Netw. 133 (2021) 69–86.
[41] J. Nash, Non-cooperative games, Ann. of Math. (1951) 286–295.
[42] F. Farnia, A. Ozdaglar, GANs may have no Nash equilibria, 2020.
[43] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, S. Hochreiter, Gans
trained by a two time-scale update rule converge to a local nash
equilibrium, Adv. Neural Inf. Process. Syst. 30 (2017).
[44] Á. González-Prieto, A. Mozo, E. Talavera, S. Gómez-Canaval, Dynamics of
Fourier modes in torus generative adversarial networks, Mathematics 9
(4) (2021).
[45] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen,
Improved techniques for training GANs, 2016.
[46] Z. Zhang, C. Luo, J. Yu, Towards the gradient vanishing, divergence
mismatching and mode collapse of generative adversarial nets, in: Pro
ceedings of the 28th ACM International Conference on Information and
Knowledge Management, CIKM ’19, Association for Computing Machinery,
New York, NY, USA, 2019, pp. 2377–2380.
[47] H.D. Meulemeester, J. Schreurs, M. Fanuel, B.D. Moor, J.A.K. Suykens, The
bures metric for generative adversarial networks, 2021.
[48] W. Li, L. Fan, Z. Wang, C. Ma, X. Cui, Tackling mode collapse in multi
generator GANs with orthogonal vectors, Pattern Recognit. 110 (2021)
107646.
[49] I. Goodfellow, NIPS 2016 tutorial: Generative adversarial networks, 2017.
[50] S. Pei, R.Y. Da Xu, G. Meng, dp-GAN: Alleviating mode collapse in GAN
via diversity penalty module, 2021, arXiv preprint arXiv:2108.02353 .
[51] J. Su, GAN-QP: A novel GAN framework without gradient vanishing and
Lipschitz constraint, 2018.
[52] Y. Zuo, G. Avraham, T. Drummond, Improved training of generative ad
versarial networks using decision forests, in: Proceedings of the IEEE/CVF
Winter Conference on Applications of Computer Vision, WACV, 2021,
pp. 3492–3501.
[53] S. Liu, O. Bousquet, K. Chaudhuri, Approximation and convergence
properties of generative adversarial learning, 2017.
[54] S.A. Barnett, Convergence problems with generative adversarial networks
(GANs), 2018.
[55] A. Borji, Pros and cons of GAN evaluation measures, Comput. Vis. Image
Underst. 179 (2019) 41–65.
[56] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the
inception architecture for computer vision, 2015.
[57] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, Imagenet: A large
scale hierarchical image database, in: 2009 IEEE Conference on Computer
Vision and Pattern Recognition, IEEE, 2009, pp. 248–255.
[58] S. Nowozin, B. Cseke, R. Tomioka, f-GAN: Training generative neural
samplers using variational divergence minimization, 2016.
[59] S. Gurumurthy, R.K. Sarvadevabhatla, V.B. Radhakrishnan, DeLiGAN:
Generative adversarial networks for diverse and limited data, 2017.
[60] T. Karras, M. Aittala, S. Laine, E. Härkönen, J. Hellsten, J. Lehtinen, T. Aila,
Alias-free generative adversarial networks, 2021, arXiv preprint arXiv:
2106.12423 .
[61] G. Daras, A. Odena, H. Zhang, A.G. Dimakis, Your local GAN: Designing
two dimensional local attention mechanisms for generative models, in:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 2020, pp. 14531–14539.
[62] Z. Wang, E. Simoncelli, A. Bovik, Multiscale structural similarity for
image quality assessment, in: The Thrity-Seventh Asilomar Conference
on Signals, Systems Computers, 2003, Vol. 2, 2003, pp. 1398–1402, Vol.2.
[63] K. Kurach, M. Lucic, X. Zhai, M. Michalski, S. Gelly, The GAN landscape:
Losses, architectures, regularization, and normalization, 2019.
[64] E.L. Lehmann, J.P. Romano, Testing Statistical Hypotheses, Springer
Science & Business Media, 2006.
[65] D. Lopez-Paz, M. Oquab, Revisiting classifier two-sample tests, 2018.
[66] K. Simonyan, A. Zisserman, Very deep convolutional networks for
large-scale image recognition, in: International Conference on Learning
Representations, 2015.
[67] W. Bounliphone, E. Belilovsky, M.B. Blaschko, I. Antonoglou, A. Gretton, A
test of relative similarity for model selection in generative models, 2016.
[68] C.-L. Li, W.-C. Chang, Y. Cheng, Y. Yang, B. Póczos, MMD GAN: Towards
deeper understanding of moment matching network, 2017.
[69] A. Radford, L. Metz, S. Chintala, Unsupervised representation learning
with deep convolutional generative adversarial networks, 2016.
[70] J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, K. Tunyasuvunakool,
O. Ronneberger, R. Bates, A. Žídek, A. Bridgland, et al., High accuracy
protein structure prediction using deep learning, in: Fourteenth Critical
Assessment of Techniques for Protein Structure Prediction (Abstract
Book), Vol. 22, 2020, p. 24.
[71] J.T. Springenberg, A. Dosovitskiy, T. Brox, M. Riedmiller, Striving for
simplicity: The all convolutional net, 2015.
[72] R. Ayachi, M. Afif, Y. Said, M. Atri, Strided convolution instead of max
pooling for memory efficiency of convolutional neural networks, in:
M.S. Bouhlel, S. Rovetta (Eds.), Proceedings of the 8th International
Conference on Sciences of Electronics, Technologies of Information and
Telecommunications (SETIT’18), Vol. 1, Springer International Publishing,
Cham, 2020, pp. 234–243.
[73] Y. Li, N. Xiao, W. Ouyang, Improved boundary equilibrium generative
adversarial networks, IEEE Access 6 (2018) 11342–11348.
[74] S. Wu, G. Li, L. Deng, L. Liu, D. Wu, Y. Xie, L. Shi, L1 norm batch
normalization for efficient training of deep neural networks, IEEE Trans.
Neural Netw. Learn. Syst. 30 (7) (2019) 2043–2051.
[75] D.H. Hubel, T.N. Wiesel, Receptive fields of single neurones in the cat’s
striate cortex, J. Physiol. 148 (3) (1959) 574–591.
[76] M. Mirza, S. Osindero, Conditional generative adversarial nets, 2014.
[77] M. Loey, G. Manogaran, N.E.M. Khalifa, A deep transfer learning model
with classical data augmentation and cgan to detect covid-19 from chest
ct radiography digital images, Neural Comput. Appl. (2020) 1–13.
[78] Y. Ma, X. Chen, W. Zhu, X. Cheng, D. Xiang, F. Shi, Speckle noise reduction
in optical coherence tomography images based on edge-sensitive cGAN,
Biomed. Opt. Express 9 (11) (2018) 5129–5146.
[79] Y. Li, R. Fu, X. Meng, W. Jin, F. Shao, A SAR-to-optical image translation
method based on conditional generation adversarial network (cGAN), IEEE
Access 8 (2020) 60338–60343.
[80] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, P. Abbeel,
Infogan: Interpretable representation learning by information maximiz
ing generative adversarial nets, in: Proceedings of the 30th Inter
national Conference on Neural Information Processing Systems, 2016,
pp. 2180–2188.
[81] A. Odena, C. Olah, J. Shlens, Conditional image synthesis with auxiliary
classifier gans, in: International Conference on Machine Learning, PMLR,
2017, pp. 2642–2651.
[82] C.E. Shannon, A mathematical theory of communication, Bell Syst. Tech.
J. 27 (3) (1948) 379–423.
[83] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image
recognition, in: Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, 2016, pp. 770–778.
22 G. Iglesias, E. Talavera and A. Díaz-Álvarez
Computer Science Review 48 (2023) 100553
[84] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V.
Vanhoucke, A. Rabinovich, Going deeper with convolutions, in: Proceed
ings of the IEEE Conference on Computer Vision and Pattern Recognition,
2015, pp. 1–9.
[85] Y. Zhou, T.L. Berg, Learning temporal transformations from time-lapse
videos, in: European Conference on Computer Vision, Springer, 2016,
pp. 262–277.
[86] J. Johnson, A. Alahi, L. Fei-Fei, Perceptual losses for real-time style
transfer and super-resolution, in: European Conference on Computer
Vision, Springer, 2016, pp. 694–711.
[87] M. Liu, J. Zhu, A. Tao, J. Kautz, B. Catanzaro, High-resolution image
synthesis and semantic manipulation with conditional gans, in: ICCV,
2017.
[88] Y. Qu, Y. Chen, J. Huang, Y. Xie, Enhanced pix2pix dehazing network, in:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 2019, pp. 8160–8168.
[89] M. Mori, T. Fujioka, L. Katsuta, Y. Kikuchi, G. Oda, T. Nakagawa, Y.
Kitazume, K. Kubota, U. Tateishi, Feasibility of new fat suppression for
breast MRI using pix2pix, Jpn. J. Radiol. 38 (11) (2020) 1075–1081.
[90] W. Pan, C. Torres-Verdín, M.J. Pyrcz, Stochastic pix2pix: a new machine
learning method for geophysical and well conditioning of rule-based
channel reservoir models, Natural Resour. Res. 30 (2) (2021) 1319–1345.
[91] M. Drob, RF PIX2PIX unsupervised wi-fi to video translation, 2021, arXiv
preprint arXiv:2102.09345 .
[92] N. Sundaram, T. Brox, K. Keutzer, Dense point trajectories by gpu
accelerated large displacement optical flow, in: European Conference on
Computer Vision, Springer, 2010, pp. 438–451.
[93] Z. Kalal, K. Mikolajczyk, J. Matas, Forward-backward error: Automatic
detection of tracking failures, in: 2010 20th International Conference on
Pattern Recognition, IEEE, 2010, pp. 2756–2759.
[94] Z. Yi, H. Zhang, P. Tan, M. Gong, Dualgan: Unsupervised dual learning
for image-to-image translation, in: Proceedings of the IEEE International
Conference on Computer Vision, 2017, pp. 2849–2857.
[95] J. Ye, Y. Ji, X. Wang, X. Gao, M. Song, Data-free knowledge amalgamation
via group-stack dual-gan, in: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, 2020, pp. 12516–12525.
[96] D. Prokopenko, J.V. Stadelmann, H. Schulz, S. Renisch, D.V. Dylov, Syn
thetic CT generation from MRI using improved DualGAN, 2019, arXiv
preprint arXiv:1909.08942 .
[97] W. Liang, D. Ding, G. Wei, An improved DualGAN for near-infrared image
colorization, Infrared Phys. Technol. 116 (2021) 103764.
[98] C.L.M. Veillon, N. Obin, A. Roebel, Towards end-to-end F0 voice conversion
based on dual-GAN with convolutional wavelet kernels, 2021, arXiv
preprint arXiv:2104.07283 .
[99] F. Yger, A. Rakotomamonjy, Wavelet kernel learning, Pattern Recognit. 44
(10–11) (2011) 2614–2629.
[100] Z. Luo, J. Chen, T. Takiguchi, Y. Ariki, Emotional voice conversion using
dual supervised adversarial networks with continuous wavelet transform
f0 features, IEEE/ACM Trans. Audio Speech Lang. Process. 27 (10) (2019)
1535–1548.
[101] T. Kim, M. Cha, H. Kim, J.K. Lee, J. Kim, Learning to discover cross
domain relations with generative adversarial networks, in: International
Conference on Machine Learning, PMLR, 2017, pp. 1857–1865.
[102] C.R.A. Chaitanya, A.S. Kaplanyan, C. Schied, M. Salvi, A. Lefohn, D.
Nowrouzezahrai, T. Aila, Interactive reconstruction of Monte Carlo image
sequences using a recurrent denoising autoencoder, ACM Trans. Graph.
36 (4) (2017) 1–12.
[103] I.A. Luchnikov, A. Ryzhov, P.-J. Stas, S.N. Filippov, H. Ouerdane, Variational
autoencoder reconstruction of complex many-body physics, Entropy 21
(11) (2019) 1091.
[104] J. Mehta, A. Majumdar, Rodeo: robust de-aliasing autoencoder for
real-time medical image reconstruction, Pattern Recognit. 63 (2017)
499–510.
[105] S. Hicsonmez, N. Samet, E. Akbas, P. Duygulu, GANILLA: Generative
adversarial networks for image to illustration translation, Image Vis.
Comput. 95 (2020) 103886.
[106] A.A. Rusu, N.C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K.
Kavukcuoglu, R. Pascanu, R. Hadsell, Progressive neural networks, 2016,
arXiv preprint arXiv:1606.04671 .
[107] A. Krizhevsky, G. Hinton, et al., Learning multiple layers of features from
tiny images, 2009.
[108] H. Yang, J. Liu, L. Zhang, Y. Li, H. Zhang, ProEGAN-MS: A progressive grow
ing generative adversarial networks for electrocardiogram generation,
IEEE Access 9 (2021) 52089–52100.
[109] V. Bhagat, S. Bhaumik, Data augmentation using generative adversarial
networks for pneumonia classification in chest xrays, in: 2019 Fifth
International Conference on Image Information Processing, ICIIP, IEEE,
2019, pp. 574–579.
[110] L. Liu, Y. Zhang, J. Deng, S. Soatto, Dynamically grown generative ad
versarial networks, in: Proceedings of the AAAI Conference on Artificial
Intelligence, Vol. 35, 2021, pp. 8680–8687.
[111] T. Sainburg, M. Thielk, B. Theilman, B. Migliori, T. Gentner, Generative
adversarial interpolative autoencoding: adversarial training on latent
space interpolations encourage convex latent distributions, 2018, arXiv
preprint arXiv:1807.06650 .
[112] S. Laine, Feature-Based Metrics for Exploring the Latent Space of
Generative Models, ICLR Workshop Poster, 2018.
[113] X. Huang, S. Belongie, Arbitrary style transfer in real-time with adap
tive instance normalization, in: Proceedings of the IEEE International
Conference on Computer Vision, 2017, pp. 1501–1510.
[114] M. Tancik, P.P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U.
Singhal, R. Ramamoorthi, J.T. Barron, R. Ng, Fourier features let networks
learn high frequency functions in low dimensional domains, 2020, arXiv
preprint arXiv:2006.10739 .
[115] R. Xu, X. Wang, K. Chen, B. Zhou, C.C. Loy, Positional encoding as spatial
inductive bias in gans, in: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, 2021, pp. 13569–13578.
[116] H. Zhang, I. Goodfellow, D. Metaxas, A. Odena, Self-attention generative
adversarial networks, in: International Conference on Machine Learning,
PMLR, 2019, pp. 7354–7363.
[117] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł.
Kaiser, I. Polosukhin, Attention is all you need, in: Advances in Neural
Information Processing Systems, 2017, pp. 5998–6008.
[118] A. Brock, J. Donahue, K. Simonyan, Large scale GAN training for high
fidelity natural image synthesis, 2018, arXiv preprint arXiv:1809.11096 .
[119] A.G. Dimakis, P.B. Godfrey, Y. Wu, M.J. Wainwright, K. Ramchandran,
Network coding for distributed storage systems, IEEE Trans. Inform.
Theory 56 (9) (2010) 4539–4551.
[120] Y. Chen, G. Li, C. Jin, S. Liu, T. Li, SSD-GAN: Measuring the realness in the
spatial and spectral domains, 2020, arXiv preprint arXiv:2012.05535 .
[121] P. Benioff, The computer as a physical system: A microscopic quantum
mechanical Hamiltonian model of computers as represented by turing
machines, J. Stat. Phys. 22 (5) (1980) 563–591.
[122] E.R. MacQuarrie, C. Simon, S. Simmons, E. Maine, The emerging com
mercial landscape of quantum computing, Nat. Rev. Phys. 2 (11) (2020)
596–598.
[123] Y. Cao, J. Romero, J.P. Olson, M. Degroote, P.D. Johnson, M. Kieferová,
I.D. Kivlichan, T. Menke, B. Peropadre, N.P. Sawaya, et al., Quantum
chemistry in the age of quantum computing, Chem. Rev. 119 (19) (2019)
10856–10915.
[124] S.A. Stein, B. Baheri, R.M. Tischio, Y. Mao, Q. Guan, A. Li, B. Fang, S. Xu,
Qugan: A generative adversarial network through quantum states, 2020,
arXiv preprint arXiv:2010.09036 .
[125] M.Y. Niu, A. Zlokapa, M. Broughton, S. Boixo, M. Mohseni, V. Smelyanskyi,
H. Neven, Entangling quantum generative adversarial networks, 2021,
arXiv preprint arXiv:2105.00080 .
[126] W.W. Ng, J. Hu, D.S. Yeung, S. Yin, F. Roli, Diversified sensitivity-based
undersampling for imbalance classification problems, IEEE Trans. Cybern.
45 (11) (2014) 2402–2412.
[127] E. Ramentol, Y. Caballero, R. Bello, F. Herrera, SMOTE-RS B*: a hybrid
preprocessing approach based on oversampling and undersampling for
high imbalanced data-sets using SMOTE and rough sets theory, Knowl.
Inf. Syst. 33 (2) (2012) 245–265.
[128] Z. Pan, F. Yuan, J. Lei, W. Li, N. Ling, S. Kwong, MIEGAN: Mobile image
enhancement via a multi-module cascade neural network, IEEE Trans.
Multimed. 24 (2021) 519–533.
[129] G. Qi, Loss-sensitive generative adversarial networks on lipschitz
densities, 2017, CoRR abs/1701.06264 . arXiv preprint arXiv:1701.06264 .
[130] L. Weng, From gan to wgan, 2019, arXiv preprint arXiv:1904.08994 .
[131] J. Cao, L. Mo, Y. Zhang, K. Jia, C. Shen, M. Tan, Multi-marginal wasserstein
gan, Adv. Neural Inf. Process. Syst. 32 (2019) 1776–1786.
[132] Y. Xiangli, Y. Deng, B. Dai, C.C. Loy, D. Lin, Real or not real, that is the
question, 2020, arXiv preprint arXiv:2002.05512 .
[133] T. Miyato, T. Kataoka, M. Koyama, Y. Yoshida, Spectral normalization for
generative adversarial networks, 2018, arXiv preprint arXiv:1802.05957 .
[134] T. Salimans, D.P. Kingma, Weight normalization: A simple reparameter
ization to accelerate training of deep neural networks, Adv. Neural Inf.
Process. Syst. 29 (2016) 901–909.
[135] K.B. Kancharagunta, S.R. Dubey, Csgan: Cyclic-synthesized generative
adversarial networks for image-to-image transformation, 2019, arXiv
preprint arXiv:1901.03554 .
[136] X. Wang, X. Tang, Face photo-sketch synthesis and recognition, IEEE
Trans. Pattern Anal. Mach. Intell. 31 (11) (2008) 1955–1967.
[137] R. Tyleček, R. Šára, Spatial pattern templates for recognition of objects
with regular structure, in: German Conference on Pattern Recognition,
Springer, 2013, pp. 364–374.
[138] L. Wang, V. Sindagi, V. Patel, High-quality facial photo-sketch synthesis
using multi-adversarial networks, in: 2018 13th IEEE International Con
ference on Automatic Face & Gesture Recognition (FG 2018), IEEE, 2018,
pp. 83–90.
23 G. Iglesias, E. Talavera and A. Díaz-Álvarez
Computer Science Review 48 (2023) 100553
[139] N. Barzilay, T.B. Shalev, R. Giryes, MISS GAN: A multi-IlluStrator style gen
erative adversarial network for image to illustration translation, Pattern
Recognit. Lett. (2021).
[140] S.W. Park, J. Kwon, Sphere generative adversarial network based on
geometric moment matching, in: Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, 2019, pp. 4292–4301.
[141] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A.
Aitken, A. Tejani, J. Totz, Z. Wang, et al., Photo-realistic single image
super-resolution using a generative adversarial network, in: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition, 2017,
pp. 4681–4690.
[142] H. Zhang, T. Zhu, X. Chen, L. Zhu, D. Jin, P. Fei, Super-resolution generative
adversarial network (SRGAN) enabled on-chip contact microscopy, J. Phys.
D: Appl. Phys. 54 (39) (2021) 394005.
[143] O. Dehzangi, S.H. Gheshlaghi, A. Amireskandari, N.M. Nasrabadi, A. Rezai,
OCT image segmentation using neural architecture search and SRGAN, in:
2020 25th International Conference on Pattern Recognition, ICPR, IEEE,
2021, pp. 6425–6430.
[144] S. Zhao, Y. Fang, L. Qiu, Deep learning-based channel estimation with
SRGAN in OFDM systems, in: 2021 IEEE Wireless Communications and
Networking Conference, WCNC, IEEE, 2021, pp. 1–6.
[145] B. Liu, J. Chen, A super resolution algorithm based on attention
mechanism and SRGAN network, IEEE Access (2021).
[146] A. Genevay, G. Peyré, M. Cuturi, GAN and VAE from an optimal transport
point of view, 2017, arXiv preprint arXiv:1706.01807 .
[147] E. Denton, A. Hanna, R. Amironesei, A. Smart, H. Nicole, M.K. Scheuerman,
Bringing the people back in: Contesting benchmark machine learning
datasets, 2020, arXiv preprint arXiv:2007.07399 .
[148] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied
to document recognition, Proc. IEEE 86 (11) (1998) 2278–2324.
[149] J. Susskind, A. Anderson, G.E. Hinton, The Toronto Face Dataset, Tech.
Rep., Technical Report UTML TR 2010-001, U. Toronto, 2010.
[150] R. Zhang, P. Isola, A.A. Efros, E. Shechtman, O. Wang, The unreasonable
effectiveness of deep features as a perceptual metric, in: Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition, 2018,
pp. 586–595.
[151] J. Lin, Y. Xia, T. Qin, Z. Chen, T.-Y. Liu, Conditional image-to-image
translation, in: Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, 2018, pp. 5524–5532.
[152] Q. Guo, W. Feng, R. Gao, Y. Liu, S. Wang, Exploring the effects of blur and
deblurring to visual object tracking, IEEE Trans. Image Process. 30 (2021)
1812–1824.
[153] K. Zhang, W. Luo, Y. Zhong, L. Ma, B. Stenger, W. Liu, H. Li, Deblurring
by realistic blurring, in: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, 2020, pp. 2737–2746.
[154] M.A. Younus, T.M. Hasan, Effective and fast deepfake detection method
based on haar wavelet transform, in: 2020 International Conference
on Computer Science and Software Engineering, CSASE, IEEE, 2020,
pp. 186–190.
[155] X. Ren, Z. Qian, Q. Chen, Video deblurring by fitting to test data, 2020,
arXiv preprint arXiv:2012.05228 .
[156] M. Westerlund, The emergence of deepfake technology: A review,
Technol. Innov. Manage. Rev. 9 (11) (2019).
[157] V.C. Martínez, G.P. Castillo, Historia del ‘‘fake’’ audiovisual: ‘‘deepfake’’ y
la mujer en un imaginario falsificado y perverso, Hist. Comun. Soc. 24 (2)
(2019) 55.
[158] A.O. Kwok, S.G. Koh, Deepfake: A social construction of technology
perspective, Curr. Issues Tour. 24 (13) (2021) 1798–1802.
[159] P. Korshunov, S. Marcel, Vulnerability assessment and detection of deep
fake videos, in: 2019 International Conference on Biometrics, ICB, IEEE,
2019, pp. 1–6.
[160] B. Dolhansky, J. Bitton, B. Pflaum, J. Lu, R. Howes, M. Wang, C. Can
ton Ferrer, The deepfake detection challenge dataset, 2020, arXiv e-prints
arXiv–2006.
[161] N. Carlini, H. Farid, Evading deepfake-image detectors with white
and black-box attacks, in: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition Workshops, 2020, pp. 658–659.
[162] H. Zhao, W. Zhou, D. Chen, T. Wei, W. Zhang, N. Yu, Multi-attentional
deepfake detection, in: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, 2021, pp. 2185–2194.
[163] Y. Chen, Y. Pan, T. Yao, X. Tian, T. Mei, Mocycle-gan: Unpaired video
to-video translation, in: Proceedings of the 27th ACM International
Conference on Multimedia, 2019, pp. 647–655.
[164] A. Bansal, S. Ma, D. Ramanan, Y. Sheikh, Recycle-gan: Unsupervised video
retargeting, in: Proceedings of the European Conference on Computer
Vision, ECCV, 2018, pp. 119–135.
[165] L. Kurup, M. Narvekar, R. Sarvaiya, A. Shah, Evolution of neural text gen
eration: Comparative analysis, in: Advances in Computer, Communication
and Computational Sciences, Springer, 2021, pp. 795–804.
[166] H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, D.N. Metaxas,
Stackgan: Text to photo-realistic image synthesis with stacked generative
adversarial networks, in: Proceedings of the IEEE International Conference
on Computer Vision, 2017, pp. 5907–5915.
[167] H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, D.N. Metaxas,
Stackgan++: Realistic image synthesis with stacked generative adversarial
networks, IEEE Trans. Pattern Anal. Mach. Intell. 41 (8) (2018) 1947–1962.
[168] C. Gulcehre, S. Chandar, K. Cho, Y. Bengio, Dynamic neural turing machine
with soft and hard addressing schemes, 2016, arXiv preprint arXiv:1607.
00036 .
[169] J. Weston, S. Chopra, A. Bordes, Memory networks, 2014, arXiv preprint
arXiv:1410.3916 .
[170] M. Tao, H. Tang, S. Wu, N. Sebe, X.-Y. Jing, F. Wu, B. Bao, Df-gan: Deep
fusion generative adversarial networks for text-to-image synthesis, 2020,
arXiv preprint arXiv:2008.05865 .
[171] L. Gao, D. Chen, Z. Zhao, J. Shao, H.T. Shen, Lightweight dynamic condi
tional GAN with pyramid attention for text-to-image synthesis, Pattern
Recognit. 110 (2021) 107384.
[172] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, H. Lee, Generative
adversarial text to image synthesis, in: International Conference on
Machine Learning, PMLR, 2016, pp. 1060–1069.
[173] S.E. Reed, Z. Akata, S. Mohan, S. Tenka, B. Schiele, H. Lee, Learning what
and where to draw, Adv. Neural Inf. Process. Syst. 29 (2016) 217–225.
[174] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár,
C.L. Zitnick, Microsoft coco: Common objects in context, in: European
Conference on Computer Vision, Springer, 2014, pp. 740–755.
[175] C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, The caltech-ucsd
birds-200–2011 dataset, 2011.
[176] M.-E. Nilsback, A. Zisserman, Automated flower classification over a large
number of classes, in: 2008 Sixth Indian Conference on Computer Vision,
Graphics & Image Processing, IEEE, 2008, pp. 722–729.
[177] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput.
9 (8) (1997) 1735–1780.
[178] A.M. Dai, Q.V. Le, Semi-supervised sequence learning, Adv. Neural Inf.
Process. Syst. 28 (2015) 3079–3087.
[179] Y. Zhang, Z. Gan, L. Carin, Generating text via adversarial training, in:
NIPS Workshop on Adversarial Training, Vol. 21, academia. edu, 2016,
pp. 21–32.
[180] S. Bengio, O. Vinyals, N. Jaitly, N. Shazeer, Scheduled sampling for
sequence prediction with recurrent neural networks, 2015, arXiv preprint
arXiv:1506.03099 .
[181] L. Yu, W. Zhang, J. Wang, Y. Yu, Seqgan: Sequence generative adversarial
nets with policy gradient, in: Proceedings of the AAAI Conference on
Artificial Intelligence, Vol. 31, 2017.
[182] C.B. Browne, E. Powley, D. Whitehouse, S.M. Lucas, P.I. Cowling, P.
Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, S. Colton, A survey of
monte carlo tree search methods, IEEE Trans. Comput. Intell. AI Games 4
(1) (2012) 1–43.
[183] L. Floridi, M. Chiriatti, GPT-3: Its nature, scope, limits, and consequences,
Minds Mach. 30 (4) (2020) 681–694.
[184] N.-T. Tran, V.-H. Tran, N.-B. Nguyen, T.-K. Nguyen, N.-M. Cheung, On data
augmentation for GAN training, IEEE Trans. Image Process. 30 (2021)
1882–1897.
[185] M. Frid-Adar, E. Klang, M. Amitai, J. Goldberger, H. Greenspan, Synthetic
data augmentation using GAN for improved liver lesion classification, in:
2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI
2018), IEEE, 2018, pp. 289–293.
[186] D. Kiyasseh, G.A. Tadesse, L. Thwaites, T. Zhu, D. Clifton, et al., Plethaug
ment: Gan-based ppg augmentation for medical diagnosis in low-resource
settings, IEEE J. Biomed. Health Inf. 24 (11) (2020) 3226–3235.
[187] C. Qi, J. Chen, G. Xu, Z. Xu, T. Lukasiewicz, Y. Liu, SAG-GAN: Semi
supervised attention-guided GANs for data augmentation on medical
images, 2020, arXiv preprint arXiv:2011.07534 .
[188] M. Hammami, D. Friboulet, R. Kechichian, Cycle GAN-based data aug
mentation for multi-organ detection in CT images via yolo, in: 2020
IEEE International Conference on Image Processing, ICIP, IEEE, 2020,
pp. 390–393.
[189] A. Graves, G. Wayne, I. Danihelka, Neural turing machines, 2014, arXiv
preprint arXiv:1410.5401 .
[190] P. Guo, P. Wang, J. Zhou, V.M. Patel, S. Jiang, Lesion mask-based si
multaneous synthesis of anatomic and molecular mr images using a
gan, in: International Conference on Medical Image Computing and
Computer-Assisted Intervention, Springer, 2020, pp. 104–113.
[191] T.C. Mok, A. Chung, Learning data augmentation for brain tumor
segmentation with coarse-to-fine generative adversarial networks, in:
International MICCAI Brainlesion Workshop, Springer, 2018, pp. 70–80.
[192] H. Uzunova, J. Ehrhardt, H. Handels, Generation of annotated brain
tumor MRIs with tumor-induced tissue deformations for training and
assessment of neural networks, in: International Conference on Medical
Image Computing and Computer-Assisted Intervention, Springer, 2020,
pp. 501–511.
[193] A. Segato, V. Corbetta, M. Di Marzo, L. Pozzi, E. De Momi, Data aug
mentation of 3D brain environment using deep convolutional refined
auto-encoding alpha GAN, IEEE Trans. Med. Robot. Bionics 3 (1) (2020)
269–272.
[194] T. Kossen, P. Subramaniam, V.I. Madai, A. Hennemuth, K. Hildebrand, A.
Hilbert, J. Sobesky, M. Livne, I. Galinovic, A.A. Khalil, et al., Synthesizing
anonymized and labeled TOF-MRA patches for brain vessel segmentation
using generative adversarial networks, Comput. Biol. Med. 131 (2021)
104254.
[195] T. Xia, A. Chartsias, C. Wang, S.A. Tsaftaris, A.D.N. Initiative, et al., Learning
to synthesise the ageing brain without longitudinal data, Med. Image
Anal. 73 (2021) 102169.
[196] Y. Chen, X.-H. Yang, Z. Wei, A.A. Heidari, N. Zheng, Z. Li, H. Chen, H.
Hu, Q. Zhou, Q. Guan, Generative adversarial networks in medical image
augmentation: a review, Comput. Biol. Med. (2022) 105382.
[197] M. Li, G. Zhou, A. Chen, J. Yi, C. Lu, M. He, Y. Hu, FWDGAN-based data
augmentation for tomato leaf disease identification, Comput. Electron.
Agric. 194 (2022) 106779.
[198] M. Xu, S. Yoon, A. Fuentes, J. Yang, D.S. Park, Style-consistent image
translation: A novel data augmentation paradigm to improve plant
disease recognition, Front. Plant Sci. 12 (2021) 773142.
[199] H. Jin, Y. Li, J. Qi, J. Feng, D. Tian, W. Mu, GrapeGAN: Unsupervised
image enhancement for improved grape leaf disease recognition, Comput.
Electron. Agric. 198 (2022) 107055.
[200] Y. Jing, Y. Bian, Z. Hu, L. Wang, X.-Q.S. Xie, Deep learning for drug design:
an artificial intelligence paradigm for drug discovery in the big data era,
AAPS J. 20 (3) (2018) 1–10.
[201] D. Dana, S.V. Gadhiya, L.G. St. Surin, D. Li, F. Naaz, Q. Ali, L. Paka, M.A.
Yamin, M. Narayan, I.D. Goldberg, et al., Deep learning in drug discovery
and medicine; scratching the surface, Molecules 23 (9) (2018) 2384.
[202] A. Kadurin, A. Aliper, A. Kazennov, P. Mamoshina, Q. Vanhaelen, K.
Khrabrov, A. Zhavoronkov, The cornucopia of meaningful leads: Apply
ing deep adversarial autoencoders for new molecule development in
oncology, Oncotarget 8 (7) (2017) 10883.
[203] A. Kadurin, S. Nikolenko, K. Khrabrov, A. Aliper, A. Zhavoronkov, druGAN:
an advanced generative adversarial autoencoder model for de novo
generation of new molecules with desired molecular properties in silico,
Mol. Pharmaceut. 14 (9) (2017) 3098–3104.
[204] G.R. Padalkar, S.D. Patil, M.M. Hegadi, N.K. Jaybhaye, Drug discovery using
generative adversarial network with reinforcement learning, in: 2021
International Conference on Computer Communication and Informatics,
ICCCI, IEEE, 2021, pp. 1–3.
[205] D. Manu, Y. Sheng, J. Yang, J. Deng, T. Geng, A. Li, C. Ding, W. Jiang,
L. Yang, FL-DISCO: Federated generative adversarial network for graph
based molecule drug discovery: Special session paper, in: 2021 IEEE/ACM
International Conference on Computer Aided Design, ICCAD, IEEE, 2021,
pp. 1–7.
[206] J. Konečn
`
y, H.B. McMahan, F.X. Yu, P. Richtárik, A.T. Suresh, D. Bacon,
Federated learning: Strategies for improving communication efficiency,
2016, arXiv preprint arXiv:1610.05492 .
[207] P. Dhariwal, A. Nichol, Diffusion models beat gans on image synthesis,
Adv. Neural Inf. Process. Syst. 34 (2021) 8780–8794.
[208] J. Ho, A. Jain, P. Abbeel, Denoising diffusion probabilistic models, Adv.
Neural Inf. Process. Syst. 33 (2020) 6840–6851.
[209] Y. Song, S. Ermon, Generative modeling by estimating gradients of the
data distribution, Adv. Neural Inf. Process. Syst. 32 (2019).
[210] F.-A. Croitoru, V. Hondru, R.T. Ionescu, M. Shah, Diffusion models in
vision: A survey, 2022, arXiv preprint arXiv:2209.04747 .
[211] C. Saharia, W. Chan, H. Chang, C. Lee, J. Ho, T. Salimans, D. Fleet, M.
Norouzi, Palette: Image-to-image diffusion models, in: ACM SIGGRAPH
2022 Conference Proceedings, 2022, pp. 1–10.
[212] Y. Jiang, S. Chang, Z. Wang, Transgan: Two transformers can make one
strong gan, 2021, arXiv preprint arXiv:2102.07074 1, 3.
[213] Z. Lv, X. Huang, W. Cao, An improved GAN with transformers for
pedestrian trajectory prediction models, Int. J. Intell. Syst. 37 (8) (2022)
4417–4436.

未完待续... 

你可能感兴趣的:(生成对抗网络,人工智能,机器学习)