ACM(国际计算机学会)宣布,有“深度学习三巨头”之称的Yoshua Bengio、Yann LeCun、Geoffrey Hinton共同获得了2018年的图灵奖,这是图灵奖1966年建立以来少有的一年颁奖给三位获奖者。ACM同时宣布,将于2019年6月15日在旧金山举行年度颁奖晚宴,届时正式给获奖者颁奖,奖金100万美元。
以表彰他们给人工智能带来的重大突破,这些突破使深度神经网络成为计算的关键组成部分。本吉奥是蒙特利尔大学教授,也是魁北克人工智能研究所Mila的科学主任。辛顿是谷歌副总裁兼工程研究员、Vector研究所首席科学顾问、多伦多大学名誉教授。杨乐昆是纽约大学教授、Facebook副总裁兼人工智能首席科学家。(Google有Hinton,Lecun在Facebook.)
杰弗里·埃弗里斯特·辛顿(Geoffrey Everest Hinton),计算机学家、心理学家,被称为“神经网络之父”、“深度学习鼻祖”。他研究了使用神经网络进行机器学习、记忆、感知和符号处理的方法,并在这些领域发表了超过200篇论文。他是将(Backpropagation)反向传播算法引入多层神经网络训练的学者之一,他还联合发明了波尔兹曼机(Boltzmann machine)。他对于神经网络的其它贡献包括:分散表示(distributed representation)、时延神经网络、专家混合系统(mixtures of experts)、亥姆霍兹机(Helmholtz machines)等。
Yann LeCun,自称中文名“杨立昆”,计算机科学家,被誉为“卷积网络之父”,为卷积神经网络(CNN,Convolutional Neural Networks)和图像识别领域做出了重要贡献,以手写字体识别、图像压缩和人工智能硬件等主题发表过 190 多份论文,研发了很多关于深度学习的项目,并且拥有14项相关的美国专利。他同Léon Bottou和Patrick Haffner等人一起创建了DjVu图像压缩技术,同Léon Bottou一起开发了一种开源的Lush语言,比Matlab功能还要强大,并且也是一位Lisp高手。(Backpropagation,简称BP)反向传播这种现阶段常用来训练人工神经网络的算法,就是 LeCun 和其老师“神经网络之父”Geoffrey Hinton 等科学家于 20 世纪 80 年代中期提出的,而后 LeCun 在贝尔实验室将 BP 应用于卷积神经网络中,并将其实用化,推广到各种图像相关任务中。
Bengio的一篇“A neural probabilistic language model”论文开创了神经网络语言模型的先河。其整体思路影响、启发了之后的很多基于神经网络做NLP的paper,在工业界也得到了广泛使用,还有梯度消失(gradient vanishing)的细致分析,word2vec的雏形,以及现很火的计算机翻译(machine translation)都有Bengio的贡献。
1、反向传播算法的使用
Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors[J]. Cognitive modeling, 1988, 5(3): 1.
2、CNN语音识别开篇TDN网络
Waibel A, Hanazawa T, Hinton G, et al. Phoneme recognition using time-delay neural networks[J]. Backpropagation: Theory, Architectures and Applications, 1995: 35-61.
3、DBN网络的学习
Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets[J]. Neural computation, 2006, 18(7): 1527-1554.
4、深度学习的开篇
Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks[J]. science, 2006, 313(5786): 504-507.
5、数据降维可视化方法t-SNE
Maaten L, Hinton G. Visualizing data using t-SNE[J]. Journal of machine learning research, 2008, 9(Nov): 2579-2605.
6、DBM模型
Salakhutdinov R, Hinton G. Deep boltzmann machines[C]//Artificial intelligence and statistics. 2009: 448-455.
7、ReLU激活函数的使用
Nair V, Hinton G E. Rectified linear units improve restricted boltzmann machines[C]//Proceedings of the 27th international conference on machine learning (ICML-10). 2010: 807-814.
8、RBM模型的训练
Hinton G E. A practical guide to training restricted Boltzmann machines[M]//Neural networks: Tricks of the trade. Springer, Berlin, Heidelberg, 2012: 599-619.
9、深度学习语音识别开篇Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition[J]. IEEE Signal processing magazine, 2012, 29.
10、深度学习图像识别开篇AlexNet
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. 2012: 1097-1105.
11、权重初始化和Momentum优化方法的研究
Sutskever I, Martens J, Dahl G, et al. On the importance of initialization and momentum in deep learning[C]//International conference on machine learning. 2013: 1139-1147.
12、 Dropout方法提出
Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
13、三巨头深度学习综述LeCun Y, Bengio Y, Hinton G. Deep learning[J]. nature, 2015, 521(7553): 436.
14、蒸馏学习算法
Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. arXiv preprint arXiv:1503.02531, 2015.
15、Capsule Network
Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules[C]//Advances in neural information processing systems. 2017: 3856-3866.
1、LeNet5卷积神经网络提出:LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
2、NLP模型:Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model[J]. Journal of machine learning research, 2003, 3(Feb): 1137-1155.
3、逐层训练方法:Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks[C]//Advances in neural information processing systems. 2007: 153-160.
4、AI架构:Bengio Y. Learning deep architectures for AI[J]. Foundations and trends® in Machine Learning, 2009, 2(1): 1-127.
5、Stacked denoising autoencoders提出:Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion[J]. Journal of machine learning research, 2010, 11(Dec): 3371-3408.
6、Xavier初始化:Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the thirteenth international conference on artificial intelligence and statistics. 2010: 249-256.
7、ReLU激活函数使用:Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks[C]//Proceedings of the fourteenth international conference on artificial intelligence and statistics. 2011: 315-323.
8、Theano框架:Bastien F, Lamblin P, Pascanu R, et al. Theano: new features and speed improvements[J]. arXiv preprint arXiv:1211.5590, 2012.
9、RNN训练问题:Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks[C]//International conference on machine learning. 2013: 1310-1318.
10、Maxout激活函数:Goodfellow I J, Warde-Farley D, Mirza M, et al. Maxout networks[J]. arXiv preprint arXiv:1302.4389, 2013.
11、生成对抗网络GAN:Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in neural information processing systems. 2014: 2672-2680.
12、机器翻译:Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.
13、二值神经网络:Courbariaux M, Bengio Y, David J P. Binaryconnect: Training deep neural networks with binary weights during propagations[C]//Advances in neural information processing systems. 2015: 3123-3131.
14、三巨头深度学习综述:LeCun Y, Bengio Y, Hinton G. Deep learning[J]. nature, 2015, 521(7553): 436.
15、image caption与attention: Xu K, Ba J, Kiros R, et al. Show, attend and tell: Neural image caption generation with visual attention[C]//International conference on machine learning. 2015: 2048-2057.
16、深度学习教材:Goodfellow I, Bengio Y, Courville A. Deep learning[M]. MIT press, 2016.
17、语音生成:Sotelo J, Mehri S, Kumar K, et al. Char2wav: End-to-end speech synthesis[J]. 2017.
1、LeNet5卷积神经网络提出:LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
2、NLP模型:Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model[J]. Journal of machine learning research, 2003, 3(Feb): 1137-1155.
3、逐层训练方法:Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks[C]//Advances in neural information processing systems. 2007: 153-160.
4、AI架构:Bengio Y. Learning deep architectures for AI[J]. Foundations and trends® in Machine Learning, 2009, 2(1): 1-127.
5、Stacked denoising autoencoders提出:Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion[J]. Journal of machine learning research, 2010, 11(Dec): 3371-3408.
6、Xavier初始化:Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the thirteenth international conference on artificial intelligence and statistics. 2010: 249-256.
7、ReLU激活函数使用:Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks[C]//Proceedings of the fourteenth international conference on artificial intelligence and statistics. 2011: 315-323.
8、Theano框架:Bastien F, Lamblin P, Pascanu R, et al. Theano: new features and speed improvements[J]. arXiv preprint arXiv:1211.5590, 2012.
9、RNN训练问题:Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks[C]//International conference on machine learning. 2013: 1310-1318.
10、Maxout激活函数:Goodfellow I J, Warde-Farley D, Mirza M, et al. Maxout networks[J]. arXiv preprint arXiv:1302.4389, 2013.
11、生成对抗网络GAN:Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in neural information processing systems. 2014: 2672-2680.
12、机器翻译:Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.
13、二值神经网络:Courbariaux M, Bengio Y, David J P. Binaryconnect: Training deep neural networks with binary weights during propagations[C]//Advances in neural information processing systems. 2015: 3123-3131.
14、三巨头深度学习综述:LeCun Y, Bengio Y, Hinton G. Deep learning[J]. nature, 2015, 521(7553): 436.
15、image caption与attention: Xu K, Ba J, Kiros R, et al. Show, attend and tell: Neural image caption generation with visual attention[C]//International conference on machine learning. 2015: 2048-2057.
16、深度学习教材:Goodfellow I, Bengio Y, Courville A. Deep learning[M]. MIT press, 2016.
17、语音生成:Sotelo J, Mehri S, Kumar K, et al. Char2wav: End-to-end speech synthesis[J]. 2017.