- Python数据分析与可视化
jun778895
python数据分析开发语言
Python数据分析与可视化是一个涉及数据处理、分析和以图形化方式展示数据的过程,它对于数据科学家、分析师以及任何需要从数据中提取洞察力的专业人员来说至关重要。以下将详细探讨Python在数据分析与可视化方面的应用,包括常用的库、数据处理流程、可视化技巧以及实际应用案例。一、Python数据分析与可视化的重要性数据可视化是将数据以图形或图像的形式表示出来,以便人们能够更直观地理解数据背后的信息和规
- 【加密算法基础——RSA 加密】
XWWW668899
网络服务器笔记python
RSA加密RSA(Rivest-Shamir-Adleman)加密是非对称加密,一种广泛使用的公钥加密算法,主要用于安全数据传输。公钥用于加密,私钥用于解密。RSA加密算法的名称来源于其三位发明者的姓氏:R:RonRivestS:AdiShamirA:LeonardAdleman这三位计算机科学家在1977年共同提出了这一算法,并发表了相关论文。他们的工作为公钥加密的基础奠定了重要基础,使得安全通
- 【ShuQiHere】 进制与补码的世界:从符号-大小表示法到二补码
ShuQiHere
二进制计算机组成原理
【ShuQiHere】在计算机系统中,表示正数是相对简单的,只需使用其对应的二进制形式即可。然而,如何有效地表示负数一直是计算机科学中的一个关键问题。为了解决这个问题,科学家们提出了多种表示方法,包括符号-大小表示法(Sign-MagnitudeRepresentation)、一补码(One’sComplement)和二补码(Two’sComplement)。在本文中,我们将深入探讨这些表示方法的
- 从门氏元素周期表看三皇五帝在关中论
霜叶红似二月花y
世间所有物质,都是由不同元素组成的,科学家们”认识物质初期,所有元素也是多年逐一认识的。著名的俄罗斯化学家门捷列耶夫(DmitriMendeleev1834-1907),在1869年首创的元素周期表,想必大家都很熟悉。他是怎么发现元素周期规律并制成表的?最权威的说法是他自己笔记中所记载的,是他做梦所得。门氏元素周期表这个表开始并不完善,但已经有个雏形了。当时只有已知的63种元素。但门氏预测应该有1
- 降伏不听话的静电,在家做一个富兰克林马达
三个爸爸实验室
这是我们一起探索的第55个实验昨天我们一起认识了神奇的静电我们知道了通过摩擦可以产生静电我们也知道了有两种电荷一种是正电荷一种是负电荷如果两个正电荷相遇或者两个负电荷相遇他们会互相排斥如果是一个正电荷与一个负电荷相遇他们就会相互吸引今天我们就利用静电的这些特征做一个简易的马达由于美国科学家富兰克林对于静电研究非常多我们称这个马达为富兰克林马达一起来看一下怎么做的吧—富兰克林马达—三个爸爸实验室No
- 蜜蜂和苍蝇故事
几分暖
蜜蜂和苍蝇科学家用一只开口的透明瓶子侧放着分别装入蜜蜂和苍蝇两种昆虫,瓶底向光,瓶口背光,结果蜜蜂会一次一次地飞向瓶底,企图飞进光源,它们决不会反其道而行,试试另一个方向,当它们尝试几次失败之后,就会选择放弃,最后只能永久困在瓶中;而苍蝇全部都飞出去了,因为它们习惯多方尝试,当光源处飞不出去,它们会选择不断向上、向下的乱飞乱撞,虽然撞上玻璃的次数很多,但最终还是飞出了瓶口…《蜜蜂和苍蝇》思维训练1
- 未来的世界想象作文怎么写
尚未秃头的老师
未来的世界想象作文1我睁开眼睛,发现自己正躺在一个陌生的屋子里,身边围着一群科学家。“哇,我们真的把100万年前的原始人复活啦!”一个长着长鼻子的科学家高兴地喊道。“是啊,我们终于可以得到一瓶纯净水的奖励了!”另一个长着大耳朵的科学家说。听了这些话,我不免有些诧异,便问:“为什么你们有了这么大的贡献,却只得到一瓶水的奖励?”一个又瘦又小的白头发老头说:“我带你去外面看一看,你就知道了。”来到屋外,
- 0053/1000 天才少年 画圣拉斐尔
坚持坚持再坚持00
1.首先带你看看拉斐尔,“三杰”中最年轻的一位。他可是个不折不扣的“少年天才”。为什么这么说?拉斐尔曾经在当时的名家佩鲁吉诺门下学画,经过短短几年的学习就青出于蓝,他在21岁时画的《圣母的婚礼》,就已经超越了老师。左:拉斐尔《圣母的婚礼》右:佩鲁吉诺《圣母的婚礼》2.拉斐尔在艺术史上留下的最杰出的形象,就是圣母。之前很多画作,都把圣母作为一个宗教象征,表现得在受苦受难。拉斐尔却把圣母拉回了人间,画
- 云服务业界动态简报-20180128
Captain7
一、青云青云QingCloud推出深度学习平台DeepLearningonQingCloud,包含了主流的深度学习框架及数据科学工具包,通过QingCloudAppCenter一键部署交付,可以让算法工程师和数据科学家快速构建深度学习开发环境,将更多的精力放在模型和算法调优。二、腾讯云1.腾讯云正式发布腾讯专有云TCE(TencentCloudEnterprise)矩阵,涵盖企业版、大数据版、AI
- 2022-03-14那几年
拂云松
那是2021年的秋天,白水和东城的婚姻遇到了危机,都是白水的错,她爱上了小她九岁的叫明朗一个男子,是她的领导,从一开始白水就知道这就是错的,她用尽了全力去克制,但是没有用啊,爱情是什么,科学家说是催产素,佛家说是因果,白水被这份缥缈的、朦胧的、抓不住的感情搞得神魂颠倒,工作中频频出错,被单位领导边缘化,在家里对孩子、对家庭都像隔了一层纱布,她想这就是失魂落魄吧,对的,失魂落魄,只要收到他的信息整个
- 芭睿芭睿:开启生物护肤之门 缔造美丽奇迹
3adced8f1ee8
生物护肤并不是营销噱头,是人类对生物科学的一种新认识和新体验。美国JDD皮肤科药学临床杂志在其2014年第13刊中就写道:护肤品已经从低端的物理防护和中等精细化学上升到了生物护肤时代。一个偶然的机会,在一次世界高级化妆品配方师闭门研讨会中,强微特公司的生物科学家们当时了解到:“在世界化妆品原料中,90%以上是化学合成原料,生物原料只占比不到10%,但恰恰是这10%能够真正引导皮肤自我修护”。强微特
- 日精进第四十三天
刚子_0662
亲爱的家人们大家好图片发自App我是杭州佳能广告工程部郑刚,今天是2018.10.18,星期四,是我第43天的日精进,给大家分享我今天的进步,我们互相学习,共同进步,共同成长。每天进步一点点,距离成功就进一点。1.比学习:今晚家里有事,没有看书。2.比改变:学会控制自己的情绪,再气,心里始终告诉自己,会慢慢变好的。3.比付出:付出才会杰出,付出的越多,在不久的将来,收获会越大。4.比谦卑:越是成熟
- 育方式吗?#科普 #涨知识 #人造子宫
努力幸运
替代女性实现生养。为了证明人造子宫的可行性,美国费城儿童医院曾做过有趣的实验。科学家们将8只早产的小羊羔各自放进一个透明的塑料袋子里进行孕育。这些小羊羔的胎灵等同于22-23周的人类胎儿。塑料袋中清晰可见的粘稠液体又以模拟子宫内的羊水。袋子和小羊羔身体上还插着很多大大小小的管子,源源不断地输送着营养物质和氧气。科学家们将这样的袋子称为生育袋。生育袋放置在保温容器中,以维持妊娠所需的适宜温度。4个星
- 书香|为你解答汗水背后的秘密
汶子_杨家小汶子
香水的用途,想必大家都了解,为了掩盖身上的味道,从而让自己香气四溢,更加地迷人,有吸引力。那你是否真正了解体味的由来,人身上的汗水又是怎么形成的呢?为何有些有的人身上的味道特别重,甚至有狐臭,而有些人就感觉清清爽爽的?眼前的这本《汗水的快乐:奇妙的汗水科学》,就将为你解答所有的谜团,告诉你汗水背后的秘密和科学,以及一直以来科学家们对此所做的研究,会让你发现新大陆一样认识一个崭新的世界,了解更深入的
- 2018-12-12
花开的蕾蕾
浅谈课程中的张首晟教授最近华为事件及张首晟教授事件引发了不少的风波与唏嘘,事件大家都知道,不必赘述,今天只想以个人角度回忆张首晟教授课堂之风采,以示缅怀。张教授曾在商学院上过一堂课,叫《第一性原理与创业》,这也是我第一次听一位教授以科学家的角度来阐述创业投资与科学的关联和必然性,精彩至极。不得不承认,张教授渊博的知识和对科学的探究都使其有着科学家独有的人个魅力,授课的方式也更内敛和平和,做为斯坦福
- 确诊
上学的小先生
昨晚做梦,很真实,记不太清了:我正伤心着,不知道怎么说。坐在办公室里,回想着我遇到的两个大师!都是比较出名的科学家。一个大师是我在爬山时遇到,我送同学过那种桥,桥上有水漫过,偶遇的老师,老师和我聊了很多,说我以后必得癌症活到60岁!一个好像是回家时遇到,在一个山涧拐角处,我帮助一个陌生人,那老师见我后,便告诉我走着走着就走到车库,说我鞋垫不合适,影响人体的代谢啥的!也说我身体不好,我听了后顿时就觉
- 你想成为怎样的人
高然
如果你想成为强者,你现在就扣向强者靠近,并以强者的标准要求自己,像强者一样活着。自小学时代起,经常会有人问起:你长大以后想做什么呀?“老师、医生、科学家...”,虽年纪不大,但也向往去成为一个好的职业人。日复一年,忙忙碌碌又拖拖拉拉的生活,已过二十六载,如今的自己早已随着公司的大环境而变成了那种“不要有太大压力,只是找个事情做的女孩子”,早已忘记了小时的雄心壮志,且对目前现状间接性踌躇满志,持续性
- 培育钻石大概多少钱,实验室培育钻石价格一览表
美鞋之家
钻石,因其稀有、珍贵,以及独特的硬度和光彩,一直以来都深受人们的喜爱。然而,在自然界中,钻石的形成过程非常漫长并且条件苛刻,因此自然钻石的价格也异常昂贵。与此同时,科学家们通过技术改良,已经可以在实验室中培育出高质量的钻石,这种钻石被称为“人造钻石”或“实验室培育钻石”。微信:17350898965(定制各类钻石款式)培育钻石的费用取决于许多因素,包括钻石的大小、颜色、净度和切割质量等。总的来说,
- 周勇//2.8日立春五日//古风· 《三体》人物形象分析之汪淼//立春·春景·春意(五)
高山流水无情剑
题记:特殊战线,展露峥嵘,重任在肩,风险敢迎。科学院士,科研纳米,揭秘三体,伴随险情。巧将敌况,频频传送,暗助国安,屡屡功成。小说誉称,工具神人。一人能抵,几师雄兵。古风·《三体》人物形象分析之汪淼君不见,汪淼三体第一季,三重身份立天地。君不闻,串联故事一重身,推动情节因他续。二重身,气质佳,院士尊号名迩遐。三重身,责任大,协助国安做调查。扫除科学边界邪,保护高端科学家。慷慨赴难卧底去,重任在肩气
- 高仿mcm哪里有卖,推荐九个进购渠道
金都之家
在时尚界,品牌的价值和影响力起着至关重要的作用。作为全球著名的奢侈品牌之一,MCM以其独特的设计和杰出的品质享誉于世。然而,原版MCM价格昂贵,对于一般消费者来说可能无法承受。那么,高仿MCM哪里有卖呢?本文将为您介绍九个可靠的高仿MCM进购渠道。咨询加微信:7862953(下单赠送精美礼品)1.网购平台随着电子商务的迅猛发展,网购成为了人们购物的首选方式。在各大知名网购平台,如淘宝、京东等,您可
- 2019-10-31
振华老凤祥店长崔宁宁
大爱的李老师,智慧的教授,亲爱的跃友们:大家好!我是莱州鑫和金店李总的人~崔宁宁今天是我的日精进行动第162天,我分享一下今天的改变,我们相互勉励,每天进步一点点,离成功便不远。1、比学习:只有不断的学习才能提升自己!2、比改变:我是一切的根源,我变了世界就变了!改变自己的心态!3、比付出:付出才会有结果,付出才会杰出!4、比谦卑:学习每位优秀店长身上的优点!5、比感恩:感恩老大提供这个平台给我感
- 干细胞回输到体内后,会不会乱跑?它自带“GPS”,哪缺补哪!
干细胞精研社
每个孩子都曾被问起未来的梦想,而“我想当一个科学家”一定可以登上梦想热榜TOP1。但为什么要向往当一个科学家呢?仅仅是因为“科学家们又为人类做出了杰出的贡献”是每个励志故事最标准圆满的结局?让我来试着把这句标准结局补充完整——科学家们又为人类生命的课题,做出了杰出的贡献——比如,如果重生还有距离,那就先努力健康的老去。我们只会感叹皱纹白发,科学家们已经去细胞里找答案有些道理,长大了才懂。比如,奥特
- 人体内的旅行
Ryanta
人体是一个非常神奇的东西,也是近代科学家们也一直在研究的东西。有些人体器官是功能熟知且明了的,而有些器官比如大脑却是神不可测的,而这一部人体内的旅行纪录片就带我们了解了,整个人的一生以及我们器官在发生的变化。首先是消化系统,消化系统是由非常多的器官组成的,其最重要的部分也就是小肠和大肠了,食物在进入水中,最先进入的是小肠小肠在吸收食物的养分,并且将吸收完后的物体排入大肠。大肠在尽量的多吸收养分,把
- 大数据领域的深度分析——AI是在帮助开发者还是取代他们?
阳爱铭
大数据与数据中台技术沉淀大数据人工智能后端数据库架构数据库开发etl工程师chatgpt
在大数据领域,生成式人工智能(AIGC)的应用正在迅速扩展,改变了数据科学家和开发者的工作方式。本文将从大数据的专业视角,探讨AI工具在这一领域的作用,以及它们是如何帮助开发者而非取代他们的。1.大数据领域的AI工具现状在大数据领域,AI工具已经取得了显著进展,以下是几款主要的AI工具及其功能和实际应用:ApacheSpark+MLlib:ApacheSpark是一个开源的分布式计算系统,广泛用于
- 梦想
小猪的花
小时候的梦想真的没有很复杂,甚至说我是没有什么梦想的。像别人说的要和父母、长辈永远在一起,能永远不长大,还有那些说要成为一个科学家的,我是真的没有想过。真的,如果不是老师提醒我们,或者是现在我想写这个关于梦想的文章。回想起自己小时候,能无忧无虑的生活着,快快乐乐地玩耍着,可能对当下的我是真的梦想。参加工作了,成为了上班族,除了少数人,大部分人都是要早九晚五的上班。那么,现在问我自己还有梦想吗?我很
- 机器学习,深度学习,AGI,AI的概念和区别
我就是全世界
人工智能机器学习深度学习
1.人工智能(AI)的定义与范围1.1AI的基本概念人工智能(AI)是指通过计算机系统模拟人类智能的技术和科学。AI的目标是创建能够执行通常需要人类智能的任务的系统,如视觉识别、语音识别、决策制定和语言翻译。AI的核心在于其能够处理和分析大量数据,从中提取有用的信息,并根据这些信息做出决策或预测。AI的发展可以追溯到20世纪50年代,当时科学家们开始探索如何使机器能够执行复杂的任务。随着计算能力的
- 志愿者们主动感染新冠,会发生什么?
英格恩
英国的研究人员近期发布了一项史无前例的研究结果,在该研究中,健康的年轻志愿者故意感染了大流行冠状病毒的早期毒株。正如预期,没有一个参与者得了重病,科学家们密切跟踪他们的症状,并对感染期间SARS-CoV-2水平和症状从头到尾如何变化提出了独特的见解。这项最初的“人类挑战”研究的成功为未来测试COVID-19治疗、疫苗和病毒变体提供了策略。这项研究还可以帮助科学家了解为什么冠状病毒可以破坏某些人的免
- 2023-01-03
贵如
2022.1.2《加速》人脉资源积累的吸引力法则晚上参加了有10多人的高中校友聚会,这是校友资源共享整合的活动之一,吸引力法则让校友会有凝聚力,大家有来有往中,把活动搞得有情怀有品位有人情味道。本次活动主题:迎接新年。参加人员的年龄结构是老中青的组合。有曾经教授过我们的老师,有愿意为校友付出的杰出学长,有其他年轻些的校友。活动开始,在介绍聚会人员的基本情况时,秘书长会重点介绍大家在校友会中做出过的
- 开学第一课2021-09-01
金色忍者
我今天看了“开学第一课”,这次“开学第一课”的主题是理想照亮未来。我也有理想,我的理想是当一名“天文科学家。”我从小就很喜欢天文,妈妈带我去北京天文馆上陨石课,还给我买了很多天文方面的书,我觉得天空非常神奇。天文里最神奇的是黑洞,黑洞就是能把很多东西都吸进去,包括光和时间,黑洞现在是一个未解之谜,我希望我能解开这个未解之谜。天文还有很多需要探索的,需要我们去突破。我现在好好学习就是为了打下坚实的基
- 2022-06-29 历史上的今天
玉石儿
2001年,青藏铁路开工典礼;1984年,《中国妇女》杂志刊登了我国第一则征婚广告;1976年,塞舌尔宣布脱离英国独立,成立塞舌尔共和国;1964年,“东风二号”导弹发射升空;1949年,南非开始实行种族隔离计划;1940年,意象派画家保罗·克利病逝;1929年,意大利女记者奥里亚娜·法拉奇出生;1900年,诺贝尔奖金基金会成立;1895年,英国杰出生物学家托马斯·亨利·赫胥黎逝世;1852年,美
- 关于旗正规则引擎中的MD5加密问题
何必如此
jspMD5规则加密
一般情况下,为了防止个人隐私的泄露,我们都会对用户登录密码进行加密,使数据库相应字段保存的是加密后的字符串,而非原始密码。
在旗正规则引擎中,通过外部调用,可以实现MD5的加密,具体步骤如下:
1.在对象库中选择外部调用,选择“com.flagleader.util.MD5”,在子选项中选择“com.flagleader.util.MD5.getMD5ofStr({arg1})”;
2.在规
- 【Spark101】Scala Promise/Future在Spark中的应用
bit1129
Promise
Promise和Future是Scala用于异步调用并实现结果汇集的并发原语,Scala的Future同JUC里面的Future接口含义相同,Promise理解起来就有些绕。等有时间了再仔细的研究下Promise和Future的语义以及应用场景,具体参见Scala在线文档:http://docs.scala-lang.org/sips/completed/futures-promises.html
- spark sql 访问hive数据的配置详解
daizj
spark sqlhivethriftserver
spark sql 能够通过thriftserver 访问hive数据,默认spark编译的版本是不支持访问hive,因为hive依赖比较多,因此打的包中不包含hive和thriftserver,因此需要自己下载源码进行编译,将hive,thriftserver打包进去才能够访问,详细配置步骤如下:
1、下载源码
2、下载Maven,并配置
此配置简单,就略过
- HTTP 协议通信
周凡杨
javahttpclienthttp通信
一:简介
HTTPCLIENT,通过JAVA基于HTTP协议进行点与点间的通信!
二: 代码举例
测试类:
import java
- java unix时间戳转换
g21121
java
把java时间戳转换成unix时间戳:
Timestamp appointTime=Timestamp.valueOf(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(new Date()))
SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd hh:m
- web报表工具FineReport常用函数的用法总结(报表函数)
老A不折腾
web报表finereport总结
说明:本次总结中,凡是以tableName或viewName作为参数因子的。函数在调用的时候均按照先从私有数据源中查找,然后再从公有数据源中查找的顺序。
CLASS
CLASS(object):返回object对象的所属的类。
CNMONEY
CNMONEY(number,unit)返回人民币大写。
number:需要转换的数值型的数。
unit:单位,
- java jni调用c++ 代码 报错
墙头上一根草
javaC++jni
#
# A fatal error has been detected by the Java Runtime Environment:
#
# EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00000000777c3290, pid=5632, tid=6656
#
# JRE version: Java(TM) SE Ru
- Spring中事件处理de小技巧
aijuans
springSpring 教程Spring 实例Spring 入门Spring3
Spring 中提供一些Aware相关de接口,BeanFactoryAware、 ApplicationContextAware、ResourceLoaderAware、ServletContextAware等等,其中最常用到de匙ApplicationContextAware.实现ApplicationContextAwaredeBean,在Bean被初始后,将会被注入 Applicati
- linux shell ls脚本样例
annan211
linuxlinux ls源码linux 源码
#! /bin/sh -
#查找输入文件的路径
#在查找路径下寻找一个或多个原始文件或文件模式
# 查找路径由特定的环境变量所定义
#标准输出所产生的结果 通常是查找路径下找到的每个文件的第一个实体的完整路径
# 或是filename :not found 的标准错误输出。
#如果文件没有找到 则退出码为0
#否则 即为找不到的文件个数
#语法 pathfind [--
- List,Set,Map遍历方式 (收集的资源,值得看一下)
百合不是茶
listsetMap遍历方式
List特点:元素有放入顺序,元素可重复
Map特点:元素按键值对存储,无放入顺序
Set特点:元素无放入顺序,元素不可重复(注意:元素虽然无放入顺序,但是元素在set中的位置是有该元素的HashCode决定的,其位置其实是固定的)
List接口有三个实现类:LinkedList,ArrayList,Vector
LinkedList:底层基于链表实现,链表内存是散乱的,每一个元素存储本身
- 解决SimpleDateFormat的线程不安全问题的方法
bijian1013
javathread线程安全
在Java项目中,我们通常会自己写一个DateUtil类,处理日期和字符串的转换,如下所示:
public class DateUtil01 {
private SimpleDateFormat dateformat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
public void format(Date d
- http请求测试实例(采用fastjson解析)
bijian1013
http测试
在实际开发中,我们经常会去做http请求的开发,下面则是如何请求的单元测试小实例,仅供参考。
import java.util.HashMap;
import java.util.Map;
import org.apache.commons.httpclient.HttpClient;
import
- 【RPC框架Hessian三】Hessian 异常处理
bit1129
hessian
RPC异常处理概述
RPC异常处理指是,当客户端调用远端的服务,如果服务执行过程中发生异常,这个异常能否序列到客户端?
如果服务在执行过程中可能发生异常,那么在服务接口的声明中,就该声明该接口可能抛出的异常。
在Hessian中,服务器端发生异常,可以将异常信息从服务器端序列化到客户端,因为Exception本身是实现了Serializable的
- 【日志分析】日志分析工具
bit1129
日志分析
1. 网站日志实时分析工具 GoAccess
http://www.vpsee.com/2014/02/a-real-time-web-log-analyzer-goaccess/
2. 通过日志监控并收集 Java 应用程序性能数据(Perf4J)
http://www.ibm.com/developerworks/cn/java/j-lo-logforperf/
3.log.io
和
- nginx优化加强战斗力及遇到的坑解决
ronin47
nginx 优化
先说遇到个坑,第一个是负载问题,这个问题与架构有关,由于我设计架构多了两层,结果导致会话负载只转向一个。解决这样的问题思路有两个:一是改变负载策略,二是更改架构设计。
由于采用动静分离部署,而nginx又设计了静态,结果客户端去读nginx静态,访问量上来,页面加载很慢。解决:二者留其一。最好是保留apache服务器。
来以下优化:
- java-50-输入两棵二叉树A和B,判断树B是不是A的子结构
bylijinnan
java
思路来自:
http://zhedahht.blog.163.com/blog/static/25411174201011445550396/
import ljn.help.*;
public class HasSubtree {
/**Q50.
* 输入两棵二叉树A和B,判断树B是不是A的子结构。
例如,下图中的两棵树A和B,由于A中有一部分子树的结构和B是一
- mongoDB 备份与恢复
开窍的石头
mongDB备份与恢复
Mongodb导出与导入
1: 导入/导出可以操作的是本地的mongodb服务器,也可以是远程的.
所以,都有如下通用选项:
-h host 主机
--port port 端口
-u username 用户名
-p passwd 密码
2: mongoexport 导出json格式的文件
- [网络与通讯]椭圆轨道计算的一些问题
comsci
网络
如果按照中国古代农历的历法,现在应该是某个季节的开始,但是由于农历历法是3000年前的天文观测数据,如果按照现在的天文学记录来进行修正的话,这个季节已经过去一段时间了。。。。。
也就是说,还要再等3000年。才有机会了,太阳系的行星的椭圆轨道受到外来天体的干扰,轨道次序发生了变
- 软件专利如何申请
cuiyadll
软件专利申请
软件技术可以申请软件著作权以保护软件源代码,也可以申请发明专利以保护软件流程中的步骤执行方式。专利保护的是软件解决问题的思想,而软件著作权保护的是软件代码(即软件思想的表达形式)。例如,离线传送文件,那发明专利保护是如何实现离线传送文件。基于相同的软件思想,但实现离线传送的程序代码有千千万万种,每种代码都可以享有各自的软件著作权。申请一个软件发明专利的代理费大概需要5000-8000申请发明专利可
- Android学习笔记
darrenzhu
android
1.启动一个AVD
2.命令行运行adb shell可连接到AVD,这也就是命令行客户端
3.如何启动一个程序
am start -n package name/.activityName
am start -n com.example.helloworld/.MainActivity
启动Android设置工具的命令如下所示:
# am start -
- apache虚拟机配置,本地多域名访问本地网站
dcj3sjt126com
apache
现在假定你有两个目录,一个存在于 /htdocs/a,另一个存在于 /htdocs/b 。
现在你想要在本地测试的时候访问 www.freeman.com 对应的目录是 /xampp/htdocs/freeman ,访问 www.duchengjiu.com 对应的目录是 /htdocs/duchengjiu。
1、首先修改C盘WINDOWS\system32\drivers\etc目录下的
- yii2 restful web服务[速率限制]
dcj3sjt126com
PHPyii2
速率限制
为防止滥用,你应该考虑增加速率限制到您的API。 例如,您可以限制每个用户的API的使用是在10分钟内最多100次的API调用。 如果一个用户同一个时间段内太多的请求被接收, 将返回响应状态代码 429 (这意味着过多的请求)。
要启用速率限制, [[yii\web\User::identityClass|user identity class]] 应该实现 [[yii\filter
- Hadoop2.5.2安装——单机模式
eksliang
hadoophadoop单机部署
转载请出自出处:http://eksliang.iteye.com/blog/2185414 一、概述
Hadoop有三种模式 单机模式、伪分布模式和完全分布模式,这里先简单介绍单机模式 ,默认情况下,Hadoop被配置成一个非分布式模式,独立运行JAVA进程,适合开始做调试工作。
二、下载地址
Hadoop 网址http:
- LoadMoreListView+SwipeRefreshLayout(分页下拉)基本结构
gundumw100
android
一切为了快速迭代
import java.util.ArrayList;
import org.json.JSONObject;
import android.animation.ObjectAnimator;
import android.os.Bundle;
import android.support.v4.widget.SwipeRefreshLayo
- 三道简单的前端HTML/CSS题目
ini
htmlWeb前端css题目
使用CSS为多个网页进行相同风格的布局和外观设置时,为了方便对这些网页进行修改,最好使用( )。http://hovertree.com/shortanswer/bjae/7bd72acca3206862.htm
在HTML中加入<table style=”color:red; font-size:10pt”>,此为( )。http://hovertree.com/s
- overrided方法编译错误
kane_xie
override
问题描述:
在实现类中的某一或某几个Override方法发生编译错误如下:
Name clash: The method put(String) of type XXXServiceImpl has the same erasure as put(String) of type XXXService but does not override it
当去掉@Over
- Java中使用代理IP获取网址内容(防IP被封,做数据爬虫)
mcj8089
免费代理IP代理IP数据爬虫JAVA设置代理IP爬虫封IP
推荐两个代理IP网站:
1. 全网代理IP:http://proxy.goubanjia.com/
2. 敲代码免费IP:http://ip.qiaodm.com/
Java语言有两种方式使用代理IP访问网址并获取内容,
方式一,设置System系统属性
// 设置代理IP
System.getProper
- Nodejs Express 报错之 listen EADDRINUSE
qiaolevip
每天进步一点点学习永无止境nodejs纵观千象
当你启动 nodejs服务报错:
>node app
Express server listening on port 80
events.js:85
throw er; // Unhandled 'error' event
^
Error: listen EADDRINUSE
at exports._errnoException (
- C++中三种new的用法
_荆棘鸟_
C++new
转载自:http://news.ccidnet.com/art/32855/20100713/2114025_1.html
作者: mt
其一是new operator,也叫new表达式;其二是operator new,也叫new操作符。这两个英文名称起的也太绝了,很容易搞混,那就记中文名称吧。new表达式比较常见,也最常用,例如:
string* ps = new string("
- Ruby深入研究笔记1
wudixiaotie
Ruby
module是可以定义private方法的
module MTest
def aaa
puts "aaa"
private_method
end
private
def private_method
puts "this is private_method"
end
end
[–]mdooder 41 指標 1 年 前
Hello Prof. Bengio, What motivates you to stay in academia? What do you think about corporate research labs in terms of productivity and innovation when compared to academic labs. Does research flexibility (doing what you want, more or less) play a large role in this decision?
[–]yoshua_bengioProf. Bengio 29 指標 1 年 前
I like academia because I can choose what to work on, I can choose to work on long-term goals, I can work for the benefit of humanity rather than for a specific company, and I can talk about my work freely. Note that to different degrees, my esteemed colleagues in large industrial labs also enjoy some of that freedom.
[–]alecradford 30 指標 1 年 前*
Hi there! I'm an undergrad and your work combined with Hinton's is a huge inspiration to me! A bunch of questions, so feel free to answer all or none!
Hinton semi-recently offered an awesome MOOC on Coursera over NNs. The resources and lectures it provided are what allowed me and many others to build homebrew nets and really get into the field. It would be a great resource if another researcher at the forefront of the field offered their own take, do you have any plans for something like this?
As a leading professor in the field, how do you personally view the resurgence of interest in modern NN applications? Do you believe it's well deserved recognition, guilty of overhype, some mixture of the two, or something completely different! On a similar note, how do you feel about the portrayal of modern NN research in popular literature?
I'm interested in using unsupervised techniques to learn automated data augmentations/corruptions for increasing generalization performance, which I hope is a promising hybrid of supervised and unsupervised learning that's different from traditional pretraining. A lot of advances have been made using "simple" data augmentations/corruptions pioneered in your lab like gaussian noise corruption and what we now call input dropout in the context of DAEs. Preliminary results on MNIST seem successful (~0.8% permutation invariant) and I can send code if you are interested but admittedly I'm just an undergrad with no formal research experience. Do you see this as an area with potential and could you point me to any resources or papers that you are aware of - I've had a hard time finding them.
No one has a crystal ball, but what do you see as the most interesting areas of research for continuing to advance your work? The last few years has seen purely supervised techniques make a lot of headroom riding off the success of dropout, for instance.
Thank you so much for doing this AMA, it's great to have you here on /r/MachineLearning!
[–]yoshua_bengioProf. Bengio 22 指標 12 月 前
I have no clear plan for a MOOC but I might do one eventually. In the meantime, I write a new and more complete book on deep learning (with Ian Goodfellow and Aaron Courville). Some draft chapters should come out in the next few months and feedback from the community and students would be great. Note that Hugo Larochelle (formerly a PhD with me and a post-doc with Hinton) has great videos on deep learning http://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH (and slides on his web page).
I believe that the recent surge of interest in NNets just means that the machine learning community wasted many years not exploring them, in the 1996-2006 decade, mostly. There is also hype, especially if you consider the media. That is unfortunate and dangerous, and will be exploited especially by companies trying to make a quick buck. The danger is to see another bust when wild promises are not followed by outstanding results. Science mostly moves by small steps and we should stay humble.
I have no crystal ball but I believe that improving our ability to model joint distributions (either in an unsupervised way or conditioned on some input, either explicitly or implicitly through learning of good representations) is going to be crucial for future progress of deep learning towards AI-level machine understanding of the world around us.
Another easy prediction is that we need to and will make progress towards efficiently training much larger models. This involves improvements in the way we train model (the numerical optimization involved), as well as in ways to do it computationally more efficiently (e.g. through parallelization and other tricks that avoid doing the computation associated with all the parts of the network for every example).
You can find out more in my arxiv paper on "looking forward": http://arxiv.org/abs/1305.0445
[–]Sigmoid_Freud 14 指標 1 年 前
Traditional (deep or non-deep) Neural Networks seem somewhat limited in the sense that they cannot keep any contextual information. Each datapoint/example is viewed in isolation. Recurrent Neural Networks overcome this, but they seem to be very hard to train and have been tried in a variety of designs with apparently relatively limited success.
Do you think RNNs will become more prevalent in the future? For which applications and using what designs?
Thank you very much for taking your time to do this!
[–]yoshua_bengioProf. Bengio 16 指標 1 年 前
Recurrent or recursive nets are really useful tools for modelling all kinds of dependency structures on variable-sized objects. We have made progress on ways to train them and it is one of the important areas of current research in the deep learning community. Examples of applications: speech recognition (especially the language part), machine translation, sentiment analysis, speech synthesis, handwriting synthesis and recognition, etc.
[–]omphalos 2 指標 1 年 前
I'd be curious to hear his thoughts on any intersection between liquid state machines (one approach to this problem) and deep learning.
[–]yoshua_bengioProf. Bengio 11 指標 1 年 前*
Liquid state machines and echo state networks do not learn the recurrent weights, i.e., they do not learn the representation. Instead, learning good representations is the central purpose of deep learning. In a way, the echo-state / liquid state machines are like SVMs, in the sense that we put a linear predictor on top of a fixed set of features. The features are functions of the past sequence through the smartly initialized recurrent weights, in the case of echo state networks and liquid state machines. Those features are good, but they can be even better if you learn them!
[–]yoshua_bengioProf. Bengio 6 指標 12 月 前
See the answer I already gave there:http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfpboj8
[–]Noncomment 3 指標 12 月 前
Did you mean recursion?
[–]omphalos 2 指標 12 月 前
Thank you for the reply. Yes I understand the analogy to SVMs. Honestly I was wondering about something more along the lines of using the liquid state machine's untrained "chaotic" states (which encode temporal information) as feature vectors that a deep network can sit on top of, and thereby construct representations of temporal patterns.
[–]rpascanu 3 指標 12 月 前
I would add that ESNs or LSMs can provide insights in why certain things don't work or work for RNNs. So having a good grasp of them could definitely be useful for deep learning. An example is Ilya's work on initialization (jmlr.org/proceedings/papers/v28/sutskever13.pdf), where they show that an initialization based on the one proposed by Herbert Jaeger for ESNs is very useful for RNNs as well.
They also offer quite a strong baseline most of the time.
[–]freieschaf 2 指標 1 年 前
Take a look at Schmidhuber's page on RNNs. There is quite a lot of info on them, and especially on LSTMNN, an architecture of RNN designed precisely for tackling the issue of vanishing gradient when training RNNs and so allowing them to keep track of a longer context.
[–]PasswordIsntHAMSTER 13 指標 1 年 前
Hi Prof. Bengio, I'm an undergrad at McGill University doing research in type theory. Thank you for doing this AMA!
Questions:
My field is extremely concerned with formal proofs. Is there a significant focus on proofs in machine learning too? If not, how do you make sure to maintain scientific rigor?
Is there research being done about the use of deep learning for program generation? My intuition is that eventually we could use type theory to specify a program and deep learning to "search " for an instantiation of the specification, but I feel like we're quite far from that.
Can you give me examples of exotic data structure used in ML?
How would I get into deep learning starting from zero? I don't know what resources to look at, though if I develop some rudiments I would LOVE to apply for a research position on your team.
[–]yoshua_bengioProf. Bengio 10 指標 12 月 前
There is a simple way that you get scientific rigor without proof, and it's used throughout science: it's called the scientific method, and it relies and experiments and hypothesis-testing ;-) Besides, math is getting into more deep learning papers. I have been interested for some time in proving properties of deep vs shallow architectures (see papers with Delalleau, and more recently with Pascanu). With Nicolas Le Roux I worked on the approximation properties of RBMs and DBNs. I encourage you to also look at the papers by Montufar. Fancy math there.
Deep learning from 0? there is lots of material out there, some listed in deeplearning.net:
My 2009 paper/book (a new one is on the way!): http://www.iro.umontreal.ca/~bengioy/papers/ftml_book.pdf
Hugo Larochelle's neural networks course & youtube videos: http://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH (slides on his webpage)
Practical recommendations for training deep nets: http://www.google.com/url?q=http%3A%2F%2Farxiv.org%2Fabs%2F1206.5533&sa=D&sntz=1&usg=AFQjCNFJClbJs-wyBb46aPwER1ZfOB_kng
A recent review: https://arxiv.org/abs/1206.5538
[–]PokerPirate 2 指標 1 年 前
On a related note, I am doing research in probabalistic programming languages. Do you think there will ever be a "deep learning programming language" (whatever that means) that makes it easier for nonexperts to write deep learning models?
[–]ian_goodfellow[S] 5 指標 12 月 前
I am one of Yoshua's graduate students and our lab develops a python package called Pylearn2 that makes it relatively easy for non-experts to do deep learning:
https://github.com/lisa-lab/pylearn2
You'll still need to have some idea of what the algorithms are meant to be doing, but at least you won't have to implement them yourself.
[–]nxvd 5 指標 1 年 前
It's not a programming language in the usual sense, but Theano is a pretty neat way to describe and train neural network architectures, however deep they are and whatever their characteristics. It's actually developed by people in Dr. Bengio's lab if I'm not mistaken.
[–]serge_cell 2 指標 1 年 前
IMHO definitely should be. There are several open source packages with similar functionality right now, and different research papers refer to different packages for results reproduction. Would be great if one wouldn't have to install and learn new package to reproduce result, but just use ready made cfg or script in dl language. Would improve reproducibility too - results reproduced with different implementation are more relatable.
[–]PokerPirate 1 指標 1 年 前
links?
[–]serge_cell 3 指標 1 年 前
I mostly familiar with convolutional networks, so most of packages here are for CNN and autoencoders
Fastest:
1. cuda-convnet - most used gpgpu implementation, used in other packages too
https://code.google.com/p/cuda-convnet/ there are also several forks on github
2. caffe
https://github.com/BVLC/caffe
3. NNforge
http://milakov.github.io/nnForge/
Based on cuda-convnet, but include more staff:
4. pylearn2
https://github.com/lisa-lab/pylearn2
other staff:
http://deeplearning.net/software_links/
[–]polyguo 2 指標 1 年 前
What probabilistic programming languages are you researching? Any experience with Church? I have an internship this summer with someone who does research using PPLs and it would be immensely useful to me if you could point me to resources that would allow me to get more familiar with the subject matter. Papers and actual code would be best.
[–]PokerPirate 1 指標 1 年 前
Have you been to http://probmods.org? It's a pretty thorough tutorial.
[–]polyguo 2 指標 1 年 前
I'm actually taking the probabilistic graphical models course in Coursera and i got a copy of Koller's book. I'm familiar with the theory, I've yet to see mature code written in PPLs.
And, yes, I've been to the site. I'm actually going to be working with one of the authors.
[–]PokerPirate 1 指標 1 年 前
me too :)
[–]dwf 1 指標 1 年 前
Machine learning is a big field. The folks who submit to COLT would be big on proofs. Others, not as much. Empirical study counts for a lot.
[–]orwells1 1 指標 12 月 前
Can't see a reply so this might help:
Ilya Sutskever https://vimeo.com/77050653 2013, 1:05:13
[–]wardnath 15 指標 1 年 前*
Dr. Bengio, In your paper Big Neural Networks Waste Capacity you suggest that gradient descent does not work as well with a lot of neurons as it does with fewer. (1) Why do the increased interactions create worse local minima? (2) Do you think hessian free methods like in (Martens 2010) are sufficient to overcome these issues?
Thank You!
Ref: Dauphin, Yann N., and Yoshua Bengio. "Big neural networks waste capacity." arXiv preprint arXiv:1301.3583 (2013).
Martens, James. "Deep learning via Hessian-free optimization." Proceedings of the 27th International Conference on Machine Learning (ICML-10). 2010.
[–]dhammack 9 指標 1 年 前
I think the answer to this one is that the increased interactions just lead to more curvature (off diagonal Hessian terms). Gradient descent, as a first-order technique, ignores curvature (it assumes the Hessian is the identity matrix). So what happens is that gradient descent is less effective in bigger nets because you tend to "bounce around" minima.
[–]yoshua_bengioProf. Bengio 9 指標 1 年 前
This is essentially in agreement with my understanding of the issue. It's not clear that we are talking about local minima, but what I call 'effective local minima', because training gets stuck (they could also be saddle points or other kinds of flat regions). We also know that 2nd order methods don't do miracles, in many cases, so something else is going on that we do not understand yet.
[–]ian_goodfellow[S] 10 指標 1 年 前
Verification post: https://plus.google.com/103174629363045094445/posts/2fqbkyYULAf
[–]hf98hf43j2klhf9 7 指標 1 年 前
We should try to request Yann LeCunn as well, he seems to be open to the idea.
[–]Megatron_McLargeHuge 11 指標 1 年 前
With the recent success of maxout and hinge activations, how relevant is the older work on RBM pretraining using various contrastive divergence tweaks? What do you think is still worth investigating about stochastic models?
How biologically plausible is maxout, and should we care?
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前*
The older work on RBM and auto-encoders is certainly still worth further investigation, along with the construction of other novel unsupervised learning procedures.
For one thing, unsupervised procedures (and pre-training) remain a key ingredient to deal with the semi-supervised and transfer learning cases (and domain adaptation, and non-stationary data), when the number of labeled examples of the new classes (or of the changed distribution) is small. This is how we won the two 2011 transfer learning competitions (held at ICML and NIPS).
Furthermore, looking farther into the future, unsupervised learning is very appealing for other reasons:
take advantage of huge quantitities of unlabeled data
learn about the statistical dependencies between all the variables observed so that you can answer NEW questions (not seen during training) about any subset of variables given any other subset
it's a very powerful regularizer and can help the learner to disentangle the underlying factors of variation, making much easier to solve new tasks from very few examples
it can be used in the supervised case when the output variable (to be predicted) is a very high-dimensional composite object (like an image or a sentence), i.e., a so-called structured output
Maxout and other such pooling units do something that may be related to the local competition (often through inhibitory interneurons) between neighboring neurons in the same area of cortex.
[–]ian_goodfellow[S] 3 指標 12 月 前
Right now pretraining does seem to be helpful for preventing overfitting in cases where there is very little labeled training data available. It now longer seems to be necessary as an optimization technique for deep networks, since we can just use the piecewise linear activation functions that are easy to optimize even for very deep networks.
Probabilistic models are still useful for tasks like classification with missing input (because they can reason about the missing inputs), or tasks where the goal is to repair damaged inputs (example: photo touchup) or infer the values of missing inputs, or where the task is just to generate realistic samples of data. It can also often be useful to have a probabilistic model that you use as part of a larger system. For example, if you want to use a neural net as part of an HMM, the HMM requires that its observation and transition models provide real probabilities.
Rectified linear units were partially motivated by biological plausibility concerns, because some neuroscientific evidence suggests that real neurons rarely operate in the regime where they reach their maximum firing rate.
I'm the grad student who came up with maxout, and I didn't have any biological plausibility concerns in mind when I came up with it. After I started using maxout for machine learning, another of Yoshua's grad students, Caglar Gulcehre, told me that there is some neuroscientific evidence for a function similar to maxout but with an absolute value being used in the deeper layers of the cortex. I don't know much about this myself. One thing about maxout that makes it a little bit difficult to explain in biological terms is the fact that maxout units can take on negative values. This is a bit awkward for a biological neurons since it's not possible to have a negative firing rate. But maybe biological neurons could use some average firing rate to indicate 0, and indicate negative values by firing less often than that.
My main interest is in engineering intelligent systems, not necessarily understanding how the human brain works. Because that's what my interest is, I am not very concerned with biological plausibility. Right now it seems easier to make progress in machine learning just by working from first principles than by reverse-engineering the brain. We don't have good enough sensor equipment to extract the kind of information from the brain that we would need to make reverse engineering it convenient.
[–]jkyle1234 13 指標 1 年 前*
Hello Prof. Bengio, thank you for the AMA. What recommendations would you have for someone who is not a PHD in getting started with Deep Learning.
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
See some of the pointers I put above:http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq7a3s
[–]32er234 1 指標 12 月 前
Something wrong with the link
[–]uber_kerbonaut 1 指標 12 月 前
maybe he's referring to this onehttp://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfpn5yp
[–][deleted] 12 指標 1 年 前
Dear Yoshua, thanks for doing this!
You are, to my knowledge, the only ML academic to publicly (and wonderfully!) speculate about the sociocultural perspectives afforded by the vantage of deep representation learning. In your fascinating article "Culture vs Local Minima" you touch on many important things, some of which I'm very curious about:
You describe how individuals learn by being immersed in culture. We both agree that they don't always learn very wholesome things. If you were king of the world, and you could prescribe a set of concepts that should be a part of every childhood learning trajectory, what would those be and to what end?
A corollary of "cultural immersion" is that the specific process of learning is not evident to the learner, the world simply "is" in a particular way. The author David Foster Wallace phrased this phenomenon as akin to fish having to figure out what water is. In your opinion, is this phenomenon an experiential byproduct of the neural architecture, or does it confer some learning benefit?
Why do you think that cultural trends become entrenched and cause their learners to fight to stay in (what could be argued to be) local optima - like e.g. the conflicts between various religious institutions and Enlightenment philosophy, or patriarchal society vs the suffragettes, etc.? Is this a case of very pernicious parameters, or is there some benefit to the learners in question?
Do you have an opinion on such concepts as mindfulness meditation, and if so, how do you think they relate to the exploration of "idea space"?
Again, thanks a lot for taking the time. In the space of human ideas you are a trailblazer, and we are immensely richer for your presence!
[–]yoshua_bengioProf. Bengio 9 指標 1 年 前
I am not a social scientist or a psychologist, so my opinions on these subjects should be taken as such. My opinion is that many learners stay entrenched in their beliefs because these beliefs have become part of their identity, their definition of who they are, and it's harder and scary to change that. There may also be a more computational aspect related to the notion of effective local minima (the optimization getting stuck). I believe that a lot of what our brain does is try to bring coherence to all of our experience, in order to construct a better model of the world. Mathematically, this may be related to the problem of inference, by which a learner searches for plausible explanations (latent variables) of the observed data. In stochastic models, inference is done by a form of stochastic exploration of configurations (and a Markov chain really looks like a series of free associations). Meditation and other time spent not doing anything directed but just thinking may well be useful to help us explore in this way. Sometimes it clicks, i.e., we find an explanation that fits well with many things. This is also how scientific ideas often seem to emerge (for me at least).
[–]yoshua_bengioProf. Bengio 10 指標 1 年 前
Verification post: https://plus.google.com/112504130537129706790/posts/eqdBAysAyqR
[–]vondragon 8 指標 1 年 前
I live in Montreal, working in the technology startup world. Very interested in your work, thank you for doing this AMA Professor Bengio. I worked hard to filter down to one question:
There seems to be a lot of disinterest from Machine Learning specialists and academics in general towards ML competitions hosted by Kaggle and the like. I recognize the odds of winning are quite low, making a the return on the investment of your time even worse, but it would seem to be even worse for ML enthusiasts that are flocking to participate. It would seem a few hours from an ML domain expert could be really beneficial on the right open datasets. Can you imagine an open, collaborative approach to competitive machine learning where experts and enthusiasts work effectively together?
[–]EJBorey 10 指標 1 年 前
Here's an example where experts won a Kaggle contest: http://blog.kaggle.com/2012/11/01/deep-learning-how-i-did-it-merck-1st-place-interview/ And here, where they won the Netflix Prize: http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html
But I think the reason why they don't work on the problems is that the bad ML researchers won't win and therefore not publish, while the good ones would get paid millions of dollars by companies to answer the same questions! Why do it for free?
[–]vondragon 6 指標 1 年 前
I would estimate that a majority of the time ML 'experts' do win the competitions, but they might not be recognized experts.
When a "non-expert" does win, they typically make up for their lack of domain sepecific ML knowledge by being an expert in a related domain like stats, math, programming, etc.
I think the dataset is an important factor to conisider here. Is it possible for an ML researcher to spend an insignificant amount of their time to apply some of their knoweldge building the model, at which point a larger crowd of less specialized people can compete on the remaining work?
[–]PasswordIsntHAMSTER 2 指標 1 年 前
I'm in Montreal too, where do you work? o.O
[–]vondragon 1 指標 1 年 前
Near Sherbrooke =D
[–]dwf 2 指標 12 月 前
ML researchers are usually trying to push the methodological envelope, but that's often not required to solve some arbitrary domain problem. Usually dealing with the mountain of annoyances of real-world data sources is what takes up the majority of the time, and then a random forest, boosted tree ensemble or SVM will do an acceptable job (especially compared to the usually pitiful posted baseline). Doing really, really well may require some finesse but also a large time investment, that won't typically be rewarded in an academic incentive structure (as far as being rewarded monetarily, there's also something seriously wrong with the economics of Kaggle, as is well-articulated by this lightning talk; anyone who's any good and has a clue what they're worth won't bother).
In short, winning competitions is usually only useful to an academic if it demonstrates a particular research-related point.
[–]marvinalone 9 指標 1 年 前
What's your opinion of Solomonoff Induction and AIXI? I'm just starting to read up on the topic, and I can't quite decide whether it's serious work, or a fringe theory by a small group of people who all cite each other.
[–]dylanbyte 2 指標 1 年 前
I am interested in this also.
[–]eaturbrainz 2 指標 1 年 前
Not Bengio, but reasonably well-versed in this specific topic.
It's serious work by theoreticians. You need a freaking Turing oracle to make those algorithms work, and all the relevant proofs are about global optimality in the presence of that Turing oracle, not about how good a learning/error rate you're going to get out of a finite sample with limited computing power (as you're going to need to build real algorithms).
That said, Schmidhuber and Hutter (who invented AIXI) have publication and competition records like nobody fucking else.
[–]dwf 2 指標 12 月 前
I'll just say that while the IDSIA group's competition record and benchmark results are impressive, it's important to compare apples to apples. Comparing a method that uses elastic distortions and other dataset augmentation strategies against a method that doesn't doesn't tell you anything about either method; it's been known for decades that more data helps, and that you can sometimes acquire more data by artificially augmenting a given training set with distortions. It's important to not conflate impressive engineering with scientific novelty.
[–]EJBorey 9 指標 1 年 前
We have all been hearing about the performance achievable via deep learning (in academic journals such as the New York Times, no less!). I've also heard that it's difficult for non-experts to get these techniques to work: Ilya Sutskever says that there is a weighty oral tradition about the design and training of deep networks and that the best way to learn how is to work for years with someone who is already an expert (source: http://vimeo.com/77050653).
I studied machine learning but not deep learning. Going back to grad school is not really an option for me. How can I learn how to design, build, and train deep neural networks without access to the oral tradition? Could you write it down for us somewhere?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
See the pointers I put above:http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq6wf0
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq7a3s
[–]EJBorey 1 指標 12 月 前
The second link is broken.
Do Hugo Larochelle's videos answer the questions here:http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq4rvi ?
[–]dylanbyte 2 指標 1 年 前
Related to this: would it be possible to use a Bayesian approach to try and encode some of this folk-lore knowledge?
What is the road-map to making deep learning accessible to all?
Thank you.
[–]yoshua_bengioProf. Bengio 8 指標 12 月 前
Hyper-parameter optimization has already been found to be a useful way to (partially) automate the search for good configurations in deep learning.
The idea is to automate the process of selecting the knobs, bells and whistles of machine learning algorithms, and especially of deep learning algorithms. We call such "knobs" hyper-parameters. They are different from the parameters that are learned during training, in that they are typically set by hand, by trial and error, or through a dumb and extensive exploration of all combinations of values (called "grid search"). Deep learning and neural networks in general involve many more such knobs to be tuned, and that was one of the reasons why many practitioners stayed far from neural networks in the past. It gave the impression of deep learning as a "black art", and it remains true that strong expertise helps a lot, but the research on hyper-parameter optimization is helping to move towards a more fully automated deep learning.
The idea of optimizing hyper-parameters is old, but had not had as much visible success until recently. One of the main early contributors to this line of work (before it was applied to machine learning hyper-parameter optimization) is Frank Hutter (along with collaborators), who devoted his PhD thesis (2009) to algorithms for optimizing knobs that are typically set by hand in general in software systems. My former PhD student James Bergstra and I worked on hyper-parameter optimization a couple of years ago and we first proposed a very simple alternative, called "random sampling" to standard methods (called "grid search"), which works very well and is very easy to implement.
http://jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
We then proposed using for deep learning the kinds of algorithms Hutter had developed for other contexts, called sequential optimization and this was published at NIPS'2011, in collaboration with another PhD student who devoted his thesis to this work, Remi Bardenet, and his supervisor Balazs Kegl (previously a prof in my lab, now in France).
http://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf
This work has been followed up very successfully by researchers at U. Toronto, including Jasper Snoek (then a student of Geoff Hinton), Hugo Larochelle (who did his PhD with me) and Ryan Adams (now a faculty at Harvard) with a paper at NIPS'2012 where they showed that they could push the state-of-the-art on the ImageNet competition, helping to improve the same neural net that made Krizhevsky, Sutskever and Hinton famous for breaking records in object recognition.
http://www.dmi.usherb.ca/~larocheh/publications/gpopt_nips.pdf
Snoek et al put out a software that has since been used by many researchers, called 'spearmint', and I found out recently that Netflix has been using it in their new work aiming to take advantage of deep learning for movie recommendations:
http://techblog.netflix.com/2014/02/distributed-neural-networks-with-gpus.html
[–]james_bergstra 1 指標 12 月 前*
Plug for Bayesian Optimization and Hyperopt:
FWIW my take is that Bayesian Optimization + Experts designing the search spaces for SMBO algorithms is the way to deal with this: e.g. other post and ICML paper on tuning ConvNets
The Hyperopt Python package provides SMBO for ConvNets, NNets, and (soon) a range of classifiers from scikit-learnhyperopt-sklearn.
Sign up for Hyperopt-announce to get alerts about new stuff such as upcoming Gaussian-Process and regression-tree-based SMBO search algorithms similar to Jasper Snoek's Spearmint and Frank Hutter's SMAC software.
[–]EJBorey 2 指標 1 年 前
Actually, I wasn't asking about the Bayesian optimization work that Jasper Snoek et al. are doing, because I don't think it will be possible to automate away all human judgement in the design of these things. Rather, I wanted to know how to quickly acquire the necessary intuition without postdoc-ing in Bengio, Hinton, or LeCunn's labs.
Deep learning will never be practical if there's only 10 people on the planet who can get it to work! Is there a way to quickly become one of the savants?
[–]orwells1 1 指標 12 月 前*
Hello, same here. I fit the bill of their intended phd students (according to Y. Lecun's page, awesome math + coder), but wanted to avoid more phd/post-docs. I went through a reasonable number of papers, but in most there are either explanations missing or later the authors comment online on the "human in the loop optimization"/"tricks of the trade"/"black magic". I'm not sure if I should be investing much more of my time alone, if the full knowledge is not there. Is it? Thanks a lot for doing this!
[–]serge_cell 9 指標 1 年 前
Hi Prof. Bengio, There were some work on applying "higher" math - algebraic/tropical geometry, category theory, to deep learning. Notably, John Healy several years ago claimed improving neural net (ART1) with category theory. What's your opinion on this approach? Will it be only toy model in foreseeable future, or there is some promise in this approach in your opinion?
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
See the above suggestions http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq7a3sRegarding algebraic/tropical geometry, look at the work of Morton & Montufar.
[–]polyguo 2 指標 1 年 前
Source? I'm extremely interested in the intersection between Programming Language Theory and Machine Learning. This seems to be right there.
[–]serge_cell 2 指標 1 年 前
Healy:
http://www.ece.unm.edu/~mjhealy/
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.98.6807
Tropical geometry
Tropical geometry of statistical models
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.242.9890
[–]n_dimensional 8 指標 1 年 前
Dear Prof. Bengio,
I am about to finish my PhD in computational neuroscience and I am very interested in the "gray area" between neuroscience and machine learning.
What aspects of brain computation do you think are (or will be) most relevant for machine learning?
If you could know the answer to one question about how the brain computes information, what would that be?
Thanks!
[–]yoshua_bengioProf. Bengio 6 指標 12 月 前
Understanding how learning proceeds in brains is clearly the subject most relevant to machine learning. We don't have a clue of how brains can learn in the kinds of efficient ways that we are able to implement in artificial neural networks, so this could be really important, and a place where information could flow both ways between machine learning research and computational neuroscience.
[–]exellentpossum 21 指標 1 年 前
When asked about sum product networks, one of the original Google Brain team members told me he's not interested in tractable models.
What's your opinion about sum product networks? They made a big splash at NIPS one year and now they've disappeared.
[–]yoshua_bengioProf. Bengio 6 指標 1 年 前
There are many kinds of intractabilities that show up in different places with various learning algorithms. The more tractable the easier to deal with in general, but it should not be at the price of losing crucial expressive power. I don't have a sufficiently clear mental fix on the expressive power of SPNs to know who much we lose (if any) through this parametrization of a joint distribution. In any case, all the interesting models that I know of suffer from intractability of minimizing the training criterion wrt the parameters (i.e. training is fundamentally hard, at least in theory). SVMs and other related kernel machines do not suffer from that problem, but they may suffer from poor generalization unless you provide them with the right feature space (which is precisely what is hard, and what deep learning is trying to do).
[–]celestec 3 指標 1 年 前
Hi exellentpossum, I am studying some machine learning on my own, and have not yet come across "tractable models." What exactly is a tractable model? (Searching on my own didn't help much...) Sorry if this is a dumb question.
[–]exellentpossum 3 指標 1 年 前
In the context of sum product networks, it means that inference is tractable or doesn't suffer from the exponential growth in computational cost when you add more variables.
This comes at a price though, sum product networks can only represent certain types of distributions. More specifically, probability distributions where its parameterization can be expressed as a product of factors (when multiplied out this creates a much larger polynomial). I'm not sure of the exact scope of distributions this encompasses, but it does include hierarchical mixture models.
[–]Scrofuloid 3 指標 1 年 前*
Not quite. All graphical models can be represented as products of factors, and deep belief networks and such are special cases of graphical models. Inference in graphical models is usually considered intractable in the treewidth of the graph. So, in conventional graphical model wisdom, low-treewidth graphical models were considered 'tractable', and high-treewidth models were 'intractable', so you'd have to use MCMC or BP or other approximate algorithms to solve them.
Any graphical model can be compiled into an SPN-like structure (an arithmetic circuit, or AC). The problem is that in the worst-case, the resulting circuit can be exponentially large. So even though inference is still linear in the size of the circuit, it's potentially exponential in the size of the original graphical model. But it turns out certain high-treewidth graphical models can still be compiled into compact circuits, so you can still do efficient inference on them. This means that there are certain high-treewidth graphical models on which inference is tractable -- kind of a surprise to the graphical models community.
You can think of ACs and SPNs as a way to compactly represent context-specific independences. They can compactly represent distributions that would result in high-treewidth graphical models if you tried to represent them in the usual graphical models way. The difference between ACs and SPNs is that ACs are compiled from Bayesian networks, as a means of performing inference on them. SPNs directly use the circuit to represent a probability distribution. So instead of training a graphical model and hoping you can compile it into a compact circuit (AC), you directly learn a compact circuit that fits your training data (SPN).
[–]exellentpossum 1 指標 1 年 前
I agree, SPNs can represent any probability distribution. But there is a certain set which can be represented efficiently. Can you be more specific about this set of distributions which can take advantage the factorization property of SPNs (a distribution with a reasonably sized circuit)?
[–]Scrofuloid 1 指標 1 年 前
Hm. I don't know if there's a one-line way to characterize that set of distributions. It includes all low-treewidth graphical models, and some high-treewidth distributions with context-specific independences. Poon & Domingos' paper had a section relating SPNs to various other representations.
[–][刪除] 1 年 前
[deleted]
[–]BeatLeJuce 7 指標 1 年 前
Why do Deep Networks actually work better than shallow ones? We know a 1-Hidden-Layer Net is already an Universal Approximator (for better or worse), yet adding additional fully connected layer usually helps performance. Were there any theoretical or empirical investigations into this? Most papers I read just showed that they WERE better, but there were very few explanations as to why -- and if there was any explanation. then it was mostly speculation.. what is your view on the matter?
What was your most interesting idea that you never managed to publish?
What was funniest/weirdest/strangest paper you ever had to peer-review?
If I read your homepage correctly, you teach your classes in French rather than English. Is this a personal preference or mandated by your University (or by other circumstances)?
[–]yoshua_bengioProf. Bengio 6 指標 12 月 前
Being a universal approximator does not tell you how many hidden units you will need. For arbitrary functions, depth does not buy you anything. However, if your function has structure that can be expressed as a composition, then depth could help you save big, both in a statistical sense (less parameters can express a function that has a lot of variations, and so need less examples to be learned) and in a computational sense (less parameters = less computation, basically).
I teach in French because U. Montreal is a French-language university. However, three quarters of my graduate students are non-francophones, so it is not a big hurdle.
[–]rpascanu 1 指標 12 月 前
Regarding 1, there are some work in this direction. You can check out these papers:
http://arxiv.org/abs/1312.6098 (about rectifier deep MLPs),
http://arxiv.org/abs/1402.1869 (about deep MLPs with piecewise-linear activations),
RBM_Representational_Efficiency.pdf,
http://arxiv.org/abs/1303.7461.
Basically the universal approximator theorem says that a one layer MLP can approximate any function if you allow yourself an infinite number of hidden units which in practice one can not do. One advantage of deep models over shallow one is that they can be (exponentially) more efficient at representing certain family of functions (arguably the family of functions we actually care about).
[–]shanwhiz 6 指標 1 年 前
We have seen deep learning work really well for image/video/sound. Do you foresee it working for text classification as well? Most papers that have tried text/document classification using deep learning have not done better than the conventional SVM/Bayes. What are your thoughts on this?
[–]yoshua_bengioProf. Bengio 9 指標 1 年 前
I predict that deep learning will have a big impact in natural language processing. It has already had an impact, in part due to an old idea of mine (from NIPS'2000 and a 2003 paper in JMLR): represent words by a learned vector of attributes, learned so as to model the probability distribution of sequences of words in natural language text. The current challenge is to learn distributed representations for sequences of words, phrases and sentences. Look at the work of Richard Socher, which is pretty impressive. Look at the work of Tomas Mikolov, who beat the state of the art in language models using recurrent networks and who found that these distributed representations magically capture some form of analogical relationships between words. For example, if you take the representation for Italy minus the representation for Rome, plus the representation for Paris, you get something close to the representation for France: Italy - Rome + Paris = France. Similarly, you get that King - Man + Woman = Queen, and so on. Since the model was not trained explicitly to do these things, this is really amazing.
[–]hapagolucky 10 指標 1 年 前
I see more and more pop media articles extolling deep learning as a panacea that will make AI a reality (Wired is especially guilty of this). Given the AI winters of the 1970's and 1980's that arose from overhyped expectations, what can deep learning and ML researchers and advocates do to mitigate this from happening again?
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
Stick to the scientific ways of demonstrating advances (which often is lacking from companies branding themselves as doing deep learning). Avoid overselling. Stay humble while not using our motivation associated with the long-term vision that brought us here in the first place.
[–][deleted] 7 指標 1 年 前
Hi Bengio. I'm a masters candidate in robotics, mostly doing reinforcement learning mushed together with some ML regression methods for the identification of interesting value functions and state space representations.
How is your work life balance? Do you have fun? What sorts of things do you do to unwind?
I'm considering doing a PhD, but I literally feel like just getting a part-time job and doing independent research, because the academic environment can be pretty stifling.
Also, Montreal seems really fun!
J
[–]yoshua_bengioProf. Bengio 16 指標 1 年 前
Life balance. That is tough. Many prominent scientists will tell you the same story. My inclination is to work as much as I can: that is probably part of the reasons for my early success, but this may threaten my health and personal life. We live in an environment which puts so much pressure on us that it is easy to forget that we are humans and we need breaks and to take care of our body (I have some health issues that I cannot just ignore) and our relationships with other humans. Some kind of self-discipline helps, but I found that what works best is to cultivate what is rewarding and pleasurable and the same time is good for me and my physical and emotional well-being. For example I like very much to walk (many ideas come!), not to speak about eating healthily and enjoying a romantic relationship based on authenticity and where I can really be myself.
Oh, and yes, Montreal IS fun ;-)
The advantage of academia is that you can focus on research and that you can benefit enormously from the interactions with other researchers. Research is a collective enterprise. This is NOT like what you tend to see in science-fiction movies. Never forget that!
[–][deleted] 1 指標 12 月 前
This is really refreshing to hear!
I have been struggling with balance as well. I think I should find my balanced way of being a scientist as well, and find a supervisor who wants to be my long term colleague and friend - not just a pedantic sort of guide and disciplinary figure. Perhaps giving up on academia is the easy way out. Perhaps what I really need to do is make more inspirational friends, and help join and build the community I want to be a part of.
Thanks so much for the candid response! It's very eye opening. I hope you keep being awesome and inspiring people like me! (but no so much that we keep losing so much sleep on our work :p)
[–]Derpscientist 6 指標 1 年 前*
Dr Bengio,
I'd like to thank you for the amazing research and software(theano, pylearn2) that your lab has contributed.
What are your feelings on Hinton and LeCun moving to industry?
What about academia and publishing your research is more valuable than the floating point overflow of money you could make at private companies?
Are you nervous that machine learning will go the way of time-series analysis, where a lot of advanced research takes place behind closed doors because the intellectual property is so valuable?
Given the recent advancements in training discriminative neural networks, what role do you envision generative neural networks play in the future?
[–]yoshua_bengioProf. Bengio 8 指標 12 月 前
I think that with Hinton & LeCun in industry, there will be more rapid advance in applying deep learning to really interesting and large-scale problems. The down side may be a temporarily reduced offer in terms of supervising new graduate students for deep learning. However, there are many young faculty who are at the forefront of deep learning research and who are eager to take new strong students. And the fact that deep learning is being used heavily in industry means that more students get to know about the field and are excited to jump into it.
Personally, I prefer the freedom of academia over more zeros in my salary. See also what I wrote above:http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfpbc1g
I believe that a lot of research will continue to happen in academia and that in the large industrial labs the incentive to publish will remain high.
I think that generative networks are very important for the future. See what I wrote above about unsupervised learning (the two are not synonym, but often come together, especially since we found the generative interpretation of auto-encoders, see the work with Guillaume Alain, http://arxiv.org/pdf/1305.6663.pdf):
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq7v4v
[–]quaternion 1 指標 1 年 前
Could you provide additional info on who and what you are referring to with time series analysis?
[–]tryolabs_feco 9 指標 1 年 前*
Hi Yoshua, very excited about this AMA, thank you for your time. I have a few questions:
- What are the biggest challenges in ML nowadays?
- What are the most interesting and/or creative ways you have seen people/businesses using ML?
- What does the future of Machine Learning look like?
[–]freieschaf 4 指標 1 年 前
Last year I did my undergrad thesis on NLP using probabilistic models and neural networks partly inspired by your work. I became interested and at that point I considered doing further work on NLP. Currently I am pursuing an MSc degree taking several related courses.
But, after several months, I haven't found NLP to be as motivating as I was expecting it to be; research on this area seems to be a little stagnant, from my limited point of view. What do you think are some challenges that are making or going to make this field move forward?
Thanks for taking the time to answer some questions here!
[–]yoshua_bengioProf. Bengio 8 指標 1 年 前
I believe that the really interesting challenge in NLP, which will be the key to actual "natural language understanding", is the design of learning algorithms that will be able to learn to represent meaning. For example, I am working on ways to model sequences of words (language modeling) or to translate a sentence in one language into a corresponding one in another language. In both of these cases we are trying to learn a representation of the meaning of a phrase or sentence (not just of a single word). In the case of translation, you can think of it like an auto-encoder: the encoder (that is specialized to French) can map a French sentence into its meaning representation (represented in a universal way), while a decoder (that is specialized to English) can map this to a probability distribution over English sentences that have the same meaning (ie. you can sample a plausible translation). With the same kind of tool you can obviously paraphrase, and with a bit of extra work, you can do question answering and other standard NLP tasks. We are not there yet, and the main challenges I see have to do with numerical optimization (it is difficult not to underfit neural networks, when they are trained on huge quantities of data). There are also more computational challenges: we need to be able to train much larger models (say 10000x bigger), and we can't afford to wait 10000x more time for training. And parallelizing is not simple but should help. All this will of course not be enough to get really good natural language understanding. To to this well would basically allow to pass some Turing test, and it would require the computer to understand a lot of things about how our world works. For this we will need to train such models with more than just text. The meaning representation for sequences of words can be combined with the meaning representation for images or video (or other modalities, but image and text seem the most important for humans). Again, you can think of the problem as translating from one modality to another, or of asking whether two representations are compatible (one expresses a subset of what the other expresses). In a simpler form, this is already how Google image search works. And traditional information retrieval also fits the same structure (replace "image" by "document").
[–]akshayxyz 1 指標 12 月 前
I am not from academia, but ever since I have started following machine learning stuff, I keep getting interesting ideas/problems to solve. Here is one I got few years back.
You take simple math word problems, e.g. simple ratio/proportion, rate/motion, age, give/take etc. word problems, they can (have to) be translated to a bunch of constants, unkown(s) and math relations/concepts, eventually to find some unknown(s). And every one who understands the concepts, will come up with similar equations, and definitely one correct answer. You can view it as a NLP problem.. How to solve it? Well I don't know, may be trying to first extract basic concepts/relations from standard (and simple) word problems?
Thinking aloud - you may start by doing something like "part of (math) speech" tagging...or, get some labeled data ( problem -> math equation), and see if you can find some hidden factors/relations defining the translations...
[–]deeperredder 4 指標 1 年 前*
While deep nets have helped move the state of the art forward in natural language text understanding, the improvements there haven't really been significant. Where do you think significant progress can come from in that field?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
I do think that significant progress will come in the area of natural language processing, most importantly, natural language understanding. Progressively, though (because full understanding is essentially AI-level understanding of the world around us). See my previous answer:
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfpje92
[–]CyberByte 11 指標 1 年 前
What will be the role of deep neural nets in Artificial General Intelligence (AGI) / Strong AI?
Do you believe AGI can be achieved (solely) by further developing these networks? If so: how? If not: why not, and are they still suitable for part of the problem (e.g. perception)?
Thanks for doing this AMA!
[–]davidscottkrueger 6 指標 12 月 前
Hi! My name's David Krueger; I'm a Master's student in Bengio's lab (LISA).
My response is: it is not clear what their role will be. AGI may be theoretically achievable solely by developing NNs, (especially if we include RNNs), but this is not how it will actually take place.
What incompetentrobot said is literally false, but there is a kernel of truth, which is that Deep Learning (so far) just provides a set of methods for solving certain well-defined types of general Machine Learning problems (such as function approximation, density estimation, sampling from complex distributions, etc.).
So the point is that the contributions of the Deep Learning community haven't been about solving fundamentally new kinds of problems, but rather finding better ways to solve fundamental problems.
[–]willis77 8 指標 1 年 前
Have you observed practical applications where deep learning succeeds but traditional ML fails? i.e. not simply improving the state of the art on an image benchmark by X%, but a case where an intractable problem is made tractable, solely via deep learning?
[–]yoshua_bengioProf. Bengio 9 指標 12 月 前
There is a constructed task on which all the traditional black-box machine learning that were tried failed, and where some deep learning variants work reasonably well (and where guiding the hidden representation completely nails the task, showing the importance of looking for algorithms that can discover good intermediate representations that disentangle the underlying factors). Note that many deep learning approaches also failed so this is interesting. Seehttp://arxiv.org/abs/1301.4083. What's particular about this task is that it is the composition of two much easier tasks (detecting objects, performing a logical operation on the result), i.e., it intrinsically requires more depth than a simple object recognition task.
[–]SnowLong 2 指標 1 年 前*
I believe no one had commercially deployed system that could search untagged images up until deep convolutional nets hugely improved state of art on the ImageNet benchmark. It took less then half a year for Google to implement search in personal galleries after promising results were shown. So in a way traditional method failed - non were good enouph to actually put into production...
[–]Should_I_say_this 7 指標 1 年 前
Can you describe what you are currently researching, first by bringing us up to speed on the current techniques used and then what you are trying to do to advance that?
[–]SnowLong 8 指標 1 年 前
I think your question was answered by Yousua here:
Deep Learning of Representations: Looking Forward
Yoshua Bengio
arXiv:1305.0445v2 [cs.LG] 7 Jun 2013
[–]Should_I_say_this 1 指標 1 年 前
This is excellent thanks!
[–]dwf 4 指標 12 月 前
Following on work Ian and I did on maxout, I recently did some work empirically interrogating how and why dropout works, focusing on the rectified linear case. More recently I've been working on hyperparameter optimization.
[–]exellentpossum 3 指標 1 年 前
It would be cool if members from Bengio's group could also answer this (like Ian).
[–]rpascanu 7 指標 12 月 前
I've done some work lately on the theory side (showing that deep models can be more efficient than shallow ones):
http://arxiv.org/abs/1402.1869
http://arxiv.org/abs/1312.6098
I've been spending quite a bit of time on natural gradient, and I'm currently exploring variants of the algorithm, and I'm interested in how it addresses non-convex optimization specific problems.
And, of course, recurrent networks which have been the focus of my PhD since I started. Particularly I worked on understanding the difficulties of training them (http://arxiv.org/abs/1211.5063) and how depth can be added to RNNs (http://arxiv.org/abs/1312.6026).
[–]caglargulcehre 5 指標 12 月 前
Hi, My name is Caglar Gulcehre and I am PhD student at Lisa lab. You can access my academic page from here,http://www-etud.iro.umontreal.ca/~gulcehrc/.
I have done some works related to Yoshua Bengio's "Culture and Local Minima" paper, basically we focused on empirically validating the optimization difficulty on learning high level abstract problems:http://arxiv.org/abs/1301.4083
Recently I've started working on Recurrent neural networks and we have a joint work with Razvan Pascanu, Kyung Hyun Cho and Yoshua Bengio: http://arxiv.org/abs/1312.6026
I've also worked on a new kind of activation function in which we claim to be more efficient in terms of representing complicated functions compared to regular activation functions i.e, sigmoid, tanh,...etc:
http://arxiv.org/abs/1311.1780
Nowadays I am working on Statistical Machine Translation and learning&generating sequences using RNNs and what not. But I am still interested in optimization difficulty for learning high level(or abstract) tasks.
[–]ian_goodfellow[S] 5 指標 1 年 前
I'm helping Yoshua write a textbook, and working on getting Pylearn2 into a cleaner and better documented state before I graduate.
[–]exellentpossum 1 指標 12 月 前
Any particular developments in deep learning that you're excited about?
[–]ian_goodfellow[S] 5 指標 12 月 前
I'm very excited about the extremely large scale neural networks built by Jeff Dean's team at Google. The idea of neural networks is that while an individual neuron can't do anything interesting, a large population of neurons can. For most of the 80s and 90s, researchers tried to use neural networks that had fewer artificial neurons than a leech. In retrospect, it's not very surprising that these networks didn't work very well, when they had such a small population of neurons. With the modern, large-scale neural networks, we have nearly as many neurons as a small vertebrate animal like a frog, and it's starting to become fairly easy to solve complicated tasks like reading house numbers out of unconstrained photos: http://www.technologyreview.com/view/523326/how-google-cracked-house-number-identification-in-street-view/ I'm joining Jeff Dean's team when I graduate because it's the best place to do research on very large neural networks like this.
[–]Letter_Guardian 3 指標 1 年 前
Hi Prof. Bengio,
Thank you for doing this AMA. Questions:
How much do you think we can actually accomplish in the big data challenge?
Do you think data alone is sufficient to solve practical problems, as opposed to use some kind of expert knowledge?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
At the end of the day there is only data. Expert knowledge is also coming from past experience: either communicated by some humans (recently, or in past generations, through cultural evolutio) or from genetic evolution (which also relies on experience to engrave knowledge into genes). What this may potentially say is that we may need different kinds of optimization methods and not just those based on local descent (like most learning algorithms).
All that being said, if I try to solve a practical problem in the short term, it can be very useful to use prior knowledge. There are many ways that this has been done in deep learning, either through preprocessing, architecture and/or training objective (e.g. especially through regularizers and pre-training strategies). However, I much prefer when the data can override the prior that is injected (and this is also theoretically more sound, as one consider that more and more data can be exploited).
[–]FuzzySets 3 指標 1 年 前
I'm currently finishing up my undergrad in philosophy of science and logic and I am trying to make the switch to computer science for masters work with the intention of pursuing machine learning at the phd level. Besides filling in the obvious knowledge gaps in mathematics and basic programming skills, what are some of the things a person in my position could do to make themselves a more attractive candidate for your field of work? Thanks so much for visiting us a r/MachineLearning!
[–]yoshua_bengioProf. Bengio 10 指標 12 月 前
Read deep learning papers and tutorials, starting from the introductory material and moving your way up. Take notes on your reading, trying to summarize what you learned.
Implement some of these algorithms yourself, from scratch, to make sure you understand the math for real, implementing variants of these, not just a copycat of a pseudo-code you found in a paper.
Play with these implementations on real data, maybe competing in Kaggle competitions. The point is that a lot is learned by actually putting your hands in data and playing with variants of these algorithms (this is true in general for machine learning).
Write about your experiences and results and thoughts in a blog. Initiate contact with researchers in the field and ask them if they would like to you to work remotely on some of the projects and ideas they have. Try to do an internship.
Apply to graduate school in a lab that actually does these things.
Is the roadmap clear enough?
[–]karmicthreat 3 指標 1 年 前
So I've had a desire to get deep into Deep Learning and general machine learning for a while. I'm currently taking the computational neurology course coursera offers. I'll follow that up with the ML and NN courses.
Where do you recommend someone go from there? I've not seen much that is at the grad level out there.
[–]last_useful_man 1 指標 1 年 前
https://www.coursera.org/courses?orderby=upcoming&search=computational%20neurology
(comes up empty) - care to clarify? Clinical neurology perhaps?
[–]karmicthreat 2 指標 1 年 前
Sorry, I meant computational neuroscience. Which makes sense, since neurology would be more the study of disorders of the nerves. Which while interesting I'm not really after that particular aspect of the CNS.
[–]last_useful_man 1 指標 1 年 前
Holy moly, that exists! https://www.coursera.org/course/compneuro
Awesome, thank you!
[–]lars_ 3 指標 1 年 前
Hi! The guys behind the Blue Brain project intend to build a working brain by reverse engineering the human brain. I heard Hinton be critical of this approach in a talk. I got the impression that he believed the kind of work that is done within ML would be more likely to lead to a general strong AI.
Let's imagine we are some time in the future, and we have created strong artificial intelligence - that passes the Turing test, and generally passes as alive and conscious. If we look at the code for this AI, do you think it would mostly be a result of reverse engineering the human brain, or would it be mostly made of parts that we humans have invented on our own?
[–]yoshua_bengioProf. Bengio 6 指標 12 月 前
I don't think that Hinton was critical of the idea of reverse-engineering the brain, i.e., to consider what we can learn from the brain in order to build intelligent machines. I suspect he was critical of the approach in which one tries to get all the details right without an overarching computational theory that would explain why the computation makes sense (especially from a machine learning perspective). I remember him making that analogy: imagine copying all the details of a car (but with an imperfect copy), putting them together, and then turning on the key and hoping for the car to move forward. It's just not going to work. You need to make sense of these details.
[–]redkk 3 指標 1 年 前
Hi Sir, I am a self-learner trying to train a sparse autoencoder with linear/relu units. What would be a suitable sparsity cost which is differentiable? I saw something that uses KL divergence but could not understand it. Is sparsity-inducing formula a holy grail or secret? Thanks, KK.
[–]yoshua_bengioProf. Bengio 5 指標 12 月 前
Not a holy grail or secret. With a denoising auto-encoder setup and rectifiers, you easily get sparsity, especially with an L1 penalty. With sigmoids you are better off with the KL divergence penalty. It just says that the output of the units should be close to some small target (like 0.05) in average, but instead of penalizing squared difference it uses the KL divergence, which is more appropriate for comparing probabilities. My colleague Roland Memisevic is more involved than I am in experimenting with such things and could probably tell you more.
[–]evc123 3 指標 1 年 前
Hi Prof Bengio,
Is it possible to get into Lisa-Lab without any Machine learning/Deep Learning publications? The university I'm attending does a tiny bit of research in computer vision, bioinformatics, and 1980s-era neural networks; but none of it as contemporary or as in-depth as the research at Lisa-Lab and the other labs listed on Deeplearning.net
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
We have taken such candidates recently, especially if they are strong in math and computer science. Note that we have pretty much filled the open positions for Fall 2014, though.
[–]ddebarr 3 指標 12 月 前
As EJBorey says, "I've heard that it's difficult for non-experts to get these techniques to work." Was is the most promising work being done to automate the configuration of deep learning networks? Thanks!
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
Please see this reply: http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq884k
[–]SnowLong 6 指標 1 年 前
Is there attempts to apply neural nets to the task of machine translation?
When do you think NN based approaches replace statistical methods in commercially deployed MT systems? I mean in speech recognition(all major industry players) and vision(Google, Baidu) tasks NNs are already deployed...
[–]yoshua_bengioProf. Bengio 5 指標 12 月 前
I just started a page that lists some of the papers on neural nets for machine translation:https://docs.google.com/document/d/1lqo5N1LzVWNPy1sYuujNa5vVNmyP5Zjv6VtEVgcFr6k
Briefly, since neural nets already beat n-grams on language modeling, you can first use them to replace the language-modeling part of MT. Then you can use them to replace the translation table (after all it's just another table of conditional probabilities). Other fun stuff is going on. The most exciting and ambitious approaches would completely scrap the current MT pipeline and learn to do end-to-end MT purely with a deep model. The interesting aspect of this is that the output is structured (it is a joint distribution over sequences of words), not a simple point-wise prediction (because there are many translations that are appropriate for a given source sentence).
[–]SnowLong 1 指標 12 月 前
Thank you! Insights help and I'm starting to read papers so thanx for the list too (:
[–]EJBorey 1 指標 1 年 前
Sure. Here's a New York Times article that talks about real-time machine translation from English into Mandarin:http://www.nytimes.com/2012/11/24/science/scientists-see-advances-in-deep-learning-a-part-of-artificial-intelligence.html
[–]SnowLong 3 指標 1 年 前
I saw that video from MS, very impressive one. But I do not believe MT part was done using NNs. Speech recognition - YES. Speech synthesis - most likely. MT - nope.
[–]Two-Tone- 5 指標 1 年 前
What are your thoughts on Google acquiring all of these different AI related companies the last year or so?
[–]totes_meta_bot 2 指標 1 年 前*
This thread has been linked to from elsewhere on reddit.
[/r/compsci] Deep learning pioneer Yoshua Bengio taking questions for his AMA in /r/MachineLearning
[/r/artificial] Deep learning pioneer Yoshua Bengio AMA: Thursday 1-2PM EST in /r/MachineLearning
[/r/Futurology] Deep learning pioneer Yoshua Bengio taking questions for his AMA in /r/MachineLearning
I am a bot. Comments? Complaints? Send them to my inbox!
[–]EJBorey 2 指標 1 年 前
Any advice on hiring your students? What is compelling to the modern machine learning PhD?
[–]kablunk 2 指標 1 年 前*
Sorry for being so mundane: What as yet unexplored fields do you see machine learning being applied to in the future?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
I would rather ask about fields where machine learning is NOT going to be applied ;-)
[–]sssub 2 指標 1 年 前
Dear Prof. Bengio,
In Neuroinformatics several researchers work in the field of 'Reservoir Computing' (random sparse RNN with a linear read-out which is trained). Comparing this architecture to 'Deep networks' I see a lot of similarities in both approaches. There seems to be a strong link between learning abstract features in deep architectures and plasticity mechanisms in spiking reservoirs.
I would very much like to hear your opinion on this
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
Biological motivation is indeed very interesting, but learning the recurrent weights is crucial to get computational competence, as I wrote there:
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfpboj8
[–]rpascanu 1 指標 12 月 前
Correct me if I'm wrong, but the Reservoir Computing paradigm assumes that the reservoir (or recurrent and input to hidden weight matrices) are randomly sampled (from carefully crafted distribution) and not learned. By plasticity mechanism you refer here to RC methods that use some local learning mechanism of the weights ?
If not, I believe one can answer your question along this line. Both RC approaches and DL approaches are trying to extract useful features from data. However RC does not learn this feature extractor, while DL does. Of course, as you pointed out, there are a lot of similarities. There are a lot of things DL could learn from RC research and the other way around it.
[–]sssub 1 指標 12 月 前
Yes, I am referring to local biologically-inspired learning mechanisms. An example being Spike-timing dependent plasticity (STDP) which is then investigated in reservoir systems. Such architectures look a lot like autoencoders.
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前*
"Looking a lot like" is interesting, but we need a theory of how this enables doing something useful, like capturing the distribution of the data, or approximately optimizing a meaningful criterion.
[–]US932H923 2 指標 12 月 前
Who are some of the people you have a lot of respect for?
What was the last fiction book that you've read?
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
I have a lot of respect for a lot of people! One clue is who I cite! Another is who I invite at the workshops and conferences I organize.
[–]m4linka 2 指標 12 月 前
Dear Prof. Bengio.
In my experience with using different neural networks models, it seems that either a good initialization (for example via pretraining, or the sort of guided learning) or the structure (think of the convolutional net) or standard regularization like l2 norm is crucial for learning. In my opinion all of them are special forms of the regularization. Therefore, it looks that 'without prior assumptions, there is no learning'. In the era of 'big data' we can slowly decrease the influence of the regularization part - and therefore develop more 'data-driven' approaches.
Nonetheless, still some form of regularization is needed. For me it seems there is a complexity gap between training networks from scratch (and keeping the regularization as small as possible), and using regularized networks (structure, l2 norm, pre-training, smart initialization, ...). Something like P-hard vs NP-hard in the complexity theory.
Are you aware of any literature that tackle this problem from the formal or experimental perspective?
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
In a theoretical sense, you would imagine that as the amount of data goes to infinity priors become useless. Not so, I believe. Not only because of the potentially exponential gains (in terms of number of examples saved) of some priors, but also because there are computational implications of some priors. For example, the depth prior can save you both statistically and computationally, when it allows you to represent a highly variable function with a reasonable number of parameters. Another example is the time for training. If (effective) local minima are an issue, then even with more training data, you would get stuck in poor solutions, that a good initialization (like pre-training) could avoid. Unless you make both the amount of data and computation resources to infinity (and not just "large"), I think some forms of broad priors are really important.
[–]m4linka 1 指標 12 月 前
That is interesting. Could you point out some literature on this topic?
[–]davidscottkrueger 1 指標 12 月 前
According to yesterday's talk, the private dataset network in this paper was trained without regularization, suggesting that with enough data it may not be needed (although it likely depends on the dataset/task).http://arxiv.org/pdf/1312.6082v2.pdf
[–]US932H923 2 指標 12 月 前
When you're learning something new, do you spend time trying to figure out how the learning process is happening in your own brain?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
Typically not. I get too excited when something clicks. My brain races and my urge is to write my understanding down or talk about it.
But at other times, I do marvel on this phenomenon and I think about it.
[–]DavidJayHarris 2 指標 12 月 前
Hi Professor Bengio, thanks so much for answering our questions. I was wondering what you thought of stochastic feedforward methods like Tang and Salakhutdinov presented at NIPS last year.
It seems to me like a great way to get some of the benefits of stochastic methods (especially the ability to predict at multiple modes) while retaining the efficiency of feedfoward methods that can be trained by backprop. It seems like there are some interesting parallels between this approach and the stochastic networks your lab has been working on, and I'd love to hear your thoughts on the comparison.
Thanks again!
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
I very much like their paper. We have been working on very similar stuff!
[–]nxvd 2 指標 12 月 前
Hello Dr. Bengio,
Thank you for your time. There are two questions I would like to ask you, if you don't mind:
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
I consider one of my greatest success is to have contributed to a collaborative, open, and collegial atmosphere in the lab. The common good is not an idle concept, here. It also helps to make students a lot more motivated, they enjoy their time here and contributing to group efforts.
[–]dhammack 1 指標 1 年 前
If I were summarizing the results from deep models, I'd say that deep models are excelling in problems that humans held the previous state-of-the-art (vision/audio/language).
Do you know of any successes in problems of the opposite nature; problems where statistical methods are already better than humans? One example I can think of is the Merk Kaggle challenge won by George Dahl, but I'd love to hear of some more.
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
Yes, I know of some such cases, in the realm of recommendation systems or fraud detection, when the number of input variables is large and cannot be easily visualized or digested by a human. Although I don't know of head-to-head comparisons with human performance, the sheer speed advantage makes it impractical to even consider humans for such jobs (except maybe to consider the few cases flagged by a machine).
[–]zach_will 4 指標 1 年 前
Hi Professor!
I always find myself resorting to ensembles and random forests in my projects (I think I can just internalize decision trees much better than deep learning). Could you offer the flip side for why I should be excited about neural networks?
(I mostly work with "medium-sized" data, and it usually fits on a single machine.)
Thanks!
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
I wrote some papers explaining why decision trees are doomed to generalize poorly:
http://www.iro.umontreal.ca/~lisa/pointeurs/bengio+al-decisiontrees-2010.pdf
The key point is that decision trees (and many other machine learning algorithms) partition the input space and then allocate separate parameters to each region. Thus no generalization to new regions or across regions. No way you can learn a function which needs to vary across a number of distinguished regions that is greater than the number of training examples. Neural nets do not suffer from that and can generalize "non-locally" because each parameter is re-used over many regions (typically HALF of all the input space, in a regular neural net).
[–]kablunk 3 指標 1 年 前
What are some things that self-taught machine learning scientists lack that those trained in a formal environment (university or similar) have?
(I'm asking as a member of the first group)
[–]SuperFX 4 指標 1 年 前
There seems to be a recent trend where a lot of deep learning researchers have moved to industry, ostensibly to gain access to very large data sets. Do you think deep learning research within academia can continue to flourish without such access? Or is the field invariably moving toward HPC and massive data sets as perquisites?
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
I think that there are plenty of huge datasets available for free out there. Think about all of wikipedia, all of youtube, etc. Not to mention: all of the internet.
Computing power is another question, but actually in some countries like Canada, the government is encouraging (or forcing) scientists to share computational resources. The result is that I have access to more computational power than most of my american colleagues. Plus, the cost of computing power continues to go down.
[–]javiermares 3 指標 1 年 前*
Professor Bengio,
What do you think of Ray Kurzweil's PRTM? Do you think any of its characteristics could be implemented on current deep learning techniques to improve their capabilities?
Thank you.
[–]yohamoha 3 指標 1 年 前
Hello, professor. I have a question that I always ask experts in their fields: In your field of study, what is the best book/paper you know of? Why? (here "best" can have any meaning, as long as it's specified)
Thanks.
[–]yoshua_bengioProf. Bengio 5 指標 12 月 前
There are too many good papers.
My students have put together a list of papers to read for the new students of the lab:
https://docs.google.com/document/d/1IXF3h0RU5zz4ukmTrVKVotPQypChscNGf5k6E25HGvA
[–]hltt 1 指標 1 年 前
Do you think of any other interesting deep learning approaches to NLP than Recursive Neural Network from Richard Socher ?
[–]rpascanu 1 指標 12 月 前
RNNs as in recurrent neural networks (e.g. Tomas Mikolov's work) are also very interesting IMHO.
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
Indeed.
[–][deleted] 1 指標 1 年 前
Hi professor Yoshua Bengio.
Do you think that machine learning as we understand it today will be the basis of future AI?
Which is a bigger obstacle to making AI stronger, hardware limitations or algorithmic/software problems? What is the biggest obstacle to making AI better in general?
What do you think of Ray Kurzweil's prediction that an AI will pass the Turing test by 2029? He has placed a bet on this prediction.
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
I won't bet on the year that AI will pass the Turing test, but I will certainly bet that machine learning will be a central technology to future AI.
The biggest obstacle to improving AI is to improve machine learning. To improve ML enough to get there, there are still many obstacles. Only some of them have to do with computing power. Others are more conceptual. For example I am convinced that there are still fundamental obstacles to learning the joint distribution of many variables for AI-like tasks. I also think that we have not even scratched the surface of the optimization challenges involved in training very large deep nets. Then there is reinforcement learning, which will be clearly necessary and on which advances are clearly needed (see the recent exciting work by the DeepMind people, on learning to play 80's Atari games, and presented at the Deep Learning Workshop at NIPS, which I organized).
[–][deleted] 1 指標 12 月 前
Thank you for your response.
[–]edersantana 1 指標 1 年 前
Which suggestions would you give to a young professor building a new research lab on machine learning, neural networks and such? What do you think are the most important aspects about lab environment, hardware and software resources? What about international cooperation? Also, How to be competitive worldwide?
[–]yoshua_bengioProf. Bengio 6 指標 12 月 前
Focus on your research.
Engage in collaboration and discussion with scientists from which you can learn.
Read. Read. Read.
Focus on your research.
Nourish your graduate students intellectually and at a personal relational level, like a father with his children.
Go to the best conferences of your field. Talk. Talk. Talk.
Keep thinking about the long term and steering back in the directions that you believe are promising, even though it's tempting to follow the trend and do incremental contributions. Believe in yourself.
Focus on your research.
[–]sixwings 1 指標 1 年 前
Professor Bengio,
Thank you for taking our questions. How do you respond to this criticism of Deep Learning from Jeff Hawkins:
Source: Deep Learning
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
See the replies below. There is plenty of deep learning work involving temporal structure. More will come, for sure.
[–]richardabrich 1 指標 1 年 前
Recurrent neural networks model temporal relationships implicitly. They're often used for speech recognition. There has been some work on deep recurrent neural networks. [1,2]
[1] http://www.cs.toronto.edu/~hinton/absps/RNN13.pdf
[2] http://papers.nips.cc/paper/5166-training-and-analysing-deep-recurrent-neural-networks.pdf
[–]rpascanu 1 指標 12 月 前
http://arxiv.org/abs/1312.6026.
RNN are also used in NLP. Some other interesting work that goes towards recurrent models (for scene parsing now) is this: http://arxiv.org/abs/1306.2795
[–]davidscottkrueger 1 指標 12 月 前
Of course, this cannot be taken as a valid criticism of the promise or potential of Deep Learning, because DL can account for the concept of time.
However, I think the point he is making about systems that interact with the world in real time vs. systems that don't is huge, and currently, DL's big successes are not in real-time applications.
I think a greater emphasis on real-time methods across the board would be a good thing. And I think that Reinforcement Learning will ultimately be more important than supervised/unsupervised learning.
[–]hf98hf43j2klhf9 1 指標 1 年 前
[META] In the comments at the verification page it looks like Yann LeCun is open to the AMA idea as well! Should we try to request him as well?
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
It would be fun.
[–]32er234 1 指標 12 月 前
You sending him an email will probably be more effective than all of us trying to bombard his social media pages ;-)
[–]IdoNotKnowShit 1 指標 1 年 前
Bonjour professeur Bengio! Thank you so much for this AMA! Here are a few questions of mine (not chosen i.i.d.):
Where does deep learning show promise? And in what application would it be an absolutely horrible choice?
Why do stacked RBMs work? Is this something that can be explained in a throughly formal manner or is there still some magic that needs to be unraveled?
What would you say is the relationship between ensemble learning and deeply layered learning?
Can you describe some of the work your lab/grad students is/are doing and why you support it?
What are some of the best things about living in Montreal?
How do you like to approach a research question? What kind of working environment do you prefer?
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前*
There is no such thing as magic, except in our emotional interpretation. I believe that I have a fairly rounded interpretation of why stacks of RBMs or regularized auto-encoders work so well. I have written about this, see in particular the 2013/2013 review paper with Courville & Vincent:
http://arxiv.org/abs/1206.5538
(also published in PAMI 2013)
I don't know of relationships between ensemble learning and deep layered learning besides the beautiful interpretation of dropout. For example, see http://arxiv.org/abs/1312.6197
My students have written a few words about studying in Montreal, for new graduate candidates:
http://www.iro.umontreal.ca/~bengioy/yoshua_en/index_files/open_positions.html
Montreal is a large city with 4 universities, a very rich cultural tradition, near nature, and where the quality of life (including security) is among the best (the 4th best in North-America, according to Mercer). Cost of life is substantially less than in other similar-sized North-American cities.
[–]moseconseco2 1 指標 1 年 前
Can you talk about the connection, if there is one, between big, structured knowledge projects like Google'sKnowledge Graph (built largely on the entity graph Freebase) and deep learning?
Is it significant that the data of the knowledge graph has this recursive network structure that looks a lot like the layers of abstraction in a deep learning setup?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
There is plenty of room in the Knowledge Graph project for machine learning, and so for deep learning. In particular, you want ML to help you guess the missing attributes of objects in the graph and even guess the missing relationships (so that you can even automatically insert new objects in the graph, based on some of their attributes).
[–]strayadvice 1 指標 12 月 前
This question is regarding deep learning. From what I understand, the success of deep neural networks on a training task relies on choosing the right meta parameters, like network depth, hidden layer sizes, sparsity constraint, etc. And there are papers on searching for these parameters using random search. Perhaps some of this relies on good engineering as well. Is there a resource where one could find "suggested" meta parameters, maybe for specific class of tasks? It would be great to start with these tested parameters, then searching/tweaking for better parameters for a specific task.
What is the state of research on dealing with time series data with deep neural nets? Deep RNN's perhaps?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
Regarding the first question you asked, please refer to what I wrote earlier about hyper-parameter optimization (including random search);
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq884k
James Bergstra continues to be involved in this line of work.
[–]rpascanu 2 指標 12 月 前
Here are a list of more recent work. The idea of Deep RNN's (or hierarchical ones) is older, and both Jurgen Schmidhuber and Yoshua have papers about it since the 90's.
http://arxiv.org/abs/1306.2795
http://arxiv.org/abs/1312.6026
http://arxiv.org/abs/1308.0850
http://papers.nips.cc/paper/5166-training-and-analysing-deep-recurrent-neural-networks.pdf
[–]james_bergstra 2 指標 12 月 前
I think having a database of known-configurations that make for good starting points for search is a great way to go.
That's pretty much my vision for the "Hyperopt" sub-projects on github: http://hyperopt.github.io/
The hyperopt sub-projects specialized for nnets, convnets, and sklearn currently define priors over what hyperparameters make sense. Those priors take the form of simple factorized distributions (e.g. number of hidden layers should be 1-3, hidden units per layer should be e.g. 50-5000). I think there's room for richer priors, different parameterizations of the hyperparameters themselves, and better search algorithms for optimizing performance over hyperparameter space. Lots of interesting research possibilities. Send me email if you're interested in working on this sort of thing.
[–][刪除] 12 月 前
[deleted]
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
Initially, 90% intuition, 10% math.
Then more math comes. Then you try it out and you find problems and you update your intuition and your math... etc.
And intuition comes from letting a problem sit in your head for a while, reading about it, asking yourself the question, working with it, talking with others about it, etc.
[–]32er234 1 指標 12 月 前
Is fluency in French a pre-requisite to becoming your student? Does it matter at all?
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
Not a pre-requisite at all. Most new students know very little or no French when I recruit them.
[–]32er234 1 指標 12 月 前
Given three candidates, none of which have much experience in ML, who would you rather chose as a potential student (other dimensions being equal):
Someone experienced in applied statistics (say, psychology research, or epidemiology), knows R
Someone who is very good at software development and knows some numpy/scipy, Matlab
Pure math undergrad who has little exposure to either programming or "real world" data
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
I can afford many students. I would not evaluate based on the above features but also based on an interview, in which all aspects come together. Strength in math is an excellent predictor of success in machine learning research, and so math undergrads with good programming skills are very high on my list of preferences. Strong software development is also very important for many of the projects we have, which involve big data and big models, where computational efficiency and top-notch collective programming are really important.
[–]andrewff 1 指標 12 月 前
I know I'm a little late to the party, but I was just wondering if you thought there was any room for an evolving topologies algorithm such as NEAT within deep learning? In some ways, techniques like dropout and dropconnect approach an evolving topolgy type methodolgy, but overall the idea of an evolving topology is not entirely captured by such techniques.
Thanks for doing this AMA!
[–]rishok 1 指標 12 月 前
Hello Prof. Bengio. I am a student from Denmark.
I am trying to add your Maxout Networks solution to the sparse autoencoder to see the potential benefits ... do you have any pre comment?
Can we be allowed to see more updates on your DL book .. hehe
[–]meiyordrummer123 1 指標 7 月 前
Hello professor Bengio I tried to run the Matlab toolbox that you have for DBN and I run at same time the Plearn app, but I want to know how can run a similar process between them?, because it is some options on plearn that are so different with the Matlab schemes and it would be useful to prototype a faster application.
Thank you
JMM
[–]sasaram 1 指標 1 年 前
Hi Prof. Bengio, very happy to see you here.
[–]yoshua_bengioProf. Bengio 3 指標 1 年 前
Recurrent nets are deep in the sense that the computation they perform (when you consider unfolding them in time) corresponds to a very deep network (albeit with shared weights across layers).
My definition of deep is that you have multiple levels of representation, with the i-th level obtained as a learned function of the representations at the lower levels. I also insist that the number of such levels be data-dependent, and I expect that higher-level representations capture more abstract features of the data which can only be obtained by the composition of the features at the lower levels, i.e., they are highly non-linear functions of the raw input.
[–]melipone 1 指標 1 年 前
Hi! No experience with deep learning, here. The introduction says that deep learning advances in machine learning can be used to solve artificial intelligence problems. Does that mean solving the consciousness/self-awareness problem or is it in a narrow sense?
[–]yoshua_bengioProf. Bengio 5 指標 1 年 前
Deep learning is not about consciousness or self-awareness but about something that I consider much more important from a practical point of view as well as much more challenging: allowing computers to understand the world around us. I believe that we will have fairly intelligent machines, that understand the world around us, but have no "consciousness" or rather no "self" in any way close to what humans have. Not because it would be difficult to introduce that, but because it would not be necessary in order to produce a lot of useful technology. Not to speak of the fact that once you introduce self in intelligent machines, you have to worry about Asimov's rules etc.
[–]anne-nonymous 1 指標 1 年 前
There are some robots who are self-aware :). Seriously.
http://www.scientificamerican.com/article/automaton-robots-become-self-aware/
[–]dnoup -2 指標 1 年 前
What exactly is deep learning and how it differ from conventional ML?
[–]yoshua_bengioProf. Bengio 3 指標 1 年 前
I have my definition above:
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfqay3e
Keep in mind that deep learning is part of machine learning.
[–]augustus2010 -3 指標 1 年 前
Could you explain Rationale behind sparse and deep learning?
[–]yoshua_bengioProf. Bengio 3 指標 1 年 前
I have already explained why deep learning is interesting. It is a broad prior and it brings both statistical and computational advantages, where it is an appropriate prior.
Sparsity is another prior: it assumes that for any given input, only a small subset of all the possible concepts known to the learner are relevant. Again, it is useful where it is applicable.
I believe that both priors are useful for many real-world problems where we want AI.
[–]melipone -9 指標 1 年 前
Did we get just 5 responses to ~100 questions?
[–]Noncomment 4 指標 1 年 前
I'm actually very impressed with this AMA. He answered almost all of the questions and put a lot of effort into the responses. Your comment was premature.
[–]vinnl -16 指標 1 年 前
Do you know all terms mentioned in questions here?