吴恩达(英文名:Andrew Ng),是斯坦福大学计算机科学系和电子工程系教授,人工智能实验室主任,人工智能和机器学习领域国际上最权威的学者之一,也是在线教育平台Coursera的联合创始人(with Daphne Koller),曾担任百度公司首席科学家,负责百度研究院的领导工作,尤其是Baidu Brain计划。
在斯坦福大学的一次深度学习课程上,吴恩达亲述了如何有效阅读论文,通过论文去了解一个新的领域。视频截取如下:
(Stanford CS230: Deep Learning | Autumn 2018 | Lecture 8 - Career Advice / Reading Research Papers)
最近,一位博主以“pose estimation(姿态估计)”这一技术领域为例子亲身示范了吴恩达的论文阅读方法,并发布在Medium上,获得高赞。整理如下:
一、方法整体流程
第一步:收集并整合相关资源。
Resources can come in the form of research papers, Medium articles, blog posts, videos, GitHub repository etc.
(资源可以是研究论文、媒体文章、博客文章、视频、GitHub知识库等形式。)
A quick google search on the phrase “pose estimation” will provide you with top resources that contain information in regards to the subject matter. At this initial step, the aim is to collate all resources that are relevant。
(在谷歌上搜索“pose estimate”,你会得到与主题相关的顶级资源。这一步的目标是整理所有相关的资源。)
Ideally, at this stage, there is no limit to the number of resources you consider important, but be sure to create a shortlist of papers, videos and articles that are useful.
(在这个阶段,对资源数量没有限制,但一定要创建一个名单列表,用于记录有用的论文、视频和文章。)
第二步:深入研究你认为与主题相关的任何资源。
It is crucial that there’s a method to track the understanding of each shortlisted resources. Andrew Ng, suggests a table of resource plotted against your understanding level that looks similar to the table below.
(在这一步中,记录并跟踪对每个资源的理解程度是至关重要的。Andrew Ng建议根据对资源的理解程度绘制一个资源表格,它类似于下表。)
It is advisable to ensure you go through at least 10–20% of the content of each paper you have added to the list; this will ensure that you have been exposed to enough of the introductory content within an identified resource and are able to gauge its relevancy accurately.
(确保对添加到列表中的每篇论文至少有10-20%的理解;这将确保你已经对收集到的资源有了足够的整体性了解,并且能够准确地衡量它的相关性。)
For the more relevant papers/resources identified, it is expected that you progress to a higher level of understanding. Eventually, you will have identified some appropriate resources with content that you understand fully.
You are probably asking yourself, “what number of papers/resource is sufficient”.
(你可能会问,“多少论文/资源算足够呢”。)
According to Andrew, an understanding of 5–20 papers will showcase a basic understanding within the subject matter, perhaps enough understanding to progress to implementation of techniques.
(根据Andrew Ng的说法,理解了5-20篇论文的内容,那么你对该领域的研究就有了基本的理解,对该领域的相关技术也会有足够的理解。)
50–100 papers will primarily provide you with a very good understanding of the domain.
(50-100篇论文会让你对这个领域有一个更深入的理解。)
After going through the resources and extraction of vital information, your table might look something similar to what’s shown below.
(在掌握了该领域的一些重要资源后,你的表格可能就如下了。)
第三步:做笔记,对该领域理解的升华。
The third step is to take structured notes that summarises the key discoveries, findings and techniques within a paper, in your own words.
(做结构化的笔记,用你自己的话总结论文中的关键发现和技巧。)
二、详细阐述如何阅读一篇论文
According to Andrew, reading a paper from the first word to the last word in one sitting might not be the best way to form an understanding.
(根据Andrew Ng的说法,一口气的从第一个词读到最后一个词可能不是最好的方式。)
Be prepared to go through a paper at least three times to have a good understanding of its content。
(一篇论文至少要读三遍)
第一遍:阅读标题、摘要、文中图表。
In your first pass, start with reading the following sections within the paper: title, abstract and figures.
第二遍:阅读引言、结论,掌握关键信息;并结合图表快速扫描文章其余的内容。
The second pass entails you reading the following sections: introduction, conclusion, another pass through figures and scan through the rest of the content.
第三遍:对论文进行整体阅读,但要跳过任何对你来说可能陌生的复杂的数学或技术公式。在此过程中,还可以跳过不理解或不熟悉的任何术语和术语。
The third pass of the paper involves reading the whole sections within the paper but skipping any complicated maths or technique formulations that might be alien to you. During this pass, you can also skip any terms and terminologies that you do not understand or aren’t familiar.
如若要深入理解一个领域,这些公式和术语还是必须搞懂的。
三、通过问自己问题来检测对论文的理解程度
Andrew provides a set of questions that you should ask yourself as you read a paper. These questions generally will show you understand the critical information presented in a paper. I use the questions below as beacons to ensure I don’t stray from the aim of understanding vital information.
(Andrew Ng提供了一组在阅读论文时应该问自己的问题。这些问题通常会表明你理解论文中提出的关键信息。我使用下面的问题作为指引,以确保不会偏离理解重要信息的目标。)
1、Describe what the authors of the paper aim to accomplish, or perhaps did achieve.
(论文的作者想要完成什么,或者已经完成了什么?)
2、If a new approach/ technique/ method was introduced in a paper, what are the key elements of the newly proposed approach?
(如果一篇论文介绍了一种新方法/技术/方法,那么该新方法的关键要素是什么?)
3、What content within the paper is useful to you?
(论文中哪些内容对你有用?)
4、What other references do you want to follow?
(你还想关注哪些参考文献?)
四、最后
吴恩达也强调:“Learn steadily rather than short burst for longevity.”
该博主根据Andrew Ng的方法,每个月至少阅读四篇研究论文,来达到对该领域的理解。随着对论文阅读频次的增加,阅读和理解文章的速度也会越来越快。
Andrew Ng states in his video that he carries a batch of research papers around with him, intending to read them.
(随身携带论文,随时阅读)
博文链接:
https://towardsdatascience.com/how-you-should-read-research-papers-according-to-andrew-ng-stanford-deep-learning-lectures-98ecbd3ccfb3
视频链接(Stanford CS230: Deep Learning | Autumn 2018 | Lecture 8 - Career Advice / Reading Research Papers):
https://www.youtube.com/watch?v=733m6qBH-jI