By James Le, New Story Charity.
It is no doubt that the sub-field of machine learning / artificial intelligence has increasingly gained more popularity in the past couple of years. As Big Data is the hottest trend in the tech industry at the moment, machine learning is incredibly powerful to make predictions or calculated suggestions based on large amounts of data. Some of the most common examples of machine learning are Netflix’s algorithms to make movie suggestions based on movies you have watched in the past or Amazon’s algorithms that recommend books based on books you have bought before.
So if you want to learn more about machine learning, how do you start? For me, my first introduction is when I took an Artificial Intelligence class when I was studying abroad in Copenhagen. My lecturer is a full-time Applied Math and CS professor at the Technical University of Denmark, in which his research areas are logic and artificial, focusing primarily on the use of logic to model human-like planning, reasoning and problem solving. The class was a mix of discussion of theory/core concepts and hands-on problem solving. The textbook that we used is one of the AI classics: Peter Norvig’s Artificial Intelligence — A Modern Approach,in which we covered major topics including intelligent agents, problem-solving by searching, adversarial search, probability theory, multi-agent systems, social AI, philosophy/ethics/future of AI. At the end of the class, in a team of 3, we implemented simple search-based agents solving transportation tasks in a virtual environment as a programming project.
所以如果你想了解更多关于机器学习的知识,如何开始学习?对我来说,我第一次接触的是在哥本哈根出国留学时参加的人工智能课。我的讲师是丹麦技术大学的全职应用数学和CS教授,他的研究领域是逻辑和人工智能,主要侧重于使用逻辑模拟人类行为,推理和问题解决。这堂课主要讨论的是理论/核心概念和动手解决问题。我们使用的教科书是AI经典之一: Peter Norvig的“ 人工智能 - 现代方法”其中涵盖智能Agent,通过搜索进行问题求解,对抗搜索,概率论,多代理系统,社交AI,AI的哲学/道德/未来等主要议题。在课程结束时,我们组成了一个3人小组,我们实现了简单的基于搜索的代理,在虚拟环境中作为编程项目来解决传输任务。
I have learned a tremendous amount of knowledge thanks to that class, and decided to keep learning about this specialized topic. In the last few weeks, I have been multiple tech talks in San Francisco on deep learning, neural networks, data architecture — and a Machine Learning conference with a lot of well-known professionals in the field. Most importantly, I enrolled in Udacity’s Intro to Machine Learningonline course in the beginning of June and has just finished it a few days ago. In this post, I want to share some of the most common machine learning algorithms that I learned from the course.
Machine learning algorithms can be divided into 3 broad categories — supervised learning, unsupervised learning, and reinforcement learning.Supervised learning is useful in cases where a property (label) is available for a certain dataset (training set), but is missing and needs to be predicted for other instances. Unsupervised learning is useful in cases where the challenge is to discover implicit relationships in a given unlabeled dataset (items are not pre-assigned). Reinforcement learning falls between these 2 extremes — there is some form of feedback available for each predictive step or action, but no precise label or error message. Since this is an intro class, I didn’t learn about reinforcement learning, but I hope that 10 algorithms on supervised and unsupervised learning will be enough to keep you interested.
机器学习算法可以分为3大类 - 监督学习,无监督学习和强化学习。监督学习在某个特定数据集(训练集)可用的属性(标签)的情况下非常有用,但缺失并需要可以预测其他情况。在挑战是发现给定的未标记的隐式关系的情况下,无监督学习是有用的数据集(项目不预先分配)。强化学习属于这两个极端之间 - 每种预测步骤或行为都有某种形式的反馈,但没有精确的标签或错误信息。由于这是一个介绍类,我没有学习强化学习,但是我希望10个有监督和无监督学习的算法足以让你感兴趣。
1.Decision Trees: A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance-event outcomes, resource costs, and utility. Take a look at the image to get a sense of how it looks like.
2.Naive Bayes Classification: Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) independence assumptions between the features. The featured image is the equation — with P(A|B) is posterior probability, P(B|A) is likelihood, P(A) is class prior probability, and P(B) is predictor prior probability.
2.朴素贝叶斯分类朴素:贝叶斯分类器是一个简单的概率分类器的家庭基础上应用贝叶斯定理与强(天真)独立性假设之间的功能。P(A | B)为后验概率,P(B | A)为似然率,P(A)为类先验概率,P(B)为预测器的先验概率。
一些现实世界的例子是: 将电子邮件标记为垃圾邮件或不垃圾邮件 分类关于技术,政治或体育的新闻文章 检查一段表达正面情绪或负面情绪的文字吗? 用于人脸识别软件。
3. Ordinary Least Squares Regression: If you know statistics, you probably have heard of linear regression before. Least squares is a method for performing linear regression. You can think of linear regression as the task of fitting a straight line through a set of points. There are multiple possible strategies to do this, and “ordinary least squares” strategy go like this — You can draw a line, and then for each of the data points, measure the vertical distance between the point and the line, and add these up; the fitted line would be the one where this sum of distances is as small as possible.
3.普通最小二乘回归:如果你知道统计,你可能以前听说过线性回归。最小二乘法是进行线性回归的一种方法。您可以将线性回归看作是通过一组点拟合直线的任务。有多种可能的策略可以做到这一点,“普通最小二乘”策略是这样的 - 你可以绘制一条线,然后对每个数据点,测量点和线之间的垂直距离,并将它们相加; 拟合的线将是这个距离的总和尽可能小的那条线。
4. Logistic Regression: Logistic regression is a powerful statistical way of modeling a binomial outcome with one or more explanatory variables. It measures the relationship between the categorical dependent variable and one or more independent variables by estimating probabilities using a logistic function, which is the cumulative logistic distribution.
4. Logistic回归:Logistic回归是一个强大的统计方法,用一个或多个解释变量对二项结果进行建模。它通过使用逻辑函数估计概率来测量分类因变量和一个或多个自变量之间的关系,逻辑函数是累积逻辑分布。
一般来说,回归可以用于现实世界的应用程序,例如: 信用评分 衡量营销活动的成功率 预测某种产品的收入 有一天会发生地震吗?
5. Support Vector Machines: SVM is binary classification algorithm. Given a set of points of 2 types in N dimensional place, SVM generates a (N — 1) dimensional hyperplane to separate those points into 2 groups. Say you have some points of 2 types in a paper which are linearly separable. SVM will find a straight line which separates those points into 2 types and situated as far as possible from all those points.
那么集成方法是如何工作的,为什么它们比个人模型更优越呢? 他们平均有偏见:如果你把一群民主倾向的民意调查和共和民主的民意调查结合在一起,你会得到一个平均的东西。 他们减少了方差:一堆模型的总体意见比其中一个模型的单一意见少噪音。在金融方面,这就是所谓的多元化 - 许多股票的混合组合将变得比只有一个股票变量少得多。这就是为什么你的模型会更好,更多的数据点,而不是更少。 他们不太可能过度适应:如果您有个别模型没有过度拟合,并且您将每个模型的预测以简单的方式(平均值,加权平均值,逻辑回归)相结合,那么就没有余地-配件。
7. Clustering Algorithms: Clustering is the task of grouping a set of objects such that objects in the same group (cluster) are more similar to each other than to those in other groups.
每个聚类算法是不同的,这里有几个: 基于质心的算法 基于连接的算法 基于密度的算法 概率 降维 神经网络/深度学习
8. Principal Component Analysis: PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.
9. Singular Value Decomposition: In linear algebra, SVD is a factorization of a real complex matrix. For a given m * n matrix M, there exists a decomposition such that M = UΣV, where U and V are unitary matrices and Σ is a diagonal matrix.
9.奇异值分解:在线性代数中,奇异值分解是一个实数复数矩阵的分解。对于给定的 m×n 矩阵M,存在如下分解:M =UΣV,其中U和V是酉矩阵,Σ是对角矩阵。
PCA实际上是一种简单的SVD应用。在计算机视觉领域,第一代人脸识别算法采用PCA和SVD方法将人脸表示为“特征脸”的线性组合,进行维数降维,然后通过简单的方法将人脸匹配到身份; 虽然现代的方法要复杂得多,但许多仍然依靠类似的技术。
10. Independent Component Analysis: ICA is a statistical technique for revealing hidden factors that underlie sets of random variables, measurements, or signals. ICA defines a generative model for the observed multivariate data, which is typically given as a large database of samples. In the model, the data variables are assumed to be linear mixtures of some unknown latent variables, and the mixing system is also unknown. The latent variables are assumed non-gaussian and mutually independent, and they are called independent components of the observed data.
ICA与PCA相关,但是当这些经典方法完全失败时,它是一种更为强大的技术,能够找到源的潜在因素。其应用包括数字图像,文档数据库,经济指标和心理测量。 现在出发,运用您对算法的理解来创建机器学习应用程序,为各地的人们提供更好的体验。
【作者介绍】Bio: James Le is a Product Intern at New Story Charity and a Computer Science and Communication student at Denison University.