看到不少推荐Andrew Ng的机器学习的课程,所以在coursera上注册了开始学。2016年1月15日
Arthur Samuel(1959): Machine learning is a field of study that gives computers the ability to learn without explicitly programmed.
Tom Mitchell(1998)(CMU) :Well-posed learning problem:A computer program is said to learn from experience E with respect to some task T, and some performance measure P, if its performance T, as measured by P, improves with exprience E.
(1)supervised learning(监督学习)
(2)unsupervised learning(非监督学习)
其他还有reinforcement learning(比如check playing),recommender system
(1) refers to the fact that we gave the algorithm a data set in which “right answers” were given.
(2)主要分为regression(回归,主要output 为连续的)和 classification(分类,output是离散的)问题,regression 可以为线性,可以为非线性。
(3) SVM(support vector machine,支持向量机)
(1) 没有标签(label),没有告诉right answer
(2) Clustering problem(聚类问题)
E.g Google news group automatically cluster news stories into groups about the same topic.
(3) Applications: Genes; organizing large computing clusters; social network analysis; market segmentation; astronomical data analysis
(4)Cocktail party problem
分离两个不同来源但叠加在一起的声音input1和input2
svd: single value decomposition
traning set
m = # of traning examples
(x, y) a single traning example
()
hθ x = θ0 + θ1 x1
cost function= J( θ0 , θ1 ) = 12m ∑i=1m(hθ(x(i))−y(i))2
θ∈Rn+1 , J(θ0,θ1,......,θn) = 12m ∑i=1m(hθ(x(i))−y(i))2
对每一个 θ 求偏微分
set δδθj(θ)=0 (for every j)
solve for θ0,θ1,......,θn
可以推导出 θ=(XTX)−1XTY
当变量很多时,如 n≥106 的时候,适合用gradientdescent
但n较小时,需要选择 α ,需要很多部才能收敛,效果不如normal equation