非参数模型 机器学习
Machine learning can be briefed as learning a function (f) that maps input variables (X) and the following results are given in output variables (Y).
可以将机器学习概括为学习映射输入变量(X)的函数(f),并在输出变量(Y)中给出以下结果。
Y = f(x)
Y = f(x)
The machine learns from the training data to map the target function, but the configuration of the function is unknown. Different algorithms make various conclusions or biases about the function‘s structure, so our task as machine learning practitioners is to test various machine learning algorithms to see which one is effective at modeling the underlying function. Thus machine learning models are parameterized so that their behavior can be tuned for a given problem. These models can have many parameters and finding the best combination of parameters can be treated as a search problem.
机器从训练数据中学习以映射目标功能,但是功能的配置未知。 不同的算法会对函数的结构得出不同的结论或偏见,因此作为机器学习从业者,我们的任务是测试各种机器学习算法,以查看哪种算法可以有效地对基础函数进行建模。 因此,对机器学习模型进行了参数设置,以便可以针对给定问题调整其行为。 这些模型可以具有许多参数,并且将参数的最佳组合视为搜索问题。
Can we quickly glance introduction on parameters in Machine Learning, to get our understanding right!!:)
我们能否快速浏览一下机器学习中的参数介绍,以使我们理解正确!:)
What is a parameter in a machine learning model?A model parameter is a configuration variable that is internal to the model and whose value can be estimated from the given data.
机器学习模型中的参数是什么? 模型参数是模型内部的配置变量,可以从给定数据中估计其值。
- They are required by the model when making predictions. 模型在进行预测时需要它们。
- Their values define the skill of the model on your problem. 它们的值定义了模型解决问题的技能。
- They are estimated or learned from historical training data. 它们是从历史训练数据中估计或学习的。
- They are often not set manually by the practitioner. 他们通常不是从业者手动设置的。
- They are often saved as part of the learned model. 它们通常被保存为学习模型的一部分。
The examples of model parameters include:
模型参数的示例包括:
- The weights in an artificial neural network. 人工神经网络中的权重。
- The support vectors in a support vector machine. 支持向量机中的支持向量。
- The coefficients in linear regression or logistic regression. 线性回归或逻辑回归中的系数。
Machine learning algorithms are classified into two distinct groups: parametric and nonparametric models.
机器学习算法分为两个不同的组: 参数 模型和非参数模型。
What is the parametric model?A learning model that summarizes data with a set of fixed-size parameters (independent on the number of instances of training).Parametric machine learning algorithms are which optimizes the function to a known form.
什么是参数模型? 一个学习模型,该模型使用一组固定大小的参数来汇总数据(与训练实例的数量无关),参数化机器学习算法可以将功能优化为已知形式。
In a parametric model, you know exactly which model you are going to fit in with the data, for example, linear regression line.b0 + b1*x1 + b2*x2 = 0where,b0, b1, b2 → the coefficients of the line that control the intercept and slopex1, x2 → input variables
在参数模型中,您确切地知道要与数据拟合的模型,例如线性回归线。 b0 + b1 * x1 + b2 * x2 = 0 其中,b0,b1,b2→控制截距和斜率的直线系数x1,x2→输入变量
Following the functional form of a linear line clarifies the learning process greatly. Now we’ll have to do is estimate the line equation coefficients and we have a predictive model for the problem. With the intercept and the coefficient, one can predict any value along with the regression.
遵循直线的功能形式,极大地阐明了学习过程。 现在,我们要做的是估算线方程系数,并为该问题建立一个预测模型。 有了截距和系数,就可以预测任何值以及回归。
The assumed functional form is always a linear combination of input variables and as such parametric machine learning algorithms are also frequently referred to as ‘linear machine learning algorithms.’
假定的功能形式始终是输入变量的线性组合,因此,参数化机器学习算法也经常被称为“ 线性机器学习算法” 。
The equation in algorithms is pre-defined. Feeding more data might just change the coefficients in the equations and increasing the number of instances will not make your model more complex. It becomes stable.
算法中的方程式是预定义的。 馈入更多数据可能只会改变方程式中的系数,并且增加实例数不会使您的模型更复杂。 它变得稳定。
Some more examples of parametric machine learning algorithms include:
参数化机器学习算法的更多示例包括:
- Logistic Regression 逻辑回归
- Linear Discriminant Analysis 线性判别分析
- Perceptron 感知器
- Naive Bayes 朴素贝叶斯
- Simple Neural Networks 简单神经网络
What is the nonparametric model?Nonparametric machine learning algorithms are those which do not make specific assumptions about the type of the mapping function. They are prepared to choose any functional form from the training data, by not making assumptions.The word nonparametric does not mean that the value lacks parameters existing in it, but rather that the parameters are adjustable and can change. When dealing with ranked data one may turn to nonparametric modeling, in which the sequence in that they are ordered is some of the significance of the parameters.
什么是非参数模型? 非参数机器学习算法是不对映射函数的类型进行特定假设的算法。 他们准备通过不做假设从训练数据中选择任何功能形式。非参数一词并不意味着该值缺少其中存在的参数,而是参数是可调的并且可以更改。 当处理排名数据时,人们可能会转向非参数建模,在非参数建模中,对参数排序的顺序是参数的某些重要意义。
A simple to understand the nonparametric model is the k-nearest neighbors' algorithm, making predictions for a new data instance based on the most similar training patterns k. The only assumption it makes about the data set is that the training patterns that are the most similar are most likely to have a similar result.
易于理解的非参数模型是k最近邻居算法,它基于最相似的训练模式k对新数据实例进行预测。 它对数据集所做的唯一假设是,最相似的训练模式最有可能产生相似的结果。
Some more examples of popular nonparametric machine learning algorithms are:
流行的非参数机器学习算法的更多示例包括:
- k-Nearest Neighbors k最近邻居
- Decision Trees like CART and C4.5 决策树,例如CART和C4.5
- Support Vector Machines 支持向量机
Parametric vs. Nonparametric modeling
参数与非参数建模
- Parametric models deal with discrete values, and nonparametric models use continuous values. 参数模型处理离散值,非参数模型使用连续值。
- Parametric models are able to infer the traditional measurements associated with normal distributions including mean, median, and mode. While some nonparametric distributions are normally oriented, often one cannot assume the data comes from a normal distribution. 参数模型能够推断与正态分布相关的传统测量值,包括均值,中位数和众数。 尽管某些非参数分布是正态分布的,但通常不能假设数据来自正态分布。
- Feature engineering is important in parametric models. Because you can poison parametric models if you feed a lot of unrelated features. Nonparametric models handle feature engineering mostly. We can feed all the data we have to those non-parametric algorithms and the algorithm can ignore unimportant features. It would not cause overfitting. 特征工程在参数模型中很重要。 因为如果您提供许多不相关的功能,则可能会毒害参数模型。 非参数模型主要处理要素工程。 我们可以将所有数据提供给那些非参数算法,并且该算法可以忽略不重要的特征。 这不会导致过度拟合。
- A parametric model can predict future values using only the parameters. While nonparametric machine learning algorithms are often slower and require large amounts of data, they are rather flexible as they minimize the assumptions they make about the data. 参数模型可以仅使用参数来预测将来的值。 尽管非参数机器学习算法通常较慢并且需要大量数据,但它们具有很大的灵活性,因为它们可以最小化对数据所做的假设。
In this post, we have learned that parametric methods make large assumptions about the mapping of the input variables to the output variable and in turn are faster to train, require less data but may not be as powerful. Nonparametric methods make few or no assumptions about the target function and in turn require a lot more data, are slower to train, and have a higher model complexity but can result in more powerful models.
在本文中,我们了解到参数化方法对输入变量到输出变量的映射做出了很大的假设,因而训练起来更快,需要的数据更少,但功能可能不那么强大。 非参数方法很少或根本没有关于目标函数的假设,进而需要更多的数据,训练较慢,模型复杂度较高,但可以生成更强大的模型。
翻译自: https://medium.com/analytics-vidhya/parametric-and-nonparametric-models-in-machine-learning-a9f63999e233
非参数模型 机器学习