使用Python进行机器学习-基础知识

使用Python进行机器学习-基础知识 (Machine Learning with Python - Basics)

We are living in the ‘age of data’ that is enriched with better computational power and more storage resources,. This data or information is increasing day by day, but the real challenge is to make sense of all the data. Businesses & organizations are trying to deal with it by building intelligent systems using the concepts and methodologies from Data science, Data Mining and Machine learning. Among them, machine learning is the most exciting field of computer science. It would not be wrong if we call machine learning the application and science of algorithms that provides sense to the data.

我们生活在一个“数据时代”,它具有更好的计算能力和更多的存储资源。 这些数据或信息每天都在增加,但真正的挑战是要理解所有数据。 企业和组织正在尝试通过使用来自数据科学,数据挖掘和机器学习的概念和方法构建智能系统来应对它。 其中,机器学习是计算机科学中最令人兴奋的领域。 如果我们将机器学习称为为数据提供意义的算法的应用和科学,那就没错。

什么是机器学习? (What is Machine Learning?)

Machine Learning (ML) is that field of computer science with the help of which computer systems can provide sense to data in much the same way as human beings do.

机器学习(ML)是计算机科学领域,计算机系统可以像人类一样提供对数据的感知。

In simple words, ML is a type of artificial intelligence that extract patterns out of raw data by using an algorithm or method. The main focus of ML is to allow computer systems learn from experience without being explicitly programmed or human intervention.

简而言之,ML是一种人工智能,可以通过使用算法或方法从原始数据中提取模式。 ML的主要重点是允许计算机系统从经验中学习,而无需进行明确的编程或人工干预。

机器学习的需要 (Need for Machine Learning)

Human beings, at this moment, are the most intelligent and advanced species on earth because they can think, evaluate and solve complex problems. On the other side, AI is still in its initial stage and haven’t surpassed human intelligence in many aspects. Then the question is that what is the need to make machine learn? The most suitable reason for doing this is, “to make decisions, based on data, with efficiency and scale”.

目前,人类是地球上最聪明,最先进的物种,因为他们可以思考,评估和解决复杂的问题。 另一方面,人工智能还处于起步阶段,在很多方面都没有超越人类的智能。 然后的问题是,使机器学习需要什么? 这样做的最合适理由是“根据数据高效且大规模地做出决策”。

Lately, organizations are investing heavily in newer technologies like Artificial Intelligence, Machine Learning and Deep Learning to get the key information from data to perform several real-world tasks and solve problems. We can call it data-driven decisions taken by machines, particularly to automate the process. These data-driven decisions can be used, instead of using programing logic, in the problems that cannot be programmed inherently. The fact is that we can’t do without human intelligence, but other aspect is that we all need to solve real-world problems with efficiency at a huge scale. That is why the need for machine learning arises.

最近,组织正在对人工智能,机器学习和深度学习等较新的技术进行大量投资,以从数据中获取关键信息,以执行一些实际任务并解决问题。 我们可以称其为机器做出的数据驱动决策,尤其是使流程自动化的决策。 在无法固有编程的问题中,可以使用这些数据驱动的决策来代替编程逻辑。 事实是,我们离不开人类的智慧,但另一方面,我们都需要大规模高效地解决现实问题。 这就是为什么需要机器学习的原因。

为什么和何时使机器学习? (Why & When to Make Machines Learn?)

We have already discussed the need for machine learning, but another question arises that in what scenarios we must make the machine learn? There can be several circumstances where we need machines to take data-driven decisions with efficiency and at a huge scale. The followings are some of such circumstances where making machines learn would be more effective −

我们已经讨论了机器学习的必要性,但另一个问题是,在什么情况下必须使机器学习? 在某些情况下,我们需要机器来高效,大规模地进行数据驱动的决策。 以下是使机器学习更有效的一些此类情况-

缺乏人类专业知识 (Lack of human expertise)

The very first scenario in which we want a machine to learn and take data-driven decisions, can be the domain where there is a lack of human expertise. The examples can be navigations in unknown territories or spatial planets.

我们希望机器学习并采取以数据为依据的决策的第一个场景可能是缺乏专业知识的领域。 示例可以是在未知地区或太空星球中的导航。

动态场景 (Dynamic scenarios)

There are some scenarios which are dynamic in nature i.e. they keep changing over time. In case of these scenarios and behaviors, we want a machine to learn and take data-driven decisions. Some of the examples can be network connectivity and availability of infrastructure in an organization.

有些场景本质上是动态的,即它们会随着时间而变化。 在这些情况和行为的情况下,我们希望机器学习并采取以数据为依据的决策。 一些示例可以是组织中的网络连接性和基础结构的可用性。

难以将专业知识转化为计算任务 (Difficulty in translating expertise into computational tasks)

There can be various domains in which humans have their expertise,; however, they are unable to translate this expertise into computational tasks. In such circumstances we want machine learning. The examples can be the domains of speech recognition, cognitive tasks etc.

人们可以在各个领域拥有自己的专业知识; 但是,他们无法将这种专业知识转化为计算任务。 在这种情况下,我们需要机器学习。 示例可以是语音识别,认知任务等领域。

机器学习模型 (Machine Learning Model)

Before discussing the machine learning model, we must need to understand the following formal definition of ML given by professor Mitchell −

在讨论机器学习模型之前,我们必须了解由Mitchell教授给出的ML的以下正式定义-

“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

“如果某个计算机程序在T任务中的性能(由P衡量)随着经验E的提高而提高,则可以从经验E中学习有关某类任务T和性能度量P的信息。”

The above definition is basically focusing on three parameters, also the main components of any learning algorithm, namely Task(T), Performance(P) and experience (E). In this context, we can simplify this definition as −

上面的定义基本上集中在三个参数上,也是任何学习算法的主要组成部分,即Task(T),Performance(P)和Experience(E)。 在这种情况下,我们可以将该定义简化为-

ML is a field of AI consisting of learning algorithms that −

ML是AI的一个领域,其中包括以下学习算法:

  • Improve their performance (P)

    提高绩效(P)

  • At executing some task (T)

    在执行某些任务时(T)

  • Over time with experience (E)

    随着时间的流逝(E)

Based on the above, the following diagram represents a Machine Learning Model −

基于上述,下图表示机器学习模型-

使用Python进行机器学习-基础知识_第1张图片

Let us discuss them more in detail now −

现在让我们更详细地讨论它们-

任务(T) (Task(T))

From the perspective of problem, we may define the task T as the real-world problem to be solved. The problem can be anything like finding best house price in a specific location or to find best marketing strategy etc. On the other hand, if we talk about machine learning, the definition of task is different because it is difficult to solve ML based tasks by conventional programming approach.

从问题的角度来看,我们可以将任务T定义为要解决的现实问题。 问题可能是诸如在特定位置找到最佳房价或找到最佳营销策略之类的问题。另一方面,如果我们谈论机器学习,则任务的定义是不同的,因为很难通过以下方式解决基于ML的任务常规编程方法。

A task T is said to be a ML based task when it is based on the process and the system must follow for operating on data points. The examples of ML based tasks are Classification, Regression, Structured annotation, Clustering, Transcription etc.

当任务T基于流程并且系统必须遵循以对数据点进行操作时,它被称为基于ML的任务。 基于ML的任务包括分类,回归,结构化注释,聚类,转录等。

经验(E) (Experience (E))

As name suggests, it is the knowledge gained from data points provided to the algorithm or model. Once provided with the dataset, the model will run iteratively and will learn some inherent pattern. The learning thus acquired is called experience(E). Making an analogy with human learning, we can think of this situation as in which a human being is learning or gaining some experience from various attributes like situation, relationships etc. Supervised, unsupervised and reinforcement learning are some ways to learn or gain experience. The experience gained by out ML model or algorithm will be used to solve the task T.

顾名思义,这是从提供给算法或模型的数据点获得的知识。 一旦提供了数据集,该模型将迭代运行并学习一些固有模式。 这样获得的学习称为经验(E)。 类似于人类学习,我们可以认为这种情况是人类正在学习或从各种属性(例如情况,关系等)中获得经验。有监督,无监督和强化学习是学习或获得经验的一些方式。 ML模型或算法获得的经验将用于解决任务T。

性能(P) (Performance (P))

An ML algorithm is supposed to perform task and gain experience with the passage of time. The measure which tells whether ML algorithm is performing as per expectation or not is its performance (P). P is basically a quantitative metric that tells how a model is performing the task, T, using its experience, E. There are many metrics that help to understand the ML performance, such as accuracy score, F1 score, confusion matrix, precision, recall, sensitivity etc.

机器学习算法应该随着时间的流逝执行任务并获得经验。 衡量ML算法是否按预期执行的指标是其性能(P)。 P基本上是一个定量指标,使用其经验E来告诉模型是如何执行任务T的。有许多指标有助于理解ML性能,例如准确性得分,F1得分,混淆矩阵,精度,召回率,灵敏度等

机器学习中的挑战 (Challenges in Machines Learning)

While Machine Learning is rapidly evolving, making significant strides with cybersecurity and autonomous cars, this segment of AI as whole still has a long way to go. The reason behind is that ML has not been able to overcome number of challenges. The challenges that ML is facing currently are −

尽管机器学习正在Swift发展,并在网络安全和自动驾驶汽车方面取得了长足的进步,但从整体上来说,这部分AI仍有很长的路要走。 背后的原因是ML无法克服许多挑战。 ML当前面临的挑战是-

Quality of data − Having good-quality data for ML algorithms is one of the biggest challenges. Use of low-quality data leads to the problems related to data preprocessing and feature extraction.

数据质量 -为ML算法获取高质量数据是最大的挑战之一。 使用低质量的数据会导致与数据预处理和特征提取有关的问题。

Time-Consuming task − Another challenge faced by ML models is the consumption of time especially for data acquisition, feature extraction and retrieval.

耗时的任务 -ML模型面临的另一个挑战是时间的消耗,特别是对于数据采集,特征提取和检索而言。

Lack of specialist persons − As ML technology is still in its infancy stage, availability of expert resources is a tough job.

缺乏专业人才 -由于机器学习技术仍处于起步阶段,因此难以获得专家资源。

No clear objective for formulating business problems − Having no clear objective and well-defined goal for business problems is another key challenge for ML because this technology is not that mature yet.

解决业务问题没有明确的目标- 对于业务问题没有明确的目标和明确定义的目标是ML的另一个主要挑战,因为该技术尚未成熟。

Issue of overfitting & underfitting − If the model is overfitting or underfitting, it cannot be represented well for the problem.

过度拟合和不足拟合的问题 -如果模型过度拟合或不足拟合,则无法很好地表示该问题。

Curse of dimensionality − Another challenge ML model faces is too many features of data points. This can be a real hindrance.

维数的诅咒 -ML模型面临的另一个挑战是数据点的特征太多。 这可能是一个真正的障碍。

Difficulty in deployment − Complexity of the ML model makes it quite difficult to be deployed in real life.

部署困难 -ML模型的复杂性使得在现实生活中很难部署。

机器学习的应用 (Applications of Machines Learning)

Machine Learning is the most rapidly growing technology and according to researchers we are in the golden year of AI and ML. It is used to solve many real-world complex problems which cannot be solved with traditional approach. Following are some real-world applications of ML −

机器学习是发展最快的技术,根据研究人员的说法,我们正处于AI和ML的黄金年。 它用于解决许多传统方法无法解决的现实世界中的复杂问题。 以下是ML的一些实际应用-

  • Emotion analysis

    情绪分析

  • Sentiment analysis

    情绪分析

  • Error detection and prevention

    错误检测与预防

  • Weather forecasting and prediction

    天气预报和预报

  • Stock market analysis and forecasting

    股市分析与预测

  • Speech synthesis

    语音合成

  • Speech recognition

    语音识别

  • Customer segmentation

    客户细分

  • Object recognition

    物体识别

  • Fraud detection

    欺诈识别

  • Fraud prevention

    预防诈骗

  • Recommendation of products to customer in online shopping.

    在网上购物中向客户推荐产品。

翻译自: https://www.tutorialspoint.com/machine_learning_with_python/machine_learning_with_python_basics.htm

你可能感兴趣的:(算法,大数据,编程语言,python,机器学习)