Python has a design philosophy that stresses allowing programmers to express concepts readably and in fewer lines of code. This philosophy makes the language suitable for a diverse set of use cases: simple scripts for web, large web applications (like YouTube), scripting language for other platforms (like Blender and Autodesk’s Maya), and scientific applications in several areas, such as astronomy, meteorology, physics, and data science.
Python的设计理念强调允许程序员以更少的代码行可读地表达概念。 这种理念使该语言适用于各种用例 :用于Web的简单脚本,大型Web应用程序(例如YouTube),其他平台的脚本语言(例如Blender和Autodesk的Maya)以及在多个领域(例如天文学)的科学应用程序,气象,物理学和数据科学。
It is technically possible to implement scalar and matrix calculations using Python lists. However, this can be unwieldy, and performance is poor when compared to languages suited for numerical computation, such as MATLAB or Fortran, or even some general purpose languages, such as C or C++.
从技术上讲,可以使用Python列表实现标量和矩阵计算。 但是,与适合于数值计算的语言(如MATLAB或Fortran)或某些通用语言(如C或C ++)相比,这可能会很麻烦,并且性能很差。
To circumvent this deficiency, several libraries have emerged that maintain Python’s ease of use while lending the ability to perform numerical calculations in an efficient manner. Two such libraries worth mentioning are NumPy (one of the pioneer libraries to bring efficient numerical computation to Python) and TensorFlow (a more recently rolled-out library focused more on deep learning algorithms).
为了避免这种缺陷,已经出现了一些库,这些库在保持Python易用性的同时,还提供了以高效方式执行数值计算的能力。 值得一提的两个这样的库是NumPy(将高效的数值计算引入Python的先驱库之一)和TensorFlow(一个最近推出的库,它更加专注于深度学习算法)。
But how do these schemes compare? How much faster does the application run when implemented with NumPy instead of pure Python? What about TensorFlow? The purpose of this article is to begin to explore the improvements you can achieve by using these libraries.
但是这些方案相比如何? 使用NumPy而非纯Python实施时,应用程序运行的速度有多快? 那TensorFlow呢? 本文的目的是开始探索使用这些库可以实现的改进。
To compare the performance of the three approaches, you’ll build a basic regression with native Python, NumPy, and TensorFlow.
为了比较这三种方法的性能,您将使用本机Python,NumPy和TensorFlow构建基本回归。
Get Notified: Don’t miss the follow up to this tutorial—Click here to join the Real Python Newsletter and you’ll know when the next instalment comes out.
通知您:不要错过本教程的后续内容- 单击此处加入Real Python Newslet ,您将知道下一期的发行时间。
To test the performance of the libraries, you’ll consider a simple two-parameter linear regression problem. The model has two parameters: an intercept term, w_0
and a single coefficient, w_1
.
为了测试库的性能,您将考虑一个简单的两参数线性回归问题 。 该模型具有两个参数:拦截项w_0
和单个系数w_1
。
Given N pairs of inputs x
and desired outputs d
, the idea is to model the relationship between the outputs and the inputs using a linear model y = w_0 + w_1 * x
where the output of the model y
is approximately equal to the desired output d
for every pair (x, d)
.
给定N对输入x
和期望输出d
,其思想是使用线性模型y = w_0 + w_1 * x
对输出和输入之间的关系进行建模,其中模型y
的输出大约等于期望输出d
对于每对(x, d)
。
Technical Detail: The intercept term, w_0
, is technically just a coefficient like w_1
, but it can be interpreted as a coefficient that multiplies elements of a vector of 1s.
技术细节 :截距项w_0
从技术上来说只是一个像w_1
的系数,但可以将其解释为乘以1s向量的元素的系数。
To generate the training set of the problem, use the following program:
要生成问题的训练集,请使用以下程序:
import import numpy numpy as as np
np
npnp .. randomrandom .. seedseed (( 444444 )
)
N N = = 10000
10000
sigma sigma = = 0.1
0.1
noise noise = = sigma sigma * * npnp .. randomrandom .. randnrandn (( NN )
)
x x = = npnp .. linspacelinspace (( 00 , , 22 , , NN )
)
d d = = 3 3 + + 2 2 * * x x + + noise
noise
dd .. shape shape = = (( NN , , 11 )
)
# We need to prepend a column vector of 1s to `x`.
# We need to prepend a column vector of 1s to `x`.
X X = = npnp .. column_stackcolumn_stack (((( npnp .. onesones (( NN , , dtypedtype == xx .. dtypedtype ), ), xx ))
))
printprint (( XX .. shapeshape )
)
(( 1000010000 , , 22 )
)
This program creates a set of 10,000 inputs x
linearly distributed over the interval from 0 to 2. It then creates a set of desired outputs d = 3 + 2 * x + noise
, where nois