numba使用LLVM编译器架构将纯Python代码生成优化过的机器码,
将面向数组和使用大量数学的python代码优化到与c,c++和Fortran类似的性能,而无需改变Python的解释器。
from numba import jit, int32
import math
# 例子1
@jit(int32(int32, int32))
def f(x, y):
return x + y
f(1, 3)
4
Numba编译的函数可以调用其他编译的函数。这些函数调用甚至可以在本地代码中被内联,这取决于优化器的启发式方法。比如说。
# 例子2
@jit
def square(x):
return x ** 2
@jit
def hypot(x, y):
return math.sqrt(square(x) + square(y))
hypot(1, 3)
3.1622776601683795
# 例子3
@jit
def go_fast_sum1(size: float) -> int:
sum = 0
for i in range(size):
sum += i
return sum
@jit
def go_fast_sum2(size):
sum = 0
for i in range(size):
sum += i
return sum
@jit(int32(int32))
def go_fast_sum3(size):
sum = 0
for i in range(size):
sum += i
return sum
def pure_python_sum(size):
sum = 0
for i in range(size):
sum += i
return sum
%timeit go_fast_sum1(1000)
192 ns ± 6.89 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
%timeit go_fast_sum2(1000)
193 ns ± 4.93 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
%timeit go_fast_sum3(1000)
201 ns ± 4.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
%timeit pure_python_sum(1000)
47.8 µs ± 5.39 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
上面几个例子看出来,不同写法,差别不是很大,都有得到提升。就按你自己舒服方式写就好
这个模式,是被推荐的模式
说白了就是这段代码的运行将脱离python解释器,变成机器码来运行,所以速度超快。
@jit(nopython=True)
def go_fast_sum4(size):
sum = 0
for i in range(size):
sum += i
return sum
%timeit go_fast_sum4(1000)
190 ns ± 12.5 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
@jit(nopython=True)
def go_fast_sum5(size: float) -> int:
sum = 0
for i in range(size):
sum += i
return sum
%timeit go_fast_sum5(1000)
195 ns ± 26 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
普通模式,就是在python解释器里运行的模式。没有写nopython=True那么就默认是这个
def foo():
A#非数学计算类。
for i in range(1000):
B#数学计算类。
C#非数学计算类。
这个模式将自动识别那个循环,然后优化,脱离python解释器,运行。而对于A,C这两个东西无法优化,需要切换回到python解释器,极其浪费时间,效果差。切换很费时间,这种情况,最好不要用nopython的模式,而使用下面地这种普通模式。
@jit(nopython=True, parallel=True)
def go_fast_sum6(size: float) -> int:
sum = 0
for i in range(size):
sum += i
return sum
%timeit go_fast_sum6(1000)
/home/ubuntu/.local/lib/python3.8/site-packages/numba/core/typed_passes.py:329: NumbaPerformanceWarning: [1m
The keyword argument 'parallel=True' was specified but no transformation for parallel execution was possible.
To find out why, try turning on parallel diagnostics, see https://numba.readthedocs.io/en/stable/user/parallel.html#diagnostics for help.
[1m
File "../../../../tmp/ipykernel_3446152/3683479791.py", line 1:[0m
[1m
@jit(nopython=True)
def haversine_dis(lon1: float, lat1: float, lon2: float, lat2: float) -> float:
"""
基于两点经纬度计算两点距离
"""
# 将十进制转为弧度
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine公式
d_lon = lon2 - lon1
d_lat = lat2 - lat1
aa = sin(d_lat / 2) ** 2 + cos(lat1) * cos(lat2) * sin(d_lon / 2) ** 2
c = 2 * asin(sqrt(aa))
return c * EARTH_R * 1000
数据是标准整数,浮点数,字符串等,简单可直接转换的。里面是数学运算
@jit()
def cal_cross_points(
point: List[float],
point_before: List[float] = None,
point_after: List[float] = None,
r: float = 0.003) -> List[Any]:
参数有可能为空,需要调整代码设置,初始化数据要定义一直数据类型
@jit()
def get_center_point(points: List[List[float]]) -> List[float]:
"""
求多边形中心点,返回中心点坐标【x,y】
:param points: 多边形的点坐标
:return:
"""
参数是一些嵌套列表,字典,类型不统一,结构复杂的结构体。
def least_squares_transform(primary: List[List[float]], secondary: List[List[float]]) -> np.ndarray:
x_points = np.array(primary) # 把数组转为矩阵
y_points = np.array(secondary) # 把数组转为矩阵
pad = lambda x: np.hstack([x, np.ones((x.shape[0], 1))]) # 矩阵运算需要补充维度
参数是复杂结构体,里面有lambda函数,这些都是不能正常识别和转换。