NUMPY 在不同平台上(AMD/INTEL)以及使用不同 BLAS 库的速度测试

NUMPY 速度测试

  • 测试脚本

注意事项

  • intel 平台下,使用 conda install numpy 会默认安装 numpy + mkl
  • amd 平台下,使用 conda install numpy 会默认安装 numpy + openblas
  • windows 平台以及 AMD 平台下,使用 pip install numpy 都会默认安装 numpy + openblas
  • amd 平台下,使用 numpy + mkl(2019版)
    • conda create -p miniconda3/envs/my_env python numpy mkl=2019.* blas=*=*mkl
    • conda env config vars set MKL_DEBUG_CPU_TYPE=5
    • conda activate my_env
  • amd 平台下,使用 numpy + blis
    • conda create -p miniconda3/envs/my_env python numpy blas=*=*blis
    • conda activate my_env

INTEL平台

  • CPU: i7-11800H @ 2.30GHz,8核16线程

numpy + mkl (conda 安装)

Dotted two 4096x4096 matrices in 0.68 s.
Dotted two vectors of length 524288 in 0.05 ms.
SVD of a 2048x1024 matrix in 0.31 s.
Cholesky decomposition of a 2048x2048 matrix in 0.08 s.
Eigendecomposition of a 2048x2048 matrix in 2.92 s.

numpy + openblas (pip 安装)

Dotted two 4096x4096 matrices in 2.55 s.
Dotted two vectors of length 524288 in 0.14 ms.
SVD of a 2048x1024 matrix in 1.01 s.
Cholesky decomposition of a 2048x2048 matrix in 0.12 s.
Eigendecomposition of a 2048x2048 matrix in 4.81 s.

AMD平台

  • CPU: AMD Ryzen 7 3800X 8核16线程

numpy + openblas (conda 安装)

Dotted two 4096x4096 matrices in 0.55 s.
Dotted two vectors of length 524288 in 0.02 ms.
SVD of a 2048x1024 matrix in 0.48 s.
Cholesky decomposition of a 2048x2048 matrix in 0.14 s.
Eigendecomposition of a 2048x2048 matrix in 3.98 s.

numpy + mkl (conda 安装,mkl 2019版,可使用 MKL_DEBUG_CPU_TYPE=5 加速)

Dotted two 4096x4096 matrices in 0.55 s.
Dotted two vectors of length 524288 in 0.02 ms.
SVD of a 2048x1024 matrix in 0.40 s.
Cholesky decomposition of a 2048x2048 matrix in 0.14 s.
Eigendecomposition of a 2048x2048 matrix in 3.56 s.

numpy + blis (conda 安装)

Dotted two 4096x4096 matrices in 2.51 s.
Dotted two vectors of length 524288 in 0.09 ms.
SVD of a 2048x1024 matrix in 1.03 s.
Cholesky decomposition of a 2048x2048 matrix in 0.21 s.
Eigendecomposition of a 2048x2048 matrix in 6.46 s.

你可能感兴趣的:(python,python,开发语言,深度学习)