GCC中的NEON相关的编译选项

    1. GCC的命令行参数

主要的选项是-mcpu, -mfpu, and -mfloat-abi。

      1. 指定CPU的选项

2.9.1 Option to specify the CPU

使用 -mcpu=cpu-name,其中cpu-name是使用小写字符的处理器名称,例如, Cortex-A9 的该选项为 -mcpu=cortex-a9。

CPU

选项

Cortex-A5

-mcpu=cortex-a5

Cortex-A7

-mcpu=cortex-a7

Cortex-A8

-mcpu=cortex-a8

Cortex-A9

-mcpu=cortex-a9

Cortex-A15

-mcpu=cortex-a15

表2-3 Cortex-A processors supported by GCC

如果当前GCC不能识别表中的任一个core类型,则说明GCC版本太低,升级版本。如果没有指定CPU类型,则GCC会使用build-in default。

 

      1. 指定FPU的选项

几乎所有的Cortex-A系列的处理器都有一个浮点单元,也有一个NEON单元。但是,这组指令是否可用依赖于处理器的负担单元。

GCC需要单独的选项, -mfpu来指定它。

Processor

 FP only

FP + SIMD

Cortex-A5

 -mfpu=vfpv3-fp16

-mfpu=vfpv3-d16-fp16

-mfpu=neon-fp16

Cortex-A7

-mfpu=vfpv4

-mfpu=vfpv4-d16

-mfpu=neon-vfpv4

 

Cortex-A8

-mfpu=vfpv3

 -mfpu=neon

Cortex-A9

-mfpu=vfpv3-fp16

-mfpu=vfpv3-d16-fp16

-mfpu=neon-fp16

Cortex-A15

-mfpu=vfpv4

-mfpu=neon-vfpv4

表 2-4 -mfpu options for Cortex-A processors

VFPv3和VFPv4都实现了32个双精度的寄存器。然而,当NEON单元不存在时,上面的16个寄存器 (D16-D31) 是可选的,通过设置选项-d16,说明上面的16个D寄存器时是不可用的。选项中的fp16部分指定可以进行半精度(16-bit)浮点型加载、存储和转换指令。它是在VFPv3上的扩展,在所有VFPv4上都是可用的。

      1. 使能NEON和浮点指令的使用的选项

这个选项是 -mfloat-abi.

说明:-mfloat-abi can also change the ABI that the compiler conforms to.

该选项可以有三个可能的值。

  1. -mfloat-abi=soft Does not use any FPU and NEON instructions. Uses only the core register set. Emulates all floating-point operations using library calls.
  2. -mfloat-abi=softfp Uses the same calling conventions as -mfloat-abi=soft, but uses floating-point and NEON instructions as appropriate. Applications compiled with this option can be linked with a soft float library. If the relevant hardware instructions are available, then you can use this option to improve the performance of code and still have the code conform to a soft-float environment.
  3. -mfloat-abi=hard Uses the floating-point and NEON instructions as appropriate and also changes the ABI calling conventions in order to generate more efficient function calls. Floating-point and vector types can be passed between functions in the NEON registers which significantly reduces the amount of copying. This also means that fewer calls are need to pass arguments on the stack.

你可能感兴趣的:(并行计算)