gcc优化选项,可在编译时间,目标文件长度,执行效率三个维度,进行不同的取舍和平衡。
arm-linux-gnueabihf-g++ -O3 -march=armv7-a -mcpu=cortex-a9 -ftree-vectorize -mfpu=neon -mfpu=vfpv3-fp16 -mfloat-abi=hard -ffast-math
-c 只编译并生成目标文件。
-E 只运行 C 预编译器。
-g 生成调试信息。GNU 调试器可利用该信息。
-Os 相对语-O2.5。
-o FILE 生成指定的输出文件。用在生成可执行文件时。
-O0 不进行优化处理。
-O 或 -O1 优化生成代码。
-O2 进一步优化。
-O3 比 -O2 更进一步优化,包括 inline 函数。
-shared 生成共享目标文件。通常用在建立共享库时。
-W 开启所有 gcc 能提供的警告。
-w 不生成任何警告信息。
-Wall 生成所有警告信息。
These options control various sorts of optimizations.
Without any optimization option, the compiler’s goal is to reduce the cost of compilation and to make debugging produce the expected results. Statements are independent: if you stop the program with a breakpoint between statements, you can then assign a new value to any variable or change the program counter to any other statement in the function and get exactly the results you would expect from the source code.
Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
The compiler performs optimization based on the knowledge it has of the program. Using the -funit-at-a-time flag will allow the compiler to consider information gained from later functions in the file when compiling a function. Compiling multiple files at once to a single output file (and using -funit-at-a-time) will allow the compiler to use information gained from all of the files when compiling each of them.
Not all optimizations are controlled directly by a flag. Only optimizations that have a flag are listed.
-O0: 不做任何优化,这是默认的编译选项。
-O1:优化会消耗少多的编译时间,它主要对代码的分支,常量以及表达式等进行优化。
-O和-O1: 对程序做部分编译优化,对于大函数,优化编译占用稍微多的时间和相当大的内存。使用本项优化,编译器会尝试减小生成代码的尺寸,以及缩短执行时间,但并不执行需要占用大量编译时间的优化。 -O1打开的优化选项, 可参考最后的参考文献。
Optimize. Optimizing compilation takes somewhat more time, and a lot more memory for a large function.
With -O, the compiler tries to reduce code size and execution time, without performing any optimizations that take a great deal of compilation time.
-O turns on the following optimization flags:
-fdefer-pop
-fmerge-constants
-fthread-jumps
-floop-optimize
-fif-conversion
-fif-conversion2
-fdelayed-branch
-fguess-branch-probability
-fcprop-registers
-O also turns on -fomit-frame-pointer on machines where doing so does not interfere with debugging.
-O2:会尝试更多的寄存器级的优化以及指令级的优化,它会在编译期间占用更多的内存和编译时间。
Gcc将执行几乎所有的不包含时间和空间折中的优化。当设置O2选项时,编译器并不进行循环打开()loop unrolling以及函数内联。
Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. The compiler does not perform loop unrolling or function inlining when you specify -O2. As compared to -O, this option increases both compilation time and the performance of the generated code.
-O2 turns on all optimization flags specified by -O. It also turns on the following optimization flags:
-fforce-mem
-foptimize-sibling-calls
-fstrength-reduce
-fcse-follow-jumps -fcse-skip-blocks
-frerun-cse-after-loop -frerun-loop-opt
-fgcse -fgcse-lm -fgcse-sm -fgcse-las
-fdelete-null-pointer-checks
-fexpensive-optimizations
-fregmove
-fschedule-insns -fschedule-insns2
-fsched-interblock -fsched-spec
-fcaller-saves
-fpeephole2
-freorder-blocks -freorder-functions
-fstrict-aliasing
-funit-at-a-time
-falign-functions -falign-jumps
-falign-loops -falign-labels
-fcrossjumping
Please note the warning under -fgcse about invoking -O2 on programs that use computed gotos.
-O3: 在O2的基础上进行更多的优化。例如使用伪寄存器网络,普通函数的内联,以及针对循环的更多优化。在包含了O2所有的优化的基础上,又打开了以下优化选项:
l -finline-functions:内联简单的函数到被调用函数中。
l -fweb:构建用于保存变量的伪寄存器网络。 伪寄存器包含数据, 就像他们是寄存器一样, 但是可以使用各种其他优化技术进行优化, 比如cse和loop优化技术。这种优化会使得调试变得更加的不可能,因为变量不再存放于原本的寄存器中。
l -frename-registers:在寄存器分配后,通过使用registers left over来避免预定代码中的虚假依赖。这会使调试变得非常困难,因为变量不再存放于原本的寄存器中了。
l -funswitch-loops:将无变化的条件分支移出循环,取而代之的将结果副本放入循环中。
-Os:相当于-O2.5。是使用了所有-O2的优化选项,但又不缩减代码尺寸的方法。
Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size.
-Os disables the following optimization flags:
-falign-functions -falign-jumps -falign-loops
-falign-labels -freorder-blocks -fprefetch-loop-arrays
If you use multiple -O options, with or without level numbers, the last such option is the one that is effective.