python中函数可以提高代码执行速度吗_为什么Python代码在函数中运行得更快?

除了本地/全局变量存储时间,操作码预测使函数更快。

正如其他答案所解释的,函数在循环中使用STORE_FAST操作码。下面是函数循环的字节码:>> 13 FOR_ITER 6 (to 22) # get next value from iterator

16 STORE_FAST 0 (x) # set local variable

19 JUMP_ABSOLUTE 13 # back to FOR_ITER

通常,当程序运行时,Python会一个接一个地执行每个操作码,跟踪堆栈,并在执行每个操作码后对堆栈帧执行其他检查。操作码预测意味着在某些情况下,Python能够直接跳转到下一个操作码,从而避免了一些这种开销。

在这种情况下,每当Python看到FOR_ITER(循环的顶部)时,它就会“预测”到STORE_FAST是它必须执行的下一个操作码。然后,Python查看下一个操作码,如果预测正确,它将直接跳到STORE_FAST。这会将两个操作码压缩为一个操作码。

另一方面,在全局级别的循环中使用STORE_NAME操作码。Python看到这个操作码时,确实会做出类似的预测。相反,它必须返回到计算循环的顶部,这对循环的执行速度有明显的影响。

为了提供有关此优化的更多技术细节,这里引用了^{}文件(Python虚拟机的“引擎”)中的一段话:Some opcodes tend to come in pairs thus making it possible to

predict the second code when the first is run. For example,

GET_ITER is often followed by FOR_ITER. And FOR_ITER is often

followed by STORE_FAST or UNPACK_SEQUENCE.

Verifying the prediction costs a single high-speed test of a register

variable against a constant. If the pairing was good, then the

processor's own internal branch predication has a high likelihood of

success, resulting in a nearly zero-overhead transition to the

next opcode. A successful prediction saves a trip through the eval-loop

including its two unpredictable branches, the HAS_ARG test and the

switch-case. Combined with the processor's internal branch prediction,

a successful PREDICT has the effect of making the two opcodes run as if

they were a single new opcode with the bodies combined.

在^{}操作码的源代码中,我们可以看到STORE_FAST的预测是在哪里进行的:case FOR_ITER: // the FOR_ITER opcode case

v = TOP();

x = (*v->ob_type->tp_iternext)(v); // x is the next value from iterator

if (x != NULL) {

PUSH(x); // put x on top of the stack

PREDICT(STORE_FAST); // predict STORE_FAST will follow - success!

PREDICT(UNPACK_SEQUENCE); // this and everything below is skipped

continue;

}

// error-checking and more code for when the iterator ends normally

PREDICT函数扩展为if (*next_instr == op) goto PRED_##op,也就是说,我们只是跳到预测操作码的开头。在这种情况下,我们跳到这里:PREDICTED_WITH_ARG(STORE_FAST);

case STORE_FAST:

v = POP(); // pop x back off the stack

SETLOCAL(oparg, v); // set it as the new local variable

goto fast_next_opcode;

现在设置了局部变量,下一个操作码准备执行。Python继续遍历iterable,直到它到达末尾,每次都进行成功的预测。

Python wiki page有更多关于CPython的虚拟机如何工作的信息。

你可能感兴趣的:(python中函数可以提高代码执行速度吗_为什么Python代码在函数中运行得更快?)