Effective C# 原则31:选择小而简单的函数

Effective C# 原则31:选择小而简单的函数
Item 31: Prefer Small, Simple Functions

做为一个有经验的程序员,不管你在使用C#以前是习惯用什么语言的,我们综合了几个可以让你开发出有效代码的实际方法。有些时候,我们在先前的环境中所做的努力在.Net环境中却成了相反的。特别是在你试图手动去优化一些代码时尤其突出。你的这些行为往往会阻止JIT编译器进行最有效的优化。你的以性能为由的额外工作,实际上产生了更慢的代码。你最好还是以你最清楚的方法写代码,其它的让JIT编译器来做。最常见的一个例子就是预先优化,你创建一个很长很复杂的函数,本想用它来避免太多的函数调用,结果会导致很多问题。实际操作时,提升这样一个函数的逻辑到循环体中对.Net程序是有害的。这与你的真实是相反的,让我们来看一些细节。

这一节介绍一个简单的内容,那就是JIT编译器是如何工作的 。.Net运行时调用JIT编译器,用来把由C#编译器生成的IL指令编译成机器代码。这一任务在应用程序的运行期间是分步进行的。JIT并不是在程序一开始就编译整个应用程序,取而代之的是,CLR是一个函数接一个函数的调用JIT编译器。这可以让启动开销最小化到合理的级别,然而不合理的是应用程序保留了大量的代码要在后期进行编译。那些从来不被调用的函数JIT是不会编译它的。你可以通过让JIT把代码分解成更多的小块,从而来最小化大量无关的代码,也就是说小而多的函数比大而少的函数要好。考虑这个人为的例子:


public string BuildMsg( bool takeFirstPath )
{
  StringBuilder msg = new StringBuilder( );
  if ( takeFirstPath )
  {
    msg.Append( "A problem occurred." );
    msg.Append( "\nThis is a problem." );
    msg.Append( "imagine much more text" );
  } else
  {
    msg.Append( "This path is not so bad." );
    msg.Append( "\nIt is only a minor inconvenience." );
    msg.Append( "Add more detailed diagnostics here." );
  }
  return msg.ToString( );
}

在BuildMsg第一次调用时,两个选择项就都编译了。而实际上只有一个是须要的。但是假设你这样写代码:


public string BuildMsg( bool takeFirstPath )
{
  if ( takeFirstPath )
  {
    return FirstPath( );
  } else
  {
    return SecondPath( );
  }
}

因为函数体的每个分支被分解到了独立的小函数中,而JIT就是须要这些小函数,这比前面的BuildMsg调用要好。确实,这个例子只是人为的,而且实际上它也没什么太特别的。但想想,你是不是经常写更“昂贵”的例子呢:一个if 语句中是不是每个片段中都包含了20或者更多的语句呢?你的开销就是让JIT在第一次调用它的时候两个分支都要编译。如果一个分支不像是错误条件,那到你就招致了本可以简单避免的浪费。小函数就意味着JIT编译器只编译它要的逻辑,而不是那些沉长的而且又不会立即使用的代码。对于很长的switch分支,JIT要花销成倍的存储,因此把每个分支的内容定义成内联的要比分离成单个函数要好。

JIT编译器可以更简单的对小而简单的函数进行可登记(enregistration)处理。可登记处理是指进程选择哪些局部变量可以被存储到寄存器中,而这比存储到堆栈中要好。创建少的局部变量可以能JIT提供更好的机会把最合适的候选对象放到寄存器中。这个简单的控制流程同样会影响JIT编译能否如期的进行变量注册。如果函数只有一个循环,那么循环变量就很可能被注册。然而,当你在一个函数中使用过多的循环时,对于变量注册,JIT编译器就不得不做出一些困难的决择。简单就是好,小而简单的函数很可能只包含简单几个变量,这样可以让JIT很容易优化寄存器的使用。

JIT编译器同样决定内联方法。内联就是说直接使用函数体而不必调用函数。考虑这个例子:


// readonly name property:
private string _name;
public string Name
{
  get
  {
    return _name;
  }
}

// access:
string val = Obj.Name;

相对函数的调用开销来说,属性访问器实体包含更少数的指令:对于函数调用,要先在寄存器中存储它的状态,然后从头到尾执行,接着存储返回结果。这还不谈如果有参数时,把参数压到堆栈上还要更多的工作。如果你这样写,这会产生更多的机器指令:

string val = Obj._name;

当然,你应该不会这样做,因为你已经明白最好不要创建公共数据成员(参见原则1)。JIT编译器明白你即须要效率也须要简洁,所以它会内联属性访问器。JIT会在以速度或者大小为目标(或者两个同时要求)时,内联一些方法,用函数体来取代函数的调用会让它更有利。一般情况不用为内联定义额外的规则,而且任何已经实现的内联在将来都可能被改变。另外,内联函数并不是你的职责。正好C#语言没有提供任何关键字让你暗示编译器说你想内联某个函数。实际上,C#编译器也不支持任何暗示来让JIT编译进行内联。你可以做的就是确保你的代码尽可能的清楚,尽可能让JIT编译器容易的做出最好的决定。我的推荐现在就很熟悉了:越小的方法越有可能成为内联对象。请记住:任何虚方法或者含有try/catch块的函数都不可能成为内联的。

内联修改了代码正要被JIT的原则。再来考虑这个访问名字属性的例子:

string val = "Default Name";
if ( Obj != null )
  val = Obj.Name;

JIT编译器内联了属性访问器,这必然会在相关的方法被调用时JIT代码。

你没有责任来为你的算法决定最好的机器级别上的表现。C#编译器以及JIT编译器一起为你完成了这些。C#编译器为每个方法生成IL代码,而JIT编译器则把这些IL代码在目标机器上翻译成机器指令。并不用太在意JIT编译器在各种情况下的确切原则;有这些时间可以开发出更好的算法。取而代之的,你应该考虑如何以一种好的方式表达你的算法,这样的方式可以让开发环境的工具以最好的方式工作。幸运的是,这些你所考虑的这些原则(译注:JIT工作原则)已经成为优秀的软件开发实践。再强调一次:使用小而简单的函数。

记住,你的C#代码经过了两步才编译成机器可执行的指令。C#编译器生成以程序集形式存在的IL代码。而JIT编译器则是在须要时,以每个函数为单元生成机器指令(当内联调用时,或者是一组方法)。小函数可以让它非常容易被JIT编译器分期处理。小函数更有可能成为内联候选对象。当然并不是足够小才行:简单的控制流程也是很重要的。函数内简单的控制分支可以让JIT以容易的寄存变量。这并不是只是写清晰代码的事情,也是告诉你如何创建在运行时更有效的代码。

================================
   

Item 31: Prefer Small, Simple Functions
As experienced programmers, in whatever language we favored before C#, we internalized several practices for developing more efficient code. Sometimes what worked in our previous environment is counterproductive in the .NET environment. This is very true when you try to hand-optimize algorithms for the C# compiler. Your actions often prevent the JIT compiler from more effective optimizations. Your extra work, in the name of performance, actually generates slower code. You're better off writing the clearest code you can create. Let the JIT compiler do the rest. One of the most common examples of premature optimizations causing problems is when you create longer, more complicated functions in the hopes of avoiding function calls. Practices such as hoisting function logic into the bodies of loops actually harm the performance of your .NET applications. It's counterintuitive, so let's go over all the details.

This chapter's introduction contains a simplified discussion of how the JIT compiler performs its work. The .NET runtime invokes the JIT compiler to translate the IL generated by the C# compiler into machine code. This task is amortized across the lifetime of your program's execution. Instead of JITing your entire application when it starts, the CLR invokes the JITer on a function-by-function basis. This minimizes the startup cost to a reasonable level, yet keeps the application from becoming unresponsive later when more code needs to be JITed. Functions that do not ever get called do not get JITed. You can minimize the amount of extraneous code that gets JITed by factoring code into more, smaller functions rather than fewer larger functions. Consider this rather contrived example:

public string BuildMsg( bool takeFirstPath )
{
  StringBuilder msg = new StringBuilder( );
  if ( takeFirstPath )
  {
    msg.Append( "A problem occurred." );
    msg.Append( "\nThis is a problem." );
    msg.Append( "imagine much more text" );
  } else
  {
    msg.Append( "This path is not so bad." );
    msg.Append( "\nIt is only a minor inconvenience." );
    msg.Append( "Add more detailed diagnostics here." );
  }
  return msg.ToString( );
}

 

The first time BuildMsg gets called, both paths are JITed. Only one is needed. But suppose you rewrote the function this way:

public string BuildMsg( bool takeFirstPath )
{
  if ( takeFirstPath )
  {
    return FirstPath( );
  } else
  {
    return SecondPath( );
  }
}

 

Because the body of each clause has been factored into its own function, that function can be JITed on demand rather than the first time BuildMsg is called. Yes, this example is contrived for space, and it won't make much difference. But consider how often you write more extensive examples: an if statement with 20 or more statements in both branches of the if statement. You'll pay to JIT both clauses the first time the function is entered. If one clause is an unlikely error condition, you'll incur a cost that you could easily avoid. Smaller functions mean that the JIT compiler compiles the logic that's needed, not lengthy sequences of code that won't be used immediately. The JIT cost savings multiplies for long switch statements, with the body of each case statement defined inline rather than in separate functions.

Smaller and simpler functions make it easier for the JIT compiler to support enregistration. Enregistration is the process of selecting which local variables can be stored in registers rather than on the stack. Creating fewer local variables gives the JIT compiler a better chance to find the best candidates for enregistration. The simplicity of the control flow also affects how well the JIT compiler can enregister variables. If a function has one loop, that loop variable will likely be enregistered. However, the JIT compiler must make some tough choices about enregistering loop variables when you create a function with several loops. Simpler is better. A smaller function is more likely to contain fewer local variables and make it easier for the JIT compiler to optimize the use of the registers.

The JIT compiler also makes decisions about inlining methods. Inlining means to substitute the body of a function for the function call. Consider this example:

// readonly name property:
private string _name;
public string Name
{
  get
  {
    return _name;
  }
}

// access:
string val = Obj.Name;

 

The body of the property accessor contains fewer instructions than the code necessary to call the function: saving register states, executing method prologue and epilogue code, and storing the function return value. There would be even more work if arguments needed to be pushed on the stack as well. There would be far fewer machine instructions if you were to write this:

string val = Obj._name;

 

Of course, you would never do that because you know better than to create public data members (see Item 1). The JIT compiler understands your need for both efficiency and elegance, so it inlines the property accessor. The JIT compiler inlines methods when the speed or size benefits (or both) make it advantageous to replace a function call with the body of the called function. The standard does not define the exact rules for inlining, and any implementation could change in the future. Moreover, it's not your responsibility to inline functions. The C# language does not even provide you with a keyword to give a hint to the compiler that a method should be inlined. In fact, the C# compiler does not provide any hints to the JIT compiler regarding inlining. All you can do is ensure that your code is as clear as possible, to make it easier for the JIT compiler to make the best decision possible. The recommendation should be getting familiar by now: Smaller methods are better candidates for inlining. But remember that even small functions that are virtual or that contain try/catch blocks cannot be inlined.

Inlining modifies the principle that code gets JITed when it will be executed. Consider accessing the name property again:

string val = "Default Name";
if ( Obj != null )
  val = Obj.Name;

 

If the JIT compiler inlines the property accessor, it must JIT that code when the containing method is called.

It's not your responsibility to determine the best machine-level representation of your algorithms. The C# compiler and the JIT compiler together do that for you. The C# compiler generates the IL for each method, and the JIT compiler translates that IL into machine code on the destination machine. You should not be too concerned about the exact rules the JIT compiler uses in all cases; those will change over time as better algorithms are developed. Instead, you should be concerned about expressing your algorithms in a manner that makes it easiest for the tools in the environment to do the best job they can. Luckily, those rules are consistent with the rules you already follow for good software-development practices. One more time: smaller and simpler functions

Remember that translating your C# code into machine-executable code is a two-step process. The C# compiler generates IL that gets delivered in assemblies. The JIT compiler generates machine code for each method (or group of methods, when inlining is involved), as needed. Small functions make it much easier for the JIT compiler to amortize that cost. Small functions are also more likely to be candidates for inlining. It's not just smallness: Simpler control flow matters just as much. Fewer control branches inside functions make it easier for the JIT compiler to enregister variables. It's not just good practice to write clearer code; it's how you create more efficient code at runtime.
 
   

你可能感兴趣的:(effective)