轉貼自 Effective C#49:为C#2.0做好准备(译)
Effective C#49:为C#2.0做好准备
Item 49: Prepare for C# 2.0
C#2.0,在2005年已经可以使用了,它有一些主要的新功能。这样使得目前使用的一些最好的实际经验可能会有所改变,这也会随着下一代工具的发布而修改。尽管目前你还可以不使用这些功能,但你应该这些做些准备。
当Visual Studio .net2005发布后,会得到一个新的开发环境,升级的C#语言。附加到这门语言上的内容确实让你成为更有工作效率的开发者:你将可以写更好重用的代码,以及用几行就可以写出更高级的结构。总而言之,你可以更快的完成你的工作。
C#2.0有四个大的新功能:范型,迭代,匿名方法,以及部分类型。这些新功能的主要目的就是增强你,做为一个C#开发的开发效率。这一原则会讨论其中的三个新功能,以及为什么你要为些做准备。与其它新功能相比,范型在对你如何开发软件上有更大的影响。范型并不是C#的特殊产物。为了实现C#的范型,MS已经扩展了CLR以及MS的中间语言(MSIL)。C#,托管C++,以及VB.Net都将可以使用范型。J#也将可以使用这些功能。
范型提供了一种“参数的多太”,对于你要利用同一代码来创建一系列相似的类来说,这是一个很神奇的方法。当你为范型参数供一个特殊的类型,编译器就可以生成不同版本的类。你可以使用范型来创建算法,这些算法与参数化的结构相关的,它们在这些结构上实行算法。你可以在.net的Collection名字空间中找到很多候选的范型:HashTables, ArrayList,Queu,以及Stack都可以存储不同的对象而不用管它们是如何实现的。这些集合对于2.0来说都是很好的范型候选类型,这在System.Collections.Generic存在范型,而且这些对于目前的类来说都一个副本。C#1.0是存储一个对System.Obejct类型的引用,尽管当前的设计对于这一类型来说是可重用的,但它有很多不足的地方,而且它并不是一个类型安全的。考虑这些代码:
ArrayList myIntList = new ArrayList( );
myIntList.Add(32 );
myIntList.Add(98.6 );
myIntList.Add("Bill Wagner" );
这编译是没问题的,但这根本无法表明你的意思。你是真的想设计这样一个容器,用于存储总完全不同的元素吗?或者你是想在一个受到限制的语言上工作吗?这样的实践意味着当你移除集合里的元素时,你必须添加额外的代码来决定什么样的对象事先已经存在这样的集合中。不管什么情况,你须要从 System.Object强制转化这些元素到实际的你要的类型。
这还不只,当你把它们放到1.0版(译注:是1.0,不是1.1)的集合中时,值类型的开销更特殊。任何时候,当你放到个值类型数据到集合中时,你必须对它进行装箱。而当你在从集合中删除它时,你又会再开销一次。这些损失虽然小,但对于一个有几千元素的大集合来说,这些开销就很快的累积起来了。通过为每种不同的值类型生成特殊的代码,范型已经消除了这些损失。
如果你熟悉C++的模板,那么对于C#的范型就不存在什么问题了,因为这些从语法上讲是非常相似的。范型的内部的工作,不管它是怎产的,却是完全不同的。让我们看一些简单的例子来了解东西是如何工作的,以及它是如何实现的。考虑下面某个类的部份代码:
public class List
{
internal class Node
{
internal object val;
internal Node next;
}
private Node first;
public void AddHead( object t )
{
// ...
}
public object Head()
{
return first.val;
}
}
这些代码在集合中存储System.Object的引用,任何时候你都可以使用它,在你访问集合是,你必须添加强制转换。但使用C#范型,你可以这样定义同样的类:
public class List < ItemType >
{
private class Node < ItemType >
{
internal ItemType val;
internal Node < ItemType > next;
}
private Node < ItemType > first;
public void AddHead( ItemType t )
{
// ...
}
public ItemType Head( )
{
return first.val;
}
}
你可以用对象来代替ItemType, 这个参数类型是用于定义类的。C#编译器在实例化列表时,用恰当的类型来替换它们。例如,看一下这样的代码:
List < int > intList = new List < int >();
MSIL可以精确的确保intList中存储的是且只是整数。比起目前你所实现的集合(译注:这里指C#1.1里的集合),创建的范型有几个好处,首先就是,如果你试图把其它任何不是整型的内容放到集合中时,C#的编译器会给出一个编译错误,而现今,你须要通过测试运行时代码来附加这些错误。
在C#1.0里,你要承担装箱和拆箱的一些损失,而不管你是从集合中移出或者是移入一个值类型数据,因为它们都是以System.Object的引用形式存在的。使用范型,JIT编译器会为集合创建特殊的实例,用于存储实际的值类型。这样,你就不用装箱或者拆箱了。还不只这些,C#的设计者还想避免代码的膨胀,这在C++模板里是相关的。为了节约空间,JIT编译器只为所有的引用类型生成一个版本。这样可以取得一个速度和空间上的平衡,对每个值类型(避免装箱)会有一个特殊的版本呢,而且引用类型共享单个运行时的版本用于存储System.Object (避免代码膨胀)。在这些集合中使用了错误的引用时,编译器还是会报告错误。
为了实现范型,CLR以及MSIL语言经历了一些修改。当你编译一个范型类时,MSIL为每一个参数化的类型预留了空间。考虑下面两个方法的申明MSIL:
To implement generics, the CLR and the MSIL language undergo some changes. When you compile a generic class, MSIL contains placeholders for each parameterized type. Consider these two method declarations in MSIL:
.method public AddHead (!0 t) {
}
.method public !0 Head () {
}
!0 就是一个为一个类型预留的,当一个实际的实例被申明和创建时,这个类型才创建。这就有一种替换的可能:
.method public AddHead (System.Int32 t) {
}
.method public System.Int32 Head () {
}
类似的,变化的实例包含特殊的类。前面的为整型的申明就变成了为样:
.locals (class List<int>)
newobj void List<int>::.ctor ()
这展示了C#编译器以及JIT编译是如何为一个范型而共同工作的。C#编译器生成的MSIL代码为每一个类型预留了一个空间,JIT编译器在则把这些预留的类型转换成特殊的类型,要么是为所有的引用类型用System.Object,或者对值类型言是特殊的值类型。每一个范型的变量实例化后会带有类型信息,所以C#编译器可以强制使用类型安全检测。
范型的限制定义可能会对你如何使用范型有很大的影响。记住,在CLR还没有加载和创建这些进行时实例时,用于范型运行时的特殊实例是还没有创建的。为了让MISL可以让所有的范型实例都成为可能,编译器须要知道在你的范型类中使用的参数化类型的功能。C#是强制解决这一问题的。在参数化的类型上强制申明期望的功能。考虑一个二叉树的范型的实现。二叉树以有序方式存储对象,也就是说,二叉树可以只存储实现了IComparable的类型。你可以使用约束来实现这一要求:
public class BinaryTree < ValType > where
ValType : IComparable < ValType >
{
}
使用这一定义,使用BinaryTree的实例,如何使用了一个没有实现IComparable 接口的类型时是不能通过编译的。你可以指明多个约束。假设你想限制你的BinaryTree成为一个支持ISerializable的对象。你只用简单的添加更多的限制就行了。注意这些接口以及限制可以在范型上很好的使用:
public class BinaryTree < ValType > where
ValType : IComparable < ValType > ,
ValType : ISerializable
{
}
你可以为每个个实例化的类型指明一个基类以及任何数量的接口集合。另外,你可以指明一个类必须有一个无参数的构造函数。
限制同样可以提供一些更好的好处:编译器可以假设这些在你的范型类中的对象支持指定列表中的某些特殊的接口(或者是基类方法)。如何不使用任何限制时,编译器则只假设类型满员System.Object中定义的方法。你可能须要添加强制转换来使用其它的方法,不管什么时候你使用一个不在 System.Object对象里的方法时,你应该在限制集合是写下这些需求。
约束指出了另一个要尽量使用接口的原因(参见原则19):如果你用接口来定义你的方法,它会让定义约束变得很简单。
迭代也是一个新的语法,通常习惯上用于少代码。想像你创建一些特殊的新容器类。为了支持你的用户,你须要在集合上创建一些方法来支持逆转这些集合以及运行时对象。
目前,你可能通过创建一个实现IEnumerator了的类来完成这些。IEnumerator 包含两个方法,Reset和MoveNextand,以及一个属性:Current。另外,你须要添加IEnumerable来列出集合上所有实现了的接口,以及它的GetEnumerator方法为你的集合返回一个IEnumerator。在你写写完了以后,你已经写了一个类以及至少三个额外的函数,同时在你的主类里还有一些状态管理和其它方法。为了演示这些,目前你须要写这样一页的代码,来处理列表的枚举:
public class List : IEnumerable
{
internal class ListEnumerator : IEnumerator
{
List theList;
int pos = -1;
internal ListEnumerator( List l )
{
theList = l;
}
public object Current
{
get
{
return theList [ pos ];
}
}
public bool MoveNext( )
{
pos++;
return pos < theList.Length;
}
public void Reset( )
{
pos = -1;
}
}
public IEnumerator GetEnumerator()
{
return new ListEnumerator( this );
}
// Other methods removed.
}
在这一方面上,C#2.0用yield关键字添加了新的语法,这让在写这些迭代时变得更清楚。对于前面的代码,在C#2.0里可是样的:
public class List
{
public object iterate()
{
int i=0;
while ( i < theList.Length ( ) )
yield theList [ i++ ];
}
// Other methods removed.
}
yield语句让你只用6行代码足足替换了近30行代码。这就是说,BUG少了,开发时间也少了,以及少的代码维护也是件好事。
在内部,编译器生成的MSIL与目前这30行代码是一致的。编译器为你做了这些,所以你不用做 。编译器生成的类实现了IEnumerator 接口,而且添加了你要支持的接口到列表上。
最后一个新功能就是部分类型。部分类型让你要吧把一个C#类的实现分开到多个文件中。你可能很少这样做,如果有,你自己可以在日常的开发中,使用这一功能来创建多源文件。MS假设这一修改是让C#支持IDE以及代码生成器。目前,你可以在你的类中使用region来包含所以VS.net为你生成的代码。而将来(译注:指C#2.0),这些工具可以创建部份类而且取代这些代码到分开的文件中。
使用这一功能,你要为你的类的申明添加一个partial关键字:
public partial class Form1
{
// Wizard Code:
private void InitializeComponent()
{
// wizard code...
}
}
// In another file:
public partial class Form1
{
public void Method ()
{
// etc...
}
}
部分类型有一些限制。类只与源相关的,不管是一个文件还是多个源文件,它们所生成的MSIL代码没有什么两样。你还是要编译一个完整类的所有的文件到同样的程序集中,而且没有什么自动的方法来确保你已经添加了一个完整类的所有源文件到你的编译项目中。当你把一个类的定义从一文件分开到多个文件时,你可能会以引发很多问题,所以建议你只用IDE生成部分类型功能。这包含form,正如我前面介绍的那样。VS.Net同样为DataSet(参见原则41)也生成部分类型,还有web服务代理,所以你可以添加你自己的成员到这些类中。
我没有太多的谈到关于C#2.0的功能,因为添加的与目前的编码有一些冲突。你可以使用它,通过范型让你自己的类型变得简单,而定义接口可以描述行为:这些接口可以做为约束。新的迭代语法可以提供更高效的方法来实现枚举。你可以通过这一新语法,快速简单的取代嵌套枚举。然而,用户扩展类可能不会是简单的取代。现在开发你自己的代码,在显而易见的地方利用这些功能,而且在用C#2.0升级你已经存在的代码时,它会变得更容易,工作量也会变得最少。
====================================
Item 49: Prepare for C# 2.0
C# 2.0, available in 2005, will have some major new features in the C# language. Some of today's best practices will change with the tools that will be available in the next release. Although you might not be using these features just yet, you should prepare for them.
When Visual Studio .NET 2005 is released, you will get a new, upgraded C# language. The additions to the language are sure to make you a more productive developer: You'll be able to write more reusable code and higher-level constructs in fewer lines of source. All in all, you'll get more done faster.
C# 2.0 has four major new features: generics, iterators, anonymous methods, and partial types. The focus of these new features is to increase your productivity as a C# developer. This item discusses three of those features and how you should prepare for them now. Generics will have more impact on how you develop software than any of the other new features in C#. The generics feature is not specific to C#. To implement C# generics, Microsoft is extending the CLR and Microsoft Intermediate Language (MSIL) as well as the C# language. C#, Managed C++, and VB .NET will be capable of creating generics. J# will be capable of consuming them.
Generics provide "parametric polymorphism," which is a fancy way of saying you that create a series of similar classes from a single source. The compiler generates different versions when you provide a specific type for a generic parameter. You use generics to build algorithms that are parameterized with respect to the structures they act upon. You can find great candidates for generics in the .NET Collections namespace: HashTables, ArrayLists, Queue, and Stack all can store different object types without affecting their implementation. These collections are such good candidates that the 2.0 release of the .NET Framework will include the System.Collections.Generic namespace containing generic counterparts for all the current collection classes. C# 1.0 stores reference to the System.Object type. Although the current design is reusable for all types, it has many deficiencies and is not type-safe. Consider this code:
ArrayList myIntList = new ArrayList( );
myIntList.Add(32 );
myIntList.Add(98.6 );
myIntList.Add("Bill Wagner" );
This compiles just fine, but it almost certainly is not the intent. Did you really create a design that calls for a container that holds totally disparate items? Or were you working around a limitation in the language? This practice means that when you remove items from the collection, you must add extra code to determine what kind of objects were put on the list in the first place. In all cases, you need to cast items from System.Object to the particular type you placed on the list.
But that's not all. Value types pay a particular penalty when they are placed in these 1.0-style collections. Anytime you put a value type in a collection, you must store it in a box. You pay again to remove the item from the box when you access an element in the collection. This penalty is small, but with large collections of thousands of items, it adds up quickly. Generics remove this penalty by generating specific object code for each value type.
Those of you familiar with C++ templates will have no trouble working with C# generics because the syntax is very similar. The inner workings for generics, however, are quite different. Let's look at one simple example to see how generics work and how they are implemented. Consider this portion of a list class:
public class List
{
internal class Node
{
internal object val;
internal Node next;
}
private Node first;
public void AddHead( object t )
{
// ...
}
public object Head()
{
return first.val;
}
}
This code stores System.Object references in its collection. Anytime you use it, you must add casts on the objects accessed from the collection. But using C# generics, you define the same class like this:
public class List < ItemType >
{
private class Node < ItemType >
{
internal ItemType val;
internal Node < ItemType > next;
}
private Node < ItemType > first;
public void AddHead( ItemType t )
{
// ...
}
public ItemType Head( )
{
return first.val;
}
}
You replace object with ItemType, the parameter type in the class definition. The C# compiler replaces ItemType with the proper type when you instantiate the list. For example, take a look at this code:
List < int > intList = new List < int >();
The MSIL generated specifies that intList stores integersand only integers. Generics have several advantages over the implementations you can create today. For starters, the C# compiler reports compile-time errors if you attempt anything but an integer in the collection; today, you need to catch those errors by testing the code at runtime.
In C# 1.0, you pay the boxing and unboxing penalty whenever you move a value type into or out of a collection that stores System.Object references. Using generics, the JIT compiler creates a specific instance of the collection that stores a particular value type; you don't need to box or unbox the items. But there's more to it. The C# designers want to avoid the code bloat often associated with C++ templates. To save space, the JIT compiler generates only one version of the type for all reference types. This provides a size/speed trade-off whereby value types get a specific version of each type (avoiding boxing), and reference types share a single runtime version storing System.Object (avoiding code bloat). The compiler still reports errors when the wrong reference type is used with these collections.
To implement generics, the CLR and the MSIL language undergo some changes. When you compile a generic class, MSIL contains placeholders for each parameterized type. Consider these two method declarations in MSIL:
.method public AddHead (!0 t) {
}
.method public !0 Head () {
}
!0 is a placeholder for a type to be created when a particular instantiation is declared and created. Here's one possible replacement:
.method public AddHead (System.Int32 t) {
}
.method public System.Int32 Head () {
}
Similarly, variable instantiations contain the specific type. The previous declaration for a list of integers becomes this:
.locals (class List<int>)
newobj void List<int>::.ctor ()
This illustrates the way the C# compiler and the JIT compiler work together for generics. The C# compiler generates MSIL that contains placeholders for each type parameter. The JIT compiler turns these placeholders into specific typeseither System.Object for all reference types, or specific value types for each value type. Each variable instantiation of a generic type includes type information so the C# compiler can enforce type safety.
Constraint definitions for generics will have a large impact on how you prepare for generics. Remember that a specific instantiation of a generic runtime class does not get created until the CLR loads and creates that instantiation at runtime. To generate MSIL for all possible instantiations of a generic class, the compiler needs to know the capabilities of the parameterized type in the generic classes you create. The C# solution for this problem is constraints. Constraints declare expected capabilities on the parameterized type. Consider a generic implementation of a binary tree. Binary trees store objects in sorted order; therefore, a binary tree can store only types that implement IComparable. You can specify this requirement using constraints:
public class BinaryTree < ValType > where
ValType : IComparable < ValType >
{
}
Using this definition, any instantiation of BinaryTree using a class that does not support the IComparable interface won't compile. You can specify multiple constraints. Suppose that you want to limit your BinaryTree to objects that support ISerializable. You simply add more constraints. Notice that interfaces and constraints can be generic types as well:
public class BinaryTree < ValType > where
ValType : IComparable < ValType > ,
ValType : ISerializable
{
}
You can specify one base class and any number of interfaces as a set of constraints for each parameterized type. In addition, you can specify that a type must have a parameterless constructor.
Constraints also provide one more advantage: The compiler assumes that the objects in your generic class support any interfaces (or base class methods) specified in the constraint list. In the absence of any constraints, the compiler assumes only the methods defined in System.Object. You would need to add casts to use any other method. Whenever you use a method that is not defined in System.Object, you should document those requirements in a set of constraints.
Constraints point out yet another reason to use interfaces liberally (see Item 19): It will be relatively easy to define constraints if you have defined your functionality using interfaces.
Iterators are a new syntax to create a common idiom using much less code. Imagine that you create some specialized new container class. To support your users, you need to create methods that support traversing this collection and returning the objects in the collection.
Today, you would do this by creating a class that implements IEnumerator. IEnumerator contains two methodsReset and MoveNextand one property: Current. In addition, you would add IEnumerable to the list of implemented interfaces on your collection, and its GetEnumerator method would return an IEnumerator for your collection. By the time you're done, you have written an extra class with at least three functions, as well as some state management and another method in your main class. To illustrate this, you must write this page of code today to handle list enumeration:
public class List : IEnumerable
{
internal class ListEnumerator : IEnumerator
{
List theList;
int pos = -1;
internal ListEnumerator( List l )
{
theList = l;
}
public object Current
{
get
{
return theList [ pos ];
}
}
public bool MoveNext( )
{
pos++;
return pos < theList.Length;
}
public void Reset( )
{
pos = -1;
}
}
public IEnumerator GetEnumerator()
{
return new ListEnumerator( this );
}
// Other methods removed.
}
C# 2.0 adds new syntax in the yield keyword that lets you write these iterators more concisely. Here is the C# 2.0 version of the previous code:
public class List
{
public object iterate()
{
int i=0;
while ( i < theList.Length ( ) )
yield theList [ i++ ];
}
// Other methods removed.
}
The yield statement lets you replace roughly 30 lines of code with only 6. This means fewer bugs, less development time, and less source code to maintainall good things.
Internally, the compiler generates the MSIL that corresponds to those 30 lines of code in today's version. The compiler does it so you don't have to. The compiler generates a class that implements the IEnumerator interface and adds it to your list of supported interfaces.
The last major new feature is partial types. Partial types let you split a C# class implementation into more than one source file. You rarely, if ever, will use this feature yourself to create multiple source files in your daily development. Microsoft proposed this change to C# to support IDEs and code generators. Today, you get a region in your form classes that contains all the code created by the VS .NET designer. In the future, these tools should create partial classes and place the code in a separate file.
To use this feature, you add the partial keyword to your class declaration:
public partial class Form1
{
// Wizard Code:
private void InitializeComponent()
{
// wizard code...
}
}
// In another file:
public partial class Form1
{
public void Method ()
{
// etc...
}
}
Partial types do have some limitations. They are a source-only featurethere is no difference in the MSIL generated from a single source file and one generated from multiple files. You also need to compile all files that form a complete type into the same assembly, and there is no automated way to ensure that you have added all the source files that form a complete class definition to your builds. You can introduce any number of problems when you split your class definition into multiple files, so I recommend that you use this feature only when IDEs generate the source using partial types. This includes forms, as I've shown earlier. VS .NET also generates partial types for typed DataSets (see Item 41) and web service proxies, so you can add your own members to those classes.
I did not cover a number of C# 2.0 features because their addition will have less of an impact on how you write code today. You can make it easier to use your types with generics by defining interfaces to describe behavior: Those interfaces can be used as constraints. The new iterator syntax will provide a more efficient way to implement enumerations. You can easily replace nested enumerators with this new syntax quickly. However, custom external classes will not be so easy to replace. Develop your code now with an eye toward where you will leverage these features, and it will be easier to upgrade your existing code with new C# 2.0 features with minimal work.