何止 Linq 的 Distinct 不给力

昨日看到一篇文章 《Linq的Distinct太不给力了》,文中指出 Linq 中 Distinct 方法的一个重载使用了 IEqualityComparer<T> 作为参数,调用时大多都要创建新的类去实现这个接口,很不给力。文中给出了一种解决办法,略显烦索,我也写了《c# 扩展方法 奇思妙用 基础篇 八:Distinct 扩展》一文使用扩展方法予以简化。

但问题远远没有结束,不给力是因为使用了 IEqualityComparer<T> 作为参数,而 .net 中将 IEqualityComparer<T> 用作参数的地方相当多:

IEqualityComparer<T> 用作参数

.net 中 IEqualityComparer<T> 用作参数,大致可分为以下两种情况:

1. Linq

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20
public static class Enumerable

{

    public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value, IEqualityComparer<TSource> comparer);

    public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer);

    public static IEnumerable<TSource> Except<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second,

        IEqualityComparer<TSource> comparer);

    public static IEnumerable<IGrouping<TKey, TSource>> GroupBy<TSource, TKey>(this IEnumerable<TSource> source,

        Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer);

    public static IEnumerable<TSource> Intersect<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second,

        IEqualityComparer<TSource> comparer);

    public static bool SequenceEqual<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second,

        IEqualityComparer<TSource> comparer);

    public static Dictionary<TKey, TSource> ToDictionary<TSource, TKey>(this IEnumerable<TSource> source,

        Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer);

    public static ILookup<TKey, TSource> ToLookup<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector,

        IEqualityComparer<TKey> comparer);

    public static IEnumerable<TSource> Union<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second,

        IEqualityComparer<TSource> comparer);

    //...

}

同样 Queryable 类中也有类似的一些方法

2. 字典、集合类

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18
19

20
public class Dictionary<TKey, TValue> : IDictionary<TKey, TValue>, ICollection<KeyValuePair<TKey, TValue>>, 
    IEnumerable<KeyValuePair<TKey, TValue>>, IDictionary, ICollection, IEnumerable, ISerializable, IDeserializationCallback

{

    public Dictionary();

    public Dictionary(IDictionary<TKey, TValue> dictionary);

    public Dictionary(IEqualityComparer<TKey> comparer);

    public Dictionary(int capacity);

    public Dictionary(IDictionary<TKey, TValue> dictionary, IEqualityComparer<TKey> comparer);

    public Dictionary(int capacity, IEqualityComparer<TKey> comparer);

    //...

}



public class HashSet<T> : ISerializable, IDeserializationCallback, ISet<T>, ICollection<T>, IEnumerable<T>, IEnumerable

{

    public HashSet();

    public HashSet(IEnumerable<T> collection);

    public HashSet(IEqualityComparer<T> comparer);

    public HashSet(IEnumerable<T> collection, IEqualityComparer<T> comparer);

    //...

}

Dictionary<TKey, TValue> 和 HashSet<T> 类的构造函数都用到了 IEqualityComparer<T> 接口。

除了如上两个,还有 ConcurrentDictionary<TKey, TValue>、SortedSet<T>、KeyedCollection<TKey, TItem>(抽象类)、SynchronizedKeyedCollection<K, T> 等等也使用 IEqualityComparer<T> 接口作为构造函数的参数。

 

IEqualityComparer<T> 作为参数多在复杂的重载中出现,满足一些特殊情况的要求,而相应的简单的重载确是经常使用的。因此,虽然 IEqualityComparer<T> 在 .net 应用广泛,但在我们编程时,确是较少涉及。
不过话又说回来,一旦使用到时,就会感觉相当麻烦。多数时候你不得不去创建一个新类,去实现 IEqualityComparer<T> 接口,再去 new 一个实例,而你真正需要的可能仅仅是根据某个属性(如 ID )进行比较。创建新类实现 IEqualityComparer<T> 接口,不但增加了代码量,还增加的复杂度:你要考虑这个新类放在哪里合适,如何命名等等。

因此,我们期望有一个简单的方法来能直接创建 IEqualityComparer<T> 的实例。《c# 扩展方法 奇思妙用 基础篇 八:Distinct 扩展》一文中给出了一个简单实用的类 CommonEqualityComparer<T, V>,在这里可以复用来达到我们的目标。

CommonEqualityComparer<T, V>

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28
using System;

using System.Collections.Generic;

using System.Runtime.CompilerServices;

using System.Linq;



public class CommonEqualityComparer<T, V> : IEqualityComparer<T>

{

    private Func<T, V> keySelector;

    private IEqualityComparer<V> comparer;



    public CommonEqualityComparer(Func<T, V> keySelector, IEqualityComparer<V> comparer)

    {

        this.keySelector = keySelector;

        this.comparer = comparer;

    }

    public CommonEqualityComparer(Func<T, V> keySelector)

        : this(keySelector, EqualityComparer<V>.Default)

    {  }



    public bool Equals(T x, T y)

    {

        return comparer.Equals(keySelector(x), keySelector(y));

    }

    public int GetHashCode(T obj)

    {

        return comparer.GetHashCode(keySelector(obj));

    }

}

使用这个类,可以简易通过 lambda 表达式来创建 IEqualityComparer<T> 的实例:

1

2

3

4

5

6
var dict = new Dictionary<Person, string>(new CommonEqualityComparer<Person, string>(p => p.Name));



List<Person> persons = null;

Person p1 = null;

//...

var ps = persons.Contains(p1, new CommonEqualityComparer<Person, int>(p=>p.ID));

相信看了上面代码的,你会觉得 new CommonEqualityComparer<Person, string>(p => p.Name)) 太冗长。不过我们可以借助下面的类加以改善:

1

2

3

4

5

6

7

8

9

10

11
public static class Equality<T>

{

    public static IEqualityComparer<T> CreateComparer<V>(Func<T, V> keySelector)

    {

        return new CommonEqualityComparer<T, V>(keySelector);

    }

    public static IEqualityComparer<T> CreateComparer<V>(Func<T, V> keySelector, IEqualityComparer<V> comparer)

    {

        return new CommonEqualityComparer<T, V>(keySelector, comparer);

    }

}

调用代码可简化:

1

2
var dict = new Dictionary<Person, string>(Equality<Person>.CreateComparer(p => p.Name));

var ps = persons.Contains(p1, Equality<Person>.CreateComparer(p => p.ID));

不考虑类名和方法名的前提下,Equality<Person>.CreateComparer(p => p.ID) 的写法也经精简到极限了(如果你能进一步精简,不妨告诉我)

其实有了 Equality<T> 这个类,我们大可将 CommonEqualityComparer<T, V> 类封装隐藏起来。

Equality<T> 类

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35
public static class Equality<T>

{

    public static IEqualityComparer<T> CreateComparer<V>(Func<T, V> keySelector)

    {

        return new CommonEqualityComparer<V>(keySelector);

    }

    public static IEqualityComparer<T> CreateComparer<V>(Func<T, V> keySelector, IEqualityComparer<V> comparer)

    {

        return new CommonEqualityComparer<V>(keySelector, comparer);

    }



    class CommonEqualityComparer<V> : IEqualityComparer<T>

    {

        private Func<T, V> keySelector;

        private IEqualityComparer<V> comparer;



        public CommonEqualityComparer(Func<T, V> keySelector, IEqualityComparer<V> comparer)

        {

            this.keySelector = keySelector;

            this.comparer = comparer;

        }

        public CommonEqualityComparer(Func<T, V> keySelector)

            : this(keySelector, EqualityComparer<V>.Default)

        { }



        public bool Equals(T x, T y)

        {

            return comparer.Equals(keySelector(x), keySelector(y));

        }

        public int GetHashCode(T obj)

        {

            return comparer.GetHashCode(keySelector(obj));

        }

    }

}

CommonEqualityComparer<T, V> 封装成了 Equaility<T> 的嵌套类 CommonEqualityComparer<V>,对外不可见,降低了使用的复杂度。

c# 扩展方法 奇思妙用 基础篇 八:Distinct 扩展》一文中的 Distinct 扩展方法 写起来也简单了:

1

2

3

4

5

6

7

8

9

10

11
public static class DistinctExtensions

{

    public static IEnumerable<T> Distinct<T, V>(this IEnumerable<T> source, Func<T, V> keySelector)

    {

        return source.Distinct(Equality<T>.CreateComparer(keySelector));

    }

    public static IEnumerable<T> Distinct<T, V>(this IEnumerable<T> source, Func<T, V> keySelector, IEqualityComparer<V> comparer)

    {

        return source.Distinct(Equality<T>.CreateComparer(keySelector, comparer));

    }

}

Linq 中除 Distinct 外还有众多方法使用了 IEqualityComparer<T> 接口,逐一扩展未必是一个好方式,使用 Equality<T>.CreateComparer 方法比较明智。

总结

.net 中经常把 IEqualityComparer<T> 用作某些重载的参数。
虽然这些重载在日常使用中并不频繁,不过一旦用到,大多要创建新类实现 IEqualityComparer<T>,繁琐不给力。
本文创建 Equality<T> 泛型类,配合一个 lambda 表达式可快速创建 IEqualityComparer<T> 的实例。

你可能感兴趣的:(distinct)