C#版的MapReduce

如果不知道MapReduce是怎么工作的,请看这里,如果不知道MapReduce是什么,请google之!

今天“闲”来无事,忽想起C#里没有MapReduce的方法,构思之,coding之:

#region IEnumerable<T>.MapReduce
        public static Dictionary<TKey, TResult> MapReduce<TInput, TKey, TValue, TResult>(
            this IEnumerable<TInput> list,
            Func<TInput, IEnumerable<KeyValuePair<TKey, TValue>>> map,
            Func<TKey, IEnumerable<TValue>, TResult> reduce)
        {
            Dictionary<TKey, List<TValue>> mapResult = new Dictionary<TKey, List<TValue>>();
            foreach (var item in list)
            {
                foreach (var one in map(item))
                {
                    List<TValue> mapValues;
                    if (!mapResult.TryGetValue(one.Key, out mapValues))
                    {
                        mapValues = new List<TValue>();
                        mapResult.Add(one.Key, mapValues);
                    }
                    mapValues.Add(one.Value);
                }
            }
            var result = new Dictionary<TKey, TResult>();
            foreach (var m in mapResult)
            {
                result.Add(m.Key, reduce(m.Key, m.Value));
            }
            return result;
        }
        #endregion

注:由于在map方法里可emit多次,所以这里返回IEnumerable,下文例子中可以看到用yield return来实现。

例:

    public class Person
    {
        public int ID { get; set; }
        public string Name { get; set; }
        public int Age { get; set; }
    }

 

        static void Main(string[] args)
        {
            List<Person> list=new List<Person> ();
            list.Add(new Person { ID=1, Name="user1", Age=23 });
            list.Add(new Person { ID = 2, Name = "user2", Age = 24 });
            list.Add(new Person { ID = 3, Name = "user3", Age = 23 });
            list.Add(new Person { ID = 4, Name = "user4", Age = 25 });
            list.Add(new Person { ID = 5, Name = "user5", Age = 20 });

            var result = list.MapReduce<Person, int, string, string>(Map, 
                (key, values) => string.Join(",", values));
            
            foreach (var d in result)
            {
                Console.WriteLine(d.Key + ":" + d.Value);
            }
        }

        public static IEnumerable<KeyValuePair<int, string>> Map(Person p)
        {
            if (p.Age > 22)
                yield return new KeyValuePair<int, string>(p.Age, p.Name);
        }

 上面程序所做的事为统计年龄大于22的,各个年龄都有谁,显示如:

C:\Windows\system32\cmd.exe
23:user1,user3
24:user2
25:user4
请按任意键继续。。。

 (嫌上传图片太麻烦,弄了个html版控制台,见谅!)

肯定有人会问为什么map不像reduce方法一样用lambda表达式,因为yield return不能在匿名方法和lambda表达式中!MS表示已知道这个问题,但重写yield花费很大,将来肯定会解决!

 

你可能感兴趣的:(mapreduce)