java8之用流收集数据

尽可能为手头的问题探索不同的解决方案，但在通用的方案里面，始终选择最专门化的一个。无论从可读性还是性能上看，这一般都是最好的决定。

1. 规约和汇总

准备好美味的菜肴，好戏上演了！

package com.example.chapter5;

public class Dish {
    private String name;
    private boolean vegetarian;
    private int calories;
    private Type type;

    public Dish(String name, boolean vegetarian, int calories, Type type) {
        super();
        this.name = name;
        this.vegetarian = vegetarian;
        this.calories = calories;
        this.type = type;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public boolean isVegetarian() {
        return vegetarian;
    }

    public void setVegetarian(boolean vegetarian) {
        this.vegetarian = vegetarian;
    }

    public int getCalories() {
        return calories;
    }

    public void setCalories(int calories) {
        this.calories = calories;
    }

    public Type getType() {
        return type;
    }

    public void setType(Type type) {
        this.type = type;
    }

    public enum Type {MEAT, FISH, OTHER}

    @Override
    public String toString() {
        return "Dish [name=" + name + ", vegetarian=" + vegetarian + ", calories=" + calories + ", type=" + type + "]";
    }
}

List menu = Arrays.asList(new Dish("pork", false, 800, Dish.Type.MEAT),
                new Dish("beef", false, 700, Dish.Type.MEAT), new Dish("chicken", false, 400, Dish.Type.MEAT),
                new Dish("french fries", true, 800, Dish.Type.OTHER), new Dish("rice", true, 350, Dish.Type.OTHER),
                new Dish("season fruit", true, 120, Dish.Type.OTHER), new Dish("pizza", true, 550, Dish.Type.OTHER),
                new Dish("prawns", false, 300, Dish.Type.FISH), new Dish("salmon", false, 450, Dish.Type.FISH));

1.1 查找流中的最大值和最小值

        Optional max = menu.stream().max(Comparator.comparingInt(Dish::getCalories));
        System.out.println(max.get());

        max = menu.stream().collect(maxBy(Comparator.comparingInt(Dish::getCalories)));
        System.out.println(max.get());
        
        Optional min = menu.stream().collect(minBy(Comparator.comparingInt(Dish::getCalories)));
        System.out.println(min.get());

1.2 汇总

        menu.stream().collect(summingInt(Dish::getCalories));
        menu.stream().mapToInt(Dish::getCalories).sum();
        menu.stream().map(Dish::getCalories).reduce(Integer::sum).get();

        menu.stream().collect(averagingInt(Dish::getCalories));

上述方法一次只能返回一个汇总值，可以通过一次summarizing操作获得元数个数，总数，平均值，最大值和最小值，

        IntSummaryStatistics statistic = menu.stream().collect(summarizingInt(Dish::getCalories));
        System.out.println(statistic);

输出如下：

IntSummaryStatistics{count=9, sum=4470, min=120, average=496.666667, max=800}

1.3 连接字符串

        menu.stream().map(Dish::getName).collect(joining());

        //字符拼接时增加逗号分隔
        menu.stream().map(Dish::getName).collect(joining(", "));

输出如下

pork, beef, chicken, french fries, rice, season fruit, pizza, prawns, salmon

1.4 广义的规约汇总

上述收集器都是reducing工厂方法定义的规约过程的特殊情况。reducing工厂方法是所有这些特殊情况的一般化。

        int totalCalories = menu.stream().collect(reducing(0, Dish::getCalories, (i, j)->i+j));
        System.out.println(totalCalories);

三个参数：

规约操作的起始值
转换函数
BinaryOperator操作

reducing也有单参数形式，如求最大值：

        Optional d = menu.stream().collect(reducing((d1,d2) -> d1.getCalories()>d2.getCalories()? d1:d2));
        System.out.println(d.get());

收集和规约的区别：

一个语义问题： reduce方法旨在把两个值结合起来生成一个新值，它是一个不可变的规约。collect方法设计是要改变容器，从而累积要输出的结果。
一个实际问题： collect更适合并行操作。

2. 分组

按dish的类型type分组：

        menu.stream().collect(groupingBy(Dish::getType));

按自定义热量分组：

        enum CaloricLevel {DIET, NORMAL, FAT};
        menu.stream().collect(groupingBy(dish->{
            if(dish.getCalories()<400) {
                return CaloricLevel.DIET;
            }else if(dish.getCalories()<700) {
                return CaloricLevel.NORMAL;
            }
            return CaloricLevel.FAT;
        }));

2.1 操作元素分组

使用一个映射函数对元素进行转换。

        System.out.println(menu.stream().collect(groupingBy(Dish::getType, mapping(Dish::getName, toList()))));

2.2 多级分组

二级分组，先按类型Type分组，在同类型中，再按热量值分组。相当于二维表格。

System.out.println(menu.stream().collect(groupingBy(Dish::getType, groupingBy(dish->{
            if(dish.getCalories()<400) {
                return CaloricLevel.DIET;
            }else if(dish.getCalories()<700) {
                return CaloricLevel.NORMAL;
            }else {
                return CaloricLevel.FAT;
            }
        }, mapping(Dish::getName, toList())))));

{FISH={DIET=[prawns], NORMAL=[salmon]}, MEAT={FAT=[pork, beef], NORMAL=[chicken]}, OTHER={FAT=[french fries], DIET=[rice, season fruit], NORMAL=[pizza]}}

2.3 按子组收集数据

传递给第一个groupingBy的第二个收集器可以是任何类型的，而不一定是groupingBy。

        System.out.println(menu.stream().collect(groupingBy(Dish::getType, counting())));

结果:

{FISH=2, MEAT=3, OTHER=4}

按照菜的类型分类，找到该类型中热量最高的Dish

    Map> mostCaloricByType = menu.stream().collect(groupingBy(Dish::getType, maxBy(Comparator.comparingInt(Dish::getCalories))));
        System.out.println(mostCaloricByType);

按照菜的类型分类，并求出每类的热量总和

    Map totalCaloriesByType = menu.stream().collect(groupingBy(Dish::getType, summingInt(Dish::getCalories)));
    System.out.println(totalCaloriesByType);

按照菜的类型分类，并求出每类的热量类型集合。
本例用到的是另一个收集器mapping方法。这个方法接收两个参数，一个函数对流中的元素做变换，另一个则将变换的结果对象收集起来。

        Map> caloricLevelsByType = menu.stream().collect(groupingBy(Dish::getType, mapping(dish->{
            if(dish.getCalories()<400) {
                return CaloricLevel.DIET;
            }else if(dish.getCalories()<700) {
                return CaloricLevel.NORMAL;
            }else {
                return CaloricLevel.FAT;
            }
        }, toSet())));
        System.out.println(caloricLevelsByType);

输出如下：

{FISH=[DIET, NORMAL], MEAT=[FAT, NORMAL], OTHER=[FAT, DIET, NORMAL]}

通过使用toCollection，可以有更多的控制，同上例子，收集到hashset

        Map> caloricLevelsByType = menu.stream().collect(groupingBy(Dish::getType, mapping(dish->{
            if(dish.getCalories()<400) {
                return CaloricLevel.DIET;
            }else if(dish.getCalories()<700) {
                return CaloricLevel.NORMAL;
            }else {
                return CaloricLevel.FAT;
            }
        }, toCollection(HashSet::new))));
        System.out.println(caloricLevelsByType);

3. 分区

分区是分组的特殊情况：由一个谓词（返回一个布尔值的函数）作为分类函数，即分区函数。最多分两组，true一组，false一组。

按是否是素食进行分类，并输出素食列表。

    Map> partitionedMenu = menu.stream().collect(partitioningBy(Dish::isVegetarian));
    System.out.println(partitionedMenu.get(true));

按是否素食进行分类，在子类中再按照类型进行分类汇总

Map>> vegetarianDishesByType = menu.stream().collect(partitioningBy(Dish::isVegetarian, groupingBy(Dish::getType)));
        System.out.println(vegetarianDishesByType);

找到素食和非素食中热量最高的菜

    Map mostCaloricPartitionedByVegetarian =  menu.stream().collect(partitioningBy(Dish::isVegetarian, collectingAndThen(maxBy(Comparator.comparingInt(Dish::getCalories)), Optional::get)));
    System.out.println(mostCaloricPartitionedByVegetarian);

计算每个分区中项目的数目

        System.out.println(menu.stream().collect(partitioningBy(Dish::isVegetarian, counting())));