收集器非常有用,因为用它可以简洁而灵活地定义collect用来生成结果集合的标准。
最直接和最常用的收集器是toList静态方法,它会把流中所有的元素收集到一个List中:
List<Transaction> transactions = transactionStream.collect(Collectors.toList());
预定义收集器的功能,也就是那些可以从Collectors类提供的工厂方法(例如groupingBy)创建的收集器。它们主要提供了三大功能:
常见预定义收集器
List<Dish> dishes = menuStream.collect(toList());
Set<Dish> dishes = menuStream.collect(toSet());
Collection<Dish> dishes = menuStream.collect(toCollection(), ArrayList::new);
long howManyDishes = menuStream.collect(counting());
double avgCalories = menuStream.collect(averagingInt(Dish::getCalories));
IntSummaryStatistics menuStatistics = menuStream.collect(summarizingInt(Dish::getCalories));
Optional<Dish> lightest = menuStream.collect(minBy(comparingInt(Dish::getCalories)));
int totalCalories = menuStream.collect(reducing(0, Dish::getCalories, Integer::sum));
int howManyDishes = menuStream.collect(collectingAndThen(toList(), List::size));
Map<Boolean, List<Dish>> vegetarianDishes = menuStream.collect(partitioningBy(Dish::isVegetarian));
long howManyDishes = menu.stream().collect(Collectors.counting()); ←─数一数菜单里有多少种菜
long howManyDishes = menu.stream().count();
Comparator<Dish> dishCaloriesComparator = Comparator.comparingInt(Dish::getCalories);
Optional<Dish> mostCalorieDish = menu.stream().collect(BinaryOperator.maxBy(dishCaloriesComparator));
int totalCalories = menu.stream().collect(summingInt(Dish::getCalories)); ←─求出菜单列表的总热量
double avgCalories = menu.stream().collect(averagingInt(Dish::getCalories)); ←─以计算数值的平均数
IntSummaryStatistics menuStatistics = menu.stream().collect(summarizingInt(Dish::getCalories));
IntSummaryStatistics{count=9, sum=4300, min=120,average=477.777778, max=800} ←─打印menuStatisticobject会得到
同样,相应的summarizingLong
和summarizingDouble
工厂方法有相关的LongSummaryStatistics
和DoubleSummaryStatistics
类型;
joining工厂方法返回的收集器会把对流中每一个对象应用toString方法得到的所有字符串连接成一个字符串;
joining在内部使用了StringBuilder
来把生成的字符串逐个追加起来;
String shortMenu = menu.stream().map(Dish::getName).collect(joining()); ←─菜单中所有菜肴的名称连接起来
结果:porkbeefchickenfrench friesriceseason fruitpizzaprawnssalmon
String shortMenu = menu.stream().map(Dish::getName).collect(joining(", ")); ←─得到一个逗号分隔的菜肴名称列表
结果:pork, beef, chicken, french fries, rice, season fruit, pizza, prawns, salmon
事实上,我们已经讨论的所有收集器,都是一个可以用reducing工厂方法定义的归约过程的特殊情况而已。
例如:
可以用reducing方法创建的收集器来计算你菜单的总热量
int totalCalories = menu.stream().collect(reducing(0, Dish::getCalories, (i, j) -> i + j));
找到热量最高的菜
Optional<Dish> mostCalorieDish = menu.stream().collect(reducing((d1, d2) -> d1.getCalories() > d2.getCalories() ? d1 : d2));
收集框架的灵活性:以不同的方法执行同样的操作
根据情况选择最佳解决方案
计算菜单总热量的归约过程
int totalCalories = menu.stream().collect(reducing(0, ←─初始值
Dish::getCalories, ←─转换函数
Integer::sum)); ←─累积函数
int totalCalories = menu.stream().map(Dish::getCalories).reduce(Integer::sum).get();
int totalCalories = menu.stream().mapToInt(Dish::getCalories).sum();
我们更倾向于最后一个解决方案(使用IntStream),因为它最简明,也很可能最易读。同时,它也是性能最好的一个,
因为IntStream可以让我们避免自动拆箱操作,也就是从Integer到int的隐式转换,它在这里毫无用处。
用reducing连接字符串
String shortMenu = menu.stream().map(Dish::getName).collect(joining());
String shortMenu = menu.stream().map(Dish::getName).collect(reducing((s1, s2) -> s1 + s2)).get();
String shortMenu = menu.stream().collect(reducing("",Dish::getName, (s1, s2) -> s1 + s2 ));
虽然以上二者都能够合法地替代joining收集器,
就实际应用而言,不管是从可读性还是性能方面考虑,我们始终建议使用joining收集器。
String shortMenu = menu.stream().collect(reducing((d1, d2) -> d1.getName() + d2.getName())).get(); 无法编译
何为分组:
例如,假设你要把菜单中的菜按照类型进行分类,有肉的放一组,有鱼的放一组,其他的都放另一组。
Map<Dish.Type, List<Dish>> dishesByType = menu.stream().collect(groupingBy(Dish::getType));
{FISH=[prawns, salmon], OTHER=[french fries, rice, season fruit, pizza],MEAT=[pork, beef, chicken]}
例如,你可能想把热量不到400卡路里的菜划分为“低热量”(diet),热量400到700卡路里的菜划为“普通”(normal),高于700卡路里的划为“高热量”(fat)。
public enum CaloricLevel {DIET, NORMAL, FAT}
Map<CaloricLevel, List<Dish>> dishesByCaloricLevel = menu.stream().collect(groupingBy(dish -> {
if (dish.getCalories() <= 400) {
return CaloricLevel.DIET;
} else if (dish.getCalories() <= 700) {
return CaloricLevel.NORMAL;
} else {
return CaloricLevel.FAT;
}
}));
Map<Dish.Type, Map<CaloricLevel, List<Dish>>> dishesByTypeCaloricLevel = menu.stream().collect(
groupingBy(Dish::getType, ←─一级分类函数
groupingBy(dish -> { ←─二级分类函数
if (dish.getCalories() <= 400) {
return CaloricLevel.DIET;
} else if (dish.getCalories() <= 700) {
return CaloricLevel.NORMAL;
} else {
return CaloricLevel.FAT;
}
})
)
);
{
MEAT = {DIET =[chicken], NORMAL =[beef],FAT =[pork]},
FISH = {DIET =[prawns], NORMAL =[salmon]},
OTHER = {DIET =[rice, seasonal fruit],NORMAL =[french fries, pizza]}
}
例如,要数一数菜单中每类菜有多少个
Map<Dish.Type, Long> typesCount = menu.stream().collect(groupingBy(Dish::getType, counting()));
{MEAT=3, FISH=2, OTHER=4}
例如,按照菜的类型分类查找菜单中热量最高的菜肴
Map<Dish.Type, Optional<Dish>> mostCaloricByType = menu.stream().collect(groupingBy(Dish::getType, maxBy(comparingInt(Dish::getCalories))));
{FISH=Optional[salmon], OTHER=Optional[pizza], MEAT=Optional[pork]}
例如,查找每个子组中热量最高的Dish
Map<Dish.Type, Dish> mostCaloricByType =
menu.stream().collect(groupingBy(Dish::getType, ←─分类函数
collectingAndThen(maxBy(comparingInt(Dish::getCalories)), ←─包装后的收集器
Optional::get))); ←─转换函数
{FISH=salmon, OTHER=pizza, MEAT=pork}
例如,对每一组Dish求出所有菜肴热量总和
Map<Dish.Type, Integer> totalCaloriesByType =
menu.stream().collect(groupingBy(Dish::getType, summingInt(Dish::getCalories)));
Map<Dish.Type, Set<CaloricLevel>> caloricLevelsByType =
menu.stream().collect(
groupingBy(Dish::getType, mapping(
dish -> {
if (dish.getCalories() <= 400) {
return CaloricLevel.DIET;
} else if (dish.getCalories() <= 700) {
return CaloricLevel.NORMAL;
} else {
return CaloricLevel.FAT;
}
},
toSet())));
{OTHER=[DIET, NORMAL], MEAT=[DIET, NORMAL, FAT], FISH=[DIET, NORMAL]}
Map<Dish.Type, Set<CaloricLevel>> caloricLevelsByType =
menu.stream().collect(
groupingBy(Dish::getType, mapping(
dish -> {
if (dish.getCalories() <= 400) {
return CaloricLevel.DIET;
} else if (dish.getCalories() <= 700) {
return CaloricLevel.NORMAL;
} else {
return CaloricLevel.FAT;
}
},
toCollection(HashSet::new))));
分区是分组的特殊情况:由一个谓词作为分类函数,它称分区函数。分区函数返回一个布尔值,这意味着得到的分组Map的键类型是Boolean,于是它最多可以分为两组——true是一组,false是一组。
例如,把菜单按照素食和非素食分开
Map<Boolean, List<Dish>> partitionedMenu =
menu.stream().collect(partitioningBy(Dish::isVegetarian)); ←─分区函数
{
false=[pork, beef, chicken, prawns, salmon],
true=[french fries, rice, season fruit, pizza]
}
List<Dish> vegetarianDishes = partitionedMenu.get(true);
和如上相同结果
List<Dish> vegetarianDishes =menu.stream().filter(Dish::isVegetarian).collect(toList());
Map<Boolean, Map<Dish.Type, List<Dish>>> vegetarianDishesByType =
menu.stream().collect(
partitioningBy(Dish::isVegetarian, ←─分区函数
groupingBy(Dish::getType))); ←─第二个收集器
{false={FISH=[prawns, salmon], MEAT=[pork, beef, chicken]},
true={OTHER=[french fries, rice, season fruit, pizza]}}
Map<Boolean, Dish> mostCaloricPartitionedByVegetarian =
menu.stream().collect(
partitioningBy(Dish::isVegetarian,
collectingAndThen(maxBy(comparingInt(Dish::getCalories)), Optional::get)));
{false=pork, true=pizza}
menu.stream().collect(partitioningBy(Dish::isVegetarian, partitioningBy(d -> d.getCalories() > 500)));
{
false={false=[chicken, prawns, salmon], true=[pork, beef]},
true={false=[rice, season fruit], true=[french fries, pizza]}
}
menu.stream().collect(partitioningBy(Dish::isVegetarian, counting()));
{false=5, true=4}
假设你要写一个方法,它接受参数int n,并将前 n 个自然数分为质数和非质数。
够测试某一个待测数字是否是质数的谓词
public boolean isPrime(int candidate) {
return IntStream.range(2, candidate) ←─产生一个自然数范围[2,candidate)
.noneMatch(i -> candidate % i == 0); ←─如果待测数字不能被流中任何数字整除则返回true
}
简单的优化
public boolean isPrime(int candidate) {
int candidateRoot = (int) Math.sqrt((double) candidate);
return IntStream.rangeClosed(2, candidateRoot).noneMatch(i -> candidate % i == 0);
}
public Map<Boolean, List<Integer>> partitionPrimes(int n) {
return IntStream.rangeClosed(2, n).boxed().collect(partitioningBy(candidate -> isPrime(candidate)));
}
-----------------------------------------------------------------------------读书笔记摘自 书名:Java 8实战 作者:[英] Raoul-Gabriel Urma [意] Mario Fusco [英] Alan M