定义:将输入元素累积到可变容器中,然后可选地将累计结果转换为最终处理之后的结果;相当于把元素累计到一个可变容器中,然后可以把累计结果转换为最终要处理的结果 ,也可以不做处理;并且累积操作可以并行,也可以是串行;
累积操作:
一个Collector由四个函数指定,这四个函数一起发挥作用把输入元素累计到可变容器中,并且可选地对结果执行最终的转换,这四个函数是:
Collector有一组characteristic(枚举)值,不同的特征值可以简化实现,以便提供更好的性能,这些特征值包括以下三个:
其他特性:
串行实现使用supplier函数创建单个结果容器,并未每一个元素调用一次accumulator函数;并行实现对输入进行分区,每一个分区调用一次supplier函数创建结果容器,将每个分区的元素累积到该分区的结果容器中,然后使用combiner函数将每个分区的结果容器合并;
为了保证串行和并行执行产生同样的结果,collector的函数必须满足标识和关联约束;
标识约束表示对于任何部分累计的结果,将它和空结果容器组合必须产生等效的结果。也就是说,对于部分累计的结果,是任何一系列的累加器和组合器调用的忽而过,必须等同于combiner.apply(a,supplier.get());
关联约束表示拆分计算必须产生等效的结果,也就是说,对于任何输入元素t1和t2,下面的计算结果r1和r2必须等效:
A a1 = supplier.get();
accumulator.accept(a1, t1);
accumulator.accept(a1, t2);
R r1 = finisher.apply(a1); // 没有拆分计算
A a2 = supplier.get();
accumulator.accept(a2, t1);
A a3 = supplier.get();
accumulator.accept(a3, t2);
R r2 = finisher.apply(combiner.apply(a2, a3)); //拆分计算
如果collector没有设置characteristic为UNORDERED,两个累计结果等效于finisher.apply(a1).equals(finisher.apply(a2))。对于无序collector,会放宽等效条件,允许有顺序相关的差异性;例如将元素累计到List中,如果包含相同的元素,会忽略顺序,collector会认为两个list等效;
基于Collector接口实现收集的库,例如Stream#collect(),必须遵守以下约束:
Collectors工具类
toCollection(Supplier collectionFactory):返回一个collector,它按顺序将输入元素累积到一个新的collection中,例如:
Collector<Object, ?, HashSet<Object>> collector = Collectors.toCollection(HashSet::new);
toList():返回一个collector,把输入元素累计到一个新的ArrayList中,并且不能保证顺序、线程安全;
Collector<Object, ?, List<Object>> list = Collectors.toList();
toSet():把输入元素累积到Set中,返回的Set的类型、可变性、可序列化、线程安全性无法保证,并且这个collector具有特征值UNORDERED
joining():按顺序把输入元素连接成字符串;
joining(CharSequence delimiter):按顺序把输入字符串用delimiter字符连接;
joining(CharSequence delimiter,CharSequence prefix, CharSequence suffix):按顺序把输入字符串用delimiter字符连接,并加上前缀(prefix)后缀(suffix)
public static
Person wangsan = new Person("wangsan", 20);
Person lisi = new Person("lisi", 21);
List<Person> people = Arrays.asList(wangsan, lisi);
Collector<Person, ?, List<String>> mapping = Collectors.mapping(Person::getName, Collectors.toList());
people.stream().collect(mapping).forEach(System.out::println);//输出 wangsan lisi
List<Person> collect = people.stream().collect(Collectors.collectingAndThen(Collectors.toList(), Collections::unmodifiableList));//把list转变为UnmodifiableList
Collector
Collector<Person, ?, Integer> summingInt = Collectors.summingInt(Person::getAge);
Integer sum = people.stream().collect(summingInt);
System.out.println("sum = " + sum);//sum = 41
Collector
Collector<Person, ?, Double> averagingInt = Collectors.averagingInt(Person::getAge);
Double avg = people.stream().collect(averagingInt);
System.out.println("avg = " + avg);
Collector
Collector<Person, ?, Integer> reducing = Collectors.reducing(Integer.valueOf(1), Person::getAge, (a, b) -> a * b);
Integer result = people.stream().collect(reducing);
System.out.println("result = " + result);//输出 result = 420
Collector
Comparator<Person> comparator = Comparator.comparing(Person::getAge);
BinaryOperator<Person> binaryOperator = BinaryOperator.maxBy(comparator);
Optional<Person> optional = people.stream().collect(Collectors.reducing(binaryOperator));
optional.ifPresent(System.out::println);
//根据城市获取每个城市最高的人
Comparator<Person> byHeight = Comparator.comparing(Person::getHeight);
Map<City, Person> tallestByCity= people.stream().collect(groupingBy(Person::getCity,reducing(BinaryOperator.maxBy(byHeight))));
Collector
//根据城市收集人的名字,并把名字收集到一个set中,最后返回Map>
Map<City,Set<String>>namesByCity=people.stream().collect(groupingBy(Person::getCity, TreeMap::new,mapping(Person::getLastName, toSet())));
源码解析
//获取downstream的容器supplier
Supplier<A> downstreamSupplier = downstream.supplier();
//获取downstream的累计器accumulator
BiConsumer<A, ? super T> downstreamAccumulator = downstream.accumulator();
//首先用分类器classifier把输入元素映射成K类型的key,然后再获取键key相对应的A类型的容器,如果不存在或者为null,则用downstreamSupplier获取容器,并把该容器放入到map中把输入元素累计到A类型的容器中,
BiConsumer<Map<K, A>, T> accumulator = (m, t) -> {
//有输入T元素获取key
K key = Objects.requireNonNull(classifier.apply(t), "element cannot be mapped to a null key");
//判断map中是否含有key的value,如果没有在map中放入一个A类型的容器,并返回该容器
A container = m.computeIfAbsent(key, k -> downstreamSupplier.get());
//把输入元素累积到A类型的容器中
downstreamAccumulator.accept(container, t);
};
//用downstream.combiner()合并map
BinaryOperator<Map<K,A>>merger=Collectors<K,A,Map<K,A>>mapMerger(downstream.combiner());
//把M类型的map强制转换为Map类型
@SuppressWarnings("unchecked")
Supplier<Map<K, A>> mangledFactory = (Supplier<Map<K, A>>) mapFactory;
//如果downstream的characteristics包含IDENTITY_FINISH,则中间类型和最终结果类型一致,不需要finisher,如果不包含,说明最终结果类型和中间结果类型不一致,需要用finisher函数转换
if(downstream.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)) {
return new CollectorImpl<>(mangledFactory, accumulator, merger, CH_ID);
} else {
@SuppressWarnings("unchecked")
Function<A, A> downstreamFinisher =(Function<A,A>)downstream.finisher();
Function<Map<K, A>, M> finisher = intermediate -> {
intermediate.replaceAll((k, v) -> downstreamFinisher.apply(v));
@SuppressWarnings("unchecked")
M castResult = (M) intermediate;
return castResult;
};
return new CollectorImpl<>(mangledFactory, accumulator, merger, finisher, CH_NOID);
}
重载方法:
groupingBy(Function<? super T, ? extends K> classifier) {return groupingBy(classifier, toList());}
public static Collector> groupingBy(Function super T, ? extends K> classifier, Collector super T, A, D> downstream) {
return groupingBy(classifier, HashMap::new, downstream);
}
partitioningBy(Predicate super T> predicate,Collector super T, A, D>downstream):根据输入元素映射成true或false,然后在对元素进行分区,最后是生成一个Map
Predicate<Person>predicate=person -> person.getAge()>20;
Collector<Person, ?, Map<Boolean, List<Object>>> personMapCollector = Collectors.partitioningBy(predicate, Collectors.toList());
Map<Boolean, List<Object>> map = people.stream().collect(personMapCollector);
System.out.println("map = " + map);//map = {false=[Person{name='wangsan', age=20}], true=[Person{name='lisi', age=21}]}
源码解析:
//获取downstream的accumulator
BiConsumer<A, ? super T> downstreamAccumulator = downstream.accumulator();
//判断输入元素是否满足predicate,如果满足则放入到downstream提供key键为true的容器中,否则放入到false的容器中
BiConsumer<Partition<A>, T> accumulator = (result, t) ->
downstreamAccumulator.accept(predicate.test(t) ? result.forTrue : result.forFalse, t);
BinaryOperator<A> op = downstream.combiner();
BinaryOperator<Partition<A>> merger = (left, right) ->new Partition<>(op.apply(left.forTrue, right.forTrue),op.apply(left.forFalse, right.forFalse));
//生成Mapsupplier其中值为{true=A类型容器,false=A类型容器}
Supplier<Partition<A>> supplier = () ->new Partition<>(downstream.supplier().get(),downstream.supplier().get());
//如果downstream的characteristics包含IDENTITY_FINISH值,说明不需要finisher函数,中间类型就是最终结果类型,否则需要finisher函数,对中间结果进行转换为最终结果类型
if(downstream.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)) {
return new CollectorImpl<>(supplier, accumulator, merger, CH_ID);
}else {
Function<Partition<A>, Map<Boolean, D>> finisher = par ->
new Partition<>(downstream.finisher().apply(par.forTrue),
downstream.finisher().apply(par.forFalse));
return new CollectorImpl<>(supplier, accumulator, merger, finisher, CH_NOID);
}
重载方法:
public static <T>Collector<T, ?, Map<Boolean, List<T>>>partitioningBy(Predicate<? super T> predicate) {
return partitioningBy(predicate, toList());
}
toMap(Function super T, ? extends K> keyMapper,Function super T, ? extends U> valueMapper,BinaryOperator mergeFunction,Supplier mapSupplier):将元素累计到map中,其中键和值是将提供的keyMapper、valueMapper函数应用于输入元素;如果映射的键有重复项,将值映射函数应用于每个相等的函数,并使用提供的mergeFunction函数合并结果,mapSupplier提供map的创建,并且返回的collector不是并发的,对于并行流请使用toConcurrentMap函数;
Person wangsan = new Person("wangsan", 20);
Person lisi = new Person("lisi", 21);
Person wangsan1 = new Person("wangsan", 21);
List<Person> people = Arrays.asList(wangsan, lisi,wangsan1);
Collector<Person, ?, TreeMap<String, Person>> personTreeMapCollector = Collectors.toMap(Person::getName, Function.identity(), (Person p1, Person p2) -> new Person(p1.getName(), p2.getAge()), TreeMap::new);
TreeMap<String, Person> collect = people.stream().collect(personTreeMapCollector);
System.out.println("collect = " + collect);//collect = {lisi=Person{name='lisi', age=21}, wangsan=Person{name='wangsan', age=21}}
重载方法:
public static <T, K, U>Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper,Function<? super T, ? extends U> valueMapper,
BinaryOperator<U> mergeFunction) {
return toMap(keyMapper, valueMapper, mergeFunction, HashMap::new);
}
public static <T, K, U>Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper,Function<? super T, ? extends U> valueMapper) {
return toMap(keyMapper, valueMapper, throwingMerger(), HashMap::new);
}
summarizingInt(ToIntFunction super T> mapper):将输入参数映射成int值,然后对结果值进行summary statistics;与之对应的还有summarizingLong、summarizingDouble;例如
Collector<Person, ?, IntSummaryStatistics> personIntSummaryStatisticsCollector = Collectors.summarizingInt(Person::getAge);
IntSummaryStatistics collect2 =people.stream().collect(personIntSummaryStatisticsCollector);
System.out.println("collect2 = " + collect2);//collect2 = IntSummaryStatistics{count=3, sum=62, min=20, average=20.666667, max=21}