【Java基础概念】Stream流浅析

概述

Java8的两个重大改变，一个是Lambdab表达式，另外一个就是Stream API表达式。Stream是Java8中处理集合的关键抽象概念，它可以对集合进行复杂的查找、过滤、筛选等操作，可以极大的提高Java程序员的生产力，让程序员写出高效、干净、简洁的代码。

这种风格将要处理的元素集合看作一种流，流在管道中传输，并且可以在管道的节点上进行处理，比如筛选，排序，聚合等。

元素流在管道中经过中间操作的处理，最后由最终操作得到前面处理的结果。

Stream 创建

创建流的方法主要有以下几种:

通过集合创建流
通过数组创建流
创建空的流
创建无限流
创建规律无限流

以上这五种创建流的方法中使用最多的还是“通过集合创建流”，下面一一举例，来熟悉下怎么通过这五种方式创建流。

通过集合创建流

    public void testCollectionStream(){
        List list = Arrays.asList("111","xxxx","ccc","222");
        //创建普通串行流
        Stream stringStream = list.stream();
        //创建并行流
        Stream parallelStream = list.parallelStream();

    }

通过数组创建流

public void testArrayStream(){
    //通过Arrays.stream()创建
    int[] arrInt = new int[]{1,2,3,4,5};
    IntStream intStream = Arrays.stream(arrInt);

    Student[] students = new Student[]{new Student("xcy",18,false),new Student("lvp",3,false)};
    Stream studentStream = Arrays.stream(students);
    //通过Stream.of()创建
    Stream IntegerStream = Stream.of(1,32,3,2312,332);

    Stream streamInt = Stream.of(arrInt,arrInt);
    streamInt.forEach(System.out::println);


    }

创建空的流

public void testEmptyStream(){
        
        Stream stream = Stream.empty();
    }

创建无限流

public void testGenerateStream(){
        //通过Stream.generate()创建无限流，通过limit来限制输出个数。
        Stream.generate(()->"random_"+new Random().nextInt()).limit(100).forEach(System.out::println);
        Stream.generate(()->new Student("xcy",new Random(50).nextInt(),false)).limit(10).forEach(System.out::println);

    }

创建规律无限流

public void testGenerateStream1(){
        Stream.iterate(0,x->x+1).limit(10).forEach(System.out::println);

    }

Stream的操作

Stream上的所有操作都分为两大类 “中间操作” 和 “结束操作” 。中间操作只是一种标记，只有结束操作才会触发最后的计算。中间操作又可以分为 “无状态操作”(Stateless) 和 “有状态操作”(Stateful)。
无状态中间操作；是指元素的处理不受前面元素的影响，而有状态中间操作，必须等到所有元素处理之后才知道最终的结果。例如，排序操作就是有状态操作，只有当所有元素都读取完了，才知道最终排序结果；结束操作又分为“短路操作”和
“非短路操作”。短路操作是指不用处理完所有数据，就可以中断的操作，例如anyMatch();而非短路操作，则需要操作完所有数据才能得出最终结果，例如forEach();下面是目前整理出的一个大概分类表：

操作类型	Stream API
中间操作-无状态	unordered();filter();map();mapToInt();mapToLong();mapToDouble(); flatMap();flatMapToInt();flatMapToDouble();flatMapToLong();peek();
中间操作-有状态	distinct();sorted();limit();skip();
结束操作-非短路操作	forEach();forEachOrdered();toArray();reduce();collect();max();min(); count();sum();
结束操作-短路操作	anyMatch();allMatch();noneMatch();findFirst();findAny();

下面通过一些例子来熟悉一下常用的一些方法；

map()

map()是在Stream()流中使用较为频繁的一个方法，其作用是将一种类型的流转换为另外一种类型的流。

//将Stream转换为Stream
@Test
public void testMap(){
    List students = Arrays.asList(new Student("xcy",18,false),new Student("lvp",3,false));
    List names = students.stream().map(Student::getName).collect(Collectors.toList());
    System.out.println(names);
}

filter()

filter，顾名思义，用处是过滤流，将不需要的部分过滤掉。

@Test
public void testFilter(){
    List integers = Arrays.asList(1,2,3,4,5,6,7,8,9,10);
    integers = integers.stream().filter(e-> e<5).collect(Collectors.toList());
    //result:[1, 2, 3, 4]
    System.out.println(integers);
}

sorted()

sorted方法用于排序，

@Test
public void testSorted(){
    String[] arr = {"a","abc","bcd","abcd","bc"};
    //按照字符串长度排序
    System.out.println("字符串按照长度排序===================");
    Arrays.stream(arr).sorted((x,y)->{
        if(x.length()>y.length()){
            return 1;
        }else if (x.length() == y.length()){
            return 0;
        }else{
            return -1;
        }
    }).forEach(System.out::println);
    System.out.println("字符串按照自然顺序排序，即首字母顺序===================");
    //按照自然顺序排序
    Arrays.stream(arr).sorted().forEach(System.out::println);
    System.out.println("字符处按自然顺序倒序===================");
    //按照自然顺序倒序
    Arrays.stream(arr).sorted(Comparator.reverseOrder()).forEach(System.out::println);

    //引用类型集合按照某个属性正序
    System.out.println("将对象按照某个属性顺序排序===================");
    List students = Arrays.asList(new Student("小红",18),new Student("小蓝",3),new Student("小绿",10));
    students.stream().sorted(Comparator.comparing(Student::getAge)).forEach(System.out::println);
    //引用类型集合按照某个属性倒序
    System.out.println("将对象按照某个属性倒序排序===================");
    students.stream().sorted(Comparator.comparing(Student::getAge).reversed()).forEach(System.out::println);

}

distinct()

类似于sql 中的distinct,都是去重的作用。该方法通过equals()比较两个对象是否相等。所以对实体对象去重的时候，需要提前定义好equals()方法。
例如：[student(name="xcy",age = 18);student(name="xcy"，age=18)]，如果没有定义equals方法，系统就不会去重。

skip()

skip用于跳过前n个元素

@Test
public void  testSkip(){
    List students = Arrays.asList(new Student("小红",18),new Student("小蓝",3),new Student("小绿",10));
    students = students.stream().skip(1).collect(Collectors.toList());
    System.out.println(students);
}

limit()

返回前N个元素,用法如下：

@Test
    public void testLimit(){
        List strings = Arrays.asList("aaa","bbb","ccc","ddd","eee","fff");
        strings = strings.stream().limit(3).collect(Collectors.toList());
        System.out.println(strings);
        //result:[aaa, bbb, ccc]
    }

flatMap()

flatMap的作用是将流中的每个元素映射为一个流，再将每个流连接成为一个流。下面举一个具体的实例。

给定单词列表["Hello","World"],输出["H","e","l","o","W","r","d"]。
解决方案如下：

@Test
public void testFlatMap(){
    String[] ss = new String[]{"Hello","World"};
    List result = Arrays.stream(ss).map(e->e.split("")).flatMap(Arrays::stream).distinct().collect(Collectors.toList());
    System.out.println(result);

}

allMatch()/anyMatch()/noneMatch()

这里将这三个方法放一起将，从字面意思就可以看出来;

allMatch() 判断流中是否所有元素都匹配给出的boolean条件;
anyMatch() 判断流中是否存在元素匹配给出的boolean条件;
noneMatch() 判断流中是否所有元素都不匹配给出的boolean条件。

    @Test
    public void testMatch(){
        List students = Arrays.asList(new Student("小红",18),new Student("小蓝",18),new Student("小绿",18));
        boolean allMatch = students.stream().allMatch(e->18 == e.getAge());
        System.out.println("allMatch:"+allMatch);
        //result allMatch:true
        boolean anyMatch = students.stream().anyMatch(e->"小蓝".equals(e.getName()));
        System.out.println("anyMatch:"+anyMatch);
        //result  anyMatch:true
        boolean noneMatch = students.stream().noneMatch(e->"小黄".equals(e.getName()));
        System.out.println("noneMatch:"+noneMatch);
        //result noneMatch:true
    }

forEach()

   @Test
   public void testForEach(){
       List students = Arrays.asList(new Student("小红",18),new Student("小蓝",18),new Student("小绿",18));
       students.stream().forEach(System.out::println);//和student.forEach(System.out::println)
       //可以forEach中调用其他类的方法。
       students.stream().forEach(StudentMapper::insert);
   }

max(),min(),count(),sum(),average()

这是数据流对应的相关操错，具体操作方法如下：

@Test
    public void testMath(){
        List students = Arrays.asList(new Student("小红",18),new Student("小蓝",18),new Student("小绿",18));
        System.out.println(students.stream().count());
        System.out.println(students.stream().mapToInt(Student::getAge).sum());
        System.out.println(students.stream().mapToInt(Student::getAge).average());
        System.out.println(students.stream().mapToInt(Student::getAge).max());
        System.out.println(students.stream().mapToInt(Student::getAge).min());
    }

collect()

collect收集器中包含很多方法。下面也举例一些常用方法。

collectors.toList() : 转换list集合
collectors.toSet(): 转换set集合
collectors.toCollection(TreeSet::new):转换成任意指定类型的集合
collectors.minBy():求最小值，相对应的有求最大值
collectors.averagingInt():求平均值
collectors.summing(): 求和
Collectors.summarizingDouble(x -> x)：可以获取最大值、最小值、平均值、总和值、总数。
Collectors.groupingBy(x -> x): 分组，这里有三个方法，下面统一看一下。
Collectors.partitioningBy(x -> x>2) : 把数据分成两部分，key为ture/false。第一个方法也是调用第二个方法，第二个参数默认为Collectors.toList().
Collectors.joining():拼接字符串
Collectors.collectingAndThen(Collectors.toList(), x -> x.size())：先执行collect操作后再执行第二个参数的表达式。这里是先塞到集合，再得出集合长度

下面是这些方法的示例：

@Test
   public void testCollect(){
       List students = Arrays.asList(new Student("小红",18),new Student("小蓝",18),new Student("小绿",18));
       LinkedList students1 = students.stream().collect(Collectors.toCollection(LinkedList::new));
       System.out.println("students1:"+students1);
       Map> studentListMap = students.stream().collect(Collectors.groupingBy(Student::getAge));
       System.out.println("studentListMap:"+studentListMap);
       Map studentMap = students.stream().collect(Collectors.toMap(Student::getName
       ,Student::getAge));
       System.out.println("studentMap:"+studentMap);
//        Map stringMap1 = students.stream().collect(Collectors.toMap(Student::getAge,Student::getName));
//        System.out.println("stringMap1"+stringMap1);
       Student student = students.stream().collect(Collectors.minBy((o1, o2) -> {
           if(o1.getAge()>o2.getAge()){
               return 1;
           }else if (o1.getAge() == o2.getAge()){
               return 0;
           }else{
               return -1;
           }
       })).get();
       System.out.println("student : " + student);

       DoubleSummaryStatistics doubleSummaryStatistics = students.stream().collect(Collectors.summarizingDouble(Student::getAge));
       System.out.println("average:"+ doubleSummaryStatistics.getAverage());
       System.out.println("count:"+ doubleSummaryStatistics.getCount());
       Map> groupByMap = Stream.of(1, 3, 3, 2).collect(Collectors.groupingBy(Function.identity()));
       System.out.println("groupByMap:" + groupByMap);
       //result:{1=[1], 2=[2], 3=[3, 3]}
       Map groupByMap1 = Stream.of(1, 3, 3, 2).collect(Collectors.groupingBy(Function.identity(), Collectors.summingInt(x -> x)));
       System.out.println("groupByMap1:"+groupByMap1);
       //result:{1=1, 2=2, 3=6}
       HashMap> groupByMap2 = Stream.of(1, 3, 3, 2).collect(Collectors.groupingBy(Function.identity(), HashMap::new, Collectors.mapping(x -> x + 1, Collectors.toList())));
       System.out.println("groupByMap2:" + groupByMap2);
       //result :{1=[2], 2=[3], 3=[4, 4]}
       Map> partitioningByMap1 = Stream.of(1, 3, 3, 2).collect(Collectors.partitioningBy(x -> x > 2));
       Map longMap = Stream.of(1, 3, 3, 2).collect(Collectors.partitioningBy(x -> x > 1, Collectors.counting()));
       System.out.println("partitioningByMap1:"+partitioningByMap1);
       //result : {false=[1, 2], true=[3, 3]}
       System.out.println("longMap:"+longMap);
       //result : {false=1, true=3}
       Integer integer = Stream.of("1", "2", "3").collect(Collectors.collectingAndThen(Collectors.toList(), x -> x.size()));
       System.out.println("integer:"+integer);
   }

并行流

并行流就是把一个内容分成多个模块，并用不同的线程处理每个模块的数据的流。Stream API可以声明性的通过parallel()方法和sequential()方法在并行流和串行流之间进行切换。

Java8 的并行流，其底层使用的是Java7引入的fork/join框架，这里简单说下fork/join，如果想了解的更彻底，可以去翻下资料。

fork/join框架，就是在必要的情况下，将一个大任务，进行拆分（fork）成若干个小任务（拆分到不可拆分）时，再将一个个小任务的执行结果进行整合(join)汇总。

fork-join.png

将一个顺序流改为并行流只需要调用parallel()方法：

    @Test
    public void testParallel(){
        Long result  = Stream.iterate(1L,i -> i+1).limit(10).parallel().reduce(0L,Long::sum);
        System.out.println(result);
    }

将一个并行流改为串行流也只需要调用sequential()方法；

stream.parallel() .filter(...) .sequential() .map(...) .parallel() .reduce();

parallel()和sequential()方法可以交替重复调用，只有最后一次调用，会决定这个流是顺序执行还是并行执行的。并发执行的默认线程数等于机器的处理器核心数。

事实上，并不是使用并行流的效率就一定比串行的高，这种同理于多线程与单线程的执行效率受很多因素的影响。并发流需要对数据进行分解，并不是所有的数据结构都适合分解，下图是整理的一些常用数据结构的可分解性。

数据源	可分解性
ArrayList	非常好
LinkedList	差
IntStream.range	非常好
Stream.iterate	差
HashSet	好
TreeSet	好

除了上面的数据源的可分解性会对性能产生影响，还有对流的操作也会有影响。比如findFirst(),limit(n),skip(n)这种对顺序有要求的操作，在并发流中是非常消耗性能的；而findAny(),anyMatch()这种的就非常适合使用并行流。