Java13都要来了，你还不了解Java8的新（旧）特性？

Java如今的版本迭代速度简直不要太快，一不留神，就错过了好几个版本了。官方版本虽然已经更新到Java12了，但是就目前来说，大多数Java系统还是运行在Java8上的，剩下一部分历史遗留系统还跑在Java7，甚至Java6上。我刚学Java的时候，正好处于Java7版本末期，彼时已经有很多关于Java8新特性的风声，当时作为初学者，其实对此关注不多，只是依稀记得“lambda表达式”、“函数式编程”之类的，也不甚明白其中真意。真正大量应用Java8，大概是我工作一年之后的事情了，还记得当时是从IBM论坛上的一篇文章开始的。

前几天和一位大学同学聊天的时候，谈到了他们公司的一些问题，他们的系统是基于JDK7的版本，并且大部分员工不愿意升级版本，因为不愿意接受Java8的新特性。我是觉得非常惊讶的，都快Java13了，你还不愿意了解Java8的新（旧）特性？因此有了这篇文章，本文将结合通俗易懂的代码介绍Java8的lambda和stream相关的新（旧）特性，从中体会函数式编程的思想。

Lambda表达式

我们可以简单认为lambda表达式就是匿名内部类的更简洁的语法糖。看下面两种线程创建方式，直观感受一下。

// 匿名内部类
new Thread(new Runnable() {
    @Override
    public void run() {
        // ...
    }
}).start();

// lambda
new Thread(() -> {
    // ...
}).start();
复制代码

想要熟练使用lambda表达式，首先要了解函数式接口，那么什么是函数式接口呢？首先必须得是interface修饰的接口，然后接口有且只有一个待实现的方法。有个简单的方法可以区分函数式接口与普通接口，那就是在接口上添加@FunctionalInterface注解，如果不报错，那就是函数式接口，就可以使用lambda表达式来代替匿名内部类了。看下面几个例子，很显然，A和B都是函数式接口，而C没有抽象方法，D不是接口，所以都不能使用lambda表达式。

// 是函数式接口
interface A {
    void test();
}

// 是函数式接口
interface B {
    default void def() {
        // balabala...
    }
    void test();
}

// 不是函数式接口
interface C {
    default void def() {}
}

// 不是函数式接口
abstract class D {
   public abstract void test();
}
复制代码

lambda表达式根据实现接口的方法参数、返回值、代码行数等，有几种不同的写法：

无参数的

interface A {
    void test();
}

A a = () -> {
    // ...
};
复制代码

单个参数的

interface B {
    void test(String arg);
}

B b = arg -> {
    // ...
};
复制代码

多个参数的

interface C {
    void test(String arg1, String arg2);
}

C c = (a1, a2) -> {
    // ...
};

interface D {
    void test(String arg1, String arg2, String arg3);
}

D d = (a1, a2, a3) -> {
    // ...
};
复制代码

只有一行代码的，可以省略大括号

interface B {
    void test(String arg);
}

B b = arg -> System.out.println("hello " + arg);
复制代码

有返回值的

interface E {
    String get(int arg);
}

E e = arg -> {
    int r = arg * arg;
    return String.valueOf(r);
};
// 只有一行代码可以省略return和大括号
e = arg -> String.valueOf(arg * arg);
复制代码

有一点需要注意，lambda表达式和匿名内部类一样，都只能引用final修饰的外部的资源，虽然Java8中可以不用显示的声明变量为final的，但是在lambda表达式内部是不能修改的。

int i = 0;
A a = () -> {
    i++; // 这里编译不通过
    // ...
};
复制代码

lambda表达式还有更加简便的写法，看下面代码，这个::符号是不是很熟悉啊？果然还是脱离不了C体系的影响?

class Math {
    int max(int x, int y) {
        return x < y ? y : x;
    }
    
    static int sum(int x, int y) {
        return x + y;
    }
}

interface Computer {
    int eval(int arg1, int arg2);
}


// 直接通过类名引用
Computer sumFun = Math::sum;
// 和上面是等价的
sumFun = (x, y) -> x + y;

Math math = new Math();
// 通过对象引用
Computer maxFun = math::max;
// 和上面是等价的
maxFun = (x, y) -> x < y ? y : x;

int sum = sumFun.eval(1, 2);
int max = maxFun.eval(2, 3);
复制代码

将上面的例子扩展一下，看下面的代码，体会一下函数式编程的思想。我们把函数作为参数，在真正调用compute方法的时候，才确定应该进行何种运算。

class Biz {
    int x, y;
    Biz(int x, int y) {
        this.x = x;
        this.y = y;
    }
    int compute(Computer cpt) {
        // ...
        return cpt.eval(x, y);
    }
}
Biz biz = new Biz(1, 2);
int result = biz.compute((x, y) -> x * y);
result = biz.compute(Math::sum);
复制代码

内置函数式接口

Java8内置了很多函数式接口，全部放在java.util.function包下面，这些接口已经能满足日常开发中大部分的需求了，这些函数接口主要分为以下几类：

无返回值、有参数的 Consumer 类型

Consumer consumer = str -> {
    // ...
};
BiConsumer biConsumer = (left, right) -> {
    // ...
};
复制代码

有返回值、无参数的 Supplier 类型

Supplier supplier = () -> {
    // ...
    return "hello word";
};
复制代码

有返回值、有参数的 Function 类型

Function function = i -> {
    // ...
    return "hello word " + i;
};
BiFunction biFunction = (m, n) -> {
    int s = m + n;
    return "sum = " + s;
};
复制代码

返回boolean、有参数的Predicate类型，可以看做是Function的一种特例

Predicate predicate = str -> {
    // ...
    return str.charAt(0) == 'a';
};
BiPredicate biPredicate = (left, right) -> {
    // ...
    return left.charAt(0) == right.charAt(0);
};
复制代码

集合类的Stream

Java8为集合框架添加了流式处理的功能，为我们提供了一种很方便的处理集合数据的方式。
Stream大体上可以分为两种操作：中间操作和终端操作，这里先不考虑中间操作状态问题。中间操作可以有多个，但是终端操作只能有一个。中间操作一般是一些对流元素的附加操作，这些操作不会在添加中间操作的时候立即生效，只有当终端操作被添加时，才会开始启动整个流。而且流是不可复用的，一旦流启动了，就不能再为这个流附加任何终端操作了。

Stream的创建方式

流的创建方式大概有以下几种：

String[] array = 
Stream stream;
// 1. 通过Stream的builder构建
stream = Stream.builder()
        .add("1")
        .add("2")
        .build();

// 2. 通过Stream.of方法构建，这种方法可以用来处理数组
stream = Stream.of("1", "2", "3");

// 3. 通过Collection类的stream方法构建，这是常用的做法
Collection list = Arrays.asList("1", "2", "3");
stream = list.stream();

// 4. 通过IntStream、LongStream、DoubleStream构建
IntStream intStream = IntStream.of(1, 2, 3);
LongStream longStream = LongStream.range(0L, 10L);
DoubleStream doubleStream = DoubleStream.of(1d, 2d, 3d);

// 5. 其实上面这些方法都是通过StreamSupport来构建的
stream = StreamSupport.stream(list.spliterator(), false);
复制代码

中间操作

如果你熟悉spark或者flink的话，就会发现，中间操作其实和spark、flink中的算子是一样的，连命名都是一样的，流在调用中间操作的方法是，并不会立即执行这个操作，会等到调用终端操作时，才会执行，下面例子中都添加了一个toArray的终端操作，把流转换为一个数组。

filter操作，参数为Predicate，该操作会过滤掉数据流中断言结果为false的所有元素

// 将返回一个只包含大于1的元素的数组
// array = [2, 3]
Integer[] array = Stream.of(1, 2, 3)
                        .filter(i -> i > 1)
                        .toArray(Integer[]::new);
复制代码

map操作，参数为Function，该操作会将数据流中元素都处理成新的元素，mapToInt、mapToLong、mapToDouble和map类似

// 将每个元素都加10
// array = [11, 12, 13]
Integer[] array = Stream.of(1, 2, 3)
                        .map(i -> i + 10)
                        .toArray(Integer[]::new);
复制代码

flatMap操作，参数为Function，不过Function返回值是个Stream，该操作和map一样，都会处理每个元素，不同的是map会将当前流中的一个元素处理成另一个元素，而flatMap则是将当前流中的一个元素处理成多个元素，flatMapToInt、flatMapToDouble、flatMapToLong和flatMap类似。

// 把每个元素都按","拆分，返回Stream
// array = ["1", "2", "3", "4", "5", "6"]
String[] array = Stream.of("1", "2,3", "4,5,6")
                       .flatMap(s -> {
                           String[] split = s.split(",");
                           return Stream.of(split);
                       })
                       .toArray(String[]::new);
复制代码

peek操作，参数为Consumer，改操作会处理每个元素，但不会返回新的对象。

Stream.of(new User("James", 40), new User("Kobe", 45), new User("Durante", 35))
      .peek(user -> {
          user.name += " NBA";
          user.age++;
      }).forEach(System.out::println);
// User(name=James NBA, age=41)
// User(name=Kobe NBA, age=46)
// User(name=Durante NBA, age=36)
复制代码

distinct操作，很显然这是一个去重操作，会根据每个元素的equals方法去重。

// array = [hello, hi]
String[] array = Stream.of("hello", "hi", "hello")
                       .distinct()
                       .toArray(String[]::new);
复制代码

sorted操作，很显然这是个排序操作，如果使用无参数的sorted，则会先将元素转换成Comparable类型，如果不能转换会抛出异常。也可以传入一个比较器Comparator，然后会根据比较器的比较结果排序。

// 根据字符串长度排序
// sorted = [hi, haha, hello]
String[] sorted = Stream.of("hello", "hi", "haha")
                        .sorted(Comparator.comparingInt(String::length))
                        .toArray(String[]::new);
复制代码

limit操作，参数是一个非负的long类型整数，该操作会截取流的前n个元素，如果参数n大于流的长度，就相当于什么都没做。

// 截取前三个
// array = [hello, hi, haha]
String[] array = Stream.of("hello", "hi", "haha", "heheda")
                       .limit(3)
                       .toArray(String[]::new);
复制代码

skip操作，参数是一个非负的long类型整数，该操作会跳过流的前n个元素，如果参数n大于流的长度，就会跳过全部元素。

// 跳过前两个
// array = [haha, heheda]
String[] array = Stream.of("hello", "hi", "haha", "heheda")
                       .skip(2)
                       .toArray(String[]::new);
复制代码

终端操作

每个流只能有一个终端操作，调用终端操作方法后，流才真正开始执行中间操作，经过多个中间操作的处理后，最终会在终端操作这里产生一个结果。

forEach操作，参数为Consumer，这相当于一个简单的遍历操作，会遍历处理过的流中的每个元素。

Stream.of("hello", "hi", "haha", "heheda")
      .limit(0)
      .forEach(s -> System.out.println(">>> " + s));
复制代码

toArray操作，这个操作在上面的已经多次提到了，该操作根据中间操作的处理结果，生成一个新的数组

// array = [hello, hi, haha, heheda]
Object[] array = Stream.of("hello", "hi", "haha", "heheda")
                       .toArray();
复制代码

allMatch、anyMatch、noneMatch操作，都就接收一个Predicate，用于匹配查询

// b = false
boolean b = Stream.of("hello", "hi", "haha", "heheda")
                  .allMatch(s -> s.equals("hello"));
// b = true
b = Stream.of("hello", "hi", "haha", "heheda")
          .anyMatch(s -> s.equals("hello"));
// b = true
b = Stream.of("hello", "hi", "haha", "heheda")
          .noneMatch(s -> s.equals("nihao"));
复制代码

findFirst、findAny操作，都会返回流中的一个元素，返回值使用Optional包装。

String first = Stream.of("hello", "hi", "haha", "heheda")
                     .findFirst().get();
first = Stream.of("hello", "hi", "haha", "heheda")
              .findAny().get();
复制代码

reduce是比较复杂的一个操作，它有三个重载方法，单参数、双参数和三参数的。主要用来做累计运算的，无论哪个重载方法都需要我们提供一个双参数的BiFunction，这个BiFunction的第一个参数表示前面所有元素的累计值，第二个参数表示当前元素的值，我们看几个例子。

// 拼接字符串
// reduceS ="hello ; hi ; haha ; heheda"
String reduceS = Stream.of("hello", "hi", "haha", "heheda")
                 .reduce((x, y) -> x + " ; " + y)
                 .get();

// 统计所有字符串的长度
// lenght = 17
int length = Stream.of("hello", "hi", "haha", "heheda")
                   .map(String::length)
                   .reduce(0, (x, y) -> x + y);

// 同上，不一样的是，第三个参数是个合并器，用于并行流各个并行结果的合并
int reduce = Stream.of("hello", "hi", "haha", "heheda")
                   .reduce(0, (x, y) -> x + y.length(), (m, n) -> m + n);
复制代码

max、min、count操作，这三个操作都比较简单，分别返回流中最大值、最小值和元素个数

// max = "heheda"
String max = Stream.of("hello", "hi", "haha", "heheda")
                   .max(Comparator.comparingInt(String::length))
                   .get();
// min = "hi"
String min = Stream.of("hello", "hi", "haha", "heheda")
                   .min(Comparator.comparingInt(String::length))
                   .get();
// count = 4
long count = Stream.of("hello", "hi", "haha", "heheda")
                   .count();
复制代码

collect操作，这个操作类似于toArray，不过这里是把流转换成Collection或者Map。一般这个操作结合着Collectors工具类使用。看下面几个简单的例子：

// 转换为List [hello, hehe, hehe, hi, hi, hi]
List list = Stream.of("hello", "hehe", "hehe", "hi", "hi", "hi")
                          .collect(Collectors.toList());
// 转换为Set [hi, hehe, hello]
Set set = Stream.of("hello", "hehe", "hehe", "hi", "hi", "hi")
                        .collect(Collectors.toSet());
// 下面这个稍微复杂一些，实现了将字符串流转换为Map，map的key是字符串本身，value是字符串出现的次数
// map = {hi=3, hehe=2, hello=1}
Map map = Stream.of("hello", "hehe", "hehe", "hi", "hi", "hi")
                                 .collect(Collectors.toMap(s -> {
                                     // 字符串作为map的key
                                     return s;
                                 }, s -> {
                                     // 1作为map的value
                                     return 1;
                                 }, (x, y) -> {
                                     // key相同时的合并操作
                                     return x + y;
                                 }, () -> {
                                     // 还可以指定Map的类型
                                     return new LinkedHashMap<>();
                                 }));
复制代码

单词统计的案例

最后，我将上面介绍的一些操作结合起来，通过一个单词统计的例子，让大家更直观的感受流式处理的好处。

Path path = Paths.get("/Users/.../test.txt");
List lines = Files.readAllLines(path);
lines.stream()
     .flatMap(line -> {
         String[] array = line.split("\\s+");
         return Stream.of(array);
     })
     .filter(w -> !w.isEmpty())
     .sorted()
     .collect(Collectors.toMap(w -> w, w -> 1,
                               (x, y) -> x + y,
                               LinkedHashMap::new))
     .forEach((k, v) -> System.out.println(k + " : " + v));
复制代码

遗憾的是Java8的Stream并不支持分组和聚合操作，所以这里使用了toMap方法来统计单词的数量。

Java8的集合类提供了parallelStream方法用于获取一个并行流（底层是基于ForkJoin做的），一般不推荐这么做，数据规模较小时使用并行Stream反而不如串行来的高效，而数据规模很大的时候，单机的计算能力毕竟有限，我还是推荐使用更加强大的spark或者flink来做分布式计算。

至此，Java8关于lambda和Stream的特性就分析完毕了，当然Java8作为一个经典版本，肯定不止于此，Doug Lea大佬的并发包也在Java8版本更新了不少内容，提供了更加丰富多彩的并发工具，还有新的time包等等，这些都可以拿出来作为一个新的的话题讨论。期望之后的文章中能和大家继续分享相关内容。

原创不易，转载请注明出处！www.yangxf.top/