我们通过一个示例,进行源码解析:如下:
Arrays.asList("a", "b", "c").stream().filter(e -> !e.equals("B")).skip(2).forEach(e -> System.out.println(e));首先看stream()方法,它是用了Collection的stream()方法:
default Stream stream() {
return StreamSupport.stream(spliterator(), false);
}
public static Stream stream(Spliterator spliterator, boolean parallel) {
Objects.requireNonNull(spliterator);
return new ReferencePipeline.Head<>(spliterator,
StreamOpFlag.fromCharacteristics(spliterator),
parallel);
}
通过stream()方法生成了一个ReferecePipeline.Head对象,他是ReferencePipeline的子类,而ReferencePipeline是Stream的一个子类,所以我们通过这种方式就生成了一个Stream类。下面filter()方法:
public final Stream filter(Predicate super P_OUT> predicate) {
Objects.requireNonNull(predicate);
return new StatelessOp(this, StreamShape.REFERENCE,
StreamOpFlag.NOT_SIZED) {
@Override
Sink opWrapSink(int flags, Sink sink) {
return new Sink.ChainedReference(sink) {
@Override
public void begin(long size) {
downstream.begin(-1);
}
@Override
public void accept(P_OUT u) {
if (predicate.test(u))
downstream.accept(u);
}
};
}
};
}
cancellationRequested(),这四个实现类是流水线操作的核心,我们后面讲。
下面是skip方法:
public static Stream makeRef(AbstractPipeline, T, ?> upstream,
long skip, long limit) {
if (skip < 0)
throw new IllegalArgumentException("Skip must be non-negative: " + skip);
return new ReferencePipeline.StatefulOp(upstream, StreamShape.REFERENCE,
flags(limit)) {
Spliterator unorderedSkipLimitSpliterator(Spliterator s,
long skip, long limit, long sizeIfKnown) {
if (skip <= sizeIfKnown) {
// Use just the limit if the number of elements
// to skip is <= the known pipeline size
limit = limit >= 0 ? Math.min(limit, sizeIfKnown - skip) : sizeIfKnown - skip;
skip = 0;
}
return new StreamSpliterators.UnorderedSliceSpliterator.OfRef<>(s, skip, limit);
}
skip方法是返回的ReferencePipeline.StatefulOp,这是一个有状态的返回,有两个实现方法我们看下:
@Override
Node opEvaluateParallel(PipelineHelper helper,
Spliterator spliterator,
IntFunction generator) {
long size = helper.exactOutputSizeIfKnown(spliterator);
if (size > 0 && spliterator.hasCharacteristics(Spliterator.SUBSIZED)) {
// Because the pipeline is SIZED the slice spliterator
// can be created from the source, this requires matching
// to shape of the source, and is potentially more efficient
// than creating the slice spliterator from the pipeline
// wrapping spliterator
Spliterator s = sliceSpliterator(helper.getSourceShape(), spliterator, skip, limit);
return Nodes.collect(helper, s, true, generator);
它也实现了opWrapSink方法:
@Override
Sink opWrapSink(int flags, Sink sink) {
return new Sink.ChainedReference(sink) {
long n = skip;
long m = limit >= 0 ? limit : Long.MAX_VALUE;
@Override
public void begin(long size) {
downstream.begin(calcSize(size, skip, m));
}
@Override
public void accept(T t) {
if (n == 0) {
if (m > 0) {
m--;
downstream.accept(t);
}
}
else {
n--;
}
}
@Override
public boolean cancellationRequested() {
return m == 0 || downstream.cancellationRequested();
}
};
}
不过它实现的是begin,accept,和cancellationRequested()。
我们看到他们都会实现Sink的begin(),accept(),end(),cancellationRequested()中的一个或多个方法,这个就是流水线的核心,上面加粗的地方有个downstream就是下一个stream执行的方法,Steam就是按照这种方式,先顺序执行所有的begin(),然后所有accept(),然后所有end(),然后所有的cancellationRequested()方法。那么是哪一步发动的呢,就是这句话:
forEach(e -> System.out.println(e))
在最后收集结果的时候进行所有的处理,如下:
@Override
public void forEach(Consumer super P_OUT> action) {
evaluate(ForEachOps.makeRef(action, false));
}
在ReferencePipiline的forEach方法中传递了一个函数表达式,然后我们下看:
final R evaluate(TerminalOp terminalOp) {
assert getOutputShape() == terminalOp.inputShape();
if (linkedOrConsumed)
throw new IllegalStateException(MSG_STREAM_LINKED);
linkedOrConsumed = true;
return isParallel()
? terminalOp.evaluateParallel(this, sourceSpliterator(terminalOp.getOpFlags()))
: terminalOp.evaluateSequential(this, sourceSpliterator(terminalOp.getOpFlags()));
}
在AbstractPipeline中,y因为我们选的模式是非并行的所以进入第二个方法(加黑):
@Override
public Void evaluateSequential(PipelineHelper helper,
Spliterator spliterator) {
return helper.wrapAndCopyInto(this, spliterator).get();
}
这之后进入了ForEachOps的evaluateSequential()方法,helper参数就是我们的ReferencePipeline,也是我前面返回的Sink。
@Override
final > S wrapAndCopyInto(S sink, Spliterator spliterator) {
copyInto(wrapSink(Objects.requireNonNull(sink)), spliterator);
return sink;
}
@Override()
@SuppressWarnings("unchecked")
final Sink wrapSink(Sink sink) {
Objects.requireNonNull(sink);
for ( @SuppressWarnings("rawtypes") AbstractPipeline p=AbstractPipeline.this; p.depth > 0; p=p.previousStage) {
sink = p.opWrapSink(p.previousStage.combinedFlags, sink);
}
return (Sink) sink;
}
在for循环中,通过AbstractPipeline的depth参数判断还没有下一个,这个depth参数是,在开始生成Sink类的时候,就会让前面Sink的depth加1,
下面我们看下copyInto方法:
@Override
final void copyInto(Sink wrappedSink, Spliterator spliterator) {
Objects.requireNonNull(wrappedSink);
if (!StreamOpFlag.SHORT_CIRCUIT.isKnown(getStreamAndOpFlags())) {
wrappedSink.begin(spliterator.getExactSizeIfKnown());
spliterator.forEachRemaining(wrappedSink);
wrappedSink.end();
}
在这里执行了begin(),forEachRemaining中执行了accept,end()方法。到此,整个流程就结束了。此时,我们是不是有些疑问,如果我们执行sorted后,再执行skip,那么
它怎么能做到skip拦截掉的是排序后的前两个呢,这个就是设计的精妙处,我们看下sorted()方法的onWrapSink方法如下:
public void begin(long size) {
if (size >= Nodes.MAX_ARRAY_SIZE)
throw new IllegalArgumentException(Nodes.BAD_SIZE);
array = (T[]) new Object[(int) size];
}
@Override
public void end() {
Arrays.sort(array, 0, offset, comparator);
downstream.begin(offset);
if (!cancellationWasRequested) {
for (int i = 0; i < offset; i++)
downstream.accept(array[i]);
}
else {
for (int i = 0; i < offset && !downstream.cancellationRequested(); i++)
downstream.accept(array[i]);
}
downstream.end();
array = null;
}
@Override
public void accept(T t) {
array[offset++] = t;
}
的问题。
这就是Stream流水线的流程,这是最基本的,但所有的方法都是大同小异,有想研究别的方法的也可以按照此流程研究。