Inspiration from Apache HAWQ

Aspects

Interconnect

UDP(User Datagram Protocol) , additional packet verification, the reliability is equivalent to TCP(Transmission Control Protocol), and the performance and scalability exceeds that of TCP

Execution Runtime

dynamic DOP(degree of parallelism) on each Segment

small point: consume splittable data file concurrently

virtual segment allocation policy, several aspects to decide the allocation per query

Maybe we pay too much attention on master-side scheduling and slave-side parallelism for each query.

There is another way, that each slave decides its own the DOP. Task is not forced to launch or setup something for execution decided by master once running. Slave should maintain its own resources(containers) to assign its allocation dynamically for query.

Resource Management

fine-grained resource management above YARN.

neccessary for online system, especially OLAP.

Query Processing

motion occurs when tuples need moving between segments.

slice occurs when the query occurs motion.

To do hash join, a Redistribue Motion involved, one table is redistributed given the join key.

Inspiration from Apache HAWQ_第1张图片

Typical problem solved as a MPP system. DAG also can solve this problem for OLAP, but this is the key distinguish between MPP and DAG.

The approach behind this is largely different in MPP and DAG. But to be honest, it is not that different, maybe a hybrid runtime can cover these two.

GPORCA

sth.

Runaway Query Termination

neccessary for online system, especially OLAP.

你可能感兴趣的:(olap,Runtime,MPP)