MepReduce面试题:MapReduce join

1.map join
缺点:只适合大小表join
优点:不会出现数据倾斜
实现:将小表数据加入缓存分发到各个计算节点,按连接关键字建立索引
job.addCacheFile(new URI(“xxxxxxx”));
job.setNumReduceTasks(0);
2.reduce join
缺点:会出现数据倾斜

你可能感兴趣的:(MepReduce面试题:MapReduce join)