Big Data Integration

Presenter: Zou Yanyan

Challenges: 4 V

  • Volume
  • Velocity
  • Variety
  • Veracity
  1. Schema Mapping
  2. Record Linkage: blocking -> pairwise matching -> clustering
  3. Data Fusion: voting -> source quality -> copy detection

你可能感兴趣的:(Big Data Integration)