联邦学习 Federated learning Google I/O‘19 笔记

Federated Learning: Machine Learning on Decentralized data

https://www.youtube.com/watch?v=89BGjQYA0uE

文章目录

  • Federated Learning: Machine Learning on Decentralized data
  • 1. Decentralized data
    • Edge devices
    • Gboard: mobile keyboard
      • Gboard machine learning
      • Gboard data: more, better, decentralized
  • Privacy Principles - like ephemeral reports
  • Privacy Technology
  • 2. Federated computation: Map-reduce for the next generation
    • Federated computation: only-in-aggregation
      • on device dataset
    • Federated computation: Federated aggregation
      • Federated computation challenges
      • Round completion rate by hours (US)
    • Federated computation: secure aggregation, focused collection
      • Gboard: word counts
      • How to compute the frequency of the word ⇒ **Focused collection**
  • 3. Federated learning
    • Model engineer workflow
      • Federated model averaging
    • FL vs. in-datacenter learning
    • When does it apply?
      • Gboard: language modeling
  • 4. TensorFlow Federated
    • What's in the box
    • Federated computation in TFF
    • Federated learning and corgis

1. Decentralized data

Edge devices

can data live at the edge?

On-device inference offers

  • improved latency
  • works offline
  • better battery life
  • privacy advantages

But what about analytics? How could the model learn on the data on the edge devices?

⇒ federated learning

Gboard: mobile keyboard

Gboard machine learning

Models are essential for:

  • tap typing
  • gesture typing
  • auto-corrections
  • predictions
  • voice to text
  • and more…

all on-deice for latency and reliability

Gboard data: more, better, decentralized

on-device cache of local interactions: touch points, typed text, context, and more.

Used exclusively for federated learning and computation ⇒

Federated Learning: Collaborative Machine Learning without Centralized Training Data

Privacy Principles - like ephemeral reports

  1. Only-in-aggregation: engineer may only access combined device reports

  2. Ephemeral reports: never persist per-device reports

  3. Focused collection: devices report only what is needed for this computation

  4. Don’t memorize individuals’ data ⇒ Differential Privacy: statistical science of learning common patterns in a dataset without memorizing individual examples

    Differential privacy complements federated learning: use noise to obscure an individual’s impact on the learned model.

    tensorflow/privacy

Privacy Technology

  1. On-device datasets ⇒ keep raw data local. Expire old data. Encrypt at rest.

  2. Federated aggregation ⇒ combine reports from multiple devices.

  3. Secure aggregation: compute a (vector) sum of encrypted device reports; e.g. Gboard word count

    it’s a practical protocol with

    • security guarantees
    • communication efficiency
    • dropout tolerance
  4. Federated model averaging: many steps of gradient descent on each device

  5. Differentially Private model averaging

    1. devices “clip” their updates if they ar

你可能感兴趣的:(笔记,联邦学习,机器学习,机器学习,tensorflow)