17. Large scale machine learning

Large scale machine learning

Learining with large datasets

Stochastic gradient descent

Batch gradient descent:


Repeat{

}

Stochastic gradient descent:


  1. Randomly shaffle dataset
  2. Repeat{for{}}

Mini-batch gradient descent

Mini-batch gradient descent: Use examples in each iteration.

= mini-batch size

Stochastic gradient descent convergence

Checking for convergence:

  • Batch gradient descent:
  • Stochastic gradient descent: Every 1000 iterations (say), plot averaged ove the last 1000 examples processed by algorithm.

For Stochastic gradient descent: Learning rate istypically held constant. Can slowly decrease over time if we want to converge. (E.g. )

Online learning

operate one data once.

Predicte CTR (click through rate)

Map-reduce and data parallelism

divide all work into many parts and calculate them at the same time with different machine.

Map-reduce and summation over the training set:

Many learining algorithms can be expressed as computing sums of functions over the training set.

Multi-core machines:

你可能感兴趣的:(17. Large scale machine learning)