GD:Learning Rate

It has been proven that if learning rate α is sufficiently small, then J(θ) will decrease on every iteration.就是说如果每次迭代不能使损失函数减小就有可能是learning
rate太大

自动收敛检验Automatic convergence test:如果一次迭代中收敛的范围太小就申报Declare convergence if J(θ) decreases by less than E in one iteration, where E is some small value such as10e−3. However in practice it's difficult to choose this threshold value.

To summarize:

If α is too small: slow convergence.

If α is too large: may not decrease onevery iteration and thus may not converge.

你可能感兴趣的:(GD:Learning Rate)