CCJ PRML Study Note - Chapter 1.3-1.4 : Model Selection & the Curse of Dimensionality

Chapter 1.3-1.4 : Model Selection & the Curse of Dimensionality

 
 

Chapter 1.3-1.4 : Model Selection & the Curse of Dimensionality

 

Christopher M. Bishop, PRML, Chapter 1 Introdcution

1. Model Selection

In our example of polynomial curve fitting using least squares:

1.1 Parameters and Model Complexity:

  • Order of polynomial: there was an optimal order of polynomial that gave the best generalization. The order of the polynomial controls the number of free parameters in the model and thereby governs the model complexity.
  • Regularization coefficient : With regularized least squares, the regularization coefficient λ also controls the effective complexity of the model.
  • More more complex models: such as mixture distributions or neural networks there may be multiple parameters governing complexity.

1.2 Model Selection and Why?

The principal objective in determining the values of such parameters, is

  • usually to achieve the best predictive performance on new data.
  • finding the appropriate values for complexity parameters within a given model.
  • to consider a range of different types of model in order to find the best one for our particular application.

1.3 Category 1 - a training set, a validation set, a third test set

If data is plentiful, then one approach is simply to use some of the available data

  • to train 1) a range of models, or 2) a given model with a range of values for its complexity parameters,
  • and then to compare them on independent data (called a validation set), and select the one having the best predictive performance.
  • If the model design is iterated many times using a limited size data set, then some over-fitting to the validation data can occur and so it may be necessary to keep aside a third test set on which the performance of the selected model is finally evaluated.

1.4 Category 2 - cross validation

Problem: What if the supply of data for training and testing is limited?

  • in order to build good models, we wish to use as much of the available data as possible for training.
  • However, if the validation set is small, it will give a relatively noisy estimate of predictive performance.
  • One solution to this dilemma is to use cross-validation, which is illustrated in Figure 1.18.

Alt text|center

Cross-validation allows a proportion of the available data to be used for training while making use of all of the data to assess performance. When data is particularly scarce, it may be appropriate to consider the case , where is the total number of data points, which gives the leave-one-out technique.

1.5 Drawbacks of Cross-validation

  • One major drawback of cross-validation: is that the number of training runs that must be performed is increased by a factor of , and this can prove problematic for models in which the training is itself computationally expensive.
  • Exploring combinations of settings for multiple complexity parameters for a single model could, in the worst case, require a number of training runs that is exponential in the number of parameters.
  • Clearly, we need a better approach. Ideally, this should rely only on the training data and should allow multiple hyperparameters and model types to be compared in a single training run.

2. The Curse of Dimensionality

2.1 A Simplistic Classification Approach

One very simple approach would be to divide the input space into regular cells, as indicated in Figure 1.20.Alt text|center

2.2 Problem with This Naive Approach

The origin of the problem is illustrated in Figure 1.21, which shows that, if we divide a region of a space into regular cells, then the number of such cells grows exponentially with the dimensionality of the space. The problem with an exponentially large number of cells is that we would need an exponentially large quantity of training data in order to ensure that the cells are not empty. Alt text|center

  • The Curse of Dimensionality: The severe difficulty that can arise in spaces of many dimensions is sometimes called the curse of dimensionality.
  • The reader should be warned that not all intuitions developed in spaces of low dimensionality will generalize to high-dimensional spaces.
 

转载于:https://www.cnblogs.com/glory-of-family/p/5602320.html

你可能感兴趣的:(CCJ PRML Study Note - Chapter 1.3-1.4 : Model Selection & the Curse of Dimensionality)