Some issues inclassification

Deep belief network and deep autoencoder are unsupervised learning algorithms that can be trained by unlabeled training sets and then tuned with a small amount of tagged data, making both of them available to be used in the application field which only small samples is accessible.

The issues needed in remote sensed classification:

  • Parameter tuning Machine-learning classifiers usually have parameters that have to be set by the user. Parameter tuning, in which
    an optimal value for the parameter is estimated for our example classifications, can be performed using 10-fold cross validation for each model.
  • Variable normalization All variables were also centred and rescaled for consistency, prior to classification.
  • Variable reduction or feature selection
  • The impact of number of training samles to classification accuracy
  • The impact of data quality (such as mislabelled data samples) to classification accuracy
  • sample imbalance. In order to assess the potential impact of training data imbalance, an additional experiment can be carried out, with the data sets balanced using a random oversampling method in which samples from rarer classes were duplicated to produce an equal number of samples in each class.
  • Overfitting

Some criterions on the evaluation of classification algorithm:

  • overall accuracy. Overall accuracy is the probability that an individual will be correctly classified by a test; that is, the sum of the true positives plus true negatives divided by the total number of individuals tested.
  • Errors of Omission. A separate omission error is generally calculated for each category, allowing us to evaluate the classification accuracy and error for each category. As for one category, the error of omission is the sum of the samples mislabeled as other catogeries divided by the total number of samples for this given category.
  • Errors of Commission. Similar to a error of omission, a error of commission is generally calculated for each category. As for one category, the error of commission is the sum of the samples mislabeled as this category divided by the total number of samples labeled as this category.
  • Producer’s Accuracy. Producer’s Accuracy is the map accuracy from the point of view of the map maker (the producer). This is how often are real features on the ground correctly shown on the classified map or the probability that a certain land cover of an area on the ground is classified as such. The Producer’s Accuracy is complement of the Omission Error, Producer’s Accuracy = 100%-Omission Error. It is also the number of reference sites classified accurately divided by the total number of reference sites for that class.
  • User’s Accuracy. The User’s Accuracy is the accuracy from the point of view of a map user, not the map maker. the User’s accuracy essentially tells use how often the class on the map will actually be present on the ground. This is referred to as reliability. The User’s Accuracy is complement of the Commission Error, User’s Accuracy = 100%-Commission Error.
  • Kappa Coefficient. The Kappa Coefficient is generated from a statistical test to evaluate the accuracy of a classification. Kappa essentially evaluate how well the classification performed as compared to just randomly assigning values, i.e. did the classification do better than random. The Kappa Coefficient can range from -1 t0 1. A value of 0 indicated that the classification is no better than a random classification. A negative number indicates the classification is significantly worse than random. A value close to 1 indicates that the classification is significantly better than random.

你可能感兴趣的:(遥感)