《美团机器学习实践》笔记

https://book.douban.com/subject/30243136/

Performance Metric

  • F1 score: 2/F = 1/P + 1/R
  • Other interpretations for AUC:
    • Wilcoxon Test of Ranks
    • Gini-index: Gini+1 = 2*AUC
    • Not sensitive to predicted score

Feature Engineering and Feature Selection

Continuous Variables

  • Bucketing for continuous variables in, for example, logistic regression (by width or by percentile)
  • Missing value treatment (imputation or code dummy variables)
  • Feed RF nodes to linear models

Discrete Variables

  • Cross-interaction
  • Statistics (e.g., unique values of B for each A)

Time, Space, Text Features

Popular Models

Logistic Regression:

  • Why not OLS (outliers)
  • How to solver: GD, or stochastic GD (Google FTRL)
  • Advantage: Fast, scalable

FM

  • Motivation:
    • Feature interaction (not done manually)
    • Polynomial kernel (too many parameters, too sparse matrix)
  • Approach:
    • Instead of learning all co-occurrence of i and j, the weight w is calculated as the dot product of v_i and v_j with dimension k.
    • Here assumption is imposed on matrix W so that it can be de-composed.
    • The parameters for different combinations are no longer independent
  • Improvement:
    • FFM to map similar features into a field
  • Application:
    • Serve as embedding for NN (e.g., User and Ad similarity)
    • Outperforms GBDT for learn complicated feature interactions (due to sparse combinations)

GBDT
Compared with Linear Models: Missing value, Range difference of attributes,, outliers, interactions, non-linear decision boundary

Data Mining

你可能感兴趣的:(《美团机器学习实践》笔记)