[资源整理]几个经典的用于不平衡回归Imbalanced regression的采样方法以及代码资源

几个经典的用于不平衡回归的采样方法

  • 前言
  • SMOGN
  • SMOTE
  • DA-WR (Data Augmentation - Weighted Resampling)
    • REBAGG: REsampled BAGGing for Imbalanced Regression
  • 学位论文:
  • ImbalancedLearningRegression
  • 总结
  • 总结


前言

众所周知,不平衡回归相比于不平衡分类是一个很少被关注的话题. 因需要,笔者整理一些用于处理imbalanced regression的data level方法.

SMOGN

原始论文:
Branco, P., Torgo, L., Ribeiro, R. (2017). SMOGN: A Pre-Processing Approach for Imbalanced Regression. Proceedings of Machine Learning Research, 74:36-50. http://proceedings.mlr.press/v74/branco17a/branco17a.pdf.

该方法的官方实现是基于R语言, 该方法目前已经被收录进Python包(smogn)中, 可通过如下命令安装使用,

pip install smogn

项目地址见:https://github.com/nickkunz/smogn

SMOTE

原始论文:
Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of artificial intelligence research, 2002, 16: 321-357. https://www.jair.org/index.php/jair/article/download/10302/24590
SMOTE及其各种变体的实现大集合见项目:https://github.com/analyticalmindsltd/smote_variants

SMOTE用于Regression的应用论文:

  1. Torgo L, Ribeiro R P, Pfahringer B, et al. Smote for regression[C]//Progress in Artificial Intelligence: 16th Portuguese Conference on Artificial Intelligence, EPIA 2013, Angra do Heroísmo, Azores, Portugal, September 9-12, 2013. Proceedings 16. Springer Berlin Heidelberg, 2013: 378-389.
  2. Camacho L, Douzas G, Bacao F. Geometric SMOTE for regression[J]. Expert Systems with Applications, 2022: 116387.

DA-WR (Data Augmentation - Weighted Resampling)

论文: Data Augmentation for Imbalanced Regression, AISTATS 2023.
代码链接: https://github.com/sstocksieker/DAIR.

REBAGG: REsampled BAGGing for Imbalanced Regression

论文: REBAGG: REsampled BAGGing for Imbalanced Regression, LIDTA 2018.

基本思路: 结合了集成学习Bagging

学位论文:

Thesis, Re-sampling Approaches for Regression Tasks under Imbalanced Domains, 2014.

ImbalancedLearningRegression

原始论文:
Branco P. ImbalancedLearningRegression-A Python Package to Tackle the Imbalanced Regression Problem[J]. 2022.https://2022.ecmlpkdd.org/wp-content/uploads/2022/09/sub_1456.pdf

该方法已经被收录进Python包 (ImbalancedLearningRegression)中,可通过如下命令安装使用,

pip install ImbalancedLearningRegression

官方项目地址:https://github.com/paobranco/ImbalancedLearningRegression.

总结

虽然不多,应该还有,后面再补充…

上面提到的这些基本上都是应用到人工 构造特征的数据集上, 如何将其应用到端到端的深度学习方法中值得进一步研究,

此方方面的研究工作见:
Dablain D, Krawczyk B, Chawla N V. DeepSMOTE: Fusing deep learning and SMOTE for imbalanced data[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9694621, 发表于顶刊IEEE TNNLS, 膜拜.

总结

后续再补充

你可能感兴趣的:(前沿介绍,Machine,Learning,回归,数据挖掘,人工智能)