wikipedia.org,历史,领域概述,资源链接:
Data mining:介绍了数据挖掘的概念、过程、学术会议、软件等,右侧有细分条目;
Category:Data mining:更多和数据挖掘有关的条目;
DMOZ关于DM:资源链接;
谷歌上不了推荐镜像站,搜索和下载电子书籍推荐Library Genesis(更多在线图书馆)。
Stanford课程:CS246 Mining Massive Data Sets,CS246H Mining Massive Data Sets: Hadoop Labs,CS341 Project in Mining Massive Data Sets,配套书籍 Mining of Massive Datasets,DataMiningTalk;
CMU课程:Data Mining: Spring 2013,Statistics 36-350: Data Mining (fall 2009);
南京大学课程:Introduction to Data Mining;
Coursera:Data Mining Specialization。
Mining of Massive Datasets, Jure Leskovec, Anand Rajaraman, Jeff Ullman, 2015; PPT;中文译本:大数据-互联网大规模数据挖掘与分布式处理;
Data Mining: The Textbook, Charu C. Aggarwal, 2015; 资源链接;
Data Mining: Concepts and Techniques (3rd ed.), Jiawei Han, Micheline Kamber, Jian Pei, 2011; PPT;中文译本:数据挖掘:概念与技术;
Data Mining and Analysis: Fundamental Concepts and Algorithms, Mohammed J. Zaki, Wagner Meira Jr, 2014; 作者网站;
Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, Vipin Kumar, 2006; PPT;中文译本:数据挖掘导论;
A Practical Guide to Data Mining for Business and Industry, Andrea Ahlemeyer-Stubbe, Shirley Coleman, 2014; PPT;
Data Mining: Practical Machine Learning Tools and Techniques (3rd ed.), Ian H. Witten, Eibe Frank, Mark A. Hall, 2011; PPT;中文译本:数据挖掘:实用机器学习工具与技术;
Programming Collective Intelligence: Building Smart Web 2.0 Applications, Toby Segaran, 2007; 中文译本:集体智慧编程;
The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.), Trevor Hastie, Robert Tibshirani, Jerome Friedman, 2009;
顶级会议:KDD,ICDE;
更多会议期刊见:Google Scholar DM,Microsoft academic DM,KDnuggets DM Conferences。
KDnuggets:各种资源,博文,课程、软件、Datasets等链接;
国内的两个网站:我爱机器学习,机器学习日报;
Data Sets:UCI Machine Learning Repository,List of Public Data Sources Fit for Machine Learning;
Competitions:Kaggle,DMC,Knowledge Pit,TunedIT,DrivenData;
这里也整理了一些资源,这里整理了数据挖掘博客,这里有术语解释、挖掘介绍、书推荐等不过有点老。
R语言:RDataMining,inside-R;
Hadoop:Tutorial,Wiki,实现了MapReduce计算模型;
Spark:Tutorial,作为Hadoop的改进或补充近来很火,
Hadoop:知乎的比较。