Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

会议进程，包括三个平行轨道：

The conference program, with its three parallel tracks - the Research Track, the Applied Data Science Track and the Applied Invited Speakers Track - brings the two groups together.

（1）研究轨道；

（2）应用数据科学轨道；

（3）应用邀请演讲者轨道 - 将两个小组合并在一起。

The conference this year continues with its tradition of a strong tutorial and workshop program on leading edge issues of data mining during the first two days of the program. The last three days are devoted to contributed technical papers, describing both novel, important research contributions, and deployed, innovative solutions.

前两天：教程+数据挖掘前沿问题

最后三天：重要文献——描述新颖、重要的研究成果，以及创新解决方案。

内容：

paper List : http://www.kdd.org/kdd2017/accepted-papers

Three keynote talks, by Cynthia Dwork, Bin Yu, and Renée J. Miller touch on some of the hard, emerging issues before the field of data mining.

1. 三个主题演讲——数据挖掘领域面临的新兴难题。

（1）What’s Fair?——Cynthia Dwork (Microsoft Research & Harvard University)

（2）The Future of Data Integration

数据集成的未来——Renée J. Miller (University of Toronto)

（3）Three Principles of Data Science: Predictability, Stability and Computability

数据科学的三个原则：可预测性，稳定性和可计算性——Bin Yu (University of California, Berkeley)

2. 12个 Applied Invited Talks

（1）Foreword to the Applied Data Science – Invited Talks Track at KDD-2017

应用数据科学前言 - KDD-2017特邀报告

（2）More than the Sum of its Parts: Building Domino Data Lab

不仅仅是相加：构建Domino数据实验室——Eduardo Ariño de la Rubia (Domino Data Lab)

（3）Mining Big Data in Neuro Genetics to Understand Muscular Dystrophy

挖掘神经遗传学中的大数据来了解肌营养不良症——Andy Berglund (University of Florida

（4）Industrial Machine Learning

工业机器学习——Josh Bloom (GE)

（5）Behavior Informatics to Discover Behavior Insight for Active and Tailored Client Management

行为信息学进行行为洞察，用于主动和定制的客户端管理——Longbing Cao (University of Technology Sydney）

（6）It Takes More than Math and Engineering to Hit the Bullseye with Data

击中数据靶心不仅需要数学和工程——Paritosh Desai (Target)

（7）Planning and Learning under Uncertainty: Theory and Practice

不确定性下的规划与学习：理论与实践——Jonathan P. How (Massachusetts Institute of Technology)

（8）Big Data in Climate: Opportunities and Challenges for Machine Learning

气候大数据：机器学习的机遇和挑战——Anuj Karpatne, Vipin Kumar (University of Minnesota)

（9）Addressing Challenges with Big Data for Media Measurement

应对大数据媒体测量挑战——Mainak Mazumdar (Nielsen)

（10）Machine Learning Software in Practice: Quo Vadis?

机器学习软件的实践：Quo Vadis？——Szilárd Pafka (Epoch)

（11）Designing AI at Scale to Power Everyday Life

设计人工智能以帮助日常生活——Rajesh Parekh (Facebook)

（12）Spaceborne Data Enters the Mainstream

星载数据进入主流——David Potere (Tellus Laboratories)

3. KDD 2017 Panels（人工智能相关）

（1）Benchmarks and Process Management in Data Science: Will We Ever Get Over the Mess?

数据科学中的基准测试和流程管理：我们能否克服困难？——Usama M. Fayyad (Open Insights), Arno Candel (H2O.ai, Inc.), Eduardo Ariño de la Rubia (Domino Data Lab),Szilárd Pafka (Epoch), Anthony Chong (IKASI), Jeong-Yoon Lee (Microsoft)

（2）The Future of Artificially Intelligent Assistants

人工智能助手的未来——Muthu Muthukrishnan (Rutgers University), Andrew Tomkins, Larry Heck (Google), Alborz Geramifard (Amazon), Deepak Agarwal (LinkedIn)

4.KDD 2017 Research Papers (Oral Papers) 研究文献

（1）Learning Certifiably Optimal Rule Lists

学习可证明的最优规则列表——Elaine Angelino (University of California, Berkeley),Nicholas Larus-Stone, Daniel Alabi, Margo Seltzer (Harvard University), Cynthia Rudin (Duke University)

（2）Improved Degree Bounds and Full Spectrum Power Laws in Preferential Attachment Networks

在优先附着网络中改进度边界和全谱幂律——Chen Avin, Zvi Lotker (Ben Gurion University of the Negev),Yinon Nahum, David Peleg (Weizmann Institute of Science)

（3）Unsupervised Network Discovery for Brain Imaging Data

脑成像数据的无监督网络发现——Zilong Bai (University of California, Davis), Peter Walker, Anna Tschiffely (Naval Medical Research Center),Fei Wang (Cornell University), Ian Davidson (University of California, Davis)

（4）Patient Subtyping via Time-Aware LSTM Networks

病人分类，通过时间感知的LSTM网络——Inci M. Baytas (Michigan State University), Cao Xiao (IBM T. J. Watson Research Center),Xi Zhang, Fei Wang (Cornell University), Anil K. Jain, Jiayu Zhou (Michigan State University)

（5）Robust Top-k Multiclass SVM for Visual Category Recognition

稳健Top-k多类SVM，用于视觉分类识别——Xiaojun Chang (Carnegie Mellon University), Yao-Liang Yu (University of Waterloo),Yi Yang (University of Technology Sydney)

（6）KATE: K-Competitive Autoencoder for Text

KATE：文本K-竞争自动编码器——Yu Chen, Mohammed J. Zaki (Rensselaer Polytechnic Institute)

（7）A Minimal Variance Estimator for the Cardinality of Big Data Set Intersection

大数据集交叉基数的最小方差估计——Reuven Cohen, Liran Katzir, Aviv Yehezkel (Technion)

（8）HyperLogLog Hyperextended: Sketches for Concave Sublinear Frequency Statistics

HyperLogLog Hyperextended：用于凹次线性频率统计的草图——Edith Cohen (Google Research)

（9）Fast Enumeration of Large k-Plexes

Large k-Plexes的快速枚举——Alessio Conte (University of Pisa), Donatella Firmani (Roma Tre University),Caterina Mordente (Be Think Solve Execute), Maurizio Patrignani, Riccardo Torlone (Roma Tre University)

（10）Matrix Profile V: A Generic Technique to Incorporate Domain Knowledge into Motif Discovery

矩阵Profile V：将领域知识合并到Motif发现中的一种通用技术

（10）metapath2vec: Scalable Representation Learning for Heterogeneous Networks

metapath2vec：异构网络的可扩展表示学习

（11）Ego-Splitting Framework: from Non-Overlapping to Overlapping Clusters

自分割框架：从非重叠到重叠的集群

（12）Contextual Motifs: Increasing the Utility of Motifs using Contextual Data

上下文的图案：使用上下文数据增加图案效用

（13）Unsupervised P2P Rental Recommendations via Integer Programming

无监督的P2P出租推荐，通过整数编程

（14）The Co-Evolution Model for Social Network Evolving and Opinion Migration

社会网络演进和意见迁移的共同演化模型

（15）Groups-Keeping Solution Path Algorithm for Sparse Regression with Automatic Feature Grouping

归分组解决路径算法，用于基于自动特征分组的稀疏回归

（16）Clustering Individual Transactional Data for Masses of Users

用户群体的单个交易数据聚类

（17）Network Inference via the Time-Varying Graphical Lasso

通过时变图套索进行网络推断

（18）Efficient Correlated Topic Modeling with Topic Embedding

有效的相关主题建模与主题嵌入

（19）Accelerating Innovation Through Analogy Mining

通过类比挖掘加速创新

（20）Communication-Efficient Distributed Block Minimization for Nonlinear Kernel Machines

通信高效分布块最小化，用于非线性核机制

（21）A Hierarchical Algorithm for Extreme Clustering

一种极端聚类的分层算法

（21）Estimating Treatment Effect in the Wild via Differentiated Confounder Balancing

通过差异配种平衡，评估野外治疗效果

（22）The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables

选择性标签问题：在不可观察情况下评估算法预测

（23）Constructivism Learning: A Learning Paradigm for Transparent Predictive Analytics

建构主义学习：透明预测分析的学习范式

（24）Is the Whole Greater Than the Sum of Its Parts?

（25）Collaborative Variational Autoencoder for Recommender Systems

用于推荐系统的协作变分自动编码器

（26）Linearized GMM Kernels and Normalized Random Fourier Features

线性化GMM核与归一化随机傅立叶特征

（27）Discrete Content-aware Matrix Factorization

感知内容的矩阵分解

（28）Effective and Real-time In-App Activity Analysis in Encrypted Internet Traffic Streams

加密的互联网业务流中有效和实时的应用内活动分析

（29）Functional Annotation of Human Protein Coding Isoforms via Non-convex Multi-Instance Learning

人类蛋白质编码亚型的非凸多实例学习功能注释

（30）Discovering Reliable Approximate Functional Dependencies

发现可靠的近似函数依赖

（21）Towards an Optimal Subspace for K-Means

（22）SPARTan: Scalable PARAFAC2 for Large & Sparse Data

用于大型稀疏数据的可扩展的PARAFAC2

（23）struc2vec: Learning Node Representations from Structural Identity

（24）Similarity Forests

（25）Structural Deep Brain Network Mining

（26）On Finding Socially Tenuous Groups for Online Social Networks

（27）A Local Algorithm for Structure-Preserving Graph Cut