数据挖掘方面重要会议的最佳paper集合,后续将陆续分析一下内容:
主要有KDD、SIGMOD、VLDB、ICML、SIGIR
KDD (Data Mining) |
||
2013 |
Simple and Deterministic Matrix Sketching |
Edo Liberty, Yahoo! Research |
2012 |
Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping |
Thanawin Rakthanmanon, University of California Riverside; et al. |
2011 |
Leakage in Data Mining: Formulation, Detection, and Avoidance |
Shachar Kaufman, Tel-Aviv University; et al. |
2010 |
Large linear classification when data cannot fit in memory |
Hsiang-Fu Yu, National Taiwan University; et al. |
Connecting the dots between news articles |
Dafna Shahaf & Carlos Guestrin, Carnegie Mellon University |
|
2009 |
Collaborative Filtering with Temporal Dynamics |
Yehuda Koren, Yahoo! Research |
2008 |
Fastanova: an efficient algorithm for genome-wide association study |
Xiang Zhang, University of North Carolina at Chapel Hill; et al. |
2007 |
Predictive discrete latent factor models for large scale dyadic data |
Deepak Agarwal & Srujana Merugu, Yahoo! Research |
2006 |
Training linear SVMs in linear time |
Thorsten Joachims, Cornell University |
2005 |
Graphs over time: densification laws, shrinking diameters and possible explanations |
Jure Leskovec, Carnegie Mellon University; et al. |
2004 |
A probabilistic framework for semi-supervised clustering |
Sugato Basu, University of Texas at Austin; et al. |
2003 |
Maximizing the spread of influence through a social network |
David Kempe, Cornell University; et al. |
2002 |
Pattern discovery in sequences under a Markov assumption |
Darya Chudova & Padhraic Smyth, University of California Irvine |
2001 |
Robust space transformations for distance-based operations |
Edwin M. Knorr, University of British Columbia; et al. |
2000 |
Hancock: a language for extracting signatures from data streams |
Corinna Cortes, AT&T Laboratories; et al. |
1999 |
MetaCost: a general method for making classifiers cost-sensitive |
Pedro Domingos, Universidade Técnica de Lisboa |
1998 |
Occam's Two Razors: The Sharp and the Blunt |
Pedro Domingos, Universidade Técnica de Lisboa |
1997 |
Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Di... |
Foster Provost & Tom Fawcett, NYNEX Science and Technology |
SIGMOD (Databases) |
||
2013 |
Massive Graph Triangulation |
Xiaocheng Hu, The Chinese University of Hong Kong; et al. |
2012 |
High-Performance Complex Event Processing over XML Streams |
Barzan Mozafari, Massachusetts Institute of Technology; et al. |
2011 |
Entangled Queries: Enabling Declarative Data-Driven Coordination |
Nitin Gupta, Cornell University; et al. |
2010 |
FAST: fast architecture sensitive tree search on modern CPUs and GPUs |
Changkyu Kim, Intel; et al. |
2009 |
Generating example data for dataflow programs |
Christopher Olston, Yahoo! Research; et al. |
2008 |
Serializable isolation for snapshot databases |
Michael J. Cahill, University of Sydney; et al. |
Scalable Network Distance Browsing in Spatial Databases |
Hanan Samet, University of Maryland; et al. |
|
2007 |
Compiling mappings to bridge applications and databases |
Sergey Melnik, Microsoft Research; et al. |
Scalable Approximate Query Processing with the DBO Engine |
Christopher Jermaine, University of Florida; et al. |
|
2006 |
To search or to crawl?: towards a query optimizer for text-centric tasks |
Panagiotis G. Ipeirotis, New York University; et al. |
2004 |
Indexing spatio-temporal trajectories with Chebyshev polynomials |
Yuhan Cai & Raymond T. Ng, University of British Columbia |
2003 |
Spreadsheets in RDBMS for OLAP |
Andrew Witkowski, Oracle; et al. |
2001 |
Locally adaptive dimensionality reduction for indexing large time series databases |
Eamonn Keogh, University of California Irvine; et al. |
2000 |
XMill: an efficient compressor for XML data |
Hartmut Liefke, University of Pennsylvania |
1999 |
DynaMat: a dynamic view management system for data warehouses |
Yannis Kotidis & Nick Roussopoulos, University of Maryland |
1998 |
Efficient transparent application recovery in client-server information systems |
David Lomet & Gerhard Weikum, Microsoft Research |
Integrating association rule mining with relational database systems: alternatives and implications |
Sunita Sarawagi, IBM Research; et al. |
|
1997 |
Fast parallel similarity search in multimedia databases |
Stefan Berchtold, University of Munich; et al. |
1996 |
Implementing data cubes efficiently |
Venky Harinarayan, Stanford University; et al. |
VLDB (Databases) |
||
2013 |
DisC Diversity: Result Diversification based on Dissimilarity and Coverage |
Marina Drosou & Evaggelia Pitoura, University of Ioannina |
2012 |
Dense Subgraph Maintenance under Streaming Edge Weight Updates for Real-time Story Identification |
Albert Angel, University of Toronto; et al. |
2011 |
RemusDB: Transparent High-Availability for Database Systems |
Umar Farooq Minhas, University of Waterloo; et al. |
2010 |
Towards Certain Fixes with Editing Rules and Master Data |
Shuai Ma, University of Edinburgh; et al. |
2009 |
A Unified Approach to Ranking in Probabilistic Databases |
Jian Li, University of Maryland; et al. |
2008 |
Finding Frequent Items in Data Streams |
Graham Cormode & Marios Hadjieleftheriou, AT&T Laboratories |
Constrained Physical Design Tuning |
Nicolas Bruno & Surajit Chaudhuri, Microsoft Research |
|
2007 |
Scalable Semantic Web Data Management Using Vertical Partitioning |
Daniel J. Abadi, Massachusetts Institute of Technology; et al. |
2006 |
Trustworthy Keyword Search for Regulatory-Compliant Records Retention |
Soumyadeb Mitra, University of Illinois at Urbana-Champaign; et al. |
2005 |
Cache-conscious Frequent Pattern Mining on a Modern Processor |
Amol Ghoting, Ohio State University; et al. |
2004 |
Model-Driven Data Acquisition in Sensor Networks |
Amol Deshpande, University of California Berkeley; et al. |
2001 |
Weaving Relations for Cache Performance |
Anastassia Ailamaki, Carnegie Mellon University; et al. |
1997 |
Integrating Reliable Memory in Databases |
Wee Teck Ng & Peter M. Chen, University of Michigan |
ICML (Machine Learning) |
||
2013 |
Vanishing Component Analysis |
Roi Livni, The Hebrew University of Jerusalum; et al. |
Fast Semidifferential-based Submodular Function Optimization |
Rishabh Iyer, University of Washington; et al. |
|
2012 |
Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring |
Sungjin Ahn, University of California Irvine; et al. |
2011 |
Computational Rationalization: The Inverse Equilibrium Problem |
Kevin Waugh, Carnegie Mellon University; et al. |
2010 |
Hilbert Space Embeddings of Hidden Markov Models |
Le Song, Carnegie Mellon University; et al. |
2009 |
Structure preserving embedding |
Blake Shaw & Tony Jebara, Columbia University |
2008 |
SVM Optimization: Inverse Dependence on Training Set Size |
Shai Shalev-Shwartz & Nathan Srebro, Toyota Technological Institute at Chicago |
2007 |
Information-theoretic metric learning |
Jason V. Davis, University of Texas at Austin; et al. |
2006 |
Trading convexity for scalability |
Ronan Collobert, NEC Labs America; et al. |
2005 |
A support vector method for multivariate performance measures |
Thorsten Joachims, Cornell University |
1999 |
Least-Squares Temporal Difference Learning |
Justin A. Boyan, NASA Ames Research Center |
SIGIR (Information Retrieval) |
||
2013 |
Beliefs and Biases in Web Search |
Ryen W. White, Microsoft Research |
2012 |
Time-Based Calibration of Effectiveness Measures |
Mark Smucker & Charles Clarke, University of Waterloo |
2011 |
Find It If You Can: A Game for Modeling Different Types of Web Search Success Using Interaction Data |
Mikhail Ageev, Moscow State University; et al. |
2010 |
Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs |
Ryen W. White, Microsoft Research |
2009 |
Sources of evidence for vertical selection |
Jaime Arguello, Carnegie Mellon University; et al. |
2008 |
Algorithmic Mediation for Collaborative Exploratory Search |
Jeremy Pickens, FX Palo Alto Lab; et al. |
2007 |
Studying the Use of Popular Destinations to Enhance Web Search Interaction |
Ryen W. White, Microsoft Research; et al. |
2006 |
Minimal Test Collections for Retrieval Evaluation |
Ben Carterette, University of Massachusetts Amherst; et al. |
2005 |
Learning to estimate query difficulty: including applications to missing content detection and dis... |
Elad Yom-Tov, IBM Research; et al. |
2004 |
A Formal Study of Information Retrieval Heuristics |
Hui Fang, University of Illinois at Urbana-Champaign; et al. |
2003 |
Re-examining the potential effectiveness of interactive query expansion |
Ian Ruthven, University of Strathclyde |
2002 |
Novelty and redundancy detection in adaptive filtering |
Yi Zhang, Carnegie Mellon University; et al. |
2001 |
Temporal summaries of new topics |
James Allan, University of Massachusetts Amherst; et al. |
2000 |
IR evaluation methods for retrieving highly relevant documents |
Kalervo Järvelin & Jaana Kekäläinen, University of Tampere |
1999 |
Cross-language information retrieval based on parallel texts and automatic mining of parallel text... |
Jian-Yun Nie, Université de Montréal; et al. |
1998 |
A theory of term weighting based on exploratory data analysis |
Warren R. Greiff, University of Massachusetts Amherst |
1997 |
Feature selection, perceptron learning, and a usability case study for text categorization |
Hwee Tou Ng, DSO National Laboratories; et al. |
1996 |
Retrieving spoken documents by combining multiple index sources |
Gareth Jones, University of Cambridge; et al. |
推荐一个网站,感谢作者的努力搜集,主要是各种顶级会议的最佳论文集合。
http://jeffhuang.com/best_paper_awards.html