【教程】【随机矩阵sketch】Data Mining Seminar : Matrix Sketching

Data Mining Seminar : Matrix Sketching

Instructors :Jeff PhillipsandMina Ghashami

Spring 2015 | Fridays 1:45 pm - 3:00 pm

Location : MEB 3147 (the LCR)

Catalog number: CS 7931 or CS 6961

Description:

A very common way to represent very large data sets is as a matrix. For instance if there are n data points, and each data points has d attributes, then this can be thought of an nxd matrix A with n rows and d columns. While matrix approximation and decomposition has been studied in numerical linear algebra for many decades, these methods often require more space and time than is feasible for very large scale settings, and also often worry about more precision than is required. The last decade has witnessed an explosion of work inmatrix sketchingwhere input matrix A is efficiently approximated with a more compact matrix B (or product of a few matrices) so that B preserves most of the properties of A up to some guaranteed approximation ratio. This class will attempt to survey the large and growing literature on this topic, focusing on simple algorithms, intuition for error bounds, and practical performance.

This 1-credit seminar will meet once a week. Instructors will give most lecturs. Students will be expected to carry out a small project explore one or more of the topics we discuss in a bit of depth, and pushing the boundaries of research. They will give a short presentation of their results at the end of class.

Schedule: (subject to change)

DateTopicReferencesSpeaker

Fri 1.16OverviewJeff Phillips

Fri 1.23Column SamplingWoodruff2.4 |DGP2.1, 3.1, 5.1 |MahoneyJeff Phillips

Fri 1.30Random Projection and HashingWoodruff2.1 |DGP2.2, 5.2Mina Ghashami

Fri 2.06Iterative (Frequent Directions)GLPW|DGP2.3, 3.2, 5.3Jeff Phillips

Fri 2.13CUR DecompositionsWoodruff4.1, 4.2 |MahoneyMina Ghashami

Fri 2.20(No Class - Grad Visit Day)

Fri 2.27Matrix Concentration BoundsTropp(Ch 5+6)Mina Ghashami

Fri 3.06Lower BoundsWoodruff6Jeff Phillips

Fri 3.13Sparsification@ 3:15 in WEB 1705Mina Ghashami

Fri 3.20(Fall Break - No Class)

Fri 3.27Regression and L1 (and Lp) Bounds@ 3:15 in WEB 1705Woodruff2.5, 3,YMMJeff Phillips

Fri 4.03Distributed ModelsWoodruff4.4 |GPLMina Ghashami

Fri 4.10Tensors DecompositionsMina Ghashami

Fri 4.17Project Presentations

Fri 4.24Project Presentations

Useful references:

Woodruff: David P. WoodruffSketching as a Tool for Numerical Linear Algebra. Foundations and Trends in Theoretical Computer Science. Vol. 10,(2014) pages 1-157.

Tropp: Joel A. TroppAn Introduction to Matrix Concentration Inequalities. arXiv:1501.01571. To appear in Foundations and Trends in Machine Learning.

GLPW: Mina Ghashami, Edo Liberty, Jeff M. Phillips, and David WoodruffFrequent Directions : Simple and Deterministic Matrix Sketching. arXiv:1501.01711.

DGP: Amey Desai and Mina Ghashami and Jeff M. PhillipsImproved Practical Matrix Sketching with Guarantees. arXiv:1501.06561.

Mahoney: Michael W. MahoneyRandomized Algorthims for Matrices and Data. Foundations and Trends in Machine Learning. Vol. 3, (2011) pages 123-224.

YMM: Jiyan Yang and Xiangrui Meng and Michael W. MahoneyImplementing Randomized Matrix Algorithms in Parallel and Distributed Environments. arxiv:1502.03032.

你可能感兴趣的:(【教程】【随机矩阵sketch】Data Mining Seminar : Matrix Sketching)