MMDS 2008. Workshop on Algorithms for Modern Massive Data Sets

Stanford University
June 25–28, 2008

MMDS 2010. Workshop on Algorithms for Modern Massive Data Sets, Stanford, CA, June 15–18, 2010.

Synopsis

The 2008 Workshop on Algorithms for Modern Massive Data Sets (MMDS 2008) addressed algorithmic, mathematical, and statistical challenges in modern large-scale data analysis. The goals of MMDS 2008 were to explore novel techniques for modeling and analyzing massive, high-dimensional, and nonlinearly-structured scientific and internet data sets, and to bring together computer scientists, statisticians, mathematicians, and data analysis practitioners to promote cross-fertilization of ideas.

The organizers thank all participants and speakers for their time and interest.

Schedule, abstracts of talks and posters, PDF file with everything, original conference web page.

Reports about the event: AMSTAT News (link), Statistical Computing and Statistical Graphics (link), ACM KDD Explorations (link), IMS Bulletin (link), SIAM News Part I (link) and SIAM News Part II (link).

Blogs about the event: Mainly Data, Stream of Caffeiness, Nuit Blanche, Ganesh Swami, The AstroStat Slog, Large-Scale Social Network Analysis.

Talk Slides:

Wednesday, June 25, 2008. Theme: Data Analysis and Data Applications

Time	Talk
10:00 - 11:00	Tutorial: Christos Faloutsos Graph mining: laws, generators and tools
11:00 - 11:30	Deepak Agarwal Predictive discrete latent models for large incomplete dyadic data
11:30 - 12:00	Chandrika Kamath Scientific data mining: why is it difficult?
2:00 - 3:00	Tutorial: Edward Chang Challenges in mining large-scale social networks
3:00 - 3:30	Sharad Goel Predictive indexing for fast search
3:30 - 4:00	James Demmel Avoiding communication in linear algebra algorithms
4:30 - 5:00	Jun Liu Bayesian inference of interactions and associations
5:00 - 5:30	Fan Chung Four graph partitioning algorithms
5:30 - 6:00	Ronald Coifman Diffusion geometries and harmonic analysis on data sets

Thursday, June 26, 2008. Theme: Networked Data and Algorithmic Tools

Time	Talk
9:00 - 10:00	Tutorial: Milena Mihail Models and algorithms for complex networks, with network elements maintaining characteristic profiles
10:00 - 10:30	Reid Andersen An algorithm for improving graph partitions
11:00 - 11:30	Michael W. Mahoney Community structure in large social and information networks
11:30 - 12:00	Nikhil Srivastava and Daniel Spielman Graph sparsification by effective resistances
12:00 - 12:30	Amin Saberi Sequential algorithms for generating random graphs
2:30 - 3:00	Pankaj K. Agarwal Modeling and analyzing massive terrain data sets
3:00 - 3:30	Leonidas Guibas Detection of symmetries and repeated patterns in 3D point cloud data
3:30 - 4:00	Yuan Yao Topological methods for exploring pathway analysis in complex biomolecular folding
4:30 - 5:00	Piotr Indyk Sparse recovery using sparse random matrices
5:00 - 5:30	Ping Li Compressed counting and stable random projections
5:30 - 6:00	Joel Tropp Algorithms for matrix column selection

Friday, June 27, 2008. Theme: Statistical, Geometric, and Topological Methods

Time	Talk
9:00 - 10:00	Tutorial: Jerome H. Friedman Fast sparse regression and classification
10:00 - 10:30	Tong Zhang An adaptive forward/backward greedy algorithm for learning sparse representations
11:00 - 11:30	Jitendra Malik Classification using intersection kernel SVMs is efficient
11:30 - 12:00	Elad Hazan Efficient online routing with limited feedback and optimization in the dark
12:00 - 12:30	T.S. Jayram Cascaded aggregates on data streams
2:30 - 3:30	Tutorial: Gunnar Carlsson Topology and data
3:30 - 4:00	Partha Niyogi Manifold regularization and semi-supervised learning
4:30 - 5:00	Sanjoy Dasgupta Random projection trees and low dimensional manifolds
5:00 - 5:30	Kenneth Clarkson Tighter bounds for random projections of manifolds
5:30 - 6:00	Yoram Singer Efficient projection algorithms for learning sparse representations from high dimensional data
6:00 - 6:30	Arindam Banerjee Bayesian co-clustering for dyadic data analysis

Saturday, June 28, 2008. Theme: Machine Learning and Dimensionality Reduction

Time	Talk
9:00 - 10:00	Tutorial: Michael I. Jordan Sufficient dimension reduction
10:00 - 10:30	Nathan Srebro More data less work: SVM training in time decreasing with larger data sets
11:00 - 11:30	Inderjit S. Dhillon Rank minimization via online learning
11:30 - 12:00	Nir Ailon Efficient dimension reduction
2:30 - 3:00	Ravi Kannan Spectral algorithms
3:00 - 3:30	Chris Wiggins Inferring and encoding graph partitions
3:30 - 4:00	Anna Gilbert Combinatorial group testing in signal recovery
4:30 - 5:00	Lars Kai Hansen Generalization in high-dimensional matrix factorization
5:00 - 5:30	Holly Jin Exploring sparse nonnegative matrix factorization
5:30 - 6:00	Elizabeth Purdom Data analysis with graphs
6:00 - 6:30	Lek-Heng Lim Ranking via Hodge decompositions of graphs and skew-symmetric matrices