MMDS 2008. Workshop on Algorithms for Modern Massive Data Sets

Stanford University
June 25–28, 2008

MMDS 2010. Workshop on Algorithms for Modern Massive Data Sets, Stanford, CA, June 15–18, 2010.


The 2008 Workshop on Algorithms for Modern Massive Data Sets (MMDS 2008) addressed algorithmic, mathematical, and statistical challenges in modern large-scale data analysis. The goals of MMDS 2008 were to explore novel techniques for modeling and analyzing massive, high-dimensional, and nonlinearly-structured scientific and internet data sets, and to bring together computer scientists, statisticians, mathematicians, and data analysis practitioners to promote cross-fertilization of ideas.

Talk Slides:

Wednesday, June 25, 2008. Theme: Data Analysis and Data Applications

Time Talk
10:00 - 11:00 Tutorial: Christos Faloutsos
Graph mining: laws, generators and tools
11:00 - 11:30 Deepak Agarwal
Predictive discrete latent models for large incomplete dyadic data
11:30 - 12:00 Chandrika Kamath
Scientific data mining: why is it difficult?
2:00 - 3:00 Tutorial: Edward Chang
Challenges in mining large-scale social networks
3:00 - 3:30 Sharad Goel
Predictive indexing for fast search
3:30 - 4:00 James Demmel
Avoiding communication in linear algebra algorithms
4:30 - 5:00 Jun Liu
Bayesian inference of interactions and associations
5:00 - 5:30 Fan Chung
Four graph partitioning algorithms
5:30 - 6:00 Ronald Coifman
Diffusion geometries and harmonic analysis on data sets

Thursday, June 26, 2008. Theme: Networked Data and Algorithmic Tools

Time Talk
9:00 - 10:00 Tutorial: Milena Mihail
Models and algorithms for complex networks, with network elements maintaining characteristic profiles
10:00 - 10:30 Reid Andersen
An algorithm for improving graph partitions
11:00 - 11:30 Michael W. Mahoney
Community structure in large social and information networks
11:30 - 12:00 Nikhil Srivastava and Daniel Spielman
Graph sparsification by effective resistances
12:00 - 12:30 Amin Saberi
Sequential algorithms for generating random graphs
2:30 - 3:00 Pankaj K. Agarwal
Modeling and analyzing massive terrain data sets
3:00 - 3:30 Leonidas Guibas
Detection of symmetries and repeated patterns in 3D point cloud data
3:30 - 4:00 Yuan Yao
Topological methods for exploring pathway analysis in complex biomolecular folding
4:30 - 5:00 Piotr Indyk
Sparse recovery using sparse random matrices
5:00 - 5:30 Ping Li
Compressed counting and stable random projections
5:30 - 6:00 Joel Tropp
Algorithms for matrix column selection

Friday, June 27, 2008. Theme: Statistical, Geometric, and Topological Methods

Time Talk
9:00 - 10:00 Tutorial: Jerome H. Friedman
Fast sparse regression and classification
10:00 - 10:30 Tong Zhang
An adaptive forward/backward greedy algorithm for learning sparse representations
11:00 - 11:30 Jitendra Malik
Classification using intersection kernel SVMs is efficient
11:30 - 12:00 Elad Hazan
Efficient online routing with limited feedback and optimization in the dark
12:00 - 12:30 T.S. Jayram
Cascaded aggregates on data streams
2:30 - 3:30 Tutorial: Gunnar Carlsson
Topology and data
3:30 - 4:00 Partha Niyogi
Manifold regularization and semi-supervised learning
4:30 - 5:00 Sanjoy Dasgupta
Random projection trees and low dimensional manifolds
5:00 - 5:30 Kenneth Clarkson
Tighter bounds for random projections of manifolds
5:30 - 6:00 Yoram Singer
Efficient projection algorithms for learning sparse representations from high dimensional data
6:00 - 6:30 Arindam Banerjee
Bayesian co-clustering for dyadic data analysis

Saturday, June 28, 2008. Theme: Machine Learning and Dimensionality Reduction

Time Talk
9:00 - 10:00 Tutorial: Michael I. Jordan
Sufficient dimension reduction
10:00 - 10:30 Nathan Srebro
More data less work: SVM training in time decreasing with larger data sets
11:00 - 11:30 Inderjit S. Dhillon
Rank minimization via online learning
11:30 - 12:00 Nir Ailon
Efficient dimension reduction
2:30 - 3:00 Ravi Kannan
Spectral algorithms
3:00 - 3:30 Chris Wiggins
Inferring and encoding graph partitions
3:30 - 4:00 Anna Gilbert
Combinatorial group testing in signal recovery
4:30 - 5:00 Lars Kai Hansen
Generalization in high-dimensional matrix factorization
5:00 - 5:30 Holly Jin
Exploring sparse nonnegative matrix factorization
5:30 - 6:00 Elizabeth Purdom
Data analysis with graphs
6:00 - 6:30 Lek-Heng Lim
Ranking via Hodge decompositions of graphs and skew-symmetric matrices


Michael Mahoney, Stanford University

Lek-Heng Lim, University of California, Berkeley

Petros Drineas, Rensselaer Polytechnic Institute

Gunnar Carlsson, Stanford University

Related Events

EMMDS 2009. European Workshop on Challenges in Modern Massive Data Sets, Technical University of Denmark, Lyngby, Denmark, July 1–4, 2009.

MMDS 2006. Workshop on Algorithms for Modern Massive Data Sets, Stanford, CA, June 21–24, 2006.

Sponsored by

National Science 
Foundation Yahoo! 
Research DARPA LinkedIn Pacific Institute 
for the Mathematical Sciences

With support from

Stanford iCME Berkeley 


  • Book display: SIAM
  • Events and meeting planning: Victor Olmo, Mayita Romero
  • Finance: Lisa Ewan, Debbie Lemos
  • Online registration: Victor Olmo, Mayita Romero, Seth Tornborg, Kuan-Chuen Wu
  • Poster session: Victor Olmo
  • Program design: Sou-Cheng Choi, Michael Saunders
  • Publicity: Suzanne Bigas
  • Registration desk: Andrew Bradley, Christopher Carlsson, John Carlsson, Jeffrey Danciger, Victor Olmo, Maksims Ovsjanikovs, Seth Tornborg
  • Slides collection: David Gleich, Prateek Jain