June 15–18, 2010

The Workshops on Algorithms for Modern Massive Data Sets (MMDS 2010) addressed algorithmic and statistical challenges in modern large-scale data analysis. The goals of this series of workshops are to explore novel techniques for modeling and analyzing massive, high-dimensional, and nonlinearly-structured scientific and internet data sets; and to bring together computer scientists, statisticians, mathematicians, and data analysis practitioners to promote the cross-fertilization of ideas.

- The organizers thank the 216 participants and 40 speakers for their time and interest.

- Schedule and talk/poster abstracts can be found in 2010 program. Also, the original conference web page.

- Blogs about the event: Revolution Analytics, Big Data News, Nuit Blanche.

Time | Talk |
---|---|

8:00 - 10:00 | Breakfast and Registration -- outside Cubberley Auditorium (at the Stanford School of Education, just off the Main Quad) |

9:45 - 10:00 | Welcome and Opening Remarks -- in Cubberley Auditorium |

10:00 - 11:00 | Tutorial: Peter Norvig Internet-Scale Data Analysis |

11:00 - 11:30 | Ashok Srivastava Virtual Sensors and Large-Scale Gaussian Processes |

11:30 - 12:00 | John Langford A Method for Parallel Online Learning |

2:00 - 3:00 | Tutorial: John Gilbert Combinatorial Scientific Computing: Experience and Challenges |

3:00 - 3:30 | Deepak Agarwal Recommender Probems for Content Optimization |

3:30 - 4:00 | James Demmel Minimizing Communication in Linear Algebra |

4:30 - 5:00 | Dmitri Krioukov Hyperbolic Mapping of Complex Networks |

5:00 - 5:30 | Mehryar Mohri Matrix Approximation for Large-Scale Learning |

5:30 - 6:00 | David Bader Massive-Scale Analytics of Streaming Social Networks |

6:00 - 6:30 | Ely Porat Fast Pseudo-Random Fingerprints |

Time | Talk |
---|---|

9:00 - 10:00 | Tutorial: Peter Bickel Statistical Inference for Networks |

10:00 - 10:30 | Jure Leskovec Inferring Networks of Diffusion and Influence |

11:00 - 11:30 | Michael W. Mahoney Geometric Network Analysis Tools |

11:30 - 12:00 | Edward Chang AdHEat - A New Influence-based Social Ads Model and its Tera-Scale Algorithms |

12:00 - 12:30 | Mauro Maggioni Intrinsic Dimensionality Estimation and Multiscale Geometry of Data Sets |

2:30 - 3:00 | Guillermo Sapiro Collaborative Hierarchical Sparse Models |

3:00 - 3:30 | Alekh Agarwal and Peter Bartlett Information-theoretic Lower Bounds on the Oracle Complexity of Convex Optimization |

3:30 - 4:00 | John Duchi and Yoram Singer Composite Objective Optimization and Learning for Massive Datasets |

4:30 - 5:00 | Steven Hillion MAD Analytics in Practice |

5:00 - 5:30 | Matthew Harding Outlier Detection in Financial Trading Networks |

5:30 - 6:00 | Neel Sundrahan Large Dataset Problems at the Long Tail |

Time | Talk |
---|---|

9:00 - 10:00 | Tutorial: Sebastiano Vigna Spectral Ranking |

10:00 - 10:30 | Robert Stine Streaming Feature Selection |

11:00 - 11:30 | Konstantin Mischaikow A Combinatorial Framework for Nonlinear Dynamics |

11:30 - 12:00 | Alfred Hero Sparse Correlation Screening in High Dimension |

12:00 - 12:30 | Susan Holmes Heterogeneous Data Challenge Combining Complex Data |

2:30 - 3:30 | Tutorial: Piotr Indyk Sparse Recovery Using Sparse Matrices |

3:30 - 4:00 | Sayan Mukherjee Efficient Dimension Reduction on Massive Data |

4:30 - 5:00 | Padhraic Smyth Statistical Modeling of Large-Scale Sensor Count Data |

5:00 - 5:30 | Ping Li Compressed Counting and Application in Estimating Entropy of Data Steams |

5:30 - 6:00 | Edo Liberty Scaleable Correlation Clustering Algorithms |

Time | Talk |
---|---|

9:00 - 10:00 | Tutorial: Petros Drineas Randomized Algorithms in Linear Algebra and Large Data Applications |

10:00 - 10:30 | Gunnar Martinsson Randomized methods for Computing the SVD/PCA of Very Large Matrices |

11:00 - 11:30 | Ilse Ipsen Numerical Reliability of Randomized Algorithms |

11:30 - 12:00 | Philippe Rigollet Optimal Rates of Sparse Esimation and Universal Aggregation |

12:00 - 12:30 | Alexandre d'Aspremont Subsampling, Spectral Methods & Semidefinite Programming |

2:30 - 3:00 | Gary Miller Specialized System Solvers for very large Systems: Theory and Practice |

3:00 - 3:30 | John Wright and Emmanuel Candes Robust Principal Component Analysis? |

3:30 - 4:00 | Alon Orlitsky Estimation, Prediction, and Classification over Large Alphabets |

4:30 - 5:00 | Ken Clarkson Numerical Linear Algebra in the Streaming Model |

5:00 - 5:30 | David Woodruff Fast Lp Regression in Data Streams |

Alekh Agarwal | University of California, Berkeley |

Deepak Agarwal | Yahoo! Research |

Alexandre d'Aspremont | Princeton University |

David Bader | Georgia Tech College of Computing |

Peter Bickel | University of California, Berkeley |

Emmanuel Candes | Stanford University |

Edward Chang | Google Research |

Ken Clarkson | IBM Almaden Research Center |

Jim Demmel | University of California, Berkeley |

John Duchi | University of California, Berkeley |

John Gilbert | University of California, Santa Barbara |

Matthew Harding | Stanford University |

Alfred Hero | University of Michigan, Ann Arbor |

Steven Hillion | Greenplum |

Susan Holmes | Stanford University |

Peter Indyk | Massachusetts Institute of Technology |

Ilse Ipsen | North Carolina State University |

Dmitri Krioukov | Cooperative Association for Internet Data Analysis |

John Langford | Yahoo! Research |

Jure Leskovec | Stanford University |

Ping Li | Cornell University |

Edo Liberty | Yahoo! Research |

Mauro Maggioni | Duke University |

Gunnar Martinsson | University of Colorado, Boulder |

Gary Miller | Carnegie Mellon University |

Konstantin Mischaikow | Rutgers University |

Mehryar Mohri | New York University |

Sayan Mukherjee | Duke University |

Peter Norvig | Google Research |

Alon Orlitsky | University of California, San Diego |

Ely Porat | Bar-Ilan University |

Guillermo Sapiro | University of Minnesota |

Padhraic Smyth | University of California, Irvine |

Ashok Srivastava | National Aeronautics and Space Administration |

Neel Sundaresan | eBay Research |

Robert Stine | University of Pennsylvania |

Sebastiano Vigna | Università Degli Studi Di Milano |

David Woodruff | IBM Almaden Research Center |

John Wright | Microsoft Research Asia |

Peter Bartlett | University of California, Berkeley |

Robert Calderbank | Princeton University |

Fan Chung | University of California, San Diego |

Yoram Singer | Google Research |

Patrick Wolfe | Harvard University |

**MMDS 2008.**
Workshop on Algorithms for Modern Massive Data Sets,
Stanford, CA, June 25–28, 2008.

**MMDS 2006.**
Workshop on Algorithms for Modern Massive Data Sets,
Stanford, CA, June 21–24, 2006.