FINM 33180/STAT 32940. Multivariate Data Analysis via Matrix Decompositions

Department of Statistics
University of Chicago
Fall 2016

This course is about using matrix computations to infer useful information from observed data. One may view it as an "applied" version of Stat 309; the only prerequisite for this course is basic linear algebra. The data analytic tools that we will study will go beyond linear and multiple regression and often fall under the heading of "Multivariate Analysis" in Statistics or "Unsupervised Learning" in Machine Learning. These include factor analysis, correspondence analysis, principal components analysis, multidimensional scaling, canonical correlation analysis, Procrustes analysis, partial least squares, etc. We would also discuss a small number of supervised learning techniques including discriminant analysis and support vector machines. Understanding these techniques require some facility with matrices (primarily eigen and singular value decompositions, as well as their generalization) in addition to some basic statistics, both of which the student will acquire during the course.

Announcements

Lectures

Location: Kent Chem Lab, Room 120

Times: Mon, 6:30–9:30pm

Course staff

Instructor: Lek-Heng Lim
Office: Jones 122B
lekheng(at)galton.uchicago.edu
Tel: (773) 702-4263
Office hours: Mon, 2:00–4:00pm, Jones 122B

Chicago Course Assistant I: Klakow Akepanidtaworn
klakowa(at)uchicago.edu
Chicago Course Assistant II: Triwit Ariyathugun
triwita1(at)uchicago.edu
Office hours: Mon, 3:30–5:00pm in Math-Stat Library; Thu, 6:30–8:00pm in Room 302, Math-Stat Building (Stevanovich Center)

Syllabus

The last two applications fall under supervised learning but we will discuss them if time permits, if only to give an idea of how supervised learning differs from unsupervised learning.

Problem Sets

Collaborations are permitted but you will need to write up your own solutions and declare your collaborators. The problem sets are designed to get progressively more difficult. You will get about 10 days for each problem set.

You are required to implement your own programs for problems that require some amount of simple coding (using Matlab, Mathematica, R, or SciPy).

Bug report on the problem sets: lekheng(at)galton.uchicago.edu

Grades

Grade composition: 60% Problem Sets, 40% Final Exam (Mon, Dec 5, 6:30–9:30pm, Kent 120).

Supplementary materials

References

You may download some of these books online from an UChicago IP address or via ProxyIt! if you are off-campus.