Please note that the official course website is on Canvas (log in with CNetID), NOT here. This webpage is for those who are interested in STAT 22400 to get an idea of what the course is like.

Prerequisites

STAT 22000 or 23400 with a grade of at least C; or 24500, 24510, or PBHS 32100, or AP Statistics credit, or equivalent. and two-quarters of calculus (MATH 13200, 15200, or 16200 or above).

Course Description

STAT 22400/PBHS32400 introduces the methods and applications of fitting and interpreting multiple regression models. The primary emphasis is on the method of least squares and its many varieties. Topics include the examination of residuals, the transformation of data, strategies, and criteria for the selection of a regression equation, the use of dummy variables, tests of fit, nonlinear models, multi-collinearity, biases due to excluded variables and measurement error, and the use and interpretation of computer package regression programs. The techniques discussed are illustrated by many real examples involving data from both the natural and social sciences. Matrix notation is introduced as needed.

Textbooks

Chatterjee & Hadi. Regression Analysis by Example, 5th edition 2005, Wiley

Course Schedule and Slides

Week/Date Slides Content Textbook Coverage
Before CLass L00.pdf Brief Intro to R and RStudio
Week 1 – 9/30, 10/2 L01_02.pdf Intro to the ggplot2 Library
Week 1 – 10/4 L03.pdf Intro to R Markdown L03A.qmd, L03A.pdf, L03B.qmd, L03B.pdf, L03C.qmd, L03C.pdf
Week 2 – 10/7, 10/9 L04.pdf What are Multiple Linear Regression Models? Least-Square Estimation; Fitted Values; Residuals and their Properties; Interpretation of Regression Coefficients As Effects of \(X_i\) on \(Y\) After Adjusting for Other Covariates 3.1-3.5
Week 2 – 10/9, 10/11 L05.pdf Standard Errors and Distributions of Least-Square Estimators;
Confidence Intervals and Hypothesis Tests of Individual Regression Coefficients
3.7, 3.9
Week 2 – 10/11
Week 3 – 10/14, 10/16
L06.pdf Sum of Squares and their Degrees of Freedom; Mean Squares; Multiple R-squared, Adjusted R-Squared;
F-Tests of Multiple Coefficients and Comparisons of Nested Models
3.11, 3.8
Week 3 – 10/16, 10/18 L07.pdf Models with Categorical Predictors/Dummy Variables;
Interactions of Two Categorical Predictors
5.1-5.3
Week 3 – 10/18
Week 4 – 10/21,10/23
L08.pdf
Ftests.pdf
Interactions of Categorical and Numerical Predictors;
F-Distributions & F-Tests
5.4
Week 4 – 10/23, 10/25 L09.pdf Interactions of 3+ Predictors 5.4
Week 4 – 10/25 L10.pdf Polynomial Models
Not Covered in 2024 L11.pdf Models with Ordinal Categorical Predictors
Week 5 – 10/28, 11/1 L12.pdf Model Diagnostics; Assumptions of MLR;
Checking the Linearity Assumptions;
Why Pairwise Scatterplots Are Not Useful and Better Tools;
Checking Interactions of Two Numerical Predictors
Residual-Plus-Component Plot;
4.1-4.4
4.12.2
Week 5 – 10/30 In-Class Midterm; No Lecture
Week 5 – 11/1
Week 6 – 11/4, 11/6
L13_14.pdf
qqnorm.pdf
Hat Outliers v.s. Influential Points
Matrix, Leverages, High Leverage Points
Types of Residuals, Residual Plots for Checking Assumptions
Measure of Influence (DFFITs, Cook’s D, DFBETAs)
Added-Variable Plots
Normal Probability Plots
4.8
4.9
4.11
4.12.1
4.7
Week 6 – 11/6,11/8 L15.pdf Transformation of Variables Chapter 6
Week 6 – 11/8
Week 7 – 11/11
L16.pdf Weighted Least-Squares 7.1-7.2
Week 7 – 11/13, 11/15 L17.pdf The Problem of Correlated Errors;
Detection (Time Plots, Runs Test, Durbin-Watson Test, Lag Plots, Autocorrelation Function and Plots)
Remedies (by removing AR(1) dependence, by including missing predictors, by removing seasonality)
Chapter 8
Week 8 – 11/18, 11/20 L18.pdf Multicollinearity Chapter 9
Week 8 – 11/22
Week 9 – 12/2, 12/4
L19.pdf
L19_example.pdf
Variable Selection Procedures (Consequence of Model Misspecification, AIC, BIC, Forward Selection, Backward Elimination, Stepwise) Chapter 11
Week 9 – 12/4,12/6 L20.pdf Ridge and Lasso Regression Chapter 9

Last Update: 12/25/2025