Please note that the official course website is on Canvas (log in with CNetID), NOT here. This webpage is for those who are interested in STAT 22400 to get an idea of what the course is like.

Prerequisites

STAT 22000 or 23400 with a grade of at least C; or 24500, 24510, or PBHS 32100, or AP Statistics credit, or equivalent. and two-quarters of calculus (MATH 13200, 15200, or 16200 or above).

Course Description

STAT 22400/PBHS32400 introduces the methods and applications of fitting and interpreting multiple regression models. The primary emphasis is on the method of least squares and its many varieties. Topics include the examination of residuals, the transformation of data, strategies, and criteria for the selection of a regression equation, the use of dummy variables, tests of fit, nonlinear models, multi-collinearity, biases due to excluded variables and measurement error, and the use and interpretation of computer package regression programs. The techniques discussed are illustrated by many real examples involving data from both the natural and social sciences. Matrix notation is introduced as needed.

Textbooks

Chatterjee & Hadi. Regression Analysis by Example, 5th edition 2005, Wiley

Course Schedule and Slides

Week/Date Slides Content Textbook Coverage
Before CLass L00.pdf Brief Intro to R and RStudio
Week 1 – 9/27 L01.pdf Intro to the ggplot2 library
Week 1 – 9/29 LA0928_demo1.Rmd, LA0928_demo2.Rmd
LA0928_demo3.Rmd
LA0928slides.pdf
Intro to R Markdown
Week 1 – 10/1 LA1001slides.pdf
LA1001.Rmd
LA1001.pdf
Example: NC Birth Data
Week 2 – 10/4 L02.pdf What are multiple linear regression models? Least-square estimation; Fitted values; residuals and their properties; Interpretation of regression coefficients as effects of \(X_i\) on \(Y\) after adjusting for other covariates Section 3.1-3.5
Week 2 – 10/6 L03.pdf Standard errors and distributions of least-square estimators; Confidence intervals and hypothesis tests of individual regression coefficients Section 3.7, 3.9
Week 2 – 10/8 L04.pdf (p.1-25) Confidence intervals and prediction intervals for predictions;
Sum of squares and their degrees of freedom; Mean squares;
Multiple R-squared, Adjusted R-Squared
Section 3.11, 3.8
Week 3 – 10/11 L04.pdf (p.26-end) F-Tests of multiple coefficients (all coefficients, a subset of coefficients, the equality of coefficients, estimation and tests of coefficients under constraints) Section 3.10
Week 3 – 10/13 L05.pdf Models with categorical predictors/dummy variables;
Interactions of two categorical predictors
Section 5.1-5.3
Week 3 – 10/15 L06.pdf Interactions of categorical and numerical predictors Section 5.4
Week 4 – 10/18 L07.pdf Interactions of three or more predictors Section 5.4
Week 4 – 10/20 L08.pdf
L09.pdf
Polynomial Models
Ordinal Categorical Predictors
Week 4 – 10/22 L10.pdf Model Diagnostics; Assumptions of MLR; Leverage;
Standardized and Studentized residuals; Residual Plots
Section 4.1-4.4
Week 5 – 10/25 L11.pdf
qqnorm.pdf
Checking assumptions; Pairwise scatterplots and better tools;
Checking interactions of two numerical predictors;
Residual-plus-component plot;
Normal Probability Plots
Section 4.5-4.7, 4.12.2
Week 5 – 10/27,10/29 L12.pdf Influential points and outliers;
Hat matrix, leverages, high leverage points;
Cook’s distance;
Added-Variable Plot
Section 4.8-4.11, 4.12.1
Week 6 – 11/1 L13.pdf Transformation of variables Chapter 6
Week 6 – 11/3 Midterm Exam. No lecture
Week 6 – 11/5
Week 7 – 11/8
L14.pdf Weighted least-squares Section 7.1-7.2
Week 7 – 11/10, 11/12 L15.pdf The problem of correlated errors;
Detection (time plot, runs test, Durbin-Watson test, lag plots, autocorrelation function and plots
Remedies (by removing AR(1) dependence, by including missing predictors, by removing seasonality)
Chapter 8
Week 8 – 11/15, 11/17 L16.pdf Multicollinearity Chapter 9
Week 8 – 11/19
Week 9 – 11/29, 12/1
L17.pdf
L17_example.pdf
Variable Selection Procedures Chapter 11
Week 9 – 12/3 L18.pdf Ridge and Lasso Regression Chapter 9
Week 10 – 12/8, 12/9 Online Final Exam

Last Update: 08/14/2022