Suppose we asked four students how many hours of sleep they had last night and their answers were \(9\), \(3\), \(15\), and \(1\) hours.

The size of the sample is \(4\) students.

One summary of these data is the average number of hours per student.

STAT 200

\(\displaystyle \textrm{AVG} \;=\; \frac{\textrm{sum of the values}}{\textrm{number of values}}\)

For these data, the average is \(\quad \displaystyle \textrm{AVG} \;=\; \frac{9 + 3 + 15 + 1}{\textrm{4}} = \frac{28}{4} = 7\) hours

Whenever we write “AVG” we mean the average: the sum of the values divided by the number of values

The average is the balancing point of the data (the fulcrum).

STAT 220

Includes everything in STAT 200 plus…

\(\displaystyle \textrm{AVG} \;=\; \frac{1}{n} \sum_{i=1}^n x_i \;=\; \overline{x}\)

and we will refer to the average as \(\;\overline{x} =\) “x-bar”.

For these data, \(x_1=9, x_2=3, x_3=15,\) and \(x_4=1\).
The sample size is \(n=4\).

So, \(\;\;\displaystyle \overline{x} \;=\; \frac{1}{4} (9 + 3 + 15 + 1) = \frac{1}{4} (28) \;=\; 7\) hours.

STAT 234

Includes everything in STAT 200 and STAT 220 plus…

Let’s try to quantify the concept that \(\overline{x}\) is the balancing point of the data.

We can see that

So, for the first \(3\) values, we see \((2 + 8) = 10\) units to the right and \(4\) units to the left for a net distance of \(6\) units to the right of \(\overline{x}\).

To have balance, the last data value must be \(6\) units to the left of \(\overline{x}\).

Indeed, \(x_4 = 1\) is \(6\) units to the left of \(\overline{x} = 7\) since \((1 - 7) = -6\)

OK, so this works for the data values \(9\), \(3\), \(15\), and \(1\) and their average \(\overline{x} = 7\) hours.

Does it work for any set of \(n\) values?

First, let’s write a mathematical expression that expresses this idea of “balance”.
You think about it for a little bit and see what you come up with.

DON’T LOOK BELOW UNTIL YOU HAVE TRIED TO FIND A MATHEMATICAL EXPRESSION YOURSELF





I suggest this expression: \(\displaystyle \sum_{i=1}^n (x_i - \overline{x}) = 0\).

Indeed, for the values \(9\), \(3\), \(15\), and \(1\),

\[ \begin{align} \sum_{i=1}^n (x_i - \overline{x}) & \;=\; (9 - 7) + (3 - 7) + (15 - 7) + (1 - 7) \\ & \;=\; (2) + (-4) + (8) + (-6) \\ & \;=\; 0 \end{align} \]

This next part really sets STAT 234 apart from STAT 220

Show that \(\displaystyle \sum_{i=1}^n (x_i - \overline{x}) = 0\) is true for and \(n\) values \(x_1, x_2, \ldots, x_n\).

This is a “proof” of the form “show that the left side equals the right side”.
Suggestion: Start with the expression on the left side of the equation. Rewrite it, rewrite it, and rewrite it, until you are convinced that it equals the right side.

STOP HERE AND TRY TO CONFIRM ON YOUR OWN THAT THE EQUATION IS TRUE





Here’s my attempt to show that \(\displaystyle \sum_{i=1}^n (x_i - \overline{x}) = 0\) is true.
Your attempt may be equally valid, but different from mine. That’s great!

Starting on the left side…

\[ \begin{align} \sum_{i=1}^n (x_i - \overline{x}) & \;=\; \sum_{i=1}^n x_i \; - \; \sum_{i=1}^n \overline{x} \\ & \;=\; \sum_{i=1}^n x_i \; - \; n \overline{x} \\ & \;=\; \frac{n}{n}\sum_{i=1}^n x_i \; - \; n \overline{x} \\ & \;=\; n \; \left(\frac{1}{n}\sum_{i=1}^n x_i\right) \; - \; n \overline{x} \\ & \;=\; n \overline{x} \; - \; n \overline{x} \\ & \; = \; 0 \end{align} \]

The last step we we just borrowed from the math department. :)
The difference between a number and itself is zero: \(n \overline{x} \; - \; n \overline{x} = 0\)


Director of Undergraduate Studies: Dr. Linda Brant Collins
Email: lcollins@uchicago.edu

Information valid as of