Material for Session 5: Analyzing Categorical Data
Simply tabulate how often (what % of the time) a random variate will be larger than some specified value
| Distribution | df | 0.5 | 0.2 | 0.1 | 0.05 | 0.02 | 0.01 | 0.005 |
|---|---|---|---|---|---|---|---|---|
| Normal | 0.67 | 1.28 | 1.64 | 1.96 | 2.33 | 2.58 | 2.81 | |
| Chi Square | 1 | 0.46 | 1.64 | 2.71 | 3.84 | 5.41 | 6.64 | 7.88 |
| Chi Square | 2 | 1.39 | 3.22 | 4.60 | 5.99 | 7.82 | 9.21 | 10.60 |
| Chi Square | 3 | 2.37 | 4.64 | 6.25 | 7.82 | 9.84 | 11.34 | 12.84 |
| Chi Square | 4 | 3.36 | 5.99 | 7.78 | 9.49 | 11.67 | 13.28 | 14.86 |
| Chi Square | 5 | 4.35 | 7.29 | 9.24 | 11.07 | 13.39 | 15.09 | 16.75 |
Historically, we expect 9 fatal accidents in any given week. Last week we had 15 fatal accidents. How likely is this?
The Poisson distribution predicts the following likelihood of accidents in a week if the average is 9:
# Accidents |
Probability | # Accidents |
Probability | |
0 |
0.0 % | 10 |
11.9 % | |
1 |
0.1 % | 11 |
9.7 % | |
2 |
0.5 % | 12 |
7.3 % | |
3 |
1.5 % | 13 |
5.0 % | |
4 |
3.4 % | 14 |
3.2 % | |
5 |
6.1 % | 15 |
1.9 % | |
6 |
9.1 % | 16 |
1.1 % | |
7 |
11.7 % | 17 |
0.6 % | |
8 |
13.2 % | 18 |
0.3 % | |
9 |
13.2 % | 19 |
0.1 % |
The probability of getting 15 events is 1.9%.
But the probability of getting 15 or more events is 1.9 + 1.1 + 0.6 + 0.3 + 0.1, or about 4.0%.
But getting 6 fewer events than expected is just as unusual as getting 6 more events than expected.
The probability of getting 3 or fewer events is 1.5 + 0.5 + 0.1, or about 2.1%.
So we would expect a weekly fatal accident count as "surprising" as ours to occur about 6.1% of the time. (three weeks a year).
So, it might be due to chance (a borderline case).
Use the Normal approximation to the Poisson distribution:
A Poisson variable with a mean of N is distributed approximately like
a Normal variable with a mean of N and a SD of Sqrt(N).
A Poisson variable with a mean of 9 is distributed like a Normal variable with a mean of 9 and a SD of 3.
15 is 6 larger than 9 (the mean).
6 is twice as much as 3 (the SD)
So having 16 events when you expect 9 is like drawing the number 2.0 from a Standard Normal distribution (m=0, sd=1).
This only happens about 5% of the time.