19 oct 2018

Inhoud

Independent factorial ANOVA

Two or more independent variables with two or more categories. One dependent variable.

Independent factorial ANOVA

The independent factorial ANOVA analyses the variance of multiple independent variables (Factors) with two or more categories.

Effects and interactions:

  • 1 dependent/outcome variable
  • 2 or more independent/predictor variables
    • 2 or more cat./levels

Assumptions

  • Continuous variable
  • Random sample
  • Normaly distributed
    • Shapiro-Wilk test
  • Equal variance within groups
    • Levene's test

Formulas

Variance Sum of squares df Mean squares F-ratio
Model \(\text{SS}_{\text{model}} = \sum{n_k(\bar{X}_k-\bar{X})^2}\) \(k_{model}-1\) \(\frac{\text{SS}_{\text{model}}}{\text{df}_{\text{model}}}\) \(\frac{\text{MS}_{\text{model}}}{\text{MS}_{\text{error}}}\)
\(\hspace{2ex}A\) \(\text{SS}_{\text{A}} = \sum{n_k(\bar{X}_k-\bar{X})^2}\) \(k_A-1\) \(\frac{\text{SS}_{\text{A}}}{\text{df}_{\text{A}}}\) \(\frac{\text{MS}_{\text{A}}}{\text{MS}_{\text{error}}}\)
\(\hspace{2ex}B\) \(\text{SS}_{\text{B}} = \sum{n_k(\bar{X}_k-\bar{X})^2}\) \(k_B-1\) \(\frac{\text{SS}_{\text{B}}}{\text{df}_{\text{B}}}\) \(\frac{\text{MS}_{\text{B}}}{\text{MS}_{\text{error}}}\)
\(\hspace{2ex}AB\) \(\text{SS}_{A \times B} = \text{SS}_{\text{model}} - \text{SS}_{\text{A}} - \text{SS}_{\text{B}}\) \(df_A \times df_B\) \(\frac{\text{SS}_{\text{AB}}}{\text{df}_{\text{AB}}}\) \(\frac{\text{MS}_{\text{AB}}}{\text{MS}_{\text{error}}}\)
Error \(\text{SS}_{\text{error}} = \sum{s_k^2(n_k-1)}\) \(N-k_{model}\) \(\frac{\text{SS}_{\text{error}}}{\text{df}_{\text{error}}}\)
Total \(\text{SS}_{\text{total}} = \text{SS}_{\text{model}} + \text{SS}_{\text{error}}\) \(N-1\) \(\frac{\text{SS}_{\text{total}}}{\text{df}_{\text{total}}}\)

Example

In this example we will look at the amount of accidents in a car driving simulator while subjects where given varying doses of speed and alcohol.

  • Dependent variable
    • Accidents
  • Independent variables
    • Speed
      • None
      • Small
      • Large
    • Alcohol
      • None
      • Small
      • Large

person alcohol speed accidents
1 1 1 0
2 1 2 2
3 1 3 4
4 2 1 6
5 2 2 8
6 2 3 10
7 3 1 12
8 3 2 14
9 3 3 16

Data

SSmodel

Variance Sum of squares df Mean squares F-ratio
Model \(\text{SS}_{\text{model}} = \sum{n_k(\bar{X}_k-\bar{X})^2}\) \(k_{model}-1\) \(\frac{\text{SS}_{\text{model}}}{\text{df}_{\text{model}}}\) \(\frac{\text{MS}_{\text{model}}}{\text{MS}_{\text{error}}}\)


##   speed alcohol accidents  n
## 1  much    much    7.5720 20
## 2  none    much    5.2970 20
## 3  some    much    6.5125 20
## 4  much    none    3.8880 20
## 5  none    none    2.1060 20
## 6  some    none    2.9445 20
## 7  much    some    5.5790 20
## 8  none    some    3.4435 20
## 9  some    some    4.7625 20

SS.model <- sum((exp.accidents - mean(data$accidents))^2); SS.model
## [1] 494.2205
m.k1 = mean(subset(data, speed == "none" & alcohol == "none", select = "accidents")$accidents)
m.k2 = mean(subset(data, speed == "none" & alcohol == "some", select = "accidents")$accidents)
m.k3 = mean(subset(data, speed == "none" & alcohol == "much", select = "accidents")$accidents)
m.k4 = mean(subset(data, speed == "some" & alcohol == "none", select = "accidents")$accidents)
m.k5 = mean(subset(data, speed == "some" & alcohol == "some", select = "accidents")$accidents)
m.k6 = mean(subset(data, speed == "some" & alcohol == "much", select = "accidents")$accidents)
m.k7 = mean(subset(data, speed == "much" & alcohol == "none", select = "accidents")$accidents)
m.k8 = mean(subset(data, speed == "much" & alcohol == "some", select = "accidents")$accidents)
m.k9 = mean(subset(data, speed == "much" & alcohol == "much", select = "accidents")$accidents)

n.k1 = n.k2 = n.k3 = n.k4 = n.k5 = n.k6 = n.k7 = n.k8 = n.k9 = 20

ss.m.k1 = n.k1 * (m.k1 - mean(accidents))^2
ss.m.k2 = n.k2 * (m.k2 - mean(accidents))^2
ss.m.k3 = n.k3 * (m.k3 - mean(accidents))^2
ss.m.k4 = n.k4 * (m.k4 - mean(accidents))^2
ss.m.k5 = n.k5 * (m.k5 - mean(accidents))^2
ss.m.k6 = n.k6 * (m.k6 - mean(accidents))^2
ss.m.k7 = n.k7 * (m.k7 - mean(accidents))^2
ss.m.k8 = n.k8 * (m.k8 - mean(accidents))^2
ss.m.k9 = n.k9 * (m.k9 - mean(accidents))^2

ss.model = sum(ss.m.k1,ss.m.k2,ss.m.k3,ss.m.k4,ss.m.k5,ss.m.k6,ss.m.k7,ss.m.k8,ss.m.k9)
ss.model
## [1] 494.2205

SSmodel visual

# Plot all data points
plot(accidents,
     xlab = 'participants')

# With the mean
lines(1:n,rep(mean(accidents),n),col='black',lwd=2)

# The black lines are the total variance, the deviation from the mean.
segments(1:n, mean(accidents), 1:n, accidents)

# The model predicts the accidents scores
points(1:n,exp.accidents, col='red')

p <- recordPlot()

# The black lines are the total variance, the deviation from the mean.
segments(1:n, exp.accidents, 1:n, mean(accidents), col = "red")

# Add legend to plot
legend("topleft",
       pch    = c("o"),
       col    = c("red"),
       legend = c("Full model") )