18 oct 2018

Inhoud

ANCOVA

ANCOVA

Analysis of covariance (ANCOVA) is a general linear model which blends ANOVA and regression. ANCOVA evaluates whether population means of a dependent variable (DV) are equal across levels of a categorical independent variable (IV) often called a treatment, while statistically controlling for the effects of other continuous variables that are not of primary interest, known as covariates (CV).

WIKIPEDIA

ANCOVA

Determine main effect while correcting for covariate

  • 1 dependent variable
  • 1 or more independent variables
  • 1 or more covariates

A covariate is a variable can be a confounding variable biasing your results. By adding a covariate we reduce error/residual in the model.

Assumptions

  • Same as ANOVA
  • Independence of the covariate and treatment effect §12.3.1.
    • No difference on ANOVA with covar and independent variable
    • Matching experimental groups on the covariate
  • Homogeneity of regression slopes §12.3.2.
    • Visual: scatterplot dep var * covar per condition
    • Testing: interaction indep. var * covar

Data example

We want to test the difference in national extraversion but want to controll for openness to experience.

  • Dependent variable: Extraversion
  • Independent variabele: Nationality
    • Dutch
    • German
    • Belgian
  • Covariate: Openness to experience

Simulate data

# Simulate data
n = 20
k = 3
group       = round(runif(n,1,k),0)
mu.covar    = 8
sigma.covar = 1
covar       = round(rnorm(n,mu.covar,sigma.covar),2)

# Create dummy variables
dummy.1 <- ifelse(group == 2, 1, 0)
dummy.2 <- ifelse(group == 3, 1, 0)

# Set parameters
b.0 = 15 # initial value for group 1
b.1 = 3  # difference between group 1 and 2
b.2 = 4  # difference between group 1 and 3
b.3 = 3  # Weight for covariate

# Create error
error = rnorm(n,0,1)

Define the model

\[{outcome} = {model} + {error}\] \[{model} = {indvar} + {covariate} = {nationality} + {openness}\]

Formal model

\[b_0 + b_1 {dummy}_1 + b_2 {dummy}_2 + b_3 covar\]

# Define model
outcome = b.0 + b.1 * dummy.1 + b.2 * dummy.2 + b.3 * covar + error

Dummies

The data

Group means

aggregate(outcome ~ group, data, mean)
##   group  outcome
## 1     1 38.67000
## 2     2 42.16625
## 3     3 43.79800

Model fit no covar

What are the beta coëfficients based on the data without the covariate?

fit.group <- lm(outcome ~ factor(group), data); fit.group
## 
## Call:
## lm(formula = outcome ~ factor(group), data = data)
## 
## Coefficients:
##    (Intercept)  factor(group)2  factor(group)3  
##         38.670           3.496           5.128
fit.group$coefficients[2:3] + fit.group$coefficients[1]
## factor(group)2 factor(group)3 
##       42.16625       43.79800

\[{Dutch} = 38.67 \> {German} = 42.16625 \> {Belgian} = 43.798\]

Model fit only covar

What is the weight of only the covariate?

fit.covar <- lm(outcome ~ covar, data)
fit.covar
## 
## Call:
## lm(formula = outcome ~ covar, data = data)
## 
## Coefficients:
## (Intercept)        covar  
##      15.667        3.185

Model fit with covar

fit <- lm(outcome ~ factor(group) + covar, data); fit
## 
## Call:
## lm(formula = outcome ~ factor(group) + covar, data = data)
## 
## Coefficients:
##    (Intercept)  factor(group)2  factor(group)3           covar  
##         15.965           2.769           4.181           2.881
fit$coefficients[2:3] + fit$coefficients[1]
## factor(group)2 factor(group)3 
##       18.73401       20.14609

\[{Dutch} = 15.96 \> {German} = 18.73 \> {Belgian} = 20.14\]

Total variance

What is the total variance

\[{MS}_{total} = s^2_{outcome} = \frac{{SS}_{outcome}}{{df}_{outcome}}\]

ms.t = var(data$outcome); ms.t
## [1] 11.97756
ss.t = var(data$outcome) * (length(data$outcome) - 1); ss.t
## [1] 227.5737

The data

Total variance visual

plot <- ggplot(data, aes(x=n, y=outcome)) + geom_point(shape=1) +
        geom_hline(yintercept=mean(outcome)) +
        geom_segment(aes(x = n, y = grand.mean, xend = n, yend = outcome)) +
        ggtitle("Total variance")
plot