library(mosaic)
library(dplyr)
library(ggplot2)
#install.packages("lme4")
library(lme4)
We will use the hox
dataset that has the example from the slides. Each pupil has a popularity rating, popular
, and extroversion rating, extrav
, their gender, sex
, the class room their in, class
, and their teachers’ years of experience, texp
. We will be exploring popularity as a function of extroversion, sex, and teacher’s experience.
hox <- read.csv("/Users/randigarcia/Desktop/Data/hox.csv", header=TRUE)
glimpse(hox)
## Observations: 2,000
## Variables: 7
## $ pupil <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16...
## $ class <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
## $ extrav <int> 5, 7, 4, 3, 5, 4, 5, 4, 5, 5, 5, 5, 5, 5, 5, 6, 4, 4,...
## $ sex <int> 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0,...
## $ texp <int> 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 2...
## $ popular <dbl> 6.3, 4.9, 5.3, 4.7, 6.0, 4.7, 5.9, 4.2, 5.2, 3.9, 5.7...
## $ popteach <int> 6, 5, 6, 5, 6, 5, 5, 5, 5, 3, 5, 5, 5, 6, 5, 5, 2, 3,...
Do some preliminary data analysis to get familiar with the data set.
popular
?distinct()
function might help)?#Exploratory data analysis
First let’s fit the most basic multilevel model, the random intercept model with no predictors.
mlm <- lmer(popular ~ 1 + (1 | class), data = hox)
summary(mlm)
## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ 1 + (1 | class)
## Data: hox
##
## REML criterion at convergence: 6330.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.5655 -0.6975 0.0020 0.6758 3.3175
##
## Random effects:
## Groups Name Variance Std.Dev.
## class (Intercept) 0.7021 0.8379
## Residual 1.2218 1.1053
## Number of obs: 2000, groups: class, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 5.07786 0.08739 58.1
Note that model objects have handy properties:
names(summary(mlm))
## [1] "methTitle" "objClass" "devcomp" "isLmer"
## [5] "useScale" "logLik" "family" "link"
## [9] "ngrps" "coefficients" "sigma" "vcov"
## [13] "varcor" "AICtab" "call" "residuals"
## [17] "fitMsgs" "optinfo"
#which do you think will confirm your number of groups from above?
The intraclass correlation (ICC) can be computed as:
ran_eff <- as.data.frame(summary(mlm)$varcor) %>%
select(-var1, -var2)
icc <- ran_eff[1,2]/(ran_eff[1,2] + ran_eff[2,2])
icc
## [1] 0.3649386
Does the pupil’s extroversion predict their popularity? Now, let’s add 1 level 1 predictor, the pupil’s extroversion score. We’ll ass the fixed component only, not the random piece.
mlm_pred1 <- lmer(popular ~ extrav + (1 | class), data = hox)
summary(mlm_pred1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ extrav + (1 | class)
## Data: hox
##
## REML criterion at convergence: 5832.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.0644 -0.7267 0.0165 0.7088 3.3587
##
## Random effects:
## Groups Name Variance Std.Dev.
## class (Intercept) 0.8406 0.9168
## Residual 0.9304 0.9646
## Number of obs: 2000, groups: class, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.54214 0.14113 18.01
## extrav 0.48631 0.02015 24.13
##
## Correlation of Fixed Effects:
## (Intr)
## extrav -0.745
Spaghetti plot shows effects for different classrooms.
hox_small <- hox %>%
filter(class >= 1 & class <= 6)
ggplot(hox_small, aes(extrav, popular,
group = as.factor(class),
color = as.factor(class))) +
geom_point() +
geom_smooth(method = "lm", se = FALSE)
Try faceting by class instead.
#Alt spaghetti plot
For every 1 unit increase in extroversion we’d expect a 0.49 point increase in popularity. Does this size of this popularity effect differ across classrooms? Let’s ass the random component of the extroversion effect.
mlm_pred1_ran <- lmer(popular ~ extrav + (extrav | class), data = hox)
summary(mlm_pred1_ran)
## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ extrav + (extrav | class)
## Data: hox
##
## REML criterion at convergence: 5779.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.1961 -0.7291 0.0146 0.6816 3.2217
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## class (Intercept) 2.99952 1.7319
## extrav 0.02599 0.1612 -0.97
## Residual 0.89492 0.9460
## Number of obs: 2000, groups: class, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.46102 0.20315 12.11
## extrav 0.49286 0.02546 19.36
##
## Correlation of Fixed Effects:
## (Intr)
## extrav -0.917
The standard deviation of the extroversion effect is 0.16. There is a negative correlation between the classroom’s intercept and the effect of extroversion.
Does the teacher’s experience have an effect on the pupil’s popularity? Let’s ass this level 2 predictor, it can only be fixed.
mlm_pred2 <- lmer(popular ~ texp + (1 | class), data = hox)
summary(mlm_pred2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ texp + (1 | class)
## Data: hox
##
## REML criterion at convergence: 6313.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.5326 -0.6963 -0.0005 0.6896 3.3455
##
## Random effects:
## Groups Name Variance Std.Dev.
## class (Intercept) 0.5427 0.7367
## Residual 1.2218 1.1053
## Number of obs: 2000, groups: class, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 4.19668 0.18609 22.552
## texp 0.06164 0.01183 5.212
##
## Correlation of Fixed Effects:
## (Intr)
## texp -0.909
Yes, more experienced teachers seem to have more popular pupils. Next we we can have level 1 and level 2 predictors, the level 1 predictor is random.
mlm_pred12_ran <- lmer(popular ~ extrav + texp + (extrav | class), data = hox)
summary(mlm_pred12_ran)
## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ extrav + texp + (extrav | class)
## Data: hox
##
## REML criterion at convergence: 5749.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.0472 -0.7305 0.0110 0.6777 3.2032
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## class (Intercept) 1.55946 1.2488
## extrav 0.03534 0.1880 -0.87
## Residual 0.89059 0.9437
## Number of obs: 2000, groups: class, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 1.01683 0.23072 4.407
## extrav 0.50215 0.02763 18.173
## texp 0.09777 0.01026 9.530
##
## Correlation of Fixed Effects:
## (Intr) extrav
## extrav -0.719
## texp -0.702 0.103
Is the effect of extroversion stronger in classrooms with more experienced teachers? Finally, we can estimate the cross-level interaction of extroversion and teacher’s experience.
mlm_inter <- lmer(popular ~ extrav*texp + (extrav | class), data = hox)
summary(mlm_inter)
## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ extrav * texp + (extrav | class)
## Data: hox
##
## REML criterion at convergence: 5694.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.2505 -0.7267 0.0162 0.6861 3.1895
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## class (Intercept) 0.5289833 0.72731
## extrav 0.0003337 0.01827 -1.00
## Residual 0.8898028 0.94329
## Number of obs: 2000, groups: class, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) -1.210965 0.321040 -3.772
## extrav 0.891087 0.044954 19.822
## texp 0.251926 0.019746 12.758
## extrav:texp -0.027462 0.002872 -9.563
##
## Correlation of Fixed Effects:
## (Intr) extrav texp
## extrav -0.874
## texp -0.918 0.805
## extrav:texp 0.773 -0.898 -0.859
See if you can make a graph to get a sense of this interaction. Hint: first create a variable that is the median split of texp
.
#graph the interaction
Check the residuals.
#model diagnostics here.