4.3 Linear Mixed-effects Models. Why random effects matter

Let’s generate a new dataframe that we will use later on for our mixed models

We will use the faux package. It is available on CRAN, but it is possible that it is not available for your version of R (I used version 4.5). In this case, you need to install it via github. The package faux is a package that allows you to simulate data for your experiments. It is a very useful package to test your ideas and to assess the impact of particular predictors on your dependant variable.

To install it, you will first need to install the package devtools, and then install the package faux using the function devtools::install_github. Uncomment the code chunk below

# install.packages("devtools")
# devtools::install_github("debruine/faux")
# library(faux)

4.3.1 Dataframe (simulated)

Our experiment has the following structure: we asked 40 subjects to respond to 40 items in a fully crossed design. There were two IVs: Condition with congruent and incongruent and Age with young and old. The Condition is a within subject variable; age is a between subject. However, Condition and Age were both within item variables. The dependant variable was LookingTime.

This meant that we used items within each of the young and the old subjects in addition to items within each of the congurent and incongruent conditions.

Our research question is as follows: Age of subject will impact the Looking Time in the two conditions. Our hypothesis is: The older a subject is, the more the looking time it is to the incongruent condition.

We will use the package faux to simulate a dataframe. This step is crucial in assessing contributions of particular predictors and for testing ideas as we already know the distribution of the data and what to expect as an outcome when it comes to the fixed effect in question.

The simulated data has the following parameters

set.seed(42)
# define parameters
Subj_n = 40  # number of subjects
Item_n = 40  # number of items
b0 = 100       # intercept
b1 = 2.5 * b0   # fixed effect of condition
u0s_sd = 300   # random intercept SD for subjects
u0i_sd = 200   # random intercept SD for items
u1s_sd = 100   # random b1 slope SD for subjects
u1i_sd = 50   # random b1 slope SD for items
r01s = -0.3     # correlation between random effects 0 and 1 for subjects
r01i = 0.2     # correlation between random effects 0 and 1 for items
sigma_sd = 150 # error SD

# set up data structure
dataCong <- add_random(Subj = Subj_n, Item = Item_n) %>%
  # add within categorical variable for subject
  add_within("Subj", Cond = c("Congruent", "Incongruent")) %>%
  add_recode("Cond", "Cond.Incongruent", Congruent = 0, Incongruent = 1) %>%
  # add between categorical variable for subject
  add_between("Subj", Age = c("Young", "Old")) %>%
  add_recode("Age", "Age.Old", Young = 0, Old = 1) %>%
  # add random effects 
  add_ranef("Subj", u0s = u0s_sd, u1s = u1s_sd, .cors = r01s) %>%
  add_ranef("Item", u0i = u0i_sd, u1i = u1i_sd, .cors = r01i) %>%
  ##add_ranef(c("Subj", "Item"), u0si = u0s_sd + u0i_sd) %>%
  ##add_ranef(c("Subj", "Item"), u1si = u1s_sd + u1i_sd) %>%
  add_ranef(sigma = sigma_sd) %>%
  # calculate DV
  mutate(LookingTime = b0 + b1 + u0s + u0i + #u0si + u1si +
           (((b1 + u1s) + 0.5) * Cond.Incongruent) + (((b1 + u1s) + 0.9) * Age.Old) + # subject specific variation
           (((b1 + u1i) - 0.3) * Cond.Incongruent) + (((b1 + u1i) - 0.25) * Age.Old) + # item specific variation  
           sigma)
write.csv(dataCong, "data/dataCong.csv")

If you were not able to install the faux package, simply uncomment the following line of code below to import the dataset

#dataCong <- read.csv("dataCong.csv")[-1]

4.3.1.1 Counts

dataCong <- dataCong %>% 
  mutate(Subj = factor(Subj),
         Item = factor(Item))
dataCong
## # A tibble: 3,200 × 12
##    Subj   Item   Cond    Cond.Incongruent Age   Age.Old   u0s   u1s    u0i   u1i
##    <fct>  <fct>  <fct>              <dbl> <fct>   <dbl> <dbl> <dbl>  <dbl> <dbl>
##  1 Subj01 Item01 Congru…                0 Young       0 -413.  26.2 -306.  56.9 
##  2 Subj01 Item01 Incong…                1 Young       0 -413.  26.2 -306.  56.9 
##  3 Subj01 Item02 Congru…                0 Young       0 -413.  26.2  -55.4 69.1 
##  4 Subj01 Item02 Incong…                1 Young       0 -413.  26.2  -55.4 69.1 
##  5 Subj01 Item03 Congru…                0 Young       0 -413.  26.2  -17.4 -7.03
##  6 Subj01 Item03 Incong…                1 Young       0 -413.  26.2  -17.4 -7.03
##  7 Subj01 Item04 Congru…                0 Young       0 -413.  26.2   21.6 50.0 
##  8 Subj01 Item04 Incong…                1 Young       0 -413.  26.2   21.6 50.0 
##  9 Subj01 Item05 Congru…                0 Young       0 -413.  26.2  239.  12.8 
## 10 Subj01 Item05 Incong…                1 Young       0 -413.  26.2  239.  12.8 
## # ℹ 3,190 more rows
## # ℹ 2 more variables: sigma <dbl>, LookingTime <dbl>
4.3.1.1.1 Subjects
dataCong %>% 
  group_by(Cond, Age, Subj) %>% 
  summarise(count = n())
## # A tibble: 80 × 4
## # Groups:   Cond, Age [4]
##    Cond      Age   Subj   count
##    <fct>     <fct> <fct>  <int>
##  1 Congruent Young Subj01    40
##  2 Congruent Young Subj03    40
##  3 Congruent Young Subj05    40
##  4 Congruent Young Subj07    40
##  5 Congruent Young Subj09    40
##  6 Congruent Young Subj11    40
##  7 Congruent Young Subj13    40
##  8 Congruent Young Subj15    40
##  9 Congruent Young Subj17    40
## 10 Congruent Young Subj19    40
## # ℹ 70 more rows
4.3.1.1.2 Items
4.3.1.1.3 Age
dataCong %>% 
  group_by(Age, Item) %>% 
  summarise(count = n())
## # A tibble: 80 × 3
## # Groups:   Age [2]
##    Age   Item   count
##    <fct> <fct>  <int>
##  1 Young Item01    40
##  2 Young Item02    40
##  3 Young Item03    40
##  4 Young Item04    40
##  5 Young Item05    40
##  6 Young Item06    40
##  7 Young Item07    40
##  8 Young Item08    40
##  9 Young Item09    40
## 10 Young Item10    40
## # ℹ 70 more rows
4.3.1.1.4 Cond
dataCong %>% 
  group_by(Cond, Item) %>% 
  summarise(count = n())
## # A tibble: 80 × 3
## # Groups:   Cond [2]
##    Cond      Item   count
##    <fct>     <fct>  <int>
##  1 Congruent Item01    40
##  2 Congruent Item02    40
##  3 Congruent Item03    40
##  4 Congruent Item04    40
##  5 Congruent Item05    40
##  6 Congruent Item06    40
##  7 Congruent Item07    40
##  8 Congruent Item08    40
##  9 Congruent Item09    40
## 10 Congruent Item10    40
## # ℹ 70 more rows

4.3.1.2 Visualisations

4.3.1.2.1 Condition by Age

The figure below confirms this, where we see an increase in LookingTime in the incongruent condition and oevrall, older participants show an increase in LookingTime

dataCong %>% 
  ggplot(aes(x = Cond,
             y = LookingTime,
             colour = Age)) +
  theme_bw() + 
  geom_boxplot() +
  geom_smooth(aes(as.numeric(Cond)), method = "lm")

4.3.1.2.2 Subject by Condition

This figure shows that subjects are variable in how they responded to this task

dataCong %>% 
  ggplot(aes(x = Cond,
             y = LookingTime,
             colour = Subj)) +
  theme_bw() + 
  geom_point() +
  geom_smooth(aes(as.numeric(Cond)), method = "lm", se = FALSE) +
  scale_colour_manual(values = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj))))

4.3.1.2.3 Subject by Age

This figure shows that subjects had an impact on the LookingTime in both age groups, simply due to their variable responses to the different items

dataCong %>% 
  ggplot(aes(x = Age,
             y = LookingTime,
             colour = Subj)) +
  theme_bw() + 
  geom_point() +
  geom_smooth(aes(as.numeric(Cond)), method = "lm", se = FALSE) +
  scale_colour_manual(values = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj))))

4.3.1.2.4 Item by Condition

This figure shows that items had an impact on the LookingTime in both conditions

dataCong %>% 
  ggplot(aes(x = Cond,
             y = LookingTime,
             colour = Item)) +
  theme_bw() + 
  geom_point() +
  geom_smooth(aes(as.numeric(Cond)), method = "lm", se = FALSE) +
  scale_colour_manual(values = paletteer_c("grDevices::rainbow", length(unique(dataCong$Item))))

4.3.1.2.5 Subject by Age

This figure shows that items had an impact on the LookingTime in both age groups

dataCong %>% 
  ggplot(aes(x = Age,
             y = LookingTime,
             colour = Item)) +
  theme_bw() + 
  geom_point() +
  geom_smooth(aes(as.numeric(Cond)), method = "lm", se = FALSE) +
  scale_colour_manual(values = paletteer_c("grDevices::rainbow", length(unique(dataCong$Item))))

4.3.2 Modelling strategy

We use an LMER model with a crossed random effect. To choose our optimal model, we start first by a simple model with only random intercepts, increasing complexity by accounting for random slopes for both subjects and items. It is clear from our visualisation above, that there is no interaction between the two predictors. However, for demonstration purposes, we do test for this

4.3.2.1 Simple Linear Model

mdl.lm <- dataCong %>% 
  lm(LookingTime ~ Cond + Age, data = .)
summary(mdl.lm)
## 
## Call:
## lm(formula = LookingTime ~ Cond + Age, data = .)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1215.59  -290.17   -21.78   264.89  1661.41 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       304.95      12.91   23.61   <2e-16 ***
## CondIncongruent   504.35      14.91   33.82   <2e-16 ***
## AgeOld            566.44      14.91   37.98   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 421.8 on 3197 degrees of freedom
## Multiple R-squared:  0.4472, Adjusted R-squared:  0.4469 
## F-statistic:  1293 on 2 and 3197 DF,  p-value: < 2.2e-16
hist(residuals(mdl.lm))

qqnorm(residuals(mdl.lm)); qqline(residuals(mdl.lm))

plot(fitted(mdl.lm), residuals(mdl.lm), cex = 4)

4.3.2.2 No interaction

4.3.2.2.1 Crossed random intercepts
xmdl.rand.Interc <- dataCong %>% 
  lmer(LookingTime ~ Cond + Age + 
         (1 | Subj) + 
         (1 | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))
4.3.2.2.2 Crossed random intercepts + By-speaker random slopes
xmdl.rand.Slope1 <- dataCong %>% 
  lmer(LookingTime ~ Cond + Age + 
         (1 + Cond | Subj) + 
         (1 | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))
4.3.2.2.3 Crossed random intercepts + By-speaker and by-item random slopes
xmdl.rand.Slope2 <- dataCong %>% 
  lmer(LookingTime ~ Cond + Age + 
         (1 + Cond | Subj) + 
         (1 + Cond | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))
4.3.2.2.4 Crossed random intercepts + By-speaker and by-item random slopes
xmdl.rand.Slope3 <- dataCong %>% 
  lmer(LookingTime ~ Cond + Age + 
         (1 + Cond | Subj) + 
         (1 + Cond + Age| Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.3 With interaction

4.3.2.3.1 Crossed random intercepts + Interaction
xmdl.rand.Interc.Int <- dataCong %>% 
  lmer(LookingTime ~ Cond * Age + 
         (1 | Subj) + 
         (1 | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))
4.3.2.3.2 Crossed random intercepts + By-speaker random slopes + Interaction
xmdl.rand.Slope1.Int <- dataCong %>% 
  lmer(LookingTime ~ Cond * Age + 
         (1 + Cond | Subj) + 
         (1 | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))
4.3.2.3.3 Crossed random intercepts + By-speaker and by-item random slopes + Interaction
xmdl.rand.Slope2.Int <- dataCong %>% 
  lmer(LookingTime ~ Cond * Age + 
         (1 + Cond | Subj) + 
         (1 + Cond | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))
4.3.2.3.4 Crossed random intercepts + By-speaker and by-item random slopes
xmdl.rand.Slope3.Int <- dataCong %>% 
  lmer(LookingTime ~ Cond * Age + 
         (1 + Cond | Subj) + 
         (1 + Cond * Age| Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.4 Model comparison

anova(xmdl.rand.Interc, xmdl.rand.Slope1, xmdl.rand.Slope2, xmdl.rand.Slope3, xmdl.rand.Interc.Int, xmdl.rand.Slope1.Int, xmdl.rand.Slope2.Int, xmdl.rand.Slope3.Int)
## Data: .
## Models:
## xmdl.rand.Interc: LookingTime ~ Cond + Age + (1 | Subj) + (1 | Item)
## xmdl.rand.Interc.Int: LookingTime ~ Cond * Age + (1 | Subj) + (1 | Item)
## xmdl.rand.Slope1: LookingTime ~ Cond + Age + (1 + Cond | Subj) + (1 | Item)
## xmdl.rand.Slope1.Int: LookingTime ~ Cond * Age + (1 + Cond | Subj) + (1 | Item)
## xmdl.rand.Slope2: LookingTime ~ Cond + Age + (1 + Cond | Subj) + (1 + Cond | Item)
## xmdl.rand.Slope2.Int: LookingTime ~ Cond * Age + (1 + Cond | Subj) + (1 + Cond | Item)
## xmdl.rand.Slope3: LookingTime ~ Cond + Age + (1 + Cond | Subj) + (1 + Cond + Age | Item)
## xmdl.rand.Slope3.Int: LookingTime ~ Cond * Age + (1 + Cond | Subj) + (1 + Cond * Age | Item)
##                      npar   AIC   BIC logLik -2*log(L)    Chisq Df Pr(>Chisq)
## xmdl.rand.Interc        6 42074 42110 -21031     42062                       
## xmdl.rand.Interc.Int    7 42050 42093 -21018     42036  25.8359  1  3.717e-07
## xmdl.rand.Slope1        8 41834 41883 -20909     41818 217.8699  1  < 2.2e-16
## xmdl.rand.Slope1.Int    9 41833 41888 -20908     41815   3.0247  1     0.0820
## xmdl.rand.Slope2       10 41808 41869 -20894     41788  27.3253  1  1.719e-07
## xmdl.rand.Slope2.Int   11 41807 41874 -20892     41785   3.0149  1     0.0825
## xmdl.rand.Slope3       13 41780 41858 -20877     41754  31.2599  2  1.629e-07
## xmdl.rand.Slope3.Int   18 41786 41895 -20875     41750   3.3401  5     0.6477
##                         
## xmdl.rand.Interc        
## xmdl.rand.Interc.Int ***
## xmdl.rand.Slope1     ***
## xmdl.rand.Slope1.Int .  
## xmdl.rand.Slope2     ***
## xmdl.rand.Slope2.Int .  
## xmdl.rand.Slope3     ***
## xmdl.rand.Slope3.Int    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The results above highlight that the model accounting for both by-subject and by-item random intercepts and random slopes for Condition are improving the model fit in comparison to a more complex model. We rerun the model with REML = False

4.3.2.5 Optimal model

xmdl.Optimal <- dataCong %>% 
  lmer(LookingTime ~ Cond + Age + 
         (1 + Cond | Subj) + 
         (1 + Cond + Age | Item), data = ., REML = TRUE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))
4.3.2.5.1 Model criticism
4.3.2.5.1.1 Histogram
hist(residuals(xmdl.Optimal))

4.3.2.5.1.2 QQ-plot
qqnorm(residuals(xmdl.Optimal)); qqline(residuals(xmdl.Optimal))

4.3.2.5.1.3 Residuals vs Fitted
plot(fitted(xmdl.Optimal), residuals(xmdl.Optimal), cex = 4)

4.3.2.5.2 Summary
xmdl.Optimal %>% 
  summary()
## Linear mixed model fit by REML ['lmerMod']
## Formula: LookingTime ~ Cond + Age + (1 + Cond | Subj) + (1 + Cond + Age |  
##     Item)
##    Data: .
## Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e+05))
## 
## REML criterion at convergence: 41724.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.5337 -0.6485 -0.0054  0.6647  3.5358 
## 
## Random effects:
##  Groups   Name            Variance Std.Dev. Corr       
##  Subj     (Intercept)     123480   351.40              
##           CondIncongruent  10746   103.66   -0.26      
##  Item     (Intercept)      38781   196.93              
##           CondIncongruent   1872    43.27    0.31      
##           AgeOld            1851    43.03   -0.13  0.69
##  Residual                  22613   150.38              
## Number of obs: 3200, groups:  Subj, 40; Item, 40
## 
## Fixed effects:
##                 Estimate Std. Error t value
## (Intercept)       315.04      83.42   3.777
## CondIncongruent   504.35      18.54  27.204
## AgeOld            546.28     107.70   5.072
## 
## Correlation of Fixed Effects:
##             (Intr) CndInc
## CndIncngrnt -0.120       
## AgeOld      -0.646  0.016
4.3.2.5.3 ANOVA
Anova(xmdl.Optimal)
## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: LookingTime
##        Chisq Df Pr(>Chisq)    
## Cond 740.067  1  < 2.2e-16 ***
## Age   25.729  1  3.928e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
4.3.2.5.4 Model’s fit
print(tab_model(xmdl.Optimal, file = paste0("outputs/xmdl.Optimal.html")))
htmltools::includeHTML("outputs/xmdl.Optimal.html")
  Looking Time
Predictors Estimates CI p
(Intercept) 315.04 151.48 – 478.60 <0.001
Cond [Incongruent] 504.35 468.00 – 540.70 <0.001
Age [Old] 546.28 335.12 – 757.44 <0.001
Random Effects
σ2 22612.64
τ00 Subj 123480.39
τ00 Item 38780.89
τ11 Subj.CondIncongruent 10745.65
τ11 Item.CondIncongruent 1872.31
τ11 Item.AgeOld 1851.23
ρ01 Subj -0.26
ρ01 Item.CondIncongruent 0.31
ρ01 Item.AgeOld -0.13
ICC 0.88
N Subj 40
N Item 40
Observations 3200
Marginal R2 / Conditional R2 0.428 / 0.930

4.3.3 Plotting model’s output

4.3.3.1 With ggstats

We use two functions from the package ggstats.

4.3.3.1.1 A plot
ggcoef_model(xmdl.Optimal)

4.3.3.2 A plot + a table + 95% CI

ggcoef_table(xmdl.Optimal)

4.3.3.3 With ggstatsplot

ggcoefstats(xmdl.Optimal, point.args = list(color = paletteer_c("grDevices::rainbow", 13), stats.label.color = paletteer_c("grDevices::rainbow", 13)))
## Warning in (function (mapping = NULL, data = NULL, stat = "identity", position
## = "identity", : Ignoring unknown parameters: `stats.label.colour`
## Number of labels is greater than default palette color count.
## • Select another color `palette` (and/or `package`).

4.3.4 Exploring random effects

4.3.4.1 Subject random effects

random_effects <- ranef(xmdl.Optimal) %>%
  pluck(1) %>%
  rownames_to_column() %>%
  rename(Subject = rowname, Intercept = "(Intercept)") 
 
random_effects %>%
  knitr::kable()
Subject Intercept CondIncongruent
Subj01 -404.45054 58.1953342
Subj02 164.05584 56.9126269
Subj03 -56.37438 -104.0600463
Subj04 -87.06837 43.7884718
Subj05 -55.65214 104.5323881
Subj06 -42.39999 -7.1394329
Subj07 -448.81496 124.6524511
Subj08 -171.75811 -105.4504177
Subj09 -587.74593 123.6910957
Subj10 -44.50407 -50.9618425
Subj11 -346.46288 1.6396952
Subj12 -550.49633 158.6524658
Subj13 417.66028 -169.1409656
Subj14 -11.45716 -30.0008247
Subj15 62.46660 -14.3520071
Subj16 -210.03390 -23.9077435
Subj17 124.19946 -122.7635143
Subj18 666.67147 -42.1836663
Subj19 815.78334 217.1615445
Subj20 -393.58521 26.1633645
Subj21 120.64645 37.9297095
Subj22 455.31816 -46.4925670
Subj23 42.42431 -54.4053541
Subj24 -496.29436 -31.7202401
Subj25 -552.91356 144.2714853
Subj26 -21.91509 -159.7897792
Subj27 65.14521 54.5212741
Subj28 373.40089 -166.0862214
Subj29 -132.82108 -21.8820278
Subj30 104.73419 -112.8914437
Subj31 -79.72259 125.0118676
Subj32 -182.30982 15.3898927
Subj33 -303.68384 -0.1957399
Subj34 215.98609 136.5078856
Subj35 -152.60396 76.5559793
Subj36 401.50547 -145.0045102
Subj37 274.61818 -114.4122269
Subj38 139.48472 -82.7958599
Subj39 733.87830 58.5575052
Subj40 155.08935 41.5013938

4.3.4.2 Items random effects

random_effects <- ranef(xmdl.Optimal) %>%
  pluck(2) %>%
  rownames_to_column() %>%
  rename(Items = rowname, Intercept = "(Intercept)") 
 
random_effects %>%
  knitr::kable()
Items Intercept CondIncongruent AgeOld
Item01 -247.414514 16.2991461 28.313005
Item02 -28.240458 20.6149251 37.267882
Item03 -27.548425 11.7099369 -3.218265
Item04 30.999038 32.2392567 32.548251
Item05 246.216865 38.0586065 2.963532
Item06 -135.583741 28.3151557 28.127365
Item07 75.461506 28.6715618 -2.095308
Item08 72.679382 69.1731012 73.479559
Item09 -180.071464 32.9237792 60.953098
Item10 -170.029336 8.4112556 23.865860
Item11 -278.857701 -39.6392973 -22.526615
Item12 114.001992 4.7652529 -15.353375
Item13 -137.533535 -49.5701288 -36.764434
Item14 -272.928909 -85.2576859 -45.340741
Item15 219.855561 -70.2475208 -83.121775
Item16 243.071756 13.0287971 -4.812881
Item17 210.792617 10.6898309 17.840021
Item18 328.737777 -36.9498370 -82.922723
Item19 -46.910053 28.2829542 22.930115
Item20 -107.870054 -25.5384342 -13.846115
Item21 -209.598742 -56.8339876 -29.482587
Item22 -190.253589 -3.0439753 20.502614
Item23 207.958653 34.4328054 33.373220
Item24 -371.211254 -46.1955955 1.751567
Item25 163.603920 15.0837807 -7.780891
Item26 -56.954216 -39.1972174 -59.652456
Item27 126.648407 7.5867286 -1.102687
Item28 48.176094 -0.4969084 1.358914
Item29 -76.699983 -27.4397737 -1.462458
Item30 -20.393155 33.1813808 36.286237
Item31 33.763416 -15.9993993 -14.400138
Item32 -15.979029 47.9237089 67.643425
Item33 64.451233 -8.2252876 -47.157862
Item34 102.286454 22.6554474 17.134677
Item35 319.882217 54.1470389 10.485239
Item36 115.886689 22.7596706 9.094455
Item37 93.471378 9.1514881 -17.490337
Item38 -523.293819 -46.1546008 -5.185295
Item39 287.858946 20.0462802 2.292923
Item40 -8.431925 -59.3622401 -34.495016

4.3.4.3 Plots

These plots below are produced with the sjPlot package.

4.3.4.3.1 Fixed effects
4.3.4.3.1.1 Condition
plot_model(xmdl.Optimal, type="pred", terms=c("Cond"), ci.lvl = NA, dodge = 0) + theme_bw() + geom_line()

4.3.4.3.1.2 Age
plot_model(xmdl.Optimal, type="pred", terms=c("Age"), ci.lvl = NA, dodge = 0) + theme_bw() + geom_line()

4.3.4.3.1.3 Condition by Age
plot_model(xmdl.Optimal, type="emm", terms=c("Cond", "Age"), ci.lvl = NA, dodge = 0) + theme_bw() + geom_line()

4.3.4.3.2 Random effects
4.3.4.3.2.1 Subject
plot_model(xmdl.Optimal, type="pred", terms=c("Subj"), pred.type="re", ci.lvl = NA, dodge = 0) + theme_bw() + geom_line()

4.3.4.3.2.2 Subject by Condition
plot_model(xmdl.Optimal, type="pred", terms=c("Cond", "Subj"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj)))) + theme_bw() + geom_line() 

4.3.4.3.2.3 Subject by Age
plot_model(xmdl.Optimal, type="pred", terms=c("Age", "Subj"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj)))) + theme_bw() + geom_line()

4.3.4.3.2.4 Item
plot_model(xmdl.Optimal, type="pred", terms=c("Item"), pred.type="re", ci.lvl = NA, dodge = 0) + theme_bw() + geom_line()

4.3.4.3.2.5 Item by Cond
plot_model(xmdl.Optimal, type="pred", terms=c("Cond", "Item"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Item)))) + theme_bw() + geom_line()

4.3.4.3.2.6 Item by Age
plot_model(xmdl.Optimal, type="pred", terms=c("Age", "Item"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Item)))) + theme_bw() + geom_line()

4.3.4.3.2.7 Item by Cond facetted by Age
plot_model(xmdl.Optimal, type="pred", terms=c("Cond", "Item", "Age"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Item)))) + theme_bw() + geom_line()

4.3.4.3.2.8 Subject by Item
plot_model(xmdl.Optimal, type="pred", terms=c("Subj", "Item"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj)))) + theme_bw() + geom_line()

4.3.4.3.2.9 Subject by Item facetted by Cond
plot_model(xmdl.Optimal, type="pred", terms=c("Subj", "Item", "Cond"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj)))) + theme_bw() + geom_line()

4.3.4.3.2.10 Subject by Item facetted by Age
plot_model(xmdl.Optimal, type="pred", terms=c("Subj", "Item", "Age"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj)))) + theme_bw() + geom_line()

4.3.4.3.3 With Lattice
4.3.4.3.3.1 Subject Intercepts
dotplot(ranef(xmdl.Optimal))$Subj[1]

4.3.4.3.3.2 Subject Slopes
dotplot(ranef(xmdl.Optimal), xlim = c(-350, 350))$Subj[2]

4.3.4.3.3.3 Item Intercepts
dotplot(ranef(xmdl.Optimal))$Item[1]

4.3.4.3.3.4 Item Slopes for Cond
dotplot(ranef(xmdl.Optimal), xlim = c(-150, 150))$Item[2]

4.3.4.3.3.5 Item Slopes for Age
dotplot(ranef(xmdl.Optimal), xlim = c(-150, 150))$Item[3]

I hope this tutorial helped you to uncover the role of participants and items and what they can tell us beyond the fixed effect!

4.3.5 Conclusion

In this simulated dataset, we showed how to use Linear Mixed-effects Models, and used a specific approach, named “maximal specification model”. This maximal specification model uses both random intercepts and random slopes. Usually, on any dataset, you are required to formally assess the need for Random slopes.

As a rule of thumb: Any within-subject (or within-item) should be tested for a potential inclusion as a random slope.

Fixed effects provide averages over all observations, even when using mixed effects regressions; we need to explore what random effects (intercepts and slopes) tell us.

In this example, we see that many subjects vary beyond the fixed effect; Standard Errors are not enough to quantify this type of variation. The same is true for items that are not explored routinely!

The approach above used a maximal specification model. This of course depends on the dataset you are working on and requires formal testing. See the package RePsychLing that is on github. The function RePCA is part of the package lme4. You can use it to verify if the model structure is appropriate. However, lme4 will give you directly a warning “is singular” that you can interpret as “over specified model”.