4.3 Linear Mixed-effects Models. Why random effects matter

Let’s generate a new dataframe that we will use later on for our mixed models

We will use the faux package. It is available on CRAN, but it is possible that it is not available for your version of R (I used version 4.5). In this case, you need to install it via github. The package faux is a package that allows you to simulate data for your experiments. It is a very useful package to test your ideas and to assess the impact of particular predictors on your dependant variable.

To install it, you will first need to install the package devtools, and then install the package faux using the function devtools::install_github. Uncomment the code chunk below

# install.packages("devtools")
# devtools::install_github("debruine/faux")
# library(faux)

4.3.1 Dataframe (simulated)

Our experiment has the following structure: we asked 40 subjects to respond to 40 items in a fully crossed design. There were two IVs: Condition with congruent and incongruent and Age with young and old. The Condition is a within subject variable; age is a between subject. However, Condition and Age were both within item variables. The dependant variable was LookingTime.

This meant that we used items within each of the young and the old subjects in addition to items within each of the congurent and incongruent conditions.

Our research question is as follows: Age of subject will impact the Looking Time in the two conditions. Our hypothesis is: The older a subject is, the more the looking time it is to the incongruent condition.

We will use the package faux to simulate a dataframe. This step is crucial in assessing contributions of particular predictors and for testing ideas as we already know the distribution of the data and what to expect as an outcome when it comes to the fixed effect in question.

The simulated data has the following parameters

set.seed(42)
# define parameters
Subj_n = 40  # number of subjects
Item_n = 40  # number of items
b0 = 100       # intercept
b1 = 2.5 * b0   # fixed effect of condition
u0s_sd = 300   # random intercept SD for subjects
u0i_sd = 200   # random intercept SD for items
u1s_sd = 100   # random b1 slope SD for subjects
u1i_sd = 50   # random b1 slope SD for items
r01s = -0.3     # correlation between random effects 0 and 1 for subjects
r01i = 0.2     # correlation between random effects 0 and 1 for items
sigma_sd = 150 # error SD

# set up data structure
dataCong <- add_random(Subj = Subj_n, Item = Item_n) %>%
  # add within categorical variable for subject
  add_within("Subj", Cond = c("Congruent", "Incongruent")) %>%
  add_recode("Cond", "Cond.Incongruent", Congruent = 0, Incongruent = 1) %>%
  # add between categorical variable for subject
  add_between("Subj", Age = c("Young", "Old")) %>%
  add_recode("Age", "Age.Old", Young = 0, Old = 1) %>%
  # add random effects 
  add_ranef("Subj", u0s = u0s_sd, u1s = u1s_sd, .cors = r01s) %>%
  add_ranef("Item", u0i = u0i_sd, u1i = u1i_sd, .cors = r01i) %>%
  ##add_ranef(c("Subj", "Item"), u0si = u0s_sd + u0i_sd) %>%
  ##add_ranef(c("Subj", "Item"), u1si = u1s_sd + u1i_sd) %>%
  add_ranef(sigma = sigma_sd) %>%
  # calculate DV
  mutate(LookingTime = b0 + b1 + u0s + u0i + #u0si + u1si +
           (((b1 + u1s) + 0.5) * Cond.Incongruent) + (((b1 + u1s) + 0.9) * Age.Old) + # subject specific variation
           (((b1 + u1i) - 0.3) * Cond.Incongruent) + (((b1 + u1i) - 0.25) * Age.Old) + # item specific variation  
           sigma)
write.csv(dataCong, "data/dataCong.csv")

If you were not able to install the faux package, simply uncomment the following line of code below to import the dataset

#dataCong <- read.csv("dataCong.csv")[-1]

4.3.1.1 Counts

dataCong <- dataCong %>% 
  mutate(Subj = factor(Subj),
         Item = factor(Item))
dataCong

## # A tibble: 3,200 × 12
##    Subj   Item   Cond    Cond.Incongruent Age   Age.Old   u0s   u1s    u0i   u1i
##    <fct>  <fct>  <fct>              <dbl> <fct>   <dbl> <dbl> <dbl>  <dbl> <dbl>
##  1 Subj01 Item01 Congru…                0 Young       0 -413.  26.2 -306.  56.9 
##  2 Subj01 Item01 Incong…                1 Young       0 -413.  26.2 -306.  56.9 
##  3 Subj01 Item02 Congru…                0 Young       0 -413.  26.2  -55.4 69.1 
##  4 Subj01 Item02 Incong…                1 Young       0 -413.  26.2  -55.4 69.1 
##  5 Subj01 Item03 Congru…                0 Young       0 -413.  26.2  -17.4 -7.03
##  6 Subj01 Item03 Incong…                1 Young       0 -413.  26.2  -17.4 -7.03
##  7 Subj01 Item04 Congru…                0 Young       0 -413.  26.2   21.6 50.0 
##  8 Subj01 Item04 Incong…                1 Young       0 -413.  26.2   21.6 50.0 
##  9 Subj01 Item05 Congru…                0 Young       0 -413.  26.2  239.  12.8 
## 10 Subj01 Item05 Incong…                1 Young       0 -413.  26.2  239.  12.8 
## # ℹ 3,190 more rows
## # ℹ 2 more variables: sigma <dbl>, LookingTime <dbl>

4.3.1.1.1 Subjects

dataCong %>% 
  group_by(Cond, Age, Subj) %>% 
  summarise(count = n())

## # A tibble: 80 × 4
## # Groups:   Cond, Age [4]
##    Cond      Age   Subj   count
##    <fct>     <fct> <fct>  <int>
##  1 Congruent Young Subj01    40
##  2 Congruent Young Subj03    40
##  3 Congruent Young Subj05    40
##  4 Congruent Young Subj07    40
##  5 Congruent Young Subj09    40
##  6 Congruent Young Subj11    40
##  7 Congruent Young Subj13    40
##  8 Congruent Young Subj15    40
##  9 Congruent Young Subj17    40
## 10 Congruent Young Subj19    40
## # ℹ 70 more rows

4.3.1.1.2 Items

4.3.1.1.3 Age

dataCong %>% 
  group_by(Age, Item) %>% 
  summarise(count = n())

## # A tibble: 80 × 3
## # Groups:   Age [2]
##    Age   Item   count
##    <fct> <fct>  <int>
##  1 Young Item01    40
##  2 Young Item02    40
##  3 Young Item03    40
##  4 Young Item04    40
##  5 Young Item05    40
##  6 Young Item06    40
##  7 Young Item07    40
##  8 Young Item08    40
##  9 Young Item09    40
## 10 Young Item10    40
## # ℹ 70 more rows

4.3.1.1.4 Cond

dataCong %>% 
  group_by(Cond, Item) %>% 
  summarise(count = n())

## # A tibble: 80 × 3
## # Groups:   Cond [2]
##    Cond      Item   count
##    <fct>     <fct>  <int>
##  1 Congruent Item01    40
##  2 Congruent Item02    40
##  3 Congruent Item03    40
##  4 Congruent Item04    40
##  5 Congruent Item05    40
##  6 Congruent Item06    40
##  7 Congruent Item07    40
##  8 Congruent Item08    40
##  9 Congruent Item09    40
## 10 Congruent Item10    40
## # ℹ 70 more rows

4.3.1.2 Visualisations

4.3.1.2.1 Condition by Age

The figure below confirms this, where we see an increase in LookingTime in the incongruent condition and oevrall, older participants show an increase in LookingTime

dataCong %>% 
  ggplot(aes(x = Cond,
             y = LookingTime,
             colour = Age)) +
  theme_bw() + 
  geom_boxplot() +
  geom_smooth(aes(as.numeric(Cond)), method = "lm")

4.3.1.2.2 Subject by Condition

This figure shows that subjects are variable in how they responded to this task

dataCong %>% 
  ggplot(aes(x = Cond,
             y = LookingTime,
             colour = Subj)) +
  theme_bw() + 
  geom_point() +
  geom_smooth(aes(as.numeric(Cond)), method = "lm", se = FALSE) +
  scale_colour_manual(values = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj))))

4.3.1.2.3 Subject by Age

This figure shows that subjects had an impact on the LookingTime in both age groups, simply due to their variable responses to the different items

dataCong %>% 
  ggplot(aes(x = Age,
             y = LookingTime,
             colour = Subj)) +
  theme_bw() + 
  geom_point() +
  geom_smooth(aes(as.numeric(Cond)), method = "lm", se = FALSE) +
  scale_colour_manual(values = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj))))

4.3.1.2.4 Item by Condition

This figure shows that items had an impact on the LookingTime in both conditions

dataCong %>% 
  ggplot(aes(x = Cond,
             y = LookingTime,
             colour = Item)) +
  theme_bw() + 
  geom_point() +
  geom_smooth(aes(as.numeric(Cond)), method = "lm", se = FALSE) +
  scale_colour_manual(values = paletteer_c("grDevices::rainbow", length(unique(dataCong$Item))))

4.3.1.2.5 Subject by Age

This figure shows that items had an impact on the LookingTime in both age groups

dataCong %>% 
  ggplot(aes(x = Age,
             y = LookingTime,
             colour = Item)) +
  theme_bw() + 
  geom_point() +
  geom_smooth(aes(as.numeric(Cond)), method = "lm", se = FALSE) +
  scale_colour_manual(values = paletteer_c("grDevices::rainbow", length(unique(dataCong$Item))))

4.3.2 Modelling strategy

We use an LMER model with a crossed random effect. To choose our optimal model, we start first by a simple model with only random intercepts, increasing complexity by accounting for random slopes for both subjects and items. It is clear from our visualisation above, that there is no interaction between the two predictors. However, for demonstration purposes, we do test for this

4.3.2.1 Simple Linear Model

mdl.lm <- dataCong %>% 
  lm(LookingTime ~ Cond + Age, data = .)
summary(mdl.lm)

## 
## Call:
## lm(formula = LookingTime ~ Cond + Age, data = .)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1215.59  -290.17   -21.78   264.89  1661.41 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       304.95      12.91   23.61   <2e-16 ***
## CondIncongruent   504.35      14.91   33.82   <2e-16 ***
## AgeOld            566.44      14.91   37.98   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 421.8 on 3197 degrees of freedom
## Multiple R-squared:  0.4472, Adjusted R-squared:  0.4469 
## F-statistic:  1293 on 2 and 3197 DF,  p-value: < 2.2e-16

hist(residuals(mdl.lm))

qqnorm(residuals(mdl.lm)); qqline(residuals(mdl.lm))

plot(fitted(mdl.lm), residuals(mdl.lm), cex = 4)

4.3.2.2 No interaction

4.3.2.2.1 Crossed random intercepts

xmdl.rand.Interc <- dataCong %>% 
  lmer(LookingTime ~ Cond + Age + 
         (1 | Subj) + 
         (1 | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.2.2 Crossed random intercepts + By-speaker random slopes

xmdl.rand.Slope1 <- dataCong %>% 
  lmer(LookingTime ~ Cond + Age + 
         (1 + Cond | Subj) + 
         (1 | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.2.3 Crossed random intercepts + By-speaker and by-item random slopes

xmdl.rand.Slope2 <- dataCong %>% 
  lmer(LookingTime ~ Cond + Age + 
         (1 + Cond | Subj) + 
         (1 + Cond | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.2.4 Crossed random intercepts + By-speaker and by-item random slopes

xmdl.rand.Slope3 <- dataCong %>% 
  lmer(LookingTime ~ Cond + Age + 
         (1 + Cond | Subj) + 
         (1 + Cond + Age| Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.3 With interaction

4.3.2.3.1 Crossed random intercepts + Interaction

xmdl.rand.Interc.Int <- dataCong %>% 
  lmer(LookingTime ~ Cond * Age + 
         (1 | Subj) + 
         (1 | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.3.2 Crossed random intercepts + By-speaker random slopes + Interaction

xmdl.rand.Slope1.Int <- dataCong %>% 
  lmer(LookingTime ~ Cond * Age + 
         (1 + Cond | Subj) + 
         (1 | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.3.3 Crossed random intercepts + By-speaker and by-item random slopes + Interaction

xmdl.rand.Slope2.Int <- dataCong %>% 
  lmer(LookingTime ~ Cond * Age + 
         (1 + Cond | Subj) + 
         (1 + Cond | Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.3.4 Crossed random intercepts + By-speaker and by-item random slopes

xmdl.rand.Slope3.Int <- dataCong %>% 
  lmer(LookingTime ~ Cond * Age + 
         (1 + Cond | Subj) + 
         (1 + Cond * Age| Item), data = ., REML = FALSE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.4 Model comparison

anova(xmdl.rand.Interc, xmdl.rand.Slope1, xmdl.rand.Slope2, xmdl.rand.Slope3, xmdl.rand.Interc.Int, xmdl.rand.Slope1.Int, xmdl.rand.Slope2.Int, xmdl.rand.Slope3.Int)

## Data: .
## Models:
## xmdl.rand.Interc: LookingTime ~ Cond + Age + (1 | Subj) + (1 | Item)
## xmdl.rand.Interc.Int: LookingTime ~ Cond * Age + (1 | Subj) + (1 | Item)
## xmdl.rand.Slope1: LookingTime ~ Cond + Age + (1 + Cond | Subj) + (1 | Item)
## xmdl.rand.Slope1.Int: LookingTime ~ Cond * Age + (1 + Cond | Subj) + (1 | Item)
## xmdl.rand.Slope2: LookingTime ~ Cond + Age + (1 + Cond | Subj) + (1 + Cond | Item)
## xmdl.rand.Slope2.Int: LookingTime ~ Cond * Age + (1 + Cond | Subj) + (1 + Cond | Item)
## xmdl.rand.Slope3: LookingTime ~ Cond + Age + (1 + Cond | Subj) + (1 + Cond + Age | Item)
## xmdl.rand.Slope3.Int: LookingTime ~ Cond * Age + (1 + Cond | Subj) + (1 + Cond * Age | Item)
##                      npar   AIC   BIC logLik -2*log(L)    Chisq Df Pr(>Chisq)
## xmdl.rand.Interc        6 42074 42110 -21031     42062                       
## xmdl.rand.Interc.Int    7 42050 42093 -21018     42036  25.8359  1  3.717e-07
## xmdl.rand.Slope1        8 41834 41883 -20909     41818 217.8699  1  < 2.2e-16
## xmdl.rand.Slope1.Int    9 41833 41888 -20908     41815   3.0247  1     0.0820
## xmdl.rand.Slope2       10 41808 41869 -20894     41788  27.3253  1  1.719e-07
## xmdl.rand.Slope2.Int   11 41807 41874 -20892     41785   3.0149  1     0.0825
## xmdl.rand.Slope3       13 41780 41858 -20877     41754  31.2599  2  1.629e-07
## xmdl.rand.Slope3.Int   18 41786 41895 -20875     41750   3.3401  5     0.6477
##                         
## xmdl.rand.Interc        
## xmdl.rand.Interc.Int ***
## xmdl.rand.Slope1     ***
## xmdl.rand.Slope1.Int .  
## xmdl.rand.Slope2     ***
## xmdl.rand.Slope2.Int .  
## xmdl.rand.Slope3     ***
## xmdl.rand.Slope3.Int    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The results above highlight that the model accounting for both by-subject and by-item random intercepts and random slopes for Condition are improving the model fit in comparison to a more complex model. We rerun the model with REML = False

4.3.2.5 Optimal model

xmdl.Optimal <- dataCong %>% 
  lmer(LookingTime ~ Cond + Age + 
         (1 + Cond | Subj) + 
         (1 + Cond + Age | Item), data = ., REML = TRUE,
       control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))

4.3.2.5.1 Model criticism

4.3.2.5.1.1 Histogram

hist(residuals(xmdl.Optimal))

4.3.2.5.1.2 QQ-plot

qqnorm(residuals(xmdl.Optimal)); qqline(residuals(xmdl.Optimal))

4.3.2.5.1.3 Residuals vs Fitted

plot(fitted(xmdl.Optimal), residuals(xmdl.Optimal), cex = 4)

4.3.2.5.2 Summary

xmdl.Optimal %>% 
  summary()

## Linear mixed model fit by REML ['lmerMod']
## Formula: LookingTime ~ Cond + Age + (1 + Cond | Subj) + (1 + Cond + Age |  
##     Item)
##    Data: .
## Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e+05))
## 
## REML criterion at convergence: 41724.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.5337 -0.6485 -0.0054  0.6647  3.5358 
## 
## Random effects:
##  Groups   Name            Variance Std.Dev. Corr       
##  Subj     (Intercept)     123480   351.40              
##           CondIncongruent  10746   103.66   -0.26      
##  Item     (Intercept)      38781   196.93              
##           CondIncongruent   1872    43.27    0.31      
##           AgeOld            1851    43.03   -0.13  0.69
##  Residual                  22613   150.38              
## Number of obs: 3200, groups:  Subj, 40; Item, 40
## 
## Fixed effects:
##                 Estimate Std. Error t value
## (Intercept)       315.04      83.42   3.777
## CondIncongruent   504.35      18.54  27.204
## AgeOld            546.28     107.70   5.072
## 
## Correlation of Fixed Effects:
##             (Intr) CndInc
## CndIncngrnt -0.120       
## AgeOld      -0.646  0.016

4.3.2.5.3 ANOVA

Anova(xmdl.Optimal)

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: LookingTime
##        Chisq Df Pr(>Chisq)    
## Cond 740.067  1  < 2.2e-16 ***
## Age   25.729  1  3.928e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

4.3.2.5.4 Model’s fit

print(tab_model(xmdl.Optimal, file = paste0("outputs/xmdl.Optimal.html")))
htmltools::includeHTML("outputs/xmdl.Optimal.html")

	Looking Time
Predictors	Estimates	CI	p
(Intercept)	315.04	151.48 – 478.60	<0.001
Cond [Incongruent]	504.35	468.00 – 540.70	<0.001
Age [Old]	546.28	335.12 – 757.44	<0.001
Random Effects
σ²	22612.64
τ₀₀ _Subj	123480.39
τ₀₀ _Item	38780.89
τ₁₁ _{Subj.CondIncongruent}	10745.65
τ₁₁ _{Item.CondIncongruent}	1872.31
τ₁₁ _Item.AgeOld	1851.23
ρ₀₁ _Subj	-0.26
ρ₀₁ _{Item.CondIncongruent}	0.31
ρ₀₁ _Item.AgeOld	-0.13
ICC	0.88
N _Subj	40
N _Item	40
Observations	3200
Marginal R² / Conditional R²	0.428 / 0.930

4.3.3 Plotting model’s output

4.3.3.1 With `ggstats`

We use two functions from the package ggstats.

4.3.3.1.1 A plot

ggcoef_model(xmdl.Optimal)

4.3.3.2 A plot + a table + 95% CI

ggcoef_table(xmdl.Optimal)

4.3.3.3 With `ggstatsplot`

ggcoefstats(xmdl.Optimal, point.args = list(color = paletteer_c("grDevices::rainbow", 13), stats.label.color = paletteer_c("grDevices::rainbow", 13)))

## Warning in (function (mapping = NULL, data = NULL, stat = "identity", position
## = "identity", : Ignoring unknown parameters: `stats.label.colour`

## Number of labels is greater than default palette color count.
## • Select another color `palette` (and/or `package`).

4.3.4 Exploring random effects

4.3.4.1 Subject random effects

random_effects <- ranef(xmdl.Optimal) %>%
  pluck(1) %>%
  rownames_to_column() %>%
  rename(Subject = rowname, Intercept = "(Intercept)") 
 
random_effects %>%
  knitr::kable()

Subject	Intercept	CondIncongruent
Subj01	-404.45054	58.1953342
Subj02	164.05584	56.9126269
Subj03	-56.37438	-104.0600463
Subj04	-87.06837	43.7884718
Subj05	-55.65214	104.5323881
Subj06	-42.39999	-7.1394329
Subj07	-448.81496	124.6524511
Subj08	-171.75811	-105.4504177
Subj09	-587.74593	123.6910957
Subj10	-44.50407	-50.9618425
Subj11	-346.46288	1.6396952
Subj12	-550.49633	158.6524658
Subj13	417.66028	-169.1409656
Subj14	-11.45716	-30.0008247
Subj15	62.46660	-14.3520071
Subj16	-210.03390	-23.9077435
Subj17	124.19946	-122.7635143
Subj18	666.67147	-42.1836663
Subj19	815.78334	217.1615445
Subj20	-393.58521	26.1633645
Subj21	120.64645	37.9297095
Subj22	455.31816	-46.4925670
Subj23	42.42431	-54.4053541
Subj24	-496.29436	-31.7202401
Subj25	-552.91356	144.2714853
Subj26	-21.91509	-159.7897792
Subj27	65.14521	54.5212741
Subj28	373.40089	-166.0862214
Subj29	-132.82108	-21.8820278
Subj30	104.73419	-112.8914437
Subj31	-79.72259	125.0118676
Subj32	-182.30982	15.3898927
Subj33	-303.68384	-0.1957399
Subj34	215.98609	136.5078856
Subj35	-152.60396	76.5559793
Subj36	401.50547	-145.0045102
Subj37	274.61818	-114.4122269
Subj38	139.48472	-82.7958599
Subj39	733.87830	58.5575052
Subj40	155.08935	41.5013938

4.3.4.2 Items random effects

random_effects <- ranef(xmdl.Optimal) %>%
  pluck(2) %>%
  rownames_to_column() %>%
  rename(Items = rowname, Intercept = "(Intercept)") 
 
random_effects %>%
  knitr::kable()

Items	Intercept	CondIncongruent	AgeOld
Item01	-247.414514	16.2991461	28.313005
Item02	-28.240458	20.6149251	37.267882
Item03	-27.548425	11.7099369	-3.218265
Item04	30.999038	32.2392567	32.548251
Item05	246.216865	38.0586065	2.963532
Item06	-135.583741	28.3151557	28.127365
Item07	75.461506	28.6715618	-2.095308
Item08	72.679382	69.1731012	73.479559
Item09	-180.071464	32.9237792	60.953098
Item10	-170.029336	8.4112556	23.865860
Item11	-278.857701	-39.6392973	-22.526615
Item12	114.001992	4.7652529	-15.353375
Item13	-137.533535	-49.5701288	-36.764434
Item14	-272.928909	-85.2576859	-45.340741
Item15	219.855561	-70.2475208	-83.121775
Item16	243.071756	13.0287971	-4.812881
Item17	210.792617	10.6898309	17.840021
Item18	328.737777	-36.9498370	-82.922723
Item19	-46.910053	28.2829542	22.930115
Item20	-107.870054	-25.5384342	-13.846115
Item21	-209.598742	-56.8339876	-29.482587
Item22	-190.253589	-3.0439753	20.502614
Item23	207.958653	34.4328054	33.373220
Item24	-371.211254	-46.1955955	1.751567
Item25	163.603920	15.0837807	-7.780891
Item26	-56.954216	-39.1972174	-59.652456
Item27	126.648407	7.5867286	-1.102687
Item28	48.176094	-0.4969084	1.358914
Item29	-76.699983	-27.4397737	-1.462458
Item30	-20.393155	33.1813808	36.286237
Item31	33.763416	-15.9993993	-14.400138
Item32	-15.979029	47.9237089	67.643425
Item33	64.451233	-8.2252876	-47.157862
Item34	102.286454	22.6554474	17.134677
Item35	319.882217	54.1470389	10.485239
Item36	115.886689	22.7596706	9.094455
Item37	93.471378	9.1514881	-17.490337
Item38	-523.293819	-46.1546008	-5.185295
Item39	287.858946	20.0462802	2.292923
Item40	-8.431925	-59.3622401	-34.495016

4.3.4.3 Plots

These plots below are produced with the sjPlot package.

4.3.4.3.1 Fixed effects

4.3.4.3.1.1 Condition

plot_model(xmdl.Optimal, type="pred", terms=c("Cond"), ci.lvl = NA, dodge = 0) + theme_bw() + geom_line()

4.3.4.3.1.2 Age

plot_model(xmdl.Optimal, type="pred", terms=c("Age"), ci.lvl = NA, dodge = 0) + theme_bw() + geom_line()

4.3.4.3.1.3 Condition by Age

plot_model(xmdl.Optimal, type="emm", terms=c("Cond", "Age"), ci.lvl = NA, dodge = 0) + theme_bw() + geom_line()

4.3.4.3.2 Random effects

4.3.4.3.2.1 Subject

plot_model(xmdl.Optimal, type="pred", terms=c("Subj"), pred.type="re", ci.lvl = NA, dodge = 0) + theme_bw() + geom_line()

4.3.4.3.2.2 Subject by Condition

plot_model(xmdl.Optimal, type="pred", terms=c("Cond", "Subj"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj)))) + theme_bw() + geom_line()

4.3.4.3.2.3 Subject by Age

plot_model(xmdl.Optimal, type="pred", terms=c("Age", "Subj"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj)))) + theme_bw() + geom_line()

4.3.4.3.2.4 Item

plot_model(xmdl.Optimal, type="pred", terms=c("Item"), pred.type="re", ci.lvl = NA, dodge = 0) + theme_bw() + geom_line()

4.3.4.3.2.5 Item by Cond

plot_model(xmdl.Optimal, type="pred", terms=c("Cond", "Item"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Item)))) + theme_bw() + geom_line()

4.3.4.3.2.6 Item by Age

plot_model(xmdl.Optimal, type="pred", terms=c("Age", "Item"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Item)))) + theme_bw() + geom_line()

4.3.4.3.2.7 Item by Cond facetted by Age

plot_model(xmdl.Optimal, type="pred", terms=c("Cond", "Item", "Age"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Item)))) + theme_bw() + geom_line()

4.3.4.3.2.8 Subject by Item

plot_model(xmdl.Optimal, type="pred", terms=c("Subj", "Item"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj)))) + theme_bw() + geom_line()

4.3.4.3.2.9 Subject by Item facetted by Cond

plot_model(xmdl.Optimal, type="pred", terms=c("Subj", "Item", "Cond"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj)))) + theme_bw() + geom_line()

4.3.4.3.2.10 Subject by Item facetted by Age

plot_model(xmdl.Optimal, type="pred", terms=c("Subj", "Item", "Age"), pred.type="re", ci.lvl = NA, dodge = 0, colors = paletteer_c("grDevices::rainbow", length(unique(dataCong$Subj)))) + theme_bw() + geom_line()

4.3.4.3.3 With `Lattice`

4.3.4.3.3.1 Subject Intercepts

dotplot(ranef(xmdl.Optimal))$Subj[1]

4.3.4.3.3.2 Subject Slopes

dotplot(ranef(xmdl.Optimal), xlim = c(-350, 350))$Subj[2]

4.3.4.3.3.3 Item Intercepts

dotplot(ranef(xmdl.Optimal))$Item[1]

4.3.4.3.3.4 Item Slopes for Cond

dotplot(ranef(xmdl.Optimal), xlim = c(-150, 150))$Item[2]

4.3.4.3.3.5 Item Slopes for Age

dotplot(ranef(xmdl.Optimal), xlim = c(-150, 150))$Item[3]

I hope this tutorial helped you to uncover the role of participants and items and what they can tell us beyond the fixed effect!

4.3.5 Conclusion

In this simulated dataset, we showed how to use Linear Mixed-effects Models, and used a specific approach, named “maximal specification model”. This maximal specification model uses both random intercepts and random slopes. Usually, on any dataset, you are required to formally assess the need for Random slopes.

As a rule of thumb: Any within-subject (or within-item) should be tested for a potential inclusion as a random slope.

Fixed effects provide averages over all observations, even when using mixed effects regressions; we need to explore what random effects (intercepts and slopes) tell us.

In this example, we see that many subjects vary beyond the fixed effect; Standard Errors are not enough to quantify this type of variation. The same is true for items that are not explored routinely!

The approach above used a maximal specification model. This of course depends on the dataset you are working on and requires formal testing. See the package RePsychLing that is on github. The function RePCA is part of the package lme4. You can use it to verify if the model structure is appropriate. However, lme4 will give you directly a warning “is singular” that you can interpret as “over specified model”.