## Use the code below to check if you have all required packages installed. If some are not installed already, the code below will install these. If you have all packages installed, then you could load them with the second code.
= c('tidyverse', 'languageR', 'Hmisc', 'corrplot', 'broom', 'knitr', 'xtable', 'ggsignif')
requiredPackages for(p in requiredPackages){
if(!require(p,character.only = TRUE)) install.packages(p)
library(p,character.only = TRUE)
}
Loading required package: Hmisc
Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Registered S3 method overwritten by 'htmlwidgets':
method from
print.htmlwidget tools:rstudio
Registered S3 method overwritten by 'data.table':
method from
print.data.table
Attaching package: ‘Hmisc’
The following objects are masked from ‘package:dplyr’:
src, summarize
The following objects are masked from ‘package:base’:
format.pval, units
Loading required package: corrplot
corrplot 0.90 loaded
Loading required package: broom
Loading required package: knitr
Loading required package: xtable
Attaching package: ‘xtable’
The following objects are masked from ‘package:Hmisc’:
label, label<-
Loading required package: ggsignif
What do we mean by inferential statistics? We want to evaluate if there are differences observed between two groups of datapoints and if these differences are statistically significant.
For this, we need to pay attention to the format of our outcome and predictors. An outcome is the response variable (or dependent variable); a predictor is our independent variable(s).
It is important to know the class of the outcome before doing any pre-data analyses or inferential statistics. Outcome classes can be one of:
Numeric
: As an example, we have length/width of leaf; height of mountain; fundamental frequency of the voice; etc. These are true
numbers and we can use summaries, t-tests, linear models, etc. Integer are a family of numeric variables and can still be considered as a normal numeric variable.
Categorical
(Unordered): Observations for two or more categories. As an example, we can have gender of a speaker (male or female); responses to a True vs False perception tests; Colour (unordered) categorisation, e.g., red, blue, yellow, orange, etc.. For these we can use a Generalised Linear Model (binomial or multinomial) or a simple chi-square test. Count data are numbers related to a category. But these should be analysed using a poisson logistic regression
Categorical
(Ordered): When you run a rating experiment, where the outcome is either numeric
(i.e., 1, 2, 3, 4, 5) or categories
(i.e., disagree, neutral, agree). The numeric
option is NOT a true number as for the participant, these are categories. Cumulative Logit models (or Generalised Linear Model with a cumulative function) are used. The mean is meaningless here, and the median is a preferred descriptive statistic.
Let us start with a basic correlation test. We want to evaluate if two numeric variables are correlated with each other.
We use the function cor
to obtain the pearson correlation and cor.test
to run a basic correlation test on our data with significance testing
cor(english$RTlexdec, english$RTnaming, method = "pearson")
[1] 0.7587033
cor.test(english$RTlexdec, english$RTnaming)
Pearson's product-moment correlation
data: english$RTlexdec and english$RTnaming
t = 78.699, df = 4566, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.7461195 0.7707453
sample estimates:
cor
0.7587033
What these results are telling us? There is a positive correlation between RTlexdec
and RTnaming
. The correlation coefficient (R²) is 0.76 (limits between -1 and 1). This correlation is statistically significant with a t value of 78.699, degrees of freedom of 4566 and a p-value < 2.2e-16.
What are the degrees of freedom? These relate to number of total observations - number of comparisons. Here we have 4568 observations in the dataset, and two comparisons, hence 4568 - 2 = 4566.
For the p value, there is a threshold we usually use. This threshold is p = 0.05. This threshold means we have a minimum to consider any difference as significant or not. 0.05 means that we have a probability to find a significant difference that is at 5% or lower. IN our case, the p value is lower that 2.2e-16. How to interpret this number? this tells us to add 15 0s before the 2!! i.e., 0.0000000000000002. This probability is very (very!!) low. So we conclude that there is a statistically significant correlation between the two variables.
The formula to calculate the t value is below.
x̄ = sample mean μ0 = population mean s = sample standard deviation n = sample size
The p value is influenced by various factors, number of observations, strength of the difference, mean values, etc.. You should always be careful with interpreting p values taking everything else into account.
corrplot
Above, we did a correlation test on two predictors. What if we want to obtain a nice plot of all numeric predictors and add significance levels?
<-
corr %>%
english select(where(is.numeric)) %>%
cor()
print(corr)
RTlexdec RTnaming Familiarity
RTlexdec 1.000000000 0.758703280 -0.44440973
RTnaming 0.758703280 1.000000000 -0.09479307
Familiarity -0.444409734 -0.094793069 1.00000000
WrittenFrequency -0.434814982 -0.095994313 0.79125559
WrittenSpokenFrequencyRatio 0.039820007 0.036592754 -0.18989881
FamilySize -0.349595853 -0.088037010 0.59191973
DerivationalEntropy -0.161164620 -0.049456670 0.22071588
InflectionalEntropy -0.088418681 -0.022110376 0.10795420
NumberSimplexSynsets -0.309008140 -0.071900207 0.51065170
NumberComplexSynsets -0.328613209 -0.076384846 0.51001913
LengthInLetters 0.049747275 0.094497065 -0.08215272
Ncount -0.065726313 -0.094669618 0.09650461
MeanBigramFrequency 0.002633525 0.048459360 0.02962138
FrequencyInitialDiphone -0.074452719 -0.057216874 0.12847193
ConspelV -0.032867467 -0.025165185 0.07451417
ConspelN -0.107538023 -0.034801239 0.21437628
ConphonV -0.021747588 0.001175572 0.05408975
ConphonN -0.080930543 -0.014364896 0.16180440
ConfriendsV -0.025720833 -0.027741071 0.04584224
ConfriendsN -0.117532883 -0.044064997 0.20673977
ConffV -0.016494945 0.007924551 0.05610343
ConffN -0.005679088 0.011407182 0.02945722
ConfbV -0.022515482 0.019159417 0.05687343
ConfbN -0.018539499 0.017011731 0.05199653
NounFrequency -0.167189500 -0.043148572 0.38119070
VerbFrequency -0.076388309 -0.024593780 0.23817700
FrequencyInitialDiphoneWord -0.042640861 0.020488545 0.09106333
FrequencyInitialDiphoneSyllable -0.035503708 0.026897756 0.07354114
CorrectLexdec -0.253188184 0.151348043 0.52685458
WrittenFrequency
RTlexdec -0.43481498
RTnaming -0.09599431
Familiarity 0.79125559
WrittenFrequency 1.00000000
WrittenSpokenFrequencyRatio 0.07158067
FamilySize 0.66253864
DerivationalEntropy 0.25522889
InflectionalEntropy -0.04048005
NumberSimplexSynsets 0.55874958
NumberComplexSynsets 0.59105478
LengthInLetters -0.06663196
Ncount 0.10492564
MeanBigramFrequency 0.07758879
FrequencyInitialDiphone 0.16748670
ConspelV 0.05864228
ConspelN 0.28248908
ConphonV 0.08201255
ConphonN 0.22245283
ConfriendsV 0.02146455
ConfriendsN 0.26326498
ConffV 0.08162166
ConffN 0.05028101
ConfbV 0.11975724
ConfbN 0.10409106
NounFrequency 0.46955152
VerbFrequency 0.27879235
FrequencyInitialDiphoneWord 0.10827895
FrequencyInitialDiphoneSyllable 0.09111661
CorrectLexdec 0.45797185
WrittenSpokenFrequencyRatio
RTlexdec 0.039820007
RTnaming 0.036592754
Familiarity -0.189898811
WrittenFrequency 0.071580669
WrittenSpokenFrequencyRatio 1.000000000
FamilySize -0.108801543
DerivationalEntropy -0.010987756
InflectionalEntropy -0.118804044
NumberSimplexSynsets -0.085088573
NumberComplexSynsets -0.104598273
LengthInLetters 0.204091196
Ncount -0.188200595
MeanBigramFrequency 0.191513493
FrequencyInitialDiphone 0.020799272
ConspelV -0.157689304
ConspelN -0.057307387
ConphonV -0.034796879
ConphonN -0.025856126
ConfriendsV -0.136348207
ConfriendsN -0.055568525
ConffV -0.026731676
ConffN -0.003403134
ConfbV 0.093614167
ConfbN 0.070239322
NounFrequency 0.012482293
VerbFrequency -0.096554415
FrequencyInitialDiphoneWord 0.002627022
FrequencyInitialDiphoneSyllable 0.012872673
CorrectLexdec 0.008564774
FamilySize DerivationalEntropy
RTlexdec -0.349595853 -0.161164620
RTnaming -0.088037010 -0.049456670
Familiarity 0.591919733 0.220715880
WrittenFrequency 0.662538635 0.255228886
WrittenSpokenFrequencyRatio -0.108801543 -0.010987756
FamilySize 1.000000000 0.692088896
DerivationalEntropy 0.692088896 1.000000000
InflectionalEntropy 0.101743523 -0.050795034
NumberSimplexSynsets 0.590763556 0.223943027
NumberComplexSynsets 0.645411663 0.331924999
LengthInLetters -0.122995009 -0.104860729
Ncount 0.174107015 0.123827732
MeanBigramFrequency -0.001056468 -0.020837051
FrequencyInitialDiphone 0.126804334 0.083835479
ConspelV 0.110812602 0.046117273
ConspelN 0.249522442 0.137519738
ConphonV 0.050171973 -0.002754648
ConphonN 0.161531494 0.074468614
ConfriendsV 0.079271194 0.028929529
ConfriendsN 0.242711641 0.131756437
ConffV 0.080362377 0.041097365
ConffN 0.059199476 0.040363830
ConfbV 0.050020822 0.007743200
ConfbN 0.038457171 0.011462290
NounFrequency 0.417794301 0.172254519
VerbFrequency 0.114132925 -0.019725738
FrequencyInitialDiphoneWord 0.096705342 0.029479534
FrequencyInitialDiphoneSyllable 0.086426850 0.027755605
CorrectLexdec 0.360613035 0.188753214
InflectionalEntropy
RTlexdec -0.088418681
RTnaming -0.022110376
Familiarity 0.107954197
WrittenFrequency -0.040480046
WrittenSpokenFrequencyRatio -0.118804044
FamilySize 0.101743523
DerivationalEntropy -0.050795034
InflectionalEntropy 1.000000000
NumberSimplexSynsets 0.398736053
NumberComplexSynsets 0.005589502
LengthInLetters 0.052485031
Ncount -0.003252708
MeanBigramFrequency 0.024789643
FrequencyInitialDiphone -0.034461207
ConspelV 0.140798520
ConspelN 0.046826086
ConphonV 0.082962738
ConphonN 0.031410725
ConfriendsV 0.131247972
ConfriendsN 0.067675205
ConffV 0.014388810
ConffN 0.010758578
ConfbV 0.002863801
ConfbN -0.007583063
NounFrequency -0.114401007
VerbFrequency 0.094002603
FrequencyInitialDiphoneWord 0.052469468
FrequencyInitialDiphoneSyllable 0.050939450
CorrectLexdec 0.182065382
NumberSimplexSynsets
RTlexdec -0.3090081404
RTnaming -0.0719002065
Familiarity 0.5106516971
WrittenFrequency 0.5587495840
WrittenSpokenFrequencyRatio -0.0850885729
FamilySize 0.5907635556
DerivationalEntropy 0.2239430269
InflectionalEntropy 0.3987360535
NumberSimplexSynsets 1.0000000000
NumberComplexSynsets 0.5245365002
LengthInLetters -0.0063644110
Ncount 0.1129586209
MeanBigramFrequency 0.0539689516
FrequencyInitialDiphone 0.0571276406
ConspelV 0.1660590990
ConspelN 0.2275859556
ConphonV 0.0747186906
ConphonN 0.1443620165
ConfriendsV 0.1546554020
ConfriendsN 0.2524518522
ConffV 0.0362906719
ConffN 0.0227783558
ConfbV 0.0215807967
ConfbN 0.0002057108
NounFrequency 0.2380855612
VerbFrequency 0.1887961418
FrequencyInitialDiphoneWord 0.1270081956
FrequencyInitialDiphoneSyllable 0.1160585801
CorrectLexdec 0.3500774208
NumberComplexSynsets LengthInLetters
RTlexdec -0.328613209 0.049747275
RTnaming -0.076384846 0.094497065
Familiarity 0.510019126 -0.082152716
WrittenFrequency 0.591054783 -0.066631955
WrittenSpokenFrequencyRatio -0.104598273 0.204091196
FamilySize 0.645411663 -0.122995009
DerivationalEntropy 0.331924999 -0.104860729
InflectionalEntropy 0.005589502 0.052485031
NumberSimplexSynsets 0.524536500 -0.006364411
NumberComplexSynsets 1.000000000 -0.120445975
LengthInLetters -0.120445975 1.000000000
Ncount 0.137748482 -0.625129141
MeanBigramFrequency -0.023604116 0.790492091
FrequencyInitialDiphone 0.103684145 -0.060443836
ConspelV 0.071760775 -0.226416938
ConspelN 0.193733204 -0.170022083
ConphonV 0.047783270 -0.202368726
ConphonN 0.142498720 -0.205167896
ConfriendsV 0.037761099 -0.192199942
ConfriendsN 0.180629522 -0.156898314
ConffV 0.082102822 -0.019244458
ConffN 0.057010830 0.010765359
ConfbV 0.052172715 -0.040037290
ConfbN 0.049175194 -0.069985486
NounFrequency 0.349469930 -0.035331865
VerbFrequency 0.092248597 -0.083729951
FrequencyInitialDiphoneWord 0.058217178 0.155454553
FrequencyInitialDiphoneSyllable 0.047009152 0.150391668
CorrectLexdec 0.329011088 0.046317578
Ncount MeanBigramFrequency
RTlexdec -0.065726313 0.002633525
RTnaming -0.094669618 0.048459360
Familiarity 0.096504609 0.029621385
WrittenFrequency 0.104925644 0.077588795
WrittenSpokenFrequencyRatio -0.188200595 0.191513493
FamilySize 0.174107015 -0.001056468
DerivationalEntropy 0.123827732 -0.020837051
InflectionalEntropy -0.003252708 0.024789643
NumberSimplexSynsets 0.112958621 0.053968952
NumberComplexSynsets 0.137748482 -0.023604116
LengthInLetters -0.625129141 0.790492091
Ncount 1.000000000 -0.387546284
MeanBigramFrequency -0.387546284 1.000000000
FrequencyInitialDiphone 0.135888890 0.324815461
ConspelV 0.474710938 -0.091270605
ConspelN 0.346547943 0.060952203
ConphonV 0.210193628 -0.122457211
ConphonN 0.190732679 -0.064645303
ConfriendsV 0.436821402 -0.078215883
ConfriendsN 0.340831193 0.049075415
ConffV 0.076838593 0.072593437
ConffN 0.069850580 0.113704972
ConfbV -0.036034465 0.002954953
ConfbN -0.042132633 -0.019550727
NounFrequency 0.035870248 0.043361959
VerbFrequency 0.053361797 -0.045835069
FrequencyInitialDiphoneWord 0.007890710 0.214165942
FrequencyInitialDiphoneSyllable 0.021719522 0.201570329
CorrectLexdec 0.016048288 0.063566285
FrequencyInitialDiphone ConspelV
RTlexdec -0.0744527186 -0.03286747
RTnaming -0.0572168742 -0.02516518
Familiarity 0.1284719337 0.07451417
WrittenFrequency 0.1674867041 0.05864228
WrittenSpokenFrequencyRatio 0.0207992723 -0.15768930
FamilySize 0.1268043344 0.11081260
DerivationalEntropy 0.0838354786 0.04611727
InflectionalEntropy -0.0344612070 0.14079852
NumberSimplexSynsets 0.0571276406 0.16605910
NumberComplexSynsets 0.1036841450 0.07176077
LengthInLetters -0.0604438357 -0.22641694
Ncount 0.1358888899 0.47471094
MeanBigramFrequency 0.3248154611 -0.09127061
FrequencyInitialDiphone 1.0000000000 -0.05573050
ConspelV -0.0557304958 1.00000000
ConspelN 0.0309540623 0.64214341
ConphonV -0.0142352920 0.54021641
ConphonN 0.0104023606 0.41727634
ConfriendsV -0.0495263185 0.91949493
ConfriendsN 0.0362062483 0.62267751
ConffV -0.0001697975 0.23618800
ConffN 0.0021493798 0.16655427
ConfbV 0.0334157038 0.04946527
ConfbN 0.0191463198 0.02724069
NounFrequency 0.0985964971 -0.01696005
VerbFrequency 0.0557187478 0.06291967
FrequencyInitialDiphoneWord 0.1310981285 0.11861458
FrequencyInitialDiphoneSyllable 0.1188976490 0.12276477
CorrectLexdec 0.0486800603 0.04934274
ConspelN ConphonV ConphonN
RTlexdec -0.10753802 -0.021747588 -0.08093054
RTnaming -0.03480124 0.001175572 -0.01436490
Familiarity 0.21437628 0.054089750 0.16180440
WrittenFrequency 0.28248908 0.082012553 0.22245283
WrittenSpokenFrequencyRatio -0.05730739 -0.034796879 -0.02585613
FamilySize 0.24952244 0.050171973 0.16153149
DerivationalEntropy 0.13751974 -0.002754648 0.07446861
InflectionalEntropy 0.04682609 0.082962738 0.03141073
NumberSimplexSynsets 0.22758596 0.074718691 0.14436202
NumberComplexSynsets 0.19373320 0.047783270 0.14249872
LengthInLetters -0.17002208 -0.202368726 -0.20516790
Ncount 0.34654794 0.210193628 0.19073268
MeanBigramFrequency 0.06095220 -0.122457211 -0.06464530
FrequencyInitialDiphone 0.03095406 -0.014235292 0.01040236
ConspelV 0.64214341 0.540216414 0.41727634
ConspelN 1.00000000 0.380474673 0.65365781
ConphonV 0.38047467 1.000000000 0.66588359
ConphonN 0.65365781 0.665883587 1.00000000
ConfriendsV 0.55820343 0.533170763 0.38675454
ConfriendsN 0.88292615 0.378854310 0.65028040
ConffV 0.27788182 0.039617689 0.08270634
ConffN 0.34213915 0.060092579 0.09538521
ConfbV 0.14221895 0.741851531 0.56997101
ConfbN 0.14531495 0.609947163 0.66832337
NounFrequency 0.11924516 -0.008769463 0.08519330
VerbFrequency 0.12533768 0.064268066 0.10336888
FrequencyInitialDiphoneWord 0.11832011 0.029920110 0.05363076
FrequencyInitialDiphoneSyllable 0.11843351 0.033639918 0.05429738
CorrectLexdec 0.10432858 0.020750103 0.06878849
ConfriendsV ConfriendsN ConffV
RTlexdec -0.02572083 -0.117532883 -0.0164949450
RTnaming -0.02774107 -0.044064997 0.0079245511
Familiarity 0.04584224 0.206739773 0.0561034304
WrittenFrequency 0.02146455 0.263264980 0.0816216587
WrittenSpokenFrequencyRatio -0.13634821 -0.055568525 -0.0267316761
FamilySize 0.07927119 0.242711641 0.0803623769
DerivationalEntropy 0.02892953 0.131756437 0.0410973651
InflectionalEntropy 0.13124797 0.067675205 0.0143888100
NumberSimplexSynsets 0.15465540 0.252451852 0.0362906719
NumberComplexSynsets 0.03776110 0.180629522 0.0821028216
LengthInLetters -0.19219994 -0.156898314 -0.0192444580
Ncount 0.43682140 0.340831193 0.0768385931
MeanBigramFrequency -0.07821588 0.049075415 0.0725934375
FrequencyInitialDiphone -0.04952632 0.036206248 -0.0001697975
ConspelV 0.91949493 0.622677513 0.2361879962
ConspelN 0.55820343 0.882926151 0.2778818176
ConphonV 0.53317076 0.378854310 0.0396176894
ConphonN 0.38675454 0.650280396 0.0827063427
ConfriendsV 1.00000000 0.649018597 -0.1205452871
ConfriendsN 0.64901860 1.000000000 0.0065361208
ConffV -0.12054529 0.006536121 1.0000000000
ConffN -0.10390690 0.020956643 0.8241820547
ConfbV 0.01094067 0.083786596 0.0729283492
ConfbN -0.01072111 0.089220835 0.0683948055
NounFrequency -0.02244854 0.120108606 0.0367115079
VerbFrequency 0.00393842 0.118902818 0.1198303241
FrequencyInitialDiphoneWord 0.12811075 0.115161203 0.0058281749
FrequencyInitialDiphoneSyllable 0.13787711 0.121747881 -0.0110108598
CorrectLexdec 0.04575846 0.124420710 0.0072904730
ConffN ConfbV ConfbN
RTlexdec -0.005679088 -0.022515482 -0.0185394993
RTnaming 0.011407182 0.019159417 0.0170117309
Familiarity 0.029457215 0.056873430 0.0519965301
WrittenFrequency 0.050281007 0.119757242 0.1040910628
WrittenSpokenFrequencyRatio -0.003403134 0.093614167 0.0702393218
FamilySize 0.059199476 0.050020822 0.0384571715
DerivationalEntropy 0.040363830 0.007743200 0.0114622898
InflectionalEntropy 0.010758578 0.002863801 -0.0075830633
NumberSimplexSynsets 0.022778356 0.021580797 0.0002057108
NumberComplexSynsets 0.057010830 0.052172715 0.0491751942
LengthInLetters 0.010765359 -0.040037290 -0.0699854865
Ncount 0.069850580 -0.036034465 -0.0421326334
MeanBigramFrequency 0.113704972 0.002954953 -0.0195507270
FrequencyInitialDiphone 0.002149380 0.033415704 0.0191463198
ConspelV 0.166554266 0.049465271 0.0272406932
ConspelN 0.342139148 0.142218950 0.1453149511
ConphonV 0.060092579 0.741851531 0.6099471634
ConphonN 0.095385208 0.569971012 0.6683233736
ConfriendsV -0.103906902 0.010940674 -0.0107211109
ConfriendsN 0.020956643 0.083786596 0.0892208353
ConffV 0.824182055 0.072928349 0.0683948055
ConffN 1.000000000 0.114815595 0.0933643088
ConfbV 0.114815595 1.000000000 0.8424469664
ConfbN 0.093364309 0.842446966 1.0000000000
NounFrequency 0.010288796 0.021685118 0.0252276045
VerbFrequency 0.082163492 0.050519739 0.0329567506
FrequencyInitialDiphoneWord 0.003618182 -0.019915368 -0.0188858839
FrequencyInitialDiphoneSyllable -0.008732692 -0.027020380 -0.0244143624
CorrectLexdec -0.007205900 0.005393638 0.0039391579
NounFrequency VerbFrequency
RTlexdec -0.167189500 -0.076388309
RTnaming -0.043148572 -0.024593780
Familiarity 0.381190698 0.238176996
WrittenFrequency 0.469551521 0.278792355
WrittenSpokenFrequencyRatio 0.012482293 -0.096554415
FamilySize 0.417794301 0.114132925
DerivationalEntropy 0.172254519 -0.019725738
InflectionalEntropy -0.114401007 0.094002603
NumberSimplexSynsets 0.238085561 0.188796142
NumberComplexSynsets 0.349469930 0.092248597
LengthInLetters -0.035331865 -0.083729951
Ncount 0.035870248 0.053361797
MeanBigramFrequency 0.043361959 -0.045835069
FrequencyInitialDiphone 0.098596497 0.055718748
ConspelV -0.016960053 0.062919672
ConspelN 0.119245162 0.125337679
ConphonV -0.008769463 0.064268066
ConphonN 0.085193295 0.103368885
ConfriendsV -0.022448538 0.003938420
ConfriendsN 0.120108606 0.118902818
ConffV 0.036711508 0.119830324
ConffN 0.010288796 0.082163492
ConfbV 0.021685118 0.050519739
ConfbN 0.025227604 0.032956751
NounFrequency 1.000000000 -0.003117231
VerbFrequency -0.003117231 1.000000000
FrequencyInitialDiphoneWord 0.047626002 0.069596145
FrequencyInitialDiphoneSyllable 0.034300335 0.055821617
CorrectLexdec 0.128263251 0.050165423
FrequencyInitialDiphoneWord
RTlexdec -0.042640861
RTnaming 0.020488545
Familiarity 0.091063334
WrittenFrequency 0.108278953
WrittenSpokenFrequencyRatio 0.002627022
FamilySize 0.096705342
DerivationalEntropy 0.029479534
InflectionalEntropy 0.052469468
NumberSimplexSynsets 0.127008196
NumberComplexSynsets 0.058217178
LengthInLetters 0.155454553
Ncount 0.007890710
MeanBigramFrequency 0.214165942
FrequencyInitialDiphone 0.131098129
ConspelV 0.118614576
ConspelN 0.118320106
ConphonV 0.029920110
ConphonN 0.053630763
ConfriendsV 0.128110751
ConfriendsN 0.115161203
ConffV 0.005828175
ConffN 0.003618182
ConfbV -0.019915368
ConfbN -0.018885884
NounFrequency 0.047626002
VerbFrequency 0.069596145
FrequencyInitialDiphoneWord 1.000000000
FrequencyInitialDiphoneSyllable 0.978742189
CorrectLexdec 0.062039751
FrequencyInitialDiphoneSyllable
RTlexdec -0.035503708
RTnaming 0.026897756
Familiarity 0.073541144
WrittenFrequency 0.091116609
WrittenSpokenFrequencyRatio 0.012872673
FamilySize 0.086426850
DerivationalEntropy 0.027755605
InflectionalEntropy 0.050939450
NumberSimplexSynsets 0.116058580
NumberComplexSynsets 0.047009152
LengthInLetters 0.150391668
Ncount 0.021719522
MeanBigramFrequency 0.201570329
FrequencyInitialDiphone 0.118897649
ConspelV 0.122764768
ConspelN 0.118433514
ConphonV 0.033639918
ConphonN 0.054297378
ConfriendsV 0.137877114
ConfriendsN 0.121747881
ConffV -0.011010860
ConffN -0.008732692
ConfbV -0.027020380
ConfbN -0.024414362
NounFrequency 0.034300335
VerbFrequency 0.055821617
FrequencyInitialDiphoneWord 0.978742189
FrequencyInitialDiphoneSyllable 1.000000000
CorrectLexdec 0.057000795
CorrectLexdec
RTlexdec -0.253188184
RTnaming 0.151348043
Familiarity 0.526854585
WrittenFrequency 0.457971849
WrittenSpokenFrequencyRatio 0.008564774
FamilySize 0.360613035
DerivationalEntropy 0.188753214
InflectionalEntropy 0.182065382
NumberSimplexSynsets 0.350077421
NumberComplexSynsets 0.329011088
LengthInLetters 0.046317578
Ncount 0.016048288
MeanBigramFrequency 0.063566285
FrequencyInitialDiphone 0.048680060
ConspelV 0.049342737
ConspelN 0.104328581
ConphonV 0.020750103
ConphonN 0.068788492
ConfriendsV 0.045758455
ConfriendsN 0.124420710
ConffV 0.007290473
ConffN -0.007205900
ConfbV 0.005393638
ConfbN 0.003939158
NounFrequency 0.128263251
VerbFrequency 0.050165423
FrequencyInitialDiphoneWord 0.062039751
FrequencyInitialDiphoneSyllable 0.057000795
CorrectLexdec 1.000000000
corrplot(corr, method = 'ellipse', type = 'upper')
Let’s first compute the correlations between all numeric variables and plot these with the p values
## correlation using "corrplot"
## based on the function `rcorr' from the `Hmisc` package
## Need to change dataframe into a matrix
<-
corr %>%
english select(where(is.numeric)) %>%
as.matrix(english) %>%
rcorr(type = "pearson")
Warning in if (rownames.force %in% FALSE) NULL else if (rownames.force %in% :
the condition has length > 1 and only the first element will be used
print(corr)
RTlexdec RTnaming Familiarity
RTlexdec 1.00 0.76 -0.44
RTnaming 0.76 1.00 -0.09
Familiarity -0.44 -0.09 1.00
WrittenFrequency -0.43 -0.10 0.79
WrittenSpokenFrequencyRatio 0.04 0.04 -0.19
FamilySize -0.35 -0.09 0.59
DerivationalEntropy -0.16 -0.05 0.22
InflectionalEntropy -0.09 -0.02 0.11
NumberSimplexSynsets -0.31 -0.07 0.51
NumberComplexSynsets -0.33 -0.08 0.51
LengthInLetters 0.05 0.09 -0.08
Ncount -0.07 -0.09 0.10
MeanBigramFrequency 0.00 0.05 0.03
FrequencyInitialDiphone -0.07 -0.06 0.13
ConspelV -0.03 -0.03 0.07
ConspelN -0.11 -0.03 0.21
ConphonV -0.02 0.00 0.05
ConphonN -0.08 -0.01 0.16
ConfriendsV -0.03 -0.03 0.05
ConfriendsN -0.12 -0.04 0.21
ConffV -0.02 0.01 0.06
ConffN -0.01 0.01 0.03
ConfbV -0.02 0.02 0.06
ConfbN -0.02 0.02 0.05
NounFrequency -0.17 -0.04 0.38
VerbFrequency -0.08 -0.02 0.24
FrequencyInitialDiphoneWord -0.04 0.02 0.09
FrequencyInitialDiphoneSyllable -0.04 0.03 0.07
CorrectLexdec -0.25 0.15 0.53
WrittenFrequency
RTlexdec -0.43
RTnaming -0.10
Familiarity 0.79
WrittenFrequency 1.00
WrittenSpokenFrequencyRatio 0.07
FamilySize 0.66
DerivationalEntropy 0.26
InflectionalEntropy -0.04
NumberSimplexSynsets 0.56
NumberComplexSynsets 0.59
LengthInLetters -0.07
Ncount 0.10
MeanBigramFrequency 0.08
FrequencyInitialDiphone 0.17
ConspelV 0.06
ConspelN 0.28
ConphonV 0.08
ConphonN 0.22
ConfriendsV 0.02
ConfriendsN 0.26
ConffV 0.08
ConffN 0.05
ConfbV 0.12
ConfbN 0.10
NounFrequency 0.47
VerbFrequency 0.28
FrequencyInitialDiphoneWord 0.11
FrequencyInitialDiphoneSyllable 0.09
CorrectLexdec 0.46
WrittenSpokenFrequencyRatio FamilySize
RTlexdec 0.04 -0.35
RTnaming 0.04 -0.09
Familiarity -0.19 0.59
WrittenFrequency 0.07 0.66
WrittenSpokenFrequencyRatio 1.00 -0.11
FamilySize -0.11 1.00
DerivationalEntropy -0.01 0.69
InflectionalEntropy -0.12 0.10
NumberSimplexSynsets -0.09 0.59
NumberComplexSynsets -0.10 0.65
LengthInLetters 0.20 -0.12
Ncount -0.19 0.17
MeanBigramFrequency 0.19 0.00
FrequencyInitialDiphone 0.02 0.13
ConspelV -0.16 0.11
ConspelN -0.06 0.25
ConphonV -0.03 0.05
ConphonN -0.03 0.16
ConfriendsV -0.14 0.08
ConfriendsN -0.06 0.24
ConffV -0.03 0.08
ConffN 0.00 0.06
ConfbV 0.09 0.05
ConfbN 0.07 0.04
NounFrequency 0.01 0.42
VerbFrequency -0.10 0.11
FrequencyInitialDiphoneWord 0.00 0.10
FrequencyInitialDiphoneSyllable 0.01 0.09
CorrectLexdec 0.01 0.36
DerivationalEntropy InflectionalEntropy
RTlexdec -0.16 -0.09
RTnaming -0.05 -0.02
Familiarity 0.22 0.11
WrittenFrequency 0.26 -0.04
WrittenSpokenFrequencyRatio -0.01 -0.12
FamilySize 0.69 0.10
DerivationalEntropy 1.00 -0.05
InflectionalEntropy -0.05 1.00
NumberSimplexSynsets 0.22 0.40
NumberComplexSynsets 0.33 0.01
LengthInLetters -0.10 0.05
Ncount 0.12 0.00
MeanBigramFrequency -0.02 0.02
FrequencyInitialDiphone 0.08 -0.03
ConspelV 0.05 0.14
ConspelN 0.14 0.05
ConphonV 0.00 0.08
ConphonN 0.07 0.03
ConfriendsV 0.03 0.13
ConfriendsN 0.13 0.07
ConffV 0.04 0.01
ConffN 0.04 0.01
ConfbV 0.01 0.00
ConfbN 0.01 -0.01
NounFrequency 0.17 -0.11
VerbFrequency -0.02 0.09
FrequencyInitialDiphoneWord 0.03 0.05
FrequencyInitialDiphoneSyllable 0.03 0.05
CorrectLexdec 0.19 0.18
NumberSimplexSynsets
RTlexdec -0.31
RTnaming -0.07
Familiarity 0.51
WrittenFrequency 0.56
WrittenSpokenFrequencyRatio -0.09
FamilySize 0.59
DerivationalEntropy 0.22
InflectionalEntropy 0.40
NumberSimplexSynsets 1.00
NumberComplexSynsets 0.52
LengthInLetters -0.01
Ncount 0.11
MeanBigramFrequency 0.05
FrequencyInitialDiphone 0.06
ConspelV 0.17
ConspelN 0.23
ConphonV 0.07
ConphonN 0.14
ConfriendsV 0.15
ConfriendsN 0.25
ConffV 0.04
ConffN 0.02
ConfbV 0.02
ConfbN 0.00
NounFrequency 0.24
VerbFrequency 0.19
FrequencyInitialDiphoneWord 0.13
FrequencyInitialDiphoneSyllable 0.12
CorrectLexdec 0.35
NumberComplexSynsets LengthInLetters
RTlexdec -0.33 0.05
RTnaming -0.08 0.09
Familiarity 0.51 -0.08
WrittenFrequency 0.59 -0.07
WrittenSpokenFrequencyRatio -0.10 0.20
FamilySize 0.65 -0.12
DerivationalEntropy 0.33 -0.10
InflectionalEntropy 0.01 0.05
NumberSimplexSynsets 0.52 -0.01
NumberComplexSynsets 1.00 -0.12
LengthInLetters -0.12 1.00
Ncount 0.14 -0.63
MeanBigramFrequency -0.02 0.79
FrequencyInitialDiphone 0.10 -0.06
ConspelV 0.07 -0.23
ConspelN 0.19 -0.17
ConphonV 0.05 -0.20
ConphonN 0.14 -0.21
ConfriendsV 0.04 -0.19
ConfriendsN 0.18 -0.16
ConffV 0.08 -0.02
ConffN 0.06 0.01
ConfbV 0.05 -0.04
ConfbN 0.05 -0.07
NounFrequency 0.35 -0.04
VerbFrequency 0.09 -0.08
FrequencyInitialDiphoneWord 0.06 0.16
FrequencyInitialDiphoneSyllable 0.05 0.15
CorrectLexdec 0.33 0.05
Ncount MeanBigramFrequency
RTlexdec -0.07 0.00
RTnaming -0.09 0.05
Familiarity 0.10 0.03
WrittenFrequency 0.10 0.08
WrittenSpokenFrequencyRatio -0.19 0.19
FamilySize 0.17 0.00
DerivationalEntropy 0.12 -0.02
InflectionalEntropy 0.00 0.02
NumberSimplexSynsets 0.11 0.05
NumberComplexSynsets 0.14 -0.02
LengthInLetters -0.63 0.79
Ncount 1.00 -0.39
MeanBigramFrequency -0.39 1.00
FrequencyInitialDiphone 0.14 0.32
ConspelV 0.47 -0.09
ConspelN 0.35 0.06
ConphonV 0.21 -0.12
ConphonN 0.19 -0.06
ConfriendsV 0.44 -0.08
ConfriendsN 0.34 0.05
ConffV 0.08 0.07
ConffN 0.07 0.11
ConfbV -0.04 0.00
ConfbN -0.04 -0.02
NounFrequency 0.04 0.04
VerbFrequency 0.05 -0.05
FrequencyInitialDiphoneWord 0.01 0.21
FrequencyInitialDiphoneSyllable 0.02 0.20
CorrectLexdec 0.02 0.06
FrequencyInitialDiphone ConspelV
RTlexdec -0.07 -0.03
RTnaming -0.06 -0.03
Familiarity 0.13 0.07
WrittenFrequency 0.17 0.06
WrittenSpokenFrequencyRatio 0.02 -0.16
FamilySize 0.13 0.11
DerivationalEntropy 0.08 0.05
InflectionalEntropy -0.03 0.14
NumberSimplexSynsets 0.06 0.17
NumberComplexSynsets 0.10 0.07
LengthInLetters -0.06 -0.23
Ncount 0.14 0.47
MeanBigramFrequency 0.32 -0.09
FrequencyInitialDiphone 1.00 -0.06
ConspelV -0.06 1.00
ConspelN 0.03 0.64
ConphonV -0.01 0.54
ConphonN 0.01 0.42
ConfriendsV -0.05 0.92
ConfriendsN 0.04 0.62
ConffV 0.00 0.24
ConffN 0.00 0.17
ConfbV 0.03 0.05
ConfbN 0.02 0.03
NounFrequency 0.10 -0.02
VerbFrequency 0.06 0.06
FrequencyInitialDiphoneWord 0.13 0.12
FrequencyInitialDiphoneSyllable 0.12 0.12
CorrectLexdec 0.05 0.05
ConspelN ConphonV ConphonN ConfriendsV
RTlexdec -0.11 -0.02 -0.08 -0.03
RTnaming -0.03 0.00 -0.01 -0.03
Familiarity 0.21 0.05 0.16 0.05
WrittenFrequency 0.28 0.08 0.22 0.02
WrittenSpokenFrequencyRatio -0.06 -0.03 -0.03 -0.14
FamilySize 0.25 0.05 0.16 0.08
DerivationalEntropy 0.14 0.00 0.07 0.03
InflectionalEntropy 0.05 0.08 0.03 0.13
NumberSimplexSynsets 0.23 0.07 0.14 0.15
NumberComplexSynsets 0.19 0.05 0.14 0.04
LengthInLetters -0.17 -0.20 -0.21 -0.19
Ncount 0.35 0.21 0.19 0.44
MeanBigramFrequency 0.06 -0.12 -0.06 -0.08
FrequencyInitialDiphone 0.03 -0.01 0.01 -0.05
ConspelV 0.64 0.54 0.42 0.92
ConspelN 1.00 0.38 0.65 0.56
ConphonV 0.38 1.00 0.67 0.53
ConphonN 0.65 0.67 1.00 0.39
ConfriendsV 0.56 0.53 0.39 1.00
ConfriendsN 0.88 0.38 0.65 0.65
ConffV 0.28 0.04 0.08 -0.12
ConffN 0.34 0.06 0.10 -0.10
ConfbV 0.14 0.74 0.57 0.01
ConfbN 0.15 0.61 0.67 -0.01
NounFrequency 0.12 -0.01 0.09 -0.02
VerbFrequency 0.13 0.06 0.10 0.00
FrequencyInitialDiphoneWord 0.12 0.03 0.05 0.13
FrequencyInitialDiphoneSyllable 0.12 0.03 0.05 0.14
CorrectLexdec 0.10 0.02 0.07 0.05
ConfriendsN ConffV ConffN ConfbV ConfbN
RTlexdec -0.12 -0.02 -0.01 -0.02 -0.02
RTnaming -0.04 0.01 0.01 0.02 0.02
Familiarity 0.21 0.06 0.03 0.06 0.05
WrittenFrequency 0.26 0.08 0.05 0.12 0.10
WrittenSpokenFrequencyRatio -0.06 -0.03 0.00 0.09 0.07
FamilySize 0.24 0.08 0.06 0.05 0.04
DerivationalEntropy 0.13 0.04 0.04 0.01 0.01
InflectionalEntropy 0.07 0.01 0.01 0.00 -0.01
NumberSimplexSynsets 0.25 0.04 0.02 0.02 0.00
NumberComplexSynsets 0.18 0.08 0.06 0.05 0.05
LengthInLetters -0.16 -0.02 0.01 -0.04 -0.07
Ncount 0.34 0.08 0.07 -0.04 -0.04
MeanBigramFrequency 0.05 0.07 0.11 0.00 -0.02
FrequencyInitialDiphone 0.04 0.00 0.00 0.03 0.02
ConspelV 0.62 0.24 0.17 0.05 0.03
ConspelN 0.88 0.28 0.34 0.14 0.15
ConphonV 0.38 0.04 0.06 0.74 0.61
ConphonN 0.65 0.08 0.10 0.57 0.67
ConfriendsV 0.65 -0.12 -0.10 0.01 -0.01
ConfriendsN 1.00 0.01 0.02 0.08 0.09
ConffV 0.01 1.00 0.82 0.07 0.07
ConffN 0.02 0.82 1.00 0.11 0.09
ConfbV 0.08 0.07 0.11 1.00 0.84
ConfbN 0.09 0.07 0.09 0.84 1.00
NounFrequency 0.12 0.04 0.01 0.02 0.03
VerbFrequency 0.12 0.12 0.08 0.05 0.03
FrequencyInitialDiphoneWord 0.12 0.01 0.00 -0.02 -0.02
FrequencyInitialDiphoneSyllable 0.12 -0.01 -0.01 -0.03 -0.02
CorrectLexdec 0.12 0.01 -0.01 0.01 0.00
NounFrequency VerbFrequency
RTlexdec -0.17 -0.08
RTnaming -0.04 -0.02
Familiarity 0.38 0.24
WrittenFrequency 0.47 0.28
WrittenSpokenFrequencyRatio 0.01 -0.10
FamilySize 0.42 0.11
DerivationalEntropy 0.17 -0.02
InflectionalEntropy -0.11 0.09
NumberSimplexSynsets 0.24 0.19
NumberComplexSynsets 0.35 0.09
LengthInLetters -0.04 -0.08
Ncount 0.04 0.05
MeanBigramFrequency 0.04 -0.05
FrequencyInitialDiphone 0.10 0.06
ConspelV -0.02 0.06
ConspelN 0.12 0.13
ConphonV -0.01 0.06
ConphonN 0.09 0.10
ConfriendsV -0.02 0.00
ConfriendsN 0.12 0.12
ConffV 0.04 0.12
ConffN 0.01 0.08
ConfbV 0.02 0.05
ConfbN 0.03 0.03
NounFrequency 1.00 0.00
VerbFrequency 0.00 1.00
FrequencyInitialDiphoneWord 0.05 0.07
FrequencyInitialDiphoneSyllable 0.03 0.06
CorrectLexdec 0.13 0.05
FrequencyInitialDiphoneWord
RTlexdec -0.04
RTnaming 0.02
Familiarity 0.09
WrittenFrequency 0.11
WrittenSpokenFrequencyRatio 0.00
FamilySize 0.10
DerivationalEntropy 0.03
InflectionalEntropy 0.05
NumberSimplexSynsets 0.13
NumberComplexSynsets 0.06
LengthInLetters 0.16
Ncount 0.01
MeanBigramFrequency 0.21
FrequencyInitialDiphone 0.13
ConspelV 0.12
ConspelN 0.12
ConphonV 0.03
ConphonN 0.05
ConfriendsV 0.13
ConfriendsN 0.12
ConffV 0.01
ConffN 0.00
ConfbV -0.02
ConfbN -0.02
NounFrequency 0.05
VerbFrequency 0.07
FrequencyInitialDiphoneWord 1.00
FrequencyInitialDiphoneSyllable 0.98
CorrectLexdec 0.06
FrequencyInitialDiphoneSyllable
RTlexdec -0.04
RTnaming 0.03
Familiarity 0.07
WrittenFrequency 0.09
WrittenSpokenFrequencyRatio 0.01
FamilySize 0.09
DerivationalEntropy 0.03
InflectionalEntropy 0.05
NumberSimplexSynsets 0.12
NumberComplexSynsets 0.05
LengthInLetters 0.15
Ncount 0.02
MeanBigramFrequency 0.20
FrequencyInitialDiphone 0.12
ConspelV 0.12
ConspelN 0.12
ConphonV 0.03
ConphonN 0.05
ConfriendsV 0.14
ConfriendsN 0.12
ConffV -0.01
ConffN -0.01
ConfbV -0.03
ConfbN -0.02
NounFrequency 0.03
VerbFrequency 0.06
FrequencyInitialDiphoneWord 0.98
FrequencyInitialDiphoneSyllable 1.00
CorrectLexdec 0.06
CorrectLexdec
RTlexdec -0.25
RTnaming 0.15
Familiarity 0.53
WrittenFrequency 0.46
WrittenSpokenFrequencyRatio 0.01
FamilySize 0.36
DerivationalEntropy 0.19
InflectionalEntropy 0.18
NumberSimplexSynsets 0.35
NumberComplexSynsets 0.33
LengthInLetters 0.05
Ncount 0.02
MeanBigramFrequency 0.06
FrequencyInitialDiphone 0.05
ConspelV 0.05
ConspelN 0.10
ConphonV 0.02
ConphonN 0.07
ConfriendsV 0.05
ConfriendsN 0.12
ConffV 0.01
ConffN -0.01
ConfbV 0.01
ConfbN 0.00
NounFrequency 0.13
VerbFrequency 0.05
FrequencyInitialDiphoneWord 0.06
FrequencyInitialDiphoneSyllable 0.06
CorrectLexdec 1.00
n= 4568
P
RTlexdec RTnaming Familiarity
RTlexdec 0.0000 0.0000
RTnaming 0.0000 0.0000
Familiarity 0.0000 0.0000
WrittenFrequency 0.0000 0.0000 0.0000
WrittenSpokenFrequencyRatio 0.0071 0.0134 0.0000
FamilySize 0.0000 0.0000 0.0000
DerivationalEntropy 0.0000 0.0008 0.0000
InflectionalEntropy 0.0000 0.1351 0.0000
NumberSimplexSynsets 0.0000 0.0000 0.0000
NumberComplexSynsets 0.0000 0.0000 0.0000
LengthInLetters 0.0008 0.0000 0.0000
Ncount 0.0000 0.0000 0.0000
MeanBigramFrequency 0.8588 0.0011 0.0453
FrequencyInitialDiphone 0.0000 0.0001 0.0000
ConspelV 0.0263 0.0890 0.0000
ConspelN 0.0000 0.0187 0.0000
ConphonV 0.1417 0.9367 0.0003
ConphonN 0.0000 0.3317 0.0000
ConfriendsV 0.0822 0.0608 0.0019
ConfriendsN 0.0000 0.0029 0.0000
ConffV 0.2650 0.5923 0.0001
ConffN 0.7012 0.4408 0.0465
ConfbV 0.1281 0.1954 0.0001
ConfbN 0.2103 0.2503 0.0004
NounFrequency 0.0000 0.0035 0.0000
VerbFrequency 0.0000 0.0965 0.0000
FrequencyInitialDiphoneWord 0.0039 0.1662 0.0000
FrequencyInitialDiphoneSyllable 0.0164 0.0691 0.0000
CorrectLexdec 0.0000 0.0000 0.0000
WrittenFrequency
RTlexdec 0.0000
RTnaming 0.0000
Familiarity 0.0000
WrittenFrequency
WrittenSpokenFrequencyRatio 0.0000
FamilySize 0.0000
DerivationalEntropy 0.0000
InflectionalEntropy 0.0062
NumberSimplexSynsets 0.0000
NumberComplexSynsets 0.0000
LengthInLetters 0.0000
Ncount 0.0000
MeanBigramFrequency 0.0000
FrequencyInitialDiphone 0.0000
ConspelV 0.0000
ConspelN 0.0000
ConphonV 0.0000
ConphonN 0.0000
ConfriendsV 0.1469
ConfriendsN 0.0000
ConffV 0.0000
ConffN 0.0007
ConfbV 0.0000
ConfbN 0.0000
NounFrequency 0.0000
VerbFrequency 0.0000
FrequencyInitialDiphoneWord 0.0000
FrequencyInitialDiphoneSyllable 0.0000
CorrectLexdec 0.0000
WrittenSpokenFrequencyRatio FamilySize
RTlexdec 0.0071 0.0000
RTnaming 0.0134 0.0000
Familiarity 0.0000 0.0000
WrittenFrequency 0.0000 0.0000
WrittenSpokenFrequencyRatio 0.0000
FamilySize 0.0000
DerivationalEntropy 0.4578 0.0000
InflectionalEntropy 0.0000 0.0000
NumberSimplexSynsets 0.0000 0.0000
NumberComplexSynsets 0.0000 0.0000
LengthInLetters 0.0000 0.0000
Ncount 0.0000 0.0000
MeanBigramFrequency 0.0000 0.9431
FrequencyInitialDiphone 0.1599 0.0000
ConspelV 0.0000 0.0000
ConspelN 0.0001 0.0000
ConphonV 0.0187 0.0007
ConphonN 0.0806 0.0000
ConfriendsV 0.0000 0.0000
ConfriendsN 0.0002 0.0000
ConffV 0.0708 0.0000
ConffN 0.8181 0.0000
ConfbV 0.0000 0.0007
ConfbN 0.0000 0.0093
NounFrequency 0.3990 0.0000
VerbFrequency 0.0000 0.0000
FrequencyInitialDiphoneWord 0.8591 0.0000
FrequencyInitialDiphoneSyllable 0.3844 0.0000
CorrectLexdec 0.5628 0.0000
DerivationalEntropy InflectionalEntropy
RTlexdec 0.0000 0.0000
RTnaming 0.0008 0.1351
Familiarity 0.0000 0.0000
WrittenFrequency 0.0000 0.0062
WrittenSpokenFrequencyRatio 0.4578 0.0000
FamilySize 0.0000 0.0000
DerivationalEntropy 0.0006
InflectionalEntropy 0.0006
NumberSimplexSynsets 0.0000 0.0000
NumberComplexSynsets 0.0000 0.7057
LengthInLetters 0.0000 0.0004
Ncount 0.0000 0.8260
MeanBigramFrequency 0.1591 0.0939
FrequencyInitialDiphone 0.0000 0.0198
ConspelV 0.0018 0.0000
ConspelN 0.0000 0.0015
ConphonV 0.8523 0.0000
ConphonN 0.0000 0.0338
ConfriendsV 0.0506 0.0000
ConfriendsN 0.0000 0.0000
ConffV 0.0055 0.3309
ConffN 0.0064 0.4672
ConfbV 0.6008 0.8466
ConfbN 0.4386 0.6084
NounFrequency 0.0000 0.0000
VerbFrequency 0.1825 0.0000
FrequencyInitialDiphoneWord 0.0463 0.0004
FrequencyInitialDiphoneSyllable 0.0607 0.0006
CorrectLexdec 0.0000 0.0000
NumberSimplexSynsets
RTlexdec 0.0000
RTnaming 0.0000
Familiarity 0.0000
WrittenFrequency 0.0000
WrittenSpokenFrequencyRatio 0.0000
FamilySize 0.0000
DerivationalEntropy 0.0000
InflectionalEntropy 0.0000
NumberSimplexSynsets
NumberComplexSynsets 0.0000
LengthInLetters 0.6672
Ncount 0.0000
MeanBigramFrequency 0.0003
FrequencyInitialDiphone 0.0001
ConspelV 0.0000
ConspelN 0.0000
ConphonV 0.0000
ConphonN 0.0000
ConfriendsV 0.0000
ConfriendsN 0.0000
ConffV 0.0142
ConffN 0.1237
ConfbV 0.1447
ConfbN 0.9889
NounFrequency 0.0000
VerbFrequency 0.0000
FrequencyInitialDiphoneWord 0.0000
FrequencyInitialDiphoneSyllable 0.0000
CorrectLexdec 0.0000
NumberComplexSynsets LengthInLetters
RTlexdec 0.0000 0.0008
RTnaming 0.0000 0.0000
Familiarity 0.0000 0.0000
WrittenFrequency 0.0000 0.0000
WrittenSpokenFrequencyRatio 0.0000 0.0000
FamilySize 0.0000 0.0000
DerivationalEntropy 0.0000 0.0000
InflectionalEntropy 0.7057 0.0004
NumberSimplexSynsets 0.0000 0.6672
NumberComplexSynsets 0.0000
LengthInLetters 0.0000
Ncount 0.0000 0.0000
MeanBigramFrequency 0.1107 0.0000
FrequencyInitialDiphone 0.0000 0.0000
ConspelV 0.0000 0.0000
ConspelN 0.0000 0.0000
ConphonV 0.0012 0.0000
ConphonN 0.0000 0.0000
ConfriendsV 0.0107 0.0000
ConfriendsN 0.0000 0.0000
ConffV 0.0000 0.1935
ConffN 0.0001 0.4670
ConfbV 0.0004 0.0068
ConfbN 0.0009 0.0000
NounFrequency 0.0000 0.0169
VerbFrequency 0.0000 0.0000
FrequencyInitialDiphoneWord 0.0000 0.0000
FrequencyInitialDiphoneSyllable 0.0015 0.0000
CorrectLexdec 0.0000 0.0017
Ncount MeanBigramFrequency
RTlexdec 0.0000 0.8588
RTnaming 0.0000 0.0011
Familiarity 0.0000 0.0453
WrittenFrequency 0.0000 0.0000
WrittenSpokenFrequencyRatio 0.0000 0.0000
FamilySize 0.0000 0.9431
DerivationalEntropy 0.0000 0.1591
InflectionalEntropy 0.8260 0.0939
NumberSimplexSynsets 0.0000 0.0003
NumberComplexSynsets 0.0000 0.1107
LengthInLetters 0.0000 0.0000
Ncount 0.0000
MeanBigramFrequency 0.0000
FrequencyInitialDiphone 0.0000 0.0000
ConspelV 0.0000 0.0000
ConspelN 0.0000 0.0000
ConphonV 0.0000 0.0000
ConphonN 0.0000 0.0000
ConfriendsV 0.0000 0.0000
ConfriendsN 0.0000 0.0009
ConffV 0.0000 0.0000
ConffN 0.0000 0.0000
ConfbV 0.0149 0.8417
ConfbN 0.0044 0.1865
NounFrequency 0.0153 0.0034
VerbFrequency 0.0003 0.0019
FrequencyInitialDiphoneWord 0.5939 0.0000
FrequencyInitialDiphoneSyllable 0.1422 0.0000
CorrectLexdec 0.2782 0.0000
FrequencyInitialDiphone ConspelV
RTlexdec 0.0000 0.0263
RTnaming 0.0001 0.0890
Familiarity 0.0000 0.0000
WrittenFrequency 0.0000 0.0000
WrittenSpokenFrequencyRatio 0.1599 0.0000
FamilySize 0.0000 0.0000
DerivationalEntropy 0.0000 0.0018
InflectionalEntropy 0.0198 0.0000
NumberSimplexSynsets 0.0001 0.0000
NumberComplexSynsets 0.0000 0.0000
LengthInLetters 0.0000 0.0000
Ncount 0.0000 0.0000
MeanBigramFrequency 0.0000 0.0000
FrequencyInitialDiphone 0.0002
ConspelV 0.0002
ConspelN 0.0364 0.0000
ConphonV 0.3361 0.0000
ConphonN 0.4821 0.0000
ConfriendsV 0.0008 0.0000
ConfriendsN 0.0144 0.0000
ConffV 0.9908 0.0000
ConffN 0.8845 0.0000
ConfbV 0.0239 0.0008
ConfbN 0.1957 0.0656
NounFrequency 0.0000 0.2518
VerbFrequency 0.0002 0.0000
FrequencyInitialDiphoneWord 0.0000 0.0000
FrequencyInitialDiphoneSyllable 0.0000 0.0000
CorrectLexdec 0.0010 0.0008
ConspelN ConphonV ConphonN ConfriendsV
RTlexdec 0.0000 0.1417 0.0000 0.0822
RTnaming 0.0187 0.9367 0.3317 0.0608
Familiarity 0.0000 0.0003 0.0000 0.0019
WrittenFrequency 0.0000 0.0000 0.0000 0.1469
WrittenSpokenFrequencyRatio 0.0001 0.0187 0.0806 0.0000
FamilySize 0.0000 0.0007 0.0000 0.0000
DerivationalEntropy 0.0000 0.8523 0.0000 0.0506
InflectionalEntropy 0.0015 0.0000 0.0338 0.0000
NumberSimplexSynsets 0.0000 0.0000 0.0000 0.0000
NumberComplexSynsets 0.0000 0.0012 0.0000 0.0107
LengthInLetters 0.0000 0.0000 0.0000 0.0000
Ncount 0.0000 0.0000 0.0000 0.0000
MeanBigramFrequency 0.0000 0.0000 0.0000 0.0000
FrequencyInitialDiphone 0.0364 0.3361 0.4821 0.0008
ConspelV 0.0000 0.0000 0.0000 0.0000
ConspelN 0.0000 0.0000 0.0000
ConphonV 0.0000 0.0000 0.0000
ConphonN 0.0000 0.0000 0.0000
ConfriendsV 0.0000 0.0000 0.0000
ConfriendsN 0.0000 0.0000 0.0000 0.0000
ConffV 0.0000 0.0074 0.0000 0.0000
ConffN 0.0000 0.0000 0.0000 0.0000
ConfbV 0.0000 0.0000 0.0000 0.4597
ConfbN 0.0000 0.0000 0.0000 0.4688
NounFrequency 0.0000 0.5535 0.0000 0.1293
VerbFrequency 0.0000 0.0000 0.0000 0.7902
FrequencyInitialDiphoneWord 0.0000 0.0432 0.0003 0.0000
FrequencyInitialDiphoneSyllable 0.0000 0.0230 0.0002 0.0000
CorrectLexdec 0.0000 0.1609 0.0000 0.0020
ConfriendsN ConffV ConffN ConfbV ConfbN
RTlexdec 0.0000 0.2650 0.7012 0.1281 0.2103
RTnaming 0.0029 0.5923 0.4408 0.1954 0.2503
Familiarity 0.0000 0.0001 0.0465 0.0001 0.0004
WrittenFrequency 0.0000 0.0000 0.0007 0.0000 0.0000
WrittenSpokenFrequencyRatio 0.0002 0.0708 0.8181 0.0000 0.0000
FamilySize 0.0000 0.0000 0.0000 0.0007 0.0093
DerivationalEntropy 0.0000 0.0055 0.0064 0.6008 0.4386
InflectionalEntropy 0.0000 0.3309 0.4672 0.8466 0.6084
NumberSimplexSynsets 0.0000 0.0142 0.1237 0.1447 0.9889
NumberComplexSynsets 0.0000 0.0000 0.0001 0.0004 0.0009
LengthInLetters 0.0000 0.1935 0.4670 0.0068 0.0000
Ncount 0.0000 0.0000 0.0000 0.0149 0.0044
MeanBigramFrequency 0.0009 0.0000 0.0000 0.8417 0.1865
FrequencyInitialDiphone 0.0144 0.9908 0.8845 0.0239 0.1957
ConspelV 0.0000 0.0000 0.0000 0.0008 0.0656
ConspelN 0.0000 0.0000 0.0000 0.0000 0.0000
ConphonV 0.0000 0.0074 0.0000 0.0000 0.0000
ConphonN 0.0000 0.0000 0.0000 0.0000 0.0000
ConfriendsV 0.0000 0.0000 0.0000 0.4597 0.4688
ConfriendsN 0.6587 0.1567 0.0000 0.0000
ConffV 0.6587 0.0000 0.0000 0.0000
ConffN 0.1567 0.0000 0.0000 0.0000
ConfbV 0.0000 0.0000 0.0000 0.0000
ConfbN 0.0000 0.0000 0.0000 0.0000
NounFrequency 0.0000 0.0131 0.4869 0.1428 0.0882
VerbFrequency 0.0000 0.0000 0.0000 0.0006 0.0259
FrequencyInitialDiphoneWord 0.0000 0.6937 0.8069 0.1784 0.2019
FrequencyInitialDiphoneSyllable 0.0000 0.4569 0.5551 0.0678 0.0990
CorrectLexdec 0.0000 0.6223 0.6263 0.7155 0.7901
NounFrequency VerbFrequency
RTlexdec 0.0000 0.0000
RTnaming 0.0035 0.0965
Familiarity 0.0000 0.0000
WrittenFrequency 0.0000 0.0000
WrittenSpokenFrequencyRatio 0.3990 0.0000
FamilySize 0.0000 0.0000
DerivationalEntropy 0.0000 0.1825
InflectionalEntropy 0.0000 0.0000
NumberSimplexSynsets 0.0000 0.0000
NumberComplexSynsets 0.0000 0.0000
LengthInLetters 0.0169 0.0000
Ncount 0.0153 0.0003
MeanBigramFrequency 0.0034 0.0019
FrequencyInitialDiphone 0.0000 0.0002
ConspelV 0.2518 0.0000
ConspelN 0.0000 0.0000
ConphonV 0.5535 0.0000
ConphonN 0.0000 0.0000
ConfriendsV 0.1293 0.7902
ConfriendsN 0.0000 0.0000
ConffV 0.0131 0.0000
ConffN 0.4869 0.0000
ConfbV 0.1428 0.0006
ConfbN 0.0882 0.0259
NounFrequency 0.8332
VerbFrequency 0.8332
FrequencyInitialDiphoneWord 0.0013 0.0000
FrequencyInitialDiphoneSyllable 0.0204 0.0002
CorrectLexdec 0.0000 0.0007
FrequencyInitialDiphoneWord
RTlexdec 0.0039
RTnaming 0.1662
Familiarity 0.0000
WrittenFrequency 0.0000
WrittenSpokenFrequencyRatio 0.8591
FamilySize 0.0000
DerivationalEntropy 0.0463
InflectionalEntropy 0.0004
NumberSimplexSynsets 0.0000
NumberComplexSynsets 0.0000
LengthInLetters 0.0000
Ncount 0.5939
MeanBigramFrequency 0.0000
FrequencyInitialDiphone 0.0000
ConspelV 0.0000
ConspelN 0.0000
ConphonV 0.0432
ConphonN 0.0003
ConfriendsV 0.0000
ConfriendsN 0.0000
ConffV 0.6937
ConffN 0.8069
ConfbV 0.1784
ConfbN 0.2019
NounFrequency 0.0013
VerbFrequency 0.0000
FrequencyInitialDiphoneWord
FrequencyInitialDiphoneSyllable 0.0000
CorrectLexdec 0.0000
FrequencyInitialDiphoneSyllable
RTlexdec 0.0164
RTnaming 0.0691
Familiarity 0.0000
WrittenFrequency 0.0000
WrittenSpokenFrequencyRatio 0.3844
FamilySize 0.0000
DerivationalEntropy 0.0607
InflectionalEntropy 0.0006
NumberSimplexSynsets 0.0000
NumberComplexSynsets 0.0015
LengthInLetters 0.0000
Ncount 0.1422
MeanBigramFrequency 0.0000
FrequencyInitialDiphone 0.0000
ConspelV 0.0000
ConspelN 0.0000
ConphonV 0.0230
ConphonN 0.0002
ConfriendsV 0.0000
ConfriendsN 0.0000
ConffV 0.4569
ConffN 0.5551
ConfbV 0.0678
ConfbN 0.0990
NounFrequency 0.0204
VerbFrequency 0.0002
FrequencyInitialDiphoneWord 0.0000
FrequencyInitialDiphoneSyllable
CorrectLexdec 0.0001
CorrectLexdec
RTlexdec 0.0000
RTnaming 0.0000
Familiarity 0.0000
WrittenFrequency 0.0000
WrittenSpokenFrequencyRatio 0.5628
FamilySize 0.0000
DerivationalEntropy 0.0000
InflectionalEntropy 0.0000
NumberSimplexSynsets 0.0000
NumberComplexSynsets 0.0000
LengthInLetters 0.0017
Ncount 0.2782
MeanBigramFrequency 0.0000
FrequencyInitialDiphone 0.0010
ConspelV 0.0008
ConspelN 0.0000
ConphonV 0.1609
ConphonN 0.0000
ConfriendsV 0.0020
ConfriendsN 0.0000
ConffV 0.6223
ConffN 0.6263
ConfbV 0.7155
ConfbN 0.7901
NounFrequency 0.0000
VerbFrequency 0.0007
FrequencyInitialDiphoneWord 0.0000
FrequencyInitialDiphoneSyllable 0.0001
CorrectLexdec
# use corrplot to obtain a nice correlation plot!
corrplot(corr$r, p.mat = corr$P,
addCoef.col = "black", diag = FALSE, type = "upper", tl.srt = 55)
Up to now, we have looked at descriptive statistics, and evaluated summaries, correlations in the data (with p values).
We are now interested in looking at group differences.
The basic assumption of a Linear model is to create a regression analysis on the data. We have an outcome (or dependent variable) and a predictor (or an independent variable). The formula of a linear model is as follows outcome ~ predictor
that can be read as “outcome as a function of the predictor”. We can add “1” to specify an intercept, but this is by default added to the model
<- english %>%
english2 mutate(AgeSubject = factor(AgeSubject, levels = c("young", "old")))
<- english2 %>%
mdl.lm lm(RTlexdec ~ AgeSubject, data = .)
#lm(RTlexdec ~ AgeSubject, data = english)
#also print(mdl.lm) mdl.lm
Call:
lm(formula = RTlexdec ~ AgeSubject, data = .)
Coefficients:
(Intercept) AgeSubjectold
6.4392 0.2217
summary(mdl.lm)
Call:
lm(formula = RTlexdec ~ AgeSubject, data = .)
Residuals:
Min 1Q Median 3Q Max
-0.25776 -0.08339 -0.01669 0.06921 0.52685
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.439237 0.002324 2771.03 <2e-16 ***
AgeSubjectold 0.221721 0.003286 67.47 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1111 on 4566 degrees of freedom
Multiple R-squared: 0.4992, Adjusted R-squared: 0.4991
F-statistic: 4552 on 1 and 4566 DF, p-value: < 2.2e-16
# from library(broom)
tidy(mdl.lm) %>%
select(term, estimate) %>%
mutate(estimate = round(estimate, 3))
<- tidy(mdl.lm) %>% pull(estimate) mycoefE
Obtaining mean values from our model
#old
1] mycoefE[
[1] 6.439237
#young
1] + mycoefE[2] mycoefE[
[1] 6.660958
We can also obtain a nice table of our model summary. We can use the package knitr
or xtable
kable(summary(mdl.lm)$coef, digits = 3)
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 6.439 | 0.002 | 2771.027 | 0 |
AgeSubjectold | 0.222 | 0.003 | 67.468 | 0 |
NA
tidy
output<- tidy(mdl.lm)
mdl.lmT kable(mdl.lmT, digits = 3)
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 6.439 | 0.002 | 2771.027 | 0 |
AgeSubjectold | 0.222 | 0.003 | 67.468 | 0 |
Let us dissect the model. If you use “str”, you will be able to see what is available under our linear model. To access some info from the model
str(mdl.lm)
List of 13
$ coefficients : Named num [1:2] 6.439 0.222
..- attr(*, "names")= chr [1:2] "(Intercept)" "AgeSubjectold"
$ residuals : Named num [1:4568] 0.1045 -0.0416 -0.1343 -0.015 0.0114 ...
..- attr(*, "names")= chr [1:4568] "1" "2" "3" "4" ...
$ effects : Named num [1:4568] -442.7013 7.4927 -0.1352 -0.0159 0.0105 ...
..- attr(*, "names")= chr [1:4568] "(Intercept)" "AgeSubjectold" "" "" ...
$ rank : int 2
$ fitted.values: Named num [1:4568] 6.44 6.44 6.44 6.44 6.44 ...
..- attr(*, "names")= chr [1:4568] "1" "2" "3" "4" ...
$ assign : int [1:2] 0 1
$ qr :List of 5
..$ qr : num [1:4568, 1:2] -67.587 0.0148 0.0148 0.0148 0.0148 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:4568] "1" "2" "3" "4" ...
.. .. ..$ : chr [1:2] "(Intercept)" "AgeSubjectold"
.. ..- attr(*, "assign")= int [1:2] 0 1
.. ..- attr(*, "contrasts")=List of 1
.. .. ..$ AgeSubject: chr "contr.treatment"
..$ qraux: num [1:2] 1.01 1.01
..$ pivot: int [1:2] 1 2
..$ tol : num 1e-07
..$ rank : int 2
..- attr(*, "class")= chr "qr"
$ df.residual : int 4566
$ contrasts :List of 1
..$ AgeSubject: chr "contr.treatment"
$ xlevels :List of 1
..$ AgeSubject: chr [1:2] "young" "old"
$ call : language lm(formula = RTlexdec ~ AgeSubject, data = .)
$ terms :Classes 'terms', 'formula' language RTlexdec ~ AgeSubject
.. ..- attr(*, "variables")= language list(RTlexdec, AgeSubject)
.. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:2] "RTlexdec" "AgeSubject"
.. .. .. ..$ : chr "AgeSubject"
.. ..- attr(*, "term.labels")= chr "AgeSubject"
.. ..- attr(*, "order")= int 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: 0x000001859f41a520>
.. ..- attr(*, "predvars")= language list(RTlexdec, AgeSubject)
.. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "factor"
.. .. ..- attr(*, "names")= chr [1:2] "RTlexdec" "AgeSubject"
$ model :'data.frame': 4568 obs. of 2 variables:
..$ RTlexdec : num [1:4568] 6.54 6.4 6.3 6.42 6.45 ...
..$ AgeSubject: Factor w/ 2 levels "young","old": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "terms")=Classes 'terms', 'formula' language RTlexdec ~ AgeSubject
.. .. ..- attr(*, "variables")= language list(RTlexdec, AgeSubject)
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "RTlexdec" "AgeSubject"
.. .. .. .. ..$ : chr "AgeSubject"
.. .. ..- attr(*, "term.labels")= chr "AgeSubject"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: 0x000001859f41a520>
.. .. ..- attr(*, "predvars")= language list(RTlexdec, AgeSubject)
.. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "factor"
.. .. .. ..- attr(*, "names")= chr [1:2] "RTlexdec" "AgeSubject"
- attr(*, "class")= chr "lm"
coef(mdl.lm)
(Intercept) AgeSubjectold
6.4392366 0.2217215
## same as
## mdl.lm$coefficients
What if I want to obtain the “Intercept”? Or the coefficient for distance? What if I want the full row for distance?
coef(mdl.lm)[1] # same as mdl.lm$coefficients[1]
(Intercept)
6.439237
coef(mdl.lm)[2] # same as mdl.lm$coefficients[2]
AgeSubjectold
0.2217215
summary(mdl.lm)$coefficients[2, ] # full row
Estimate Std. Error t value Pr(>|t|)
0.22172146 0.00328631 67.46820211 0.00000000
summary(mdl.lm)$coefficients[2, 4] #for p value
[1] 0
What about residuals (difference between the observed value and the estimated value of the quantity) and fitted values? This allows us to evaluate how normal our residuals are and how different they are from a normal distribution.
hist(residuals(mdl.lm))
qqnorm(residuals(mdl.lm)); qqline(residuals(mdl.lm))
plot(fitted(mdl.lm), residuals(mdl.lm), cex = 4)
AIC(mdl.lm) # Akaike's Information Criterion, lower values are better
[1] -7110.962
BIC(mdl.lm) # Bayesian AIC
[1] -7091.682
logLik(mdl.lm) # log likelihood
'log Lik.' 3558.481 (df=3)
Or use the following from broom
glance(mdl.lm)
Are the above informative? of course not directly. If we want to test for overall significance of model. We run a null model (aka intercept only) and compare models.
<- english %>%
mdl.lm.Null lm(RTlexdec ~ 1, data = .)
<- anova(mdl.lm.Null, mdl.lm)
mdl.comp mdl.comp
Analysis of Variance Table
Model 1: RTlexdec ~ 1
Model 2: RTlexdec ~ AgeSubject
Res.Df RSS Df Sum of Sq F Pr(>F)
1 4567 112.456
2 4566 56.314 1 56.141 4552 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The results show that adding the variable “AgeSubject” improves the model fit. We can write this as follows: Model comparison showed that the addition of AgeSubject improved the model fit when compared with an intercept only model (\(F\)(1) = 4551.96, p < 0) (F(1) = 4552 , p < 2.2e-16)
We can use the p values generated from either our linear model to add significance levels on a plot. We use the code from above and add the significance level. We also add a trend line
%>%
english ggplot(aes(x = AgeSubject, y = RTlexdec))+
geom_boxplot()+
theme_bw() + theme(text = element_text(size = 15))+
geom_smooth(aes(x = as.numeric(AgeSubject), y = predict(mdl.lm)), method = "lm", color = "blue") +
geom_signif(comparison = list(c("old", "young")),
map_signif_level = TRUE, test = function(a, b) {
list(p.value = summary(mdl.lm)$coefficients[2, 4]
)})
`geom_smooth()` using formula 'y ~ x'
Linear models require a numeric outcome, but the predictor can be either numeric or a factor. We can have more than one predictor. The only issue is that this complicates the interpretation of results
%>%
english lm(RTlexdec ~ AgeSubject * WordCategory, data = .) %>%
summary()
Call:
lm(formula = RTlexdec ~ AgeSubject * WordCategory, data = .)
Residuals:
Min 1Q Median 3Q Max
-0.25079 -0.08273 -0.01516 0.06940 0.52285
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.664955 0.002911 2289.950 <2e-16
AgeSubjectyoung -0.220395 0.004116 -53.545 <2e-16
WordCategoryV -0.010972 0.004822 -2.275 0.0229
AgeSubjectyoung:WordCategoryV -0.003642 0.006820 -0.534 0.5934
(Intercept) ***
AgeSubjectyoung ***
WordCategoryV *
AgeSubjectyoung:WordCategoryV
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1109 on 4564 degrees of freedom
Multiple R-squared: 0.5008, Adjusted R-squared: 0.5005
F-statistic: 1526 on 3 and 4564 DF, p-value: < 2.2e-16
And with an Anova
%>%
english lm(RTlexdec ~ AgeSubject * WordCategory, data = .) %>%
anova()
Analysis of Variance Table
Response: RTlexdec
Df Sum Sq Mean Sq F value Pr(>F)
AgeSubject 1 56.141 56.141 4564.2810 < 2.2e-16 ***
WordCategory 1 0.173 0.173 14.0756 0.0001778 ***
AgeSubject:WordCategory 1 0.004 0.004 0.2851 0.5933724
Residuals 4564 56.138 0.012
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The results above tell us that all predictors used are significantly different.
This is the end of the fourth session. We continued working with the Tidyverse
to obtain visualisations and added more complex specifications. We then looked at correlation tests and inferential statistics. We looked at a t-test, an anova and a linear model.
sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] ggsignif_0.6.3 xtable_1.8-4 knitr_1.36 broom_0.7.10
[5] corrplot_0.90 Hmisc_4.6-0 Formula_1.2-4 survival_3.2-13
[9] lattice_0.20-45 languageR_1.5.0 forcats_0.5.1 stringr_1.4.0
[13] dplyr_1.0.7 purrr_0.3.4 readr_2.0.2 tidyr_1.1.4
[17] tibble_3.1.5 ggplot2_3.3.5 tidyverse_1.3.1
loaded via a namespace (and not attached):
[1] nlme_3.1-153 fs_1.5.0 lubridate_1.8.0
[4] RColorBrewer_1.1-2 httr_1.4.2 tools_4.1.2
[7] backports_1.3.0 bslib_0.3.1 utf8_1.2.2
[10] R6_2.5.1 rpart_4.1-15 DBI_1.1.1
[13] mgcv_1.8-38 colorspace_2.0-2 nnet_7.3-16
[16] withr_2.4.2 gridExtra_2.3 tidyselect_1.1.1
[19] compiler_4.1.2 cli_3.1.0 rvest_1.0.2
[22] htmlTable_2.3.0 xml2_1.3.2 labeling_0.4.2
[25] sass_0.4.0 checkmate_2.0.0 scales_1.1.1
[28] psycho_0.6.1 PresenceAbsence_1.1.9 digest_0.6.28
[31] foreign_0.8-81 minqa_1.2.4 rmarkdown_2.11
[34] base64enc_0.1-3 jpeg_0.1-9 pkgconfig_2.0.3
[37] htmltools_0.5.2 lme4_1.1-27.1 highr_0.9
[40] dbplyr_2.1.1 fastmap_1.1.0 htmlwidgets_1.5.4
[43] rlang_0.4.12 readxl_1.3.1 rstudioapi_0.13
[46] jquerylib_0.1.4 farver_2.1.0 generics_0.1.1
[49] jsonlite_1.7.2 magrittr_2.0.1 Matrix_1.3-4
[52] Rcpp_1.0.7 munsell_0.5.0 fansi_0.5.0
[55] lifecycle_1.0.1 stringi_1.7.5 yaml_2.2.1
[58] MASS_7.3-54 grid_4.1.2 crayon_1.4.2
[61] haven_2.4.3 splines_4.1.2 hms_1.1.1
[64] pillar_1.6.4 boot_1.3-28 reprex_2.0.1
[67] glue_1.4.2 evaluate_0.14 latticeExtra_0.6-29
[70] data.table_1.14.2 modelr_0.1.8 vctrs_0.3.8
[73] png_0.1-7 nloptr_1.2.2.2 tzdb_0.2.0
[76] cellranger_1.1.0 gtable_0.3.0 assertthat_0.2.1
[79] xfun_0.27 cluster_2.1.2 ellipsis_0.3.2