bfKanaiCenter = anovaBF(latency~stimulation*leg*direction+subject, data = data.frame(filter(latencyMedianBaseline, type == "center")), whichModels="withmain", whichRandom = "subject", progress = FALSE, iterations = 100000) # compute Bayes Factors
bfKanaiCenter <- sort(bfKanaiCenter, decreasing = TRUE) # sort such that winning model is at the top
Interestingly, there is a lot more evidence across the board, especially for the models that feature Stimulation and Direction.
There seems to be a mismatch with the classical ANOVA: An effect of stimulation receives strong support, whereas it was non-significant. The effect of direction was significant, but it is less strongly supported by the inclusion Bayes Factor (although the evidence still goes in the right direction).
Main effect of stimulation
latencyMedianBaseline %>%
filter(type == "center") %>%
group_by(subject,stimulation) %>%
summarise(latency = mean(latency)) %>%
ggplot(aes(stimulation, latency)) +
geom_hline(yintercept = 0, linetype = "dashed") +
stat_summary(fun.y = mean, geom = "line", aes(group = 1), size = 2) +
geom_line(aes(colour = subject, group = subject))
It is easy to see why this effect is non-significant: the average difference is tiny and there is a lot of variability. The plot shows one major outlier though in terms of the effect size (S01): let’s see what happens to the Bayes Factor if we take their data out.
latencyNoS01 <- latencyMedianBaseline %>%
filter(subject != "S01") %>%
mutate(subject = factor(subject))
bfKanaiCenterNoS01 = anovaBF(latency~stimulation*leg*direction+subject, data = data.frame(filter(latencyNoS01, type == "center")), whichModels="withmain", whichRandom = "subject", progress = FALSE, iterations = 100000) # compute Bayes Factors
bfKanaiCenterNoS01 <- sort(bfKanaiCenterNoS01, decreasing = TRUE) # sort such that winning model is at the top
kable(select(extractBF(bfKanaiCenterNoS01), bf)) # show only the Bayes factors in a table
stimulation + direction + subject |
1370.5021891 |
direction + subject |
610.7330230 |
stimulation + direction + leg + subject |
531.6196659 |
stimulation + direction + stimulation:direction + subject |
352.9869220 |
direction + leg + subject |
200.7493579 |
stimulation + direction + stimulation:direction + leg + subject |
123.1974214 |
stimulation + direction + leg + direction:leg + subject |
63.0498822 |
stimulation + direction + leg + stimulation:leg + subject |
42.8446155 |
direction + leg + direction:leg + subject |
24.4897141 |
stimulation + direction + stimulation:direction + leg + direction:leg + subject |
14.9256981 |
stimulation + direction + stimulation:direction + leg + stimulation:leg + subject |
8.3443542 |
stimulation + direction + leg + stimulation:leg + direction:leg + subject |
4.0678602 |
stimulation + subject |
1.9848227 |
stimulation + direction + stimulation:direction + leg + stimulation:leg + direction:leg + subject |
1.0168697 |
stimulation + leg + subject |
0.6129997 |
leg + subject |
0.3006916 |
stimulation + direction + stimulation:direction + leg + stimulation:leg + direction:leg + stimulation:direction:leg + subject |
0.1633244 |
stimulation + leg + stimulation:leg + subject |
0.0422355 |
kable(inclusionBF(bfKanaiCenterNoS01, models = "matched"))
stimulation |
2.3502132 |
direction |
699.4732539 |
stimulation:direction |
0.2487328 |
leg |
0.3664546 |
stimulation:leg |
0.0767869 |
direction:leg |
0.1186097 |
stimulation:direction:leg |
0.1606149 |
This completely abolishes the strong support for Stimulation, and greatly enhances the support for Direction, bringing the Bayesian and the classical ANOVAs more in line. At present it is unclear why the classical and Bayesian analyses differ in this regard. When simulating this case for normally distributed data, the Bayes Factors and p-values track each other nicely (see discussion on JASP/BayesFactor forum). This suggests there must be some assumption that is not met in this particular dataset, which is causing the divergence between the analyses.
Still, let’s do some follow-up tests with all of the data (i.e. including this subject) to see whether the anodal or cathodal change scores are significantly different from 0 on their own.
Bayesian one-sample t-tests:
latencyMedianBaseline %>%
filter(type == "center") %>% # keep only center saccades
group_by(stimulation,subject) %>% # for each session and subject
summarise(deviation.end = mean(latency)) %>% # average over all other variables
spread(stimulation,deviation.end) %>% # make separate columns with test data
summarise_if(is.numeric, funs(extractBF(ttestBF(.), onlybf = TRUE))) %>% # run Bayesian t-test on each column, keeping only the BF
gather(stimulation,BF,anodal,cathodal) %>% # make row for each stimulation condition
kable(.)
anodal |
0.3218926 |
cathodal |
0.2840670 |
Frequentist one-sample t-tests:
latencyMedianBaseline %>%
filter(type == "center") %>% # keep only center saccades
group_by(stimulation,subject) %>% # for each session and subject
summarise(deviation.end = mean(latency)) %>% # average over all other variables (df is now still grouped per stimulation)
summarise_if(is.numeric, funs(list(tidy(t.test(.))))) %>% # run one-sample t-test for each stimulation condition, return tidy data frames
unnest() %>% # unpack the list-column with data frame for each test
kable(.)
anodal |
-1.608974 |
-0.9875463 |
0.3328370 |
25 |
-4.964508 |
1.746559 |
One Sample t-test |
two.sided |
cathodal |
1.096154 |
0.8335773 |
0.4124131 |
25 |
-1.612139 |
3.804446 |
One Sample t-test |
two.sided |
So neither are actually significant or have a BF with evidence for the alternative.