Reaction Times

Margherita Calderan

“First, RT distributions are decidedly non-normal — they are almost always skewed to the right.

Second, this skew increases with task difficulty.

Third, the spread of the distribution increases with the mean.”

(Wagenmakers & Brown, 2007)

Reaction time distributions

  • Shifted-LogNormal
  • Inverse Gaussian (Wald)
  • Wiener et al.
  • Gamma
  • Ex-Gaussian
  • Weibull

Code
N = 10000
dat = data.frame(rt = c(rgamma(N, shape = 2.2, rate = 4.8)+.12,
  rshifted_lnorm(N, meanlog = -.8, sdlog = .75, shift = .15),
  rexgaussian(N, mu = 0.7, sigma = .07, beta = 0.4),
  rinv_gaussian(N, mu = .56,shape = 1.1)+.12,
  rwiener(N,alpha = 2, tau = .15, beta = 0.5, delta = 1.5 , types = "q")),
  Dist = factor(rep(c("ShiftGamma","ShiftLogNorm","ExGau", "ShiftInvGau", "Wiener"), each = N)))

ggplot(dat,aes(x = rt, fill = Dist))+geom_density(alpha = .4, lwd = 1,adjust = 2)+
  scale_x_continuous(breaks = seq(0,3,by = .5), limits = c(-0.5,3.5))+
  facet_wrap(~Dist)

Which one to choose?

  • Interpretability
  • Expertise
  • Feasibility (e.g., number of parameters)

Interpretability

Type of parameters - Jonas Kristoffer

  • Difficulty-like : influence both the mean and SD. Ideally a value that reflects the center (median?).
  • Onset/Shift : the earliest possible reaction time.
  • Scale : stretches or squeezes the distribution over and above difficulty, without (severely) changing the location (onset and center).

Difficulty-like

Influences both the mean and SD.

Code
N = 10000
dat = data.frame(rt = c(rshifted_lnorm(N, meanlog = -.85, sdlog = .7, shift = .2),
                        rshifted_lnorm(N, meanlog = -.5, sdlog = .7, shift = .2)),
                 Group = factor(rep(c("Easy","Hard"), each = N)))

ggplot(dat,aes(x = rt, fill = Group))+geom_density(alpha = .4, lwd = 1,adjust = 2)+
  scale_x_continuous(breaks = seq(0,3,by = .3), limits = c(0,3))


Median (Easy): 0.631
Median (Hard): 0.804
Mean (Easy): 0.749
Mean (Hard): 0.972
SD (Easy): 0.430
SD (Hard): 0.623

Onset - Shift

Influences the earliest possible reaction time.

Code
N = 10000
dat = data.frame(rt = c(rshifted_lnorm(N, meanlog = -.85, sdlog = .7, shift = .2),
                        rshifted_lnorm(N, meanlog = -.85, sdlog = .7, shift = .9)),
                 Group = factor(rep(c("valid cue","invalid cue"), each = N)))

ggplot(dat,aes(x = rt, fill = Group))+geom_density(alpha = .4, lwd = 1,adjust = 2)+
  scale_x_continuous(breaks = seq(0,3,by = .3), limits = c(0,3))


Median (val): 0.625
Median (inv): 1.324
SD (val): 0.421
SD (inv): 0.417

Scale

Stretches or squeezes the distribution, without (severely) changing the location (onset and center).

Code
N = 10000
dat = data.frame(rt = c(rshifted_lnorm(N, meanlog = -.85, sdlog = .7, shift = .2),
                        rshifted_lnorm(N, meanlog = -.85, sdlog = 2, shift = .2)),
                 Group = factor(rep(c("Control","Patient"), each = N)))

ggplot(dat,aes(x = rt, fill = Group))+geom_density(alpha = .4, lwd = 1,adjust = 2)+
  scale_x_continuous(breaks = seq(0,3,by = .3), limits = c(0.1,3))


Median (Cont): 0.624
Median (Pat): 0.647
Mean (Cont): 0.750
Mean (Pat): 3.729
SD (Cont): 0.439
SD (Pat): 20.185

Ideally, we should use distribution that can effectively separate three key parameters: Difficulty, Onset, and Scale…

Expertise

lme4

Gamma vs. Inverse Gaussian

Gamma

  • α > 0 shape

  • λ > 0 rate

f(x;α,λ)=λαΓ(α)xα1eλx\begin{equation} f(x; \alpha, \lambda) = \frac{\lambda^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\lambda x} \end{equation}


Inverse Gaussian

  • μ > 0 mean

  • λ > 0 shape

f(x;μ,λ)=λ2πx3exp(λ(xμ)22μ2x)\begin{equation} f(x; \mu, \lambda) = \sqrt{\tfrac{\lambda}{2\pi x^3}}\ \exp\left(-\tfrac{\lambda(x-\mu)^2}{2\mu^2x}\right) \end{equation}


Their are both difficulty-like parameter: μ = α / λ, σ=αλ2\sigma = \sqrt{\frac{\alpha}{\lambda^2}}






μ reflects difficulty μ = μ , λ influence the scale. The larger μ and the smaller λ, the more variable the RTs: σ=μ3λ\sigma = \sqrt{\frac{\mu^3}{\lambda}}

Gamma

Interpretability

  • α independent events.

  • λ rate of events per unit time.

The total time for all events follows a Gamma distribution

Inverse Gaussian (Wald)

Interpretability

The name “inverse Gaussian” comes from its relation to Brownian motion (or Wiener process). It “describes the distribution of the time a Brownian motion with positive drift takes to reach a fixed positive level” [Wikipedia].

Code
set.seed(11)
rdm_path = function(drift, threshold, ndt, sp1=0,  noise_constant=1, dt=0.0001, max_rt=2) {
  max_tsteps = max_rt/dt
  
  # initialize the diffusion process
  tstep = 0
  x1 = c(sp1*threshold) # vector of accumulated evidence at t=tstep
  time = c(ndt)
  
  # start accumulating
  while (x1[tstep+1] < threshold  & tstep < max_tsteps) {
    x1 = c(x1, x1[tstep+1] + rnorm(mean=drift*dt, sd=noise_constant*sqrt(dt), n=1))
    time = c(time, dt*tstep + ndt)
    tstep = tstep + 1
  }
  return (data.frame(time=time, x=x1, accumulator=rep(1, length(x1))))
}

drift = 1;threshold = 2; ndt = .1

sim_path = rdm_path(drift = drift,threshold = threshold, ndt = .1)

ggplot(data = sim_path, aes(x = time, y = x))+
  geom_line(size = 1, color = "blue4") +
  geom_hline(yintercept=threshold, size=1) 

Reading time

High frequency vs. low frequency words, children 9-15 y.o.

Inverse Gaussian vs. Gamma

Code
mInv = lme4::glmer(time ~ freq  + age + (1|ID), 
                    data = dat_e, family = inverse.gaussian(link = "log"))
sumInv = summary(mInv)

mGam = lme4::glmer(time ~ freq  + age + (1|ID), 
                    data = dat_e, family = Gamma(link = "log"))
sumGam = summary(mGam)

Inverse Gaussian

              Estimate Std. Error
(Intercept)  2.6436875 0.02579157
freq1       -0.3161442 0.01138232
age         -0.1090430 0.02600495

Gamma

              Estimate Std. Error
(Intercept)  2.6159104 0.02561623
freq1       -0.3286415 0.01210228
age         -0.1085667 0.02619190

Posterior predictions

Inverse Gaussian

Code
ppInv = performance::check_predictions(mInv, iterations = 100, 
                               type = "density")
time = rowMeans(ppInv)
dat_e_plot = rbind(dat_e[,c("freq","time")], cbind(dat_e[,"freq"], time))
dat_e_plot$type = as.factor(c(rep("real", nrow(dat_e)),rep("pred",length(time))))

ggplot(dat_e_plot, aes(x = time, fill = type))+
  geom_density(alpha = .5)+facet_wrap(~freq)+xlim(0,max(dat_e_plot$time))

Posterior predictions

Gamma

Code
ppGam = performance::check_predictions(mGam, iterations = 100, 
                               type = "density")
time = rowMeans(ppGam)
dat_e_plot = rbind(dat_e[,c("freq","time")], cbind(dat_e[,"freq"], time))
dat_e_plot$type = as.factor(c(rep("real", nrow(dat_e)),rep("pred",length(time))))

ggplot(dat_e_plot, aes(x = time, fill = type))+
  geom_density(alpha = .5)+facet_wrap(~freq)+xlim(0,max(dat_e_plot$time))

lme4

  • Inverse Gaussian, correlation real pred = 0.771
  • Gamma, correlation real pred = 0.773

Bayesian … :)

Inverse Gaussian (Brms non-info prior)

Bayesian … :)

Gamma (Brms non-info prior)

RSS

Frequentist

Inverse Gaussian: 11772.59 
Gamma: 11063.1 


Inverse Gaussian

              Estimate Std. Error
(Intercept)  2.6436875 0.02579157
freq1       -0.3161442 0.01138232
age         -0.1090430 0.02600495

Gamma

              Estimate Std. Error
(Intercept)  2.6159104 0.02561623
freq1       -0.3286415 0.01210228
age         -0.1085667 0.02619190

Bayesian

       elpd_diff se_diff
fitInv   0.0       0.0  
fitGam -33.8       8.1  


Inverse Gaussian

            Estimate Est.Error   Q2.5  Q97.5
b_Intercept    2.632     0.018  2.597  2.668
b_freq1       -0.318     0.015 -0.348 -0.290
b_age         -0.101     0.018 -0.137 -0.064

Gamma

            Estimate Est.Error   Q2.5  Q97.5
b_Intercept    2.633     0.018  2.597  2.669
b_freq1       -0.330     0.015 -0.360 -0.301
b_age         -0.111     0.019 -0.149 -0.074

Shifted LogNormal

Interpretability

If Y is log-normally distributed, then log(Y) follows a normal distribution. The distribution is defined directly for Y, not for a transformed version of Y!

  • μ(,+){\displaystyle  \mu \in ( -\infty ,+\infty  ) } - influence difficulty. Median = shift + exp(μ).

  • σ > 0 scale - increases mean = θ+exp(μ+σ22)\theta + \exp\left( \mu + \frac{\sigma^2}{2} \right) (but not the median).

  • θ > 0 - earliest possible response.

f(x;μ,σ,θ)=1(xθ)σ2πexp[12(ln(xθ)μσ)2]\begin{equation} f(x; \mu, \sigma, \theta) =\frac{1}{(x - \theta)\sigma\sqrt{2\pi}} \exp\left[-\frac{1}{2}\left(\frac{\ln(x - \theta) - \mu}{\sigma}\right)^2\right] \end{equation}

Shifted LogNormal (Race)

Shifted LogNormal

Shifted-LogNormal

Median real RT: 868.212
Median predicted RT: 888.652
Error: -20.441

Median real RT: 868.212
Median predicted RT: 867.514
Error: 0.698

But Feasibility :(


μ1+age+freq+(1|ID)θ1+(1|ID) \begin{aligned} \mu &\sim 1 +\text{age} + \text{freq} + (1 | \text{ID}) \\ \theta &\sim 1 + (1 | \text{ID}) \end{aligned}

Warning: There were 144 divergent transitions after warmup. Increasing
adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
 Family: shifted_lognormal 
  Links: mu = identity; sigma = identity; ndt = log 
Formula: time ~ freq + age + (1 | ID) 
         ndt ~ 1 + (1 | ID)
   Data: dat_e (Number of observations: 476) 
  Draws: 4 chains, each with iter = 6000; warmup = 3000; thin = 1;
         total post-warmup draws = 12000

Multilevel Hyperparameters:
~ID (Number of levels: 238) 
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)         0.51      0.04     0.43     0.59 1.00     2310     5146
sd(ndt_Intercept)     0.16      0.03     0.09     0.22 1.01      539     1355

Regression Coefficients:
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept         1.76      0.10     1.55     1.95 1.02      675     1355
ndt_Intercept     1.95      0.07     1.81     2.07 1.01      617     1389
freq1            -0.74      0.07    -0.89    -0.62 1.01      796     1754
age              -0.22      0.04    -0.31    -0.14 1.01     1225     2948

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.31      0.03     0.26     0.37 1.01     1454     3114

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).



μ1+age+freq+(1|ID)θ1 \begin{aligned} \mu &\sim 1 +\text{age} +\text{freq} + (1 | \text{ID}) \\ \theta &\sim 1 \end{aligned}

 Family: shifted_lognormal 
  Links: mu = identity; sigma = identity; ndt = log 
Formula: time ~ freq + age + (1 | ID) 
         ndt ~ 1
   Data: dat_e (Number of observations: 476) 
  Draws: 4 chains, each with iter = 6000; warmup = 3000; thin = 1;
         total post-warmup draws = 12000

Multilevel Hyperparameters:
~ID (Number of levels: 238) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.44      0.03     0.39     0.50 1.00     4352     6720

Regression Coefficients:
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept         2.01      0.05     1.92     2.12 1.00     5631     7417
ndt_Intercept     1.73      0.05     1.63     1.81 1.00     8241     7945
freq1            -0.59      0.03    -0.66    -0.52 1.00    10150    10023
age              -0.17      0.03    -0.23    -0.11 1.00     4218     6459

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.27      0.02     0.24     0.30 1.00     4938     9131

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Model performance




         elpd_diff se_diff
fitShift   0.0       0.0  
fitInv    -8.2       9.2  
fitGam   -42.0      13.0  



correlation = 0.94

Posterior predictions

Shifted LogNormal

Feasibility (and theory ?)

Experiment: participants viewed a central prime—either a real word or pseudo-word—followed by a spatial cue directing them to a target on the left or right, which they located by pressing a key.

label

cue validity

Shifted LogNormal


μ1+label+(1+label|Exp_Subject_Id)θ1+cue+(1+cue|Exp_Subject_Id) \begin{aligned} \mu &\sim 1 +\text{label} + (1 + \text{label} | \text{Exp_Subject_Id}) \\ \theta &\sim 1 + \text{cue} + (1 + \text{cue} | \text{Exp_Subject_Id}) \end{aligned}


                Estimate Est.Error   Q2.5  Q97.5
b_Intercept        6.015     0.051  5.919  6.117
b_ndt_Intercept   41.017    18.883  2.156 75.566
b_label1           0.047     0.007  0.035  0.060
b_ndt_cue1        28.961     3.893 21.391 36.735

PP-Check

label

PP-Check

cue validity

Correlation check - be careful

ShinyApp - Jonas Kristoffer