In the basic DDM introduced in the previous vignette, we assumed that core parameters like drift rate (v), starting point (z), and non-decision time (ter) remain constant across all trials within an experimental condition. While this simplification is useful for understanding the model’s basic mechanics, it’s unrealistic for modeling actual human behavior.
Consider what happens in a real experiment:
These natural fluctuations mean that the “true” parameter values vary from trial to trial, even within the same experimental condition.
The extended DDM incorporates this reality by allowing key parameters to vary across trials according to specified distributions. This produces several important benefits:
Instead of fixed values, parameters become random variables drawn from distributions:
Where: - \(\mu\) parameters are the
means (what we estimated in the basic DDM) - \(\sigma_v\) (sv) is the
standard deviation of drift rate variability - \(s_z\) (sz) is the
range of starting point variability
- \(s_{ter}\) (st0) is
the range of non-decision time variability
Our simulate_diffusion_trial_variable() function
implements this by:
Let’s see this in action!
We’ll start by comparing a basic DDM (fixed parameters) with a variable DDM using the same mean parameter values.
# Shared simulation parameters
n_trials <- 3000 # Large enough to see distributional differences
base_params <- list(
mean_v = 0.15, # Moderate positive drift
a = 0.7, # Reasonable threshold
mean_z = 0.35, # Unbiased start (a/2)
s = 0.2, # Standard within-trial noise
mean_ter = 0.08, # Typical non-decision time
dt = 0.001 # Fine time resolution
)
# Fixed DDM parameters (variability = 0)
fixed_params <- c(base_params, list(sv = 0, sz = 0, st0 = 0))
# Variable DDM parameters (realistic variability)
variable_params <- c(base_params, list(
sv = 0.3, # Moderate drift rate variability
sz = 0.1, # Some starting point variability
st0 = 0.04 # Small non-decision time variability
))
set.seed(123)
data_fixed <- do.call(simulate_diffusion_experiment_variable,
c(list(n_trials = n_trials), fixed_params))
data_fixed$model_type <- "Fixed Parameters"
set.seed(123) # Same seed for fair comparison
data_variable <- do.call(simulate_diffusion_experiment_variable,
c(list(n_trials = n_trials), variable_params))
data_variable$model_type <- "Variable Parameters"# Combine data for plotting
combined_data <- bind_rows(
data_fixed %>% select(rt, choice, model_type),
data_variable %>% select(rt, choice, model_type)
) %>%
filter(!is.na(rt), !is.na(choice)) %>%
mutate(
choice_label = factor(choice, levels = c(0, 1),
labels = c("Lower Boundary", "Upper Boundary")),
model_type = factor(model_type, levels = c("Fixed Parameters", "Variable Parameters"))
)
# Create comprehensive comparison plot
p1 <- ggplot(combined_data, aes(x = rt, fill = model_type)) +
geom_density(alpha = 0.7, color = "black", linewidth = 0.5) +
facet_grid(model_type ~ choice_label, scales = "free_y") +
scale_fill_manual(values = c("Fixed Parameters" = "#E74C3C",
"Variable Parameters" = "#3498DB")) +
#scale_x_continuous(limits = c(0, 1.5), breaks = seq(0, 1.5, 0.3)) +
labs(
title = "The Dramatic Impact of Parameter Variability",
subtitle = "Same mean parameters, but very different RT distributions",
x = "Reaction Time (seconds)",
y = "Density",
fill = "Model Type"
) +
theme(
legend.position = "bottom",
strip.text = element_text(face = "bold", size = 11),
plot.title = element_text(size = 16, face = "bold"),
plot.subtitle = element_text(size = 12)
)
print(p1)
# Calculate and display summary statistics
summary_stats <- combined_data %>%
group_by(model_type, choice_label) %>%
summarise(
N = n(),
Mean_RT = round(mean(rt), 3),
Median_RT = round(median(rt), 3),
SD_RT = round(sd(rt), 3),
Skewness = round((mean(rt) - median(rt)) / sd(rt), 3),
.groups = "drop"
)
knitr::kable(summary_stats,
caption = "Key Statistics: Fixed vs. Variable Parameter Models")| model_type | choice_label | N | Mean_RT | Median_RT | SD_RT | Skewness |
|---|---|---|---|---|---|---|
| Fixed Parameters | Lower Boundary | 196 | 2.031 | 1.715 | 1.118 | 0.282 |
| Fixed Parameters | Upper Boundary | 2637 | 1.901 | 1.649 | 1.074 | 0.235 |
| Variable Parameters | Lower Boundary | 863 | 1.632 | 1.341 | 1.052 | 0.276 |
| Variable Parameters | Upper Boundary | 1988 | 1.391 | 1.071 | 0.978 | 0.327 |
Shape Transformation: Parameter variability changes the RT distribution shape:
Skewness: The variable model produces the positive skew universally observed in real RT data, while the fixed model does not.
Variability: Real decisions show much more RT variability than fixed-parameter models predict.
Now let’s systematically examine how each type of parameter variability affects the model’s predictions.
Drift rate variability is often the most important variability parameter because it: - Creates realistic right-skewed RT distributions - Explains the relationship between correct and error RTs - Accounts for trial-to-trial fluctuations in attention and processing
# Parameters for drift rate variability exploration
sv_values <- c(0, 0.1, 0.25, 0.5)
n_trials_sv <- 1000
# Simulate across different sv values
sv_data_list <- list()
for (i in seq_along(sv_values)) {
sv_val <- sv_values[i]
set.seed(200 + i)
params_sv <- c(base_params, list(sv = sv_val, sz = 0, st0 = 0))
data_sv <- do.call(simulate_diffusion_experiment_variable,
c(list(n_trials = n_trials_sv), params_sv))
data_sv$sv_condition <- paste0("sv = ", sv_val)
data_sv$sv_value <- sv_val
sv_data_list[[i]] <- data_sv
}
sv_combined <- bind_rows(sv_data_list) %>%
filter(!is.na(rt), !is.na(choice)) %>%
mutate(
choice_label = factor(choice, levels = c(0, 1),
labels = c("Lower Boundary", "Upper Boundary")),
sv_condition = factor(sv_condition, levels = paste0("sv = ", sv_values))
)
# Create the visualization
p_sv_density <- ggplot(sv_combined, aes(x = rt, fill = choice_label)) +
geom_density(alpha = 0.7, color = "black", linewidth = 0.3) +
facet_grid(sv_condition ~ choice_label) +
scale_fill_manual(values = c("Lower Boundary" = "#E67E22",
"Upper Boundary" = "#2ECC71")) +
#scale_x_continuous(limits = c(0, 2), breaks = seq(0, 2, 0.5)) +
labs(
title = "Effect of Drift Rate Variability (sv) on RT Distributions",
subtitle = "Higher sv creates more right-skewed distributions",
x = "Reaction Time (seconds)",
y = "Density",
fill = "Response Type"
) +
theme(
legend.position = "bottom",
strip.text = element_text(face = "bold"),
plot.title = element_text(size = 14, face = "bold")
)
# Calculate skewness measures
sv_skewness <- sv_combined %>%
group_by(sv_condition, choice_label) %>%
summarise(
Mean_RT = round(mean(rt), 3),
Median_RT = round(median(rt), 3),
Skewness = round((mean(rt) - median(rt)) / sd(rt), 3),
.groups = "drop"
) %>%
select(sv_condition, choice_label, Skewness) %>%
pivot_wider(names_from = choice_label, values_from = Skewness)
print(p_sv_density)
| sv_condition | Lower Boundary | Upper Boundary |
|---|---|---|
| sv = 0 | 0.377 | 0.263 |
| sv = 0.1 | 0.246 | 0.288 |
| sv = 0.25 | 0.228 | 0.366 |
| sv = 0.5 | 0.337 | 0.364 |
Key Insights about Drift Rate Variability:
Starting point variability primarily affects: - The probability of fast errors (when you start close to the wrong boundary) - Overall RT variance (but less dramatically than sv) - Response bias patterns
sz_values <- c(0, 0.1, 0.25, 0.4)
n_trials_sz <- 1000
# Simulate across different sz values
sz_data_list <- list()
for (i in seq_along(sz_values)) {
sz_val <- sz_values[i]
set.seed(300 + i)
params_sz <- c(base_params, list(sv = 0.2, sz = sz_val, st0 = 0))
data_sz <- do.call(simulate_diffusion_experiment_variable,
c(list(n_trials = n_trials_sz), params_sz))
data_sz$sz_condition <- paste0("sz = ", sz_val)
data_sz$sz_value <- sz_val
sz_data_list[[i]] <- data_sz
}
sz_combined <- bind_rows(sz_data_list) %>%
filter(!is.na(rt), !is.na(choice)) %>%
mutate(
choice_label = factor(choice, levels = c(0, 1),
labels = c("Lower Boundary", "Upper Boundary")),
sz_condition = factor(sz_condition, levels = paste0("sz = ", sz_values))
)
# Focus on fast errors - look at fastest 20% of responses
fast_errors_analysis <- sz_combined %>%
group_by(sz_condition) %>%
mutate(rt_percentile = ntile(rt, 5)) %>%
filter(rt_percentile == 1) %>% # Fastest 20%
group_by(sz_condition, choice_label) %>%
summarise(
N_fast = n(),
Prop_fast = n() / nrow(filter(sz_combined, sz_condition == first(sz_condition))),
.groups = "drop"
)
# Visualization focusing on leading edge
p_sz <- ggplot(sz_combined, aes(x = rt, fill = choice_label)) +
geom_density(alpha = 0.7, color = "black", linewidth = 0.3) +
facet_grid(sz_condition ~ choice_label) +
scale_fill_manual(values = c("Lower Boundary" = "#E67E22",
"Upper Boundary" = "#2ECC71")) +
#scale_x_continuous(limits = c(0, 1.5), breaks = seq(0, 1.5, 0.3)) +
labs(
title = "Effect of Starting Point Variability (sz) on RT Distributions",
subtitle = "Higher sz increases fast errors and overall variance",
x = "Reaction Time (seconds)",
y = "Density",
fill = "Response Type"
) +
theme(
legend.position = "bottom",
strip.text = element_text(face = "bold"),
plot.title = element_text(size = 14, face = "bold")
)
print(p_sz)
| sz_condition | choice_label | N_fast | Prop_fast |
|---|---|---|---|
| sz = 0 | Lower Boundary | 32 | 0.0343348 |
| sz = 0 | Upper Boundary | 155 | 0.1663090 |
| sz = 0.1 | Lower Boundary | 25 | 0.0268240 |
| sz = 0.1 | Upper Boundary | 163 | 0.1748927 |
| sz = 0.25 | Lower Boundary | 29 | 0.0311159 |
| sz = 0.25 | Upper Boundary | 162 | 0.1738197 |
| sz = 0.4 | Lower Boundary | 34 | 0.0364807 |
| sz = 0.4 | Upper Boundary | 154 | 0.1652361 |
Non-decision time variability has a distinctive effect: - “Smears out” the leading edge of RT distributions - Makes the fastest responses more variable - Has relatively subtle effects compared to sv and sz
st0_values <- c(0, 0.02, 0.04, 0.06)
n_trials_st0 <- 2000
# Simulate across different st0 values
st0_data_list <- list()
for (i in seq_along(st0_values)) {
st0_val <- st0_values[i]
set.seed(400 + i)
params_st0 <- c(base_params, list(sv = 0.1, sz = 0, st0 = st0_val))
data_st0 <- do.call(simulate_diffusion_experiment_variable,
c(list(n_trials = n_trials_st0), params_st0))
data_st0$st0_condition <- paste0("st0 = ", st0_val)
data_st0$st0_value <- st0_val
st0_data_list[[i]] <- data_st0
}
st0_combined <- bind_rows(st0_data_list) %>%
filter(!is.na(rt), !is.na(choice)) %>%
mutate(
choice_label = factor(choice, levels = c(0, 1),
labels = c("Lower Boundary", "Upper Boundary")),
st0_condition = factor(st0_condition, levels = paste0("st0 = ", st0_values))
)
# Visualization with focus on leading edge
p_st0 <- ggplot(st0_combined, aes(x = rt, fill = choice_label)) +
geom_density(alpha = 0.7, color = "black", linewidth = 0.3) +
facet_grid(st0_condition ~ choice_label) +
scale_fill_manual(values = c("Lower Boundary" = "#E67E22",
"Upper Boundary" = "#2ECC71")) +
#scale_x_continuous(limits = c(0, 1.2), breaks = seq(0, 1.2, 0.2)) +
labs(
title = "Effect of Non-Decision Time Variability (st0) on RT Distributions",
subtitle = "Higher st0 smooths the leading edge of distributions",
x = "Reaction Time (seconds)",
y = "Density",
fill = "Response Type"
) +
theme(
legend.position = "bottom",
strip.text = element_text(face = "bold"),
plot.title = element_text(size = 14, face = "bold")
)
# Calculate leading edge statistics (10th percentile)
leading_edge_stats <- st0_combined %>%
group_by(st0_condition, choice_label) %>%
summarise(
Q10 = round(quantile(rt, 0.1), 3),
Q25 = round(quantile(rt, 0.25), 3),
SD_RT = round(sd(rt), 3),
.groups = "drop"
)
print(p_st0)
| st0_condition | choice_label | Q10 | Q25 | SD_RT |
|---|---|---|---|---|
| st0 = 0 | Lower Boundary | 0.739 | 1.136 | 1.222 |
| st0 = 0 | Upper Boundary | 0.695 | 0.987 | 1.073 |
| st0 = 0.02 | Lower Boundary | 0.739 | 1.162 | 1.163 |
| st0 = 0.02 | Upper Boundary | 0.683 | 0.956 | 1.051 |
| st0 = 0.04 | Lower Boundary | 0.918 | 1.217 | 1.156 |
| st0 = 0.04 | Upper Boundary | 0.656 | 0.938 | 1.031 |
| st0 = 0.06 | Lower Boundary | 0.669 | 1.014 | 1.210 |
| st0 = 0.06 | Upper Boundary | 0.654 | 0.948 | 1.075 |
Real behavior reflects the combined influence of all variability sources. Let’s simulate a “realistic” DDM with all three types of variability:
# Realistic parameter set
realistic_params <- list(
mean_v = 0.2, # Moderate drift rate
a = 0.8, # Reasonable threshold
mean_z = 0.4, # Slightly biased start
s = 0.2, # Standard noise
mean_ter = 0.18, # Typical non-decision time
sv = 0.3, # Substantial drift variability
sz = 0.1, # Moderate starting point variability
st0 = 0.06, # Small non-decision time variability
dt = 0.001
)
# Simulate realistic DDM
set.seed(500)
realistic_data <- do.call(simulate_diffusion_experiment_variable,
c(list(n_trials = 3000), realistic_params))
# Also simulate a "textbook" version (no variability) for comparison
textbook_params <- realistic_params
textbook_params[c("sv", "sz", "st0")] <- 0
set.seed(500)
textbook_data <- do.call(simulate_diffusion_experiment_variable,
c(list(n_trials = 3000), textbook_params))
# Combine and prepare data
comparison_data <- bind_rows(
realistic_data %>% mutate(model = "Realistic DDM\n(with variability)"),
textbook_data %>% mutate(model = "Textbook DDM\n(no variability)")
) %>%
filter(!is.na(rt), !is.na(choice)) %>%
mutate(
choice_label = factor(choice, levels = c(0, 1),
labels = c("Error Responses", "Correct Responses")),
model = factor(model, levels = c("Textbook DDM\n(no variability)",
"Realistic DDM\n(with variability)"))
)
# Create comprehensive comparison
p_realistic <- ggplot(comparison_data, aes(x = rt, fill = model)) +
geom_density(alpha = 0.7, color = "black", linewidth = 0.5) +
facet_grid(model ~ choice_label) +
scale_fill_manual(values = c("Textbook DDM\n(no variability)" = "#95A5A6",
"Realistic DDM\n(with variability)" = "#8E44AD")) +
#scale_x_continuous(limits = c(0, 2), breaks = seq(0, 2, 0.4)) +
labs(
title = "Realistic vs. Textbook DDM: The Full Picture",
subtitle = "Parameter variability creates distributions that match real human data",
x = "Reaction Time (seconds)",
y = "Density",
fill = "Model Type"
) +
theme(
legend.position = "bottom",
strip.text = element_text(face = "bold", size = 11),
plot.title = element_text(size = 16, face = "bold"),
plot.subtitle = element_text(size = 12)
)
print(p_realistic)
# Statistical comparison
model_comparison <- comparison_data %>%
group_by(model, choice_label) %>%
summarise(
N = n(),
Mean_RT = round(mean(rt), 3),
Median_RT = round(median(rt), 3),
SD_RT = round(sd(rt), 3),
Skewness = round((mean(rt) - median(rt)) / sd(rt), 3),
Min_RT = round(min(rt), 3),
Q90_RT = round(quantile(rt, 0.9), 3),
.groups = "drop"
)
knitr::kable(model_comparison,
caption = "Comprehensive Model Comparison")| model | choice_label | N | Mean_RT | Median_RT | SD_RT | Skewness | Min_RT | Q90_RT |
|---|---|---|---|---|---|---|---|---|
| Textbook DDM | ||||||||
| (no variability) | Error Responses | 46 | 2.082 | 1.788 | 1.008 | 0.292 | 0.660 | 3.495 |
| Textbook DDM | ||||||||
| (no variability) | Correct Responses | 2853 | 2.006 | 1.757 | 1.033 | 0.241 | 0.427 | 3.537 |
| Realistic DDM | ||||||||
| (with variability) | Error Responses | 717 | 2.012 | 1.815 | 1.079 | 0.182 | 0.326 | 3.523 |
| Realistic DDM | ||||||||
| (with variability) | Correct Responses | 2081 | 1.582 | 1.211 | 1.026 | 0.362 | 0.360 | 3.017 |
The realistic DDM captures several key features of real human RT data:
Let’s look at the actual parameter values sampled on each trial to understand the variability process:
# Use the realistic simulation data which includes trial-specific parameters
param_analysis_data <- realistic_data %>%
filter(!is.na(rt), !is.na(choice)) %>%
mutate(
choice_label = factor(choice, levels = c(0, 1),
labels = c("Error", "Correct")),
rt_fast = rt < quantile(rt, 0.25, na.rm = TRUE),
rt_slow = rt > quantile(rt, 0.75, na.rm = TRUE)
)
# Create panel plots showing parameter distributions
p1_params <- param_analysis_data %>%
select(v_trial, z_trial, ter_trial, choice_label) %>%
pivot_longer(cols = c(v_trial, z_trial, ter_trial),
names_to = "parameter", values_to = "value") %>%
mutate(
parameter = factor(parameter,
levels = c("v_trial", "z_trial", "ter_trial"),
labels = c("Drift Rate (v)", "Starting Point (z)", "Non-Decision Time (ter)"))
) %>%
ggplot(aes(x = value, fill = choice_label)) +
geom_density(alpha = 0.7, color = "black", linewidth = 0.3) +
facet_grid(parameter ~ choice_label, scales = "free_y") +
scale_fill_manual(values = c("Error" = "#E74C3C", "Correct" = "#27AE60")) +
labs(
title = "Distribution of Trial-Specific Parameters",
subtitle = "How parameter values vary across trials",
x = "Parameter Value",
y = "Density",
fill = "Response Type"
) +
theme(
legend.position = "bottom",
strip.text = element_text(face = "bold")
)
print(p1_params)
# Examine parameter-RT relationships
rt_param_correlations <- param_analysis_data %>%
summarise(
cor_v_rt = round(cor(v_trial, rt, use = "complete.obs"), 3),
cor_z_rt = round(cor(z_trial, rt, use = "complete.obs"), 3),
cor_ter_rt = round(cor(ter_trial, rt, use = "complete.obs"), 3)
)
cat("Correlations between trial parameters and RT:\n")## Correlations between trial parameters and RT:
## v_trial - RT: -0.374
## z_trial - RT: -0.028
## ter_trial - RT: 0.01
# Create scatter plots showing parameter-RT relationships
p_relationships <- param_analysis_data %>%
sample_n(1000) %>% # Sample for clearer visualization
select(rt, v_trial, z_trial, ter_trial, choice_label) %>%
pivot_longer(cols = c(v_trial, z_trial, ter_trial),
names_to = "parameter", values_to = "value") %>%
mutate(
parameter = factor(parameter,
levels = c("v_trial", "z_trial", "ter_trial"),
labels = c("Drift Rate (v)", "Starting Point (z)", "Non-Decision Time (ter)"))
) %>%
ggplot(aes(x = value, y = rt, color = choice_label)) +
geom_point(alpha = 0.6, size = 0.8) +
geom_smooth(method = "lm", se = TRUE, linewidth = 1) +
facet_wrap(~ parameter, scales = "free_x") +
scale_color_manual(values = c("Error" = "#E74C3C", "Correct" = "#27AE60")) +
scale_y_continuous(limits = c(0, 1.5)) +
labs(
title = "How Trial-Specific Parameters Influence RT",
subtitle = "Each point represents one trial",
x = "Parameter Value",
y = "Reaction Time (seconds)",
color = "Response Type"
) +
theme(
legend.position = "bottom",
strip.text = element_text(face = "bold")
)
print(p_relationships)
Key Relationships:
Always include parameter variability when: - You want realistic RT distributions that match human data - You’re fitting the DDM to empirical data - You need to explain error RT patterns (fast vs. slow errors) - You have sufficient data (>1000 trials per condition)
Consider simplified models when: - Teaching basic DDM concepts - You have very limited data (<200 trials) - Computational efficiency is critical - You’re doing theoretical explorations
Based on empirical DDM studies, typical variability parameter ranges are:
| Parameter | Typical.Range | Common.Value | Psychological.Meaning |
|---|---|---|---|
| sv (drift variability) | 0.05 - 0.30 | 0.10 - 0.15 | Attention fluctuations, stimulus variability |
| sz (starting point variability) | 0.02 - 0.15 | 0.05 - 0.08 | Strategic adjustments, expectation changes |
| st0 (non-decision time variability) | 0.01 - 0.10 | 0.03 - 0.05 | Motor preparation variability |
Parameter variability transforms the DDM from a theoretical tool into a realistic model of human decision-making. The key insights are:
Across-trial parameter variability is a critical extension to the basic DDM. By allowing drift rate, starting point, and non-decision time to fluctuate from trial to trial, the model can more accurately capture the richness and complexity of human decision-making behavior, particularly the full shape of RT distributions and the patterns of error responses.