Biomarker Discovery with TCGA Data
A Complete Workflow Using OmicSelector 2.0
OmicSelector Team
2025-12-21
Source:vignettes/tcga-tutorial.Rmd
tcga-tutorial.RmdIntroduction
This tutorial demonstrates a complete biomarker discovery workflow using OmicSelector 2.0 with real data from The Cancer Genome Atlas (TCGA). We’ll use miRNA expression data to discover diagnostic biomarkers for pancreatic cancer.
Learning Objectives
By the end of this tutorial, you will be able to:
- Prepare TCGA data for biomarker discovery
- Configure and run feature selection with zero data leakage
- Compare multiple filter-model combinations using nested CV
- Analyze feature stability across cross-validation folds
- Select an optimal biomarker signature balancing performance and stability
- Train and save a final model for deployment
- Apply batch correction using FrozenComBat
- Assess and improve model calibration
- Generate a TRIPOD+AI compliant report
Prerequisites
# Install OmicSelector if needed
if (!require("OmicSelector")) {
remotes::install_github("kstawiski/OmicSelector")
}
# Load required packages
library(OmicSelector)
library(mlr3)
library(mlr3learners)
library(data.table)
library(ggplot2)
# For TCGA data download (optional)
# BiocManager::install("TCGAbiolinks")Part 1: Data Preparation
1.1 Loading TCGA Data
OmicSelector includes pre-processed TCGA miRNA expression data for demonstration.
1.2 Selecting Cases and Controls
For a valid biomarker study, we need properly matched cases and controls. Using tumor vs. normal tissue from the same organ is critical.
# Select pancreatic cancer cases (primary tumors only)
pancreatic_cancer <- orginal_TCGA_data[
orginal_TCGA_data$primary_site == "Pancreas" &
orginal_TCGA_data$sample_type == "PrimaryTumor",
]
# Select pancreatic normal controls
pancreatic_normal <- orginal_TCGA_data[
orginal_TCGA_data$primary_site == "Pancreas" &
orginal_TCGA_data$sample_type == "SolidTissueNormal",
]
cat("Pancreatic cancer samples:", nrow(pancreatic_cancer), "\n")
cat("Pancreatic normal samples:", nrow(pancreatic_normal), "\n")Important: Never mix normal tissues from different organs as controls - this creates confounding that leads to inflated performance estimates.
1.3 Handling Limited Normal Samples
TCGA often has few matched normal samples. We can carefully add normals from related tissues.
⚠️ IMPORTANT WARNING: Adding normal tissues from different organs introduces biological confounding. Expression differences between organs may dominate tumor vs. normal differences, potentially leading to:
- Inflated performance estimates - You may be classifying “pancreas vs. colon” rather than “tumor vs. normal”
- Biologically meaningless biomarkers - Features distinguishing tissue types, not cancer status
- Failed external validation - Models trained this way often fail when tested on matched tumor-normal pairs
Recommendations: 1. Preferred: Use only matched tissue normals, even if sample size is small 2. Alternative: Use an independent normal cohort from the same tissue 3. Last resort: Add related tissue normals, but clearly report this limitation and interpret results with extreme caution
# If we have fewer than 30 pancreatic normals, add GI tract normals
# WARNING: This is a last resort that introduces biological confounding!
if (nrow(pancreatic_normal) < 30) {
warning("Adding GI tract normals - this introduces biological confounding!")
message("Adding GI tract normals to increase control sample size")
message("Results should be validated on matched tumor-normal pairs.")
# Add normal tissues from related GI organs
gi_organs <- c("Stomach", "Colon", "Liver and intrahepatic bile ducts")
gi_normals <- orginal_TCGA_data[
orginal_TCGA_data$primary_site %in% gi_organs &
orginal_TCGA_data$sample_type == "SolidTissueNormal",
]
# Sample to avoid overwhelming imbalance
set.seed(42)
n_to_add <- min(50, nrow(gi_normals))
gi_sample <- gi_normals[sample(nrow(gi_normals), n_to_add), ]
pancreatic_normal <- rbind(pancreatic_normal, gi_sample)
cat("Total controls after augmentation:", nrow(pancreatic_normal), "\n")
cat("WARNING: These results may be confounded by tissue-type differences.\n")
}1.4 Creating the Analysis Dataset
# Add class labels
pancreatic_cancer$Class <- "Case"
pancreatic_normal$Class <- "Control"
# Combine into final dataset
dataset <- rbind(pancreatic_cancer, pancreatic_normal)
# Extract miRNA features (columns starting with "hsa")
mirna_cols <- grep("^hsa", names(dataset), value = TRUE)
cat("Number of miRNA features:", length(mirna_cols), "\n")
# Create analysis data.frame
# Handle both data.table and data.frame inputs
if (inherits(dataset, "data.table")) {
mirna_matrix <- as.data.frame(dataset[, mirna_cols, with = FALSE])
} else {
mirna_matrix <- dataset[, mirna_cols]
}
analysis_data <- data.frame(
mirna_matrix,
outcome = factor(dataset$Class, levels = c("Control", "Case"))
)
# Check class distribution
table(analysis_data$outcome)1.5 Data Quality Checks
Before analysis, always perform quality checks:
# Check for missing values
na_count <- sum(is.na(analysis_data))
cat("Total missing values:", na_count, "\n")
# Check feature variance
feature_vars <- apply(analysis_data[, mirna_cols], 2, var, na.rm = TRUE)
low_var_features <- sum(feature_vars < 0.01)
cat("Features with very low variance:", low_var_features, "\n")
# Remove zero-variance features
valid_features <- names(feature_vars[feature_vars > 0])
analysis_data <- analysis_data[, c(valid_features, "outcome")]
cat("Features after QC:", length(valid_features), "\n")Part 2: Building the Analysis Pipeline
2.1 Creating an OmicPipeline
# Create pipeline
pipeline <- OmicPipeline$new(
data = analysis_data,
target = "outcome",
positive = "Case" # Define which class is "positive" for AUC
)
# View pipeline summary
print(pipeline)2.2 Understanding GraphLearners
OmicSelector uses mlr3pipelines GraphLearners to ensure
all preprocessing happens inside cross-validation
folds. The pipeline order is:
Impute → Scale → [Batch Correct] → [Oversample] → Filter → Model
optional optional
# Create a basic GraphLearner
learner_basic <- pipeline$create_graph_learner(
filter = "anova", # ANOVA F-test for feature selection
model = "ranger", # Random Forest
n_features = 30, # Select top 30 features
impute_method = "median", # Impute missing values with median
scale = TRUE # Standardize features
)
# View the learner structure
print(learner_basic)2.3 Available Options
Feature Selection Methods
| Filter | Description | Strengths |
|---|---|---|
"anova" |
ANOVA F-test | Good default, tests mean differences |
"variance" |
Variance filter | Removes uninformative features |
"correlation" |
Target correlation | Fast univariate filter |
"mrmr" |
Min Redundancy Max Relevance | Reduces redundant features |
"information_gain" |
Entropy-based | Good for mixed data types |
Part 3: Comparing Multiple Configurations
3.1 Defining a Comparison Grid
We’ll compare different filter-model combinations:
# Define configurations to compare
configs <- list(
list(filter = "anova", model = "ranger", n_features = 20, id = "anova_rf_20"),
list(filter = "anova", model = "ranger", n_features = 30, id = "anova_rf_30"),
list(filter = "anova", model = "glmnet", n_features = 30, id = "anova_glmnet_30"),
list(filter = "mrmr", model = "ranger", n_features = 20, id = "mrmr_rf_20"),
list(filter = "variance", model = "svm", n_features = 30, id = "var_svm_30")
)
cat("Testing", length(configs), "configurations\n")3.2 Setting Up the Benchmark
# Create benchmark service with nested cross-validation
benchmark <- BenchmarkService$new(
task = pipeline,
outer_folds = 5, # 5-fold outer CV for evaluation
inner_folds = 3, # 3-fold inner CV for feature selection
seed = 42 # For reproducibility
)
# Add all configurations
for (cfg in configs) {
learner <- pipeline$create_graph_learner(
filter = cfg$filter,
model = cfg$model,
n_features = cfg$n_features
)
benchmark$add_learner(learner, id = cfg$id)
}
print(benchmark)3.4 Analyzing Results
# Get performance table
performance <- result$performance
print(performance)
# Compute standard errors from resampling (if available)
# The benchmark stores per-fold scores that we can use
bmr <- result$benchmark_result
fold_scores <- bmr$score(msr("classif.auc"))
# Aggregate mean and standard error per learner
library(data.table)
scores_dt <- as.data.table(fold_scores)
perf_summary <- scores_dt[, .(
mean_auc = mean(classif.auc, na.rm = TRUE),
se_auc = sd(classif.auc, na.rm = TRUE) / sqrt(.N)
), by = learner_id]
# Visualize performance with proper error bars
ggplot(perf_summary, aes(x = reorder(learner_id, mean_auc),
y = mean_auc)) +
geom_col(fill = "steelblue", alpha = 0.8) +
geom_errorbar(aes(ymin = mean_auc - se_auc, ymax = mean_auc + se_auc),
width = 0.2) +
coord_flip() +
labs(
title = "Performance Comparison: AUC",
subtitle = "Error bars show standard error across CV folds",
x = "Configuration",
y = "AUC (Cross-Validated)"
) +
theme_minimal()Part 4: Feature Stability Analysis
4.1 Why Stability Matters
High performance with unstable features often indicates overfitting. The Nogueira Stability Index quantifies feature selection consistency:
- SI ≥ 0.9: Excellent - Very stable selections
- SI 0.7-0.9: Good - Reasonably stable
- SI 0.5-0.7: Moderate - Some instability
- SI < 0.5: Poor - Likely overfitting
4.2 Computing Stability
# Compute stability for the benchmark result
stability <- compute_stability_from_resample(result$benchmark_result)
print(stability)
# View interpretation
cat("\n", stability$interpretation, "\n")4.4 Visualizing Feature Stability
# Create selection frequency plot
freq_df <- data.frame(
feature = names(freq)[1:20],
frequency = freq[1:20]
)
freq_df$feature <- factor(freq_df$feature, levels = rev(freq_df$feature))
ggplot(freq_df, aes(x = frequency, y = feature)) +
geom_col(aes(fill = frequency >= 1.0), alpha = 0.8) +
scale_fill_manual(
values = c("FALSE" = "steelblue", "TRUE" = "darkgreen"),
labels = c("Variable", "Consensus"),
name = "Selection"
) +
geom_vline(xintercept = 1.0, linetype = "dashed", color = "red") +
labs(
title = "Top 20 Features by Selection Frequency",
x = "Selection Frequency (1.0 = all folds)",
y = NULL
) +
theme_minimal()Part 5: Optimal Signature Selection
5.1 Multi-Objective Selection
OmicSelector selects signatures balancing: - Performance (AUC) - Stability (Nogueira Index) - Parsimony (fewer features)
# Select best signature using weighted mode
best <- select_best_signature(
result,
mode = "weighted",
stability_weight = 0.3, # 30% weight on stability
parsimony_weight = 0.1 # 10% weight on fewer features
)
print(best)5.2 Selection Modes
| Mode | Description | Use When |
|---|---|---|
"best_auc" |
Highest AUC only | Maximum predictive power needed |
"best_stability" |
Highest stability only | Reproducibility is critical |
"weighted" |
Weighted combination | Balanced selection (default) |
"pareto" |
Pareto-optimal with utopia | Formal multi-objective |
5.3 Final Signature
# Extract the final signature
final_features <- best$features
cat("Final signature contains", length(final_features), "features:\n")
print(final_features)
# Summary metrics
cat("\nSignature Metrics:\n")
cat(" AUC:", round(best$auc, 3), "\n")
cat(" Stability:", round(best$stability, 3), "\n")
cat(" N Features:", length(final_features), "\n")Part 6: Training the Final Model
After selecting an optimal signature configuration, we train a final model on the full dataset for deployment or external validation.
6.1 Selecting the Best Learner
# From benchmarking, identify the best configuration
# Let's say "anova_rf_30" performed best (example)
best_config <- "anova_rf_30"
# Create the learner with the optimal configuration
final_learner <- pipeline$create_graph_learner(
filter = "anova",
model = "ranger",
n_features = 30,
scale = TRUE
)
# Set the learner ID
final_learner$id <- "final_model"6.2 Training on Full Data
# Get the task from the pipeline
task <- pipeline$get_task()
# Train the final model on ALL data
# (Feature selection happens inside the pipeline, so no leakage)
final_learner$train(task)
cat("Final model trained successfully.\n")
cat("Model type:", final_learner$model$model, "\n")
# Extract selected features from the trained pipeline
# The filter step stores which features were selected
selected_features <- final_learner$state$model$filter$features
cat("Features selected:", length(selected_features), "\n")
print(head(selected_features, 10))6.3 Generating Predictions
# Generate predictions on the training data (for calibration)
predictions <- final_learner$predict(task)
# Extract probabilities for the positive class
prob_positive <- predictions$prob[, "Case"]
# Extract true labels
true_labels <- as.integer(task$truth() == "Case")
# Summary statistics
cat("Prediction summary:\n")
cat(" Mean probability:", round(mean(prob_positive), 3), "\n")
cat(" Min:", round(min(prob_positive), 3), "\n")
cat(" Max:", round(max(prob_positive), 3), "\n")
cat(" AUC (resubstitution):",
round(predictions$score(msr("classif.auc")), 3), "\n")Note: The resubstitution AUC is optimistically biased. The nested CV results from Part 3 provide unbiased performance estimates.
Part 7: Batch Correction with FrozenComBat
7.1 Why FrozenComBat?
When applying a model to external data with batch effects, we must: 1. Fit batch parameters on training data only 2. Apply the same transformation to new data
Standard ComBat re-fits on all data, causing leakage. FrozenComBat solves this by “freezing” parameters from training.
7.3 Including Batch Correction in Pipeline
# Create pipeline with batch information
pipeline_bc <- OmicPipeline$new(
data = analysis_data,
target = "outcome",
positive = "Case",
batch = "batch" # Specify batch column
)
# Create learner with batch correction enabled
learner_bc <- pipeline_bc$create_graph_learner(
filter = "anova",
model = "ranger",
n_features = 30,
batch_correct = TRUE # Enable FrozenComBat
)
print(learner_bc)7.4 Verifying No Batch Leakage
FrozenComBat prevents batch correction leakage by fitting parameters
only on training data and applying frozen transformations to test data.
The pipeline handles this automatically when
batch_correct = TRUE.
To manually verify batch effect reduction, compare corrected vs original data:
# Manual verification of batch correction (after FrozenComBat)
# The correction should reduce batch-related variance while preserving
# biological signal. Check the mean shift between batches:
# Example: compute batch mean difference before/after correction
# If using frozen_combat_correct() directly:
# result <- frozen_combat_correct(train_data, train_batch, test_data, test_batch)
# Then compare batch means in result$corrected vs originalPart 8: Model Calibration
8.1 Why Calibrate?
Predicted probabilities may not reflect true event rates. Calibration
ensures P(outcome|prediction=0.7) ≈ 70%.
When to calibrate: - When probability estimates (not just classifications) matter - For risk stratification or clinical decision support - When combining predictions with other probability-based models
8.2 Computing Calibration Metrics
Using the predictions from Part 6 (final model training):
# We already have prob_positive and true_labels from Part 6
# prob_positive <- predictions$prob[, "Case"]
# true_labels <- as.integer(task$truth() == "Case")
# Compute Expected Calibration Error (ECE)
# ECE measures how well predicted probabilities match observed frequencies
ece <- compute_ece(prob_positive, true_labels)
cat("Expected Calibration Error (ECE):", round(ece, 4), "\n")
# Interpretation:
# ECE < 0.05: Well-calibrated
# ECE 0.05-0.10: Acceptable
# ECE > 0.10: Consider recalibration
# Decompose Brier Score into reliability, resolution, uncertainty
brier <- decompose_brier(prob_positive, true_labels)
cat("\nBrier Score Decomposition:\n")
cat(" Total Brier Score:", round(brier$brier, 4), "\n")
cat(" Reliability (lower is better):", round(brier$reliability, 4), "\n")
cat(" Resolution (higher is better):", round(brier$resolution, 4), "\n")
cat(" Uncertainty (baseline):", round(brier$uncertainty, 4), "\n")8.3 Calibration Plot
# Create calibration curve data
n_bins <- 10
breaks <- seq(0, 1, length.out = n_bins + 1)
bins <- cut(prob_positive, breaks = breaks, include.lowest = TRUE)
cal_data <- data.frame(
bin = levels(bins),
predicted = tapply(prob_positive, bins, mean),
observed = tapply(true_labels, bins, mean),
n = as.numeric(table(bins))
)
cal_data <- cal_data[!is.na(cal_data$predicted), ]
# Plot calibration curve
ggplot(cal_data, aes(x = predicted, y = observed)) +
geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray50") +
geom_point(aes(size = n), color = "steelblue", alpha = 0.7) +
geom_smooth(method = "loess", se = FALSE, color = "darkblue") +
scale_size_continuous(name = "N samples", range = c(2, 8)) +
labs(
title = "Calibration Plot",
subtitle = paste("ECE =", round(ece, 3)),
x = "Mean Predicted Probability",
y = "Observed Frequency"
) +
coord_equal(xlim = c(0, 1), ylim = c(0, 1)) +
theme_minimal()8.4 Applying Calibration
If calibration is poor (ECE > 0.10), apply recalibration:
# Option 1: Platt Scaling (logistic recalibration)
# Best for sigmoid-shaped miscalibration
calibrator_platt <- fit_platt_scaling(prob_positive, true_labels)
# Option 2: Isotonic Regression
# More flexible, best for non-monotonic miscalibration
calibrator_isotonic <- fit_isotonic_calibration(prob_positive, true_labels)
# Apply calibration to the same data (for demonstration)
# Note: fit_platt_scaling and fit_isotonic_calibration return functions
calibrated_platt <- calibrator_platt(prob_positive)
calibrated_isotonic <- calibrator_isotonic(prob_positive)
# Compare calibration errors
ece_original <- compute_ece(prob_positive, true_labels)
ece_platt <- compute_ece(calibrated_platt, true_labels)
ece_isotonic <- compute_ece(calibrated_isotonic, true_labels)
cat("Calibration Comparison:\n")
cat(" Original ECE:", round(ece_original, 4), "\n")
cat(" After Platt Scaling:", round(ece_platt, 4), "\n")
cat(" After Isotonic Regression:", round(ece_isotonic, 4), "\n")
# Save the calibrator for use with new predictions
saveRDS(calibrator_platt, "calibrator_platt.rds")
cat("\nCalibrator saved for future use.\n")Important: For rigorous calibration: - Use a held-out calibration set, not the training data - Or perform calibration within cross-validation folds
Part 9: Generating Reports
9.1 TRIPOD+AI Compliant Report
OmicSelector generates reports following the TRIPOD+AI checklist:
# Generate comprehensive report
generate_tripod_report(
results = result,
output_file = "tcga_pancreatic_report.html",
title = "Pancreatic Cancer miRNA Biomarker Discovery",
author = "OmicSelector Analysis"
)9.2 Report Contents
The generated report includes:
- Executive Summary - Key metrics at a glance
- Methods - Data source, validation strategy, model details
- Discrimination - AUC, sensitivity, specificity
- Calibration - Brier score, calibration plot
- Stability - Nogueira index, consensus features
- TRIPOD Checklist - Compliance status
Part 10: Phase 5 Advanced Features
OmicSelector 2.0 includes cutting-edge methods validated on TCGA data.
10.1 GOF Filters for Sparse miRNA Data
miRNA data often shows zero-inflation (features “off” in one condition). GOF filters detect these patterns that ANOVA misses.
# Load GOF filters
source(system.file("R", "FilterGOF.R", package = "OmicSelector"))
# Compare KS filter vs ANOVA
ks_filter <- FilterGOF_KS$new()
ks_filter$calculate(task)
ks_top <- names(sort(ks_filter$scores, decreasing = TRUE))[1:20]
anova_filter <- flt("anova")
anova_filter$calculate(task)
anova_top <- names(sort(anova_filter$scores, decreasing = TRUE))[1:20]
# Features unique to KS (potential discoveries)
unique_discoveries <- setdiff(ks_top, anova_top)
cat("KS-unique features:", length(unique_discoveries), "\n")On TCGA kidney cancer, KS filter discovered the miR-200/miR-141 family (EMT regulators) that ANOVA missed.
10.2 AutoXAI with Correlation Warnings
Get SHAP-based interpretability with automatic warnings for correlated features:
# Train a model
learner_rf <- lrn("classif.ranger",
predict_type = "prob",
num.trees = 200,
importance = "impurity"
)
learner_rf$train(task)
# Run XAI pipeline
xai_results <- xai_pipeline(
learner = learner_rf,
task = task,
top_k = 15,
cor_threshold = 0.7
)
# Check for correlation warnings
if (xai_results$correlations$n_warnings > 0) {
cat("WARNING:", xai_results$correlations$n_warnings,
"correlated feature pairs detected!\n")
cat("SHAP values may be unreliable for these features.\n")
}
# Visualize importance
plot_xai_importance(xai_results, top_n = 15)10.3 Bootstrap Stability Ensemble
Find reproducible biomarkers across resamples:
# Create ensemble with multiple filters
ensemble <- create_stability_ensemble(
preset = "default",
n_bootstrap = 100,
n_features = 30
)
# Fit and extract stable features
ensemble$fit(task, seed = 42)
stable_features <- ensemble$get_feature_importance(30)
# Features with >90% selection are highly reproducible
very_stable <- names(stable_features[stable_features >= 0.9])
cat("Very stable biomarkers (>90%):", length(very_stable), "\n")
print(very_stable)10.4 Bayesian Hyperparameter Tuning
Replace grid search with efficient Bayesian optimization:
# Create Bayesian-tuned autotuner
autotuner <- make_autotuner_glmnet(
task,
n_evals = 20,
inner_folds = 3
)
# Train (performs Bayesian optimization internally)
autotuner$train(task)
# Get optimal parameters
params <- get_optimal_params(autotuner)
cat("Optimal alpha:", params$alpha, "\n")
cat("Optimal lambda:", params$s, "\n")10.5 SMOTE for Class Imbalance
Balance classes properly inside CV folds:
# Check current class balance
cat("Original class distribution:\n")
print(table(task$truth()))
# Apply SMOTE
task_balanced <- smote_augment(task, ratio = 1.0, k = 5)
cat("\nBalanced class distribution:\n")
print(table(task_balanced$truth()))
# Validate synthetic data quality
validation <- validate_synthetic(
real = as.data.frame(task$data()),
synthetic = as.data.frame(task_balanced$data()),
target = "outcome",
features = task$feature_names
)
cat("\nSynthetic data quality score:", validation$quality_score, "\n")Summary
In this tutorial, we demonstrated:
- Data Preparation: Loading TCGA data, selecting proper controls
- Pipeline Creation: Using OmicPipeline with GraphLearners
- Benchmarking: Comparing multiple configurations with nested CV
- Stability Analysis: Computing and interpreting Nogueira Index
- Signature Selection: Multi-objective selection of optimal features
- Final Model Training: Training on full data for deployment
- Batch Correction: FrozenComBat for leakage-free batch correction
- Calibration: Ensuring reliable probability predictions
- Reporting: TRIPOD+AI compliant documentation
- Phase 5 Features: GOF filters, AutoXAI, stability ensembles, Bayesian tuning
Key Takeaways
- Always use nested cross-validation for unbiased performance estimates
- Check feature stability - high AUC with low stability = overfitting
- Use consensus features (selected in all folds) for robust signatures
- Apply FrozenComBat when external validation involves batch effects
- Generate TRIPOD+AI reports for publication-ready documentation
Session Info
sessionInfo()
#> R version 4.5.2 (2025-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices datasets utils methods base
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.39 desc_1.4.3 R6_2.6.1 fastmap_1.2.0
#> [5] xfun_0.55 cachem_1.1.0 knitr_1.51 htmltools_0.5.9
#> [9] rmarkdown_2.30 lifecycle_1.0.4 cli_3.6.5 sass_0.4.10
#> [13] pkgdown_2.2.0 textshaping_1.0.4 jquerylib_0.1.4 renv_1.1.5
#> [17] systemfonts_1.3.1 compiler_4.5.2 tools_4.5.2 ragg_1.5.0
#> [21] bslib_0.9.0 evaluate_1.0.5 yaml_2.3.12 otel_0.2.0
#> [25] jsonlite_2.0.0 rlang_1.1.6 fs_1.6.6