This vignette demonstrates how to use the rtestim
package to estimate the reproduction number \(R_t\) for a disease based on observed case
counts and other relevant delay distributions. We will load in
pre-processed data, run the estimate_rt
model, and
visualize the results along with confidence bands.
To start, we will load the required data. The data includes daily reported cases, serial interval, incubation period, and reporting delay distributions.
library(rtestim)
library(ggplot2)
library(dplyr)
library(tidyverse)
library(summrt)
url <- "https://raw.githubusercontent.com/cmilando/RtEval/main/all_data.RDS"
all_data <- readRDS(url(url))
The data includes:
These will be used to estimate over the observation period.
Next, we will estimate the reproduction number \(R_t\) using the estimate_rt
function.
This function requires the observed daily case counts, the corresponding days, and the serial interval distribution as inputs. rr defines the range of rows from the dataset all_data$cases that will be included in the analysis.
rr <- 2:nrow(all_data$cases)
rtestim <- cv_estimate_rt(
observed_counts = all_data$cases$daily_reports[rr],
x = all_data$cases$day[rr],
delay_distn = all_data$serial$Px
)
summrt
The estimate_rt function uses the daily reports and the serial interval to produce time-varying estimates of the reproduction number \(R_t\)Calculating Confidence Bands
We can compute approximate confidence bands around the \(R_t\)estimates using the confband function. These confidence bands help provide uncertainty intervals for the estimates.
rtestim_cb <- confband(rtestim, lambda = "lambda.1se")
The lambda value, derived from cross-validation, controls the smoothness of the estimated curve. The function outputs the fit and credible intervals (e.g., 2.5% and 97.5% quantiles), which are used to visualize the uncertainty.
We first compute the shifts for incubation and reporting delay.
INCUBATION_SHIFT = round(weighted.mean(x = all_data$incubation$Day,
w = all_data$incubation$Px))
REPORTINGDELAY_SHIFT = round(weighted.mean(x = all_data$reporting_delay$Day,
w = all_data$reporting_delay$Px))
Adjust infection date based on incubation period and reporting delay period.
plot_rtestim <- data.frame(
package = "rtEstim",
date = all_data$cases$day[rr] - INCUBATION_SHIFT - REPORTINGDELAY_SHIFT,
Rt_median = rtestim_cb$fit,
Rt_lb = rtestim_cb$`2.5%`,
Rt_ub = rtestim_cb$`97.5%`
)
as_tibble(plot_rtestim) %>%
ggplot() +
geom_hline(yintercept = 1, linetype = "11") +
# *******
# this is the true r(t), back-calculated
geom_line(aes(x = Day, y = Rt_calc), data = all_data$rt) +
# *******
geom_ribbon(aes(x = date, ymin = Rt_lb, ymax = Rt_ub, fill = package), alpha = 0.25) +
geom_line(aes(x = date, y = Rt_median, color = package)) +
coord_cartesian(ylim = c(0, 5)) +
xlab("Days") +
ylab("Rt") +
theme(
axis.text = element_text(size = 10),
axis.title = element_text(size = 14)
)
outcome <- summarize_rtestimate(rtestim)