Introduction

This vignette demonstrates how to use the rtestim package to estimate the reproduction number \(R_t\) for a disease based on observed case counts and other relevant delay distributions. We will load in pre-processed data, run the estimate_rt model, and visualize the results along with confidence bands.

Load Data

To start, we will load the required data. The data includes daily reported cases, serial interval, incubation period, and reporting delay distributions.

library(rtestim)
library(ggplot2)
library(dplyr)
library(tidyverse)
library(summrt)
url <- "https://raw.githubusercontent.com/cmilando/RtEval/main/all_data.RDS"
all_data <- readRDS(url(url))

The data includes:

  • daily_reports: Daily reported case counts
  • serial: Distribution of the serial interval between cases
  • incubation: Incubation period distribution
  • reporting_delay: Delay in reporting case data

These will be used to estimate over the observation period.

Running the Rt Estimation Model

Next, we will estimate the reproduction number \(R_t\) using the estimate_rt function.

This function requires the observed daily case counts, the corresponding days, and the serial interval distribution as inputs. rr defines the range of rows from the dataset all_data$cases that will be included in the analysis.

rr <- 2:nrow(all_data$cases)

rtestim <- cv_estimate_rt(
  observed_counts = all_data$cases$daily_reports[rr],
  x = all_data$cases$day[rr],
  delay_distn = all_data$serial$Px
)

Extract the outomes using summrt

The estimate_rt function uses the daily reports and the serial interval to produce time-varying estimates of the reproduction number \(R_t\)Calculating Confidence Bands

We can compute approximate confidence bands around the \(R_t\)estimates using the confband function. These confidence bands help provide uncertainty intervals for the estimates.

rtestim_cb <- confband(rtestim, lambda = "lambda.1se")

The lambda value, derived from cross-validation, controls the smoothness of the estimated curve. The function outputs the fit and credible intervals (e.g., 2.5% and 97.5% quantiles), which are used to visualize the uncertainty.

Plotting output

We first compute the shifts for incubation and reporting delay.

INCUBATION_SHIFT = round(weighted.mean(x = all_data$incubation$Day,
                                       w = all_data$incubation$Px))

REPORTINGDELAY_SHIFT = round(weighted.mean(x = all_data$reporting_delay$Day,
                                           w = all_data$reporting_delay$Px))

create rtestim output

Adjust infection date based on incubation period and reporting delay period.

plot_rtestim <- data.frame(
  package = "rtEstim",
  date = all_data$cases$day[rr] - INCUBATION_SHIFT - REPORTINGDELAY_SHIFT,
  Rt_median = rtestim_cb$fit,
  Rt_lb = rtestim_cb$`2.5%`,
  Rt_ub = rtestim_cb$`97.5%`
)

plot format

as_tibble(plot_rtestim) %>%
  ggplot() +
  geom_hline(yintercept = 1, linetype = "11") +
  # *******
  # this is the true r(t), back-calculated
  geom_line(aes(x = Day, y = Rt_calc), data = all_data$rt) +
  # *******
  geom_ribbon(aes(x = date, ymin = Rt_lb, ymax = Rt_ub, fill = package), alpha = 0.25) +
  geom_line(aes(x = date, y = Rt_median, color = package)) +
  coord_cartesian(ylim = c(0, 5)) +
  xlab("Days") +
  ylab("Rt") +
  theme(
    axis.text = element_text(size = 10),
    axis.title = element_text(size = 14)
  )

outcome <- summarize_rtestimate(rtestim)