Introduction

This vignette provides an interactive example of 𝑅𝑡 estimation from incidence data using the EpiLPS package. EpiLPS utilizes Laplacian-P-Splines to provide rapid, bayesian estimates of the epidemic curve and instantaneous reproduction number among other functionalities.

In this case, we are focusing specifically on the estimation of 𝑅𝑡 and demonstrating the use of summrt to standardize outputs from EpiLPS and related models to a predictable, cleaned format for cross-validation and performance evaluation.

Prepare the environment

This vignette requires the following libraries:

Install dependencies

#install.packages('EpiLPS')
#install.packages("tidyverse")
#install.packages('summrt')

Import libraries

require(EpiLPS)
## Loading required package: EpiLPS
require(tidyverse)
## Loading required package: tidyverse
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
#require(summrt)

Load the data

The evaluation data set contains synthetic data describing a simulated oubtreak for epi model cross validation. The all_data components used in this example include:

all_data also contains additional attributes not directly usable by this model.

url <- "https://raw.githubusercontent.com/cmilando/RtEval/main/all_data.RDS"
all_data <- readRDS(url(url))

Prepare model data structures

In this section we convert the previously loaded input data into the formats expected by EpiLPS.

Extract the discrete distribution of the serial interval

si_spec <- Idist(probs = all_data$serial$Px)

Prepare incidence data

Here we zero fill NaN rows.

incidence = all_data$cases$daily_reports
which(is.na(incidence))
## [1] 1
incidence[1] <- 0

Run the model

Fit Rt from incidence and serial interval

In this section we use Laplacian-P-Splines to compute 𝑅𝑡 from the serial interval probability vector and daily incidence curve via the estimR method.

LPSfit <- estimR(incidence = incidence, si = si_spec$pvec)

Apply time shifts

Here we take the probability distributions for incubation time and reporting delays and aggregate and round them into the integer values expected by the model.

INCUBATION_SHIFT = round(weighted.mean(x = all_data$incubation$Day, w = all_data$incubation$Px))

REPORTINGDELAY_SHIFT = round(weighted.mean(x = all_data$reporting_delay$Day, w = all_data$reporting_delay$Px))

Plot model results

Prepare plot data structure

In this section we are extracting the percentile ranges from the modeled LPSfit and joining them into the output format expected by ggplot.

plot_data <- data.frame(
  package = "EpiLPS",
  date = all_data$cases$day - INCUBATION_SHIFT - REPORTINGDELAY_SHIFT,
  Rt_median = LPSfit$RLPS$Rq0.50,
  Rt_lb = LPSfit$RLPS$Rq0.025,
  Rt_ub = LPSfit$RLPS$Rq0.975
)

Create plot

Here we plot the results of the modeled 𝑅𝑡 curve and it’s distribution against the synthetic curve.

as_tibble(plot_data) %>%
  ggplot() +
  geom_hline(yintercept = 1, linetype = "11") +
  # *******
  # this is the true r(t), back-calculated
  geom_line(aes(x = Day, y = Rt_calc), data = all_data$rt) +
  # *******
  geom_ribbon(aes(x = date, ymin = Rt_lb, ymax = Rt_ub, fill = package),
              alpha = 0.25) +
  geom_line(aes(x = date, y = Rt_median, color = package)) +
  coord_cartesian(ylim = c(0, 5)) +
  xlab("Days") +
  ylab("Rt") +
  theme(
    axis.text = element_text(size = 10),
    axis.title = element_text(size = 14)
  )

Extract the outcomes using summrt

#outcomes <- summrt(LPSfit)