This vignette provides an interactive example of 𝑅𝑡 estimation from incidence data using the EpiLPS package. EpiLPS utilizes Laplacian-P-Splines to provide rapid, bayesian estimates of the epidemic curve and instantaneous reproduction number among other functionalities.
In this case, we are focusing specifically on the estimation of 𝑅𝑡 and demonstrating the use of summrt to standardize outputs from EpiLPS and related models to a predictable, cleaned format for cross-validation and performance evaluation.
This vignette requires the following libraries:
#install.packages('EpiLPS')
#install.packages("tidyverse")
#install.packages('summrt')
require(EpiLPS)
## Loading required package: EpiLPS
require(tidyverse)
## Loading required package: tidyverse
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
#require(summrt)
The evaluation data set contains synthetic data describing a simulated oubtreak for epi model cross validation. The all_data components used in this example include:
all_data also contains additional attributes not directly usable by this model.
url <- "https://raw.githubusercontent.com/cmilando/RtEval/main/all_data.RDS"
all_data <- readRDS(url(url))
In this section we convert the previously loaded input data into the formats expected by EpiLPS.
si_spec <- Idist(probs = all_data$serial$Px)
Here we zero fill NaN rows.
incidence = all_data$cases$daily_reports
which(is.na(incidence))
## [1] 1
incidence[1] <- 0
In this section we use Laplacian-P-Splines to compute 𝑅𝑡 from the serial interval probability vector and daily incidence curve via the estimR method.
LPSfit <- estimR(incidence = incidence, si = si_spec$pvec)
Here we take the probability distributions for incubation time and reporting delays and aggregate and round them into the integer values expected by the model.
INCUBATION_SHIFT = round(weighted.mean(x = all_data$incubation$Day, w = all_data$incubation$Px))
REPORTINGDELAY_SHIFT = round(weighted.mean(x = all_data$reporting_delay$Day, w = all_data$reporting_delay$Px))
In this section we are extracting the percentile ranges from the modeled LPSfit and joining them into the output format expected by ggplot.
plot_data <- data.frame(
package = "EpiLPS",
date = all_data$cases$day - INCUBATION_SHIFT - REPORTINGDELAY_SHIFT,
Rt_median = LPSfit$RLPS$Rq0.50,
Rt_lb = LPSfit$RLPS$Rq0.025,
Rt_ub = LPSfit$RLPS$Rq0.975
)
Here we plot the results of the modeled 𝑅𝑡 curve and it’s distribution against the synthetic curve.
as_tibble(plot_data) %>%
ggplot() +
geom_hline(yintercept = 1, linetype = "11") +
# *******
# this is the true r(t), back-calculated
geom_line(aes(x = Day, y = Rt_calc), data = all_data$rt) +
# *******
geom_ribbon(aes(x = date, ymin = Rt_lb, ymax = Rt_ub, fill = package),
alpha = 0.25) +
geom_line(aes(x = date, y = Rt_median, color = package)) +
coord_cartesian(ylim = c(0, 5)) +
xlab("Days") +
ylab("Rt") +
theme(
axis.text = element_text(size = 10),
axis.title = element_text(size = 14)
)
summrt
#outcomes <- summrt(LPSfit)