The LoadData function ensures that input data is
properly formatted, processed for analysis, and stored internally for
the session. The package includes the gripsYR1 dataset,
which contains the first year of recruitment data from the GRIPS study.
We will use the ScreenDt and Enrolled columns
to initialize our baseline.
## ScreenDt Enrolled
## 1 2019-06-19 1
## 2 2019-06-20 0
## 3 2019-06-21 0
## 4 2019-06-24 0
## 5 2019-06-25 0
## 6 2019-06-26 0
# Load the data into the package's internal memory
LoadData(data = gripsYR1, date = ScreenDt, enrolled = Enrolled)##
## Variables Enrolled and ScreenDt were successfully loaded
The GetWeekPredCI function predicts cumulative weekly
recruitment, providing median estimates and a 95% projection band. By
default, weight functions are generated using a Binomial (51, 0.5)
probability mass function, anchoring the peak of the curve at the same
calendar week in the empirical data to capture seasonal patterns. An s3
plot method is available to visualize the results.
set.seed(123)
res <- GetWeekPredCI()
# View the first 10 weeks of the prediction matrix
res$predCI[1:10, ]## 2.5% 50% 97.5%
## 0 0 0 0
## 1 0 0 2
## 2 0 1 3
## 3 0 1 4
## 4 0 2 5
## 5 0 2 5
## 6 0 2 6
## 7 0 2 6
## 8 0 3 7
## 9 0 3 7
There are instances where the empirical data contains prolonged
periods with zero enrollment, referred to as “gap weeks,” which are not
expected to recur (e.g., pandemic disruptions). Setting
fillGaps = TRUE replaces these with expected values. To
adjust for other anticipated changes, such as a 50% increase in
recruitment efficiency, use the efficiencyFactor
multiplier.
set.seed(123)
res_anomaly <- GetWeekPredCI(fillGaps = TRUE, efficiencyFactor = 1.5)
# View the first 10 weeks of the adjusted prediction matrix for comparison
res_anomaly$predCI[1:10, ]## 2.5% 50% 97.5%
## 0 0 0 0
## 1 0 0 3
## 2 0 2 4
## 3 0 2 6
## 4 0 3 8
## 5 0 3 8
## 6 0 3 9
## 7 0 3 9
## 8 0 4 10
## 9 0 4 10
The Time2Nsubjects function estimates the number of
weeks required to recruit a specified number of subjects based on the
historical recruitment data. The default target is 50 subjects.
## Enrolling 50 subjects requires 148 weeks
##
## 2.5% 50% 97.5%
## 107 148 197
The GetDistance function calculates the Euclidean
Distance (ED) between predicted and actual weekly recruitment to assess
model accuracy. It requires a target cumulative enrollment vector. Here,
we use the second year of recruitment (gripsYR2Weekly) as
our target benchmark.
We can evaluate how adjustments affect predictive accuracy by comparing the distance of a default model against an adjusted model.
# Set the target vector
target <- gripsYR2Weekly$enrolled
# Calculate Euclidean Distance using defaults
set.seed(123)
GetDistance(target = target)## 2.5% 50% 97.5%
## 66 104 138
# Calculate Euclidean Distance accounting for gap weeks
set.seed(123)
GetDistance(target = target, fillGaps = TRUE)## 2.5% 50% 97.5%
## 26 60 93
# Compare four different predictive scenarios visually
scenarios <- list(
sc1 = GetWeekPredCI(),
sc2 = GetWeekPredCI(cauchyWt = TRUE),
sc3 = GetWeekPredCI(fillGaps = TRUE),
sc4 = GetWeekPredCI(fillGaps = TRUE, efficiencyFactor = 1.5)
)
maxY <- sapply(scenarios, \(x) x$pargs$maxY) |> max()
defaultGraphicParams <- par(no.readonly = TRUE)
graphics::par(mfrow = c(2, 2), cex.main = 1)
for (x in scenarios) plot(x, yMax = maxY, main = x$call.)