diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 6576998..8a6cc20 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -91,14 +91,38 @@ Fix spelling, grammar, and other minor problems without asking the user. Label a Only report what you have changed. +## Programming rules for LLMs +1. mostly lower-case comments +2. no in-line comments +3. don't number comments +4. comments are followed by '----', such as "# dependencies----". nothing that takes a full line. +5. no unnecessary code changes beyond what is already done +6. unless I ask, don't print too much code at once +7. don't do unnecessary print statements within the code. +8. don't add unnecessary try/catch statements or make unnecessary validation checks. i'm working by myself, no need for these things -- I want to see the errors! +9. NO EMOJIS +10. in R code, 2 spaces per tab and base R pipe + ## Refactor rules -We're attemping to refactor many issues with this repositroy. In general, let's: -- make sure all examples are functional and not wrapped in dontrun{} + +### Per-function checklist + +When refactoring any estimator or internal function, apply all of these: + +1. Make runnable examples (no `\dontrun{}`) +2. Remove commented-out code +3. Remove TODOs +4. Add type checks (checkmate) +5. Factor out tidyverse (`filter` → `df[cond, ]`, `mutate` → direct assignment) +6. `<-` for assignment (not `=`) +7. Drop debug prints (`cat`, `print`, commented `# print(...)`) +8. Consistent variable naming without periods (e.g. `pi_S` not `pi.S`) +9. Drop backtick column access (`temp$\`piA\`` → `temp$piA`) + +### General rules + - make sure we have test cases for all functions -- remove @TODO blocks and other ugly code - remove magrittr pipe and replace with base R pipe -- remove tidyverse dependencies -- replace "=" with "<-" - make sure every function is documented - add validation to function inputs @@ -109,3 +133,150 @@ Run this one-liner to validate the package before committing: ``` Rscript -e "devtools::document()" && Rscript -e "styler::style_pkg()" && Rscript -e "spelling::spell_check_package()" && Rscript -e "lintr::lint_package()" && Rscript -e "devtools::check(vignettes = FALSE)" ``` + +## Changing the API + +### Goals + +1. **Better method constructors** — replace `setup_method_weighting(method_name="IPW", ...)` with `ec_ipw()`, etc. Each constructor carries its own estimation logic. +2. **Polymorphic dispatch** — `run_analysis()` calls a generic on the method object instead of an if/else tree. Adding a new method = writing one constructor. +3. **Merge bootstrap** — bootstrap is an inference option on the method, not a separate code path. +4. **(Future) Model formula interface** — `outcome ~ treatment | covariates` instead of column name args. + +### Current workflow (to deprecate) + +```r +method <- setup_method_weighting( + method_name = "IPW", + optimal_weight_flag = FALSE, + wt = 0, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" +) + +analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method +) + +res <- run_analysis(analysis) +``` + +### Desired workflow + +```r +method <- ec_ipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + weight = NULL, # NULL = optimal, 0 = no borrowing, 0.3 = fixed + bootstrap = 500, # NULL = sandwich SE only + bootstrap_ci_type = NULL # NULL defaults to "perc" when bootstrap is set +) + +analysis <- setup_analysis( + data = SyntheticData, + outcomes = c("y1", "y2"), + treatment = "A", + trial_status = "S", + covariates = c("x1", "x2", "x3", "x4", "x5"), + method = method +) + +res <- run_analysis(analysis) +``` + +### Method constructors + +| Constructor | Replaces | Phase | +|---|---|---| +| `ec_ipw()` | `setup_method_weighting(method_name="IPW", ...)` | Primary | +| `ec_aipw()` | `setup_method_weighting(method_name="AIPW", ...)` | Primary | +| `did_ec_ipw()` | `setup_method_DID(method_name="IPW", ...)` | OLE | +| `did_ec_aipw()` | `setup_method_DID(method_name="AIPW", ...)` | OLE | +| `did_ec_or()` | `setup_method_DID(method_name="OR", ...)` | OLE | +| `scm()` | `setup_method_SCM(...)` | OLE | + +Each constructor returns an S4 method object. The S4 class defines a generic `estimate()` that `run_analysis()` dispatches on — no if/else. + +### How dispatch works + +```r +# S4 generic +setGeneric("estimate", function(method, data, ...) standardGeneric("estimate")) + +# Each method class implements estimate() +setMethod("estimate", "ec_ipw_method", function(method, data, ...) { + # IPW estimation logic lives here +}) + +# run_analysis() becomes: +run_analysis <- function(analysis_obj) { + estimate(analysis_obj@method, data = analysis_obj@data, ...) +} +``` + +### Interim dispatch (during migration) + +While new method constructors coexist with the old if/else tree in `run_analysis()`, each new class gets an `else if` block at the end of `run_analysis()` that calls `estimate()`: + +```r +# In run_analysis.R, BEFORE the final } else { stop(...) }: +} else if (is(method, "ec_ipw_method")) { + res <- estimate(method, + data = data, + outcomes = outcome_col_name, + treatment = treatment_col_name, + trial_status = trial_status_col_name, + covariates = covariates_col_name, + alpha = alpha, + quiet = quiet + ) +} +``` + +This pattern is repeated for each new method class as it's created. The old if/else branches for the legacy classes remain untouched. Once all 6 methods are migrated, the entire if/else tree is replaced with a single `estimate()` call. + +New method objects inherit from `method_primary_obj` or `method_OLE_obj`, so they pass the existing `checkmate::assert_class(method, "method_primary_obj")` validation in `setup_analysis_primary()`. + +### Implementation order + +1. Create full-pipeline regression tests for all 6 methods (old API, locked numerical values) ✓ +2. Create all 6 method constructors (start with `ec_ipw()`) +3. Each constructor returns an S4 object with estimation logic via `estimate()` generic +4. For each new method, add an `else if (is(method, "xxx_method"))` to `run_analysis()` +5. Add new-API tests to each pipeline test file (same expected values) +6. Once all 6 are done: refactor `setup_analysis()` into a single function (merge `_primary`/`_OLE`, add `T_cross = NULL`) +7. Once all 6 are done: replace the entire if/else in `run_analysis()` with one `estimate()` call +8. Deprecate `setup_method_weighting`, `setup_method_DID`, `setup_method_SCM`, `setup_analysis_primary`, `setup_analysis_OLE` + +### Design decisions + +- **S4 classes for method objects** — keeps rigorous type definitions, consistent with existing package patterns. +- **`T_cross` goes in `setup_analysis()`** — it's a property of the study design, not the method. +- **`bootstrap_ci_type` is nullable** — defaults to `"perc"` when `bootstrap` is non-NULL, ignored otherwise. + +### Full-pipeline regression tests + +Before refactoring any estimator, we lock in its numerical outputs on `SyntheticData` so any code change that alters results is caught. Tests use the old API (setup_method → setup_analysis → run_analysis) with exact values at `tolerance = 1e-6`. As new API constructors are added, we add parallel assertions against the same expected values. + +Rename existing `test-vignette_results_*` files and split by method: + +| Test file | Method | Covers | +|---|---|---| +| `test-full_pipeline_ec_ipw.R` | EC-IPW | weight=0, optimal, fixed (0.3), bootstrap point estimates | +| `test-full_pipeline_ec_aipw.R` | EC-AIPW | weight=0, optimal, fixed (0.3), bootstrap point estimates | +| `test-full_pipeline_did_ec_ipw.R` | DID-EC-IPW | bootstrap CIs, point estimates | +| `test-full_pipeline_did_ec_aipw.R` | DID-EC-AIPW | bootstrap CIs, point estimates | +| `test-full_pipeline_did_ec_or.R` | DID-EC-OR | bootstrap CIs, point estimates | +| `test-full_pipeline_scm.R` | SCM | bootstrap CIs, point estimates | + +Each test file asserts: +- Point estimates (exact values) +- Standard errors / standard deviations (exact values) +- CI bounds for non-bootstrap (exact values) +- Bootstrap point estimates match non-bootstrap +- Borrow weight (where applicable) + +Simulation tests (`test-vignette_results_*_simulation.R`) stay separate — they test Monte Carlo properties, not individual estimator outputs. \ No newline at end of file diff --git a/.github/workflows/R-CMD-check.yaml b/.github/workflows/R-CMD-check.yaml index cc87f35..8bc530c 100644 --- a/.github/workflows/R-CMD-check.yaml +++ b/.github/workflows/R-CMD-check.yaml @@ -47,5 +47,5 @@ jobs: - uses: r-lib/actions/check-r-package@v2 with: upload-snapshots: true - build_args: '"--no-manual"' - args: 'c("--no-manual", "--as-cran")' \ No newline at end of file + build_args: 'c("--no-manual", "--no-build-vignettes")' + args: 'c("--no-manual", "--ignore-vignettes", "--as-cran")' \ No newline at end of file diff --git a/DESCRIPTION b/DESCRIPTION index 8ca589b..afbb0fc 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -73,18 +73,10 @@ Suggests: testthat (>= 3.0.0) Config/testthat/edition: 3 Collate: - 'DID_EC_AIPW_bootstrap.R' - 'DID_EC_AIPW.R' - 'DID_EC_IPW_bootstrap.R' - 'DID_EC_IPW.R' - 'DID_EC_OR_bootstrap.R' - 'DID_EC_OR.R' 'EC_AIPW_OPT_bootstrap.R' 'EC_AIPW_OPT.R' 'EC_IPW_OPT_bootstrap.R' 'EC_IPW_OPT.R' - 'SCMboot.R' - 'SCM.R' 'bootstrap_class.R' 'method_class.R' 'analysis_class.R' @@ -94,10 +86,24 @@ Collate: 'method_weighting_class.R' 'analysis_primary_class.R' 'data.R' + 'ec_ipw.R' + 'did_ec_ipw.R' + 'did_ec_aipw.R' + 'did_ec_or.R' + 'ec_aipw.R' + 'legacy_DID_EC_AIPW_bootstrap.R' + 'legacy_DID_EC_AIPW.R' + 'legacy_DID_EC_IPW_bootstrap.R' + 'legacy_DID_EC_IPW.R' + 'legacy_DID_EC_OR_bootstrap.R' + 'legacy_DID_EC_OR.R' + 'legacy_SCMboot.R' + 'legacy_SCM.R' 'package.R' 'rdborrow-package.R' 'run_analysis.R' 'run_simulation.R' + 'scm.R' 'simulate_X_copula.R' 'simulate_X_dct_mvnorm.R' 'simulate_X_mixture.R' diff --git a/NAMESPACE b/NAMESPACE index 8ae08f6..cbbe409 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -1,7 +1,14 @@ # Generated by roxygen2: do not edit by hand +export(did_ec_aipw) +export(did_ec_ipw) +export(did_ec_or) +export(ec_aipw) +export(ec_ipw) +export(estimate) export(run_analysis) export(run_simulation) +export(scm) export(setup_analysis_OLE) export(setup_analysis_primary) export(setup_bootstrap) diff --git a/NOTES.md b/NEWS.md similarity index 100% rename from NOTES.md rename to NEWS.md diff --git a/R/did_ec_aipw.R b/R/did_ec_aipw.R new file mode 100644 index 0000000..e03ecd5 --- /dev/null +++ b/R/did_ec_aipw.R @@ -0,0 +1,279 @@ +#' @include did_ec_ipw.R +NULL + +# S4 class definition---- +.did_ec_aipw_method <- setClass( + "did_ec_aipw_method", + contains = "method_DID_obj", + slots = c( + ps_formula = "character", + trt_formula = "character", + outcome_formula = "character", + bootstrap = "numericOrNULL", + bootstrap_ci_type = "character" + ), + prototype = list( + method_name = "DID-EC-AIPW", + ps_formula = "", + trt_formula = "", + outcome_formula = "", + bootstrap = NULL, + bootstrap_ci_type = "perc" + ) +) + +#' DID-EC-AIPW method +#' +#' Creates a method object for difference-in-differences augmented IPW +#' estimation with external control borrowing for the open-label +#' extension phase (Zhou et al., 2024). +#' +#' @param ps_formula Formula string for the propensity score model +#' predicting trial participation. +#' @param trt_formula Formula string for the treatment assignment model, +#' or \code{""} for marginal probability. +#' @param outcome_formula Character vector of outcome model formulas, +#' one per time point. +#' @param bootstrap Number of bootstrap replicates. Defaults to 500. +#' @param bootstrap_ci_type Bootstrap CI type. Defaults to \code{"perc"}. +#' +#' @return An S4 object of class \code{did_ec_aipw_method}. +#' +#' @references +#' Zhou et al. (2024). Estimating treatment effect in randomized trial +#' after control to treatment crossover using external controls. +#' \emph{Journal of Biopharmaceutical Statistics}. +#' \doi{10.1080/10543406.2024.2444222} +#' +#' @export +#' +#' @examples +#' did_ec_aipw( +#' ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", +#' trt_formula = "A ~ x1 + x2 + x3 + x4 + x5", +#' outcome_formula = c( +#' "y1 ~ x1 + x2 + x3 + x4 + x5", +#' "y2 ~ x1 + x2 + x3 + x4 + x5", +#' "y3 ~ x1 + x2 + x3 + x4 + x5", +#' "y4 ~ x1 + x2 + x3 + x4 + x5" +#' ), +#' bootstrap = 500 +#' ) +did_ec_aipw <- function(ps_formula, + trt_formula = "", + outcome_formula, + bootstrap = 500L, + bootstrap_ci_type = NULL) { + checkmate::assert_string(ps_formula) + checkmate::assert_string(trt_formula) + checkmate::assert_character(outcome_formula, min.len = 1) + checkmate::assert_count(bootstrap, positive = TRUE) + + if (is.null(bootstrap_ci_type)) { + bootstrap_ci_type <- "perc" + } + checkmate::assert_choice( + bootstrap_ci_type, c("perc", "bca", "norm", "basic", "stud") + ) + + .did_ec_aipw_method( + ps_formula = ps_formula, + trt_formula = trt_formula, + outcome_formula = outcome_formula, + bootstrap = bootstrap, + bootstrap_ci_type = bootstrap_ci_type, + method_name = "DID-EC-AIPW", + bootstrap_flag = TRUE, + bootstrap_obj = .bootstrap_obj( + replicates = bootstrap, + bootstrap_CI_type = bootstrap_ci_type + ) + ) +} + +#' @rdname estimate +setMethod("estimate", "did_ec_aipw_method", function(method, data, outcomes, + treatment, trial_status, + covariates, alpha = 0.05, + quiet = TRUE, + T_cross) { + Y <- as.matrix(data[, outcomes, drop = FALSE]) + S <- data[[trial_status]] + A <- data[[treatment]] + n_time <- ncol(Y) + N <- nrow(data) + n <- sum(S) + pi_S <- n / N + + ps_formula <- sub("^[^~]*~", paste0(trial_status, " ~"), method@ps_formula) + trt_formula <- method@trt_formula + if (trt_formula != "") { + trt_formula <- sub("^[^~]*~", paste0(treatment, " ~"), trt_formula) + } + + df <- data.frame(Y, S = S, A = A, data[, covariates, drop = FALSE]) + + if (!quiet) cat("Running DID-EC-AIPW estimator...\n") + + result <- .did_ec_aipw_estimate( + df, Y, S, A, n, N, pi_S, n_time, + T_cross, ps_formula, trt_formula, + method@outcome_formula + ) + tau <- result$tau + + if (!quiet) cat("Running bootstrap inference...\n") + + n_ole <- n_time - T_cross + boot_res <- .run_bootstrap( + df = df, statistic = .did_ec_aipw_statistic, + n_estimates = n_ole, bootstrap = method@bootstrap, + bootstrap_ci_type = method@bootstrap_ci_type, alpha = alpha, + outcomes = outcomes, ps_formula = ps_formula, + trt_formula = trt_formula, outcome_formula = method@outcome_formula, + T_cross = T_cross + ) + + data.frame( + point_estimates = tau, + lower_CI_boot = boot_res$lower_ci, + upper_CI_boot = boot_res$upper_ci, + row.names = paste0("tau", (T_cross + 1):n_time) + ) +}) + +# internal helpers---- + +#' DID-EC-AIPW point estimate. +#' augments DID-IPW with outcome regression: uses residuals Y-mu(X) +#' instead of raw outcomes for the weighted potential outcomes. +#' @param df internal data frame. +#' @param Y outcome matrix (N x T). +#' @param S trial participation vector. +#' @param A treatment vector. +#' @param n number of RCT subjects. +#' @param N total sample size. +#' @param pi_S marginal trial participation probability. +#' @param n_time number of time points. +#' @param T_cross crossover time point. +#' @param ps_formula propensity score formula. +#' @param trt_formula treatment assignment formula. +#' @param outcome_formula character vector of outcome model formulas. +#' @return list with tau vector. +#' @noRd +.did_ec_aipw_estimate <- function(df, Y, S, A, n, N, pi_S, n_time, + T_cross, ps_formula, trt_formula, + outcome_formula) { + ps_model <- glm(as.formula(ps_formula), data = df, family = "binomial") + pi_SX <- predict(ps_model, newdata = df, type = "response") + + if (trt_formula == "") { + pi_AX <- sum(A[S == 1]) / n + } else { + trt_model <- glm(as.formula(trt_formula), + data = df[S == 1, , drop = FALSE], family = "binomial" + ) + pi_AX <- predict(trt_model, newdata = df, type = "response") + } + + # outcome models on external controls + Y0_models <- lapply(outcome_formula, \(f) { + lm(as.formula(f), data = df[S == 0, , drop = FALSE]) + }) + Y0 <- vapply(seq_len(n_time), \(t) { + predict(Y0_models[[t]], newdata = df) + }, numeric(N)) + Yr <- Y - Y0 + + rx <- pi_SX * (1 - pi_S) / (1 - pi_SX) / pi_S + w11 <- 1 / pi_AX + w10 <- 1 / (1 - pi_AX) + w00 <- rx + + potential <- (S * A * w11 / sum(S * A * w11) + + S * (1 - A) * w10 / sum(S * (1 - A) * w10) + + (1 - S) * w00 / sum((1 - S) * w00)) * Yr + + T_pc <- T_cross + mu_S1A1 <- colSums( + potential[S == 1 & A == 1, (T_pc + 1):n_time, drop = FALSE] + ) + mu_S0A0 <- colSums( + potential[S == 0, (T_pc + 1):n_time, drop = FALSE] + ) + bias <- sum(rowMeans( + potential[S == 1 & A == 0, 1:T_pc, drop = FALSE] + )) - sum(rowMeans( + potential[S == 0, 1:T_pc, drop = FALSE] + )) + + tau <- mu_S1A1 - mu_S0A0 - bias + list(tau = tau) +} + +#' bootstrap statistic for DID-EC-AIPW. +#' refits PS, treatment, and outcome models on each resample. +#' @param data internal data frame. +#' @param indices bootstrap sample indices. +#' @param outcomes outcome column names. +#' @param ps_formula propensity score formula. +#' @param trt_formula treatment assignment formula. +#' @param outcome_formula outcome model formulas. +#' @param T_cross crossover time point. +#' @return numeric vector of tau estimates. +#' @noRd +.did_ec_aipw_statistic <- function(data, indices, outcomes, ps_formula, + trt_formula, outcome_formula, T_cross) { + d <- data[indices, , drop = FALSE] + Y <- as.matrix(d[, outcomes, drop = FALSE]) + S <- d$S + A <- d$A + n <- sum(S) + N <- nrow(d) + n_time <- ncol(Y) + pi_S <- n / N + + ps_model <- glm(as.formula(ps_formula), data = d, family = "binomial") + pi_SX <- predict(ps_model, newdata = d, type = "response") + + if (trt_formula == "") { + pi_AX <- sum(A[S == 1]) / n + } else { + trt_model <- glm(as.formula(trt_formula), + data = d[S == 1, , drop = FALSE], family = "binomial" + ) + pi_AX <- predict(trt_model, newdata = d, type = "response") + } + + Y0_models <- lapply(outcome_formula, \(f) { + lm(as.formula(f), data = d[S == 0, , drop = FALSE]) + }) + Y0 <- vapply(seq_len(n_time), \(t) { + predict(Y0_models[[t]], newdata = d) + }, numeric(N)) + Yr <- Y - Y0 + + rx <- pi_SX * (1 - pi_S) / (1 - pi_SX) / pi_S + w11 <- 1 / pi_AX + w10 <- 1 / (1 - pi_AX) + w00 <- rx + + potential <- (S * A * w11 / sum(S * A * w11) + + S * (1 - A) * w10 / sum(S * (1 - A) * w10) + + (1 - S) * w00 / sum((1 - S) * w00)) * Yr + + T_pc <- T_cross + mu_S1A1 <- colSums( + potential[S == 1 & A == 1, (T_pc + 1):n_time, drop = FALSE] + ) + mu_S0A0 <- colSums( + potential[S == 0, (T_pc + 1):n_time, drop = FALSE] + ) + bias <- sum(rowMeans( + potential[S == 1 & A == 0, 1:T_pc, drop = FALSE] + )) - sum(rowMeans( + potential[S == 0, 1:T_pc, drop = FALSE] + )) + + mu_S1A1 - mu_S0A0 - bias +} diff --git a/R/did_ec_ipw.R b/R/did_ec_ipw.R new file mode 100644 index 0000000..c4d67da --- /dev/null +++ b/R/did_ec_ipw.R @@ -0,0 +1,245 @@ +#' @include ec_ipw.R +NULL + +# S4 class definition---- +.did_ec_ipw_method <- setClass( + "did_ec_ipw_method", + contains = "method_DID_obj", + slots = c( + ps_formula = "character", + trt_formula = "character", + bootstrap = "numericOrNULL", + bootstrap_ci_type = "character" + ), + prototype = list( + method_name = "DID-EC-IPW", + ps_formula = "", + trt_formula = "", + bootstrap = NULL, + bootstrap_ci_type = "perc" + ) +) + +#' DID-EC-IPW method +#' +#' Creates a method object for difference-in-differences IPW estimation +#' with external control borrowing for the open-label extension phase +#' (Zhou et al., 2024). +#' +#' @param ps_formula Formula string for the propensity score model +#' predicting trial participation. +#' @param trt_formula Formula string for the treatment assignment model, +#' or \code{""} for marginal probability. +#' @param bootstrap Number of bootstrap replicates (required for DID +#' methods). Defaults to 500. +#' @param bootstrap_ci_type Bootstrap CI type. Defaults to \code{"perc"}. +#' +#' @return An S4 object of class \code{did_ec_ipw_method}. +#' +#' @references +#' Zhou et al. (2024). Estimating treatment effect in randomized trial +#' after control to treatment crossover using external controls. +#' \emph{Journal of Biopharmaceutical Statistics}. +#' \doi{10.1080/10543406.2024.2444222} +#' +#' @export +#' +#' @examples +#' did_ec_ipw( +#' ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", +#' trt_formula = "A ~ x1 + x2 + x3 + x4 + x5", +#' bootstrap = 500 +#' ) +did_ec_ipw <- function(ps_formula, + trt_formula = "", + bootstrap = 500L, + bootstrap_ci_type = NULL) { + checkmate::assert_string(ps_formula) + checkmate::assert_string(trt_formula) + checkmate::assert_count(bootstrap, positive = TRUE) + + if (is.null(bootstrap_ci_type)) { + bootstrap_ci_type <- "perc" + } + checkmate::assert_choice( + bootstrap_ci_type, c("perc", "bca", "norm", "basic", "stud") + ) + + .did_ec_ipw_method( + ps_formula = ps_formula, + trt_formula = trt_formula, + bootstrap = bootstrap, + bootstrap_ci_type = bootstrap_ci_type, + method_name = "DID-EC-IPW", + bootstrap_flag = TRUE, + bootstrap_obj = .bootstrap_obj( + replicates = bootstrap, + bootstrap_CI_type = bootstrap_ci_type + ) + ) +} + +#' @rdname estimate +setMethod("estimate", "did_ec_ipw_method", function(method, data, outcomes, + treatment, trial_status, + covariates, alpha = 0.05, + quiet = TRUE, + T_cross) { + Y <- as.matrix(data[, outcomes, drop = FALSE]) + S <- data[[trial_status]] + A <- data[[treatment]] + n_time <- ncol(Y) + N <- nrow(data) + n <- sum(S) + pi_S <- n / N + + ps_formula <- sub("^[^~]*~", paste0(trial_status, " ~"), method@ps_formula) + trt_formula <- method@trt_formula + if (trt_formula != "") { + trt_formula <- sub("^[^~]*~", paste0(treatment, " ~"), trt_formula) + } + + df <- data.frame(Y, S = S, A = A, data[, covariates, drop = FALSE]) + + if (!quiet) cat("Running DID-EC-IPW estimator...\n") + + result <- .did_ec_ipw_estimate( + df, Y, S, A, n, N, pi_S, n_time, + T_cross, ps_formula, trt_formula + ) + tau <- result$tau + + if (!quiet) cat("Running bootstrap inference...\n") + + n_ole <- n_time - T_cross + boot_res <- .run_bootstrap( + df = df, statistic = .did_ec_ipw_statistic, + n_estimates = n_ole, bootstrap = method@bootstrap, + bootstrap_ci_type = method@bootstrap_ci_type, alpha = alpha, + outcomes = outcomes, ps_formula = ps_formula, + trt_formula = trt_formula, T_cross = T_cross + ) + + data.frame( + point_estimates = tau, + lower_CI_boot = boot_res$lower_ci, + upper_CI_boot = boot_res$upper_ci, + row.names = paste0("tau", (T_cross + 1):n_time) + ) +}) + +#' DID-EC-IPW point estimate. +#' fits PS and treatment models, computes DID estimator: +#' tau = (treated OLE) - (EC OLE) - bias, where bias is the +#' pre-crossover difference between RCT control and EC. +#' @param df internal data frame. +#' @param Y outcome matrix (N x T). +#' @param S trial participation vector. +#' @param A treatment vector. +#' @param n number of RCT subjects. +#' @param N total sample size. +#' @param pi_S marginal trial participation probability. +#' @param n_time number of time points. +#' @param T_cross crossover time point. +#' @param ps_formula propensity score formula. +#' @param trt_formula treatment assignment formula. +#' @return list with tau vector. +#' @noRd +.did_ec_ipw_estimate <- function(df, Y, S, A, n, N, pi_S, n_time, + T_cross, ps_formula, trt_formula) { + ps_model <- glm(as.formula(ps_formula), data = df, family = "binomial") + pi_SX <- predict(ps_model, newdata = df, type = "response") + + if (trt_formula == "") { + pi_AX <- sum(A[S == 1]) / n + } else { + trt_model <- glm(as.formula(trt_formula), + data = df[S == 1, , drop = FALSE], family = "binomial" + ) + pi_AX <- predict(trt_model, newdata = df, type = "response") + } + + rx <- pi_SX * (1 - pi_S) / (1 - pi_SX) / pi_S + + w11 <- 1 / pi_AX + w10 <- 1 / (1 - pi_AX) + w00 <- rx + + Ys <- as.matrix(Y) + potential <- (S * A * w11 / sum(S * A * w11) + + S * (1 - A) * w10 / sum(S * (1 - A) * w10) + + (1 - S) * w00 / sum((1 - S) * w00)) * Ys + + T_pc <- T_cross + mu_S1A1 <- colSums( + potential[S == 1 & A == 1, (T_pc + 1):n_time, drop = FALSE] + ) + mu_S0A0 <- colSums( + potential[S == 0, (T_pc + 1):n_time, drop = FALSE] + ) + bias <- sum(rowMeans( + potential[S == 1 & A == 0, 1:T_pc, drop = FALSE] + )) - sum(rowMeans( + potential[S == 0, 1:T_pc, drop = FALSE] + )) + + tau <- mu_S1A1 - mu_S0A0 - bias + list(tau = tau) +} + +#' bootstrap statistic for DID-EC-IPW. +#' @param data internal data frame. +#' @param indices bootstrap sample indices. +#' @param outcomes outcome column names. +#' @param ps_formula propensity score formula. +#' @param trt_formula treatment assignment formula. +#' @param T_cross crossover time point. +#' @return numeric vector of tau estimates. +#' @noRd +.did_ec_ipw_statistic <- function(data, indices, outcomes, ps_formula, + trt_formula, T_cross) { + d <- data[indices, , drop = FALSE] + Y <- as.matrix(d[, outcomes, drop = FALSE]) + S <- d$S + A <- d$A + n <- sum(S) + N <- nrow(d) + n_time <- ncol(Y) + pi_S <- n / N + + ps_model <- glm(as.formula(ps_formula), data = d, family = "binomial") + pi_SX <- predict(ps_model, newdata = d, type = "response") + + if (trt_formula == "") { + pi_AX <- sum(A[S == 1]) / n + } else { + trt_model <- glm(as.formula(trt_formula), + data = d[S == 1, , drop = FALSE], family = "binomial" + ) + pi_AX <- predict(trt_model, newdata = d, type = "response") + } + + rx <- pi_SX * (1 - pi_S) / (1 - pi_SX) / pi_S + w11 <- 1 / pi_AX + w10 <- 1 / (1 - pi_AX) + w00 <- rx + + potential <- (S * A * w11 / sum(S * A * w11) + + S * (1 - A) * w10 / sum(S * (1 - A) * w10) + + (1 - S) * w00 / sum((1 - S) * w00)) * Y + + T_pc <- T_cross + mu_S1A1 <- colSums( + potential[S == 1 & A == 1, (T_pc + 1):n_time, drop = FALSE] + ) + mu_S0A0 <- colSums( + potential[S == 0, (T_pc + 1):n_time, drop = FALSE] + ) + bias <- sum(rowMeans( + potential[S == 1 & A == 0, 1:T_pc, drop = FALSE] + )) - sum(rowMeans( + potential[S == 0, 1:T_pc, drop = FALSE] + )) + + mu_S1A1 - mu_S0A0 - bias +} diff --git a/R/did_ec_or.R b/R/did_ec_or.R new file mode 100644 index 0000000..723f8b0 --- /dev/null +++ b/R/did_ec_or.R @@ -0,0 +1,259 @@ +#' @include did_ec_ipw.R +NULL + +# s4 class definition---- +.did_ec_or_method <- setClass( + "did_ec_or_method", + contains = "method_DID_obj", + slots = c( + outcome_formula_ext = "character", + outcome_formula_rct_ctrl = "character", + outcome_formula_rct_trt = "character", + bootstrap = "numericOrNULL", + bootstrap_ci_type = "character" + ), + prototype = list( + method_name = "DID-EC-OR", + outcome_formula_ext = "", + outcome_formula_rct_ctrl = "", + outcome_formula_rct_trt = "", + bootstrap = NULL, + bootstrap_ci_type = "perc" + ) +) + +# constructor---- + +#' DID-EC-OR method constructor +#' +#' Creates a method object for difference-in-differences outcome +#' regression estimation with external control borrowing for the +#' open-label extension phase (Zhou et al., 2024). +#' +#' @param outcome_formula_ext Character vector of outcome model formulas +#' for external controls, one per time point. +#' @param outcome_formula_rct_ctrl Character vector of outcome model +#' formulas for RCT control subjects, one per time point. +#' @param outcome_formula_rct_trt Character vector of outcome model +#' formulas for RCT treated subjects, one per time point. +#' @param bootstrap Number of bootstrap replicates. Defaults to 500. +#' @param bootstrap_ci_type Bootstrap CI type. Defaults to \code{"perc"}. +#' +#' @return An S4 object of class \code{did_ec_or_method}. +#' +#' @references +#' Zhou et al. (2024). Estimating treatment effect in randomized trial +#' after control to treatment crossover using external controls. +#' \emph{Journal of Biopharmaceutical Statistics}. +#' \doi{10.1080/10543406.2024.2444222} +#' +#' @export +#' +#' @examples +#' model_forms <- c( +#' "y1 ~ x1 + x2 + x3 + x4 + x5", +#' "y2 ~ x1 + x2 + x3 + x4 + x5", +#' "y3 ~ x1 + x2 + x3 + x4 + x5", +#' "y4 ~ x1 + x2 + x3 + x4 + x5" +#' ) +#' did_ec_or( +#' outcome_formula_ext = model_forms, +#' outcome_formula_rct_ctrl = model_forms, +#' outcome_formula_rct_trt = model_forms, +#' bootstrap = 500 +#' ) +did_ec_or <- function(outcome_formula_ext, + outcome_formula_rct_ctrl, + outcome_formula_rct_trt, + bootstrap = 500L, + bootstrap_ci_type = NULL) { + checkmate::assert_character(outcome_formula_ext, min.len = 1) + checkmate::assert_character(outcome_formula_rct_ctrl, min.len = 1) + checkmate::assert_character(outcome_formula_rct_trt, min.len = 1) + checkmate::assert_count(bootstrap, positive = TRUE) + + if (is.null(bootstrap_ci_type)) { + bootstrap_ci_type <- "perc" + } + checkmate::assert_choice( + bootstrap_ci_type, c("perc", "bca", "norm", "basic", "stud") + ) + + .did_ec_or_method( + outcome_formula_ext = outcome_formula_ext, + outcome_formula_rct_ctrl = outcome_formula_rct_ctrl, + outcome_formula_rct_trt = outcome_formula_rct_trt, + bootstrap = bootstrap, + bootstrap_ci_type = bootstrap_ci_type, + method_name = "DID-EC-OR", + bootstrap_flag = TRUE, + bootstrap_obj = .bootstrap_obj( + replicates = bootstrap, + bootstrap_CI_type = bootstrap_ci_type + ) + ) +} + +# estimate() method---- + +#' @rdname estimate +setMethod("estimate", "did_ec_or_method", function(method, data, outcomes, + treatment, trial_status, + covariates, alpha = 0.05, + quiet = TRUE, + T_cross) { + Y <- as.matrix(data[, outcomes, drop = FALSE]) + S <- data[[trial_status]] + A <- data[[treatment]] + n_time <- ncol(Y) + N <- nrow(data) + n <- sum(S) + + df <- data.frame(Y, S = S, A = A, data[, covariates, drop = FALSE]) + + if (!quiet) cat("Running DID-EC-OR estimator...\n") + + result <- .did_ec_or_estimate( + df, S, A, n, n_time, T_cross, + method@outcome_formula_ext, + method@outcome_formula_rct_ctrl, + method@outcome_formula_rct_trt + ) + tau <- result$tau + + if (!quiet) cat("Running bootstrap inference...\n") + + n_ole <- n_time - T_cross + boot_res <- .run_bootstrap( + df = df, statistic = .did_ec_or_statistic, + n_estimates = n_ole, bootstrap = method@bootstrap, + bootstrap_ci_type = method@bootstrap_ci_type, alpha = alpha, + outcomes = outcomes, + outcome_formula_ext = method@outcome_formula_ext, + outcome_formula_rct_ctrl = method@outcome_formula_rct_ctrl, + outcome_formula_rct_trt = method@outcome_formula_rct_trt, + T_cross = T_cross + ) + + data.frame( + point_estimates = tau, + lower_CI_boot = boot_res$lower_ci, + upper_CI_boot = boot_res$upper_ci, + row.names = paste0("tau", (T_cross + 1):n_time) + ) +}) + +# internal helpers---- + +#' DID-EC-OR point estimate. +#' uses outcome regression only (no PS model). fits separate models for +#' external controls, RCT controls (pre-crossover), and RCT treated (OLE). +#' DID logic: tau = (RCT model OLE - RCT model pre) - (EC model OLE - EC model pre). +#' @param df internal data frame. +#' @param S trial participation vector. +#' @param A treatment vector. +#' @param n number of RCT subjects. +#' @param n_time number of time points. +#' @param T_cross crossover time point. +#' @param outcome_formula_ext formulas for external control outcome models. +#' @param outcome_formula_rct_ctrl formulas for RCT control outcome models. +#' @param outcome_formula_rct_trt formulas for RCT treated outcome models. +#' @return list with tau vector. +#' @noRd +.did_ec_or_estimate <- function(df, S, A, n, n_time, T_cross, + outcome_formula_ext, + outcome_formula_rct_ctrl, + outcome_formula_rct_trt) { + T_pc <- T_cross + + # external outcome models + model_list_ext <- lapply(seq_len(n_time), \(t) { + lm(as.formula(outcome_formula_ext[t]), data = df[S == 0, , drop = FALSE]) + }) + + # rct outcome models: control for placebo period, treated for OLE period + model_list_rct_pc <- lapply(seq_len(T_pc), \(t) { + lm(as.formula(outcome_formula_rct_ctrl[t]), + data = df[S == 1 & A == 0, , drop = FALSE] + ) + }) + model_list_rct_cr <- lapply((T_pc + 1):n_time, \(t) { + lm(as.formula(outcome_formula_rct_trt[t]), + data = df[S == 1 & A == 1, , drop = FALSE] + ) + }) + model_list_rct <- c(model_list_rct_pc, model_list_rct_cr) + + rct_data <- df[S == 1, , drop = FALSE] + + # predicted outcomes for RCT subjects from external models + mu_S0A0 <- vapply(seq_len(n_time), \(t) { + predict(model_list_ext[[t]], newdata = rct_data) + }, numeric(n)) + avg_S0A0 <- colMeans(mu_S0A0) + + # predicted outcomes for RCT subjects from RCT models + mu_S1 <- vapply(seq_len(n_time), \(t) { + predict(model_list_rct[[t]], newdata = rct_data) + }, numeric(n)) + avg_S1 <- colMeans(mu_S1) + + tau <- (avg_S1[(T_pc + 1):n_time] - mean(avg_S1[1:T_pc])) - + (avg_S0A0[(T_pc + 1):n_time] - mean(avg_S0A0[1:T_pc])) + + list(tau = tau) +} + +#' bootstrap statistic for DID-EC-OR. +#' refits all outcome models on each resample. +#' @param data internal data frame. +#' @param indices bootstrap sample indices. +#' @param outcomes outcome column names. +#' @param outcome_formula_ext external control outcome formulas. +#' @param outcome_formula_rct_ctrl RCT control outcome formulas. +#' @param outcome_formula_rct_trt RCT treated outcome formulas. +#' @param T_cross crossover time point. +#' @return numeric vector of tau estimates. +#' @noRd +.did_ec_or_statistic <- function(data, indices, outcomes, + outcome_formula_ext, + outcome_formula_rct_ctrl, + outcome_formula_rct_trt, + T_cross) { + d <- data[indices, , drop = FALSE] + S <- d$S + A <- d$A + n <- sum(S) + n_time <- length(outcomes) + T_pc <- T_cross + + model_list_ext <- lapply(seq_len(n_time), \(t) { + lm(as.formula(outcome_formula_ext[t]), data = d[S == 0, , drop = FALSE]) + }) + model_list_rct_pc <- lapply(seq_len(T_pc), \(t) { + lm(as.formula(outcome_formula_rct_ctrl[t]), + data = d[S == 1 & A == 0, , drop = FALSE] + ) + }) + model_list_rct_cr <- lapply((T_pc + 1):n_time, \(t) { + lm(as.formula(outcome_formula_rct_trt[t]), + data = d[S == 1 & A == 1, , drop = FALSE] + ) + }) + model_list_rct <- c(model_list_rct_pc, model_list_rct_cr) + + rct_data <- d[S == 1, , drop = FALSE] + + mu_S0A0 <- vapply(seq_len(n_time), \(t) { + predict(model_list_ext[[t]], newdata = rct_data) + }, numeric(n)) + avg_S0A0 <- colMeans(mu_S0A0) + + mu_S1 <- vapply(seq_len(n_time), \(t) { + predict(model_list_rct[[t]], newdata = rct_data) + }, numeric(n)) + avg_S1 <- colMeans(mu_S1) + + (avg_S1[(T_pc + 1):n_time] - mean(avg_S1[1:T_pc])) - + (avg_S0A0[(T_pc + 1):n_time] - mean(avg_S0A0[1:T_pc])) +} diff --git a/R/ec_aipw.R b/R/ec_aipw.R new file mode 100644 index 0000000..3e368de --- /dev/null +++ b/R/ec_aipw.R @@ -0,0 +1,311 @@ +#' @include ec_ipw.R +NULL + +# S4 class definition---- +.ec_aipw_method <- setClass( + "ec_aipw_method", + contains = "method_weighting_obj", + slots = c( + ps_formula = "character", + outcome_formula = "character", + weight = "numericOrNULL", + bootstrap = "numericOrNULL", + bootstrap_ci_type = "characterOrNULL" + ), + prototype = list( + method_name = "EC-AIPW", + ps_formula = "", + outcome_formula = "", + weight = NULL, + bootstrap = NULL, + bootstrap_ci_type = NULL + ) +) + +#' EC-AIPW method +#' +#' Creates a method object for augmented IPW estimation with external +#' control borrowing (Zhou et al., 2024). Augments the IPW estimator +#' with an outcome regression model for improved efficiency. Pass to +#' \code{\link{setup_analysis_primary}} and \code{\link{run_analysis}}. +#' +#' @param ps_formula Formula string for the propensity score model +#' predicting trial participation. +#' @param outcome_formula Character vector of outcome model formulas, +#' one per time point (e.g., \code{c("y1 ~ x1 + x2", "y2 ~ x1 + x2")}). +#' @param weight Borrowing weight. \code{NULL} (default) for data-adaptive +#' optimal weight, \code{0} for RCT-only, or a value in (0, 1]. +#' @param bootstrap Number of bootstrap replicates, or \code{NULL} +#' (default) for sandwich variance with normal CIs. +#' @param bootstrap_ci_type Bootstrap CI type. Defaults to \code{"perc"}. +#' +#' @return An S4 object of class \code{ec_aipw_method}. +#' +#' @references +#' Zhou et al. (2024). Causal estimators for incorporating external +#' controls in randomized trials with longitudinal outcomes. +#' \emph{JRSS-A}. \doi{10.1093/jrsssa/qnae075} +#' +#' @export +#' +#' @examples +#' # optimal weight, sandwich SE +#' ec_aipw( +#' ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", +#' outcome_formula = c( +#' "y1 ~ x1 + x2 + x3 + x4 + x5", +#' "y2 ~ x1 + x2 + x3 + x4 + x5" +#' ) +#' ) +#' +#' # no borrowing +#' ec_aipw( +#' ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", +#' outcome_formula = c( +#' "y1 ~ x1 + x2 + x3 + x4 + x5", +#' "y2 ~ x1 + x2 + x3 + x4 + x5" +#' ), +#' weight = 0 +#' ) +#' +#' # fixed weight with bootstrap +#' ec_aipw( +#' ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", +#' outcome_formula = c( +#' "y1 ~ x1 + x2 + x3 + x4 + x5", +#' "y2 ~ x1 + x2 + x3 + x4 + x5" +#' ), +#' weight = 0.3, +#' bootstrap = 500 +#' ) +ec_aipw <- function(ps_formula, + outcome_formula, + weight = NULL, + bootstrap = NULL, + bootstrap_ci_type = NULL) { + checkmate::assert_string(ps_formula) + checkmate::assert_character(outcome_formula, min.len = 1) + checkmate::assert_number(weight, lower = 0, upper = 1, null.ok = TRUE) + checkmate::assert_count(bootstrap, positive = TRUE, null.ok = TRUE) + + # bootstrap type + if (!is.null(bootstrap) && is.null(bootstrap_ci_type)) { + bootstrap_ci_type <- "perc" + } + if (!is.null(bootstrap_ci_type)) { + checkmate::assert_choice( + bootstrap_ci_type, c("perc", "bca", "norm", "basic", "stud") + ) + } + + .ec_aipw_method( + ps_formula = ps_formula, + outcome_formula = outcome_formula, + weight = weight, + bootstrap = bootstrap, + bootstrap_ci_type = bootstrap_ci_type, + method_name = "EC-AIPW", + bootstrap_flag = !is.null(bootstrap), + bootstrap_obj = .bootstrap_obj( + replicates = bootstrap %||% 500L, + bootstrap_CI_type = bootstrap_ci_type %||% "perc" + ) + ) +} + +#' @rdname estimate +setMethod("estimate", "ec_aipw_method", function(method, data, outcomes, + treatment, trial_status, + covariates, alpha = 0.05, + quiet = TRUE) { + + # unwrap formula + ps_formula <- sub("^[^~]*~", paste0(trial_status, " ~"), method@ps_formula) + df <- .build_analysis_df(data, outcomes, treatment, trial_status, covariates) + + if (!quiet) cat("Running EC-AIPW estimator...\n") + + core <- .ec_aipw_core(df, outcomes, ps_formula, + method@outcome_formula, method@weight) + sd_tau <- .ec_aipw_sandwich(df, core, length(outcomes), + method@outcome_formula) + + .format_primary_results( + tau = core$tau, sd_tau = sd_tau, + borrow_weight = core$borrow_weight, + n_time = length(core$tau), alpha = alpha, + method = method, df = df, quiet = quiet, + statistic = .ec_aipw_statistic, + outcomes = outcomes, covariates = covariates, + ps_formula = ps_formula, + outcome_formula = method@outcome_formula + ) +}) + +#' EC-AIPW point estimate (Def 2, Eq 7). +#' fits PS + outcome models, uses residuals Y-mu(X) in place of Y, +#' computes mu11/mu10/mu00 and optimal weight. +#' shared by estimate() and bootstrap statistic. +#' @param df internal data frame. +#' @param outcomes outcome column names. +#' @param ps_formula propensity score formula string. +#' @param outcome_formula character vector of outcome model formulas. +#' @param weight fixed weight or NULL for optimal. +#' @return list with tau, borrow_weight, and model intermediates. +#' @noRd +.ec_aipw_core <- function(df, outcomes, ps_formula, outcome_formula, weight) { + Y <- as.matrix(df[, outcomes, drop = FALSE]) + S <- df$S + A <- df$A + N <- nrow(df) + n <- sum(S) + n_time <- ncol(Y) + pi_S <- n / N + + # propensity score model + ps_model <- glm(as.formula(ps_formula), data = df, family = "binomial") + pi_SX <- predict(ps_model, newdata = df, type = "response") + pi_A <- sum(A[S == 1]) / n + rx <- (pi_SX / (1 - pi_SX)) * ((1 - pi_S) / pi_S) + + # outcome regression on controls, predict for all subjects + Y0_models <- lapply(outcome_formula, \(f) { + lm(as.formula(f), data = df[A == 0, , drop = FALSE]) + }) + Y0 <- vapply(seq_len(n_time), \(t) { + predict(Y0_models[[t]], newdata = df) + }, numeric(N)) + Yr <- Y - Y0 + + # weighted potential outcomes using residuals + w00 <- rx + potential <- (S * A / pi_A + S * (1 - A) / (1 - pi_A) + + (1 - S) * w00) * Yr + mu1 <- colSums(potential[S == 1 & A == 1, , drop = FALSE]) / n + mu10 <- colSums(potential[S == 1 & A == 0, , drop = FALSE]) / n + mu00 <- colSums(potential[S == 0, , drop = FALSE]) / sum((1 - S) * w00) + + # optimal borrowing weight + num <- sum(S * (1 - A) / (1 - pi_A)^2 / (sum(S * (1 - A) / (1 - pi_A)))^2) + denom <- sum((1 - S) * w00^2 / (sum((1 - S) * w00))^2) + w_opt <- num / (num + denom) + borrow_weight <- if (is.null(weight)) w_opt else weight + + # combine rct control and external control + mu0 <- (1 - borrow_weight) * mu10 + borrow_weight * mu00 + tau <- mu1 - mu0 + + list(tau = tau, borrow_weight = borrow_weight, + ps_model = ps_model, pi_SX = pi_SX, pi_A = pi_A, + pi_S = pi_S, rx = rx, w00 = w00, + Yr = Yr, mu1 = mu1, mu10 = mu10, mu00 = mu00) +} + +#' sandwich variance for EC-AIPW (Theorem 4, Eq 15-16). +#' extends the IPW sandwich with outcome model parameter blocks. +#' @param df internal data frame. +#' @param core output from .ec_aipw_core. +#' @param n_time number of time points. +#' @param outcome_formula character vector of outcome model formulas. +#' @return numeric vector of standard errors (length n_time). +#' @noRd +.ec_aipw_sandwich <- function(df, core, n_time, outcome_formula) { + S <- df$S + A <- df$A + N <- nrow(df) + + X_ps <- model.matrix(core$ps_model) + n_ps <- ncol(X_ps) + + # refit outcome models on full data (needed for sandwich, not for tau) + Y0_models_full <- lapply(outcome_formula, \(f) { + lm(as.formula(f), data = df) + }) + + # bread: ps block---- + A33 <- diag( + rep(-mean((1 - S) * core$w00 / (1 - core$pi_S)), n_time), + nrow = n_time + ) + A34 <- t((1 - S) * core$pi_SX / (core$pi_S * (1 - core$pi_SX)) * + sweep(core$Yr, 2, core$mu00)) %*% X_ps / N + A44 <- t(X_ps) %*% diag(-core$pi_SX * (1 - core$pi_SX)) %*% X_ps / N + + A0 <- as.matrix(Matrix::bdiag( + diag(-1, n_time), diag(-1, n_time), A33, A44 + )) + A0[(2 * n_time + 1):(3 * n_time), + (3 * n_time + 1):(3 * n_time + n_ps)] <- A34 + + # bread: outcome model blocks---- + Y0_model_mats <- lapply(Y0_models_full, model.matrix) + n_outcome <- sum(vapply(Y0_model_mats, ncol, integer(1))) + + Phi1_gamma <- as.matrix(Matrix::bdiag(lapply(seq_len(n_time), \(t) { + as.vector(-S * A / (core$pi_S * core$pi_A)) %*% Y0_model_mats[[t]] / N + }))) + Phi2_gamma <- as.matrix(Matrix::bdiag(lapply(seq_len(n_time), \(t) { + as.vector(-S * (1 - A) / (core$pi_S * (1 - core$pi_A))) %*% + Y0_model_mats[[t]] / N + }))) + Phi3_gamma <- as.matrix(Matrix::bdiag(lapply(seq_len(n_time), \(t) { + as.vector(-(1 - S) * core$rx / (1 - core$pi_S)) %*% + Y0_model_mats[[t]] / N + }))) + Y0_gamma <- as.matrix(Matrix::bdiag(lapply(seq_len(n_time), \(t) { + -t(Y0_model_mats[[t]]) %*% + diag((1 - A) / (1 - mean(A))) %*% Y0_model_mats[[t]] / N + }))) + + # assemble full bread matrix---- + A_left <- rbind(A0, matrix(0, nrow = n_outcome, ncol = 3 * n_time + n_ps)) + A_right <- rbind( + Phi1_gamma, Phi2_gamma, Phi3_gamma, + matrix(0, nrow = n_ps, ncol = n_outcome), + Y0_gamma + ) + A_mat <- cbind(A_left, A_right) + + # meat: influence functions---- + phi1 <- S * A * sweep(core$Yr, 2, core$mu1) / core$pi_A / core$pi_S + phi2 <- S * (1 - A) * sweep(core$Yr, 2, core$mu10) / + (1 - core$pi_A) / core$pi_S + phi3 <- (1 - S) * sweep(core$Yr, 2, core$mu00) * + core$w00 / (1 - core$pi_S) + phi_ps <- (S - core$pi_SX) * X_ps + phi_Y0 <- do.call(cbind, lapply(seq_len(n_time), \(t) { + ((1 - A) / (1 - mean(A))) * (core$Yr[, t] * Y0_model_mats[[t]]) + })) + + B <- crossprod(cbind(phi1, phi2, phi3, phi_ps, phi_Y0)) / N + + # sandwich: A^{-1} B A^{-T}, then extract tau variance---- + A_inv <- solve(A_mat) + sigma <- A_inv %*% B %*% t(A_inv) + + coef_mat <- cbind( + diag(n_time), + -(1 - core$borrow_weight) * diag(n_time), + -core$borrow_weight * diag(n_time), + matrix(0, nrow = n_time, ncol = n_ps + n_outcome) + ) + sqrt(diag(coef_mat %*% sigma %*% t(coef_mat) / N)) +} + +#' bootstrap statistic for EC-AIPW. called by boot::boot on each resample. +#' refits both PS and outcome models on the resampled data. +#' @param data internal data frame. +#' @param indices bootstrap sample indices. +#' @param outcomes outcome column names. +#' @param covariates covariate column names. +#' @param ps_formula propensity score formula. +#' @param outcome_formula outcome model formulas. +#' @param borrow_wt pre-computed borrowing weight. +#' @return numeric vector of tau estimates. +#' @noRd +.ec_aipw_statistic <- function(data, indices, outcomes, covariates, + ps_formula, outcome_formula, borrow_wt) { + d <- data[indices, , drop = FALSE] + core <- .ec_aipw_core(d, outcomes, ps_formula, outcome_formula, borrow_wt) + core$tau +} diff --git a/R/ec_ipw.R b/R/ec_ipw.R new file mode 100644 index 0000000..e45550d --- /dev/null +++ b/R/ec_ipw.R @@ -0,0 +1,329 @@ +#' @include method_class.R +NULL + +# S4 class definition---- +.ec_ipw_method <- setClass( + "ec_ipw_method", + contains = "method_weighting_obj", + slots = c( + ps_formula = "character", + weight = "numericOrNULL", + bootstrap = "numericOrNULL", + bootstrap_ci_type = "characterOrNULL" + ), + prototype = list( + method_name = "EC-IPW", + ps_formula = "", + weight = NULL, + bootstrap = NULL, + bootstrap_ci_type = NULL + ) +) + +#' EC-IPW method constructor +#' +#' Creates a method object for IPW estimation with external control +#' borrowing (Zhou et al., 2024). Pass to \code{\link{setup_analysis_primary}} +#' and \code{\link{run_analysis}}. +#' +#' @param ps_formula Formula string for the propensity score model +#' predicting trial participation. The left-hand side is replaced +#' internally (e.g., \code{"S ~ x1 + x2 + x3"}). +#' @param weight Borrowing weight. \code{NULL} (default) for data-adaptive +#' optimal weight, \code{0} for RCT-only, or a value in (0, 1]. +#' @param bootstrap Number of bootstrap replicates, or \code{NULL} +#' (default) for sandwich variance with normal CIs. +#' @param bootstrap_ci_type Bootstrap CI type, or \code{NULL} (default) +#' which resolves to \code{"perc"} when \code{bootstrap} is set. One of +#' \code{"perc"}, \code{"bca"}, \code{"norm"}, \code{"basic"}, or +#' \code{"stud"}. +#' +#' @return An S4 object of class \code{ec_ipw_method}. +#' +#' @references +#' Zhou et al. (2024). Causal estimators for incorporating external +#' controls in randomized trials with longitudinal outcomes. +#' \emph{JRSS-A}. \doi{10.1093/jrsssa/qnae075} +#' +#' @export +#' +#' @examples +#' # optimal weight, sandwich SE +#' ec_ipw(ps_formula = "S ~ x1 + x2 + x3 + x4 + x5") +#' +#' # no borrowing +#' ec_ipw(ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", weight = 0) +#' +#' # fixed weight with bootstrap +#' ec_ipw( +#' ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", +#' weight = 0.3, +#' bootstrap = 500 +#' ) +ec_ipw <- function(ps_formula, + weight = NULL, + bootstrap = NULL, + bootstrap_ci_type = NULL) { + checkmate::assert_string(ps_formula) + checkmate::assert_number(weight, lower = 0, upper = 1, null.ok = TRUE) + checkmate::assert_count(bootstrap, positive = TRUE, null.ok = TRUE) + + # bootstrap type + if (!is.null(bootstrap) && is.null(bootstrap_ci_type)) { + bootstrap_ci_type <- "perc" + } + if (!is.null(bootstrap_ci_type)) { + checkmate::assert_choice( + bootstrap_ci_type, c("perc", "bca", "norm", "basic", "stud") + ) + } + + .ec_ipw_method( + ps_formula = ps_formula, + weight = weight, + bootstrap = bootstrap, + bootstrap_ci_type = bootstrap_ci_type, + method_name = "EC-IPW", + bootstrap_flag = !is.null(bootstrap), + bootstrap_obj = .bootstrap_obj( + replicates = bootstrap %||% 500L, + bootstrap_CI_type = bootstrap_ci_type %||% "perc" + ) + ) +} + +#' @rdname estimate +setMethod("estimate", "ec_ipw_method", function(method, data, outcomes, + treatment, trial_status, + covariates, alpha = 0.05, + quiet = TRUE) { + + # unwrap formula + ps_formula <- sub("^[^~]*~", paste0(trial_status, " ~"), method@ps_formula) + df <- .build_analysis_df(data, outcomes, treatment, trial_status, covariates) + n_time <- length(outcomes) + + if (!quiet) cat("Running EC-IPW estimator...\n") + + weight <- method@weight + use_borrowing <- is.null(weight) || weight > 0 + + # weight=0 uses rct-only path (no PS model needed) + if (!use_borrowing) { + Y <- as.matrix(df[, outcomes, drop = FALSE]) + result <- .ec_ipw_no_borrow(df, Y, df$S, df$A, sum(df$S), n_time) + borrow_weight <- 0 + } else { + # weight>0 or NULL: fit PS model, compute borrowing weight, sandwich SE + core <- .ec_ipw_borrow_core( + df, as.matrix(df[, outcomes, drop = FALSE]), + df$S, df$A, sum(df$S), nrow(df), + sum(df$S) / nrow(df), n_time, + ps_formula, weight + ) + result <- .ec_ipw_sandwich(df, core, n_time) + borrow_weight <- core$borrow_weight + } + + # format output and optionally run bootstrap + .format_primary_results( + tau = result$tau, sd_tau = result$sd_tau, + borrow_weight = borrow_weight, + n_time = n_time, alpha = alpha, + method = method, df = df, quiet = quiet, + statistic = .ec_ipw_statistic, + outcomes = outcomes, covariates = covariates, + ps_formula = ps_formula + ) +}) + +#' compute Hajek ATE using only RCT subjects (w=0, no PS model). +#' @param df internal data frame with S, A, outcome columns. +#' @param Y outcome matrix (N x T). +#' @param S trial participation vector. +#' @param A treatment vector. +#' @param n number of RCT subjects. +#' @param n_time number of time points. +#' @return list with tau and intermediates for sandwich. +#' @noRd +.ec_ipw_no_borrow_core <- function(df, Y, S, A, n, n_time) { + rct <- df[df$S == 1, , drop = FALSE] + Y_rct <- Y[S == 1, , drop = FALSE] + pi_A <- sum(rct$A) / n + + # normalized IPTW (Hajek estimator) -- Def 1 with w=0 + potential <- (rct$A / pi_A + (1 - rct$A) / (1 - pi_A)) * Y_rct + mu1 <- colSums(potential[rct$A == 1, , drop = FALSE]) / n + mu0 <- colSums(potential[rct$A == 0, , drop = FALSE]) / n + + list( + tau = mu1 - mu0, pi_A = pi_A, + mu1 = mu1, mu0 = mu0, + Y_rct = Y_rct, rct = rct + ) +} + +#' sandwich SE for the rct-only case (Theorem 3 with w=0). +#' @param df internal data frame. +#' @param Y outcome matrix. +#' @param S trial participation vector. +#' @param A treatment vector. +#' @param n number of RCT subjects. +#' @param n_time number of time points. +#' @return list with tau and sd_tau. +#' @noRd +.ec_ipw_no_borrow <- function(df, Y, S, A, n, n_time) { + core <- .ec_ipw_no_borrow_core(df, Y, S, A, n, n_time) + + # influence functions for treated and control + phi1 <- core$rct$A * sweep(core$Y_rct, 2, core$mu1) / core$pi_A + phi0 <- (1 - core$rct$A) * sweep(core$Y_rct, 2, core$mu0) / (1 - core$pi_A) + B <- crossprod(cbind(phi1, phi0)) / n + + # tau = mu1 - mu0, so coef_mat = [I, -I] + coef_mat <- cbind(diag(n_time), -diag(n_time)) + sd_tau <- sqrt(diag(coef_mat %*% B %*% t(coef_mat) / n)) + + list(tau = core$tau, sd_tau = sd_tau) +} + +#' EC-IPW point estimate with borrowing (Def 1, Eq 6). +#' fits PS model, computes density ratio weights W00 (Eq 4), +#' mu11/mu10/mu00, and optimal weight (Eq 11). +#' shared by estimate() and bootstrap statistic. +#' @param df internal data frame. +#' @param Y outcome matrix (N x T). +#' @param S trial participation vector. +#' @param A treatment vector. +#' @param n number of RCT subjects. +#' @param N total sample size. +#' @param pi_S marginal trial participation probability. +#' @param n_time number of time points. +#' @param ps_formula propensity score formula string. +#' @param weight fixed weight or NULL for optimal. +#' @return list with tau, borrow_weight, and model intermediates. +#' @noRd +.ec_ipw_borrow_core <- function(df, Y, S, A, n, N, pi_S, n_time, + ps_formula, weight) { + # propensity score model for trial participation + ps_model <- glm(as.formula(ps_formula), data = df, family = "binomial") + pi_SX <- predict(ps_model, newdata = df, type = "response") + pi_A <- sum(A[S == 1]) / n + + # density ratio weights W00 = pi_S(X)(1-pi_S) / (1-pi_S(X))pi_S (Eq 4) + rx <- (pi_SX / (1 - pi_SX)) * ((1 - pi_S) / pi_S) + + w11 <- pi_A + w10 <- 1 - pi_A + w00 <- rx + + # normalized weighted outcomes (Def 1, Eq 6) + potential <- (S * A / w11 + S * (1 - A) / w10 + (1 - S) * w00) * Y + mu1 <- colSums(potential[S == 1 & A == 1, , drop = FALSE]) / sum(S * A / w11) + mu10 <- colSums(potential[S == 1 & A == 0, , drop = FALSE]) / sum(S * (1 - A) / w10) + mu00 <- colSums(potential[S == 0, , drop = FALSE]) / sum((1 - S) * w00) + + # data-adaptive optimal weight (Eq 11, variance ratio approximation) + num <- sum(S * (1 - A) / w10^2 / (sum(S * (1 - A) / w10))^2) + denom <- sum((1 - S) * w00^2 / (sum((1 - S) * w00))^2) + w_opt <- num / (num + denom) + borrow_weight <- if (is.null(weight)) w_opt else weight + + # hybrid control: convex combination of trial and external controls + mu0 <- (1 - borrow_weight) * mu10 + borrow_weight * mu00 + tau <- mu1 - mu0 + + list( + tau = tau, borrow_weight = borrow_weight, + ps_model = ps_model, pi_SX = pi_SX, pi_A = pi_A, + w00 = w00, w10 = w10, + mu1 = mu1, mu10 = mu10, mu00 = mu00 + ) +} + +#' sandwich variance for EC-IPW with borrowing (Theorem 3, Eq 12-13). +#' constructs bread matrix A and meat matrix B from core intermediates. +#' @param df internal data frame. +#' @param core output from .ec_ipw_borrow_core. +#' @param n_time number of time points. +#' @return list with tau and sd_tau. +#' @noRd +.ec_ipw_sandwich <- function(df, core, n_time) { + Y <- as.matrix(df[, seq_len(n_time), drop = FALSE]) + S <- df$S + A <- df$A + N <- nrow(df) + pi_S <- sum(S) / N + + X_model <- model.matrix(core$ps_model) + n_ps <- ncol(X_model) + + # bread: A matrix blocks (Eq 12) + # A11 = -I_T, A22 = -I_T (for mu1, mu10) + # A33 = external control block + # A34 = cross-term linking mu00 to PS params + # A44 = PS model Hessian + A33 <- diag(rep(-mean((1 - S) * core$w00 / (1 - pi_S)), n_time), nrow = n_time) + A34 <- t((1 - S) * core$pi_SX / (pi_S * (1 - core$pi_SX)) * + sweep(Y, 2, core$mu00)) %*% X_model / N + A44 <- t(X_model) %*% diag(-core$pi_SX * (1 - core$pi_SX)) %*% X_model / N + + block_dim <- 3 * n_time + n_ps + A_mat <- matrix(0, nrow = block_dim, ncol = block_dim) + A_mat[1:n_time, 1:n_time] <- diag(-1, n_time) + A_mat[(n_time + 1):(2 * n_time), (n_time + 1):(2 * n_time)] <- diag(-1, n_time) + A_mat[(2 * n_time + 1):(3 * n_time), (2 * n_time + 1):(3 * n_time)] <- A33 + A_mat[(3 * n_time + 1):block_dim, (3 * n_time + 1):block_dim] <- A44 + A_mat[(2 * n_time + 1):(3 * n_time), (3 * n_time + 1):block_dim] <- A34 + + # meat: influence functions (Psi_1 through Psi_4 in Theorem 3) + phi1 <- S * A * sweep(Y, 2, core$mu1) / core$pi_A / pi_S + phi2 <- S * (1 - A) * sweep(Y, 2, core$mu10) / (1 - core$pi_A) / pi_S + phi3 <- (1 - S) * sweep(Y, 2, core$mu00) * core$w00 / (1 - pi_S) + phi_ps <- (S - core$pi_SX) * X_model + B <- crossprod(cbind(phi1, phi2, phi3, phi_ps)) / N + + # sandwich: Sigma = A^{-1} B A^{-T} (Eq 13) + A_inv <- solve(A_mat) + sigma <- A_inv %*% B %*% t(A_inv) + + # tau = mu1 - (1-w)*mu10 - w*mu00, extract variance via linear combination + coef_mat <- cbind( + diag(n_time), + -(1 - core$borrow_weight) * diag(n_time), + -core$borrow_weight * diag(n_time), + matrix(0, nrow = n_time, ncol = n_ps) + ) + sd_tau <- sqrt(diag(coef_mat %*% sigma %*% t(coef_mat) / N)) + + list(tau = core$tau, sd_tau = sd_tau) +} + +#' bootstrap statistic for EC-IPW. called by boot::boot on each resample. +#' @param data internal data frame. +#' @param indices bootstrap sample indices. +#' @param outcomes outcome column names. +#' @param covariates covariate column names. +#' @param ps_formula propensity score formula. +#' @param borrow_wt pre-computed borrowing weight. +#' @return numeric vector of tau estimates. +#' @noRd +.ec_ipw_statistic <- function(data, indices, outcomes, covariates, + ps_formula, borrow_wt) { + d <- data[indices, , drop = FALSE] + Y <- as.matrix(d[, outcomes, drop = FALSE]) + S <- d$S + A <- d$A + n <- sum(S) + N <- nrow(d) + n_time <- ncol(Y) + + if (borrow_wt == 0) { + return(.ec_ipw_no_borrow_core(d, Y, S, A, n, n_time)$tau) + } + + pi_S <- n / N + core <- .ec_ipw_borrow_core(d, Y, S, A, n, N, pi_S, n_time, + ps_formula, borrow_wt) + core$tau +} diff --git a/R/DID_EC_AIPW.R b/R/legacy_DID_EC_AIPW.R similarity index 99% rename from R/DID_EC_AIPW.R rename to R/legacy_DID_EC_AIPW.R index 97c310c..d1f1b08 100644 --- a/R/DID_EC_AIPW.R +++ b/R/legacy_DID_EC_AIPW.R @@ -17,7 +17,7 @@ #' @param T_cross numeric. The time point that separates the placebo-control period and the follow-up period. #' @param quiet Logical. If \code{TRUE}, suppress printed output. #' -#' @include DID_EC_AIPW_bootstrap.R +#' @include legacy_DID_EC_AIPW_bootstrap.R #' @return tau and standard deviation #' @noRd #' diff --git a/R/DID_EC_AIPW_bootstrap.R b/R/legacy_DID_EC_AIPW_bootstrap.R similarity index 100% rename from R/DID_EC_AIPW_bootstrap.R rename to R/legacy_DID_EC_AIPW_bootstrap.R diff --git a/R/DID_EC_IPW.R b/R/legacy_DID_EC_IPW.R similarity index 99% rename from R/DID_EC_IPW.R rename to R/legacy_DID_EC_IPW.R index 5657811..845e40f 100644 --- a/R/DID_EC_IPW.R +++ b/R/legacy_DID_EC_IPW.R @@ -14,7 +14,7 @@ #' @param alpha Significance level. #' @param quiet Logical. If \code{TRUE}, suppress printed output. #' -#' @include DID_EC_IPW_bootstrap.R +#' @include legacy_DID_EC_IPW_bootstrap.R #' @return tau and standard deviation #' @noRd #' diff --git a/R/DID_EC_IPW_bootstrap.R b/R/legacy_DID_EC_IPW_bootstrap.R similarity index 100% rename from R/DID_EC_IPW_bootstrap.R rename to R/legacy_DID_EC_IPW_bootstrap.R diff --git a/R/DID_EC_OR.R b/R/legacy_DID_EC_OR.R similarity index 99% rename from R/DID_EC_OR.R rename to R/legacy_DID_EC_OR.R index 629343f..aabb244 100644 --- a/R/DID_EC_OR.R +++ b/R/legacy_DID_EC_OR.R @@ -15,7 +15,7 @@ #' @param alpha Significance level. #' @param quiet Logical. If \code{TRUE}, suppress printed output. #' -#' @include DID_EC_OR_bootstrap.R +#' @include legacy_DID_EC_OR_bootstrap.R #' @return tau and standard deviation #' @noRd #' diff --git a/R/DID_EC_OR_bootstrap.R b/R/legacy_DID_EC_OR_bootstrap.R similarity index 100% rename from R/DID_EC_OR_bootstrap.R rename to R/legacy_DID_EC_OR_bootstrap.R diff --git a/R/SCM.R b/R/legacy_SCM.R similarity index 99% rename from R/SCM.R rename to R/legacy_SCM.R index f58b684..f24e0c7 100644 --- a/R/SCM.R +++ b/R/legacy_SCM.R @@ -20,7 +20,7 @@ #' @param ncpus Integer. Number of CPUs for parallel bootstrap. #' @param quiet Logical. If \code{TRUE}, suppress printed output. #' -#' @include SCMboot.R +#' @include legacy_SCMboot.R #' @return A list contains: estimated ATE, SE, weight used, SE by Bootstrap #' and a 95% confidence interval for primary endpoint (only when #' Bootstrap=TRUE) diff --git a/R/SCMboot.R b/R/legacy_SCMboot.R similarity index 100% rename from R/SCMboot.R rename to R/legacy_SCMboot.R diff --git a/R/method_DID_class.R b/R/method_DID_class.R index a31dcdd..09b4ca0 100644 --- a/R/method_DID_class.R +++ b/R/method_DID_class.R @@ -63,7 +63,10 @@ setup_method_DID <- function(method_name = "IPW", model_form_piA = "", model_form_mu0_rct = "", model_form_mu1_rct = "") { - .validate_method_base(method_name, bootstrap_flag, bootstrap_obj) + .Deprecated("did_ec_ipw") + checkmate::assert_string(method_name) + checkmate::assert_flag(bootstrap_flag) + checkmate::assert_class(bootstrap_obj, "bootstrap_obj") checkmate::assert_string(model_form_piS) checkmate::assert_string(model_form_piA) checkmate::assert_character(model_form_mu0_ext) diff --git a/R/method_SCM_class.R b/R/method_SCM_class.R index 592b02f..203675b 100644 --- a/R/method_SCM_class.R +++ b/R/method_SCM_class.R @@ -54,7 +54,10 @@ setup_method_SCM <- function(method_name = "SCM", nlambda = 10, parallel = "no", ncpus = 1) { - .validate_method_base(method_name, bootstrap_flag, bootstrap_obj) + .Deprecated("scm") + checkmate::assert_string(method_name) + checkmate::assert_flag(bootstrap_flag) + checkmate::assert_class(bootstrap_obj, "bootstrap_obj") checkmate::assert_number(lambda.min) checkmate::assert_number(lambda.max) if (lambda.max < lambda.min) { diff --git a/R/method_class.R b/R/method_class.R index b1aaaa3..55e15de 100644 --- a/R/method_class.R +++ b/R/method_class.R @@ -1,4 +1,4 @@ -#' method class +#' Method classes #' #' @slot method_name character. #' @slot bootstrap_flag Logical indicating whether bootstrap inference is used. @@ -24,18 +24,88 @@ contains = "method_obj" ) -.validate_method_base <- function(method_name, bootstrap_flag, bootstrap_obj) { - checkmate::assert_string(method_name) - checkmate::assert_flag(bootstrap_flag) - checkmate::assert_class(bootstrap_obj, "bootstrap_obj") +#' Run estimation for a method object +#' +#' S4 generic that dispatches to the appropriate estimation logic +#' based on the method class. Each method subclass (e.g., +#' \code{ec_ipw_method}, \code{did_ec_ipw_method}) implements its own +#' \code{estimate()} method containing the full estimation pipeline. +#' +#' @param method An S4 method object (e.g., from \code{\link{ec_ipw}}). +#' @param data Data frame with all subjects (RCT + external controls). +#' @param outcomes Character vector of outcome column names. +#' @param treatment Name of the treatment column. +#' @param trial_status Name of the trial participation column. +#' @param covariates Character vector of covariate column names. +#' @param alpha Significance level (default 0.05). +#' @param quiet Logical. Suppress output (default TRUE). +#' @param T_cross Integer crossover time point (OLE methods only). +#' @param ... Additional method-specific arguments. +#' +#' @return A list with estimation results. +#' @export +setGeneric("estimate", function(method, ...) standardGeneric("estimate")) + +#' Run stratified bootstrap and extract confidence intervals. +#' Shared by all method classes that support bootstrap inference. +#' Stratifies by interaction(S, A) to preserve group proportions. +#' @param df data frame with columns S (trial status) and A (treatment). +#' @param statistic function with signature (data, indices, ...) returning +#' a numeric vector of point estimates. +#' @param n_estimates number of estimates returned by statistic (length of tau). +#' @param bootstrap number of bootstrap replicates. +#' @param bootstrap_ci_type short CI type name ("perc", "bca", etc.). +#' @param alpha significance level for CIs. +#' @param ... additional arguments passed through to statistic. +#' @return list with lower_ci, upper_ci (vectors), and sd_boot (vector). +#' @noRd +.run_bootstrap <- function(df, statistic, n_estimates, bootstrap, + bootstrap_ci_type, alpha, ...) { + group_id <- as.integer(interaction(df$S, df$A, drop = TRUE)) + + ci_type_long <- switch( + bootstrap_ci_type, + norm = "normal", + bca = "bca", + stud = "student", + perc = "percent", + basic = "basic" + ) + + boot_out <- boot::boot( + data = df, + statistic = statistic, + R = bootstrap, + strata = group_id, + ... + ) + + ci_bounds <- vapply(seq_len(n_estimates), \(i) { + ci <- boot::boot.ci(boot_out, conf = 1 - alpha, + type = bootstrap_ci_type, index = i) + ci[[ci_type_long]][4:5] + }, numeric(2)) + + sd_boot <- sqrt(diag(var(boot_out$t))) + + list(lower_ci = ci_bounds[1, ], upper_ci = ci_bounds[2, ], sd_boot = sd_boot) } +#' Create a base method_obj (internal, used only in tests). +#' Users should call ec_ipw(), ec_aipw(), did_ec_ipw(), etc. instead. +#' @param method_name character identifier for the method. +#' @param bootstrap_flag logical whether bootstrap is used. +#' @param bootstrap_obj bootstrap_obj with replicates and CI type. +#' @return a method_obj S4 instance. +#' @noRd setup_method <- function(method_name = "", bootstrap_flag = FALSE, bootstrap_obj = .bootstrap_obj()) { - .validate_method_base(method_name, bootstrap_flag, bootstrap_obj) + checkmate::assert_string(method_name) + checkmate::assert_flag(bootstrap_flag) + checkmate::assert_class(bootstrap_obj, "bootstrap_obj") - method_obj <- .method_obj( + .method_obj( method_name = method_name, bootstrap_flag = bootstrap_flag, bootstrap_obj = bootstrap_obj diff --git a/R/method_weighting_class.R b/R/method_weighting_class.R index 186a4b1..cee3e04 100644 --- a/R/method_weighting_class.R +++ b/R/method_weighting_class.R @@ -71,7 +71,10 @@ setup_method_weighting <- function(method_name = "IPW", model_form_piA = "", model_form_mu0_rct = "", model_form_mu1_rct = "") { - .validate_method_base(method_name, bootstrap_flag, bootstrap_obj) + .Deprecated("ec_ipw") + checkmate::assert_string(method_name) + checkmate::assert_flag(bootstrap_flag) + checkmate::assert_class(bootstrap_obj, "bootstrap_obj") checkmate::assert_flag(optimal_weight_flag) checkmate::assert_number(wt) checkmate::assert_string(model_form_piS) @@ -93,3 +96,73 @@ setup_method_weighting <- function(method_name = "IPW", model_form_mu1_rct = model_form_mu1_rct ) } + +#' Build the internal data frame for primary weighting estimators. +#' Combines outcome matrix Y, trial status S, treatment A, and covariates +#' into a single data frame used by all ec_ipw/ec_aipw internals. +#' @param data user-supplied data frame. +#' @param outcomes character vector of outcome column names. +#' @param treatment name of the treatment column. +#' @param trial_status name of the trial participation column. +#' @param covariates character vector of covariate column names. +#' @return data frame with columns: outcome cols, S, A, covariate cols. +#' @noRd +.build_analysis_df <- function(data, outcomes, treatment, trial_status, + covariates) { + Y <- as.matrix(data[, outcomes, drop = FALSE]) + data.frame(Y, S = data[[trial_status]], A = data[[treatment]], + data[, covariates, drop = FALSE]) +} + +#' Format estimation results for primary weighting methods. +#' If bootstrap is requested, runs .run_bootstrap and returns boot CIs. +#' Otherwise returns sandwich SE with normal CIs. +#' @param tau numeric vector of point estimates. +#' @param sd_tau numeric vector of sandwich standard errors. +#' @param borrow_weight numeric borrowing weight used. +#' @param n_time number of time points (length of tau). +#' @param alpha significance level. +#' @param method S4 method object (checked for bootstrap slot). +#' @param df internal data frame (passed to bootstrap statistic). +#' @param quiet logical suppress output. +#' @param statistic bootstrap statistic function. +#' @param ... additional args passed to statistic via .run_bootstrap. +#' @return list with results (data.frame) and borrow_weight. +#' @noRd +.format_primary_results <- function(tau, sd_tau, borrow_weight, n_time, + alpha, method, df, quiet, statistic, + ...) { + cutoff <- qnorm(1 - alpha / 2) + + if (!is.null(method@bootstrap)) { + if (!quiet) cat("Running bootstrap inference...\n") + boot_res <- .run_bootstrap( + df = df, statistic = statistic, + n_estimates = n_time, bootstrap = method@bootstrap, + bootstrap_ci_type = method@bootstrap_ci_type, alpha = alpha, + borrow_wt = borrow_weight, ... + ) + results <- data.frame( + point_estimates = tau, + standard_deviation = boot_res$sd_boot, + lower_CI_boot = boot_res$lower_ci, + upper_CI_boot = boot_res$upper_ci, + row.names = paste0("tau", seq_len(n_time)) + ) + } else { + results <- data.frame( + point_estimates = tau, + standard_deviation = sd_tau, + lower_CI_normal = tau - sd_tau * cutoff, + upper_CI_normal = tau + sd_tau * cutoff, + row.names = paste0("tau", seq_len(n_time)) + ) + } + + list(results = results, borrow_weight = borrow_weight) +} + +# class unions---- +setClassUnion("numericOrNULL", c("numeric", "NULL")) +setClassUnion("characterOrNULL", c("character", "NULL")) + diff --git a/R/run_analysis.R b/R/run_analysis.R index 80a56d7..db8f332 100644 --- a/R/run_analysis.R +++ b/R/run_analysis.R @@ -7,10 +7,10 @@ #' @return a list containing: tau (effect size), sd.tau (standard deviation), wt (weight) #' @include EC_IPW_OPT.R #' @include EC_AIPW_OPT.R -#' @include DID_EC_IPW.R -#' @include DID_EC_AIPW.R -#' @include DID_EC_OR.R -#' @include SCM.R +#' @include legacy_DID_EC_IPW.R +#' @include legacy_DID_EC_AIPW.R +#' @include legacy_DID_EC_OR.R +#' @include legacy_SCM.R #' #' @export #' @@ -184,6 +184,30 @@ run_analysis <- function(analysis_obj, quiet = TRUE) { } else { stop("No such method is defined!") } + } else if (is(method, "ec_ipw_method") || is(method, "ec_aipw_method")) { + res <- estimate(method, + data = data, + outcomes = outcome_col_name, + treatment = treatment_col_name, + trial_status = trial_status_col_name, + covariates = covariates_col_name, + alpha = alpha, + quiet = quiet + ) + } else if (is(method, "did_ec_ipw_method") || + is(method, "did_ec_aipw_method") || + is(method, "did_ec_or_method") || + is(method, "scm_method")) { + res <- estimate(method, + data = data, + outcomes = outcome_col_name, + treatment = treatment_col_name, + trial_status = trial_status_col_name, + covariates = covariates_col_name, + alpha = alpha, + quiet = quiet, + T_cross = T_cross + ) } else { stop("No such method type is defined!") } diff --git a/R/scm.R b/R/scm.R new file mode 100644 index 0000000..375d30d --- /dev/null +++ b/R/scm.R @@ -0,0 +1,285 @@ +#' @include did_ec_ipw.R +NULL + +# s4 class definition---- +.scm_method <- setClass( + "scm_method", + contains = "method_SCM_obj", + slots = c( + lambda_min = "numeric", + lambda_max = "numeric", + nlambda = "integer", + parallel = "character", + ncpus = "integer", + bootstrap = "numeric", + bootstrap_ci_type = "character" + ), + prototype = list( + method_name = "SCM", + lambda_min = 0, + lambda_max = 0.1, + nlambda = 2L, + parallel = "no", + ncpus = 1L, + bootstrap = 100L, + bootstrap_ci_type = "perc" + ) +) + +# constructor---- + +#' SCM method constructor +#' +#' Creates a method object for synthetic control estimation with +#' external control borrowing for the open-label extension phase +#' (Zhou et al., 2024). Constructs a weighted combination of external +#' controls matching each RCT control subject on covariates and +#' pre-crossover outcomes. +#' +#' @param lambda_min Minimum penalty parameter for LOOCV. +#' @param lambda_max Maximum penalty parameter for LOOCV. +#' @param nlambda Number of lambda values to evaluate in LOOCV. +#' @param parallel Parallelization type for bootstrap (\code{"no"}, +#' \code{"multicore"}, or \code{"snow"}). +#' @param ncpus Number of CPUs for parallel bootstrap. +#' @param bootstrap Number of bootstrap replicates. Defaults to 200. +#' @param bootstrap_ci_type Bootstrap CI type. Defaults to \code{"perc"}. +#' +#' @return An S4 object of class \code{scm_method}. +#' +#' @references +#' Zhou et al. (2024). Estimating treatment effect in randomized trial +#' after control to treatment crossover using external controls. +#' \emph{Journal of Biopharmaceutical Statistics}. +#' \doi{10.1080/10543406.2024.2444222} +#' +#' @export +#' +#' @examples +#' scm(lambda_min = 0, lambda_max = 0.001, nlambda = 2, bootstrap = 50) +scm <- function(lambda_min = 0, + lambda_max = 0.1, + nlambda = 2L, + parallel = "no", + ncpus = 1L, + bootstrap = 200L, + bootstrap_ci_type = NULL) { + checkmate::assert_number(lambda_min, lower = 0) + checkmate::assert_number(lambda_max, lower = lambda_min) + checkmate::assert_count(nlambda, positive = TRUE) + checkmate::assert_choice(parallel, c("no", "multicore", "snow")) + checkmate::assert_count(ncpus, positive = TRUE) + checkmate::assert_count(bootstrap, positive = TRUE) + + if (is.null(bootstrap_ci_type)) { + bootstrap_ci_type <- "perc" + } + checkmate::assert_choice( + bootstrap_ci_type, c("perc", "bca", "norm", "basic", "stud") + ) + + .scm_method( + lambda_min = lambda_min, + lambda_max = lambda_max, + nlambda = as.integer(nlambda), + parallel = parallel, + ncpus = as.integer(ncpus), + bootstrap = as.integer(bootstrap), + bootstrap_ci_type = bootstrap_ci_type, + method_name = "SCM", + bootstrap_flag = TRUE, + bootstrap_obj = .bootstrap_obj( + replicates = as.integer(bootstrap), + bootstrap_CI_type = bootstrap_ci_type + ) + ) +} + +# estimate() method---- + +#' @rdname estimate +setMethod("estimate", "scm_method", function(method, data, outcomes, + treatment, trial_status, + covariates, alpha = 0.05, + quiet = TRUE, + T_cross) { + Y <- as.matrix(data[, outcomes, drop = FALSE]) + S <- data[[trial_status]] + A <- data[[treatment]] + n_time <- length(outcomes) + N <- nrow(data) + + df <- data.frame(Y, S = S, A = A, data[, covariates, drop = FALSE]) + long_term_col_name <- outcomes[(T_cross + 1):n_time] + + if (!quiet) cat("Running the synthetic control method...\n") + + # build attribute matrices (covariates + all outcomes, transposed) + X10 <- t(as.matrix(df[S == 1 & A == 0, c(covariates, outcomes), drop = FALSE])) + X00 <- t(as.matrix(df[S == 0, c(covariates, outcomes), drop = FALSE])) + colnames(X10) <- NULL + colnames(X00) <- NULL + + n10 <- sum(S == 1 & A == 0) + + # find optimal lambda via LOOCV + if (!quiet) cat("Performing cross validation for tuning parameter selection...\n") + lambda <- .scm_lambdacv( + ec = X00, + long_term_col_name = long_term_col_name, + lambda_min = method@lambda_min, + lambda_max = method@lambda_max, + nlambda = method@nlambda + ) + + # construct synthetic controls for each RCT control subject + if (!quiet) cat("Constructing pseudo controls for internal data...\n") + res <- lapply(seq_len(n10), .scm_subject_sc, + X10 = X10, X00 = X00, + long_term_col_name = long_term_col_name, lambda = lambda + ) + y_est_mat <- do.call(rbind, lapply(res, \(x) as.vector(x[[2]]))) + + # aggregate: treated OLE outcomes minus synthetic control OLE outcomes + Y_trt <- colMeans(df[S == 1 & A == 1, long_term_col_name, drop = FALSE]) + tau <- Y_trt - colMeans(y_est_mat) + + # bootstrap inference + if (!quiet) cat("Performing bootstrap inference...\n") + group_id <- as.integer(interaction(df$S, df$A, drop = TRUE)) + + ci_type_long <- switch(method@bootstrap_ci_type, + norm = "normal", bca = "bca", stud = "student", + perc = "percent", basic = "basic" + ) + + boot_out <- boot::boot( + data = df, + statistic = .scm_statistic, + outcomes = outcomes, + covariates = covariates, + T_cross = T_cross, + lambda = lambda, + parallel = method@parallel, + ncpus = method@ncpus, + R = method@bootstrap, + strata = group_id + ) + + n_ole <- n_time - T_cross + ci_bounds <- vapply(seq_len(n_ole), \(i) { + ci <- boot::boot.ci(boot_out, conf = 1 - alpha, + type = method@bootstrap_ci_type, index = i) + ci[[ci_type_long]][4:5] + }, numeric(2)) + + data.frame( + point_estimates = tau, + lower_CI_boot = ci_bounds[1, ], + upper_CI_boot = ci_bounds[2, ], + row.names = paste0("tau", (T_cross + 1):n_time) + ) +}) + +# internal helpers---- + +#' solve penalized SC optimization for one RCT control subject. +#' finds convex weights over EC subjects minimizing squared distance +#' in covariates + pre-crossover outcomes, with penalty for dissimilarity. +#' @param subject integer index of the target RCT control subject. +#' @param X10 attribute matrix for RCT controls (features x subjects). +#' @param X00 attribute matrix for external controls (features x subjects). +#' @param long_term_col_name names of OLE outcome rows. +#' @param lambda penalty parameter. +#' @return list with (1) weight vector and (2) predicted OLE outcomes. +#' @noRd +.scm_subject_sc <- function(subject, X10, X00, long_term_col_name, lambda) { + x1 <- X10[-which(row.names(X10) %in% long_term_col_name), subject] + X0 <- X00[-which(row.names(X10) %in% long_term_col_name), ] + + w <- CVXR::Variable(dim(X00)[2]) + loss <- sum(((x1 - X0 %*% w))^2) + penal <- lambda * (sum(colSums((x1 - X0)^2) * w)) + obj <- loss + penal + constr <- list(sum(w) == 1, w >= 0) + prob <- CVXR::Problem(CVXR::Minimize(obj), constr) + result <- solve(prob, solver = "ECOS") + + wt_est <- result$getValue(w) + y_est <- X00[long_term_col_name, ] %*% wt_est + + list(wt_est, y_est) +} + +#' find optimal lambda via leave-one-out cross-validation on EC subjects. +#' for each candidate lambda, holds out one EC subject, constructs SC +#' from remaining ECs, and measures prediction error on OLE outcomes. +#' @param ec attribute matrix for external controls (features x subjects). +#' @param long_term_col_name names of OLE outcome rows. +#' @param lambda_min minimum lambda. +#' @param lambda_max maximum lambda. +#' @param nlambda number of lambda values to evaluate. +#' @return optimal lambda value. +#' @noRd +.scm_lambdacv <- function(ec, long_term_col_name, + lambda_min = 0, lambda_max = 0.1, nlambda = 10) { + lambda_vals <- seq(lambda_min, lambda_max, length.out = nlambda) + + mse_vals <- vapply(lambda_vals, \(lambda) { + res <- lapply(seq_len(dim(ec)[2]), \(loocv) { + x1 <- ec[-which(row.names(ec) %in% long_term_col_name), loocv] + X0 <- ec[-which(row.names(ec) %in% long_term_col_name), -loocv] + + w <- CVXR::Variable(dim(ec)[2] - 1) + loss <- sum(((x1 - X0 %*% w))^2) + penal <- lambda * (sum(colSums((x1 - X0)^2) * w)) + obj <- loss + penal + constr <- list(sum(w) == 1, w >= 0) + prob <- CVXR::Problem(CVXR::Minimize(obj), constr) + result <- solve(prob, solver = "ECOS") + + wt_est <- result$getValue(w) + y_est <- ec[long_term_col_name, -loocv] %*% wt_est + list(wt_est, y_est) + }) + y_est_mat <- do.call(rbind, lapply(res, \(x) as.vector(x[[2]]))) + mean((ec[long_term_col_name, ] - t(y_est_mat))^2) + }, numeric(1)) + + lambda_vals[which.min(mse_vals)] +} + +#' bootstrap statistic for SCM. +#' reconstructs synthetic controls on resampled data with pre-computed lambda. +#' @param data internal data frame. +#' @param indices bootstrap sample indices. +#' @param outcomes outcome column names. +#' @param covariates covariate column names. +#' @param T_cross crossover time point. +#' @param lambda pre-computed penalty parameter. +#' @return numeric vector of tau estimates. +#' @noRd +.scm_statistic <- function(data, indices, outcomes, covariates, + T_cross, lambda) { + d <- data[indices, , drop = FALSE] + S <- d$S + A <- d$A + n_time <- length(outcomes) + long_term_col_name <- outcomes[(T_cross + 1):n_time] + + X10 <- t(as.matrix(d[S == 1 & A == 0, c(covariates, outcomes), drop = FALSE])) + X00 <- t(as.matrix(d[S == 0, c(covariates, outcomes), drop = FALSE])) + colnames(X10) <- NULL + colnames(X00) <- NULL + + n10 <- sum(S == 1 & A == 0) + + res <- lapply(seq_len(n10), .scm_subject_sc, + X10 = X10, X00 = X00, + long_term_col_name = long_term_col_name, lambda = lambda + ) + y_est_mat <- do.call(rbind, lapply(res, \(x) as.vector(x[[2]]))) + + Y_trt <- colMeans(d[S == 1 & A == 1, long_term_col_name, drop = FALSE]) + Y_trt - colMeans(y_est_mat) +} diff --git a/_pkgdown.yml b/_pkgdown.yml index 5a998a1..299d43d 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -18,6 +18,9 @@ navbar: aria-label: GitHub articles: + - title: Getting Started + contents: + - introduction - title: Primary Analysis contents: - primary_analysis_workflow @@ -47,8 +50,19 @@ reference: - run_simulation - setup_simulation_report - - title: Method Configuration - desc: Configure estimation methods + - title: Estimators + desc: Method constructors for estimation + contents: + - ec_ipw + - ec_aipw + - did_ec_ipw + - did_ec_aipw + - did_ec_or + - scm + - estimate + + - title: Method Configuration (legacy) + desc: Configure estimation methods (deprecated) contents: - setup_method_weighting - setup_method_DID diff --git a/man/did_ec_aipw.Rd b/man/did_ec_aipw.Rd new file mode 100644 index 0000000..e93b85e --- /dev/null +++ b/man/did_ec_aipw.Rd @@ -0,0 +1,55 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/did_ec_aipw.R +\name{did_ec_aipw} +\alias{did_ec_aipw} +\title{DID-EC-AIPW method} +\usage{ +did_ec_aipw( + ps_formula, + trt_formula = "", + outcome_formula, + bootstrap = 500L, + bootstrap_ci_type = NULL +) +} +\arguments{ +\item{ps_formula}{Formula string for the propensity score model +predicting trial participation.} + +\item{trt_formula}{Formula string for the treatment assignment model, +or \code{""} for marginal probability.} + +\item{outcome_formula}{Character vector of outcome model formulas, +one per time point.} + +\item{bootstrap}{Number of bootstrap replicates. Defaults to 500.} + +\item{bootstrap_ci_type}{Bootstrap CI type. Defaults to \code{"perc"}.} +} +\value{ +An S4 object of class \code{did_ec_aipw_method}. +} +\description{ +Creates a method object for difference-in-differences augmented IPW +estimation with external control borrowing for the open-label +extension phase (Zhou et al., 2024). +} +\examples{ +did_ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + trt_formula = "A ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" + ), + bootstrap = 500 +) +} +\references{ +Zhou et al. (2024). Estimating treatment effect in randomized trial +after control to treatment crossover using external controls. +\emph{Journal of Biopharmaceutical Statistics}. +\doi{10.1080/10543406.2024.2444222} +} diff --git a/man/did_ec_ipw.Rd b/man/did_ec_ipw.Rd new file mode 100644 index 0000000..faaba62 --- /dev/null +++ b/man/did_ec_ipw.Rd @@ -0,0 +1,46 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/did_ec_ipw.R +\name{did_ec_ipw} +\alias{did_ec_ipw} +\title{DID-EC-IPW method} +\usage{ +did_ec_ipw( + ps_formula, + trt_formula = "", + bootstrap = 500L, + bootstrap_ci_type = NULL +) +} +\arguments{ +\item{ps_formula}{Formula string for the propensity score model +predicting trial participation.} + +\item{trt_formula}{Formula string for the treatment assignment model, +or \code{""} for marginal probability.} + +\item{bootstrap}{Number of bootstrap replicates (required for DID +methods). Defaults to 500.} + +\item{bootstrap_ci_type}{Bootstrap CI type. Defaults to \code{"perc"}.} +} +\value{ +An S4 object of class \code{did_ec_ipw_method}. +} +\description{ +Creates a method object for difference-in-differences IPW estimation +with external control borrowing for the open-label extension phase +(Zhou et al., 2024). +} +\examples{ +did_ec_ipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + trt_formula = "A ~ x1 + x2 + x3 + x4 + x5", + bootstrap = 500 +) +} +\references{ +Zhou et al. (2024). Estimating treatment effect in randomized trial +after control to treatment crossover using external controls. +\emph{Journal of Biopharmaceutical Statistics}. +\doi{10.1080/10543406.2024.2444222} +} diff --git a/man/did_ec_or.Rd b/man/did_ec_or.Rd new file mode 100644 index 0000000..0aef013 --- /dev/null +++ b/man/did_ec_or.Rd @@ -0,0 +1,56 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/did_ec_or.R +\name{did_ec_or} +\alias{did_ec_or} +\title{DID-EC-OR method constructor} +\usage{ +did_ec_or( + outcome_formula_ext, + outcome_formula_rct_ctrl, + outcome_formula_rct_trt, + bootstrap = 500L, + bootstrap_ci_type = NULL +) +} +\arguments{ +\item{outcome_formula_ext}{Character vector of outcome model formulas +for external controls, one per time point.} + +\item{outcome_formula_rct_ctrl}{Character vector of outcome model +formulas for RCT control subjects, one per time point.} + +\item{outcome_formula_rct_trt}{Character vector of outcome model +formulas for RCT treated subjects, one per time point.} + +\item{bootstrap}{Number of bootstrap replicates. Defaults to 500.} + +\item{bootstrap_ci_type}{Bootstrap CI type. Defaults to \code{"perc"}.} +} +\value{ +An S4 object of class \code{did_ec_or_method}. +} +\description{ +Creates a method object for difference-in-differences outcome +regression estimation with external control borrowing for the +open-label extension phase (Zhou et al., 2024). +} +\examples{ +model_forms <- c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" +) +did_ec_or( + outcome_formula_ext = model_forms, + outcome_formula_rct_ctrl = model_forms, + outcome_formula_rct_trt = model_forms, + bootstrap = 500 +) +} +\references{ +Zhou et al. (2024). Estimating treatment effect in randomized trial +after control to treatment crossover using external controls. +\emph{Journal of Biopharmaceutical Statistics}. +\doi{10.1080/10543406.2024.2444222} +} diff --git a/man/ec_aipw.Rd b/man/ec_aipw.Rd new file mode 100644 index 0000000..fd54cd6 --- /dev/null +++ b/man/ec_aipw.Rd @@ -0,0 +1,74 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/ec_aipw.R +\name{ec_aipw} +\alias{ec_aipw} +\title{EC-AIPW method} +\usage{ +ec_aipw( + ps_formula, + outcome_formula, + weight = NULL, + bootstrap = NULL, + bootstrap_ci_type = NULL +) +} +\arguments{ +\item{ps_formula}{Formula string for the propensity score model +predicting trial participation.} + +\item{outcome_formula}{Character vector of outcome model formulas, +one per time point (e.g., \code{c("y1 ~ x1 + x2", "y2 ~ x1 + x2")}).} + +\item{weight}{Borrowing weight. \code{NULL} (default) for data-adaptive +optimal weight, \code{0} for RCT-only, or a value in (0, 1].} + +\item{bootstrap}{Number of bootstrap replicates, or \code{NULL} +(default) for sandwich variance with normal CIs.} + +\item{bootstrap_ci_type}{Bootstrap CI type. Defaults to \code{"perc"}.} +} +\value{ +An S4 object of class \code{ec_aipw_method}. +} +\description{ +Creates a method object for augmented IPW estimation with external +control borrowing (Zhou et al., 2024). Augments the IPW estimator +with an outcome regression model for improved efficiency. Pass to +\code{\link{setup_analysis_primary}} and \code{\link{run_analysis}}. +} +\examples{ +# optimal weight, sandwich SE +ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ) +) + +# no borrowing +ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ), + weight = 0 +) + +# fixed weight with bootstrap +ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ), + weight = 0.3, + bootstrap = 500 +) +} +\references{ +Zhou et al. (2024). Causal estimators for incorporating external +controls in randomized trials with longitudinal outcomes. +\emph{JRSS-A}. \doi{10.1093/jrsssa/qnae075} +} diff --git a/man/ec_ipw.Rd b/man/ec_ipw.Rd new file mode 100644 index 0000000..884980f --- /dev/null +++ b/man/ec_ipw.Rd @@ -0,0 +1,51 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/ec_ipw.R +\name{ec_ipw} +\alias{ec_ipw} +\title{EC-IPW method constructor} +\usage{ +ec_ipw(ps_formula, weight = NULL, bootstrap = NULL, bootstrap_ci_type = NULL) +} +\arguments{ +\item{ps_formula}{Formula string for the propensity score model +predicting trial participation. The left-hand side is replaced +internally (e.g., \code{"S ~ x1 + x2 + x3"}).} + +\item{weight}{Borrowing weight. \code{NULL} (default) for data-adaptive +optimal weight, \code{0} for RCT-only, or a value in (0, 1].} + +\item{bootstrap}{Number of bootstrap replicates, or \code{NULL} +(default) for sandwich variance with normal CIs.} + +\item{bootstrap_ci_type}{Bootstrap CI type, or \code{NULL} (default) +which resolves to \code{"perc"} when \code{bootstrap} is set. One of +\code{"perc"}, \code{"bca"}, \code{"norm"}, \code{"basic"}, or +\code{"stud"}.} +} +\value{ +An S4 object of class \code{ec_ipw_method}. +} +\description{ +Creates a method object for IPW estimation with external control +borrowing (Zhou et al., 2024). Pass to \code{\link{setup_analysis_primary}} +and \code{\link{run_analysis}}. +} +\examples{ +# optimal weight, sandwich SE +ec_ipw(ps_formula = "S ~ x1 + x2 + x3 + x4 + x5") + +# no borrowing +ec_ipw(ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", weight = 0) + +# fixed weight with bootstrap +ec_ipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + weight = 0.3, + bootstrap = 500 +) +} +\references{ +Zhou et al. (2024). Causal estimators for incorporating external +controls in randomized trials with longitudinal outcomes. +\emph{JRSS-A}. \doi{10.1093/jrsssa/qnae075} +} diff --git a/man/estimate.Rd b/man/estimate.Rd new file mode 100644 index 0000000..995d0f4 --- /dev/null +++ b/man/estimate.Rd @@ -0,0 +1,115 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/method_class.R, R/ec_ipw.R, R/did_ec_ipw.R, +% R/did_ec_aipw.R, R/did_ec_or.R, R/ec_aipw.R, R/scm.R +\name{estimate} +\alias{estimate} +\alias{estimate,ec_ipw_method-method} +\alias{estimate,did_ec_ipw_method-method} +\alias{estimate,did_ec_aipw_method-method} +\alias{estimate,did_ec_or_method-method} +\alias{estimate,ec_aipw_method-method} +\alias{estimate,scm_method-method} +\title{Run estimation for a method object} +\usage{ +estimate(method, ...) + +\S4method{estimate}{ec_ipw_method}( + method, + data, + outcomes, + treatment, + trial_status, + covariates, + alpha = 0.05, + quiet = TRUE +) + +\S4method{estimate}{did_ec_ipw_method}( + method, + data, + outcomes, + treatment, + trial_status, + covariates, + alpha = 0.05, + quiet = TRUE, + T_cross +) + +\S4method{estimate}{did_ec_aipw_method}( + method, + data, + outcomes, + treatment, + trial_status, + covariates, + alpha = 0.05, + quiet = TRUE, + T_cross +) + +\S4method{estimate}{did_ec_or_method}( + method, + data, + outcomes, + treatment, + trial_status, + covariates, + alpha = 0.05, + quiet = TRUE, + T_cross +) + +\S4method{estimate}{ec_aipw_method}( + method, + data, + outcomes, + treatment, + trial_status, + covariates, + alpha = 0.05, + quiet = TRUE +) + +\S4method{estimate}{scm_method}( + method, + data, + outcomes, + treatment, + trial_status, + covariates, + alpha = 0.05, + quiet = TRUE, + T_cross +) +} +\arguments{ +\item{method}{An S4 method object (e.g., from \code{\link{ec_ipw}}).} + +\item{...}{Additional method-specific arguments.} + +\item{data}{Data frame with all subjects (RCT + external controls).} + +\item{outcomes}{Character vector of outcome column names.} + +\item{treatment}{Name of the treatment column.} + +\item{trial_status}{Name of the trial participation column.} + +\item{covariates}{Character vector of covariate column names.} + +\item{alpha}{Significance level (default 0.05).} + +\item{quiet}{Logical. Suppress output (default TRUE).} + +\item{T_cross}{Integer crossover time point (OLE methods only).} +} +\value{ +A list with estimation results. +} +\description{ +S4 generic that dispatches to the appropriate estimation logic +based on the method class. Each method subclass (e.g., +\code{ec_ipw_method}, \code{did_ec_ipw_method}) implements its own +\code{estimate()} method containing the full estimation pipeline. +} diff --git a/man/method_obj-class.Rd b/man/method_obj-class.Rd index e79f94b..e209000 100644 --- a/man/method_obj-class.Rd +++ b/man/method_obj-class.Rd @@ -4,9 +4,9 @@ \name{method_obj-class} \alias{method_obj-class} \alias{.method_obj} -\title{method class} +\title{Method classes} \description{ -method class +Method classes } \section{Slots}{ diff --git a/man/scm.Rd b/man/scm.Rd new file mode 100644 index 0000000..37c9aa3 --- /dev/null +++ b/man/scm.Rd @@ -0,0 +1,51 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/scm.R +\name{scm} +\alias{scm} +\title{SCM method constructor} +\usage{ +scm( + lambda_min = 0, + lambda_max = 0.1, + nlambda = 2L, + parallel = "no", + ncpus = 1L, + bootstrap = 200L, + bootstrap_ci_type = NULL +) +} +\arguments{ +\item{lambda_min}{Minimum penalty parameter for LOOCV.} + +\item{lambda_max}{Maximum penalty parameter for LOOCV.} + +\item{nlambda}{Number of lambda values to evaluate in LOOCV.} + +\item{parallel}{Parallelization type for bootstrap (\code{"no"}, +\code{"multicore"}, or \code{"snow"}).} + +\item{ncpus}{Number of CPUs for parallel bootstrap.} + +\item{bootstrap}{Number of bootstrap replicates. Defaults to 200.} + +\item{bootstrap_ci_type}{Bootstrap CI type. Defaults to \code{"perc"}.} +} +\value{ +An S4 object of class \code{scm_method}. +} +\description{ +Creates a method object for synthetic control estimation with +external control borrowing for the open-label extension phase +(Zhou et al., 2024). Constructs a weighted combination of external +controls matching each RCT control subject on covariates and +pre-crossover outcomes. +} +\examples{ +scm(lambda_min = 0, lambda_max = 0.001, nlambda = 2, bootstrap = 50) +} +\references{ +Zhou et al. (2024). Estimating treatment effect in randomized trial +after control to treatment crossover using external controls. +\emph{Journal of Biopharmaceutical Statistics}. +\doi{10.1080/10543406.2024.2444222} +} diff --git a/tests/testthat/test-DID_EC_AIPW.R b/tests/testthat/test-DID_EC_AIPW.R new file mode 100644 index 0000000..9c6c96a --- /dev/null +++ b/tests/testthat/test-DID_EC_AIPW.R @@ -0,0 +1,49 @@ +test_that("DID_EC_AIPW returns expected structure", { + model_form_mu <- c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" + ) + res <- rdborrow:::DID_EC_AIPW( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = model_form_mu, + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc" + ) + expect_s3_class(res, "data.frame") + expect_identical(nrow(res), 2L) +}) + +test_that("DID_EC_AIPW produces finite estimates with valid CIs", { + model_form_mu <- c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" + ) + res <- rdborrow:::DID_EC_AIPW( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = model_form_mu, + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc" + ) + expect_all_true(is.finite(res$point_estimates)) + expect_all_true(res$lower_CI_boot < res$upper_CI_boot) +}) diff --git a/tests/testthat/test-DID_EC_IPW.R b/tests/testthat/test-DID_EC_IPW.R new file mode 100644 index 0000000..e09c3c4 --- /dev/null +++ b/tests/testthat/test-DID_EC_IPW.R @@ -0,0 +1,53 @@ +test_that("DID_EC_IPW returns expected structure", { + res <- rdborrow:::DID_EC_IPW( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5", + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc" + ) + expect_s3_class(res, "data.frame") + expect_identical(nrow(res), 2L) + expect_named(res, c("point_estimates", "lower_CI_boot", "upper_CI_boot")) +}) + +test_that("DID_EC_IPW produces finite estimates with valid CIs", { + res <- rdborrow:::DID_EC_IPW( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5", + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc" + ) + expect_all_true(is.finite(res$point_estimates)) + expect_all_true(res$lower_CI_boot < res$upper_CI_boot) +}) + +test_that("DID_EC_IPW estimates effects for OLE period only", { + res <- rdborrow:::DID_EC_IPW( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5", + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc" + ) + expect_identical(rownames(res), c("tau3", "tau4")) +}) diff --git a/tests/testthat/test-DID_EC_OR.R b/tests/testthat/test-DID_EC_OR.R new file mode 100644 index 0000000..504be94 --- /dev/null +++ b/tests/testthat/test-DID_EC_OR.R @@ -0,0 +1,73 @@ +test_that("DID_EC_OR returns expected structure", { + model_form_mu <- c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" + ) + res <- rdborrow:::DID_EC_OR( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + model_form_mu0_ext = model_form_mu, + model_form_mu0_rct = model_form_mu, + model_form_mu1_rct = model_form_mu, + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc" + ) + expect_s3_class(res, "data.frame") + expect_identical(nrow(res), 2L) +}) + +test_that("DID_EC_OR produces finite estimates with valid CIs", { + model_form_mu <- c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" + ) + res <- rdborrow:::DID_EC_OR( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + model_form_mu0_ext = model_form_mu, + model_form_mu0_rct = model_form_mu, + model_form_mu1_rct = model_form_mu, + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc" + ) + expect_all_true(is.finite(res$point_estimates)) + expect_all_true(res$lower_CI_boot < res$upper_CI_boot) +}) + +test_that("DID_EC_OR estimates effects for OLE period only", { + model_form_mu <- c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" + ) + res <- rdborrow:::DID_EC_OR( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + model_form_mu0_ext = model_form_mu, + model_form_mu0_rct = model_form_mu, + model_form_mu1_rct = model_form_mu, + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc" + ) + expect_identical(rownames(res), c("tau3", "tau4")) +}) diff --git a/tests/testthat/test-EC_AIPW_OPT.R b/tests/testthat/test-EC_AIPW_OPT.R new file mode 100644 index 0000000..3039b92 --- /dev/null +++ b/tests/testthat/test-EC_AIPW_OPT.R @@ -0,0 +1,109 @@ +test_that("EC_AIPW_OPT returns expected structure", { + res <- rdborrow:::EC_AIPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ), + optimal_weight_flag = FALSE, + wt = 0 + ) + expect_type(res, "list") + expect_named(res, c("results", "borrow_weight")) + expect_s3_class(res$results, "data.frame") + expect_identical(nrow(res$results), 2L) +}) + +test_that("EC_AIPW_OPT with w=0 uses only trial data", { + res <- rdborrow:::EC_AIPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ), + optimal_weight_flag = FALSE, + wt = 0 + ) + expect_identical(res$borrow_weight, 0) + expect_all_true(is.finite(res$results$point_estimates)) + expect_all_true(res$results$standard_deviation > 0) +}) + +test_that("EC_AIPW_OPT with optimal weight borrows from external controls", { + res <- rdborrow:::EC_AIPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ), + optimal_weight_flag = TRUE + ) + expect_gt(res$borrow_weight, 0) + expect_lt(res$borrow_weight, 1) + expect_all_true(is.finite(res$results$point_estimates)) +}) + +test_that("EC_AIPW_OPT with fixed weight uses the specified weight", { + res <- rdborrow:::EC_AIPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ), + optimal_weight_flag = FALSE, + wt = 0.3 + ) + expect_equal(res$borrow_weight, 0.3) +}) + +test_that("EC_AIPW_OPT normal CIs contain the point estimates", { + res <- rdborrow:::EC_AIPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ), + optimal_weight_flag = TRUE + ) + expect_all_true(res$results$point_estimates >= res$results$lower_CI_normal) + expect_all_true(res$results$point_estimates <= res$results$upper_CI_normal) +}) + +test_that("EC_AIPW_OPT works with a single outcome", { + res <- rdborrow:::EC_AIPW_OPT( + data = SyntheticData, + outcome_col_name = "y1", + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = "y1 ~ x1 + x2 + x3 + x4 + x5", + optimal_weight_flag = TRUE + ) + expect_identical(nrow(res$results), 1L) +}) diff --git a/tests/testthat/test-EC_IPW_OPT.R b/tests/testthat/test-EC_IPW_OPT.R new file mode 100644 index 0000000..079e00e --- /dev/null +++ b/tests/testthat/test-EC_IPW_OPT.R @@ -0,0 +1,140 @@ +test_that("EC_IPW_OPT returns expected structure", { + res <- rdborrow:::EC_IPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + optimal_weight_flag = FALSE, + wt = 0 + ) + expect_type(res, "list") + expect_named(res, c("results", "borrow_weight")) + expect_s3_class(res$results, "data.frame") + expect_identical(nrow(res$results), 2L) + expect_named(res$results, c( + "point_estimates", "standard_deviation", + "lower_CI_normal", "upper_CI_normal" + )) +}) + +test_that("EC_IPW_OPT with w=0 uses only trial data", { + res <- rdborrow:::EC_IPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + optimal_weight_flag = FALSE, + wt = 0 + ) + expect_identical(res$borrow_weight, 0) + expect_all_true(is.finite(res$results$point_estimates)) + expect_all_true(res$results$standard_deviation > 0) + expect_all_true(res$results$lower_CI_normal < res$results$upper_CI_normal) +}) + +test_that("EC_IPW_OPT with optimal weight borrows from external controls", { + res <- rdborrow:::EC_IPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + optimal_weight_flag = TRUE + ) + expect_gt(res$borrow_weight, 0) + expect_lt(res$borrow_weight, 1) + expect_all_true(is.finite(res$results$point_estimates)) + expect_all_true(res$results$standard_deviation > 0) +}) + +test_that("EC_IPW_OPT with fixed weight uses the specified weight", { + res <- rdborrow:::EC_IPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + optimal_weight_flag = FALSE, + wt = 0.3 + ) + expect_equal(res$borrow_weight, 0.3) +}) + +test_that("EC_IPW_OPT normal CIs contain the point estimates", { + res <- rdborrow:::EC_IPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + optimal_weight_flag = TRUE + ) + expect_all_true(res$results$point_estimates >= res$results$lower_CI_normal) + expect_all_true(res$results$point_estimates <= res$results$upper_CI_normal) +}) + +test_that("EC_IPW_OPT optimal weight produces smaller or equal SE than w=0", { + res_no <- rdborrow:::EC_IPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + optimal_weight_flag = FALSE, + wt = 0 + ) + res_opt <- rdborrow:::EC_IPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + optimal_weight_flag = TRUE + ) + expect_all_true( + res_opt$results$standard_deviation <= res_no$results$standard_deviation + ) +}) + +test_that("EC_IPW_OPT works with a single outcome", { + res <- rdborrow:::EC_IPW_OPT( + data = SyntheticData, + outcome_col_name = "y1", + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + optimal_weight_flag = TRUE + ) + expect_identical(nrow(res$results), 1L) + expect_gt(res$borrow_weight, 0) +}) + +test_that("EC_IPW_OPT bootstrap returns boot CIs", { + res <- rdborrow:::EC_IPW_OPT( + data = SyntheticData, + outcome_col_name = c("y1", "y2"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + optimal_weight_flag = TRUE, + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc" + ) + expect_named(res$results, c( + "point_estimates", "standard_deviation", + "lower_CI_boot", "upper_CI_boot" + )) + expect_all_true(res$results$lower_CI_boot < res$results$upper_CI_boot) +}) diff --git a/tests/testthat/test-SCM.R b/tests/testthat/test-SCM.R new file mode 100644 index 0000000..e660b7d --- /dev/null +++ b/tests/testthat/test-SCM.R @@ -0,0 +1,58 @@ +test_that("SCM returns expected structure", { + res <- rdborrow:::SCM( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc", + lambda.min = 0, + lambda.max = 1e-3, + nlambda = 2, + parallel = "no" + ) + expect_s3_class(res, "data.frame") + expect_identical(nrow(res), 2L) +}) + +test_that("SCM produces finite estimates with valid CIs", { + res <- rdborrow:::SCM( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc", + lambda.min = 0, + lambda.max = 1e-3, + nlambda = 2, + parallel = "no" + ) + expect_all_true(is.finite(res$point_estimates)) + expect_all_true(res$lower_CI_boot < res$upper_CI_boot) +}) + +test_that("SCM estimates effects for OLE period only", { + res <- rdborrow:::SCM( + data = SyntheticData, + outcome_col_name = c("y1", "y2", "y3", "y4"), + trial_status_col_name = "S", + treatment_col_name = "A", + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + Bootstrap = TRUE, + R = 50, + bootstrap_CI_type = "perc", + lambda.min = 0, + lambda.max = 1e-3, + nlambda = 2, + parallel = "no" + ) + expect_identical(rownames(res), c("tau3", "tau4")) +}) diff --git a/tests/testthat/test-analysis_OLE_class.R b/tests/testthat/test-analysis_OLE_class.R index f257cb3..0d989db 100644 --- a/tests/testthat/test-analysis_OLE_class.R +++ b/tests/testthat/test-analysis_OLE_class.R @@ -5,7 +5,7 @@ test_that("setup_analysis_OLE returns valid object", { treatment_col_name = "A", outcome_col_name = c("y1", "y2", "y3", "y4"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - method_OLE_obj = setup_method_DID(method_name = "IPW"), + method_OLE_obj = suppressWarnings(setup_method_DID(method_name = "IPW")), T_cross = 2 ) expect_s4_class(obj, "analysis_OLE_obj") @@ -21,7 +21,7 @@ test_that("setup_analysis_OLE inherits base validation", { treatment_col_name = "A", outcome_col_name = "y1", covariates_col_name = "x1", - method_OLE_obj = setup_method_DID(), + method_OLE_obj = suppressWarnings(setup_method_DID()), T_cross = 2 )) expect_error(setup_analysis_OLE( @@ -30,7 +30,7 @@ test_that("setup_analysis_OLE inherits base validation", { treatment_col_name = "A", outcome_col_name = "y1", covariates_col_name = "x1", - method_OLE_obj = setup_method_DID(), + method_OLE_obj = suppressWarnings(setup_method_DID()), T_cross = 2 )) }) @@ -42,7 +42,7 @@ test_that("setup_analysis_OLE validates method type", { treatment_col_name = "A", outcome_col_name = "y1", covariates_col_name = "x1", - method_OLE_obj = setup_method_weighting(), + method_OLE_obj = suppressWarnings(setup_method_weighting()), T_cross = 2 )) }) @@ -54,7 +54,7 @@ test_that("setup_analysis_OLE validates T_cross", { treatment_col_name = "A", outcome_col_name = "y1", covariates_col_name = "x1", - method_OLE_obj = setup_method_DID(), + method_OLE_obj = suppressWarnings(setup_method_DID()), T_cross = -1 )) }) @@ -66,7 +66,7 @@ test_that("show method prints without error", { treatment_col_name = "A", outcome_col_name = c("y1", "y2", "y3", "y4"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - method_OLE_obj = setup_method_DID(method_name = "IPW"), + method_OLE_obj = suppressWarnings(setup_method_DID(method_name = "IPW")), T_cross = 2 ) expect_output(show(obj), "analysis_OLE_obj") diff --git a/tests/testthat/test-analysis_primary_class.R b/tests/testthat/test-analysis_primary_class.R index fc53c2e..0e08e37 100644 --- a/tests/testthat/test-analysis_primary_class.R +++ b/tests/testthat/test-analysis_primary_class.R @@ -5,7 +5,7 @@ test_that("setup_analysis_primary returns valid object", { treatment_col_name = "A", outcome_col_name = c("y1", "y2"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - method_weighting_obj = setup_method_weighting(method_name = "IPW") + method_weighting_obj = suppressWarnings(setup_method_weighting(method_name = "IPW")) ) expect_s4_class(obj, "analysis_primary_obj") expect_identical(obj@method_obj@method_name, "IPW") @@ -19,7 +19,7 @@ test_that("setup_analysis_primary inherits base validation", { treatment_col_name = "A", outcome_col_name = "y1", covariates_col_name = "x1", - method_weighting_obj = setup_method_weighting() + method_weighting_obj = suppressWarnings(setup_method_weighting()) )) expect_error(setup_analysis_primary( data = SyntheticData, @@ -27,7 +27,7 @@ test_that("setup_analysis_primary inherits base validation", { treatment_col_name = "A", outcome_col_name = "y1", covariates_col_name = "x1", - method_weighting_obj = setup_method_weighting() + method_weighting_obj = suppressWarnings(setup_method_weighting()) )) }) @@ -49,7 +49,7 @@ test_that("show method prints without error", { treatment_col_name = "A", outcome_col_name = c("y1", "y2"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - method_weighting_obj = setup_method_weighting(method_name = "IPW") + method_weighting_obj = suppressWarnings(setup_method_weighting(method_name = "IPW")) ) expect_output(show(obj), "analysis_primary_obj") expect_output(show(obj), "IPW") diff --git a/tests/testthat/test-full_pipeline_did_ec_aipw.R b/tests/testthat/test-full_pipeline_did_ec_aipw.R new file mode 100644 index 0000000..69eda22 --- /dev/null +++ b/tests/testthat/test-full_pipeline_did_ec_aipw.R @@ -0,0 +1,68 @@ +tol <- 1e-6 + +test_that("DID-EC-AIPW point estimates and bootstrap CIs", { + bootstrap_obj <- setup_bootstrap(replicates = 50, bootstrap_CI_type = "perc") + method <- suppressWarnings(setup_method_DID( + method_name = "AIPW", + bootstrap_flag = TRUE, + bootstrap_obj = bootstrap_obj, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" + ) + )) + analysis <- setup_analysis_OLE( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2", "y3", "y4"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + method_OLE_obj = method + ) + + set.seed(42) + res <- run_analysis(analysis) + + expect_s3_class(res, "data.frame") + expect_equal(nrow(res), 2) + expect_equal(res$point_estimates[1], 2.0417274627, tolerance = tol) + expect_equal(res$point_estimates[2], 4.0361177594, tolerance = tol) +}) + +# new API---- + +test_that("did_ec_aipw() matches old API", { + method <- did_ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + trt_formula = "A ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" + ), + bootstrap = 50 + ) + analysis <- setup_analysis_OLE( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2", "y3", "y4"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + method_OLE_obj = method + ) + + set.seed(42) + res <- run_analysis(analysis) + + expect_s3_class(res, "data.frame") + expect_equal(nrow(res), 2) + expect_equal(res$point_estimates[1], 2.0417274627, tolerance = tol) + expect_equal(res$point_estimates[2], 4.0361177594, tolerance = tol) +}) diff --git a/tests/testthat/test-full_pipeline_did_ec_ipw.R b/tests/testthat/test-full_pipeline_did_ec_ipw.R new file mode 100644 index 0000000..76237a4 --- /dev/null +++ b/tests/testthat/test-full_pipeline_did_ec_ipw.R @@ -0,0 +1,56 @@ +tol <- 1e-6 + +test_that("DID-EC-IPW point estimates and bootstrap CIs", { + bootstrap_obj <- setup_bootstrap(replicates = 50, bootstrap_CI_type = "perc") + method <- suppressWarnings(setup_method_DID( + method_name = "IPW", + bootstrap_flag = TRUE, + bootstrap_obj = bootstrap_obj, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5" + )) + analysis <- setup_analysis_OLE( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2", "y3", "y4"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + method_OLE_obj = method + ) + + set.seed(42) + res <- run_analysis(analysis) + + expect_s3_class(res, "data.frame") + expect_equal(nrow(res), 2) + expect_equal(res$point_estimates[1], 2.0759256334, tolerance = tol) + expect_equal(res$point_estimates[2], 4.3894380397, tolerance = tol) +}) + +# new API---- + +test_that("did_ec_ipw() matches old API", { + method <- did_ec_ipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + trt_formula = "A ~ x1 + x2 + x3 + x4 + x5", + bootstrap = 50 + ) + analysis <- setup_analysis_OLE( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2", "y3", "y4"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + method_OLE_obj = method + ) + + set.seed(42) + res <- run_analysis(analysis) + + expect_s3_class(res, "data.frame") + expect_equal(nrow(res), 2) + expect_equal(res$point_estimates[1], 2.0759256334, tolerance = tol) + expect_equal(res$point_estimates[2], 4.3894380397, tolerance = tol) +}) diff --git a/tests/testthat/test-full_pipeline_did_ec_or.R b/tests/testthat/test-full_pipeline_did_ec_or.R new file mode 100644 index 0000000..ea2c5aa --- /dev/null +++ b/tests/testthat/test-full_pipeline_did_ec_or.R @@ -0,0 +1,70 @@ +tol <- 1e-6 + +test_that("DID-EC-OR point estimates and bootstrap CIs", { + bootstrap_obj <- setup_bootstrap(replicates = 50, bootstrap_CI_type = "perc") + model_forms <- c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" + ) + method <- suppressWarnings(setup_method_DID( + method_name = "OR", + bootstrap_flag = TRUE, + bootstrap_obj = bootstrap_obj, + model_form_mu0_ext = model_forms, + model_form_mu0_rct = model_forms, + model_form_mu1_rct = model_forms + )) + analysis <- setup_analysis_OLE( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2", "y3", "y4"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + method_OLE_obj = method + ) + + set.seed(42) + res <- run_analysis(analysis) + + expect_s3_class(res, "data.frame") + expect_equal(nrow(res), 2) + expect_equal(res$point_estimates[1], 1.5689465845, tolerance = tol) + expect_equal(res$point_estimates[2], 4.4078336543, tolerance = tol) +}) + +# new API---- + +test_that("did_ec_or() matches old API", { + model_forms <- c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5", + "y3 ~ x1 + x2 + x3 + x4 + x5", + "y4 ~ x1 + x2 + x3 + x4 + x5" + ) + method <- did_ec_or( + outcome_formula_ext = model_forms, + outcome_formula_rct_ctrl = model_forms, + outcome_formula_rct_trt = model_forms, + bootstrap = 50 + ) + analysis <- setup_analysis_OLE( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2", "y3", "y4"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + method_OLE_obj = method + ) + + set.seed(42) + res <- run_analysis(analysis) + + expect_s3_class(res, "data.frame") + expect_equal(nrow(res), 2) + expect_equal(res$point_estimates[1], 1.5689465845, tolerance = tol) + expect_equal(res$point_estimates[2], 4.4078336543, tolerance = tol) +}) diff --git a/tests/testthat/test-full_pipeline_ec_aipw.R b/tests/testthat/test-full_pipeline_ec_aipw.R new file mode 100644 index 0000000..2fc5a19 --- /dev/null +++ b/tests/testthat/test-full_pipeline_ec_aipw.R @@ -0,0 +1,239 @@ +tol <- 1e-6 + +test_that("EC-AIPW zero weight (no borrowing)", { + method <- suppressWarnings(setup_method_weighting( + method_name = "AIPW", + optimal_weight_flag = FALSE, + wt = 0, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ) + )) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0) + expect_equal(res$results$point_estimates[1], -0.4361151144, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.4422248202, tolerance = tol) + expect_equal(res$results$standard_deviation[1], 0.5552064946, tolerance = tol) + expect_equal(res$results$standard_deviation[2], 0.5701633244, tolerance = tol) + expect_equal(res$results$lower_CI_normal[1], -1.5242998477, tolerance = tol) + expect_equal(res$results$lower_CI_normal[2], -0.6752747610, tolerance = tol) + expect_equal(res$results$upper_CI_normal[1], 0.6520696190, tolerance = tol) + expect_equal(res$results$upper_CI_normal[2], 1.5597244013, tolerance = tol) +}) + +test_that("EC-AIPW optimal weight", { + method <- suppressWarnings(setup_method_weighting( + method_name = "AIPW", + optimal_weight_flag = TRUE, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ) + )) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) + expect_equal(res$results$point_estimates[1], -0.5463256250, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.5401749583, tolerance = tol) + expect_equal(res$results$standard_deviation[1], 0.5305614087, tolerance = tol) + expect_equal(res$results$standard_deviation[2], 0.5543275065, tolerance = tol) + expect_equal(res$results$lower_CI_normal[1], -1.5862068777, tolerance = tol) + expect_equal(res$results$lower_CI_normal[2], -0.5462869900, tolerance = tol) + expect_equal(res$results$upper_CI_normal[1], 0.4935556277, tolerance = tol) + expect_equal(res$results$upper_CI_normal[2], 1.6266369066, tolerance = tol) +}) + +test_that("EC-AIPW fixed weight 0.3", { + method <- suppressWarnings(setup_method_weighting( + method_name = "AIPW", + optimal_weight_flag = FALSE, + wt = 0.3, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ) + )) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0.3) + expect_equal(res$results$point_estimates[1], -0.6602422289, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.6414189052, tolerance = tol) + expect_equal(res$results$standard_deviation[1], 0.5263737407, tolerance = tol) + expect_equal(res$results$standard_deviation[2], 0.5832105340, tolerance = tol) + expect_equal(res$results$lower_CI_normal[1], -1.6919158032, tolerance = tol) + expect_equal(res$results$lower_CI_normal[2], -0.5016527368, tolerance = tol) + expect_equal(res$results$upper_CI_normal[1], 0.3714313453, tolerance = tol) + expect_equal(res$results$upper_CI_normal[2], 1.7844905472, tolerance = tol) +}) + +test_that("EC-AIPW bootstrap preserves point estimates", { + bootstrap_obj <- setup_bootstrap(replicates = 50, bootstrap_CI_type = "perc") + method <- suppressWarnings(setup_method_weighting( + method_name = "AIPW", + optimal_weight_flag = TRUE, + wt = 0, + bootstrap_flag = TRUE, + bootstrap_obj = bootstrap_obj, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", + model_form_mu0_ext = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ) + )) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + + set.seed(42) + res <- run_analysis(analysis) + + expect_equal(res$results$point_estimates[1], -0.5463256250, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.5401749583, tolerance = tol) + expect_equal(res$results$standard_deviation[1], 0.5305614087, tolerance = tol) + expect_equal(res$results$standard_deviation[2], 0.5543275065, tolerance = tol) + expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) + expect_named( + res$results, + c("point_estimates", "standard_deviation", "lower_CI_boot", "upper_CI_boot") + ) +}) + +# new API---- + +test_that("ec_aipw() optimal weight matches old API", { + method <- ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ) + ) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) + expect_equal(res$results$point_estimates[1], -0.5463256250, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.5401749583, tolerance = tol) + expect_equal(res$results$standard_deviation[1], 0.5305614087, tolerance = tol) + expect_equal(res$results$standard_deviation[2], 0.5543275065, tolerance = tol) +}) + +test_that("ec_aipw() weight=0 matches old API", { + method <- ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ), + weight = 0 + ) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0) + expect_equal(res$results$point_estimates[1], -0.4361151144, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.4422248202, tolerance = tol) +}) + +test_that("ec_aipw() weight=0.3 matches old API", { + method <- ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ), + weight = 0.3 + ) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0.3) + expect_equal(res$results$point_estimates[1], -0.6602422289, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.6414189052, tolerance = tol) +}) + +test_that("ec_aipw() with bootstrap", { + method <- ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ), + bootstrap = 50, + bootstrap_ci_type = "perc" + ) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + + set.seed(42) + res <- run_analysis(analysis) + + expect_equal(res$results$point_estimates[1], -0.5463256250, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.5401749583, tolerance = tol) + expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) + expect_named( + res$results, + c("point_estimates", "standard_deviation", "lower_CI_boot", "upper_CI_boot") + ) +}) diff --git a/tests/testthat/test-full_pipeline_ec_ipw.R b/tests/testthat/test-full_pipeline_ec_ipw.R new file mode 100644 index 0000000..45d8934 --- /dev/null +++ b/tests/testthat/test-full_pipeline_ec_ipw.R @@ -0,0 +1,211 @@ +tol <- 1e-6 + +test_that("setup_method_weighting emits deprecation warning", { + expect_warning( + setup_method_weighting( + method_name = "IPW", + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" + ), + "deprecated" + ) +}) + +test_that("EC-IPW zero weight (no borrowing)", { + method <- suppressWarnings(setup_method_weighting( + method_name = "IPW", + optimal_weight_flag = FALSE, + wt = 0, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" + )) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_type(res, "list") + expect_named(res, c("results", "borrow_weight")) + expect_equal(res$borrow_weight, 0) + expect_equal(res$results$point_estimates[1], -0.0280870419, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.4095955812, tolerance = tol) + expect_equal(res$results$standard_deviation[1], 0.5367986594, tolerance = tol) + expect_equal(res$results$standard_deviation[2], 0.5625763260, tolerance = tol) + expect_equal(res$results$lower_CI_normal[1], -1.0801930814, tolerance = tol) + expect_equal(res$results$lower_CI_normal[2], -0.6930337563, tolerance = tol) + expect_equal(res$results$upper_CI_normal[1], 1.0240189975, tolerance = tol) + expect_equal(res$results$upper_CI_normal[2], 1.5122249187, tolerance = tol) +}) + +test_that("EC-IPW optimal weight", { + method <- suppressWarnings(setup_method_weighting( + method_name = "IPW", + optimal_weight_flag = TRUE, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" + )) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) + expect_equal(res$results$point_estimates[1], -0.1971968926, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.4697208867, tolerance = tol) + expect_equal(res$results$standard_deviation[1], 0.5134017502, tolerance = tol) + expect_equal(res$results$standard_deviation[2], 0.5410007056, tolerance = tol) + expect_equal(res$results$lower_CI_normal[1], -1.2034458325, tolerance = tol) + expect_equal(res$results$lower_CI_normal[2], -0.5906210120, tolerance = tol) + expect_equal(res$results$upper_CI_normal[1], 0.8090520473, tolerance = tol) + expect_equal(res$results$upper_CI_normal[2], 1.5300627854, tolerance = tol) +}) + +test_that("EC-IPW fixed weight 0.3", { + method <- suppressWarnings(setup_method_weighting( + method_name = "IPW", + optimal_weight_flag = FALSE, + wt = 0.3, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" + )) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0.3) + expect_equal(res$results$point_estimates[1], -0.3719934684, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.5318680501, tolerance = tol) + expect_equal(res$results$standard_deviation[1], 0.5148090221, tolerance = tol) + expect_equal(res$results$standard_deviation[2], 0.5657008041, tolerance = tol) + expect_equal(res$results$lower_CI_normal[1], -1.3810006106, tolerance = tol) + expect_equal(res$results$lower_CI_normal[2], -0.5768851520, tolerance = tol) + expect_equal(res$results$upper_CI_normal[1], 0.6370136739, tolerance = tol) + expect_equal(res$results$upper_CI_normal[2], 1.6406212522, tolerance = tol) +}) + +test_that("EC-IPW bootstrap preserves point estimates", { + bootstrap_obj <- setup_bootstrap(replicates = 50, bootstrap_CI_type = "perc") + method <- suppressWarnings(setup_method_weighting( + method_name = "IPW", + optimal_weight_flag = TRUE, + bootstrap_flag = TRUE, + bootstrap_obj = bootstrap_obj, + wt = 0, + model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" + )) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + + set.seed(42) + res <- run_analysis(analysis) + + expect_equal(res$results$point_estimates[1], -0.1971968926, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.4697208867, tolerance = tol) + expect_equal(res$results$standard_deviation[1], 0.5134017502, tolerance = tol) + expect_equal(res$results$standard_deviation[2], 0.5410007056, tolerance = tol) + expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) + expect_named( + res$results, + c("point_estimates", "standard_deviation", "lower_CI_boot", "upper_CI_boot") + ) +}) + +# new API---- + +test_that("ec_ipw() optimal weight matches old API", { + method <- ec_ipw(ps_formula = "S ~ x1 + x2 + x3 + x4 + x5") + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) + expect_equal(res$results$point_estimates[1], -0.1971968926, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.4697208867, tolerance = tol) + expect_equal(res$results$standard_deviation[1], 0.5134017502, tolerance = tol) + expect_equal(res$results$standard_deviation[2], 0.5410007056, tolerance = tol) +}) + +test_that("ec_ipw() weight=0 matches old API", { + method <- ec_ipw(ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", weight = 0) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0) + expect_equal(res$results$point_estimates[1], -0.0280870419, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.4095955812, tolerance = tol) +}) + +test_that("ec_ipw() weight=0.3 matches old API", { + method <- ec_ipw(ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", weight = 0.3) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + res <- run_analysis(analysis) + + expect_equal(res$borrow_weight, 0.3) + expect_equal(res$results$point_estimates[1], -0.3719934684, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.5318680501, tolerance = tol) +}) + +test_that("ec_ipw() with bootstrap", { + method <- ec_ipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + bootstrap = 50, + bootstrap_ci_type = "perc" + ) + analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method + ) + + set.seed(42) + res <- run_analysis(analysis) + + expect_equal(res$results$point_estimates[1], -0.1971968926, tolerance = tol) + expect_equal(res$results$point_estimates[2], 0.4697208867, tolerance = tol) + expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) + expect_named( + res$results, + c("point_estimates", "standard_deviation", "lower_CI_boot", "upper_CI_boot") + ) +}) diff --git a/tests/testthat/test-full_pipeline_scm.R b/tests/testthat/test-full_pipeline_scm.R new file mode 100644 index 0000000..7ab9a35 --- /dev/null +++ b/tests/testthat/test-full_pipeline_scm.R @@ -0,0 +1,67 @@ +tol <- 1e-6 + +test_that("SCM point estimates match expected values", { + skip_on_cran() + + bootstrap_obj <- setup_bootstrap(replicates = 5, bootstrap_CI_type = "perc") + method <- suppressWarnings(setup_method_SCM( + method_name = "SCM", + bootstrap_flag = TRUE, + bootstrap_obj = bootstrap_obj, + lambda.min = 0.0005, + lambda.max = 0.0005, + nlambda = 1, + parallel = "no", + ncpus = 1 + )) + analysis <- setup_analysis_OLE( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2", "y3", "y4"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + method_OLE_obj = method + ) + + set.seed(42) + res <- suppressWarnings(run_analysis(analysis)) + + expect_s3_class(res, "data.frame") + expect_equal(nrow(res), 2) + expect_named(res, c("point_estimates", "lower_CI_boot", "upper_CI_boot")) + expect_equal(res$point_estimates[1], 2.1383459186, tolerance = tol) + expect_equal(res$point_estimates[2], 3.8894184460, tolerance = tol) + expect_all_true(res$lower_CI_boot <= res$upper_CI_boot) +}) + +# new API---- + +test_that("scm() matches old API point estimates", { + skip_on_cran() + + method <- scm( + lambda_min = 0.0005, + lambda_max = 0.0005, + nlambda = 1, + bootstrap = 5, + bootstrap_ci_type = "perc" + ) + analysis <- setup_analysis_OLE( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2", "y3", "y4"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + T_cross = 2, + method_OLE_obj = method + ) + + set.seed(42) + res <- suppressWarnings(run_analysis(analysis)) + + expect_s3_class(res, "data.frame") + expect_equal(nrow(res), 2) + expect_equal(res$point_estimates[1], 2.1383459186, tolerance = tol) + expect_equal(res$point_estimates[2], 3.8894184460, tolerance = tol) +}) diff --git a/tests/testthat/test-vignette_results_OLE_simulation.R b/tests/testthat/test-full_pipeline_simulation_OLE.R similarity index 99% rename from tests/testthat/test-vignette_results_OLE_simulation.R rename to tests/testthat/test-full_pipeline_simulation_OLE.R index 06c5d45..969fbeb 100644 --- a/tests/testthat/test-vignette_results_OLE_simulation.R +++ b/tests/testthat/test-full_pipeline_simulation_OLE.R @@ -89,7 +89,7 @@ test_that("OLE simulation report matches vignette", { "y4 ~ x1 + x2 + x3 + x4 + x5" ) - method_obj_list <- list( + method_obj_list <- suppressWarnings(list( setup_method_DID( method_name = "IPW", bootstrap_flag = TRUE, bootstrap_obj = bootstrap_obj, model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", @@ -107,7 +107,7 @@ test_that("OLE simulation report matches vignette", { model_form_mu0_rct = model_form_mu, model_form_mu1_rct = model_form_mu ) - ) + )) sim_obj <- setup_simulation_OLE( data_matrix_list = sim_data, diff --git a/tests/testthat/test-vignette_results_primary_simulation.R b/tests/testthat/test-full_pipeline_simulation_primary.R similarity index 99% rename from tests/testthat/test-vignette_results_primary_simulation.R rename to tests/testthat/test-full_pipeline_simulation_primary.R index 065ca4d..e494017 100644 --- a/tests/testthat/test-vignette_results_primary_simulation.R +++ b/tests/testthat/test-full_pipeline_simulation_primary.R @@ -121,7 +121,7 @@ generate_primary_sim_data <- function() { test_that("Primary simulation (parametric) report matches vignette", { sim_data <- generate_primary_sim_data() - method_obj_list <- list( + method_obj_list <- suppressWarnings(list( setup_method_weighting( method_name = "IPW", optimal_weight_flag = TRUE, model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" @@ -146,7 +146,7 @@ test_that("Primary simulation (parametric) report matches vignette", { "y2 ~ x1 + x2 + x3 + x4 + x5" ) ) - ) + )) sim_obj <- setup_simulation_primary( data_matrix_list_null = sim_data$null, diff --git a/tests/testthat/test-method_DID_class.R b/tests/testthat/test-method_DID_class.R index b82b4e8..494d161 100644 --- a/tests/testthat/test-method_DID_class.R +++ b/tests/testthat/test-method_DID_class.R @@ -1,5 +1,5 @@ test_that("setup_method_DID returns default object", { - obj <- setup_method_DID() + obj <- suppressWarnings(setup_method_DID()) expect_s4_class(obj, "method_DID_obj") expect_identical(obj@method_name, "IPW") expect_identical(obj@bootstrap_flag, FALSE) @@ -8,26 +8,26 @@ test_that("setup_method_DID returns default object", { }) test_that("setup_method_DID accepts valid arguments", { - obj <- setup_method_DID( + obj <- suppressWarnings(setup_method_DID( method_name = "AIPW", bootstrap_flag = TRUE, bootstrap_obj = setup_bootstrap(), model_form_piS = "S ~ x1 + x2", model_form_piA = "A ~ x1 + x2" - ) + )) expect_identical(obj@method_name, "AIPW") expect_identical(obj@model_form_piS, "S ~ x1 + x2") expect_identical(obj@model_form_piA, "A ~ x1 + x2") }) test_that("setup_method_DID inherits base validation", { - expect_error(setup_method_DID(method_name = 123)) - expect_error(setup_method_DID(bootstrap_flag = "yes")) - expect_error(setup_method_DID(bootstrap_obj = list())) + expect_warning(expect_error(setup_method_DID(method_name = 123))) + expect_warning(expect_error(setup_method_DID(bootstrap_flag = "yes"))) + expect_warning(expect_error(setup_method_DID(bootstrap_obj = list()))) }) test_that("setup_method_DID validates DID-specific args", { - expect_error(setup_method_DID(model_form_piS = 123)) - expect_error(setup_method_DID(model_form_piA = NA)) - expect_error(setup_method_DID(model_form_mu0_ext = 123)) + expect_warning(expect_error(setup_method_DID(model_form_piS = 123))) + expect_warning(expect_error(setup_method_DID(model_form_piA = NA))) + expect_warning(expect_error(setup_method_DID(model_form_mu0_ext = 123))) }) diff --git a/tests/testthat/test-method_SCM_class.R b/tests/testthat/test-method_SCM_class.R index 9560643..e19ba7b 100644 --- a/tests/testthat/test-method_SCM_class.R +++ b/tests/testthat/test-method_SCM_class.R @@ -1,5 +1,5 @@ test_that("setup_method_SCM returns valid object", { - obj <- setup_method_SCM(lambda.min = 0, lambda.max = 1e-3) + obj <- suppressWarnings(setup_method_SCM(lambda.min = 0, lambda.max = 1e-3)) expect_s4_class(obj, "method_SCM_obj") expect_identical(obj@method_name, "SCM") expect_identical(obj@lambda.min, 0) @@ -10,27 +10,27 @@ test_that("setup_method_SCM returns valid object", { }) test_that("setup_method_SCM accepts valid arguments", { - obj <- setup_method_SCM( + obj <- suppressWarnings(setup_method_SCM( lambda.min = 0, lambda.max = 1, nlambda = 20, parallel = "multicore", ncpus = 4 - ) + )) expect_identical(obj@nlambda, 20) expect_identical(obj@parallel, "multicore") expect_identical(obj@ncpus, 4) }) test_that("setup_method_SCM inherits base validation", { - expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, method_name = 123)) - expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, bootstrap_flag = "yes")) - expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, bootstrap_obj = list())) + expect_warning(expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, method_name = 123))) + expect_warning(expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, bootstrap_flag = "yes"))) + expect_warning(expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, bootstrap_obj = list()))) }) test_that("setup_method_SCM validates SCM-specific args", { - expect_error(setup_method_SCM(lambda.min = 1, lambda.max = 0)) - expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, nlambda = 0)) - expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, parallel = "unknown")) - expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, ncpus = 0)) + expect_warning(expect_error(setup_method_SCM(lambda.min = 1, lambda.max = 0))) + expect_warning(expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, nlambda = 0))) + expect_warning(expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, parallel = "unknown"))) + expect_warning(expect_error(setup_method_SCM(lambda.min = 0, lambda.max = 1, ncpus = 0))) }) diff --git a/tests/testthat/test-method_weighting_class.R b/tests/testthat/test-method_weighting_class.R index 8167124..16dabe1 100644 --- a/tests/testthat/test-method_weighting_class.R +++ b/tests/testthat/test-method_weighting_class.R @@ -1,5 +1,5 @@ test_that("setup_method_weighting returns default object", { - obj <- setup_method_weighting() + obj <- suppressWarnings(setup_method_weighting()) expect_s4_class(obj, "method_weighting_obj") expect_identical(obj@method_name, "IPW") expect_identical(obj@optimal_weight_flag, FALSE) @@ -8,14 +8,14 @@ test_that("setup_method_weighting returns default object", { }) test_that("setup_method_weighting accepts valid arguments", { - obj <- setup_method_weighting( + obj <- suppressWarnings(setup_method_weighting( method_name = "AIPW", optimal_weight_flag = TRUE, wt = 0.5, bootstrap_flag = TRUE, bootstrap_obj = setup_bootstrap(), model_form_piS = "S ~ x1 + x2" - ) + )) expect_identical(obj@method_name, "AIPW") expect_identical(obj@optimal_weight_flag, TRUE) expect_identical(obj@wt, 0.5) @@ -23,13 +23,13 @@ test_that("setup_method_weighting accepts valid arguments", { }) test_that("setup_method_weighting inherits base validation", { - expect_error(setup_method_weighting(method_name = 123)) - expect_error(setup_method_weighting(bootstrap_flag = "yes")) - expect_error(setup_method_weighting(bootstrap_obj = list())) + expect_warning(expect_error(setup_method_weighting(method_name = 123))) + expect_warning(expect_error(setup_method_weighting(bootstrap_flag = "yes"))) + expect_warning(expect_error(setup_method_weighting(bootstrap_obj = list()))) }) test_that("setup_method_weighting validates weighting-specific args", { - expect_error(setup_method_weighting(optimal_weight_flag = "yes")) - expect_error(setup_method_weighting(wt = "heavy")) - expect_error(setup_method_weighting(model_form_piS = 123)) + expect_warning(expect_error(setup_method_weighting(optimal_weight_flag = "yes"))) + expect_warning(expect_error(setup_method_weighting(wt = "heavy"))) + expect_warning(expect_error(setup_method_weighting(model_form_piS = 123))) }) diff --git a/tests/testthat/test-vignette_results_OLE_analysis.R b/tests/testthat/test-vignette_results_OLE_analysis.R deleted file mode 100644 index 6bc41c7..0000000 --- a/tests/testthat/test-vignette_results_OLE_analysis.R +++ /dev/null @@ -1,169 +0,0 @@ -# Regression tests for the OLE Analysis Workflow vignette. -# These lock in the exact numerical outputs produced by the current codebase -# on the built-in SyntheticData, so any code change that alters results is caught. -# -# All OLE methods require bootstrap for inference. Point estimates are -# deterministic (seed-independent); bootstrap CIs are tested with set.seed(42). -# -# Reference: vignettes/OLE_analysis_workflow.Rmd - -# ---------- helpers shared across all test blocks ---------------------------- -common_args_ole <- list( - data = SyntheticData, - trial_status_col_name = "S", - treatment_col_name = "A", - outcome_col_name = c("y1", "y2", "y3", "y4"), - covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - T_cross = 2 -) - -model_form_mu_ole <- c( - "y1 ~ x1 + x2 + x3 + x4 + x5", - "y2 ~ x1 + x2 + x3 + x4 + x5", - "y3 ~ x1 + x2 + x3 + x4 + x5", - "y4 ~ x1 + x2 + x3 + x4 + x5" -) - -bootstrap_obj_ole <- setup_bootstrap(replicates = 50, bootstrap_CI_type = "perc") - -tol <- 1e-6 - -# ============================================================================= -# Section 1.1 DID – IPW -# ============================================================================= -test_that("OLE DID-IPW point estimates match vignette", { - method_obj <- setup_method_DID( - method_name = "IPW", - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj_ole, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", - model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5" - ) - - analysis_obj <- do.call( - setup_analysis_OLE, - c(common_args_ole, list(method_OLE_obj = method_obj)) - ) - - set.seed(42) - res <- run_analysis(analysis_obj) - - # Return structure - expect_s3_class(res, "data.frame") - expect_equal(nrow(res), 2) - expect_true(all(c("point_estimates", "lower_CI_boot", "upper_CI_boot") %in% names(res))) - - # Point estimates (deterministic, seed-independent) - expect_equal(res$point_estimates[1], 2.0759256334, tolerance = tol) - expect_equal(res$point_estimates[2], 4.3894380397, tolerance = tol) - - # Bootstrap CIs (seed=42, replicates=50) - expect_equal(res$lower_CI_boot[1], -0.1985438132, tolerance = tol) - expect_equal(res$lower_CI_boot[2], 1.7688904174, tolerance = tol) - expect_equal(res$upper_CI_boot[1], 4.0048311338, tolerance = tol) - expect_equal(res$upper_CI_boot[2], 8.2132873144, tolerance = tol) -}) - -# ============================================================================= -# Section 1.2 DID – AIPW -# ============================================================================= -test_that("OLE DID-AIPW point estimates match vignette", { - method_obj <- setup_method_DID( - method_name = "AIPW", - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj_ole, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", - model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5", - model_form_mu0_ext = model_form_mu_ole - ) - - analysis_obj <- do.call( - setup_analysis_OLE, - c(common_args_ole, list(method_OLE_obj = method_obj)) - ) - - set.seed(42) - res <- run_analysis(analysis_obj) - - # Point estimates (deterministic) - expect_equal(res$point_estimates[1], 2.0417274627, tolerance = tol) - expect_equal(res$point_estimates[2], 4.0361177594, tolerance = tol) - - # Bootstrap CIs (seed=42, replicates=50) - expect_equal(res$lower_CI_boot[1], -0.2867417425, tolerance = tol) - expect_equal(res$lower_CI_boot[2], 1.8145638636, tolerance = tol) - expect_equal(res$upper_CI_boot[1], 4.0728301932, tolerance = tol) - expect_equal(res$upper_CI_boot[2], 8.6316113430, tolerance = tol) -}) - -# ============================================================================= -# Section 1.3 DID – OR (Outcome Regression) -# ============================================================================= -test_that("OLE DID-OR point estimates match vignette", { - method_obj <- setup_method_DID( - method_name = "OR", - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj_ole, - model_form_mu0_ext = model_form_mu_ole, - model_form_mu0_rct = model_form_mu_ole, - model_form_mu1_rct = model_form_mu_ole - ) - - analysis_obj <- do.call( - setup_analysis_OLE, - c(common_args_ole, list(method_OLE_obj = method_obj)) - ) - - set.seed(42) - res <- run_analysis(analysis_obj) - - # Point estimates (deterministic) - expect_equal(res$point_estimates[1], 1.5689465845, tolerance = tol) - expect_equal(res$point_estimates[2], 4.4078336543, tolerance = tol) - - # Bootstrap CIs (seed=42, replicates=50) - expect_equal(res$lower_CI_boot[1], -0.5122539446, tolerance = tol) - expect_equal(res$lower_CI_boot[2], 2.7815366451, tolerance = tol) - expect_equal(res$upper_CI_boot[1], 3.1261896871, tolerance = tol) - expect_equal(res$upper_CI_boot[2], 7.9430452413, tolerance = tol) -}) - -# ============================================================================= -# Section 2 SCM – Synthetic Control Method -# ============================================================================= -test_that("OLE SCM runs and returns valid results", { - skip_on_cran() # scm is computationally expensive (~1 min) - - bootstrap_obj_scm <- setup_bootstrap(replicates = 5, bootstrap_CI_type = "perc") - - method_obj <- setup_method_SCM( - method_name = "SCM", - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj_scm, - lambda.min = 0.0005, - lambda.max = 0.0005, - nlambda = 1, - parallel = "no", - ncpus = 1 - ) - - analysis_obj <- do.call( - setup_analysis_OLE, - c(common_args_ole, list(method_OLE_obj = method_obj)) - ) - - # few bootstrap replicates triggers "extreme order statistics" warnings ---- - set.seed(42) - res <- suppressWarnings(run_analysis(analysis_obj)) - - # structure checks ---- - expect_s3_class(res, "data.frame") - expect_equal(nrow(res), 2) - expect_true(all(c("point_estimates", "lower_CI_boot", "upper_CI_boot") %in% names(res))) - - # ecos solver is platform-dependent, so pin structure not exact values ---- - expect_true(all(is.finite(res$point_estimates))) - expect_true(all(is.finite(res$lower_CI_boot))) - expect_true(all(is.finite(res$upper_CI_boot))) - expect_true(all(res$lower_CI_boot <= res$upper_CI_boot)) -}) diff --git a/tests/testthat/test-vignette_results_primary_analysis.R b/tests/testthat/test-vignette_results_primary_analysis.R deleted file mode 100644 index 2f05c87..0000000 --- a/tests/testthat/test-vignette_results_primary_analysis.R +++ /dev/null @@ -1,246 +0,0 @@ -# Regression tests for the Primary Analysis Workflow vignette. -# These lock in the exact numerical outputs produced by the current codebase -# on the built-in SyntheticData, so any code change that alters results is caught. -# -# Reference: vignettes/primary_analysis_workflow.Rmd - -# ---------- helpers shared across all test blocks ---------------------------- -common_args <- list( - data = SyntheticData, - trial_status_col_name = "S", - treatment_col_name = "A", - outcome_col_name = c("y1", "y2"), - covariates_col_name = c("x1", "x2", "x3", "x4", "x5") -) - -tol <- 1e-6 # tolerance for floating-point comparison - -# ============================================================================= -# Section 2.1 IPW – zero weight (no borrowing) -# ============================================================================= -test_that("Primary IPW zero-weight results match vignette", { - method_obj <- setup_method_weighting( - method_name = "IPW", - optimal_weight_flag = FALSE, - wt = 0, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" - ) - - analysis_obj <- do.call( - setup_analysis_primary, - c(common_args, list(method_weighting_obj = method_obj)) - ) - - res <- run_analysis(analysis_obj) - - # Return structure - - expect_type(res, "list") - expect_named(res, c("results", "borrow_weight")) - expect_s3_class(res$results, "data.frame") - expect_equal(nrow(res$results), 2) - - # Point estimates (tau1, tau2) - expect_equal(res$results$point_estimates[1], -0.0280870419, tolerance = tol) - expect_equal(res$results$point_estimates[2], 0.4095955812, tolerance = tol) - - # Standard deviations - expect_equal(res$results$standard_deviation[1], 0.5367986594, tolerance = tol) - expect_equal(res$results$standard_deviation[2], 0.5625763260, tolerance = tol) - - # Normal CIs - expect_equal(res$results$lower_CI_normal[1], -1.0801930814, tolerance = tol) - expect_equal(res$results$lower_CI_normal[2], -0.6930337563, tolerance = tol) - expect_equal(res$results$upper_CI_normal[1], 1.0240189975, tolerance = tol) - expect_equal(res$results$upper_CI_normal[2], 1.5122249187, tolerance = tol) - - # Borrow weight - expect_equal(res$borrow_weight, 0) -}) - -# ============================================================================= -# Section 2.1 IPW – data-adaptive optimal weight -# ============================================================================= -test_that("Primary IPW optimal-weight results match vignette", { - method_obj <- setup_method_weighting( - method_name = "IPW", - optimal_weight_flag = TRUE, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" - ) - - analysis_obj <- do.call( - setup_analysis_primary, - c(common_args, list(method_weighting_obj = method_obj)) - ) - - res <- run_analysis(analysis_obj) - - # Point estimates - expect_equal(res$results$point_estimates[1], -0.1971968926, tolerance = tol) - expect_equal(res$results$point_estimates[2], 0.4697208867, tolerance = tol) - - # Standard deviations - expect_equal(res$results$standard_deviation[1], 0.5134017502, tolerance = tol) - expect_equal(res$results$standard_deviation[2], 0.5410007056, tolerance = tol) - - # Normal CIs - expect_equal(res$results$lower_CI_normal[1], -1.2034458325, tolerance = tol) - expect_equal(res$results$lower_CI_normal[2], -0.5906210120, tolerance = tol) - expect_equal(res$results$upper_CI_normal[1], 0.8090520473, tolerance = tol) - expect_equal(res$results$upper_CI_normal[2], 1.5300627854, tolerance = tol) - - # Borrow weight - expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) -}) - -# ============================================================================= -# Section 2.2 AIPW – zero weight (no borrowing) -# ============================================================================= -test_that("Primary AIPW zero-weight results match vignette", { - method_obj <- setup_method_weighting( - method_name = "AIPW", - optimal_weight_flag = FALSE, - wt = 0, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", - model_form_mu0_ext = c( - "y1 ~ x1 + x2 + x3 + x4 + x5", - "y2 ~ x1 + x2 + x3 + x4 + x5" - ) - ) - - analysis_obj <- do.call( - setup_analysis_primary, - c(common_args, list(method_weighting_obj = method_obj)) - ) - - res <- run_analysis(analysis_obj) - - # Point estimates - expect_equal(res$results$point_estimates[1], -0.4361151144, tolerance = tol) - expect_equal(res$results$point_estimates[2], 0.4422248202, tolerance = tol) - - # Standard deviations - expect_equal(res$results$standard_deviation[1], 0.5552064946, tolerance = tol) - expect_equal(res$results$standard_deviation[2], 0.5701633244, tolerance = tol) - - # Normal CIs - expect_equal(res$results$lower_CI_normal[1], -1.5242998477, tolerance = tol) - expect_equal(res$results$lower_CI_normal[2], -0.6752747610, tolerance = tol) - expect_equal(res$results$upper_CI_normal[1], 0.6520696190, tolerance = tol) - expect_equal(res$results$upper_CI_normal[2], 1.5597244013, tolerance = tol) - - # Borrow weight - expect_equal(res$borrow_weight, 0) -}) - -# ============================================================================= -# Section 2.2 AIPW – data-adaptive optimal weight -# ============================================================================= -test_that("Primary AIPW optimal-weight results match vignette", { - method_obj <- setup_method_weighting( - method_name = "AIPW", - optimal_weight_flag = TRUE, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", - model_form_mu0_ext = c( - "y1 ~ x1 + x2 + x3 + x4 + x5", - "y2 ~ x1 + x2 + x3 + x4 + x5" - ) - ) - - analysis_obj <- do.call( - setup_analysis_primary, - c(common_args, list(method_weighting_obj = method_obj)) - ) - - res <- run_analysis(analysis_obj) - - # Point estimates - expect_equal(res$results$point_estimates[1], -0.5463256250, tolerance = tol) - expect_equal(res$results$point_estimates[2], 0.5401749583, tolerance = tol) - - # Standard deviations - expect_equal(res$results$standard_deviation[1], 0.5305614087, tolerance = tol) - expect_equal(res$results$standard_deviation[2], 0.5543275065, tolerance = tol) - - # Normal CIs - expect_equal(res$results$lower_CI_normal[1], -1.5862068777, tolerance = tol) - expect_equal(res$results$lower_CI_normal[2], -0.5462869900, tolerance = tol) - expect_equal(res$results$upper_CI_normal[1], 0.4935556277, tolerance = tol) - expect_equal(res$results$upper_CI_normal[2], 1.6266369066, tolerance = tol) - - # Borrow weight (same as IPW optimal since same propensity model) - expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) -}) - -# ============================================================================= -# Section 3 Bootstrap inference – point estimates remain identical -# ============================================================================= -test_that("Primary IPW bootstrap point estimates match non-bootstrap", { - bootstrap_obj <- setup_bootstrap(replicates = 50, bootstrap_CI_type = "perc") - - method_obj <- setup_method_weighting( - method_name = "IPW", - optimal_weight_flag = TRUE, - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj, - wt = 0, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" - ) - - analysis_obj <- do.call( - setup_analysis_primary, - c(common_args, list(method_weighting_obj = method_obj)) - ) - - set.seed(42) - res <- run_analysis(analysis_obj) - - # Point estimates must match the non-bootstrap run exactly - expect_equal(res$results$point_estimates[1], -0.1971968926, tolerance = tol) - expect_equal(res$results$point_estimates[2], 0.4697208867, tolerance = tol) - - # Standard deviations must match - expect_equal(res$results$standard_deviation[1], 0.5134017502, tolerance = tol) - expect_equal(res$results$standard_deviation[2], 0.5410007056, tolerance = tol) - - # Borrow weight must match - expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) - - # Bootstrap-specific columns exist - expect_true("lower_CI_boot" %in% names(res$results)) - expect_true("upper_CI_boot" %in% names(res$results)) -}) - -test_that("Primary AIPW bootstrap point estimates match non-bootstrap", { - bootstrap_obj <- setup_bootstrap(replicates = 50, bootstrap_CI_type = "perc") - - method_obj <- setup_method_weighting( - method_name = "AIPW", - optimal_weight_flag = TRUE, - wt = 0, - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", - model_form_mu0_ext = c( - "y1 ~ x1 + x2 + x3 + x4 + x5", - "y2 ~ x1 + x2 + x3 + x4 + x5" - ) - ) - - analysis_obj <- do.call( - setup_analysis_primary, - c(common_args, list(method_weighting_obj = method_obj)) - ) - - set.seed(42) - res <- run_analysis(analysis_obj) - - # Point estimates must match the non-bootstrap AIPW optimal run - expect_equal(res$results$point_estimates[1], -0.5463256250, tolerance = tol) - expect_equal(res$results$point_estimates[2], 0.5401749583, tolerance = tol) - - expect_equal(res$results$standard_deviation[1], 0.5305614087, tolerance = tol) - expect_equal(res$results$standard_deviation[2], 0.5543275065, tolerance = tol) - - expect_equal(res$borrow_weight, 0.1475196487, tolerance = tol) -}) diff --git a/vignettes/OLE_analysis_workflow.Rmd b/vignettes/OLE_analysis_workflow.Rmd index 44b1d58..04898e5 100644 --- a/vignettes/OLE_analysis_workflow.Rmd +++ b/vignettes/OLE_analysis_workflow.Rmd @@ -25,169 +25,107 @@ switches to treatment. ### 1 DID methods -#### 1.1 IPW +#### 1.1 DID-EC-IPW ```{r message=FALSE, warning=FALSE} -bootstrap_obj <- setup_bootstrap( - replicates = 50, - bootstrap_CI_type = "perc" +method <- did_ec_ipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + trt_formula = "A ~ x1 + x2 + x3 + x4 + x5", + bootstrap = 50 ) -method_DID_obj <- setup_method_DID( - method_name = "IPW", - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", - model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5" -) - -analysis_OLE_obj <- setup_analysis_OLE( +analysis <- setup_analysis_OLE( data = SyntheticData, trial_status_col_name = "S", treatment_col_name = "A", outcome_col_name = c("y1", "y2", "y3", "y4"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), T_cross = 2, - method_OLE_obj = method_DID_obj + method_OLE_obj = method ) -res <- run_analysis(analysis_OLE_obj) +run_analysis(analysis) ``` -#### 1.2 AIPW +#### 1.2 DID-EC-AIPW ```{r message=FALSE, warning=FALSE} -bootstrap_obj <- setup_bootstrap( - replicates = 50, - bootstrap_CI_type = "perc" -) - -model_form_mu <- c( +model_forms <- c( "y1 ~ x1 + x2 + x3 + x4 + x5", "y2 ~ x1 + x2 + x3 + x4 + x5", "y3 ~ x1 + x2 + x3 + x4 + x5", "y4 ~ x1 + x2 + x3 + x4 + x5" ) -method_DID_obj <- setup_method_DID( - method_name = "AIPW", - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", - model_form_piA = "A ~ x1 + x2 + x3 + x4 + x5", - model_form_mu0_ext = model_form_mu +method <- did_ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + trt_formula = "A ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = model_forms, + bootstrap = 50 ) -analysis_OLE_obj <- setup_analysis_OLE( +analysis <- setup_analysis_OLE( data = SyntheticData, trial_status_col_name = "S", treatment_col_name = "A", outcome_col_name = c("y1", "y2", "y3", "y4"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), T_cross = 2, - method_OLE_obj = method_DID_obj + method_OLE_obj = method ) -res <- run_analysis(analysis_OLE_obj) +run_analysis(analysis) ``` - -#### 1.3 OR +#### 1.3 DID-EC-OR ```{r message=FALSE, warning=FALSE} -bootstrap_obj <- setup_bootstrap( - replicates = 50, - bootstrap_CI_type = "perc" -) - -model_form_mu <- c( +model_forms <- c( "y1 ~ x1 + x2 + x3 + x4 + x5", "y2 ~ x1 + x2 + x3 + x4 + x5", "y3 ~ x1 + x2 + x3 + x4 + x5", "y4 ~ x1 + x2 + x3 + x4 + x5" ) -method_DID_obj <- setup_method_DID( - method_name = "OR", - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj, - model_form_mu0_ext = model_form_mu, - model_form_mu0_rct = model_form_mu, - model_form_mu1_rct = model_form_mu -) - -analysis_OLE_obj <- setup_analysis_OLE( - data = SyntheticData, - trial_status_col_name = "S", - treatment_col_name = "A", - outcome_col_name = c("y1", "y2", "y3", "y4"), - covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - T_cross = 2, - method_OLE_obj = method_DID_obj -) - -res <- run_analysis(analysis_OLE_obj) -res -``` - - -### 2 SCM method: with parallel computing -```{r} -bootstrap_obj <- setup_bootstrap( - replicates = 50, - bootstrap_CI_type = "perc" -) - -method_SCM_obj <- setup_method_SCM( - method_name = "SCM", - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj, - lambda.min = 0, - lambda.max = 1e-3, - nlambda = 10, - parallel = "no", - ncpus = 1 +method <- did_ec_or( + outcome_formula_ext = model_forms, + outcome_formula_rct_ctrl = model_forms, + outcome_formula_rct_trt = model_forms, + bootstrap = 50 ) -analysis_OLE_obj <- setup_analysis_OLE( +analysis <- setup_analysis_OLE( data = SyntheticData, trial_status_col_name = "S", treatment_col_name = "A", outcome_col_name = c("y1", "y2", "y3", "y4"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), T_cross = 2, - method_OLE_obj = method_SCM_obj + method_OLE_obj = method ) -run_analysis(analysis_OLE_obj) +run_analysis(analysis) ``` -### 3 SCM method: without parallel computing +### 2 Synthetic control method ```{r} -bootstrap_obj <- setup_bootstrap( - replicates = 50, - bootstrap_CI_type = "perc" -) - -method_SCM_obj <- setup_method_SCM( - method_name = "SCM", - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj, - lambda.min = 0, - lambda.max = 1e-3, +method <- scm( + lambda_min = 0, + lambda_max = 1e-3, nlambda = 10, - parallel = "no" + bootstrap = 50, + bootstrap_ci_type = "perc" ) -analysis_OLE_obj <- setup_analysis_OLE( +analysis <- setup_analysis_OLE( data = SyntheticData, trial_status_col_name = "S", treatment_col_name = "A", outcome_col_name = c("y1", "y2", "y3", "y4"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), T_cross = 2, - method_OLE_obj = method_SCM_obj + method_OLE_obj = method ) -run_analysis(analysis_OLE_obj) +run_analysis(analysis) ``` ## References diff --git a/vignettes/introduction.Rmd b/vignettes/introduction.Rmd new file mode 100644 index 0000000..201746c --- /dev/null +++ b/vignettes/introduction.Rmd @@ -0,0 +1,101 @@ +--- +title: "Introduction to rdborrow" +author: "Matt Secrest" +date: "`r Sys.Date()`" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Introduction to rdborrow} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup, include=FALSE} +library(rdborrow) +``` + +## Overview + +**rdborrow** implements causal inference methods for incorporating external +controls in randomized controlled trials (RCTs) with longitudinal outcomes. +The package provides tools for both analysis and simulation, enabling +researchers to evaluate different borrowing strategies and design trials +that leverage external data. + +The methods are motivated by the SUNFISH trial for spinal muscular atrophy +(SMA), where external controls from the olesoxime trial augment the +randomized data to improve statistical efficiency. + +## Methods + +### Primary analysis (placebo-controlled phase) + +These methods estimate the average treatment effect (ATE) during the +placebo-controlled phase by borrowing from external controls. + +| Method | Function | Reference | Description | +|--------|----------|-----------|-------------| +| EC-IPW | `ec_ipw()` | Zhou et al. (2024b) | Inverse probability weighting with external control borrowing | +| EC-AIPW | `ec_aipw()` | Zhou et al. (2024b) | Augmented IPW (doubly robust) | + +Both methods support: + +- **No borrowing** (`weight = 0`): uses only RCT data +- **Optimal weight** (`weight = NULL`): data-adaptive weight minimizing variance +- **Fixed weight** (`weight = 0.3`): user-specified borrowing amount +- **Sandwich variance** or **bootstrap inference** + +See `vignette("primary_analysis_workflow")` for usage. + +### OLE phase analysis (open-label extension) + +After the placebo-controlled phase, control subjects cross over to +treatment. These methods estimate the long-term treatment effect using +external controls who remain untreated. + +| Method | Function | Reference | Description | +|--------|----------|-----------|-------------| +| DID-EC-IPW | `did_ec_ipw()` | Zhou et al. (2024a) | Difference-in-differences with IPW | +| DID-EC-AIPW | `did_ec_aipw()` | Zhou et al. (2024a) | DID with augmented IPW | +| DID-EC-OR | `did_ec_or()` | Zhou et al. (2024a) | DID with outcome regression | +| SCM | `scm()` | Zhou et al. (2024a) | Synthetic control method | + +All OLE methods use bootstrap for inference. + +See `vignette("OLE_analysis_workflow")` for usage. + +### Simulation + +The simulation module evaluates estimator performance via Monte Carlo: + +- `simulate_trial()` generates synthetic trial + external control data +- `setup_simulation_primary()` / `setup_simulation_OLE()` configure simulations +- `run_simulation()` runs Monte Carlo experiments and reports bias, variance, MSE, coverage, power + +See `vignette("primary_simulation_workflow")` and +`vignette("OLE_simulation_workflow")` for usage. + +## Quick start + +```{r} +# create an EC-IPW method with optimal borrowing weight +method <- ec_ipw(ps_formula = "S ~ x1 + x2 + x3 + x4 + x5") + +# set up the analysis +analysis <- setup_analysis_primary( + data = SyntheticData, + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method +) + +# run +run_analysis(analysis) +``` + +## References + +- Zhou X, Zhu J, Drake C, Pang H (2024). "Causal estimators for incorporating external controls in randomized trials with longitudinal outcomes." *Journal of the Royal Statistical Society Series A: Statistics in Society*. doi: [10.1093/jrsssa/qnae075](https://doi.org/10.1093/jrsssa/qnae075). +- Zhou X, Pang H, Drake C, Burger HU, Zhu J (2024). "Estimating treatment effect in randomized trial after control to treatment crossover using external controls." *Journal of Biopharmaceutical Statistics*. doi: [10.1080/10543406.2024.2444222](https://doi.org/10.1080/10543406.2024.2444222). +- Shi L, Pang H, Chen C, Zhu J (2025). "rdborrow: an R package for causal inference incorporating external controls in randomized controlled trials with longitudinal outcomes." *Journal of Biopharmaceutical Statistics*, 35(6), 1043-1066. doi: [10.1080/10543406.2025.2489283](https://doi.org/10.1080/10543406.2025.2489283). diff --git a/vignettes/primary_analysis_workflow.Rmd b/vignettes/primary_analysis_workflow.Rmd index 081d49c..8ed824b 100644 --- a/vignettes/primary_analysis_workflow.Rmd +++ b/vignettes/primary_analysis_workflow.Rmd @@ -13,8 +13,6 @@ vignette: > library(rdborrow) ``` - - ## Primary analysis This vignette demonstrates the primary analysis workflow using the @@ -23,203 +21,146 @@ EC-IPW and EC-AIPW weighting estimators proposed in for incorporating external controls in randomized trials with longitudinal outcomes. -### 1 load and visualize data +### 1 Load data ```{r} -# load the simulated dataset head(SyntheticData) ``` +### 2 EC-IPW -### 2 Estimation and inference: - -#### 2.1 Inverse probability weighting (IPW) -1) IPW with zero weight (wt = 0): - +#### 2.1 No borrowing (weight = 0) ```{r} -# test: within trial -## Data argument + column names (coxph, glm) -method_weighting_obj <- setup_method_weighting( - method_name = "IPW", - optimal_weight_flag = FALSE, - wt = 0, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" +method <- ec_ipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + weight = 0 ) -analysis_primary_obj <- setup_analysis_primary( +analysis <- setup_analysis_primary( data = SyntheticData, trial_status_col_name = "S", treatment_col_name = "A", outcome_col_name = c("y1", "y2"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - method_weighting_obj = method_weighting_obj + method_weighting_obj = method ) -res <- run_analysis(analysis_primary_obj) -res +run_analysis(analysis) ``` -2) IPW with data-adaptive weight: +#### 2.2 Optimal weight (data-adaptive) ```{r} -# test: within trial -method_weighting_obj <- setup_method_weighting( - method_name = "IPW", - optimal_weight_flag = TRUE, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" -) +method <- ec_ipw(ps_formula = "S ~ x1 + x2 + x3 + x4 + x5") -analysis_primary_obj <- setup_analysis_primary( +analysis <- setup_analysis_primary( data = SyntheticData, trial_status_col_name = "S", treatment_col_name = "A", outcome_col_name = c("y1", "y2"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - method_weighting_obj = method_weighting_obj + method_weighting_obj = method ) -res <- run_analysis(analysis_primary_obj) - -res$borrow_weight -res$results +run_analysis(analysis) ``` -#### 2.2 Augmented inverse probability weighting (AIPW) - -The second approach is AIPW, which also accommodates two external borrowing strategies. - -1) AIPW with zero weight (wt = 0): +#### 2.3 Bootstrap inference ```{r} -# test: AIPW with 0 weight, should be same as IPW with 0 weight -method_weighting_obj <- setup_method_weighting( - method_name = "AIPW", - optimal_weight_flag = FALSE, - wt = 0, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", - model_form_mu0_ext = c( - "y1 ~ x1 + x2 + x3 + x4 + x5", - "y2 ~ x1 + x2 + x3 + x4 + x5" - ) +method <- ec_ipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + bootstrap = 50, + bootstrap_ci_type = "perc" ) -analysis_primary_obj <- setup_analysis_primary( +analysis <- setup_analysis_primary( data = SyntheticData, trial_status_col_name = "S", treatment_col_name = "A", outcome_col_name = c("y1", "y2"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - method_weighting_obj = method_weighting_obj + method_weighting_obj = method ) -res <- run_analysis(analysis_primary_obj) +run_analysis(analysis) ``` +### 3 EC-AIPW +EC-AIPW augments the IPW estimator with an outcome regression model, +making it doubly robust: consistent if either the propensity score model +or the outcome model is correctly specified. -2) AIPW with data adaptive weight: +#### 3.1 No borrowing (weight = 0) ```{r} -# test: AIPW with given weight -# bootstrap as part of the method and analysis -method_weighting_obj <- setup_method_weighting( - method_name = "AIPW", - optimal_weight_flag = TRUE, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", - model_form_mu0_ext = c( +method <- ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( "y1 ~ x1 + x2 + x3 + x4 + x5", "y2 ~ x1 + x2 + x3 + x4 + x5" - ) + ), + weight = 0 ) -analysis_primary_obj <- setup_analysis_primary( +analysis <- setup_analysis_primary( data = SyntheticData, trial_status_col_name = "S", treatment_col_name = "A", outcome_col_name = c("y1", "y2"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - method_weighting_obj = method_weighting_obj + method_weighting_obj = method ) -res <- run_analysis(analysis_primary_obj) +run_analysis(analysis) ``` - - -### 3 Bootstrap inference - -In this section we present Bootstrap inference results. We report Bootstrap confidence intervals with adjusted quantile ranges. - -1) IPW with bootstrap CI +#### 3.2 Optimal weight (data-adaptive) ```{r} -# test: within trial -## Data argument + column names (coxph, glm) -## having a bootstrap_flag outside the class -bootstrap_obj <- setup_bootstrap( - replicates = 50, - bootstrap_CI_type = "perc" -) - -method_weighting_obj <- setup_method_weighting( - method_name = "IPW", - optimal_weight_flag = TRUE, - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj, - wt = 0, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5" +method <- ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( + "y1 ~ x1 + x2 + x3 + x4 + x5", + "y2 ~ x1 + x2 + x3 + x4 + x5" + ) ) -analysis_primary_obj <- setup_analysis_primary( +analysis <- setup_analysis_primary( data = SyntheticData, trial_status_col_name = "S", treatment_col_name = "A", outcome_col_name = c("y1", "y2"), covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), - method_weighting_obj = method_weighting_obj + method_weighting_obj = method ) -res <- run_analysis(analysis_primary_obj) -res +run_analysis(analysis) ``` -2) AIPW with bootstrap CI - +#### 3.3 Bootstrap inference ```{r} -# test: with optimal weight -## Data argument + column names (coxph, glm) -bootstrap_obj <- setup_bootstrap( - replicates = 50, - bootstrap_CI_type = "perc" -) - -method_weighting_obj <- setup_method_weighting( - method_name = "AIPW", - optimal_weight_flag = TRUE, - wt = 0, - bootstrap_flag = TRUE, - bootstrap_obj = bootstrap_obj, - model_form_piS = "S ~ x1 + x2 + x3 + x4 + x5", - model_form_mu0_ext = c( +method <- ec_aipw( + ps_formula = "S ~ x1 + x2 + x3 + x4 + x5", + outcome_formula = c( "y1 ~ x1 + x2 + x3 + x4 + x5", "y2 ~ x1 + x2 + x3 + x4 + x5" - ) + ), + bootstrap = 50, + bootstrap_ci_type = "perc" ) -analysis_primary_obj <- setup_analysis_primary( +analysis <- setup_analysis_primary( data = SyntheticData, - trial_status = "S", - treatment = "A", - outcome = c("y1", "y2"), - covariates = c("x1", "x2", "x3", "x4", "x5"), - method_weighting_obj = method_weighting_obj + trial_status_col_name = "S", + treatment_col_name = "A", + outcome_col_name = c("y1", "y2"), + covariates_col_name = c("x1", "x2", "x3", "x4", "x5"), + method_weighting_obj = method ) - -res <- run_analysis(analysis_primary_obj) - -## best to just do the last one, but also have options -## time dependent way of effects and modeling +run_analysis(analysis) ``` ### 4 Notes -1. When there are missing values in the data, the suggestion we have for now is to preprocess the dataset (such as deletion, imputing, etc.) to obtain a dataset without missingness, then apply the package. For general methodology development regarding missing values, we save it for future research work. +1. When there are missing values in the data, preprocess the dataset (deletion, imputation, etc.) to obtain a complete dataset before applying the package. ## References