Helper Functions for Tabulating Biomarker Effects on Survival by Subgroup

Helper functions which are documented here separately to not confuse the user when reading about the user-facing functions.

Usage

h_surv_to_coxreg_variables(variables, biomarker)

h_coxreg_mult_cont_df(variables, data, control = control_coxreg())

h_tab_surv_one_biomarker(df, vars, time_unit)

Arguments

variables: (named list of string)
list of additional analysis variables.
biomarker: (string)
the name of the biomarker variable.
data: (data.frame)
the dataset containing the variables to summarize.
control: (list)
a list of parameters as returned by the helper function control_coxreg().
df: (data.frame)
results for a single biomarker, as part of what is returned by extract_survival_biomarkers() (it needs a couple of columns which are added by that high-level function relative to what is returned by h_coxreg_mult_cont_df(), see the example).
vars: (character)
the name of statistics to be reported among n_tot_events (total number of events per group), n_tot (total number of observations per group), median (median survival time), hr (hazard ratio), ci (confidence interval of hazard ratio) and pval (p value of the effect). Note, one of the statistics n_tot and n_tot_events, as well as both hr and ci are required.
time_unit: (string)
label with unit of median survival time. Default NULL skips displaying unit.

Functions

h_surv_to_coxreg_variables(): helps with converting the "survival" function variable list to the "Cox regression" variable list. The reason is that currently there is an inconsistency between the variable names accepted by extract_survival_subgroups() and fit_coxreg_multivar().
h_coxreg_mult_cont_df(): prepares estimates for number of events, patients and median survival times, as well as hazard ratio estimates, confidence intervals and p-values, for multiple biomarkers in a given single data set. variables corresponds to names of variables found in data, passed as a named list and requires elements tte, is_event, biomarkers (vector of continuous biomarker variables) and optionally subgroups and strat.
h_tab_surv_one_biomarker(): prepares a single sub-table given a df_sub containing the results for a single biomarker.

Examples

# Testing dataset.
library(scda)
library(dplyr)
library(forcats)
library(rtables)

adtte <- synthetic_cdisc_data("latest")$adtte

# Save variable labels before data processing steps.
adtte_labels <- formatters::var_labels(adtte, fill = FALSE)

adtte_f <- adtte %>%
  filter(PARAMCD == "OS") %>%
  mutate(
    AVALU = as.character(AVALU),
    is_event = CNSR == 0
  )
labels <- c("AVALU" = adtte_labels[["AVALU"]], "is_event" = "Event Flag")
formatters::var_labels(adtte_f)[names(labels)] <- labels

# This is how the variable list is converted internally.
h_surv_to_coxreg_variables(
  variables = list(
    tte = "AVAL",
    is_event = "EVNT",
    covariates = c("A", "B"),
    strata = "D"
  ),
  biomarker = "AGE"
)
#> $time
#> [1] "AVAL"
#> 
#> $event
#> [1] "EVNT"
#> 
#> $arm
#> [1] "AGE"
#> 
#> $covariates
#> [1] "A" "B"
#> 
#> $strata
#> [1] "D"
#> 

# For a single population, estimate separately the effects
# of two biomarkers.
df <- h_coxreg_mult_cont_df(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    biomarkers = c("BMRKR1", "AGE"),
    covariates = "SEX",
    strata = c("STRATA1", "STRATA2")
  ),
  data = adtte_f
)
df
#>   biomarker              biomarker_label n_tot n_tot_events   median        hr
#> 1    BMRKR1 Continuous Level Biomarker 1   400          282 680.9598 0.9855566
#> 2       AGE                          Age   400          282 680.9598 1.0080554
#>         lcl      ucl conf_level      pval     pval_label
#> 1 0.9507602 1.021627       0.95 0.4276044 p-value (Wald)
#> 2 0.9925602 1.023793       0.95 0.3100440 p-value (Wald)

# If the data set is empty, still the corresponding rows with missings are returned.
h_coxreg_mult_cont_df(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    biomarkers = c("BMRKR1", "AGE"),
    covariates = "REGION1",
    strata = c("STRATA1", "STRATA2")
  ),
  data = adtte_f[NULL, ]
)
#>   biomarker              biomarker_label n_tot n_tot_events median hr lcl ucl
#> 1    BMRKR1 Continuous Level Biomarker 1     0            0     NA NA  NA  NA
#> 2       AGE                          Age     0            0     NA NA  NA  NA
#>   conf_level pval     pval_label
#> 1       0.95   NA p-value (Wald)
#> 2       0.95   NA p-value (Wald)

# Starting from above `df`, zoom in on one biomarker and add required columns.
df1 <- df[1, ]
df1$subgroup <- "All patients"
df1$row_type <- "content"
df1$var <- "ALL"
df1$var_label <- "All patients"
h_tab_surv_one_biomarker(
  df1,
  vars = c("n_tot", "n_tot_events", "median", "hr", "ci", "pval"),
  time_unit = "days"
)
#>                Total n   Total Events   Median (days)   Hazard Ratio   95% Wald CI    p-value (Wald)
#> ————————————————————————————————————————————————————————————————————————————————————————————————————
#> All patients     400         282            681.0           0.99       (0.95, 1.02)       0.4276