Helper functions for tabulating biomarker effects on binary response by subgroup

Helper functions which are documented here separately to not confuse the user when reading about the user-facing functions.

Usage

h_rsp_to_logistic_variables(variables, biomarker)

h_logistic_mult_cont_df(variables, data, control = control_logistic())

h_tab_rsp_one_biomarker(df, vars, na_str = default_na_str(), .indent_mods = 0L)

Arguments

variables

(named list of string)
list of additional analysis variables.

biomarker

(string)
the name of the biomarker variable.

data

(data.frame)
the dataset containing the variables to summarize.

control

(named list)
controls for the response definition and the confidence level produced by control_logistic().

df

(data.frame)
results for a single biomarker, as part of what is returned by extract_rsp_biomarkers() (it needs a couple of columns which are added by that high-level function relative to what is returned by h_logistic_mult_cont_df(), see the example).

vars

(character)
the names of statistics to be reported among:

n_tot: Total number of patients per group.
n_rsp: Total number of responses per group.
prop: Total response proportion per group.
or: Odds ratio.
ci: Confidence interval of odds ratio.
pval: p-value of the effect. Note, the statistics n_tot, or and ci are required.

na_str

(string)
string used to replace all NA or empty values in the output.

.indent_mods

(named integer)
indent modifiers for the labels. Defaults to 0, which corresponds to the unmodified default behavior. Can be negative.

Value

h_rsp_to_logistic_variables() returns a named list of elements response, arm, covariates, and strata.

h_logistic_mult_cont_df() returns a data.frame containing estimates and statistics for the selected biomarkers.

h_tab_rsp_one_biomarker() returns an rtables table object with the given statistics arranged in columns.

Functions

h_rsp_to_logistic_variables(): helps with converting the "response" function variable list to the "logistic regression" variable list. The reason is that currently there is an inconsistency between the variable names accepted by extract_rsp_subgroups() and fit_logistic().
h_logistic_mult_cont_df(): prepares estimates for number of responses, patients and overall response rate, as well as odds ratio estimates, confidence intervals and p-values, for multiple biomarkers in a given single data set. variables corresponds to names of variables found in data, passed as a named list and requires elements rsp and biomarkers (vector of continuous biomarker variables) and optionally covariates and strata.
h_tab_rsp_one_biomarker(): Prepares a single sub-table given a df_sub containing the results for a single biomarker.

Examples

library(dplyr)
library(forcats)

adrs <- tern_ex_adrs
adrs_labels <- formatters::var_labels(adrs)

adrs_f <- adrs %>%
  filter(PARAMCD == "BESRSPI") %>%
  mutate(rsp = AVALC == "CR")
formatters::var_labels(adrs_f) <- c(adrs_labels, "Response")

# This is how the variable list is converted internally.
h_rsp_to_logistic_variables(
  variables = list(
    rsp = "RSP",
    covariates = c("A", "B"),
    strata = "D"
  ),
  biomarker = "AGE"
)
#> $response
#> [1] "RSP"
#> 
#> $arm
#> [1] "AGE"
#> 
#> $covariates
#> [1] "A" "B"
#> 
#> $strata
#> [1] "D"
#> 

# For a single population, estimate separately the effects
# of two biomarkers.
df <- h_logistic_mult_cont_df(
  variables = list(
    rsp = "rsp",
    biomarkers = c("BMRKR1", "AGE"),
    covariates = "SEX"
  ),
  data = adrs_f
)
df
#>   biomarker              biomarker_label n_tot n_rsp prop        or       lcl
#> 1    BMRKR1 Continuous Level Biomarker 1   200   164 0.82 0.9755036 0.8804862
#> 2       AGE                          Age   200   164 0.82 0.9952416 0.9462617
#>        ucl conf_level      pval     pval_label
#> 1 1.080775       0.95 0.6352602 p-value (Wald)
#> 2 1.046757       0.95 0.8530389 p-value (Wald)

# If the data set is empty, still the corresponding rows with missings are returned.
h_coxreg_mult_cont_df(
  variables = list(
    rsp = "rsp",
    biomarkers = c("BMRKR1", "AGE"),
    covariates = "SEX",
    strata = "STRATA1"
  ),
  data = adrs_f[NULL, ]
)
#>   biomarker              biomarker_label n_tot n_tot_events median hr lcl ucl
#> 1    BMRKR1 Continuous Level Biomarker 1     0            0     NA NA  NA  NA
#> 2       AGE                          Age     0            0     NA NA  NA  NA
#>   conf_level pval     pval_label
#> 1       0.95   NA p-value (Wald)
#> 2       0.95   NA p-value (Wald)

# Starting from above `df`, zoom in on one biomarker and add required columns.
df1 <- df[1, ]
df1$subgroup <- "All patients"
df1$row_type <- "content"
df1$var <- "ALL"
df1$var_label <- "All patients"

h_tab_rsp_one_biomarker(
  df1,
  vars = c("n_tot", "n_rsp", "prop", "or", "ci", "pval")
)
#>                Total n   Responders   Response (%)   Odds Ratio      95% CI      p-value (Wald)
#> ———————————————————————————————————————————————————————————————————————————————————————————————
#> All patients     200        164          82.0%          0.98      (0.88, 1.08)       0.6353