Tabulate Biomarker Effects on Binary Response by Subgroup

Tabulate the estimated effects of multiple continuous biomarker variables on a binary response endpoint across population subgroups.

Usage

extract_rsp_biomarkers(
  variables,
  data,
  groups_lists = list(),
  control = control_logistic(),
  label_all = "All Patients"
)

tabulate_rsp_biomarkers(
  df,
  vars = c("n_tot", "n_rsp", "prop", "or", "ci", "pval")
)

Arguments

variables: (named list of string)
list of additional analysis variables.
data: (data.frame)
the dataset containing the variables to summarize.
groups_lists: (named list of list)
optionally contains for each subgroups variable a list, which specifies the new group levels via the names and the levels that belong to it in the character vectors that are elements of the list.
control: (named list)
controls for the response definition and the confidence level produced by control_logistic().
label_all: (string)
label for the total population analysis.
df: (data.frame)
containing all analysis variables, as returned by extract_rsp_biomarkers().
vars: (character)
the name of statistics to be reported among n_tot (total number of patients per group), n_rsp (total number of responses per group), prop (total response proportion per group), or (odds ratio), ci (confidence interval of odds ratio) and pval (p value of the effect). Note, the statistics n_tot, or and ci are required.

Details

These functions create a layout starting from a data frame which contains the required statistics. The tables are then typically used as input for forest plots.

Functions

extract_rsp_biomarkers(): prepares estimates for number of responses, patients and overall response rate, as well as odds ratio estimates, confidence intervals and p-values, for multiple biomarkers across population subgroups in a single data frame. variables corresponds to the names of variables found in data, passed as a named list and requires elements rsp and biomarkers (vector of continuous biomarker variables) and optionally covariates, subgroups and strat. groups_lists optionally specifies groupings for subgroups variables.
tabulate_rsp_biomarkers(): table creating function.

Note

You can also specify a continuous variable in rsp and then use the response_definition control to convert that internally to a logical variable reflecting binary response.

In contrast to tabulate_rsp_subgroups() this tabulation function does not start from an input layout lyt. This is because internally the table is created by combining multiple subtables.

Examples

# Testing dataset.
library(scda)
library(dplyr)
library(forcats)
library(rtables)

adrs <- synthetic_cdisc_data("latest")$adrs
adrs_labels <- formatters::var_labels(adrs)

adrs_f <- adrs %>%
  filter(PARAMCD == "BESRSPI") %>%
  mutate(rsp = AVALC == "CR")
formatters::var_labels(adrs_f) <- c(adrs_labels, "Response")
# Typical analysis of two continuous biomarkers `BMRKR1` and `AGE`,
# in logistic regression models with one covariate `RACE`. The subgroups
# are defined by the levels of `BMRKR2`.
df <- extract_rsp_biomarkers(
  variables = list(
    rsp = "rsp",
    biomarkers = c("BMRKR1", "AGE"),
    covariates = "SEX",
    subgroups = "BMRKR2"
  ),
  data = adrs_f
)
df
#>   biomarker              biomarker_label n_tot n_rsp      prop        or
#> 1    BMRKR1 Continuous Level Biomarker 1   400   336 0.8400000 1.0573123
#> 2       AGE                          Age   400   336 0.8400000 0.9989522
#> 3    BMRKR1 Continuous Level Biomarker 1   135   120 0.8888889 1.0630022
#> 4       AGE                          Age   135   120 0.8888889 1.0414591
#> 5    BMRKR1 Continuous Level Biomarker 1   135   110 0.8148148 1.0224322
#> 6       AGE                          Age   135   110 0.8148148 1.0206702
#> 7    BMRKR1 Continuous Level Biomarker 1   130   106 0.8153846 1.1131370
#> 8       AGE                          Age   130   106 0.8153846 0.9688377
#>         lcl      ucl conf_level      pval     pval_label     subgroup    var
#> 1 0.9715084 1.150694       0.95 0.1968485 p-value (Wald) All Patients    ALL
#> 2 0.9634618 1.035750       0.95 0.9547035 p-value (Wald) All Patients    ALL
#> 3 0.8880719 1.272390       0.95 0.5054054 p-value (Wald)          LOW BMRKR2
#> 4 0.9606402 1.129077       0.95 0.3243039 p-value (Wald)          LOW BMRKR2
#> 5 0.9016818 1.159353       0.95 0.7293672 p-value (Wald)       MEDIUM BMRKR2
#> 6 0.9562410 1.089440       0.95 0.5385665 p-value (Wald)       MEDIUM BMRKR2
#> 7 0.9550957 1.297330       0.95 0.1700950 p-value (Wald)         HIGH BMRKR2
#> 8 0.9195637 1.020752       0.95 0.2345489 p-value (Wald)         HIGH BMRKR2
#>                       var_label row_type
#> 1                  All Patients  content
#> 2                  All Patients  content
#> 3 Categorical Level Biomarker 2 analysis
#> 4 Categorical Level Biomarker 2 analysis
#> 5 Categorical Level Biomarker 2 analysis
#> 6 Categorical Level Biomarker 2 analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis

# Here we group the levels of `BMRKR2` manually, and we add a stratification
# variable `STRATA1`. We also here use a continuous variable `EOSDY`
# which is then binarized internally (response is defined as this variable
# being larger than 500).
df_grouped <- extract_rsp_biomarkers(
  variables = list(
    rsp = "EOSDY",
    biomarkers = c("BMRKR1", "AGE"),
    covariates = "SEX",
    subgroups = "BMRKR2",
    strat = "STRATA1"
  ),
  data = adrs_f,
  groups_lists = list(
    BMRKR2 = list(
      "low" = "LOW",
      "low/medium" = c("LOW", "MEDIUM"),
      "low/medium/high" = c("LOW", "MEDIUM", "HIGH")
    )
  ),
  control = control_logistic(
    response_definition = "I(response > 500)"
  )
)
df_grouped
#>   biomarker              biomarker_label n_tot n_rsp      prop        or
#> 1    BMRKR1 Continuous Level Biomarker 1   327   245 0.7492355 0.9773213
#> 2       AGE                          Age   327   245 0.7492355 1.0226967
#> 3    BMRKR1 Continuous Level Biomarker 1   108    82 0.7592593 0.9448599
#> 4       AGE                          Age   108    82 0.7592593 1.0319902
#> 5    BMRKR1 Continuous Level Biomarker 1   220   164 0.7454545 0.9398086
#> 6       AGE                          Age   220   164 0.7454545 1.0201601
#> 7    BMRKR1 Continuous Level Biomarker 1   327   245 0.7492355 0.9773213
#> 8       AGE                          Age   327   245 0.7492355 1.0226967
#>         lcl      ucl conf_level      pval     pval_label        subgroup    var
#> 1 0.9070459 1.053041       0.95 0.5468297 p-value (Wald)    All Patients    ALL
#> 2 0.9865104 1.060210       0.95 0.2220691 p-value (Wald)    All Patients    ALL
#> 3 0.8268817 1.079671       0.95 0.4045690 p-value (Wald)             low BMRKR2
#> 4 0.9674411 1.100846       0.95 0.3393094 p-value (Wald)             low BMRKR2
#> 5 0.8614063 1.025347       0.95 0.1624819 p-value (Wald)      low/medium BMRKR2
#> 6 0.9740974 1.068401       0.95 0.3971685 p-value (Wald)      low/medium BMRKR2
#> 7 0.9070459 1.053041       0.95 0.5468297 p-value (Wald) low/medium/high BMRKR2
#> 8 0.9865104 1.060210       0.95 0.2220691 p-value (Wald) low/medium/high BMRKR2
#>                       var_label row_type
#> 1                  All Patients  content
#> 2                  All Patients  content
#> 3 Categorical Level Biomarker 2 analysis
#> 4 Categorical Level Biomarker 2 analysis
#> 5 Categorical Level Biomarker 2 analysis
#> 6 Categorical Level Biomarker 2 analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
## Table with default columns.
# df <- <need_data_input_to_work>
tabulate_rsp_biomarkers(df)
#>                                   Total n   Responders   Response (%)   Odds Ratio      95% CI      p-value (Wald)
#> ——————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> Age                                                                                                               
#>   All Patients                      400        336          84.0%          1.00      (0.96, 1.04)       0.9547    
#>   Categorical Level Biomarker 2                                                                                   
#>     LOW                             135        120          88.9%          1.04      (0.96, 1.13)       0.3243    
#>     MEDIUM                          135        110          81.5%          1.02      (0.96, 1.09)       0.5386    
#>     HIGH                            130        106          81.5%          0.97      (0.92, 1.02)       0.2345    
#> Continuous Level Biomarker 1                                                                                      
#>   All Patients                      400        336          84.0%          1.06      (0.97, 1.15)       0.1968    
#>   Categorical Level Biomarker 2                                                                                   
#>     LOW                             135        120          88.9%          1.06      (0.89, 1.27)       0.5054    
#>     MEDIUM                          135        110          81.5%          1.02      (0.90, 1.16)       0.7294    
#>     HIGH                            130        106          81.5%          1.11      (0.96, 1.30)       0.1701    

## Table with a manually chosen set of columns: leave out "pval", reorder.
tab <- tabulate_rsp_biomarkers(
  df = df,
  vars = c("n_rsp", "ci", "n_tot", "prop", "or")
)

## Finally produce the forest plot.
if (FALSE) {
g_forest(tab, xlim = c(0.7, 1.4))
}