Tabulate Biomarker Effects on Binary Response by Subgroup
Source:R/response_biomarkers_subgroups.R
      response_biomarkers_subgroups.RdTabulate the estimated effects of multiple continuous biomarker variables on a binary response endpoint across population subgroups.
Usage
extract_rsp_biomarkers(
  variables,
  data,
  groups_lists = list(),
  control = control_logistic(),
  label_all = "All Patients"
)
tabulate_rsp_biomarkers(
  df,
  vars = c("n_tot", "n_rsp", "prop", "or", "ci", "pval")
)Arguments
- variables
 (named
listofstring)
list of additional analysis variables.- data
 (
data.frame)
the dataset containing the variables to summarize.- groups_lists
 (named
listoflist)
optionally contains for eachsubgroupsvariable a list, which specifies the new group levels via the names and the levels that belong to it in the character vectors that are elements of the list.- control
 (named
list)
controls for the response definition and the confidence level produced bycontrol_logistic().- label_all
 (
string)
label for the total population analysis.- df
 (
data.frame)
containing all analysis variables, as returned byextract_rsp_biomarkers().- vars
 (
character)
the name of statistics to be reported amongn_tot(total number of patients per group),n_rsp(total number of responses per group),prop(total response proportion per group),or(odds ratio),ci(confidence interval of odds ratio) andpval(p value of the effect). Note, the statisticsn_tot,orandciare required.
Details
These functions create a layout starting from a data frame which contains the required statistics. The tables are then typically used as input for forest plots.
Functions
extract_rsp_biomarkers(): prepares estimates for number of responses, patients and overall response rate, as well as odds ratio estimates, confidence intervals and p-values, for multiple biomarkers across population subgroups in a single data frame.variablescorresponds to the names of variables found indata, passed as a named list and requires elementsrspandbiomarkers(vector of continuous biomarker variables) and optionallycovariates,subgroupsandstrat.groups_listsoptionally specifies groupings forsubgroupsvariables.tabulate_rsp_biomarkers(): table creating function.
Note
You can also specify a continuous variable in rsp and then use the
response_definition control to convert that internally to a logical
variable reflecting binary response.
In contrast to tabulate_rsp_subgroups() this tabulation function does
not start from an input layout lyt. This is because internally the table is
created by combining multiple subtables.
See also
h_logistic_mult_cont_df() which is used internally.
h_tab_rsp_one_biomarker() which is used internally.
Examples
# Testing dataset.
library(scda)
library(dplyr)
library(forcats)
library(rtables)
adrs <- synthetic_cdisc_data("latest")$adrs
adrs_labels <- formatters::var_labels(adrs)
adrs_f <- adrs %>%
  filter(PARAMCD == "BESRSPI") %>%
  mutate(rsp = AVALC == "CR")
formatters::var_labels(adrs_f) <- c(adrs_labels, "Response")
# Typical analysis of two continuous biomarkers `BMRKR1` and `AGE`,
# in logistic regression models with one covariate `RACE`. The subgroups
# are defined by the levels of `BMRKR2`.
df <- extract_rsp_biomarkers(
  variables = list(
    rsp = "rsp",
    biomarkers = c("BMRKR1", "AGE"),
    covariates = "SEX",
    subgroups = "BMRKR2"
  ),
  data = adrs_f
)
df
#>   biomarker              biomarker_label n_tot n_rsp      prop        or
#> 1    BMRKR1 Continuous Level Biomarker 1   400   336 0.8400000 1.0573123
#> 2       AGE                          Age   400   336 0.8400000 0.9989522
#> 3    BMRKR1 Continuous Level Biomarker 1   135   120 0.8888889 1.0630022
#> 4       AGE                          Age   135   120 0.8888889 1.0414591
#> 5    BMRKR1 Continuous Level Biomarker 1   135   110 0.8148148 1.0224322
#> 6       AGE                          Age   135   110 0.8148148 1.0206702
#> 7    BMRKR1 Continuous Level Biomarker 1   130   106 0.8153846 1.1131370
#> 8       AGE                          Age   130   106 0.8153846 0.9688377
#>         lcl      ucl conf_level      pval     pval_label     subgroup    var
#> 1 0.9715084 1.150694       0.95 0.1968485 p-value (Wald) All Patients    ALL
#> 2 0.9634618 1.035750       0.95 0.9547035 p-value (Wald) All Patients    ALL
#> 3 0.8880719 1.272390       0.95 0.5054054 p-value (Wald)          LOW BMRKR2
#> 4 0.9606402 1.129077       0.95 0.3243039 p-value (Wald)          LOW BMRKR2
#> 5 0.9016818 1.159353       0.95 0.7293672 p-value (Wald)       MEDIUM BMRKR2
#> 6 0.9562410 1.089440       0.95 0.5385665 p-value (Wald)       MEDIUM BMRKR2
#> 7 0.9550957 1.297330       0.95 0.1700950 p-value (Wald)         HIGH BMRKR2
#> 8 0.9195637 1.020752       0.95 0.2345489 p-value (Wald)         HIGH BMRKR2
#>                       var_label row_type
#> 1                  All Patients  content
#> 2                  All Patients  content
#> 3 Categorical Level Biomarker 2 analysis
#> 4 Categorical Level Biomarker 2 analysis
#> 5 Categorical Level Biomarker 2 analysis
#> 6 Categorical Level Biomarker 2 analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
# Here we group the levels of `BMRKR2` manually, and we add a stratification
# variable `STRATA1`. We also here use a continuous variable `EOSDY`
# which is then binarized internally (response is defined as this variable
# being larger than 500).
df_grouped <- extract_rsp_biomarkers(
  variables = list(
    rsp = "EOSDY",
    biomarkers = c("BMRKR1", "AGE"),
    covariates = "SEX",
    subgroups = "BMRKR2",
    strat = "STRATA1"
  ),
  data = adrs_f,
  groups_lists = list(
    BMRKR2 = list(
      "low" = "LOW",
      "low/medium" = c("LOW", "MEDIUM"),
      "low/medium/high" = c("LOW", "MEDIUM", "HIGH")
    )
  ),
  control = control_logistic(
    response_definition = "I(response > 500)"
  )
)
df_grouped
#>   biomarker              biomarker_label n_tot n_rsp      prop        or
#> 1    BMRKR1 Continuous Level Biomarker 1   327   245 0.7492355 0.9773213
#> 2       AGE                          Age   327   245 0.7492355 1.0226967
#> 3    BMRKR1 Continuous Level Biomarker 1   108    82 0.7592593 0.9448599
#> 4       AGE                          Age   108    82 0.7592593 1.0319902
#> 5    BMRKR1 Continuous Level Biomarker 1   220   164 0.7454545 0.9398086
#> 6       AGE                          Age   220   164 0.7454545 1.0201601
#> 7    BMRKR1 Continuous Level Biomarker 1   327   245 0.7492355 0.9773213
#> 8       AGE                          Age   327   245 0.7492355 1.0226967
#>         lcl      ucl conf_level      pval     pval_label        subgroup    var
#> 1 0.9070459 1.053041       0.95 0.5468297 p-value (Wald)    All Patients    ALL
#> 2 0.9865104 1.060210       0.95 0.2220691 p-value (Wald)    All Patients    ALL
#> 3 0.8268817 1.079671       0.95 0.4045690 p-value (Wald)             low BMRKR2
#> 4 0.9674411 1.100846       0.95 0.3393094 p-value (Wald)             low BMRKR2
#> 5 0.8614063 1.025347       0.95 0.1624819 p-value (Wald)      low/medium BMRKR2
#> 6 0.9740974 1.068401       0.95 0.3971685 p-value (Wald)      low/medium BMRKR2
#> 7 0.9070459 1.053041       0.95 0.5468297 p-value (Wald) low/medium/high BMRKR2
#> 8 0.9865104 1.060210       0.95 0.2220691 p-value (Wald) low/medium/high BMRKR2
#>                       var_label row_type
#> 1                  All Patients  content
#> 2                  All Patients  content
#> 3 Categorical Level Biomarker 2 analysis
#> 4 Categorical Level Biomarker 2 analysis
#> 5 Categorical Level Biomarker 2 analysis
#> 6 Categorical Level Biomarker 2 analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
## Table with default columns.
# df <- <need_data_input_to_work>
tabulate_rsp_biomarkers(df)
#>                                   Total n   Responders   Response (%)   Odds Ratio      95% CI      p-value (Wald)
#> ——————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> Age                                                                                                               
#>   All Patients                      400        336          84.0%          1.00      (0.96, 1.04)       0.9547    
#>   Categorical Level Biomarker 2                                                                                   
#>     LOW                             135        120          88.9%          1.04      (0.96, 1.13)       0.3243    
#>     MEDIUM                          135        110          81.5%          1.02      (0.96, 1.09)       0.5386    
#>     HIGH                            130        106          81.5%          0.97      (0.92, 1.02)       0.2345    
#> Continuous Level Biomarker 1                                                                                      
#>   All Patients                      400        336          84.0%          1.06      (0.97, 1.15)       0.1968    
#>   Categorical Level Biomarker 2                                                                                   
#>     LOW                             135        120          88.9%          1.06      (0.89, 1.27)       0.5054    
#>     MEDIUM                          135        110          81.5%          1.02      (0.90, 1.16)       0.7294    
#>     HIGH                            130        106          81.5%          1.11      (0.96, 1.30)       0.1701    
## Table with a manually chosen set of columns: leave out "pval", reorder.
tab <- tabulate_rsp_biomarkers(
  df = df,
  vars = c("n_rsp", "ci", "n_tot", "prop", "or")
)
## Finally produce the forest plot.
if (FALSE) {
g_forest(tab, xlim = c(0.7, 1.4))
}