Tabulate Biomarker Effects on Survival by Subgroup
Source:R/survival_biomarkers_subgroups.R
      survival_biomarkers_subgroups.RdTabulate the estimated effects of multiple continuous biomarker variables across population subgroups.
Usage
extract_survival_biomarkers(
  variables,
  data,
  groups_lists = list(),
  control = control_coxreg(),
  label_all = "All Patients"
)
tabulate_survival_biomarkers(
  df,
  vars = c("n_tot", "n_tot_events", "median", "hr", "ci", "pval"),
  time_unit = NULL
)Arguments
- variables
 (named
listofstring)
list of additional analysis variables.- data
 (
data.frame)
the dataset containing the variables to summarize.- groups_lists
 (named
listoflist)
optionally contains for eachsubgroupsvariable a list, which specifies the new group levels via the names and the levels that belong to it in the character vectors that are elements of the list.- control
 (
list)
a list of parameters as returned by the helper functioncontrol_coxreg().- label_all
 (
string)
label for the total population analysis.- df
 (
data.frame)
containing all analysis variables, as returned byextract_survival_biomarkers().- vars
 (
character)
the name of statistics to be reported amongn_tot_events(total number of events per group),n_tot(total number of observations per group),median(median survival time),hr(hazard ratio),ci(confidence interval of hazard ratio) andpval(p value of the effect). Note, one of the statisticsn_totandn_tot_events, as well as bothhrandciare required.- time_unit
 (
string)
label with unit of median survival time. DefaultNULLskips displaying unit.
Details
These functions create a layout starting from a data frame which contains the required statistics. The tables are then typically used as input for forest plots.
Functions
extract_survival_biomarkers(): prepares estimates for number of events, patients and median survival times, as well as hazard ratio estimates, confidence intervals and p-values, for multiple biomarkers across population subgroups in a single data frame.variablescorresponds to the names of variables found indata, passed as a named list and requires elementstte,is_event,biomarkers(vector of continuous biomarker variables) and optionallysubgroupsandstrat.groups_listsoptionally specifies groupings forsubgroupsvariables.tabulate_survival_biomarkers(): table creating function.
Note
In contrast to tabulate_survival_subgroups() this tabulation function does
not start from an input layout lyt. This is because internally the table is
created by combining multiple subtables.
See also
h_coxreg_mult_cont_df() which is used internally.
h_tab_surv_one_biomarker() which is used internally.
Examples
# Testing dataset.
library(scda)
library(dplyr)
library(forcats)
library(rtables)
adtte <- synthetic_cdisc_data("latest")$adtte
# Save variable labels before data processing steps.
adtte_labels <- formatters::var_labels(adtte)
adtte_f <- adtte %>%
  filter(PARAMCD == "OS") %>%
  mutate(
    AVALU = as.character(AVALU),
    is_event = CNSR == 0
  )
labels <- c("AVALU" = adtte_labels[["AVALU"]], "is_event" = "Event Flag")
formatters::var_labels(adtte_f)[names(labels)] <- labels
# Typical analysis of two continuous biomarkers `BMRKR1` and `AGE`,
# in multiple regression models containing one covariate `RACE`,
# as well as one stratification variable `STRATA1`. The subgroups
# are defined by the levels of `BMRKR2`.
df <- extract_survival_biomarkers(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    biomarkers = c("BMRKR1", "AGE"),
    strata = "STRATA1",
    covariates = "SEX",
    subgroups = "BMRKR2"
  ),
  data = adtte_f
)
df
#>   biomarker              biomarker_label n_tot n_tot_events   median        hr
#> 1    BMRKR1 Continuous Level Biomarker 1   400          282 680.9598 0.9838045
#> 2       AGE                          Age   400          282 680.9598 1.0060610
#> 3    BMRKR1 Continuous Level Biomarker 1   135           95 647.7467 1.0155961
#> 4       AGE                          Age   135           95 647.7467 1.0095516
#> 5    BMRKR1 Continuous Level Biomarker 1   135           93 646.4069 0.9800800
#> 6       AGE                          Age   135           93 646.4069 1.0228066
#> 7    BMRKR1 Continuous Level Biomarker 1   130           94 761.2290 0.9437920
#> 8       AGE                          Age   130           94 761.2290 1.0035277
#>         lcl      ucl conf_level       pval     pval_label     subgroup    var
#> 1 0.9500735 1.018733       0.95 0.35898997 p-value (Wald) All Patients    ALL
#> 2 0.9908739 1.021481       0.95 0.43619694 p-value (Wald) All Patients    ALL
#> 3 0.9553432 1.079649       0.95 0.61993701 p-value (Wald)          LOW BMRKR2
#> 4 0.9801032 1.039885       0.95 0.52910110 p-value (Wald)          LOW BMRKR2
#> 5 0.9237871 1.039803       0.95 0.50496922 p-value (Wald)       MEDIUM BMRKR2
#> 6 0.9940130 1.052434       0.95 0.12167059 p-value (Wald)       MEDIUM BMRKR2
#> 7 0.8822742 1.009599       0.95 0.09253618 p-value (Wald)         HIGH BMRKR2
#> 8 0.9776074 1.030135       0.95 0.79197278 p-value (Wald)         HIGH BMRKR2
#>                       var_label row_type
#> 1                  All Patients  content
#> 2                  All Patients  content
#> 3 Categorical Level Biomarker 2 analysis
#> 4 Categorical Level Biomarker 2 analysis
#> 5 Categorical Level Biomarker 2 analysis
#> 6 Categorical Level Biomarker 2 analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
# Here we group the levels of `BMRKR2` manually.
df_grouped <- extract_survival_biomarkers(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    biomarkers = c("BMRKR1", "AGE"),
    strata = "STRATA1",
    covariates = "SEX",
    subgroups = "BMRKR2"
  ),
  data = adtte_f,
  groups_lists = list(
    BMRKR2 = list(
      "low" = "LOW",
      "low/medium" = c("LOW", "MEDIUM"),
      "low/medium/high" = c("LOW", "MEDIUM", "HIGH")
    )
  )
)
df_grouped
#>   biomarker              biomarker_label n_tot n_tot_events   median        hr
#> 1    BMRKR1 Continuous Level Biomarker 1   400          282 680.9598 0.9838045
#> 2       AGE                          Age   400          282 680.9598 1.0060610
#> 3    BMRKR1 Continuous Level Biomarker 1   135           95 647.7467 1.0155961
#> 4       AGE                          Age   135           95 647.7467 1.0095516
#> 5    BMRKR1 Continuous Level Biomarker 1   270          188 647.7467 0.9981993
#> 6       AGE                          Age   270          188 647.7467 1.0131282
#> 7    BMRKR1 Continuous Level Biomarker 1   400          282 680.9598 0.9838045
#> 8       AGE                          Age   400          282 680.9598 1.0060610
#>         lcl      ucl conf_level      pval     pval_label        subgroup    var
#> 1 0.9500735 1.018733       0.95 0.3589900 p-value (Wald)    All Patients    ALL
#> 2 0.9908739 1.021481       0.95 0.4361969 p-value (Wald)    All Patients    ALL
#> 3 0.9553432 1.079649       0.95 0.6199370 p-value (Wald)             low BMRKR2
#> 4 0.9801032 1.039885       0.95 0.5291011 p-value (Wald)             low BMRKR2
#> 5 0.9572619 1.040887       0.95 0.9327722 p-value (Wald)      low/medium BMRKR2
#> 6 0.9927059 1.033971       0.95 0.2093518 p-value (Wald)      low/medium BMRKR2
#> 7 0.9500735 1.018733       0.95 0.3589900 p-value (Wald) low/medium/high BMRKR2
#> 8 0.9908739 1.021481       0.95 0.4361969 p-value (Wald) low/medium/high BMRKR2
#>                       var_label row_type
#> 1                  All Patients  content
#> 2                  All Patients  content
#> 3 Categorical Level Biomarker 2 analysis
#> 4 Categorical Level Biomarker 2 analysis
#> 5 Categorical Level Biomarker 2 analysis
#> 6 Categorical Level Biomarker 2 analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
## Table with default columns.
# df <- <needs_to_be_inputted>
tabulate_survival_biomarkers(df)
#>                                   Total n   Total Events   Median   Hazard Ratio   95% Wald CI    p-value (Wald)
#> ————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> Age                                                                                                             
#>   All Patients                      400         282        681.0        1.01       (0.99, 1.02)       0.4362    
#>   Categorical Level Biomarker 2                                                                                 
#>     LOW                             135          95        647.7        1.01       (0.98, 1.04)       0.5291    
#>     MEDIUM                          135          93        646.4        1.02       (0.99, 1.05)       0.1217    
#>     HIGH                            130          94        761.2        1.00       (0.98, 1.03)       0.7920    
#> Continuous Level Biomarker 1                                                                                    
#>   All Patients                      400         282        681.0        0.98       (0.95, 1.02)       0.3590    
#>   Categorical Level Biomarker 2                                                                                 
#>     LOW                             135          95        647.7        1.02       (0.96, 1.08)       0.6199    
#>     MEDIUM                          135          93        646.4        0.98       (0.92, 1.04)       0.5050    
#>     HIGH                            130          94        761.2        0.94       (0.88, 1.01)       0.0925    
## Table with a manually chosen set of columns: leave out "pval", reorder.
tab <- tabulate_survival_biomarkers(
  df = df,
  vars = c("n_tot_events", "ci", "n_tot", "median", "hr"),
  time_unit = as.character(adtte_f$AVALU[1])
)
## Finally produce the forest plot.
if (FALSE) {
g_forest(tab, xlim = c(0.8, 1.2))
}