Tabulate Biomarker Effects on Survival by Subgroup

Tabulate the estimated effects of multiple continuous biomarker variables across population subgroups.

Usage

extract_survival_biomarkers(
  variables,
  data,
  groups_lists = list(),
  control = control_coxreg(),
  label_all = "All Patients"
)

tabulate_survival_biomarkers(
  df,
  vars = c("n_tot", "n_tot_events", "median", "hr", "ci", "pval"),
  time_unit = NULL
)

Arguments

variables: (named list of string)
list of additional analysis variables.
data: (data.frame)
the dataset containing the variables to summarize.
groups_lists: (named list of list)
optionally contains for each subgroups variable a list, which specifies the new group levels via the names and the levels that belong to it in the character vectors that are elements of the list.
control: (list)
a list of parameters as returned by the helper function control_coxreg().
label_all: (string)
label for the total population analysis.
df: (data.frame)
containing all analysis variables, as returned by extract_survival_biomarkers().
vars: (character)
the name of statistics to be reported among n_tot_events (total number of events per group), n_tot (total number of observations per group), median (median survival time), hr (hazard ratio), ci (confidence interval of hazard ratio) and pval (p value of the effect). Note, one of the statistics n_tot and n_tot_events, as well as both hr and ci are required.
time_unit: (string)
label with unit of median survival time. Default NULL skips displaying unit.

Details

These functions create a layout starting from a data frame which contains the required statistics. The tables are then typically used as input for forest plots.

Functions

extract_survival_biomarkers(): prepares estimates for number of events, patients and median survival times, as well as hazard ratio estimates, confidence intervals and p-values, for multiple biomarkers across population subgroups in a single data frame. variables corresponds to the names of variables found in data, passed as a named list and requires elements tte, is_event, biomarkers (vector of continuous biomarker variables) and optionally subgroups and strat. groups_lists optionally specifies groupings for subgroups variables.
tabulate_survival_biomarkers(): table creating function.

Note

In contrast to tabulate_survival_subgroups() this tabulation function does not start from an input layout lyt. This is because internally the table is created by combining multiple subtables.

Examples

# Testing dataset.
library(scda)
library(dplyr)
library(forcats)
library(rtables)

adtte <- synthetic_cdisc_data("latest")$adtte

# Save variable labels before data processing steps.
adtte_labels <- formatters::var_labels(adtte)

adtte_f <- adtte %>%
  filter(PARAMCD == "OS") %>%
  mutate(
    AVALU = as.character(AVALU),
    is_event = CNSR == 0
  )
labels <- c("AVALU" = adtte_labels[["AVALU"]], "is_event" = "Event Flag")
formatters::var_labels(adtte_f)[names(labels)] <- labels
# Typical analysis of two continuous biomarkers `BMRKR1` and `AGE`,
# in multiple regression models containing one covariate `RACE`,
# as well as one stratification variable `STRATA1`. The subgroups
# are defined by the levels of `BMRKR2`.
df <- extract_survival_biomarkers(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    biomarkers = c("BMRKR1", "AGE"),
    strata = "STRATA1",
    covariates = "SEX",
    subgroups = "BMRKR2"
  ),
  data = adtte_f
)
df
#>   biomarker              biomarker_label n_tot n_tot_events   median        hr
#> 1    BMRKR1 Continuous Level Biomarker 1   400          282 680.9598 0.9838045
#> 2       AGE                          Age   400          282 680.9598 1.0060610
#> 3    BMRKR1 Continuous Level Biomarker 1   135           95 647.7467 1.0155961
#> 4       AGE                          Age   135           95 647.7467 1.0095516
#> 5    BMRKR1 Continuous Level Biomarker 1   135           93 646.4069 0.9800800
#> 6       AGE                          Age   135           93 646.4069 1.0228066
#> 7    BMRKR1 Continuous Level Biomarker 1   130           94 761.2290 0.9437920
#> 8       AGE                          Age   130           94 761.2290 1.0035277
#>         lcl      ucl conf_level       pval     pval_label     subgroup    var
#> 1 0.9500735 1.018733       0.95 0.35898997 p-value (Wald) All Patients    ALL
#> 2 0.9908739 1.021481       0.95 0.43619694 p-value (Wald) All Patients    ALL
#> 3 0.9553432 1.079649       0.95 0.61993701 p-value (Wald)          LOW BMRKR2
#> 4 0.9801032 1.039885       0.95 0.52910110 p-value (Wald)          LOW BMRKR2
#> 5 0.9237871 1.039803       0.95 0.50496922 p-value (Wald)       MEDIUM BMRKR2
#> 6 0.9940130 1.052434       0.95 0.12167059 p-value (Wald)       MEDIUM BMRKR2
#> 7 0.8822742 1.009599       0.95 0.09253618 p-value (Wald)         HIGH BMRKR2
#> 8 0.9776074 1.030135       0.95 0.79197278 p-value (Wald)         HIGH BMRKR2
#>                       var_label row_type
#> 1                  All Patients  content
#> 2                  All Patients  content
#> 3 Categorical Level Biomarker 2 analysis
#> 4 Categorical Level Biomarker 2 analysis
#> 5 Categorical Level Biomarker 2 analysis
#> 6 Categorical Level Biomarker 2 analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis

# Here we group the levels of `BMRKR2` manually.
df_grouped <- extract_survival_biomarkers(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    biomarkers = c("BMRKR1", "AGE"),
    strata = "STRATA1",
    covariates = "SEX",
    subgroups = "BMRKR2"
  ),
  data = adtte_f,
  groups_lists = list(
    BMRKR2 = list(
      "low" = "LOW",
      "low/medium" = c("LOW", "MEDIUM"),
      "low/medium/high" = c("LOW", "MEDIUM", "HIGH")
    )
  )
)
df_grouped
#>   biomarker              biomarker_label n_tot n_tot_events   median        hr
#> 1    BMRKR1 Continuous Level Biomarker 1   400          282 680.9598 0.9838045
#> 2       AGE                          Age   400          282 680.9598 1.0060610
#> 3    BMRKR1 Continuous Level Biomarker 1   135           95 647.7467 1.0155961
#> 4       AGE                          Age   135           95 647.7467 1.0095516
#> 5    BMRKR1 Continuous Level Biomarker 1   270          188 647.7467 0.9981993
#> 6       AGE                          Age   270          188 647.7467 1.0131282
#> 7    BMRKR1 Continuous Level Biomarker 1   400          282 680.9598 0.9838045
#> 8       AGE                          Age   400          282 680.9598 1.0060610
#>         lcl      ucl conf_level      pval     pval_label        subgroup    var
#> 1 0.9500735 1.018733       0.95 0.3589900 p-value (Wald)    All Patients    ALL
#> 2 0.9908739 1.021481       0.95 0.4361969 p-value (Wald)    All Patients    ALL
#> 3 0.9553432 1.079649       0.95 0.6199370 p-value (Wald)             low BMRKR2
#> 4 0.9801032 1.039885       0.95 0.5291011 p-value (Wald)             low BMRKR2
#> 5 0.9572619 1.040887       0.95 0.9327722 p-value (Wald)      low/medium BMRKR2
#> 6 0.9927059 1.033971       0.95 0.2093518 p-value (Wald)      low/medium BMRKR2
#> 7 0.9500735 1.018733       0.95 0.3589900 p-value (Wald) low/medium/high BMRKR2
#> 8 0.9908739 1.021481       0.95 0.4361969 p-value (Wald) low/medium/high BMRKR2
#>                       var_label row_type
#> 1                  All Patients  content
#> 2                  All Patients  content
#> 3 Categorical Level Biomarker 2 analysis
#> 4 Categorical Level Biomarker 2 analysis
#> 5 Categorical Level Biomarker 2 analysis
#> 6 Categorical Level Biomarker 2 analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis

## Table with default columns.
# df <- <needs_to_be_inputted>
tabulate_survival_biomarkers(df)
#>                                   Total n   Total Events   Median   Hazard Ratio   95% Wald CI    p-value (Wald)
#> ————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> Age                                                                                                             
#>   All Patients                      400         282        681.0        1.01       (0.99, 1.02)       0.4362    
#>   Categorical Level Biomarker 2                                                                                 
#>     LOW                             135          95        647.7        1.01       (0.98, 1.04)       0.5291    
#>     MEDIUM                          135          93        646.4        1.02       (0.99, 1.05)       0.1217    
#>     HIGH                            130          94        761.2        1.00       (0.98, 1.03)       0.7920    
#> Continuous Level Biomarker 1                                                                                    
#>   All Patients                      400         282        681.0        0.98       (0.95, 1.02)       0.3590    
#>   Categorical Level Biomarker 2                                                                                 
#>     LOW                             135          95        647.7        1.02       (0.96, 1.08)       0.6199    
#>     MEDIUM                          135          93        646.4        0.98       (0.92, 1.04)       0.5050    
#>     HIGH                            130          94        761.2        0.94       (0.88, 1.01)       0.0925    

## Table with a manually chosen set of columns: leave out "pval", reorder.
tab <- tabulate_survival_biomarkers(
  df = df,
  vars = c("n_tot_events", "ci", "n_tot", "median", "hr"),
  time_unit = as.character(adtte_f$AVALU[1])
)

## Finally produce the forest plot.
if (FALSE) {
g_forest(tab, xlim = c(0.8, 1.2))
}