Helper Functions for Tabulating Survival Duration by Subgroup — h_survival_duration

Helper functions that tabulate in a data frame statistics such as median survival time and hazard ratio for population subgroups.

Usage

h_survtime_df(tte, is_event, arm)

h_survtime_subgroups_df(
  variables,
  data,
  groups_lists = list(),
  label_all = "All Patients"
)

h_coxph_df(tte, is_event, arm, strata_data = NULL, control = control_coxph())

h_coxph_subgroups_df(
  variables,
  data,
  groups_lists = list(),
  control = control_coxph(),
  label_all = "All Patients"
)

Arguments

tte

(numeric)
contains time-to-event duration values.

is_event

(logical)
TRUE if event, FALSE if time to event is censored.

arm

(factor)
the treatment group variable.

variables

(named list of string)
list of additional analysis variables.

data

(data.frame)
the dataset containing the variables to summarize.

groups_lists

(named list of list)
optionally contains for each subgroups variable a list, which specifies the new group levels via the names and the levels that belong to it in the character vectors that are elements of the list.

label_all

(string)
label for the total population analysis.

strata_data

(factor, data.frame or NULL)
required if stratified analysis is performed.

control

(list)
parameters for comparison details, specified by using
the helper function control_coxph(). Some possible parameter options are:

pval_method: (string)
p-value method for testing hazard ratio = 1. Default method is "log-rank" which comes from survival::survdiff(), can also be set to "wald" or "likelihood" that comes from survival::coxph().
ties: (string)
specifying the method for tie handling. Default is "efron", can also be set to "breslow" or "exact". See more in survival::coxph()
conf_level: (proportion)
confidence level of the interval for HR.

Details

Main functionality is to prepare data for use in a layout creating function.

Functions

h_survtime_df(): helper to prepare a data frame of median survival times by arm.
h_survtime_subgroups_df(): summarizes median survival times by arm and across subgroups in a data frame. variables corresponds to the names of variables found in data, passed as a named list and requires elements tte, is_event, arm and optionally subgroups. groups_lists optionally specifies groupings for subgroups variables.
h_coxph_df(): helper to prepare a data frame with estimates of treatment hazard ratio.
h_coxph_subgroups_df(): summarizes estimates of the treatment hazard ratio across subgroups in a data frame. variables corresponds to the names of variables found in data, passed as a named list and requires elements tte, is_event, arm and optionally subgroups and strat. groups_lists optionally specifies groupings for subgroups variables.

Examples

# Testing dataset.
library(scda)
library(dplyr)
library(forcats)
library(rtables)

adtte <- synthetic_cdisc_data("latest")$adtte

# Save variable labels before data processing steps.
adtte_labels <- formatters::var_labels(adtte)

adtte_f <- adtte %>%
  filter(
    PARAMCD == "OS",
    ARM %in% c("B: Placebo", "A: Drug X"),
    SEX %in% c("M", "F")
  ) %>%
  mutate(
    # Reorder levels of ARM to display reference arm before treatment arm.
    ARM = droplevels(fct_relevel(ARM, "B: Placebo")),
    SEX = droplevels(SEX),
    is_event = CNSR == 0
  )
labels <- c("ARM" = adtte_labels[["ARM"]], "SEX" = adtte_labels[["SEX"]], "is_event" = "Event Flag")
formatters::var_labels(adtte_f)[names(labels)] <- labels

# Extract median survival time for one group.
h_survtime_df(
  tte = adtte_f$AVAL,
  is_event = adtte_f$is_event,
  arm = adtte_f$ARM
)
#>          arm   n n_events   median
#> 1 B: Placebo 134       87  837.428
#> 2  A: Drug X 134       79 1260.491

# Extract median survival time for multiple groups.
h_survtime_subgroups_df(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    arm = "ARM",
    subgroups = c("SEX", "BMRKR2")
  ),
  data = adtte_f
)
#>           arm   n n_events    median     subgroup    var
#> 1  B: Placebo 134       87  837.4280 All Patients    ALL
#> 2   A: Drug X 134       79 1260.4905 All Patients    ALL
#> 3  B: Placebo  82       50  850.9208            F    SEX
#> 4   A: Drug X  79       45 1274.8047            F    SEX
#> 5  B: Placebo  52       37  527.6659            M    SEX
#> 6   A: Drug X  55       34  849.2976            M    SEX
#> 7  B: Placebo  45       30  751.4314          LOW BMRKR2
#> 8   A: Drug X  50       31 1160.6458          LOW BMRKR2
#> 9  B: Placebo  56       36  722.7926       MEDIUM BMRKR2
#> 10  A: Drug X  37       19 1269.4039       MEDIUM BMRKR2
#> 11 B: Placebo  33       21  848.2393         HIGH BMRKR2
#> 12  A: Drug X  47       29 1070.8022         HIGH BMRKR2
#>                        var_label row_type
#> 1                   All Patients  content
#> 2                   All Patients  content
#> 3                            Sex analysis
#> 4                            Sex analysis
#> 5                            Sex analysis
#> 6                            Sex analysis
#> 7  Categorical Level Biomarker 2 analysis
#> 8  Categorical Level Biomarker 2 analysis
#> 9  Categorical Level Biomarker 2 analysis
#> 10 Categorical Level Biomarker 2 analysis
#> 11 Categorical Level Biomarker 2 analysis
#> 12 Categorical Level Biomarker 2 analysis

# Define groupings for BMRKR2 levels.
h_survtime_subgroups_df(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    arm = "ARM",
    subgroups = c("SEX", "BMRKR2")
  ),
  data = adtte_f,
  groups_lists = list(
    BMRKR2 = list(
      "low" = "LOW",
      "low/medium" = c("LOW", "MEDIUM"),
      "low/medium/high" = c("LOW", "MEDIUM", "HIGH")
    )
  )
)
#>           arm   n n_events    median        subgroup    var
#> 1  B: Placebo 134       87  837.4280    All Patients    ALL
#> 2   A: Drug X 134       79 1260.4905    All Patients    ALL
#> 3  B: Placebo  82       50  850.9208               F    SEX
#> 4   A: Drug X  79       45 1274.8047               F    SEX
#> 5  B: Placebo  52       37  527.6659               M    SEX
#> 6   A: Drug X  55       34  849.2976               M    SEX
#> 7  B: Placebo  45       30  751.4314             low BMRKR2
#> 8   A: Drug X  50       31 1160.6458             low BMRKR2
#> 9  B: Placebo 101       66  741.8707      low/medium BMRKR2
#> 10  A: Drug X  87       50 1269.4039      low/medium BMRKR2
#> 11 B: Placebo 134       87  837.4280 low/medium/high BMRKR2
#> 12  A: Drug X 134       79 1260.4905 low/medium/high BMRKR2
#>                        var_label row_type
#> 1                   All Patients  content
#> 2                   All Patients  content
#> 3                            Sex analysis
#> 4                            Sex analysis
#> 5                            Sex analysis
#> 6                            Sex analysis
#> 7  Categorical Level Biomarker 2 analysis
#> 8  Categorical Level Biomarker 2 analysis
#> 9  Categorical Level Biomarker 2 analysis
#> 10 Categorical Level Biomarker 2 analysis
#> 11 Categorical Level Biomarker 2 analysis
#> 12 Categorical Level Biomarker 2 analysis

# Extract hazard ratio for one group.
h_coxph_df(adtte_f$AVAL, adtte_f$is_event, adtte_f$ARM)
#>   arm n_tot n_tot_events        hr       lcl       ucl conf_level       pval
#> 1       268          166 0.7173651 0.5275231 0.9755262       0.95 0.03340293
#>           pval_label
#> 1 p-value (log-rank)

# Extract hazard ratio for one group with stratification factor.
h_coxph_df(adtte_f$AVAL, adtte_f$is_event, adtte_f$ARM, strata_data = adtte_f$STRATA1)
#>   arm n_tot n_tot_events        hr       lcl      ucl conf_level       pval
#> 1       268          166 0.7343822 0.5376802 1.003045       0.95 0.05142933
#>           pval_label
#> 1 p-value (log-rank)

# Extract hazard ratio for multiple groups.
h_coxph_subgroups_df(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    arm = "ARM",
    subgroups = c("SEX", "BMRKR2")
  ),
  data = adtte_f
)
#>   arm n_tot n_tot_events        hr       lcl       ucl conf_level       pval
#> 1       268          166 0.7173651 0.5275231 0.9755262       0.95 0.03340293
#> 2       161           95 0.6979693 0.4647812 1.0481517       0.95 0.08148174
#> 3       107           71 0.7836167 0.4873444 1.2600023       0.95 0.31318347
#> 4        95           61 0.7050730 0.4243655 1.1714617       0.95 0.17526198
#> 5        93           55 0.5728069 0.3244196 1.0113683       0.95 0.05174942
#> 6        80           50 0.9769002 0.5552002 1.7189005       0.95 0.93538927
#>           pval_label     subgroup    var                     var_label row_type
#> 1 p-value (log-rank) All Patients    ALL                  All Patients  content
#> 2 p-value (log-rank)            F    SEX                           Sex analysis
#> 3 p-value (log-rank)            M    SEX                           Sex analysis
#> 4 p-value (log-rank)          LOW BMRKR2 Categorical Level Biomarker 2 analysis
#> 5 p-value (log-rank)       MEDIUM BMRKR2 Categorical Level Biomarker 2 analysis
#> 6 p-value (log-rank)         HIGH BMRKR2 Categorical Level Biomarker 2 analysis

# Define groupings of BMRKR2 levels.
h_coxph_subgroups_df(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    arm = "ARM",
    subgroups = c("SEX", "BMRKR2")
  ),
  data = adtte_f,
  groups_lists = list(
    BMRKR2 = list(
      "low" = "LOW",
      "low/medium" = c("LOW", "MEDIUM"),
      "low/medium/high" = c("LOW", "MEDIUM", "HIGH")
    )
  )
)
#>   arm n_tot n_tot_events        hr       lcl       ucl conf_level       pval
#> 1       268          166 0.7173651 0.5275231 0.9755262       0.95 0.03340293
#> 2       161           95 0.6979693 0.4647812 1.0481517       0.95 0.08148174
#> 3       107           71 0.7836167 0.4873444 1.2600023       0.95 0.31318347
#> 4        95           61 0.7050730 0.4243655 1.1714617       0.95 0.17526198
#> 5       188          116 0.6453648 0.4447544 0.9364622       0.95 0.02019120
#> 6       268          166 0.7173651 0.5275231 0.9755262       0.95 0.03340293
#>           pval_label        subgroup    var                     var_label
#> 1 p-value (log-rank)    All Patients    ALL                  All Patients
#> 2 p-value (log-rank)               F    SEX                           Sex
#> 3 p-value (log-rank)               M    SEX                           Sex
#> 4 p-value (log-rank)             low BMRKR2 Categorical Level Biomarker 2
#> 5 p-value (log-rank)      low/medium BMRKR2 Categorical Level Biomarker 2
#> 6 p-value (log-rank) low/medium/high BMRKR2 Categorical Level Biomarker 2
#>   row_type
#> 1  content
#> 2 analysis
#> 3 analysis
#> 4 analysis
#> 5 analysis
#> 6 analysis

# Extract hazard ratio for multiple groups with stratification factors.
h_coxph_subgroups_df(
  variables = list(
    tte = "AVAL",
    is_event = "is_event",
    arm = "ARM",
    subgroups = c("SEX", "BMRKR2"),
    strat = c("STRATA1", "STRATA2")
  ),
  data = adtte_f
)
#>   arm n_tot n_tot_events        hr       lcl      ucl conf_level       pval
#> 1       268          166 0.7412854 0.5390265 1.019438       0.95 0.06468801
#> 2       161           95 0.7328179 0.4794740 1.120023       0.95 0.14954837
#> 3       107           71 0.7277226 0.4270452 1.240103       0.95 0.24075763
#> 4        95           61 0.6717712 0.3834088 1.177011       0.95 0.16224377
#> 5        93           55 0.6161793 0.3394411 1.118535       0.95 0.10874603
#> 6        80           50 1.2479396 0.6657425 2.339273       0.95 0.48884004
#>           pval_label     subgroup    var                     var_label row_type
#> 1 p-value (log-rank) All Patients    ALL                  All Patients  content
#> 2 p-value (log-rank)            F    SEX                           Sex analysis
#> 3 p-value (log-rank)            M    SEX                           Sex analysis
#> 4 p-value (log-rank)          LOW BMRKR2 Categorical Level Biomarker 2 analysis
#> 5 p-value (log-rank)       MEDIUM BMRKR2 Categorical Level Biomarker 2 analysis
#> 6 p-value (log-rank)         HIGH BMRKR2 Categorical Level Biomarker 2 analysis