Tabulate Survival Duration by Subgroup
Source:R/survival_duration_subgroups.R
survival_duration_subgroups.RdUsage
extract_survival_subgroups(
variables,
data,
groups_lists = list(),
control = control_coxph(),
label_all = "All Patients"
)
a_survival_subgroups(
.formats = list(n = "xx", n_events = "xx", n_tot_events = "xx", median = "xx.x", n_tot
= "xx", hr = list(format_extreme_values(2L)), ci =
list(format_extreme_values_ci(2L)), pval = "x.xxxx | (<0.0001)")
)
tabulate_survival_subgroups(
lyt,
df,
vars = c("n_tot_events", "n_events", "median", "hr", "ci"),
time_unit = NULL
)Arguments
- variables
(named
listofstring)
list of additional analysis variables.- data
(
data frame)
the dataset containing the variables to summarize.- groups_lists
(named
listoflist)
optionally contains for eachsubgroupsvariable a list, which specifies the new group levels via the names and the levels that belong to it in the character vectors that are elements of the list.- control
-
(
list)
parameters for comparison details, specified by using
the helper functioncontrol_coxph(). Some possible parameter options are:pval_method: (string)
p-value method for testing hazard ratio = 1. Default method is "log-rank" which comes fromsurvival::survdiff(), can also be set to "wald" or "likelihood" that comes fromsurvival::coxph().ties: (string)
specifying the method for tie handling. Default is "efron", can also be set to "breslow" or "exact". See more insurvival::coxph()conf_level: (proportion)
confidence level of the interval for HR.
- label_all
(
string)
label for the total population analysis.- .formats
(named
characterorlist)
formats for the statistics.- lyt
(
layout)
input layout where analyses will be added to.- df
(
list)
of data frames containing all analysis variables. List should be created usingextract_survival_subgroups().- vars
(
character)
the name of statistics to be reported amongn_tot_events(total number of events per group),n_events(number of events per group),n_tot(total number of observations per group),n(number of observations per group),median(median survival time),hr(hazard ratio),ci(confidence interval of hazard ratio) andpval(p value of the effect). Note, one of the statisticsn_totandn_tot_events, as well as bothhrandciare required.- time_unit
(
string)
label with unit of median survival time. DefaultNULLskips displaying unit.
Details
These functions create a layout starting from a data frame which contains the required statistics. Tables typically used as part of forest plot.
Functions
extract_survival_subgroups(): prepares estimates of median survival times and treatment hazard ratios for population subgroups in data frames. Simple wrapper forh_survtime_subgroups_df()andh_coxph_subgroups_df(). Result is a list of two data frames:survtimeandhr.variablescorresponds to the names of variables found indata, passed as a named list and requires elementstte,is_event,armand optionallysubgroupsandstrat.groups_listsoptionally specifies groupings forsubgroupsvariables.a_survival_subgroups(): Formatted Analysis function used to format the results ofextract_survival_subgroups(). Returns is a list of Formatted Analysis functions with one element per statistic.tabulate_survival_subgroups(): table creating function.
Examples
# Testing dataset.
library(scda)
library(dplyr)
library(forcats)
library(rtables)
adtte <- synthetic_cdisc_data("latest")$adtte
# Save variable labels before data processing steps.
adtte_labels <- formatters::var_labels(adtte)
adtte_f <- adtte %>%
filter(
PARAMCD == "OS",
ARM %in% c("B: Placebo", "A: Drug X"),
SEX %in% c("M", "F")
) %>%
mutate(
# Reorder levels of ARM to display reference arm before treatment arm.
ARM = droplevels(fct_relevel(ARM, "B: Placebo")),
SEX = droplevels(SEX),
AVALU = as.character(AVALU),
is_event = CNSR == 0
)
labels <- c(
"ARM" = adtte_labels[["ARM"]],
"SEX" = adtte_labels[["SEX"]],
"AVALU" = adtte_labels[["AVALU"]],
"is_event" = "Event Flag"
)
formatters::var_labels(adtte_f)[names(labels)] <- labels
df <- extract_survival_subgroups(
variables = list(
tte = "AVAL",
is_event = "is_event",
arm = "ARM", subgroups = c("SEX", "BMRKR2")
),
data = adtte_f
)
df
#> $survtime
#> arm n n_events median subgroup var
#> 1 B: Placebo 134 87 837.4280 All Patients ALL
#> 2 A: Drug X 134 79 1260.4905 All Patients ALL
#> 3 B: Placebo 82 50 850.9208 F SEX
#> 4 A: Drug X 79 45 1274.8047 F SEX
#> 5 B: Placebo 52 37 527.6659 M SEX
#> 6 A: Drug X 55 34 849.2976 M SEX
#> 7 B: Placebo 45 30 751.4314 LOW BMRKR2
#> 8 A: Drug X 50 31 1160.6458 LOW BMRKR2
#> 9 B: Placebo 56 36 722.7926 MEDIUM BMRKR2
#> 10 A: Drug X 37 19 1269.4039 MEDIUM BMRKR2
#> 11 B: Placebo 33 21 848.2393 HIGH BMRKR2
#> 12 A: Drug X 47 29 1070.8022 HIGH BMRKR2
#> var_label row_type
#> 1 All Patients content
#> 2 All Patients content
#> 3 Sex analysis
#> 4 Sex analysis
#> 5 Sex analysis
#> 6 Sex analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
#> 9 Categorical Level Biomarker 2 analysis
#> 10 Categorical Level Biomarker 2 analysis
#> 11 Categorical Level Biomarker 2 analysis
#> 12 Categorical Level Biomarker 2 analysis
#>
#> $hr
#> arm n_tot n_tot_events hr lcl ucl conf_level pval
#> 1 268 166 0.7173651 0.5275231 0.9755262 0.95 0.03340293
#> 2 161 95 0.6979693 0.4647812 1.0481517 0.95 0.08148174
#> 3 107 71 0.7836167 0.4873444 1.2600023 0.95 0.31318347
#> 4 95 61 0.7050730 0.4243655 1.1714617 0.95 0.17526198
#> 5 93 55 0.5728069 0.3244196 1.0113683 0.95 0.05174942
#> 6 80 50 0.9769002 0.5552002 1.7189005 0.95 0.93538927
#> pval_label subgroup var var_label row_type
#> 1 p-value (log-rank) All Patients ALL All Patients content
#> 2 p-value (log-rank) F SEX Sex analysis
#> 3 p-value (log-rank) M SEX Sex analysis
#> 4 p-value (log-rank) LOW BMRKR2 Categorical Level Biomarker 2 analysis
#> 5 p-value (log-rank) MEDIUM BMRKR2 Categorical Level Biomarker 2 analysis
#> 6 p-value (log-rank) HIGH BMRKR2 Categorical Level Biomarker 2 analysis
#>
df_grouped <- extract_survival_subgroups(
variables = list(
tte = "AVAL",
is_event = "is_event",
arm = "ARM", subgroups = c("SEX", "BMRKR2")
),
data = adtte_f,
groups_lists = list(
BMRKR2 = list(
"low" = "LOW",
"low/medium" = c("LOW", "MEDIUM"),
"low/medium/high" = c("LOW", "MEDIUM", "HIGH")
)
)
)
df_grouped
#> $survtime
#> arm n n_events median subgroup var
#> 1 B: Placebo 134 87 837.4280 All Patients ALL
#> 2 A: Drug X 134 79 1260.4905 All Patients ALL
#> 3 B: Placebo 82 50 850.9208 F SEX
#> 4 A: Drug X 79 45 1274.8047 F SEX
#> 5 B: Placebo 52 37 527.6659 M SEX
#> 6 A: Drug X 55 34 849.2976 M SEX
#> 7 B: Placebo 45 30 751.4314 low BMRKR2
#> 8 A: Drug X 50 31 1160.6458 low BMRKR2
#> 9 B: Placebo 101 66 741.8707 low/medium BMRKR2
#> 10 A: Drug X 87 50 1269.4039 low/medium BMRKR2
#> 11 B: Placebo 134 87 837.4280 low/medium/high BMRKR2
#> 12 A: Drug X 134 79 1260.4905 low/medium/high BMRKR2
#> var_label row_type
#> 1 All Patients content
#> 2 All Patients content
#> 3 Sex analysis
#> 4 Sex analysis
#> 5 Sex analysis
#> 6 Sex analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
#> 9 Categorical Level Biomarker 2 analysis
#> 10 Categorical Level Biomarker 2 analysis
#> 11 Categorical Level Biomarker 2 analysis
#> 12 Categorical Level Biomarker 2 analysis
#>
#> $hr
#> arm n_tot n_tot_events hr lcl ucl conf_level pval
#> 1 268 166 0.7173651 0.5275231 0.9755262 0.95 0.03340293
#> 2 161 95 0.6979693 0.4647812 1.0481517 0.95 0.08148174
#> 3 107 71 0.7836167 0.4873444 1.2600023 0.95 0.31318347
#> 4 95 61 0.7050730 0.4243655 1.1714617 0.95 0.17526198
#> 5 188 116 0.6453648 0.4447544 0.9364622 0.95 0.02019120
#> 6 268 166 0.7173651 0.5275231 0.9755262 0.95 0.03340293
#> pval_label subgroup var var_label
#> 1 p-value (log-rank) All Patients ALL All Patients
#> 2 p-value (log-rank) F SEX Sex
#> 3 p-value (log-rank) M SEX Sex
#> 4 p-value (log-rank) low BMRKR2 Categorical Level Biomarker 2
#> 5 p-value (log-rank) low/medium BMRKR2 Categorical Level Biomarker 2
#> 6 p-value (log-rank) low/medium/high BMRKR2 Categorical Level Biomarker 2
#> row_type
#> 1 content
#> 2 analysis
#> 3 analysis
#> 4 analysis
#> 5 analysis
#> 6 analysis
#>
# Internal function - a_survival_subgroups
if (FALSE) {
a_survival_subgroups(.formats = list("n" = "xx", "median" = "xx.xx"))
}
## Table with default columns.
basic_table() %>%
tabulate_survival_subgroups(df, time_unit = adtte_f$AVALU[1])
#> Baseline Risk Factors B: Placebo A: Drug X
#> Total Events Events Median (DAYS) Events Median (DAYS) Hazard Ratio 95% Wald CI
#> ————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> All Patients 166 87 837.4 79 1260.5 0.72 (0.53, 0.98)
#> Sex
#> F 95 50 850.9 45 1274.8 0.70 (0.46, 1.05)
#> M 71 37 527.7 34 849.3 0.78 (0.49, 1.26)
#> Categorical Level Biomarker 2
#> LOW 61 30 751.4 31 1160.6 0.71 (0.42, 1.17)
#> MEDIUM 55 36 722.8 19 1269.4 0.57 (0.32, 1.01)
#> HIGH 50 21 848.2 29 1070.8 0.98 (0.56, 1.72)
## Table with a manually chosen set of columns: adding "pval".
basic_table() %>%
tabulate_survival_subgroups(
df = df,
vars = c("n_tot_events", "n_events", "median", "hr", "ci", "pval"),
time_unit = adtte_f$AVALU[1]
)
#> Baseline Risk Factors B: Placebo A: Drug X
#> Total Events Events Median (DAYS) Events Median (DAYS) Hazard Ratio 95% Wald CI p-value (log-rank)
#> —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> All Patients 166 87 837.4 79 1260.5 0.72 (0.53, 0.98) 0.0334
#> Sex
#> F 95 50 850.9 45 1274.8 0.70 (0.46, 1.05) 0.0815
#> M 71 37 527.7 34 849.3 0.78 (0.49, 1.26) 0.3132
#> Categorical Level Biomarker 2
#> LOW 61 30 751.4 31 1160.6 0.71 (0.42, 1.17) 0.1753
#> MEDIUM 55 36 722.8 19 1269.4 0.57 (0.32, 1.01) 0.0517
#> HIGH 50 21 848.2 29 1070.8 0.98 (0.56, 1.72) 0.9354