Tabulate Survival Duration by Subgroup
Source:R/survival_duration_subgroups.R
survival_duration_subgroups.Rd
Usage
extract_survival_subgroups(
variables,
data,
groups_lists = list(),
control = control_coxph(),
label_all = "All Patients"
)
a_survival_subgroups(
.formats = list(n = "xx", n_events = "xx", n_tot_events = "xx", median = "xx.x", n_tot
= "xx", hr = list(format_extreme_values(2L)), ci =
list(format_extreme_values_ci(2L)), pval = "x.xxxx | (<0.0001)")
)
tabulate_survival_subgroups(
lyt,
df,
vars = c("n_tot_events", "n_events", "median", "hr", "ci"),
time_unit = NULL
)
Arguments
- variables
(named
list
ofstring
)
list of additional analysis variables.- data
(
data frame
)
the dataset containing the variables to summarize.- groups_lists
(named
list
oflist
)
optionally contains for eachsubgroups
variable a list, which specifies the new group levels via the names and the levels that belong to it in the character vectors that are elements of the list.- control
-
(
list
)
parameters for comparison details, specified by using
the helper functioncontrol_coxph()
. Some possible parameter options are:pval_method
: (string
)
p-value method for testing hazard ratio = 1. Default method is "log-rank" which comes fromsurvival::survdiff()
, can also be set to "wald" or "likelihood" that comes fromsurvival::coxph()
.ties
: (string
)
specifying the method for tie handling. Default is "efron", can also be set to "breslow" or "exact". See more insurvival::coxph()
conf_level
: (proportion
)
confidence level of the interval for HR.
- label_all
(
string
)
label for the total population analysis.- .formats
(named
character
orlist
)
formats for the statistics.- lyt
(
layout
)
input layout where analyses will be added to.- df
(
list
)
of data frames containing all analysis variables. List should be created usingextract_survival_subgroups()
.- vars
(
character
)
the name of statistics to be reported amongn_tot_events
(total number of events per group),n_events
(number of events per group),n_tot
(total number of observations per group),n
(number of observations per group),median
(median survival time),hr
(hazard ratio),ci
(confidence interval of hazard ratio) andpval
(p value of the effect). Note, one of the statisticsn_tot
andn_tot_events
, as well as bothhr
andci
are required.- time_unit
(
string
)
label with unit of median survival time. DefaultNULL
skips displaying unit.
Details
These functions create a layout starting from a data frame which contains the required statistics. Tables typically used as part of forest plot.
Functions
extract_survival_subgroups()
: prepares estimates of median survival times and treatment hazard ratios for population subgroups in data frames. Simple wrapper forh_survtime_subgroups_df()
andh_coxph_subgroups_df()
. Result is a list of two data frames:survtime
andhr
.variables
corresponds to the names of variables found indata
, passed as a named list and requires elementstte
,is_event
,arm
and optionallysubgroups
andstrat
.groups_lists
optionally specifies groupings forsubgroups
variables.a_survival_subgroups()
: Formatted Analysis function used to format the results ofextract_survival_subgroups()
. Returns is a list of Formatted Analysis functions with one element per statistic.tabulate_survival_subgroups()
: table creating function.
Examples
# Testing dataset.
library(scda)
library(dplyr)
library(forcats)
library(rtables)
adtte <- synthetic_cdisc_data("latest")$adtte
# Save variable labels before data processing steps.
adtte_labels <- formatters::var_labels(adtte)
adtte_f <- adtte %>%
filter(
PARAMCD == "OS",
ARM %in% c("B: Placebo", "A: Drug X"),
SEX %in% c("M", "F")
) %>%
mutate(
# Reorder levels of ARM to display reference arm before treatment arm.
ARM = droplevels(fct_relevel(ARM, "B: Placebo")),
SEX = droplevels(SEX),
AVALU = as.character(AVALU),
is_event = CNSR == 0
)
labels <- c(
"ARM" = adtte_labels[["ARM"]],
"SEX" = adtte_labels[["SEX"]],
"AVALU" = adtte_labels[["AVALU"]],
"is_event" = "Event Flag"
)
formatters::var_labels(adtte_f)[names(labels)] <- labels
df <- extract_survival_subgroups(
variables = list(
tte = "AVAL",
is_event = "is_event",
arm = "ARM", subgroups = c("SEX", "BMRKR2")
),
data = adtte_f
)
df
#> $survtime
#> arm n n_events median subgroup var
#> 1 B: Placebo 134 87 837.4280 All Patients ALL
#> 2 A: Drug X 134 79 1260.4905 All Patients ALL
#> 3 B: Placebo 82 50 850.9208 F SEX
#> 4 A: Drug X 79 45 1274.8047 F SEX
#> 5 B: Placebo 52 37 527.6659 M SEX
#> 6 A: Drug X 55 34 849.2976 M SEX
#> 7 B: Placebo 45 30 751.4314 LOW BMRKR2
#> 8 A: Drug X 50 31 1160.6458 LOW BMRKR2
#> 9 B: Placebo 56 36 722.7926 MEDIUM BMRKR2
#> 10 A: Drug X 37 19 1269.4039 MEDIUM BMRKR2
#> 11 B: Placebo 33 21 848.2393 HIGH BMRKR2
#> 12 A: Drug X 47 29 1070.8022 HIGH BMRKR2
#> var_label row_type
#> 1 All Patients content
#> 2 All Patients content
#> 3 Sex analysis
#> 4 Sex analysis
#> 5 Sex analysis
#> 6 Sex analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
#> 9 Categorical Level Biomarker 2 analysis
#> 10 Categorical Level Biomarker 2 analysis
#> 11 Categorical Level Biomarker 2 analysis
#> 12 Categorical Level Biomarker 2 analysis
#>
#> $hr
#> arm n_tot n_tot_events hr lcl ucl conf_level pval
#> 1 268 166 0.7173651 0.5275231 0.9755262 0.95 0.03340293
#> 2 161 95 0.6979693 0.4647812 1.0481517 0.95 0.08148174
#> 3 107 71 0.7836167 0.4873444 1.2600023 0.95 0.31318347
#> 4 95 61 0.7050730 0.4243655 1.1714617 0.95 0.17526198
#> 5 93 55 0.5728069 0.3244196 1.0113683 0.95 0.05174942
#> 6 80 50 0.9769002 0.5552002 1.7189005 0.95 0.93538927
#> pval_label subgroup var var_label row_type
#> 1 p-value (log-rank) All Patients ALL All Patients content
#> 2 p-value (log-rank) F SEX Sex analysis
#> 3 p-value (log-rank) M SEX Sex analysis
#> 4 p-value (log-rank) LOW BMRKR2 Categorical Level Biomarker 2 analysis
#> 5 p-value (log-rank) MEDIUM BMRKR2 Categorical Level Biomarker 2 analysis
#> 6 p-value (log-rank) HIGH BMRKR2 Categorical Level Biomarker 2 analysis
#>
df_grouped <- extract_survival_subgroups(
variables = list(
tte = "AVAL",
is_event = "is_event",
arm = "ARM", subgroups = c("SEX", "BMRKR2")
),
data = adtte_f,
groups_lists = list(
BMRKR2 = list(
"low" = "LOW",
"low/medium" = c("LOW", "MEDIUM"),
"low/medium/high" = c("LOW", "MEDIUM", "HIGH")
)
)
)
df_grouped
#> $survtime
#> arm n n_events median subgroup var
#> 1 B: Placebo 134 87 837.4280 All Patients ALL
#> 2 A: Drug X 134 79 1260.4905 All Patients ALL
#> 3 B: Placebo 82 50 850.9208 F SEX
#> 4 A: Drug X 79 45 1274.8047 F SEX
#> 5 B: Placebo 52 37 527.6659 M SEX
#> 6 A: Drug X 55 34 849.2976 M SEX
#> 7 B: Placebo 45 30 751.4314 low BMRKR2
#> 8 A: Drug X 50 31 1160.6458 low BMRKR2
#> 9 B: Placebo 101 66 741.8707 low/medium BMRKR2
#> 10 A: Drug X 87 50 1269.4039 low/medium BMRKR2
#> 11 B: Placebo 134 87 837.4280 low/medium/high BMRKR2
#> 12 A: Drug X 134 79 1260.4905 low/medium/high BMRKR2
#> var_label row_type
#> 1 All Patients content
#> 2 All Patients content
#> 3 Sex analysis
#> 4 Sex analysis
#> 5 Sex analysis
#> 6 Sex analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
#> 9 Categorical Level Biomarker 2 analysis
#> 10 Categorical Level Biomarker 2 analysis
#> 11 Categorical Level Biomarker 2 analysis
#> 12 Categorical Level Biomarker 2 analysis
#>
#> $hr
#> arm n_tot n_tot_events hr lcl ucl conf_level pval
#> 1 268 166 0.7173651 0.5275231 0.9755262 0.95 0.03340293
#> 2 161 95 0.6979693 0.4647812 1.0481517 0.95 0.08148174
#> 3 107 71 0.7836167 0.4873444 1.2600023 0.95 0.31318347
#> 4 95 61 0.7050730 0.4243655 1.1714617 0.95 0.17526198
#> 5 188 116 0.6453648 0.4447544 0.9364622 0.95 0.02019120
#> 6 268 166 0.7173651 0.5275231 0.9755262 0.95 0.03340293
#> pval_label subgroup var var_label
#> 1 p-value (log-rank) All Patients ALL All Patients
#> 2 p-value (log-rank) F SEX Sex
#> 3 p-value (log-rank) M SEX Sex
#> 4 p-value (log-rank) low BMRKR2 Categorical Level Biomarker 2
#> 5 p-value (log-rank) low/medium BMRKR2 Categorical Level Biomarker 2
#> 6 p-value (log-rank) low/medium/high BMRKR2 Categorical Level Biomarker 2
#> row_type
#> 1 content
#> 2 analysis
#> 3 analysis
#> 4 analysis
#> 5 analysis
#> 6 analysis
#>
# Internal function - a_survival_subgroups
if (FALSE) {
a_survival_subgroups(.formats = list("n" = "xx", "median" = "xx.xx"))
}
## Table with default columns.
basic_table() %>%
tabulate_survival_subgroups(df, time_unit = adtte_f$AVALU[1])
#> Baseline Risk Factors B: Placebo A: Drug X
#> Total Events Events Median (DAYS) Events Median (DAYS) Hazard Ratio 95% Wald CI
#> ————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> All Patients 166 87 837.4 79 1260.5 0.72 (0.53, 0.98)
#> Sex
#> F 95 50 850.9 45 1274.8 0.70 (0.46, 1.05)
#> M 71 37 527.7 34 849.3 0.78 (0.49, 1.26)
#> Categorical Level Biomarker 2
#> LOW 61 30 751.4 31 1160.6 0.71 (0.42, 1.17)
#> MEDIUM 55 36 722.8 19 1269.4 0.57 (0.32, 1.01)
#> HIGH 50 21 848.2 29 1070.8 0.98 (0.56, 1.72)
## Table with a manually chosen set of columns: adding "pval".
basic_table() %>%
tabulate_survival_subgroups(
df = df,
vars = c("n_tot_events", "n_events", "median", "hr", "ci", "pval"),
time_unit = adtte_f$AVALU[1]
)
#> Baseline Risk Factors B: Placebo A: Drug X
#> Total Events Events Median (DAYS) Events Median (DAYS) Hazard Ratio 95% Wald CI p-value (log-rank)
#> —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> All Patients 166 87 837.4 79 1260.5 0.72 (0.53, 0.98) 0.0334
#> Sex
#> F 95 50 850.9 45 1274.8 0.70 (0.46, 1.05) 0.0815
#> M 71 37 527.7 34 849.3 0.78 (0.49, 1.26) 0.3132
#> Categorical Level Biomarker 2
#> LOW 61 30 751.4 31 1160.6 0.71 (0.42, 1.17) 0.1753
#> MEDIUM 55 36 722.8 19 1269.4 0.57 (0.32, 1.01) 0.0517
#> HIGH 50 21 848.2 29 1070.8 0.98 (0.56, 1.72) 0.9354