Tabulate Biomarker Effects on Binary Response by Subgroup
Source:R/response_biomarkers_subgroups.R
response_biomarkers_subgroups.Rd
Tabulate the estimated effects of multiple continuous biomarker variables on a binary response endpoint across population subgroups.
Usage
extract_rsp_biomarkers(
variables,
data,
groups_lists = list(),
control = control_logistic(),
label_all = "All Patients"
)
tabulate_rsp_biomarkers(
df,
vars = c("n_tot", "n_rsp", "prop", "or", "ci", "pval")
)
Arguments
- variables
(named
list
ofstring
)
list of additional analysis variables.- data
(
data.frame
)
the dataset containing the variables to summarize.- groups_lists
(named
list
oflist
)
optionally contains for eachsubgroups
variable a list, which specifies the new group levels via the names and the levels that belong to it in the character vectors that are elements of the list.- control
(named
list
)
controls for the response definition and the confidence level produced bycontrol_logistic()
.- label_all
(
string
)
label for the total population analysis.- df
(
data.frame
)
containing all analysis variables, as returned byextract_rsp_biomarkers()
.- vars
(
character
)
the name of statistics to be reported amongn_tot
(total number of patients per group),n_rsp
(total number of responses per group),prop
(total response proportion per group),or
(odds ratio),ci
(confidence interval of odds ratio) andpval
(p value of the effect). Note, the statisticsn_tot
,or
andci
are required.
Details
These functions create a layout starting from a data frame which contains the required statistics. The tables are then typically used as input for forest plots.
Functions
extract_rsp_biomarkers()
: prepares estimates for number of responses, patients and overall response rate, as well as odds ratio estimates, confidence intervals and p-values, for multiple biomarkers across population subgroups in a single data frame.variables
corresponds to the names of variables found indata
, passed as a named list and requires elementsrsp
andbiomarkers
(vector of continuous biomarker variables) and optionallycovariates
,subgroups
andstrat
.groups_lists
optionally specifies groupings forsubgroups
variables.tabulate_rsp_biomarkers()
: table creating function.
Note
You can also specify a continuous variable in rsp
and then use the
response_definition
control to convert that internally to a logical
variable reflecting binary response.
In contrast to tabulate_rsp_subgroups()
this tabulation function does
not start from an input layout lyt
. This is because internally the table is
created by combining multiple subtables.
See also
h_logistic_mult_cont_df()
which is used internally.
h_tab_rsp_one_biomarker()
which is used internally.
Examples
# Testing dataset.
library(scda)
library(dplyr)
library(forcats)
library(rtables)
adrs <- synthetic_cdisc_data("latest")$adrs
adrs_labels <- formatters::var_labels(adrs)
adrs_f <- adrs %>%
filter(PARAMCD == "BESRSPI") %>%
mutate(rsp = AVALC == "CR")
formatters::var_labels(adrs_f) <- c(adrs_labels, "Response")
# Typical analysis of two continuous biomarkers `BMRKR1` and `AGE`,
# in logistic regression models with one covariate `RACE`. The subgroups
# are defined by the levels of `BMRKR2`.
df <- extract_rsp_biomarkers(
variables = list(
rsp = "rsp",
biomarkers = c("BMRKR1", "AGE"),
covariates = "SEX",
subgroups = "BMRKR2"
),
data = adrs_f
)
df
#> biomarker biomarker_label n_tot n_rsp prop or
#> 1 BMRKR1 Continuous Level Biomarker 1 400 336 0.8400000 1.0573123
#> 2 AGE Age 400 336 0.8400000 0.9989522
#> 3 BMRKR1 Continuous Level Biomarker 1 135 120 0.8888889 1.0630022
#> 4 AGE Age 135 120 0.8888889 1.0414591
#> 5 BMRKR1 Continuous Level Biomarker 1 135 110 0.8148148 1.0224322
#> 6 AGE Age 135 110 0.8148148 1.0206702
#> 7 BMRKR1 Continuous Level Biomarker 1 130 106 0.8153846 1.1131370
#> 8 AGE Age 130 106 0.8153846 0.9688377
#> lcl ucl conf_level pval pval_label subgroup var
#> 1 0.9715084 1.150694 0.95 0.1968485 p-value (Wald) All Patients ALL
#> 2 0.9634618 1.035750 0.95 0.9547035 p-value (Wald) All Patients ALL
#> 3 0.8880719 1.272390 0.95 0.5054054 p-value (Wald) LOW BMRKR2
#> 4 0.9606402 1.129077 0.95 0.3243039 p-value (Wald) LOW BMRKR2
#> 5 0.9016818 1.159353 0.95 0.7293672 p-value (Wald) MEDIUM BMRKR2
#> 6 0.9562410 1.089440 0.95 0.5385665 p-value (Wald) MEDIUM BMRKR2
#> 7 0.9550957 1.297330 0.95 0.1700950 p-value (Wald) HIGH BMRKR2
#> 8 0.9195637 1.020752 0.95 0.2345489 p-value (Wald) HIGH BMRKR2
#> var_label row_type
#> 1 All Patients content
#> 2 All Patients content
#> 3 Categorical Level Biomarker 2 analysis
#> 4 Categorical Level Biomarker 2 analysis
#> 5 Categorical Level Biomarker 2 analysis
#> 6 Categorical Level Biomarker 2 analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
# Here we group the levels of `BMRKR2` manually, and we add a stratification
# variable `STRATA1`. We also here use a continuous variable `EOSDY`
# which is then binarized internally (response is defined as this variable
# being larger than 500).
df_grouped <- extract_rsp_biomarkers(
variables = list(
rsp = "EOSDY",
biomarkers = c("BMRKR1", "AGE"),
covariates = "SEX",
subgroups = "BMRKR2",
strat = "STRATA1"
),
data = adrs_f,
groups_lists = list(
BMRKR2 = list(
"low" = "LOW",
"low/medium" = c("LOW", "MEDIUM"),
"low/medium/high" = c("LOW", "MEDIUM", "HIGH")
)
),
control = control_logistic(
response_definition = "I(response > 500)"
)
)
df_grouped
#> biomarker biomarker_label n_tot n_rsp prop or
#> 1 BMRKR1 Continuous Level Biomarker 1 327 245 0.7492355 0.9773213
#> 2 AGE Age 327 245 0.7492355 1.0226967
#> 3 BMRKR1 Continuous Level Biomarker 1 108 82 0.7592593 0.9448599
#> 4 AGE Age 108 82 0.7592593 1.0319902
#> 5 BMRKR1 Continuous Level Biomarker 1 220 164 0.7454545 0.9398086
#> 6 AGE Age 220 164 0.7454545 1.0201601
#> 7 BMRKR1 Continuous Level Biomarker 1 327 245 0.7492355 0.9773213
#> 8 AGE Age 327 245 0.7492355 1.0226967
#> lcl ucl conf_level pval pval_label subgroup var
#> 1 0.9070459 1.053041 0.95 0.5468297 p-value (Wald) All Patients ALL
#> 2 0.9865104 1.060210 0.95 0.2220691 p-value (Wald) All Patients ALL
#> 3 0.8268817 1.079671 0.95 0.4045690 p-value (Wald) low BMRKR2
#> 4 0.9674411 1.100846 0.95 0.3393094 p-value (Wald) low BMRKR2
#> 5 0.8614063 1.025347 0.95 0.1624819 p-value (Wald) low/medium BMRKR2
#> 6 0.9740974 1.068401 0.95 0.3971685 p-value (Wald) low/medium BMRKR2
#> 7 0.9070459 1.053041 0.95 0.5468297 p-value (Wald) low/medium/high BMRKR2
#> 8 0.9865104 1.060210 0.95 0.2220691 p-value (Wald) low/medium/high BMRKR2
#> var_label row_type
#> 1 All Patients content
#> 2 All Patients content
#> 3 Categorical Level Biomarker 2 analysis
#> 4 Categorical Level Biomarker 2 analysis
#> 5 Categorical Level Biomarker 2 analysis
#> 6 Categorical Level Biomarker 2 analysis
#> 7 Categorical Level Biomarker 2 analysis
#> 8 Categorical Level Biomarker 2 analysis
## Table with default columns.
# df <- <need_data_input_to_work>
tabulate_rsp_biomarkers(df)
#> Total n Responders Response (%) Odds Ratio 95% CI p-value (Wald)
#> ——————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> Age
#> All Patients 400 336 84.0% 1.00 (0.96, 1.04) 0.9547
#> Categorical Level Biomarker 2
#> LOW 135 120 88.9% 1.04 (0.96, 1.13) 0.3243
#> MEDIUM 135 110 81.5% 1.02 (0.96, 1.09) 0.5386
#> HIGH 130 106 81.5% 0.97 (0.92, 1.02) 0.2345
#> Continuous Level Biomarker 1
#> All Patients 400 336 84.0% 1.06 (0.97, 1.15) 0.1968
#> Categorical Level Biomarker 2
#> LOW 135 120 88.9% 1.06 (0.89, 1.27) 0.5054
#> MEDIUM 135 110 81.5% 1.02 (0.90, 1.16) 0.7294
#> HIGH 130 106 81.5% 1.11 (0.96, 1.30) 0.1701
## Table with a manually chosen set of columns: leave out "pval", reorder.
tab <- tabulate_rsp_biomarkers(
df = df,
vars = c("n_rsp", "ci", "n_tot", "prop", "or")
)
## Finally produce the forest plot.
if (FALSE) {
g_forest(tab, xlim = c(0.7, 1.4))
}