Logistic regression for binary outcome with categorical/continuous covariates in model statement. For each covariate category (if categorical) or specified values (if continuous), present degrees of freedom, regression parameter estimate and standard error (SE) relative to reference group or category. Report odds ratios for each covariate category or specified values and corresponding Wald confidence intervals as default but allow user to specify other confidence levels. Report p-value for Wald chi-square test of the null hypothesis that covariate has no effect on response in model containing all specified covariates. Allow option to include one two-way interaction and present similar output for each interaction degree of freedom. Note: For the formula, the variable names need to be standard dataframe column name without special characters.
Usage
fit_logistic(
data,
variables = list(response = "Response", arm = "ARMCD", covariates = NULL, interaction =
NULL, strata = NULL),
response_definition = "response"
)
h_get_interaction_vars(fit_glm)
h_interaction_coef_name(
interaction_vars,
first_var_with_level,
second_var_with_level
)
h_or_cat_interaction(
odds_ratio_var,
interaction_var,
fit_glm,
conf_level = 0.95
)
h_or_cont_interaction(
odds_ratio_var,
interaction_var,
fit_glm,
at = NULL,
conf_level = 0.95
)
h_or_interaction(
odds_ratio_var,
interaction_var,
fit_glm,
at = NULL,
conf_level = 0.95
)
h_simple_term_labels(terms, table)
h_interaction_term_labels(terms1, terms2, table, any = FALSE)
h_glm_simple_term_extract(x, fit_glm)
h_glm_interaction_extract(x, fit_glm)
h_glm_inter_term_extract(odds_ratio_var, interaction_var, fit_glm, ...)
h_logistic_simple_terms(x, fit_glm, conf_level = 0.95)
h_logistic_inter_terms(x, fit_glm, conf_level = 0.95, at = NULL)
# S3 method for glm
tidy(fit_glm, conf_level = 0.95, at = NULL)
logistic_regression_cols(lyt, conf_level = 0.95)
logistic_summary_by_flag(flag_var)
summarize_logistic(lyt, conf_level, drop_and_remove_str = "")
Arguments
- data
(
data frame
)
the data frame on which the model was fit.- variables
(named
list
ofstring
)
list of additional analysis variables.- response_definition
(
string
)
the definition of what an event is in terms ofresponse
. This will be used when fitting the (conditional) logistic regression model on the left hand side of the formula.- fit_glm
logistic regression model fitted by
stats::glm()
with "binomial" family.- interaction_vars
(
character
of length 2)
interaction variable names.- first_var_with_level
(
character
of length 2)
the first variable name with the interaction level.- second_var_with_level
(
character
of length 2)
the second variable name with the interaction level.- odds_ratio_var
(
string
)
the odds ratio variable.- interaction_var
(
string
)
the interaction variable.- conf_level
(
proportion
)
confidence level of the interval.- at
(
NULL
ornumeric
)
optional values for the interaction variable. Otherwise the median is used.- terms
(
character
)
simple terms.- table
(
table
)
table containing numbers for terms.- terms1
(
character
)
terms for first dimension (rows).- terms2
(
character
)
terms for second dimension (rows).- any
(
flag
)
whether any ofterm1
andterm2
can be fulfilled to count the number of patients. In that case they can only be scalar (strings).- x
(
string
orcharacter
)
a variable or interaction term infit_glm
(depending on the helper function).- ...
additional arguments for the lower level functions.
- lyt
(
layout
)
input layout where analyses will be added to.- flag_var
(
string
)
variable name identifying which row should be used in this content function.- drop_and_remove_str
string to be dropped and removed
Details
Note this function may hang or error for certain datasets when an old version of the survival package (< 3.2-13) is used.
Functions
fit_logistic()
: Fit a (conditional) logistic regression model.h_get_interaction_vars()
: Helper function to extract interaction variable names from a fitted model assuming only one interaction term.h_interaction_coef_name()
: Helper function to get the right coefficient name from the interaction variable names and the given levels. The main value here is that the order of first and second variable is checked in theinteraction_vars
input.h_or_cat_interaction()
: Helper function to calculate the odds ratio estimates for the case when both the odds ratio and the interaction variable are categorical.h_or_cont_interaction()
: Helper function to calculate the odds ratio estimates for the case when either the odds ratio or the interaction variable is continuous.h_or_interaction()
: Helper function to calculate the odds ratio estimates in case of an interaction. This is a wrapper forh_or_cont_interaction()
andh_or_cat_interaction()
.h_simple_term_labels()
: Helper function to construct term labels from simple terms and the table of numbers of patients.h_interaction_term_labels()
: Helper function to construct term labels from interaction terms and the table of numbers of patients.h_glm_simple_term_extract()
: Helper function to tabulate the main effect results of a (conditional) logistic regression model.h_glm_interaction_extract()
: Helper function to tabulate the interaction term results of a logistic regression model.h_glm_inter_term_extract()
: Helper function to tabulate the interaction results of a logistic regression model. This basically is a wrapper forh_or_interaction()
andh_glm_simple_term_extract()
which puts the results in the right data frame format.h_logistic_simple_terms()
: Helper function to tabulate the results including odds ratios and confidence intervals of simple terms.h_logistic_inter_terms()
: Helper function to tabulate the results including odds ratios and confidence intervals of interaction terms.tidy(glm)
: Helper method (forbroom::tidy()
) to prepare a data frame from anglm
object withbinomial
family.logistic_regression_cols()
: Layout creating function for a multi-variable column layout summarizing logistic regression results.logistic_summary_by_flag()
: Constructor for content functions to be used to summarize logistic regression results.summarize_logistic()
: Layout creating function which summarizes a logistic variable regression.
Note
We don't provide a function for the case when both variables are continuous because this does not arise in this table, as the treatment arm variable will always be involved and categorical.
Model Specification
The variables
list needs to include the following elements:
arm
: usual treatment arm variable name.response
: the response arm variable name. Usually this is a 0/1 variable.covariates
: this is eitherNULL
(no covariates) or a character vector of covariate variable names.interaction
: this is eitherNULL
(no interaction) or a string of a single covariate variable name already included incovariates
. Then the interaction with the treatment arm is included in the model.
Examples
library(scda)
library(dplyr)
library(rtables)
adrs <- synthetic_cdisc_data("latest")$adrs
adrs_f <- adrs %>%
filter(PARAMCD == "BESRSPI") %>%
filter(RACE %in% c("ASIAN", "WHITE", "BLACK OR AFRICAN AMERICAN")) %>%
mutate(
Response = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0),
RACE = factor(RACE),
SEX = factor(SEX)
)
formatters::var_labels(adrs_f) <- c(formatters::var_labels(adrs), Response = "Response")
mod1 <- fit_logistic(
data = adrs_f,
variables = list(
response = "Response",
arm = "ARMCD",
covariates = c("AGE", "RACE")
)
)
mod2 <- fit_logistic(
data = adrs_f,
variables = list(
response = "Response",
arm = "ARMCD",
covariates = c("AGE", "RACE"),
interaction = "AGE"
)
)
h_glm_simple_term_extract("AGE", mod1)
#> variable variable_label term term_label interaction interaction_label
#> 1 AGE Age AGE Age
#> reference reference_label estimate std_error df pvalue
#> 1 0.06746155 0.05343377 1 0.2067594
#> is_variable_summary is_term_summary
#> 1 FALSE TRUE
h_glm_simple_term_extract("ARMCD", mod1)
#> variable variable_label term term_label interaction
#> 1 ARMCD Planned Arm Code ARM A Reference ARM A, n = 126
#> 2 ARMCD Planned Arm Code ARM B ARM B, n = 121
#> 3 ARMCD Planned Arm Code ARM C ARM C, n = 126
#> interaction_label reference reference_label estimate std_error df
#> 1 2
#> 2 -2.154975 1.083761 1
#> 3 -0.1274995 1.424796 1
#> pvalue is_variable_summary is_term_summary
#> 1 0.03677249 TRUE FALSE
#> 2 0.04676489 FALSE TRUE
#> 3 0.9286955 FALSE TRUE
h_glm_interaction_extract("ARMCD:AGE", mod2)
#> variable variable_label term
#> 1 ARMCD:AGE Interaction of Planned Arm Code * Age ARM A
#> 2 ARMCD:AGE Interaction of Planned Arm Code * Age ARM B
#> 3 ARMCD:AGE Interaction of Planned Arm Code * Age ARM C
#> term_label interaction interaction_label reference
#> 1 Reference ARM A, n = 126
#> 2 ARM B, n = 121
#> 3 ARM C, n = 126
#> reference_label estimate std_error df pvalue is_variable_summary
#> 1 2 0.2306204 TRUE
#> 2 -0.8682313 0.6298379 1 0.1680491 FALSE
#> 3 -0.617569 0.6714418 1 0.3576953 FALSE
#> is_term_summary
#> 1 FALSE
#> 2 TRUE
#> 3 TRUE
h_glm_inter_term_extract("AGE", "ARMCD", mod2)
#> variable variable_label term term_label interaction interaction_label
#> 1 AGE Age AGE Age
#> 2 AGE Age AGE Age ARMCD Planned Arm Code
#> 3 AGE Age AGE Age ARMCD Planned Arm Code
#> 4 AGE Age AGE Age ARMCD Planned Arm Code
#> reference reference_label estimate std_error odds_ratio lcl ucl
#> 1 0.8959838 0.6294275 NA NA NA
#> 2 ARM A ARM A NA NA 2.449745 0.7134297 8.411829
#> 3 ARM B ARM B NA NA 1.028141 0.9286094 1.138341
#> 4 ARM C ARM C NA NA 1.321034 0.8307684 2.100622
#> df pvalue is_variable_summary is_term_summary is_reference_summary
#> 1 1 0.1545941 FALSE TRUE FALSE
#> 2 NA NA FALSE FALSE TRUE
#> 3 NA NA FALSE FALSE TRUE
#> 4 NA NA FALSE FALSE TRUE
h_logistic_simple_terms("AGE", mod1)
#> variable variable_label term term_label interaction interaction_label
#> 1 AGE Age AGE Age
#> reference reference_label estimate std_error df pvalue
#> 1 0.06746155 0.05343377 1 0.2067594
#> is_variable_summary is_term_summary odds_ratio lcl ucl
#> 1 FALSE TRUE 1.069789 0.9634191 1.187903
#> ci
#> 1 0.9634191, 1.1879033
h_logistic_inter_terms(c("RACE", "AGE", "ARMCD", "AGE:ARMCD"), mod2)
#> variable variable_label term
#> 1 RACE Race ASIAN
#> 2 RACE Race BLACK OR AFRICAN AMERICAN
#> 3 RACE Race WHITE
#> 13 ARMCD Planned Arm Code ARM A
#> 23 ARMCD Planned Arm Code ARM B
#> ARM B ARMCD Planned Arm Code ARM B
#> 33 ARMCD Planned Arm Code ARM C
#> ARM C ARMCD Planned Arm Code ARM C
#> 11 AGE Age AGE
#> 21 AGE Age AGE
#> 31 AGE Age AGE
#> 4 AGE Age AGE
#> 12 AGE:ARMCD Interaction of Planned Arm Code * Age ARM A
#> 22 AGE:ARMCD Interaction of Planned Arm Code * Age ARM B
#> 32 AGE:ARMCD Interaction of Planned Arm Code * Age ARM C
#> term_label interaction interaction_label reference
#> 1 Reference ASIAN, n = 208
#> 2 BLACK OR AFRICAN AMERICAN, n = 91
#> 3 WHITE, n = 74
#> 13 Reference ARM A, n = 126
#> 23 ARM B, n = 121
#> ARM B ARM B, n = 121 AGE Age 34
#> 33 ARM C, n = 126
#> ARM C ARM C, n = 126 AGE Age 34
#> 11 Age
#> 21 Age ARMCD Planned Arm Code ARM A
#> 31 Age ARMCD Planned Arm Code ARM B
#> 4 Age ARMCD Planned Arm Code ARM C
#> 12 Reference ARM A, n = 126
#> 22 ARM B, n = 121
#> 32 ARM C, n = 126
#> reference_label estimate std_error df pvalue odds_ratio
#> 1 2 0.6027443
#> 2 1.078888 1.136031 1 0.342265 2.941406
#> 3 0.4554458 0.8997603 1 0.6127263 1.576876
#> 13 2 0.2890228 NA
#> 23 20.23523 14.7874 1 0.1711835 NA
#> ARM B 34 NA NA NA NA 9.284028e-05
#> 33 14.71317 16.08665 1 0.360391 NA
#> ARM C 34 NA NA NA NA 0.001865597
#> 11 0.8959838 0.6294275 1 0.1545941 NA
#> 21 ARM A NA NA NA NA 2.449745
#> 31 ARM B NA NA NA NA 1.028141
#> 4 ARM C NA NA NA NA 1.321034
#> 12 NA NA NA
#> 22 -0.8682313 0.6298379 1 0.1680491 NA
#> 32 -0.617569 0.6714418 1 0.3576953 NA
#> lcl ucl is_variable_summary is_term_summary
#> 1 TRUE FALSE
#> 2 0.3173685 27.26127 FALSE TRUE
#> 3 0.2703462 9.19761 FALSE TRUE
#> 13 NA NA TRUE FALSE
#> 23 NA NA FALSE TRUE
#> ARM B 1.483171e-10 58.11413 FALSE FALSE
#> 33 NA NA FALSE TRUE
#> ARM C 1.831536e-09 1900.293 FALSE FALSE
#> 11 NA NA FALSE TRUE
#> 21 0.7134297 8.411829 FALSE FALSE
#> 31 0.9286094 1.138341 FALSE FALSE
#> 4 0.8307684 2.100622 FALSE FALSE
#> 12 NA NA TRUE FALSE
#> 22 NA NA FALSE TRUE
#> 32 NA NA FALSE TRUE
#> is_reference_summary ci
#> 1 FALSE
#> 2 FALSE 0.3173685, 27.2612702
#> 3 FALSE 0.2703462, 9.1976100
#> 13 FALSE NA, NA
#> 23 FALSE NA, NA
#> ARM B TRUE 1.483171e-10, 5.811413e+01
#> 33 FALSE NA, NA
#> ARM C TRUE 1.831536e-09, 1.900293e+03
#> 11 FALSE NA, NA
#> 21 TRUE 0.7134297, 8.4118289
#> 31 TRUE 0.9286094, 1.1383411
#> 4 TRUE 0.8307684, 2.1006225
#> 12 FALSE NA, NA
#> 22 FALSE NA, NA
#> 32 FALSE NA, NA
library(broom)
df <- tidy(mod1, conf_level = 0.99)
df2 <- tidy(mod2, conf_level = 0.99)
# Internal function - replace_emptys_with_na
if (FALSE) {
# flagging empty strings with "_"
df <- replace_emptys_with_na(df, rep_str = "_")
df2 <- replace_emptys_with_na(df2, rep_str = "_")
result1 <- basic_table() %>%
summarize_logistic(
conf_level = 0.95,
drop_and_remove_str = "_"
) %>%
build_table(df = df)
result1
result2 <- basic_table() %>%
summarize_logistic(
conf_level = 0.95,
drop_and_remove_str = "_"
) %>%
build_table(df = df2)
result2
}