Multi-variable logistic regression table — logistic

Logistic regression for binary outcome with categorical/continuous covariates in model statement. For each covariate category (if categorical) or specified values (if continuous), present degrees of freedom, regression parameter estimate and standard error (SE) relative to reference group or category. Report odds ratios for each covariate category or specified values and corresponding Wald confidence intervals as default but allow user to specify other confidence levels. Report p-value for Wald chi-square test of the null hypothesis that covariate has no effect on response in model containing all specified covariates. Allow option to include one two-way interaction and present similar output for each interaction degree of freedom. Note: For the formula, the variable names need to be standard dataframe column name without special characters.

Usage

fit_logistic(
  data,
  variables = list(response = "Response", arm = "ARMCD", covariates = NULL, interaction =
    NULL, strata = NULL),
  response_definition = "response"
)

h_get_interaction_vars(fit_glm)

h_interaction_coef_name(
  interaction_vars,
  first_var_with_level,
  second_var_with_level
)

h_or_cat_interaction(
  odds_ratio_var,
  interaction_var,
  fit_glm,
  conf_level = 0.95
)

h_or_cont_interaction(
  odds_ratio_var,
  interaction_var,
  fit_glm,
  at = NULL,
  conf_level = 0.95
)

h_or_interaction(
  odds_ratio_var,
  interaction_var,
  fit_glm,
  at = NULL,
  conf_level = 0.95
)

h_simple_term_labels(terms, table)

h_interaction_term_labels(terms1, terms2, table, any = FALSE)

h_glm_simple_term_extract(x, fit_glm)

h_glm_interaction_extract(x, fit_glm)

h_glm_inter_term_extract(odds_ratio_var, interaction_var, fit_glm, ...)

h_logistic_simple_terms(x, fit_glm, conf_level = 0.95)

h_logistic_inter_terms(x, fit_glm, conf_level = 0.95, at = NULL)

# S3 method for glm
tidy(fit_glm, conf_level = 0.95, at = NULL)

logistic_regression_cols(lyt, conf_level = 0.95)

logistic_summary_by_flag(flag_var)

summarize_logistic(lyt, conf_level, drop_and_remove_str = "")

Arguments

data: (data frame)
the data frame on which the model was fit.
variables: (named list of string)
list of additional analysis variables.
response_definition: (string)
the definition of what an event is in terms of response. This will be used when fitting the (conditional) logistic regression model on the left hand side of the formula.
fit_glm: logistic regression model fitted by stats::glm() with "binomial" family.
interaction_vars: (character of length 2)
interaction variable names.
first_var_with_level: (character of length 2)
the first variable name with the interaction level.
second_var_with_level: (character of length 2)
the second variable name with the interaction level.
odds_ratio_var: (string)
the odds ratio variable.
interaction_var: (string)
the interaction variable.
conf_level: (proportion)
confidence level of the interval.
at: (NULL or numeric)
optional values for the interaction variable. Otherwise the median is used.
terms: (character)
simple terms.
table: (table)
table containing numbers for terms.
terms1: (character)
terms for first dimension (rows).
terms2: (character)
terms for second dimension (rows).
any: (flag)
whether any of term1 and term2 can be fulfilled to count the number of patients. In that case they can only be scalar (strings).
x: (string or character)
a variable or interaction term in fit_glm (depending on the helper function).
...: additional arguments for the lower level functions.
lyt: (layout)
input layout where analyses will be added to.
flag_var: (string)
variable name identifying which row should be used in this content function.
drop_and_remove_str: string to be dropped and removed

Details

Note this function may hang or error for certain datasets when an old version of the survival package (< 3.2-13) is used.

Functions

fit_logistic(): Fit a (conditional) logistic regression model.
h_get_interaction_vars(): Helper function to extract interaction variable names from a fitted model assuming only one interaction term.
h_interaction_coef_name(): Helper function to get the right coefficient name from the interaction variable names and the given levels. The main value here is that the order of first and second variable is checked in the interaction_vars input.
h_or_cat_interaction(): Helper function to calculate the odds ratio estimates for the case when both the odds ratio and the interaction variable are categorical.
h_or_cont_interaction(): Helper function to calculate the odds ratio estimates for the case when either the odds ratio or the interaction variable is continuous.
h_or_interaction(): Helper function to calculate the odds ratio estimates in case of an interaction. This is a wrapper for h_or_cont_interaction() and h_or_cat_interaction().
h_simple_term_labels(): Helper function to construct term labels from simple terms and the table of numbers of patients.
h_interaction_term_labels(): Helper function to construct term labels from interaction terms and the table of numbers of patients.
h_glm_simple_term_extract(): Helper function to tabulate the main effect results of a (conditional) logistic regression model.
h_glm_interaction_extract(): Helper function to tabulate the interaction term results of a logistic regression model.
h_glm_inter_term_extract(): Helper function to tabulate the interaction results of a logistic regression model. This basically is a wrapper for h_or_interaction() and h_glm_simple_term_extract() which puts the results in the right data frame format.
h_logistic_simple_terms(): Helper function to tabulate the results including odds ratios and confidence intervals of simple terms.
h_logistic_inter_terms(): Helper function to tabulate the results including odds ratios and confidence intervals of interaction terms.
tidy(glm): Helper method (for broom::tidy()) to prepare a data frame from an glm object with binomial family.
logistic_regression_cols(): Layout creating function for a multi-variable column layout summarizing logistic regression results.
logistic_summary_by_flag(): Constructor for content functions to be used to summarize logistic regression results.
summarize_logistic(): Layout creating function which summarizes a logistic variable regression.

Note

We don't provide a function for the case when both variables are continuous because this does not arise in this table, as the treatment arm variable will always be involved and categorical.

Model Specification

The variables list needs to include the following elements:

arm: usual treatment arm variable name.
response: the response arm variable name. Usually this is a 0/1 variable.
covariates: this is either NULL (no covariates) or a character vector of covariate variable names.
interaction: this is either NULL (no interaction) or a string of a single covariate variable name already included in covariates. Then the interaction with the treatment arm is included in the model.

Examples

library(scda)
library(dplyr)
library(rtables)

adrs <- synthetic_cdisc_data("latest")$adrs
adrs_f <- adrs %>%
  filter(PARAMCD == "BESRSPI") %>%
  filter(RACE %in% c("ASIAN", "WHITE", "BLACK OR AFRICAN AMERICAN")) %>%
  mutate(
    Response = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0),
    RACE = factor(RACE),
    SEX = factor(SEX)
  )
formatters::var_labels(adrs_f) <- c(formatters::var_labels(adrs), Response = "Response")
mod1 <- fit_logistic(
  data = adrs_f,
  variables = list(
    response = "Response",
    arm = "ARMCD",
    covariates = c("AGE", "RACE")
  )
)
mod2 <- fit_logistic(
  data = adrs_f,
  variables = list(
    response = "Response",
    arm = "ARMCD",
    covariates = c("AGE", "RACE"),
    interaction = "AGE"
  )
)
h_glm_simple_term_extract("AGE", mod1)
#>   variable variable_label term term_label interaction interaction_label
#> 1      AGE            Age  AGE        Age                              
#>   reference reference_label   estimate  std_error df    pvalue
#> 1                           0.06746155 0.05343377  1 0.2067594
#>   is_variable_summary is_term_summary
#> 1               FALSE            TRUE
h_glm_simple_term_extract("ARMCD", mod1)
#>   variable   variable_label  term               term_label interaction
#> 1    ARMCD Planned Arm Code ARM A Reference ARM A, n = 126            
#> 2    ARMCD Planned Arm Code ARM B           ARM B, n = 121            
#> 3    ARMCD Planned Arm Code ARM C           ARM C, n = 126            
#>   interaction_label reference reference_label   estimate std_error df
#> 1                                                                   2
#> 2                                              -2.154975  1.083761  1
#> 3                                             -0.1274995  1.424796  1
#>       pvalue is_variable_summary is_term_summary
#> 1 0.03677249                TRUE           FALSE
#> 2 0.04676489               FALSE            TRUE
#> 3  0.9286955               FALSE            TRUE

h_glm_interaction_extract("ARMCD:AGE", mod2)
#>    variable                        variable_label  term
#> 1 ARMCD:AGE Interaction of Planned Arm Code * Age ARM A
#> 2 ARMCD:AGE Interaction of Planned Arm Code * Age ARM B
#> 3 ARMCD:AGE Interaction of Planned Arm Code * Age ARM C
#>                 term_label interaction interaction_label reference
#> 1 Reference ARM A, n = 126                                        
#> 2           ARM B, n = 121                                        
#> 3           ARM C, n = 126                                        
#>   reference_label   estimate std_error df    pvalue is_variable_summary
#> 1                                       2 0.2306204                TRUE
#> 2                 -0.8682313 0.6298379  1 0.1680491               FALSE
#> 3                  -0.617569 0.6714418  1 0.3576953               FALSE
#>   is_term_summary
#> 1           FALSE
#> 2            TRUE
#> 3            TRUE

h_glm_inter_term_extract("AGE", "ARMCD", mod2)
#>   variable variable_label term term_label interaction interaction_label
#> 1      AGE            Age  AGE        Age                              
#> 2      AGE            Age  AGE        Age       ARMCD  Planned Arm Code
#> 3      AGE            Age  AGE        Age       ARMCD  Planned Arm Code
#> 4      AGE            Age  AGE        Age       ARMCD  Planned Arm Code
#>   reference reference_label  estimate std_error odds_ratio       lcl      ucl
#> 1                           0.8959838 0.6294275         NA        NA       NA
#> 2     ARM A           ARM A        NA        NA   2.449745 0.7134297 8.411829
#> 3     ARM B           ARM B        NA        NA   1.028141 0.9286094 1.138341
#> 4     ARM C           ARM C        NA        NA   1.321034 0.8307684 2.100622
#>   df    pvalue is_variable_summary is_term_summary is_reference_summary
#> 1  1 0.1545941               FALSE            TRUE                FALSE
#> 2 NA        NA               FALSE           FALSE                 TRUE
#> 3 NA        NA               FALSE           FALSE                 TRUE
#> 4 NA        NA               FALSE           FALSE                 TRUE

h_logistic_simple_terms("AGE", mod1)
#>   variable variable_label term term_label interaction interaction_label
#> 1      AGE            Age  AGE        Age                              
#>   reference reference_label   estimate  std_error df    pvalue
#> 1                           0.06746155 0.05343377  1 0.2067594
#>   is_variable_summary is_term_summary odds_ratio       lcl      ucl
#> 1               FALSE            TRUE   1.069789 0.9634191 1.187903
#>                     ci
#> 1 0.9634191, 1.1879033
h_logistic_inter_terms(c("RACE", "AGE", "ARMCD", "AGE:ARMCD"), mod2)
#>        variable                        variable_label                      term
#> 1          RACE                                  Race                     ASIAN
#> 2          RACE                                  Race BLACK OR AFRICAN AMERICAN
#> 3          RACE                                  Race                     WHITE
#> 13        ARMCD                      Planned Arm Code                     ARM A
#> 23        ARMCD                      Planned Arm Code                     ARM B
#> ARM B     ARMCD                      Planned Arm Code                     ARM B
#> 33        ARMCD                      Planned Arm Code                     ARM C
#> ARM C     ARMCD                      Planned Arm Code                     ARM C
#> 11          AGE                                   Age                       AGE
#> 21          AGE                                   Age                       AGE
#> 31          AGE                                   Age                       AGE
#> 4           AGE                                   Age                       AGE
#> 12    AGE:ARMCD Interaction of Planned Arm Code * Age                     ARM A
#> 22    AGE:ARMCD Interaction of Planned Arm Code * Age                     ARM B
#> 32    AGE:ARMCD Interaction of Planned Arm Code * Age                     ARM C
#>                              term_label interaction interaction_label reference
#> 1              Reference ASIAN, n = 208                                        
#> 2     BLACK OR AFRICAN AMERICAN, n = 91                                        
#> 3                         WHITE, n = 74                                        
#> 13             Reference ARM A, n = 126                                        
#> 23                       ARM B, n = 121                                        
#> ARM B                    ARM B, n = 121         AGE               Age        34
#> 33                       ARM C, n = 126                                        
#> ARM C                    ARM C, n = 126         AGE               Age        34
#> 11                                  Age                                        
#> 21                                  Age       ARMCD  Planned Arm Code     ARM A
#> 31                                  Age       ARMCD  Planned Arm Code     ARM B
#> 4                                   Age       ARMCD  Planned Arm Code     ARM C
#> 12             Reference ARM A, n = 126                                        
#> 22                       ARM B, n = 121                                        
#> 32                       ARM C, n = 126                                        
#>       reference_label   estimate std_error df    pvalue   odds_ratio
#> 1                                           2 0.6027443             
#> 2                       1.078888  1.136031  1  0.342265     2.941406
#> 3                      0.4554458 0.8997603  1 0.6127263     1.576876
#> 13                                          2 0.2890228           NA
#> 23                      20.23523   14.7874  1 0.1711835           NA
#> ARM B              34         NA        NA NA        NA 9.284028e-05
#> 33                      14.71317  16.08665  1  0.360391           NA
#> ARM C              34         NA        NA NA        NA  0.001865597
#> 11                     0.8959838 0.6294275  1 0.1545941           NA
#> 21              ARM A         NA        NA NA        NA     2.449745
#> 31              ARM B         NA        NA NA        NA     1.028141
#> 4               ARM C         NA        NA NA        NA     1.321034
#> 12                                         NA        NA           NA
#> 22                    -0.8682313 0.6298379  1 0.1680491           NA
#> 32                     -0.617569 0.6714418  1 0.3576953           NA
#>                lcl      ucl is_variable_summary is_term_summary
#> 1                                          TRUE           FALSE
#> 2        0.3173685 27.26127               FALSE            TRUE
#> 3        0.2703462  9.19761               FALSE            TRUE
#> 13              NA       NA                TRUE           FALSE
#> 23              NA       NA               FALSE            TRUE
#> ARM B 1.483171e-10 58.11413               FALSE           FALSE
#> 33              NA       NA               FALSE            TRUE
#> ARM C 1.831536e-09 1900.293               FALSE           FALSE
#> 11              NA       NA               FALSE            TRUE
#> 21       0.7134297 8.411829               FALSE           FALSE
#> 31       0.9286094 1.138341               FALSE           FALSE
#> 4        0.8307684 2.100622               FALSE           FALSE
#> 12              NA       NA                TRUE           FALSE
#> 22              NA       NA               FALSE            TRUE
#> 32              NA       NA               FALSE            TRUE
#>       is_reference_summary                         ci
#> 1                    FALSE                           
#> 2                    FALSE      0.3173685, 27.2612702
#> 3                    FALSE       0.2703462, 9.1976100
#> 13                   FALSE                     NA, NA
#> 23                   FALSE                     NA, NA
#> ARM B                 TRUE 1.483171e-10, 5.811413e+01
#> 33                   FALSE                     NA, NA
#> ARM C                 TRUE 1.831536e-09, 1.900293e+03
#> 11                   FALSE                     NA, NA
#> 21                    TRUE       0.7134297, 8.4118289
#> 31                    TRUE       0.9286094, 1.1383411
#> 4                     TRUE       0.8307684, 2.1006225
#> 12                   FALSE                     NA, NA
#> 22                   FALSE                     NA, NA
#> 32                   FALSE                     NA, NA

library(broom)
df <- tidy(mod1, conf_level = 0.99)
df2 <- tidy(mod2, conf_level = 0.99)

# Internal function - replace_emptys_with_na
if (FALSE) {
# flagging empty strings with "_"
df <- replace_emptys_with_na(df, rep_str = "_")
df2 <- replace_emptys_with_na(df2, rep_str = "_")

result1 <- basic_table() %>%
  summarize_logistic(
    conf_level = 0.95,
    drop_and_remove_str = "_"
  ) %>%
  build_table(df = df)
result1

result2 <- basic_table() %>%
  summarize_logistic(
    conf_level = 0.95,
    drop_and_remove_str = "_"
  ) %>%
  build_table(df = df2)
result2
}