Usage
s_compare(x, .ref_group, .in_ref_col, ...)
# S3 method for numeric
s_compare(x, .ref_group, .in_ref_col, ...)
# S3 method for factor
s_compare(
x,
.ref_group,
.in_ref_col,
denom = "n",
na.rm = TRUE,
na_level = "<Missing>",
...
)
# S3 method for character
s_compare(
x,
.ref_group,
.in_ref_col,
denom = "n",
na.rm = TRUE,
na_level = "<Missing>",
.var,
verbose = TRUE,
...
)
# S3 method for logical
s_compare(x, .ref_group, .in_ref_col, na.rm = TRUE, denom = "n", ...)
a_compare(x, .ref_group, .in_ref_col, ..., .var)
# S3 method for numeric
a_compare(x, .ref_group, .in_ref_col, ...)
# S3 method for factor
a_compare(
x,
.ref_group,
.in_ref_col,
denom = "n",
na.rm = TRUE,
na_level = "<Missing>",
...
)
# S3 method for character
a_compare(
x,
.ref_group,
.in_ref_col,
denom = "n",
na.rm = TRUE,
na_level = "<Missing>",
.var,
verbose = TRUE,
...
)
# S3 method for logical
a_compare(x, .ref_group, .in_ref_col, na.rm = TRUE, denom = "n", ...)
compare_vars(
lyt,
vars,
var_labels = vars,
nested = TRUE,
...,
show_labels = "default",
table_names = vars,
.stats = c("n", "mean_sd", "count_fraction", "pval"),
.formats = NULL,
.labels = NULL,
.indent_mods = NULL
)
Arguments
- x
(
numeric
)
vector of numbers we want to analyze.- .ref_group
(
data.frame
orvector
)
the data corresponding to the reference group.- .in_ref_col
(
logical
)TRUE
when working with the reference level,FALSE
otherwise.- ...
arguments passed to
s_compare()
.- denom
(
string
)
choice of denominator for factor proportions, can only ben
(number of values in this row and column intersection).- na.rm
(
flag
)
whetherNA
values should be removed fromx
prior to analysis.- na_level
(
string
)
used to replace allNA
or empty values in factors with customstring
.- .var
(
string
)
single variable name that is passed byrtables
when requested by a statistics function.- verbose
(
logical
)
Whether warnings and messages should be printed. Mainly used to print out information about factor casting. Defaults toTRUE
.- lyt
(
layout
)
input layout where analyses will be added to.- vars
(
character
)
variable names for the primary analysis variable to be iterated over.- var_labels
(
character
)
character for label.- nested
(
flag
)
whether this layout instruction be applied within the existing layout structure if possible (TRUE
, the default) or as a new top-level element (FALSE
). Ignored if it would nest a split underneath analyses, which is not allowed.- show_labels
(
string
)
label visibility: one of "default", "visible" and "hidden".- table_names
(
character
)
this can be customized in case that the samevars
are analyzed multiple times, to avoid warnings fromrtables
.- .stats
(
character
)
statistics to select for the table.- .formats
(named
character
orlist
)
formats for the statistics.- .labels
(named
character
)
labels for the statistics (without indent).- .indent_mods
(named
integer
)
indent modifiers for the labels.
Value
s_compare()
returns output ofs_summary()
and comparisons versus the reference group in the form of p-values.
a_compare()
returns the corresponding list with formattedrtables::CellValue()
.
compare_vars()
returns a layout object suitable for passing to further layouting functions, or tortables::build_table()
. Adding this function to anrtable
layout will add formatted rows containing the statistics froms_compare()
to the table layout.
Functions
s_compare()
: S3 generic function to produce a comparison summary.s_compare(numeric)
: Method fornumeric
class. This uses the standard t-test to calculate the p-value.s_compare(factor)
: Method forfactor
class. This uses the chi-squared test to calculate the p-value.s_compare(character)
: Method forcharacter
class. This makes an automatic conversion tofactor
(with a warning) and then forwards to the method for factors.s_compare(logical)
: Method forlogical
class. A chi-squared test is used. If missing values are not removed, then they are counted asFALSE
.a_compare()
: Formatted analysis function which is used asafun
incompare_vars()
.a_compare(numeric)
: Formatted analysis function method fornumeric
class.a_compare(factor)
: Formatted analysis function method forfactor
class.a_compare(character)
: Formatted analysis function method forcharacter
class.a_compare(logical)
: Formatted analysis function method forlogical
class.compare_vars()
: Layout-creating function which can take statistics function arguments and additional format arguments. This function is a wrapper forrtables::analyze()
.
Note
For factor variables,
denom
for factor proportions can only ben
since the purpose is to compare proportions between columns, therefore a row-based proportion would not make sense. Proportion based onN_col
would be difficult since we use counts for the chi-squared test statistic, therefore missing values should be accounted for as explicit factor levels.For character variables, automatic conversion to factor does not guarantee that the table will be generated correctly. In particular for sparse tables this very likely can fail. Therefore it is always better to manually convert character variables to factors during pre-processing.
For
compare_vars()
, the column split must define a reference group viaref_group
so that the comparison is well defined.When factor variables contains
NA
, it is expected thatNA
values have been conveyed tona_level
appropriately beforehand viadf_explicit_na()
.
See also
Relevant constructor function create_afun_compare()
, and s_summary()
which is used internally
to compute a summary within s_compare()
.
Examples
# `s_compare.numeric`
## Usual case where both this and the reference group vector have more than 1 value.
s_compare(rnorm(10, 5, 1), .ref_group = rnorm(5, -5, 1), .in_ref_col = FALSE)
#> $n
#> n
#> 10
#>
#> $sum
#> sum
#> 48.72609
#>
#> $mean
#> mean
#> 4.872609
#>
#> $sd
#> sd
#> 1.197012
#>
#> $se
#> se
#> 0.3785283
#>
#> $mean_sd
#> mean sd
#> 4.872609 1.197012
#>
#> $mean_se
#> mean se
#> 4.8726087 0.3785283
#>
#> $mean_ci
#> mean_ci_lwr mean_ci_upr
#> 4.016318 5.728899
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $mean_sei
#> mean_sei_lwr mean_sei_upr
#> 4.494080 5.251137
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> 3.675597 6.069620
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $mean_pval
#> p_value
#> 4.225699e-07
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $median
#> median
#> 4.670033
#>
#> $mad
#> mad
#> 4.440892e-16
#>
#> $median_ci
#> median_ci_lwr median_ci_upr
#> 3.926495 6.356511
#> attr(,"conf_level")
#> [1] 0.9785156
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $quantiles
#> quantile_0.25 quantile_0.75
#> 4.012884 5.887465
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $iqr
#> iqr
#> 1.874581
#>
#> $range
#> min max
#> 2.933284 6.649337
#>
#> $min
#> min
#> 2.933284
#>
#> $max
#> max
#> 6.649337
#>
#> $median_range
#> median min max
#> 4.670033 2.933284 6.649337
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $cv
#> cv
#> 24.56614
#>
#> $geom_mean
#> geom_mean
#> 4.735802
#>
#> $geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> 3.945209 5.684826
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $geom_cv
#> geom_cv
#> 25.95445
#>
#> $pval
#> [1] 2.574683e-06
#>
## If one group has not more than 1 value, then p-value is not calculated.
s_compare(rnorm(10, 5, 1), .ref_group = 1, .in_ref_col = FALSE)
#> $n
#> n
#> 10
#>
#> $sum
#> sum
#> 47.26309
#>
#> $mean
#> mean
#> 4.726309
#>
#> $sd
#> sd
#> 1.806277
#>
#> $se
#> se
#> 0.571195
#>
#> $mean_sd
#> mean sd
#> 4.726309 1.806277
#>
#> $mean_se
#> mean se
#> 4.726309 0.571195
#>
#> $mean_ci
#> mean_ci_lwr mean_ci_upr
#> 3.434176 6.018441
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $mean_sei
#> mean_sei_lwr mean_sei_upr
#> 4.155114 5.297504
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> 2.920031 6.532586
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $mean_pval
#> p_value
#> 1.688957e-05
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $median
#> median
#> 4.721137
#>
#> $mad
#> mad
#> 0
#>
#> $median_ci
#> median_ci_lwr median_ci_upr
#> 3.575542 5.772262
#> attr(,"conf_level")
#> [1] 0.9785156
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $quantiles
#> quantile_0.25 quantile_0.75
#> 4.042933 5.380980
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $iqr
#> iqr
#> 1.338047
#>
#> $range
#> min max
#> 1.238997 8.354703
#>
#> $min
#> min
#> 1.238997
#>
#> $max
#> max
#> 8.354703
#>
#> $median_range
#> median min max
#> 4.721137 1.238997 8.354703
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $cv
#> cv
#> 38.2175
#>
#> $geom_mean
#> geom_mean
#> 4.322709
#>
#> $geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> 3.027874 6.171264
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $geom_cv
#> geom_cv
#> 53.01524
#>
#> $pval
#> character(0)
#>
## Empty numeric does not fail, it returns NA-filled items and no p-value.
s_compare(numeric(), .ref_group = numeric(), .in_ref_col = FALSE)
#> $n
#> n
#> 0
#>
#> $sum
#> sum
#> NA
#>
#> $mean
#> mean
#> NA
#>
#> $sd
#> sd
#> NA
#>
#> $se
#> se
#> NA
#>
#> $mean_sd
#> mean sd
#> NA NA
#>
#> $mean_se
#> mean se
#> NA NA
#>
#> $mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $mean_sei
#> mean_sei_lwr mean_sei_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $mean_pval
#> p_value
#> NA
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $median
#> median
#> NA
#>
#> $mad
#> mad
#> NA
#>
#> $median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $quantiles
#> quantile_0.25 quantile_0.75
#> NA NA
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $iqr
#> iqr
#> NA
#>
#> $range
#> min max
#> NA NA
#>
#> $min
#> min
#> NA
#>
#> $max
#> max
#> NA
#>
#> $median_range
#> median min max
#> NA NA NA
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $cv
#> cv
#> NA
#>
#> $geom_mean
#> geom_mean
#> NaN
#>
#> $geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $geom_cv
#> geom_cv
#> NA
#>
#> $pval
#> character(0)
#>
# `s_compare.factor`
## Basic usage:
x <- factor(c("a", "a", "b", "c", "a"))
y <- factor(c("a", "b", "c"))
s_compare(x = x, .ref_group = y, .in_ref_col = FALSE)
#> $n
#> [1] 5
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.6
#>
#> $count_fraction$b
#> [1] 1.0 0.2
#>
#> $count_fraction$c
#> [1] 1.0 0.2
#>
#>
#> $n_blq
#> [1] 0
#>
#> $pval
#> [1] 0.7659283
#>
## Management of NA values.
x <- explicit_na(factor(c("a", "a", "b", "c", "a", NA, NA)))
y <- explicit_na(factor(c("a", "b", "c", NA)))
s_compare(x = x, .ref_group = y, .in_ref_col = FALSE, na.rm = TRUE)
#> $n
#> [1] 5
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.6
#>
#> $count_fraction$b
#> [1] 1.0 0.2
#>
#> $count_fraction$c
#> [1] 1.0 0.2
#>
#>
#> $n_blq
#> [1] 0
#>
#> $pval
#> [1] 0.7659283
#>
s_compare(x = x, .ref_group = y, .in_ref_col = FALSE, na.rm = FALSE)
#> $n
#> [1] 7
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#> $count$`<Missing>`
#> [1] 2
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0000000 0.4285714
#>
#> $count_fraction$b
#> [1] 1.0000000 0.1428571
#>
#> $count_fraction$c
#> [1] 1.0000000 0.1428571
#>
#> $count_fraction$`<Missing>`
#> [1] 2.0000000 0.2857143
#>
#>
#> $n_blq
#> [1] 0
#>
#> $pval
#> [1] 0.9063036
#>
# `s_compare.character`
## Basic usage:
x <- c("a", "a", "b", "c", "a")
y <- c("a", "b", "c")
s_compare(x, .ref_group = y, .in_ref_col = FALSE, .var = "x", verbose = FALSE)
#> $n
#> [1] 5
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.6
#>
#> $count_fraction$b
#> [1] 1.0 0.2
#>
#> $count_fraction$c
#> [1] 1.0 0.2
#>
#>
#> $n_blq
#> [1] 0
#>
#> $pval
#> [1] 0.7659283
#>
## Note that missing values handling can make a large difference:
x <- c("a", "a", "b", "c", "a", NA)
y <- c("a", "b", "c", rep(NA, 20))
s_compare(x,
.ref_group = y, .in_ref_col = FALSE,
.var = "x", verbose = FALSE
)
#> $n
#> [1] 5
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.6
#>
#> $count_fraction$b
#> [1] 1.0 0.2
#>
#> $count_fraction$c
#> [1] 1.0 0.2
#>
#>
#> $n_blq
#> [1] 0
#>
#> $pval
#> [1] 0.7659283
#>
s_compare(x,
.ref_group = y, .in_ref_col = FALSE, .var = "x",
na.rm = FALSE, verbose = FALSE
)
#> $n
#> [1] 6
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#> $count$`<Missing>`
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.5
#>
#> $count_fraction$b
#> [1] 1.0000000 0.1666667
#>
#> $count_fraction$c
#> [1] 1.0000000 0.1666667
#>
#> $count_fraction$`<Missing>`
#> [1] 1.0000000 0.1666667
#>
#>
#> $n_blq
#> [1] 0
#>
#> $pval
#> [1] 0.005768471
#>
# `s_compare.logical`
## Basic usage:
x <- c(TRUE, FALSE, TRUE, TRUE)
y <- c(FALSE, FALSE, TRUE)
s_compare(x, .ref_group = y, .in_ref_col = FALSE)
#> $n
#> [1] 4
#>
#> $count
#> [1] 3
#>
#> $count_fraction
#> [1] 3.00 0.75
#>
#> $n_blq
#> [1] 0
#>
#> $pval
#> [1] 0.2702894
#>
## Management of NA values.
x <- c(NA, TRUE, FALSE)
y <- c(NA, NA, NA, NA, FALSE)
s_compare(x, .ref_group = y, .in_ref_col = FALSE, na.rm = TRUE)
#> $n
#> [1] 2
#>
#> $count
#> [1] 1
#>
#> $count_fraction
#> [1] 1.0 0.5
#>
#> $n_blq
#> [1] 0
#>
#> $pval
#> [1] 0.3864762
#>
s_compare(x, .ref_group = y, .in_ref_col = FALSE, na.rm = FALSE)
#> $n
#> [1] 3
#>
#> $count
#> [1] 1
#>
#> $count_fraction
#> [1] 1.0000000 0.3333333
#>
#> $n_blq
#> [1] 0
#>
#> $pval
#> [1] 0.1675463
#>
# `a_compare.numeric`
a_compare(
rnorm(10, 5, 1),
.ref_group = rnorm(20, -5, 1),
.in_ref_col = FALSE,
.var = "bla"
)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod
#> 1 n 10 0
#> 2 sum 45.8 0
#> 3 mean 4.6 0
#> 4 sd 0.9 0
#> 5 se 0.3 0
#> 6 mean_sd 4.6 (0.9) 0
#> 7 mean_se 4.6 (0.3) 0
#> 8 mean_ci (3.92, 5.23) 0
#> 9 mean_sei (4.28, 4.87) 0
#> 10 mean_sdi (3.66, 5.49) 0
#> 11 mean_pval 0.00 0
#> 12 median 4.5 0
#> 13 mad 0.0 0
#> 14 median_ci (3.93, 5.53) 0
#> 15 quantiles 4.1 - 4.9 0
#> 16 iqr 0.8 0
#> 17 range 2.8 - 6.2 0
#> 18 min 2.8 0
#> 19 max 6.2 0
#> 20 median_range 4.5 (2.8 - 6.2) 0
#> 21 cv 20.1 0
#> 22 geom_mean 4.5 0
#> 23 geom_mean_ci 3.85510950976518, 5.222938736613 0
#> 24 geom_cv 21.5 0
#> 25 pval <0.0001 0
#> row_label
#> 1 n
#> 2 Sum
#> 3 Mean
#> 4 SD
#> 5 SE
#> 6 Mean (SD)
#> 7 Mean (SE)
#> 8 Mean 95% CI
#> 9 Mean -/+ 1xSE
#> 10 Mean -/+ 1xSD
#> 11 Mean p-value (H0: mean = 0)
#> 12 Median
#> 13 Median Absolute Deviation
#> 14 Median 95% CI
#> 15 25% and 75%-ile
#> 16 IQR
#> 17 Min - Max
#> 18 Minimum
#> 19 Maximum
#> 20 Median (Min - Max)
#> 21 CV (%)
#> 22 Geometric Mean
#> 23 Geometric Mean 95% CI
#> 24 CV % Geometric Mean
#> 25 p-value (t-test)
# `a_compare.factor`
# We need to ungroup `count` and `count_fraction` first so that the `rtables` formatting
# functions can be applied correctly.
afun <- make_afun(
getS3method("a_compare", "factor"),
.ungroup_stats = c("count", "count_fraction")
)
x <- factor(c("a", "a", "b", "c", "a"))
y <- factor(c("a", "a", "b", "c"))
afun(x, .ref_group = y, .in_ref_col = FALSE)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod row_label
#> 1 n 5 0 n
#> 2 a 3 0 a
#> 3 b 1 0 b
#> 4 c 1 0 c
#> 5 a 3 (60%) 0 a
#> 6 b 1 (20%) 0 b
#> 7 c 1 (20%) 0 c
#> 8 n_blq 0 0 n_blq
#> 9 pval 0.9560 0 p-value (chi-squared test)
# `a_compare.character`
afun <- make_afun(
getS3method("a_compare", "character"),
.ungroup_stats = c("count", "count_fraction")
)
x <- c("A", "B", "A", "C")
y <- c("B", "A", "C")
afun(x, .ref_group = y, .in_ref_col = FALSE, .var = "x", verbose = FALSE)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod row_label
#> 1 n 4 0 n
#> 2 A 2 0 A
#> 3 B 1 0 B
#> 4 C 1 0 C
#> 5 A 2 (50%) 0 A
#> 6 B 1 (25%) 0 B
#> 7 C 1 (25%) 0 C
#> 8 n_blq 0 0 n_blq
#> 9 pval 0.9074 0 p-value (chi-squared test)
# `a_compare.logical`
afun <- make_afun(
getS3method("a_compare", "logical")
)
x <- c(TRUE, FALSE, FALSE, TRUE, TRUE)
y <- c(TRUE, FALSE)
afun(x, .ref_group = y, .in_ref_col = FALSE)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod row_label
#> 1 n 5 0 n
#> 2 count 3 0 count
#> 3 count_fraction 3 (60%) 0 count_fraction
#> 4 n_blq 0 0 n_blq
#> 5 pval 0.8091 0 p-value (chi-squared test)
# `compare_vars()` in `rtables` pipelines
## Default output within a `rtables` pipeline.
lyt <- basic_table() %>%
split_cols_by("ARMCD", ref_group = "ARM B") %>%
compare_vars(c("AGE", "SEX"))
build_table(lyt, tern_ex_adsl)
#> ARM B ARM A ARM C
#> ———————————————————————————————————————————————————————————————————
#> AGE
#> n 73 69 58
#> Mean (SD) 35.8 (7.1) 34.1 (6.8) 36.1 (7.4)
#> p-value (t-test) 0.1446 0.8212
#> SEX
#> n 73 69 58
#> F 40 (54.8%) 38 (55.1%) 32 (55.2%)
#> M 33 (45.2%) 31 (44.9%) 26 (44.8%)
#> p-value (chi-squared test) 1.0000 1.0000
## Select and format statistics output.
lyt <- basic_table() %>%
split_cols_by("ARMCD", ref_group = "ARM C") %>%
compare_vars(
vars = "AGE",
.stats = c("mean_sd", "pval"),
.formats = c(mean_sd = "xx.x, xx.x"),
.labels = c(mean_sd = "Mean, SD")
)
build_table(lyt, df = tern_ex_adsl)
#> ARM C ARM A ARM B
#> ————————————————————————————————————————————————————
#> Mean, SD 36.1, 7.4 34.1, 6.8 35.8, 7.1
#> p-value (t-test) 0.1176 0.8212