Get default statistical methods and their associated formats, labels, and indent modifiers
Source:R/utils_default_stats_formats_labels.R
default_stats_formats_labels.Rd
Utility functions to get valid statistic methods for different method groups
(.stats
) and their associated formats (.formats
), labels (.labels
), and indent modifiers
(.indent_mods
). This utility is used across tern
, but some of its working principles can be
seen in analyze_vars()
. See notes to understand why this is experimental.
Usage
get_stats(
method_groups = "analyze_vars_numeric",
stats_in = NULL,
add_pval = FALSE
)
get_formats_from_stats(stats, formats_in = NULL)
get_labels_from_stats(stats, labels_in = NULL, row_nms = NULL)
get_indents_from_stats(stats, indents_in = NULL, row_nms = NULL)
tern_default_stats
tern_default_formats
tern_default_labels
summary_formats(type = "numeric", include_pval = FALSE)
summary_labels(type = "numeric", include_pval = FALSE)
Format
tern_default_stats
is a named list of available statistics, with each element named for their corresponding statistical method group.
tern_default_formats
is a named vector of available default formats, with each element named for their corresponding statistic.
tern_default_labels
is a namedcharacter
vector of available default labels, with each element named for their corresponding statistic.
Arguments
- method_groups
(
character
)
indicates the statistical method group (tern
analyze function) to retrieve default statistics for. A character vector can be used to specify more than one statistical method group.- stats_in
(
character
)
statistics to retrieve for the selected method group.- add_pval
(
flag
)
should"pval"
(or"pval_counts"
ifmethod_groups
contains"analyze_vars_counts"
) be added to the statistical methods?- stats
(
character
)
statistical methods to get defaults for.- formats_in
(named
vector
)
inserted formats to replace defaults. It can be a character vector fromformatters::list_valid_format_labels()
or a custom format function.- labels_in
(named
character
)
inserted labels to replace defaults.- row_nms
(
character
)
row names. Levels of afactor
orcharacter
variable, each of which the statistics in.stats
will be calculated for. If this parameter is set, these variable levels will be used as the defaults, and the names of the given custom values should correspond to levels (or have formatstatistic.level
) instead of statistics. Can also be variable names if rows correspond to different variables instead of levels. Defaults toNULL
.- indents_in
(named
vector
)
inserted indent modifiers to replace defaults (default is0L
).- type
(
string
)"numeric"
or"counts"
.- include_pval
(
flag
)
same as theadd_pval
argument inget_stats()
.
Value
get_stats()
returns acharacter
vector of statistical methods.
get_formats_from_stats()
returns a named vector of formats (if present in eithertern_default_formats
orformats_in
, otherwiseNULL
). Values can be taken fromformatters::list_valid_format_labels()
or a custom function (e.g. formatting_functions).
get_labels_from_stats()
returns a namedcharacter
vector of labels (if present in eithertern_default_labels
orlabels_in
, otherwiseNULL
).
get_indents_from_stats()
returns a single indent modifier value to apply to all rows or a named numeric vector of indent modifiers (if present, otherwiseNULL
).
summary_formats()
returns a namedvector
of default statistic formats for the given data type.
summary_labels
returns a namedvector
of default statistic labels for the given data type.
Details
Current choices for type
are counts
and numeric
for analyze_vars()
and affect get_stats()
.
Functions
get_stats()
: Get statistics available for a given method group (analyze function).get_formats_from_stats()
: Get formats corresponding to a list of statistics.get_labels_from_stats()
: Get labels corresponding to a list of statistics.get_indents_from_stats()
: Format indent modifiers for a given vector/list of statistics.tern_default_stats
: Named list of available statistics by method group fortern
.tern_default_formats
: Named vector of default formats fortern
.tern_default_labels
: Namedcharacter
vector of default labels fortern
.summary_formats()
: Quick function to retrieve default formats for summary statistics:analyze_vars()
andanalyze_vars_in_cols()
principally.summary_labels()
: Quick function to retrieve default labels for summary statistics. Returns labels of descriptive statistics which are understood byrtables
. Similar tosummary_formats
.
Note
These defaults are experimental because we use the names of functions to retrieve the default statistics. This should be generalized in groups of methods according to more reasonable groupings.
Formats in tern
and rtables
can be functions that take in the table cell value and
return a string. This is well documented in vignette("custom_appearance", package = "rtables")
.
Examples
# analyze_vars is numeric
num_stats <- get_stats("analyze_vars_numeric") # also the default
# Other type
cnt_stats <- get_stats("analyze_vars_counts")
# Weirdly taking the pval from count_occurrences
only_pval <- get_stats("count_occurrences", add_pval = TRUE, stats_in = "pval")
# All count_occurrences
all_cnt_occ <- get_stats("count_occurrences")
# Multiple
get_stats(c("count_occurrences", "analyze_vars_counts"))
#> [1] "count" "count_fraction"
#> [3] "count_fraction_fixed_dp" "fraction"
#> [5] "n" "n_blq"
# Defaults formats
get_formats_from_stats(num_stats)
#> $n
#> [1] "xx."
#>
#> $sum
#> [1] "xx.x"
#>
#> $mean
#> [1] "xx.x"
#>
#> $sd
#> [1] "xx.x"
#>
#> $se
#> [1] "xx.x"
#>
#> $mean_sd
#> [1] "xx.x (xx.x)"
#>
#> $mean_se
#> [1] "xx.x (xx.x)"
#>
#> $mean_ci
#> [1] "(xx.xx, xx.xx)"
#>
#> $mean_sei
#> [1] "(xx.xx, xx.xx)"
#>
#> $mean_sdi
#> [1] "(xx.xx, xx.xx)"
#>
#> $mean_pval
#> [1] "x.xxxx | (<0.0001)"
#>
#> $median
#> [1] "xx.x"
#>
#> $mad
#> [1] "xx.x"
#>
#> $median_ci
#> [1] "(xx.xx, xx.xx)"
#>
#> $quantiles
#> [1] "xx.x - xx.x"
#>
#> $iqr
#> [1] "xx.x"
#>
#> $range
#> [1] "xx.x - xx.x"
#>
#> $min
#> [1] "xx.x"
#>
#> $max
#> [1] "xx.x"
#>
#> $median_range
#> [1] "xx.x (xx.x - xx.x)"
#>
#> $cv
#> [1] "xx.x"
#>
#> $geom_mean
#> [1] "xx.x"
#>
#> $geom_mean_ci
#> [1] "(xx.xx, xx.xx)"
#>
#> $geom_cv
#> [1] "xx.x"
#>
get_formats_from_stats(cnt_stats)
#> $n
#> [1] "xx."
#>
#> $count
#> [1] "xx."
#>
#> $count_fraction
#> function(x, ...) {
#> attr(x, "label") <- NULL
#>
#> if (any(is.na(x))) {
#> return("NA")
#> }
#>
#> checkmate::assert_vector(x)
#> checkmate::assert_integerish(x[1])
#> assert_proportion_value(x[2], include_boundaries = TRUE)
#>
#> result <- if (x[1] == 0) {
#> "0"
#> } else {
#> paste0(x[1], " (", round(x[2] * 100, 1), "%)")
#> }
#>
#> return(result)
#> }
#> <environment: namespace:tern>
#>
#> $count_fraction_fixed_dp
#> function(x, ...) {
#> attr(x, "label") <- NULL
#>
#> if (any(is.na(x))) {
#> return("NA")
#> }
#>
#> checkmate::assert_vector(x)
#> checkmate::assert_integerish(x[1])
#> assert_proportion_value(x[2], include_boundaries = TRUE)
#>
#> result <- if (x[1] == 0) {
#> "0"
#> } else if (.is_equal_float(x[2], 1)) {
#> sprintf("%d (100%%)", x[1])
#> } else {
#> sprintf("%d (%.1f%%)", x[1], x[2] * 100)
#> }
#>
#> return(result)
#> }
#> <environment: namespace:tern>
#>
#> $n_blq
#> [1] "xx."
#>
get_formats_from_stats(only_pval)
#> $pval
#> [1] "x.xxxx | (<0.0001)"
#>
get_formats_from_stats(all_cnt_occ)
#> $count
#> [1] "xx."
#>
#> $count_fraction
#> function(x, ...) {
#> attr(x, "label") <- NULL
#>
#> if (any(is.na(x))) {
#> return("NA")
#> }
#>
#> checkmate::assert_vector(x)
#> checkmate::assert_integerish(x[1])
#> assert_proportion_value(x[2], include_boundaries = TRUE)
#>
#> result <- if (x[1] == 0) {
#> "0"
#> } else {
#> paste0(x[1], " (", round(x[2] * 100, 1), "%)")
#> }
#>
#> return(result)
#> }
#> <environment: namespace:tern>
#>
#> $count_fraction_fixed_dp
#> function(x, ...) {
#> attr(x, "label") <- NULL
#>
#> if (any(is.na(x))) {
#> return("NA")
#> }
#>
#> checkmate::assert_vector(x)
#> checkmate::assert_integerish(x[1])
#> assert_proportion_value(x[2], include_boundaries = TRUE)
#>
#> result <- if (x[1] == 0) {
#> "0"
#> } else if (.is_equal_float(x[2], 1)) {
#> sprintf("%d (100%%)", x[1])
#> } else {
#> sprintf("%d (%.1f%%)", x[1], x[2] * 100)
#> }
#>
#> return(result)
#> }
#> <environment: namespace:tern>
#>
#> $fraction
#> function(x, ...) {
#> attr(x, "label") <- NULL
#> checkmate::assert_vector(x)
#> checkmate::assert_count(x["num"])
#> checkmate::assert_count(x["denom"])
#>
#> result <- if (x["num"] == 0) {
#> paste0(x["num"], "/", x["denom"])
#> } else {
#> paste0(
#> x["num"], "/", x["denom"],
#> " (", sprintf("%.1f", round(x["num"] / x["denom"] * 100, 1)), "%)"
#> )
#> }
#> return(result)
#> }
#> <environment: namespace:tern>
#>
# Addition of customs
get_formats_from_stats(all_cnt_occ, formats_in = c("fraction" = c("xx")))
#> $count
#> [1] "xx."
#>
#> $count_fraction
#> function(x, ...) {
#> attr(x, "label") <- NULL
#>
#> if (any(is.na(x))) {
#> return("NA")
#> }
#>
#> checkmate::assert_vector(x)
#> checkmate::assert_integerish(x[1])
#> assert_proportion_value(x[2], include_boundaries = TRUE)
#>
#> result <- if (x[1] == 0) {
#> "0"
#> } else {
#> paste0(x[1], " (", round(x[2] * 100, 1), "%)")
#> }
#>
#> return(result)
#> }
#> <environment: namespace:tern>
#>
#> $count_fraction_fixed_dp
#> function(x, ...) {
#> attr(x, "label") <- NULL
#>
#> if (any(is.na(x))) {
#> return("NA")
#> }
#>
#> checkmate::assert_vector(x)
#> checkmate::assert_integerish(x[1])
#> assert_proportion_value(x[2], include_boundaries = TRUE)
#>
#> result <- if (x[1] == 0) {
#> "0"
#> } else if (.is_equal_float(x[2], 1)) {
#> sprintf("%d (100%%)", x[1])
#> } else {
#> sprintf("%d (%.1f%%)", x[1], x[2] * 100)
#> }
#>
#> return(result)
#> }
#> <environment: namespace:tern>
#>
#> $fraction
#> [1] "xx"
#>
get_formats_from_stats(all_cnt_occ, formats_in = list("fraction" = c("xx.xx", "xx")))
#> $count
#> [1] "xx."
#>
#> $count_fraction
#> function(x, ...) {
#> attr(x, "label") <- NULL
#>
#> if (any(is.na(x))) {
#> return("NA")
#> }
#>
#> checkmate::assert_vector(x)
#> checkmate::assert_integerish(x[1])
#> assert_proportion_value(x[2], include_boundaries = TRUE)
#>
#> result <- if (x[1] == 0) {
#> "0"
#> } else {
#> paste0(x[1], " (", round(x[2] * 100, 1), "%)")
#> }
#>
#> return(result)
#> }
#> <environment: namespace:tern>
#>
#> $count_fraction_fixed_dp
#> function(x, ...) {
#> attr(x, "label") <- NULL
#>
#> if (any(is.na(x))) {
#> return("NA")
#> }
#>
#> checkmate::assert_vector(x)
#> checkmate::assert_integerish(x[1])
#> assert_proportion_value(x[2], include_boundaries = TRUE)
#>
#> result <- if (x[1] == 0) {
#> "0"
#> } else if (.is_equal_float(x[2], 1)) {
#> sprintf("%d (100%%)", x[1])
#> } else {
#> sprintf("%d (%.1f%%)", x[1], x[2] * 100)
#> }
#>
#> return(result)
#> }
#> <environment: namespace:tern>
#>
#> $fraction
#> [1] "xx.xx" "xx"
#>
# Defaults labels
get_labels_from_stats(num_stats)
#> n sum
#> "n" "Sum"
#> mean sd
#> "Mean" "SD"
#> se mean_sd
#> "SE" "Mean (SD)"
#> mean_se mean_ci
#> "Mean (SE)" "Mean 95% CI"
#> mean_sei mean_sdi
#> "Mean -/+ 1xSE" "Mean -/+ 1xSD"
#> mean_pval median
#> "Mean p-value (H0: mean = 0)" "Median"
#> mad median_ci
#> "Median Absolute Deviation" "Median 95% CI"
#> quantiles iqr
#> "25% and 75%-ile" "IQR"
#> range min
#> "Min - Max" "Minimum"
#> max median_range
#> "Maximum" "Median (Min - Max)"
#> cv geom_mean
#> "CV (%)" "Geometric Mean"
#> geom_mean_ci geom_cv
#> "Geometric Mean 95% CI" "CV % Geometric Mean"
get_labels_from_stats(cnt_stats)
#> n count count_fraction
#> "n" "count" "count_fraction"
#> count_fraction_fixed_dp n_blq
#> "count_fraction" "n_blq"
get_labels_from_stats(only_pval)
#> pval
#> "p-value (t-test)"
get_labels_from_stats(all_cnt_occ)
#> count count_fraction count_fraction_fixed_dp
#> "count" "count_fraction" "count_fraction"
#> fraction
#> "fraction"
# Addition of customs
get_labels_from_stats(all_cnt_occ, labels_in = c("fraction" = "Fraction"))
#> count count_fraction count_fraction_fixed_dp
#> "count" "count_fraction" "count_fraction"
#> fraction
#> "Fraction"
get_labels_from_stats(all_cnt_occ, labels_in = list("fraction" = c("Some more fractions")))
#> $count
#> [1] "count"
#>
#> $count_fraction
#> [1] "count_fraction"
#>
#> $count_fraction_fixed_dp
#> [1] "count_fraction"
#>
#> $fraction
#> [1] "Some more fractions"
#>
get_indents_from_stats(all_cnt_occ, indents_in = 3L)
#> [1] 3 3 3 3
get_indents_from_stats(all_cnt_occ, indents_in = list(count = 2L, count_fraction = 5L))
#> $count
#> [1] 2
#>
#> $count_fraction
#> [1] 5
#>
#> $count_fraction_fixed_dp
#> [1] 0
#>
#> $fraction
#> [1] 0
#>
get_indents_from_stats(
all_cnt_occ,
indents_in = list(a = 2L, count.a = 1L, count.b = 5L), row_nms = c("a", "b")
)
#> $count.a
#> [1] 1
#>
#> $count.b
#> [1] 5
#>
#> $count_fraction.a
#> [1] 2
#>
#> $count_fraction.b
#> [1] 0
#>
#> $count_fraction_fixed_dp.a
#> [1] 2
#>
#> $count_fraction_fixed_dp.b
#> [1] 0
#>
#> $fraction.a
#> [1] 2
#>
#> $fraction.b
#> [1] 0
#>
summary_formats()
#> $n
#> [1] "xx."
#>
#> $sum
#> [1] "xx.x"
#>
#> $mean
#> [1] "xx.x"
#>
#> $sd
#> [1] "xx.x"
#>
#> $se
#> [1] "xx.x"
#>
#> $mean_sd
#> [1] "xx.x (xx.x)"
#>
#> $mean_se
#> [1] "xx.x (xx.x)"
#>
#> $mean_ci
#> [1] "(xx.xx, xx.xx)"
#>
#> $mean_sei
#> [1] "(xx.xx, xx.xx)"
#>
#> $mean_sdi
#> [1] "(xx.xx, xx.xx)"
#>
#> $mean_pval
#> [1] "x.xxxx | (<0.0001)"
#>
#> $median
#> [1] "xx.x"
#>
#> $mad
#> [1] "xx.x"
#>
#> $median_ci
#> [1] "(xx.xx, xx.xx)"
#>
#> $quantiles
#> [1] "xx.x - xx.x"
#>
#> $iqr
#> [1] "xx.x"
#>
#> $range
#> [1] "xx.x - xx.x"
#>
#> $min
#> [1] "xx.x"
#>
#> $max
#> [1] "xx.x"
#>
#> $median_range
#> [1] "xx.x (xx.x - xx.x)"
#>
#> $cv
#> [1] "xx.x"
#>
#> $geom_mean
#> [1] "xx.x"
#>
#> $geom_mean_ci
#> [1] "(xx.xx, xx.xx)"
#>
#> $geom_cv
#> [1] "xx.x"
#>
summary_formats(type = "counts", include_pval = TRUE)
#> $n
#> [1] "xx."
#>
#> $count
#> [1] "xx."
#>
#> $count_fraction
#> function(x, ...) {
#> attr(x, "label") <- NULL
#>
#> if (any(is.na(x))) {
#> return("NA")
#> }
#>
#> checkmate::assert_vector(x)
#> checkmate::assert_integerish(x[1])
#> assert_proportion_value(x[2], include_boundaries = TRUE)
#>
#> result <- if (x[1] == 0) {
#> "0"
#> } else {
#> paste0(x[1], " (", round(x[2] * 100, 1), "%)")
#> }
#>
#> return(result)
#> }
#> <environment: namespace:tern>
#>
#> $count_fraction_fixed_dp
#> function(x, ...) {
#> attr(x, "label") <- NULL
#>
#> if (any(is.na(x))) {
#> return("NA")
#> }
#>
#> checkmate::assert_vector(x)
#> checkmate::assert_integerish(x[1])
#> assert_proportion_value(x[2], include_boundaries = TRUE)
#>
#> result <- if (x[1] == 0) {
#> "0"
#> } else if (.is_equal_float(x[2], 1)) {
#> sprintf("%d (100%%)", x[1])
#> } else {
#> sprintf("%d (%.1f%%)", x[1], x[2] * 100)
#> }
#>
#> return(result)
#> }
#> <environment: namespace:tern>
#>
#> $n_blq
#> [1] "xx."
#>
#> $pval_counts
#> [1] "x.xxxx | (<0.0001)"
#>
summary_labels()
#> n sum
#> "n" "Sum"
#> mean sd
#> "Mean" "SD"
#> se mean_sd
#> "SE" "Mean (SD)"
#> mean_se mean_ci
#> "Mean (SE)" "Mean 95% CI"
#> mean_sei mean_sdi
#> "Mean -/+ 1xSE" "Mean -/+ 1xSD"
#> mean_pval median
#> "Mean p-value (H0: mean = 0)" "Median"
#> mad median_ci
#> "Median Absolute Deviation" "Median 95% CI"
#> quantiles iqr
#> "25% and 75%-ile" "IQR"
#> range min
#> "Min - Max" "Minimum"
#> max median_range
#> "Maximum" "Median (Min - Max)"
#> cv geom_mean
#> "CV (%)" "Geometric Mean"
#> geom_mean_ci geom_cv
#> "Geometric Mean 95% CI" "CV % Geometric Mean"
summary_labels(type = "counts", include_pval = TRUE)
#> n count
#> "n" "count"
#> count_fraction count_fraction_fixed_dp
#> "count_fraction" "count_fraction"
#> n_blq pval_counts
#> "n_blq" "p-value (chi-squared test)"