We use the S3 generic function s_summary() to implement summaries for different x objects. This
is used as a statistics function in combination with the analyze function analyze_vars().
Deprecation cycle started for summarize_vars as it is going to renamed into
analyze_vars. Intention is to reflect better the core underlying rtables
functions; in this case rtables::analyze().
Usage
s_summary(x, na.rm = TRUE, denom, .N_row, .N_col, .var, ...)
# S3 method for numeric
s_summary(
x,
na.rm = TRUE,
denom,
.N_row,
.N_col,
.var,
control = control_analyze_vars(),
...
)
# S3 method for factor
s_summary(
x,
na.rm = TRUE,
denom = c("n", "N_row", "N_col"),
.N_row,
.N_col,
...
)
# S3 method for character
s_summary(
x,
na.rm = TRUE,
denom = c("n", "N_row", "N_col"),
.N_row,
.N_col,
.var,
verbose = TRUE,
...
)
# S3 method for logical
s_summary(
x,
na.rm = TRUE,
denom = c("n", "N_row", "N_col"),
.N_row,
.N_col,
...
)
a_summary(
x,
.N_col,
.N_row,
.var = NULL,
.df_row = NULL,
.ref_group = NULL,
.in_ref_col = FALSE,
compare = FALSE,
.stats = NULL,
.formats = NULL,
.labels = NULL,
.indent_mods = NULL,
na.rm = TRUE,
na_level = lifecycle::deprecated(),
na_str = NA_character_,
...
)
analyze_vars(
lyt,
vars,
var_labels = vars,
na_level = lifecycle::deprecated(),
na_str = NA_character_,
nested = TRUE,
...,
na.rm = TRUE,
show_labels = "default",
table_names = vars,
section_div = NA_character_,
.stats = c("n", "mean_sd", "median", "range", "count_fraction"),
.formats = NULL,
.labels = NULL,
.indent_mods = NULL
)
summarize_vars(...)Arguments
- x
(
numeric)
vector of numbers we want to analyze.- na.rm
(
flag)
whetherNAvalues should be removed fromxprior to analysis.- denom
-
(
string)
choice of denominator for proportion. Options are:n: number of values in this row and column intersection.N_row: total number of values in this row across columns.N_col: total number of values in this column across rows.
- .N_row
(
integer)
row-wise N (row group count) for the group of observations being analyzed (i.e. with no column-based subsetting) that is typically passed byrtables.- .N_col
(
integer)
column-wise N (column count) for the full column being analyzed that is typically passed byrtables.- .var
(
string)
single variable name that is passed byrtableswhen requested by a statistics function.- ...
arguments passed to
s_summary().- control
-
(
list)
parameters for descriptive statistics details, specified by using the helper functioncontrol_analyze_vars(). Some possible parameter options are:conf_level(proportion)
confidence level of the interval for mean and median.quantiles(numeric)
vector of length two to specify the quantiles.quantile_type(numeric)
between 1 and 9 selecting quantile algorithms to be used. See more abouttypeinstats::quantile().test_mean(numeric)
value to test against the mean under the null hypothesis when calculating p-value.
- verbose
(
logical)
Defaults toTRUE, which prints out warnings and messages. It is mainly used to print out information about factor casting.- .df_row
(
data.frame)
data frame across all of the columns for the given row split.- .ref_group
(
data.frameorvector)
the data corresponding to the reference group.- .in_ref_col
(
logical)TRUEwhen working with the reference level,FALSEotherwise.- compare
(
logical)
Whether comparison statistics should be analyzed instead of summary statistics (compare = TRUEaddspvalstatistic comparing against reference group).- .stats
(
character)
statistics to select for the table.- .formats
(named
characterorlist)
formats for the statistics. See Details inanalyze_varsfor more information on the"auto"setting.- .labels
(named
character)
labels for the statistics (without indent).- .indent_mods
(named
vectorofinteger)
indent modifiers for the labels. Each element of the vector should be a name-value pair with name corresponding to a statistic specified in.statsand value the indentation for that statistic's row label.- na_level
- na_str
(
string)
string used to replace allNAor empty values in the output.- lyt
(
layout)
input layout where analyses will be added to.- vars
(
character)
variable names for the primary analysis variable to be iterated over.- var_labels
(
character)
character for label.- nested
(
flag)
whether this layout instruction should be applied within the existing layout structure if possible (TRUE, the default) or as a new top-level element (FALSE). Ignored if it would nest a split. underneath analyses, which is not allowed.- show_labels
(
string)
label visibility: one of "default", "visible" and "hidden".- table_names
(
character)
this can be customized in case that the samevarsare analyzed multiple times, to avoid warnings fromrtables.- section_div
(
string)
string which should be repeated as a section divider after each group defined by this split instruction, orNA_character_(the default) for no section divider.
Value
s_summary()returns different statistics depending on the class ofx.
-
If
xis of classnumeric, returns alistwith the following namednumericitems:n: Thelength()ofx.sum: Thesum()ofx.mean: Themean()ofx.sd: Thestats::sd()ofx.se: The standard error ofxmean, i.e.: (sd(x) / sqrt(length(x))).mean_sd: Themean()andstats::sd()ofx.mean_se: Themean()ofxand its standard error (see above).mean_ci: The CI for the mean ofx(fromstat_mean_ci()).mean_sei: The SE interval for the mean ofx, i.e.: (mean()-/+stats::sd()/sqrt()).mean_sdi: The SD interval for the mean ofx, i.e.: (mean()-/+stats::sd()).mean_pval: The two-sided p-value of the mean ofx(fromstat_mean_pval()).median: Thestats::median()ofx.mad: The median absolute deviation ofx, i.e.: (stats::median()ofxc, wherexc=x-stats::median()).median_ci: The CI for the median ofx(fromstat_median_ci()).quantiles: Two sample quantiles ofx(fromstats::quantile()).iqr: Thestats::IQR()ofx.range: Therange_noinf()ofx.min: Themax()ofx.max: Themin()ofx.median_range: Themedian()andrange_noinf()ofx.cv: The coefficient of variation ofx, i.e.: (stats::sd()/mean()* 100).geom_mean: The geometric mean ofx, i.e.: (exp(mean(log(x)))).geom_cv: The geometric coefficient of variation ofx, i.e.: (sqrt(exp(sd(log(x)) ^ 2) - 1) * 100).
-
If
xis of classfactoror converted fromcharacter, returns alistwith namednumericitems:n: Thelength()ofx.count: A list with the number of cases for each level of the factorx.count_fraction: Similar tocountbut also includes the proportion of cases for each level of the factorxrelative to the denominator, orNAif the denominator is zero.
-
If
xis of classlogical, returns alistwith namednumericitems:n: Thelength()ofx(possibly after removingNAs).count: Count ofTRUEinx.count_fraction: Count and proportion ofTRUEinxrelative to the denominator, orNAif the denominator is zero. Note thatNAs inxare never counted or leading toNAhere.
a_summary()returns the corresponding list with formattedrtables::CellValue().
analyze_vars()returns a layout object suitable for passing to further layouting functions, or tortables::build_table(). Adding this function to anrtablelayout will add formatted rows containing the statistics froms_summary()to the table layout.
Details
It is possible to use "auto" for analyze_vars on a subset of methods. This uses format_auto() to
determine automatically the number of digits from the analyzed variable (.vars), but only for the
current row data (.df_row[[.var]], see ?rtables::additional_fun_params), and not for the whole
data. Also no column split is considered.
Functions
s_summary(): S3 generic function to produces a variable summary.s_summary(numeric): Method fornumericclass.s_summary(factor): Method forfactorclass.s_summary(character): Method forcharacterclass. This makes an automatic conversion to factor (with a warning) and then forwards to the method for factors.s_summary(logical): Method forlogicalclass.a_summary(): Formatted analysis function which is used asafuninanalyze_vars()andcompare_vars()and ascfuninsummarize_colvars().analyze_vars(): Layout-creating function which can take statistics function arguments and additional format arguments. This function is a wrapper forrtables::analyze().
Note
If
xis an empty vector,NAis returned. This is the expected feature so as to returnrcellcontent inrtableswhen the intersection of a column and a row delimits an empty data selection.When the
meanfunction is applied to an empty vector,NAwill be returned instead ofNaN, the latter being standard behavior in R.
If
xis an emptyfactor, a list is still returned forcountswith one element per factor level. If there are no levels inx, the function fails.If factor variables contain
NA, theseNAvalues are excluded by default. To includeNAvalues setna.rm = FALSEand missing values will be displayed as anNAlevel. Alternatively, an explicit factor level can be defined forNAvalues during pre-processing viadf_explicit_na()- the defaultna_level("<Missing>") will also be excluded whenna.rmis set toTRUE.
Automatic conversion of character to factor does not guarantee that the table can be generated correctly. In particular for sparse tables this very likely can fail. It is therefore better to always pre-process the dataset such that factors are manually created from character variables before passing the dataset to
rtables::build_table().
To use for comparison (with additional p-value statistic), parameter
comparemust be set toTRUE.Ensure that either all
NAvalues are converted to an explicitNAlevel or allNAvalues are left as is.
Examples
# `s_summary.numeric`
## Basic usage: empty numeric returns NA-filled items.
s_summary(numeric())
#> $n
#> n
#> 0
#>
#> $sum
#> sum
#> NA
#>
#> $mean
#> mean
#> NA
#>
#> $sd
#> sd
#> NA
#>
#> $se
#> se
#> NA
#>
#> $mean_sd
#> mean sd
#> NA NA
#>
#> $mean_se
#> mean se
#> NA NA
#>
#> $mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $mean_sei
#> mean_sei_lwr mean_sei_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $mean_pval
#> p_value
#> NA
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $median
#> median
#> NA
#>
#> $mad
#> mad
#> NA
#>
#> $median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $quantiles
#> quantile_0.25 quantile_0.75
#> NA NA
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $iqr
#> iqr
#> NA
#>
#> $range
#> min max
#> NA NA
#>
#> $min
#> min
#> NA
#>
#> $max
#> max
#> NA
#>
#> $median_range
#> median min max
#> NA NA NA
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $cv
#> cv
#> NA
#>
#> $geom_mean
#> geom_mean
#> NaN
#>
#> $geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $geom_cv
#> geom_cv
#> NA
#>
## Management of NA values.
x <- c(NA_real_, 1)
s_summary(x, na.rm = TRUE)
#> $n
#> n
#> 1
#>
#> $sum
#> sum
#> 1
#>
#> $mean
#> mean
#> 1
#>
#> $sd
#> sd
#> NA
#>
#> $se
#> se
#> NA
#>
#> $mean_sd
#> mean sd
#> 1 NA
#>
#> $mean_se
#> mean se
#> 1 NA
#>
#> $mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $mean_sei
#> mean_sei_lwr mean_sei_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $mean_pval
#> p_value
#> NA
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $median
#> median
#> 1
#>
#> $mad
#> mad
#> 0
#>
#> $median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $quantiles
#> quantile_0.25 quantile_0.75
#> 1 1
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $iqr
#> iqr
#> 0
#>
#> $range
#> min max
#> 1 1
#>
#> $min
#> min
#> 1
#>
#> $max
#> max
#> 1
#>
#> $median_range
#> median min max
#> 1 1 1
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $cv
#> cv
#> NA
#>
#> $geom_mean
#> geom_mean
#> 1
#>
#> $geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $geom_cv
#> geom_cv
#> NA
#>
s_summary(x, na.rm = FALSE)
#> $n
#> n
#> 2
#>
#> $sum
#> sum
#> NA
#>
#> $mean
#> mean
#> NA
#>
#> $sd
#> sd
#> NA
#>
#> $se
#> se
#> NA
#>
#> $mean_sd
#> mean sd
#> NA NA
#>
#> $mean_se
#> mean se
#> NA NA
#>
#> $mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $mean_sei
#> mean_sei_lwr mean_sei_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $mean_pval
#> p_value
#> NA
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $median
#> median
#> NA
#>
#> $mad
#> mad
#> NA
#>
#> $median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $quantiles
#> quantile_0.25 quantile_0.75
#> NA NA
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $iqr
#> iqr
#> NA
#>
#> $range
#> min max
#> NA NA
#>
#> $min
#> min
#> NA
#>
#> $max
#> max
#> NA
#>
#> $median_range
#> median min max
#> NA NA NA
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $cv
#> cv
#> NA
#>
#> $geom_mean
#> geom_mean
#> NA
#>
#> $geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $geom_cv
#> geom_cv
#> NA
#>
x <- c(NA_real_, 1, 2)
s_summary(x, stats = NULL)
#> $n
#> n
#> 2
#>
#> $sum
#> sum
#> 3
#>
#> $mean
#> mean
#> 1.5
#>
#> $sd
#> sd
#> 0.7071068
#>
#> $se
#> se
#> 0.5
#>
#> $mean_sd
#> mean sd
#> 1.5000000 0.7071068
#>
#> $mean_se
#> mean se
#> 1.5 0.5
#>
#> $mean_ci
#> mean_ci_lwr mean_ci_upr
#> -4.853102 7.853102
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $mean_sei
#> mean_sei_lwr mean_sei_upr
#> 1 2
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> 0.7928932 2.2071068
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $mean_pval
#> p_value
#> 0.2048328
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $median
#> median
#> 1.5
#>
#> $mad
#> mad
#> 0
#>
#> $median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $quantiles
#> quantile_0.25 quantile_0.75
#> 1 2
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $iqr
#> iqr
#> 1
#>
#> $range
#> min max
#> 1 2
#>
#> $min
#> min
#> 1
#>
#> $max
#> max
#> 2
#>
#> $median_range
#> median min max
#> 1.5 1.0 2.0
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $cv
#> cv
#> 47.14045
#>
#> $geom_mean
#> geom_mean
#> 1.414214
#>
#> $geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> 0.01729978 115.60839614
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $geom_cv
#> geom_cv
#> 52.10922
#>
## Benefits in `rtables` contructions:
require(rtables)
dta_test <- data.frame(
Group = rep(LETTERS[1:3], each = 2),
sub_group = rep(letters[1:2], each = 3),
x = 1:6
)
## The summary obtained in with `rtables`:
basic_table() %>%
split_cols_by(var = "Group") %>%
split_rows_by(var = "sub_group") %>%
analyze(vars = "x", afun = s_summary) %>%
build_table(df = dta_test)
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> A B C
#> —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> a
#> n 2 1 0
#> sum 3 3 NA
#> mean 1.5 3 NA
#> sd 0.707106781186548 NA NA
#> se 0.5 NA NA
#> mean_sd 1.5, 0.707106781186548 3, NA NA
#> mean_se 1.5, 0.5 3, NA NA
#> Mean 95% CI -4.85310236808735, 7.85310236808735 NA NA
#> Mean -/+ 1xSE 1, 2 NA NA
#> Mean -/+ 1xSD 0.792893218813452, 2.20710678118655 NA NA
#> Mean p-value (H0: mean = 0) 0.204832764699133 NA NA
#> median 1.5 3 NA
#> mad 0 0 NA
#> Median 95% CI NA NA NA
#> 25% and 75%-ile 1, 2 3, 3 NA
#> iqr 1 0 NA
#> range 1, 2 3, 3 NA
#> min 1 3 NA
#> max 2 3 NA
#> Median (Min - Max) 1.5, 1, 2 3, 3, 3 NA
#> cv 47.1404520791032 NA NA
#> geom_mean 1.41421356237309 3 NA
#> Geometric Mean 95% CI 0.0172997815631007, 115.608396135236 NA NA
#> geom_cv 52.1092246837487 NA NA
#> b
#> n 0 1 2
#> sum NA 4 11
#> mean NA 4 5.5
#> sd NA NA 0.707106781186548
#> se NA NA 0.5
#> mean_sd NA 4, NA 5.5, 0.707106781186548
#> mean_se NA 4, NA 5.5, 0.5
#> Mean 95% CI NA NA -0.853102368087347, 11.8531023680873
#> Mean -/+ 1xSE NA NA 5, 6
#> Mean -/+ 1xSD NA NA 4.79289321881345, 6.20710678118655
#> Mean p-value (H0: mean = 0) NA NA 0.0577158767526089
#> median NA 4 5.5
#> mad NA 0 0
#> Median 95% CI NA NA NA
#> 25% and 75%-ile NA 4, 4 5, 6
#> iqr NA 0 1
#> range NA 4, 4 5, 6
#> min NA 4 5
#> max NA 4 6
#> Median (Min - Max) NA 4, 4, 4 5.5, 5, 6
#> cv NA NA 12.8564869306645
#> geom_mean NA 4 5.47722557505166
#> Geometric Mean 95% CI NA NA 1.71994304449266, 17.4424380482025
#> geom_cv NA NA 12.945835316564
## By comparison with `lapply`:
X <- split(dta_test, f = with(dta_test, interaction(Group, sub_group)))
lapply(X, function(x) s_summary(x$x))
#> $A.a
#> $A.a$n
#> n
#> 2
#>
#> $A.a$sum
#> sum
#> 3
#>
#> $A.a$mean
#> mean
#> 1.5
#>
#> $A.a$sd
#> sd
#> 0.7071068
#>
#> $A.a$se
#> se
#> 0.5
#>
#> $A.a$mean_sd
#> mean sd
#> 1.5000000 0.7071068
#>
#> $A.a$mean_se
#> mean se
#> 1.5 0.5
#>
#> $A.a$mean_ci
#> mean_ci_lwr mean_ci_upr
#> -4.853102 7.853102
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $A.a$mean_sei
#> mean_sei_lwr mean_sei_upr
#> 1 2
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $A.a$mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> 0.7928932 2.2071068
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $A.a$mean_pval
#> p_value
#> 0.2048328
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $A.a$median
#> median
#> 1.5
#>
#> $A.a$mad
#> mad
#> 0
#>
#> $A.a$median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $A.a$quantiles
#> quantile_0.25 quantile_0.75
#> 1 2
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $A.a$iqr
#> iqr
#> 1
#>
#> $A.a$range
#> min max
#> 1 2
#>
#> $A.a$min
#> min
#> 1
#>
#> $A.a$max
#> max
#> 2
#>
#> $A.a$median_range
#> median min max
#> 1.5 1.0 2.0
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $A.a$cv
#> cv
#> 47.14045
#>
#> $A.a$geom_mean
#> geom_mean
#> 1.414214
#>
#> $A.a$geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> 0.01729978 115.60839614
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $A.a$geom_cv
#> geom_cv
#> 52.10922
#>
#>
#> $B.a
#> $B.a$n
#> n
#> 1
#>
#> $B.a$sum
#> sum
#> 3
#>
#> $B.a$mean
#> mean
#> 3
#>
#> $B.a$sd
#> sd
#> NA
#>
#> $B.a$se
#> se
#> NA
#>
#> $B.a$mean_sd
#> mean sd
#> 3 NA
#>
#> $B.a$mean_se
#> mean se
#> 3 NA
#>
#> $B.a$mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $B.a$mean_sei
#> mean_sei_lwr mean_sei_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $B.a$mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $B.a$mean_pval
#> p_value
#> NA
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $B.a$median
#> median
#> 3
#>
#> $B.a$mad
#> mad
#> 0
#>
#> $B.a$median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $B.a$quantiles
#> quantile_0.25 quantile_0.75
#> 3 3
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $B.a$iqr
#> iqr
#> 0
#>
#> $B.a$range
#> min max
#> 3 3
#>
#> $B.a$min
#> min
#> 3
#>
#> $B.a$max
#> max
#> 3
#>
#> $B.a$median_range
#> median min max
#> 3 3 3
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $B.a$cv
#> cv
#> NA
#>
#> $B.a$geom_mean
#> geom_mean
#> 3
#>
#> $B.a$geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $B.a$geom_cv
#> geom_cv
#> NA
#>
#>
#> $C.a
#> $C.a$n
#> n
#> 0
#>
#> $C.a$sum
#> sum
#> NA
#>
#> $C.a$mean
#> mean
#> NA
#>
#> $C.a$sd
#> sd
#> NA
#>
#> $C.a$se
#> se
#> NA
#>
#> $C.a$mean_sd
#> mean sd
#> NA NA
#>
#> $C.a$mean_se
#> mean se
#> NA NA
#>
#> $C.a$mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $C.a$mean_sei
#> mean_sei_lwr mean_sei_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $C.a$mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $C.a$mean_pval
#> p_value
#> NA
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $C.a$median
#> median
#> NA
#>
#> $C.a$mad
#> mad
#> NA
#>
#> $C.a$median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $C.a$quantiles
#> quantile_0.25 quantile_0.75
#> NA NA
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $C.a$iqr
#> iqr
#> NA
#>
#> $C.a$range
#> min max
#> NA NA
#>
#> $C.a$min
#> min
#> NA
#>
#> $C.a$max
#> max
#> NA
#>
#> $C.a$median_range
#> median min max
#> NA NA NA
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $C.a$cv
#> cv
#> NA
#>
#> $C.a$geom_mean
#> geom_mean
#> NaN
#>
#> $C.a$geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $C.a$geom_cv
#> geom_cv
#> NA
#>
#>
#> $A.b
#> $A.b$n
#> n
#> 0
#>
#> $A.b$sum
#> sum
#> NA
#>
#> $A.b$mean
#> mean
#> NA
#>
#> $A.b$sd
#> sd
#> NA
#>
#> $A.b$se
#> se
#> NA
#>
#> $A.b$mean_sd
#> mean sd
#> NA NA
#>
#> $A.b$mean_se
#> mean se
#> NA NA
#>
#> $A.b$mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $A.b$mean_sei
#> mean_sei_lwr mean_sei_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $A.b$mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $A.b$mean_pval
#> p_value
#> NA
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $A.b$median
#> median
#> NA
#>
#> $A.b$mad
#> mad
#> NA
#>
#> $A.b$median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $A.b$quantiles
#> quantile_0.25 quantile_0.75
#> NA NA
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $A.b$iqr
#> iqr
#> NA
#>
#> $A.b$range
#> min max
#> NA NA
#>
#> $A.b$min
#> min
#> NA
#>
#> $A.b$max
#> max
#> NA
#>
#> $A.b$median_range
#> median min max
#> NA NA NA
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $A.b$cv
#> cv
#> NA
#>
#> $A.b$geom_mean
#> geom_mean
#> NaN
#>
#> $A.b$geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $A.b$geom_cv
#> geom_cv
#> NA
#>
#>
#> $B.b
#> $B.b$n
#> n
#> 1
#>
#> $B.b$sum
#> sum
#> 4
#>
#> $B.b$mean
#> mean
#> 4
#>
#> $B.b$sd
#> sd
#> NA
#>
#> $B.b$se
#> se
#> NA
#>
#> $B.b$mean_sd
#> mean sd
#> 4 NA
#>
#> $B.b$mean_se
#> mean se
#> 4 NA
#>
#> $B.b$mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $B.b$mean_sei
#> mean_sei_lwr mean_sei_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $B.b$mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> NA NA
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $B.b$mean_pval
#> p_value
#> NA
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $B.b$median
#> median
#> 4
#>
#> $B.b$mad
#> mad
#> 0
#>
#> $B.b$median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $B.b$quantiles
#> quantile_0.25 quantile_0.75
#> 4 4
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $B.b$iqr
#> iqr
#> 0
#>
#> $B.b$range
#> min max
#> 4 4
#>
#> $B.b$min
#> min
#> 4
#>
#> $B.b$max
#> max
#> 4
#>
#> $B.b$median_range
#> median min max
#> 4 4 4
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $B.b$cv
#> cv
#> NA
#>
#> $B.b$geom_mean
#> geom_mean
#> 4
#>
#> $B.b$geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> NA NA
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $B.b$geom_cv
#> geom_cv
#> NA
#>
#>
#> $C.b
#> $C.b$n
#> n
#> 2
#>
#> $C.b$sum
#> sum
#> 11
#>
#> $C.b$mean
#> mean
#> 5.5
#>
#> $C.b$sd
#> sd
#> 0.7071068
#>
#> $C.b$se
#> se
#> 0.5
#>
#> $C.b$mean_sd
#> mean sd
#> 5.5000000 0.7071068
#>
#> $C.b$mean_se
#> mean se
#> 5.5 0.5
#>
#> $C.b$mean_ci
#> mean_ci_lwr mean_ci_upr
#> -0.8531024 11.8531024
#> attr(,"label")
#> [1] "Mean 95% CI"
#>
#> $C.b$mean_sei
#> mean_sei_lwr mean_sei_upr
#> 5 6
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#>
#> $C.b$mean_sdi
#> mean_sdi_lwr mean_sdi_upr
#> 4.792893 6.207107
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#>
#> $C.b$mean_pval
#> p_value
#> 0.05771588
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#>
#> $C.b$median
#> median
#> 5.5
#>
#> $C.b$mad
#> mad
#> 0
#>
#> $C.b$median_ci
#> median_ci_lwr median_ci_upr
#> NA NA
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#>
#> $C.b$quantiles
#> quantile_0.25 quantile_0.75
#> 5 6
#> attr(,"label")
#> [1] "25% and 75%-ile"
#>
#> $C.b$iqr
#> iqr
#> 1
#>
#> $C.b$range
#> min max
#> 5 6
#>
#> $C.b$min
#> min
#> 5
#>
#> $C.b$max
#> max
#> 6
#>
#> $C.b$median_range
#> median min max
#> 5.5 5.0 6.0
#> attr(,"label")
#> [1] "Median (Min - Max)"
#>
#> $C.b$cv
#> cv
#> 12.85649
#>
#> $C.b$geom_mean
#> geom_mean
#> 5.477226
#>
#> $C.b$geom_mean_ci
#> mean_ci_lwr mean_ci_upr
#> 1.719943 17.442438
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#>
#> $C.b$geom_cv
#> geom_cv
#> 12.94584
#>
#>
# `s_summary.factor`
## Basic usage:
s_summary(factor(c("a", "a", "b", "c", "a")))
#> $n
#> [1] 5
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.6
#>
#> $count_fraction$b
#> [1] 1.0 0.2
#>
#> $count_fraction$c
#> [1] 1.0 0.2
#>
#>
#> $n_blq
#> [1] 0
#>
# Empty factor returns zero-filled items.
s_summary(factor(levels = c("a", "b", "c")))
#> $n
#> [1] 0
#>
#> $count
#> $count$a
#> [1] 0
#>
#> $count$b
#> [1] 0
#>
#> $count$c
#> [1] 0
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 0 0
#>
#> $count_fraction$b
#> [1] 0 0
#>
#> $count_fraction$c
#> [1] 0 0
#>
#>
#> $n_blq
#> [1] 0
#>
## Management of NA values.
x <- factor(c(NA, "Female"))
x <- explicit_na(x)
s_summary(x, na.rm = TRUE)
#> $n
#> [1] 1
#>
#> $count
#> $count$Female
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$Female
#> [1] 1 1
#>
#>
#> $n_blq
#> [1] 0
#>
s_summary(x, na.rm = FALSE)
#> $n
#> [1] 2
#>
#> $count
#> $count$Female
#> [1] 1
#>
#> $count$`<Missing>`
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$Female
#> [1] 1.0 0.5
#>
#> $count_fraction$`<Missing>`
#> [1] 1.0 0.5
#>
#>
#> $n_blq
#> [1] 0
#>
## Different denominators.
x <- factor(c("a", "a", "b", "c", "a"))
s_summary(x, denom = "N_row", .N_row = 10L)
#> $n
#> [1] 5
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.3
#>
#> $count_fraction$b
#> [1] 1.0 0.1
#>
#> $count_fraction$c
#> [1] 1.0 0.1
#>
#>
#> $n_blq
#> [1] 0
#>
s_summary(x, denom = "N_col", .N_col = 20L)
#> $n
#> [1] 5
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.00 0.15
#>
#> $count_fraction$b
#> [1] 1.00 0.05
#>
#> $count_fraction$c
#> [1] 1.00 0.05
#>
#>
#> $n_blq
#> [1] 0
#>
# `s_summary.character`
## Basic usage:
s_summary(c("a", "a", "b", "c", "a"), .var = "x", verbose = FALSE)
#> $n
#> [1] 5
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.6
#>
#> $count_fraction$b
#> [1] 1.0 0.2
#>
#> $count_fraction$c
#> [1] 1.0 0.2
#>
#>
#> $n_blq
#> [1] 0
#>
s_summary(c("a", "a", "b", "c", "a", ""), .var = "x", na.rm = FALSE, verbose = FALSE)
#> $n
#> [1] 6
#>
#> $count
#> $count$a
#> [1] 3
#>
#> $count$b
#> [1] 1
#>
#> $count$c
#> [1] 1
#>
#> $count$`NA`
#> [1] 1
#>
#>
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.5
#>
#> $count_fraction$b
#> [1] 1.0000000 0.1666667
#>
#> $count_fraction$c
#> [1] 1.0000000 0.1666667
#>
#> $count_fraction$`NA`
#> [1] 1.0000000 0.1666667
#>
#>
#> $n_blq
#> [1] 0
#>
# `s_summary.logical`
## Basic usage:
s_summary(c(TRUE, FALSE, TRUE, TRUE))
#> $n
#> [1] 4
#>
#> $count
#> [1] 3
#>
#> $count_fraction
#> [1] 3.00 0.75
#>
#> $n_blq
#> [1] 0
#>
# Empty factor returns zero-filled items.
s_summary(as.logical(c()))
#> $n
#> [1] 0
#>
#> $count
#> [1] 0
#>
#> $count_fraction
#> [1] 0 0
#>
#> $n_blq
#> [1] 0
#>
## Management of NA values.
x <- c(NA, TRUE, FALSE)
s_summary(x, na.rm = TRUE)
#> $n
#> [1] 2
#>
#> $count
#> [1] 1
#>
#> $count_fraction
#> [1] 1.0 0.5
#>
#> $n_blq
#> [1] 0
#>
s_summary(x, na.rm = FALSE)
#> $n
#> [1] 3
#>
#> $count
#> [1] 1
#>
#> $count_fraction
#> [1] 1.0000000 0.3333333
#>
#> $n_blq
#> [1] 0
#>
## Different denominators.
x <- c(TRUE, FALSE, TRUE, TRUE)
s_summary(x, denom = "N_row", .N_row = 10L)
#> $n
#> [1] 4
#>
#> $count
#> [1] 3
#>
#> $count_fraction
#> [1] 3.0 0.3
#>
#> $n_blq
#> [1] 0
#>
s_summary(x, denom = "N_col", .N_col = 20L)
#> $n
#> [1] 4
#>
#> $count
#> [1] 3
#>
#> $count_fraction
#> [1] 3.00 0.15
#>
#> $n_blq
#> [1] 0
#>
a_summary(factor(c("a", "a", "b", "c", "a")), .N_row = 10, .N_col = 10)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod row_label
#> 1 n 5 0 n
#> 2 a 3 0 a
#> 3 b 1 0 b
#> 4 c 1 0 c
#> 5 a 3 (60%) 0 a
#> 6 b 1 (20%) 0 b
#> 7 c 1 (20%) 0 c
#> 8 n_blq 0 0 n_blq
a_summary(
factor(c("a", "a", "b", "c", "a")),
.ref_group = factor(c("a", "a", "b", "c")), compare = TRUE
)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod
#> 1 n 5 0
#> 2 a 3 0
#> 3 b 1 0
#> 4 c 1 0
#> 5 a 3 (60%) 0
#> 6 b 1 (20%) 0
#> 7 c 1 (20%) 0
#> 8 n_blq 0 0
#> 9 p-value (chi-squared test) 0.9560 0
#> row_label
#> 1 n
#> 2 a
#> 3 b
#> 4 c
#> 5 a
#> 6 b
#> 7 c
#> 8 n_blq
#> 9 p-value (chi-squared test)
a_summary(c("A", "B", "A", "C"), .var = "x", .N_col = 10, .N_row = 10, verbose = FALSE)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod row_label
#> 1 n 4 0 n
#> 2 A 2 0 A
#> 3 B 1 0 B
#> 4 C 1 0 C
#> 5 A 2 (50%) 0 A
#> 6 B 1 (25%) 0 B
#> 7 C 1 (25%) 0 C
#> 8 n_blq 0 0 n_blq
a_summary(
c("A", "B", "A", "C"),
.ref_group = c("B", "A", "C"), .var = "x", compare = TRUE, verbose = FALSE
)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod
#> 1 n 4 0
#> 2 A 2 0
#> 3 B 1 0
#> 4 C 1 0
#> 5 A 2 (50%) 0
#> 6 B 1 (25%) 0
#> 7 C 1 (25%) 0
#> 8 n_blq 0 0
#> 9 p-value (chi-squared test) 0.9074 0
#> row_label
#> 1 n
#> 2 A
#> 3 B
#> 4 C
#> 5 A
#> 6 B
#> 7 C
#> 8 n_blq
#> 9 p-value (chi-squared test)
a_summary(c(TRUE, FALSE, FALSE, TRUE, TRUE), .N_row = 10, .N_col = 10)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod row_label
#> 1 n 5 0 n
#> 2 count 3 0 count
#> 3 count_fraction 3 (60%) 0 count_fraction
#> 4 n_blq 0 0 n_blq
a_summary(
c(TRUE, FALSE, FALSE, TRUE, TRUE),
.ref_group = c(TRUE, FALSE), .in_ref_col = TRUE, compare = TRUE
)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod
#> 1 n 5 0
#> 2 count 3 0
#> 3 count_fraction 3 (60%) 0
#> 4 n_blq 0 0
#> 5 p-value (chi-squared test) 0
#> row_label
#> 1 n
#> 2 count
#> 3 count_fraction
#> 4 n_blq
#> 5 p-value (chi-squared test)
a_summary(rnorm(10), .N_col = 10, .N_row = 20, .var = "bla")
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod
#> 1 n 10 0
#> 2 Sum 3.4 0
#> 3 Mean 0.3 0
#> 4 SD 0.7 0
#> 5 SE 0.2 0
#> 6 Mean (SD) 0.3 (0.7) 0
#> 7 Mean (SE) 0.3 (0.2) 0
#> 8 Mean 95% CI (-0.18, 0.85) 0
#> 9 Mean -/+ 1xSE (0.11, 0.56) 0
#> 10 Mean -/+ 1xSD (-0.38, 1.05) 0
#> 11 Mean p-value (H0: mean = 0) 0.17 0
#> 12 Median 0.4 0
#> 13 Median Absolute Deviation 0.0 0
#> 14 Median 95% CI (-0.48, 0.80) 0
#> 15 25% and 75%-ile -0.3 - 0.8 0
#> 16 IQR 1.0 0
#> 17 Min - Max -0.5 - 1.7 0
#> 18 Minimum -0.5 0
#> 19 Maximum 1.7 0
#> 20 Median (Min - Max) 0.4 (-0.5 - 1.7) 0
#> 21 CV (%) 214.2 0
#> 22 Geometric Mean NA 0
#> 23 Geometric Mean 95% CI NA 0
#> 24 CV % Geometric Mean NA 0
#> row_label
#> 1 n
#> 2 Sum
#> 3 Mean
#> 4 SD
#> 5 SE
#> 6 Mean (SD)
#> 7 Mean (SE)
#> 8 Mean 95% CI
#> 9 Mean -/+ 1xSE
#> 10 Mean -/+ 1xSD
#> 11 Mean p-value (H0: mean = 0)
#> 12 Median
#> 13 Median Absolute Deviation
#> 14 Median 95% CI
#> 15 25% and 75%-ile
#> 16 IQR
#> 17 Min - Max
#> 18 Minimum
#> 19 Maximum
#> 20 Median (Min - Max)
#> 21 CV (%)
#> 22 Geometric Mean
#> 23 Geometric Mean 95% CI
#> 24 CV % Geometric Mean
a_summary(rnorm(10, 5, 1), .ref_group = rnorm(20, -5, 1), .var = "bla", compare = TRUE)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#> row_name formatted_cell indent_mod
#> 1 n 10 0
#> 2 Sum 52.0 0
#> 3 Mean 5.2 0
#> 4 SD 1.1 0
#> 5 SE 0.3 0
#> 6 Mean (SD) 5.2 (1.1) 0
#> 7 Mean (SE) 5.2 (0.3) 0
#> 8 Mean 95% CI (4.42, 5.99) 0
#> 9 Mean -/+ 1xSE (4.86, 5.55) 0
#> 10 Mean -/+ 1xSD (4.11, 6.30) 0
#> 11 Mean p-value (H0: mean = 0) 0.00 0
#> 12 Median 5.5 0
#> 13 Median Absolute Deviation 0.0 0
#> 14 Median 95% CI (4.36, 6.04) 0
#> 15 25% and 75%-ile 4.7 - 5.9 0
#> 16 IQR 1.2 0
#> 17 Min - Max 2.6 - 6.4 0
#> 18 Minimum 2.6 0
#> 19 Maximum 6.4 0
#> 20 Median (Min - Max) 5.5 (2.6 - 6.4) 0
#> 21 CV (%) 21.1 0
#> 22 Geometric Mean 5.1 0
#> 23 Geometric Mean 95% CI (4.20, 6.11) 0
#> 24 CV % Geometric Mean 26.8 0
#> 25 p-value (t-test) <0.0001 0
#> row_label
#> 1 n
#> 2 Sum
#> 3 Mean
#> 4 SD
#> 5 SE
#> 6 Mean (SD)
#> 7 Mean (SE)
#> 8 Mean 95% CI
#> 9 Mean -/+ 1xSE
#> 10 Mean -/+ 1xSD
#> 11 Mean p-value (H0: mean = 0)
#> 12 Median
#> 13 Median Absolute Deviation
#> 14 Median 95% CI
#> 15 25% and 75%-ile
#> 16 IQR
#> 17 Min - Max
#> 18 Minimum
#> 19 Maximum
#> 20 Median (Min - Max)
#> 21 CV (%)
#> 22 Geometric Mean
#> 23 Geometric Mean 95% CI
#> 24 CV % Geometric Mean
#> 25 p-value (t-test)
## Fabricated dataset.
dta_test <- data.frame(
USUBJID = rep(1:6, each = 3),
PARAMCD = rep("lab", 6 * 3),
AVISIT = rep(paste0("V", 1:3), 6),
ARM = rep(LETTERS[1:3], rep(6, 3)),
AVAL = c(9:1, rep(NA, 9))
)
# `analyze_vars()` in `rtables` pipelines
## Default output within a `rtables` pipeline.
l <- basic_table() %>%
split_cols_by(var = "ARM") %>%
split_rows_by(var = "AVISIT") %>%
analyze_vars(vars = "AVAL")
build_table(l, df = dta_test)
#> A B C
#> ————————————————————————————————————————
#> V1
#> n 2 1 0
#> Mean (SD) 7.5 (2.1) 3.0 (NA) NA
#> Median 7.5 3.0 NA
#> Min - Max 6.0 - 9.0 3.0 - 3.0 NA
#> V2
#> n 2 1 0
#> Mean (SD) 6.5 (2.1) 2.0 (NA) NA
#> Median 6.5 2.0 NA
#> Min - Max 5.0 - 8.0 2.0 - 2.0 NA
#> V3
#> n 2 1 0
#> Mean (SD) 5.5 (2.1) 1.0 (NA) NA
#> Median 5.5 1.0 NA
#> Min - Max 4.0 - 7.0 1.0 - 1.0 NA
## Select and format statistics output.
l <- basic_table() %>%
split_cols_by(var = "ARM") %>%
split_rows_by(var = "AVISIT") %>%
analyze_vars(
vars = "AVAL",
.stats = c("n", "mean_sd", "quantiles"),
.formats = c("mean_sd" = "xx.x, xx.x"),
.labels = c(n = "n", mean_sd = "Mean, SD", quantiles = c("Q1 - Q3"))
)
build_table(l, df = dta_test)
#> A B C
#> ———————————————————————————————————————
#> V1
#> n 2 1 0
#> Mean, SD 7.5, 2.1 3.0, NA NA
#> Q1 - Q3 6.0 - 9.0 3.0 - 3.0 NA
#> V2
#> n 2 1 0
#> Mean, SD 6.5, 2.1 2.0, NA NA
#> Q1 - Q3 5.0 - 8.0 2.0 - 2.0 NA
#> V3
#> n 2 1 0
#> Mean, SD 5.5, 2.1 1.0, NA NA
#> Q1 - Q3 4.0 - 7.0 1.0 - 1.0 NA
## Use arguments interpreted by `s_summary`.
l <- basic_table() %>%
split_cols_by(var = "ARM") %>%
split_rows_by(var = "AVISIT") %>%
analyze_vars(vars = "AVAL", na.rm = FALSE)
build_table(l, df = dta_test)
#> A B C
#> —————————————————————————————————
#> V1
#> n 2 2 2
#> Mean (SD) 7.5 (2.1) NA NA
#> Median 7.5 NA NA
#> Min - Max 6.0 - 9.0 NA NA
#> V2
#> n 2 2 2
#> Mean (SD) 6.5 (2.1) NA NA
#> Median 6.5 NA NA
#> Min - Max 5.0 - 8.0 NA NA
#> V3
#> n 2 2 2
#> Mean (SD) 5.5 (2.1) NA NA
#> Median 5.5 NA NA
#> Min - Max 4.0 - 7.0 NA NA
## Handle `NA` levels first when summarizing factors.
dta_test$AVISIT <- NA_character_
dta_test <- df_explicit_na(dta_test)
l <- basic_table() %>%
split_cols_by(var = "ARM") %>%
analyze_vars(vars = "AVISIT", na.rm = FALSE)
build_table(l, df = dta_test)
#> A B C
#> ——————————————————————————————————————————
#> n 6 6 6
#> <Missing> 6 (100%) 6 (100%) 6 (100%)
# auto format
dt <- data.frame("VAR" = c(0.001, 0.2, 0.0011000, 3, 4))
basic_table() %>%
analyze_vars(
vars = "VAR",
.stats = c("n", "mean", "mean_sd", "range"),
.formats = c("mean_sd" = "auto", "range" = "auto")
) %>%
build_table(dt)
#> all obs
#> —————————————————————————————
#> n 5
#> Mean 1.4
#> Mean (SD) 1.44042 (1.91481)
#> Min - Max 0.0010 - 4.0000