tern
Formatting Functions Overview
The tern
R package provides functions to create common
analyses from clinical trials in R
and these functions have
default formatting arguments for displaying the values in the output a
specific way.
tern
formatting differs compared to the formatting
available in the formatters
package as tern
formats are capable of handling logical statements, allowing for more
fine-tuning of the output displayed. Depending on what type of value is
being displayed, and what that value is, the format of the output will
change. Whereas when using the formatters
package, the
specified format is applied regardless of the value.
To see the available formatting functions available in
tern
see ?formatting_functions
. To see the
available format strings available in formatters
see
formatters::list_valid_format_labels()
.
Comparing tern
& formatters
Formats
The packages used in this vignette are:
The example below demonstrates the use of tern
formatting in the count_abnormal()
function. The example
“low” category has a non-zero numerator value so both a fraction and a
percentage value are displayed, while the “high” value has a numerator
value of zero and so the fraction value is displayed without also
displaying the redundant zero percentage value.
df2 <- data.frame(
ID = as.character(c(1, 1, 2, 2)),
RANGE = factor(c("NORMAL", "LOW", "HIGH", "LOW")),
BL_RANGE = factor(c("NORMAL", "NORMAL", "HIGH", "HIGH")),
ONTRTFL = c("", "Y", "", "Y"),
stringsAsFactors = FALSE
)
df2 <- df2 %>%
filter(ONTRTFL == "Y")
basic_table() %>%
count_abnormal(
var = "RANGE",
abnormal = list(low = "LOW", high = "HIGH"),
variables = list(id = "ID", baseline = "BL_RANGE"),
exclude_base_abn = FALSE,
.formats = list(fraction = format_fraction)
) %>%
build_table(df2)
#> all obs
#> —————————————————
#> low 2/2 (100%)
#> high 0/2
In the following example the count_abnormal()
function
is utilized again. This time both “low” values and “high” values have a
non-zero numerator and so both show a percentage.
df2 <- data.frame(
ID = as.character(c(1, 1, 2, 2)),
RANGE = factor(c("NORMAL", "LOW", "HIGH", "HIGH")),
BL_RANGE = factor(c("NORMAL", "NORMAL", "HIGH", "HIGH")),
ONTRTFL = c("", "Y", "", "Y"),
stringsAsFactors = FALSE
)
df2 <- df2 %>%
filter(ONTRTFL == "Y")
basic_table() %>%
count_abnormal(
var = "RANGE",
abnormal = list(low = "LOW", high = "HIGH"),
variables = list(id = "ID", baseline = "BL_RANGE"),
exclude_base_abn = FALSE,
.formats = list(fraction = format_fraction)
) %>%
build_table(df2)
#> all obs
#> ————————————————
#> low 1/2 (50%)
#> high 1/2 (50%)
The following example demonstrates the difference when
formatters
is used instead to format the output. Here we
choose to use "xx / xx"
as our value format. The “high”
value has a zero numerator value and the “low” value has a non-zero
numerator, yet both are displayed in the same format.
df2 <- data.frame(
ID = as.character(c(1, 1, 2, 2)),
RANGE = factor(c("NORMAL", "LOW", "HIGH", "LOW")),
BL_RANGE = factor(c("NORMAL", "NORMAL", "HIGH", "HIGH")),
ONTRTFL = c("", "Y", "", "Y"),
stringsAsFactors = FALSE
)
df2 <- df2 %>%
filter(ONTRTFL == "Y")
basic_table() %>%
count_abnormal(
var = "RANGE",
abnormal = list(low = "LOW", high = "HIGH"),
variables = list(id = "ID", baseline = "BL_RANGE"),
exclude_base_abn = FALSE,
.formats = list(fraction = "xx / xx")
) %>%
build_table(df2)
#> all obs
#> ——————————————
#> low 2 / 2
#> high 0 / 2
The same concept occurs when using any of the available formats from
the formatters
package. The following example displays the
same result using the "xx.x / xx.x"
format instead. Use
formatters::list_valid_format_labels()
to see the full list
of available formats in formatters
.
df2 <- data.frame(
ID = as.character(c(1, 1, 2, 2)),
RANGE = factor(c("NORMAL", "LOW", "HIGH", "LOW")),
BL_RANGE = factor(c("NORMAL", "NORMAL", "HIGH", "HIGH")),
ONTRTFL = c("", "Y", "", "Y"),
stringsAsFactors = FALSE
)
df2 <- df2 %>%
filter(ONTRTFL == "Y")
basic_table() %>%
count_abnormal(
var = "RANGE",
abnormal = list(low = "LOW", high = "HIGH"),
variables = list(id = "ID", baseline = "BL_RANGE"),
exclude_base_abn = FALSE,
.formats = list(fraction = "xx.x / xx.x")
) %>%
build_table(df2)
#> all obs
#> ————————————————
#> low 2.0 / 2.0
#> high 0.0 / 2.0
Formatting Function Basics
Current tern
formatting functions consider some of the
following aspects when setting custom behaviors:
- Missing values - a custom value or string can be set to display for
missing values instead of
NA
. - 0’s - if a cell value is zero,
tern
fraction formatting functions will exclude the accompanying percentage value. - Number of decimal places to display - the number of decimal places can be fixed if needed.
- Value thresholds - a different format or value can be displayed depending on whether the value is within a certain threshold.
Number of Decimal Places to Display
Two functions that set a fixed number of decimal places (specifically
1) are format_fraction_fixed_dp()
and
format_count_fraction_fixed_dp()
. By default, formatting
functions will remove trailing zeros, but these two functions will
always have one decimal place in their percentage, even if the digit is
a zero. See the following example:
format_fraction_fixed_dp(x = c(num = 1L, denom = 3L))
#> [1] "1/3 (33.3%)"
format_fraction_fixed_dp(x = c(num = 1L, denom = 2L))
#> [1] "1/2 (50.0%)"
format_count_fraction_fixed_dp(x = c(2, 0.6667))
#> [1] "2 (66.7%)"
format_count_fraction_fixed_dp(x = c(2, 0.25))
#> [1] "2 (25.0%)"
Value Thresholds
Functions that set custom values according to a certain threshold
include format_extreme_values()
,
format_extreme_values_ci()
, and
format_fraction_threshold()
. The extreme value formats work
similarly to allow the user to specify the maximum number of digits to
include, and very large or very small values are given a special string
value. For example:
extreme_format <- format_extreme_values(digits = 2)
extreme_format(0.235)
#> [1] "0.23"
extreme_format(0.001)
#> [1] "<0.01"
extreme_format(Inf)
#> [1] ">999.99"
The format_fraction_threshold()
function allows the user
to specify a lower percentage threshold, below which values are instead
assigned a special string value. For example:
fraction_format <- format_fraction_threshold(0.05)
fraction_format(x = c(20, 0.1))
#> [1] 10
fraction_format(x = c(2, 0.01))
#> [1] "<5"
See the documentation on each function for specific details on their behavior and how to customize them.
Creating Custom Formatting Functions
If your table requires customized output that cannot be displayed
using one of the pre-existing tern
formatting functions,
you may want to consider creating a new formatting function. When
creating your own formatting function it is important to consider the
aspects listed in the Formatting Function Customization section
above.
In this section we will create a custom formatting function derived
from the format_fraction_fixed_dp()
function. First we will
take a look at this function in detail and then we will customize
it.
# First we will see how the format_fraction_fixed_dp code works and displays the outputs
format_fraction_fixed_dp <- function(x, ...) {
attr(x, "label") <- NULL
checkmate::assert_vector(x)
checkmate::assert_count(x["num"])
checkmate::assert_count(x["denom"])
result <- if (x["num"] == 0) {
paste0(x["num"], "/", x["denom"])
} else {
paste0(
x["num"], "/", x["denom"],
" (", sprintf("%.1f", round(x["num"] / x["denom"] * 100, 1)), "%)"
)
}
return(result)
}
Here we see that if the numerator value is greater than 0, the fraction and percentage is displayed. If the numerator is 0, only the fraction is shown. Percent values always display 1 decimal place. Below we will create a dummy dataset and then observe the output value behavior when this formatting function is applied.
df2 <- data.frame(
ID = as.character(c(1, 1, 2, 2)),
RANGE = factor(c("NORMAL", "LOW", "HIGH", "LOW")),
BL_RANGE = factor(c("NORMAL", "NORMAL", "HIGH", "HIGH")),
ONTRTFL = c("", "Y", "", "Y"),
stringsAsFactors = FALSE
) %>%
filter(ONTRTFL == "Y")
basic_table() %>%
count_abnormal(
var = "RANGE",
abnormal = list(low = "LOW", high = "HIGH"),
variables = list(id = "ID", baseline = "BL_RANGE"),
exclude_base_abn = FALSE,
.formats = list(fraction = format_fraction_fixed_dp)
) %>%
build_table(df2)
#> all obs
#> ———————————————————
#> low 2/2 (100.0%)
#> high 0/2
Now we will modify this function to make our custom formatting
function, custom_format
. We want to display 3 decimal
places in the percent value, and if the numerator value is 0 we only
want to display a 0 value (without the denominator).
custom_format <- function(x, ...) {
attr(x, "label") <- NULL
checkmate::assert_vector(x)
checkmate::assert_count(x["num"])
checkmate::assert_count(x["denom"])
result <- if (x["num"] == 0) {
paste0(x["num"]) # We remove the denominator on this line so that only a 0 is displayed
} else {
paste0(
x["num"], "/", x["denom"],
" (", sprintf("%.3f", round(x["num"] / x["denom"] * 100, 1)), "%)" # We include 3 decimal places with %.3f
)
}
return(result)
}
basic_table() %>%
count_abnormal(
var = "RANGE",
abnormal = list(low = "LOW", high = "HIGH"),
variables = list(id = "ID", baseline = "BL_RANGE"),
exclude_base_abn = FALSE,
.formats = list(fraction = custom_format) # Here we implement our new custom_format function
) %>%
build_table(df2)
#> all obs
#> —————————————————————
#> low 2/2 (100.000%)
#> high 0
Summary
Each tern
analysis function has pre-specified default
format functions to implement when generating output, some of which are
taken from the formatters
package and some of which are
custom formatting functions stored in tern
. These
tern
functions differ compared to those from
formatters
in that logical statements can be used to set
value-dependent customized formats. If you would like to create your own
custom formatting function to use with tern
, be sure to
carefully consider which rules you want to implement to handle different
input values.