This function displays descriptive and inferential results for binary, continuous, and survival data in the format of a table stratified by exposure and, if requested, by effect modifiers.
This function is intended only for tabulations of final results. Model diagnostics for regression models need to be conducted separately.
Usage
rifttable(
design,
data,
layout = "rows",
factor = 1000,
risk_percent = FALSE,
risk_digits = dplyr::if_else(risk_percent == TRUE, true = 0, false = 2),
diff_digits = 2,
ratio_digits = 2,
ratio_digits_decrease = c(`2.995` = -1, `9.95` = -2),
rate_digits = 1,
to = ", ",
reference = "(reference)",
type2_layout = "rows",
overall = FALSE,
exposure_levels = c("noempty", "nona", "all")
)
Arguments
- design
Design matrix (data frame) that sets up the table. See Details. Must be provided.
- data
Dataset to be used for all analyses. Must be provided unless the
design
was generated bytable1_design
.- layout
Optional.
"rows"
uses thedesign
as rows and exposure categories as columns."cols"
is the opposite:design
as columns and exposure categories as rows. Defaults to"rows"
.- factor
Optional. Used for
type = "rates"
: Factor to multiply events per person-time by. Defaults to1000
.- risk_percent
Optional. Show risk and risk difference estimates in percentage points instead of proportions. Defaults to
FALSE
unless thedesign
was generated bytable1_design
. In this latter case, ifrisk_percent
is not provided, it will default toTRUE
.- risk_digits
Optional. Number of decimal digits to show for risks/ cumulative incidence. Defaults to
2
forrisk_percent = FALSE
and to0
forrisk_percent = TRUE
. Can override for each line intype
.- diff_digits
Optional. Number of decimal digits to show for rounding of means and mean difference estimates. Defaults to
2
.- ratio_digits
Optional. Number of decimal digits to show for ratio estimates. Defaults to
2
. Can override for each line intype
.- ratio_digits_decrease
Optional. Lower limits of ratios above which fewer digits should be shown. Provide a named vector of the format,
c(`3` = -2, `10` = -2)
to reduce the number of rounding digits by 1 digit for ratios greater than 3 and by 2 digits for ratios greater than 10 (the default). To disable, set toNULL
.- rate_digits
Optional. Number of decimal digits to show for rates. Defaults to
1
. Can override for each line intype
.- to
Optional. Separator between the lower and the upper bound of the 95% confidence interval (and interquartile range for medians). Defaults to
", "
.- reference
Optional. Defaults to
"(reference)"
. Alternative label for the reference category.- type2_layout
Optional. If a second estimate is requested via
type2
in thedesign
matrix, display it as in rows below ("rows"
) or columns ("columns"
) to the right. Defaults to"rows"
.- overall
Optional. Defaults to
FALSE
. Add a first column with unstratified estimates to an exposure-stratified table? Elements will be shown only for absolute estimates (e.g.,type = "mean"
) and blank for comparatible estimates (e.g., mean difference viatype = "diff"
).- exposure_levels
Optional. Defaults to
"noempty"
. Show only exposure levels that exist in the data or are"NA"
("noempty"
); show only exposure levels that are neither"NA"
nor empty ("nona"
); or show all exposure levels ("all"
), even if"NA"
or a factor level that does not exist in the data.
Value
Tibble. Get formatted output as a gt table by passing on to
rt_gt
.
Details
The main input parameter is the dataset design
.
Always required are the column type
(the type of requested
statistic, see below), as well as outcome
for binary outcomes or
time
and event
for survival outcomes:
label
A label for each row (or column). If missing,type
will be used as the label.exposure
Optional. The exposure variable. Must be categorical (factor or logical). If missing (NA
), then an unstratified table with absolute estimates only will be returned.outcome
The outcome variable for non-survival data (i.e., wheneverevent
andtime
are not used). For risk/prevalence data, this variable must be0
/1
orFALSE
/TRUE
.time
The time variable for survival data. Needed for, e.g.,type = "hr"
andtype = "rate"
(i.e., wheneveroutcome
is not used).time2
The second time variable for late entry models. Only used in conjunction withtime
. If provided,time
will become the entry time andtime2
the exit time, following conventions ofSurv
.event
The event variable for survival data. Events are typically1
, censored observations0
. If competing events are present, censoring should be the first-ordered level, e.g., of a factor, and the level corresponding to the event of interest should be supplied asevent = "event_variable@Recurrence"
if"Recurrence"
is the event of interest. Theevent
variable is needed for, e.g.,type = "hr"
andtype = "rate"
, i.e., wheneveroutcome
is not used.trend
Optional. For regression models, a continuous representation of the exposure, for which a slope per one unit increase ("trend") will be estimated. Must be a numeric variable. If joint models forexposure
andeffect_modifier
are requested, trends are still reported within each stratum of theeffect_modifier
. UseNA
to leave blank.effect_modifier
Optional. A categorical effect modifier variable. UseNA
to leave blank.stratum
Optional. A stratum of the effect modifier. UseNULL
to leave blank.NA
will evaluate observations with missing data for theeffect_modifier
.confounders
Optional. A string in the format"+ var1 + var2"
that will be substituted into intoformula = exposure + confounders
. UseNA
or""
(empty string) to leave blank; the default. For Cox models, can add"+ strata(site)"
to obtain models with stratification by, e.g.,site
. For Poisson models, can add"+ offset(log(persontime))"
to define, e.g.,persontime
as the offset.weights
Optional. Variable with weights, for example inverse- probability weights. Used by comparative survival estimators (e.g.,type = "hr"
andtype = "cumincdiff"
) as well astype = "cuminc"
andtype = "surv"
. They are ignored by other estimators.type
The statistic requested (case-insensitive):Comparative estimates with 95% confidence intervals:
"hr"
Hazard ratio from Cox proportional hazards regression."irr"
Incidence rate ratio for count outcomes from Poisson regression model."irrrob"
Ratio for other outcomes from Poisson regression model with robust (sandwich) standard errors."rr"
Risk ratio (or prevalence ratio) fromriskratio
. Can request specific model fitting approach and, for marginal standardization only, the number of bootstrap repeats. Examples:"rrglm_start"
or"rrmargstd 2000"
."rd"
Risk difference (or prevalence difference) fromriskdiff
. Can request model fitting approach and bootstrap repeats as for"rr"
."diff"
Mean difference from linear model."quantreg"
Quantile difference from quantile regression usingrq
withmethod = "fn"
. By default, this is the difference in medians. For a different quantile, e.g., the 75th percentile, use"quantreg 0.75"
."fold"
Fold change from generalized linear model with log link (i.e., ratio of arithmetic means)."foldlog"
Fold change from linear model after log transformation of the outcome (i.e., ratio of geometric means)."or"
Odds ratio from logistic regression."survdiff"
Difference in survival from Kaplan-Meier estimator. Provide time horizon, e.g.,"survdiff 2.5"
to evaluate differences in survival at 2.5 years. Usessurvdiff_ci
."cumincdiff"
Difference in cumulative incidence from the Kaplan-Meier estimator or, if competing risks are present, its generalized form, the Aalen-Johansen estimator. Provide time horizon, e.g.,"cumincdiff 2.5"
to evaluate differences in cumulative incidence at 2.5 years. Usessurvdiff_ci
."survratio"
Ratio in survival from Kaplan-Meier estimator. Provide time horizon, e.g.,"survdiff 2.5"
to evaluate 2.5-year relative risk. Usessurvdiff_ci
."cumincratio"
Ratio in cumulative incidence from the Kaplan-Meier estimator or, if competing risks are present, its generalized form, the Aalen-Johansen estimator. Provide time horizon, e.g.,"cumincdiff 2.5"
to evaluate the 2.5-year risk difference. Usessurvdiff_ci
.
Absolute estimates per exposure category:
"events"
Event count."time"
Person-time."outcomes"
Outcome count."total"
Number of observations."events/time"
Events slash person-time."events/total"
Events slash number of observations."cases/controls"
Cases and non-cases (events and non-events); useful for case-control studies."risk"
Risk (or prevalence), calculated as a proportion, i.e., outcomes divided by number of observations. Change between display as proportion or percent using the parameterrisk_percent
."risk (ci)"
Risk with 95% confidence interval (Wilson score interval for binomial proportions, seescoreci
)."cuminc"
Cumulative incidence ("risk") from the Kaplan-Meier estimator or, if competing risks are present, its generalized form, the Aalen-Johansen estimator. Provide time point (e.g., 1.5-year cumulative incidence) using"cuminc 1.5"
. If no time point is provided, the cumulative incidence at end of follow-up is returned. Change between display as proportion or percent using the parameterrisk_percent
."cuminc (ci)"
Cumulative incidence ("risk"), as above, with 95% confidence intervals (Greenwood standard errors with log transformation, the default of the survival package/survfit
). Provide time point as in"cuminc"
."surv"
Survival from the Kaplan-Meier estimator. Provide time point (e.g., 1.5-year survival) using"surv 1.5"
. If no time point is provided, returns survival at end of follow-up. Change between display as proportion or percent using the parameterrisk_percent
."surv (ci)"
Survival from the Kaplan-Meier estimator with 95% confidence interval (Greenwood standard errors with log transformation, the default of the survival package/survfit
). Provide time point as in"surv"
."rate"
Event rate: event count divided by person-time, multiplied byfactor
."rate (ci)"
Event rate with 95% confidence interval (Poisson-type large-sample interval)."outcomes (risk)"
A combination: Outcomes followed by risk in parentheses."outcomes/total (risk)"
A combination: Outcomes slash total followed by risk in parentheses."events/time (rate)"
A combination: Events slash time followed by rate in parentheses."medsurv"
Median survival."medsurv (ci)"
Median survival with 95% confidence interval."medfu"
Median follow-up (reverse Kaplan-Meier), equals median survival for censoring."medfu (iqr)"
Median and interquartile range for follow-up."maxfu"
Maximum follow-up time."mean"
Mean (arithmetic mean)."mean (ci)"
Mean and 95% CI."mean (sd)"
Mean and standard deviation."geomean"
Geometric mean."median"
Median."median (iqr)"
Median and interquartile range."range"
Range: Minimum to maximum value."blank"
or""
An empty line.Custom: A custom function that must be available under the name
estimate_my_function
in order to be callable astype = "my_function"
.
By default, regression models will be fit separately for each stratum of the
effect_modifier
. Append"_joint"
to"hr"
,"rr"
,"rd"
,"irr"
,"irrrob"
,"diff"
,"fold"
,"foldlog"
,"quantreg"
, or"or"
to obtain "joint" models for exposure and effect modifier that have a single reference category. Example:type = "hr_joint"
. The reference categories for exposure and effect modifier are their first factor levels, which can be changed usingfct_relevel
from the forcats package. Note that the joint model will be fit across all non-missing (NA
) strata of the effect modifier, even if thedesign
table does not request all strata be shown.type2
Optional. A second statistic that is added in an adjacent row or column (global optiontype2_layout
defaults to"row"
and can alternatively be set to"column"
). For example, usetype = "events/times", type2 = "hr"
to get both event counts/person-time and hazard ratios for the same data, exposure, stratum, confounders, and outcome.digits
Optional. The number of digits for rounding an individual line. Defaults toNA
, where the number of digits will be determined based onrifttable
's argumentsrisk_percent
,risk_digits
,diff_digits
,ratio_digits
, orrate_digits
, as applicable.digits2
Optional. Asdigits
, for the second estimate (type2
).nmin
. Optional. Suppress estimates with"--"
if a cell defined by exposure, and possibly the effect modifier, contains fewer observations or, for survival analyses, fewer events thannmin
. Defaults toNA
, i.e., to print all estimates.na_rm
. Optional. Exclude observations with missing outcome. Defaults toFALSE
. Use with caution.ci
. Optional. Confidence level. Defaults to0.95
.
Use tibble
, tribble
, and
mutate
to construct the design
dataset,
especially variables that are used repeatedly (e.g., exposure, time,
event
, or outcome
). See examples.
If regression models cannot provide estimates in a stratum, e.g.,
because there are no events, then "--"
will be printed. Accompanying
warnings need to be suppressed manually, if appropriate, using
suppressWarnings(rifttable(...))
.
References
Greenland S, Rothman KJ (2008). Introduction to Categorical Statistics. In: Rothman KJ, Greenland S, Lash TL. Modern Epidemiology, 3rd edition. Philadelpha, PA: Lippincott Williams & Wilkins. Page 242. (Poisson/large-sample approximation for variance of incidence rates)
Examples
# Load 'cancer' dataset from survival package (Used in all examples)
data(cancer, package = "survival")
# The exposure (here, 'sex') must be categorical
cancer <- cancer %>%
tibble::as_tibble() %>%
dplyr::mutate(
sex = factor(
sex,
levels = 1:2,
labels = c("Male", "Female")),
time = time / 365.25,
status = status - 1)
# Example 1: Binary outcomes (use 'outcome' variable)
# Set table design
design1 <- tibble::tibble(
label = c(
"Outcomes",
"Total",
"Outcomes/Total",
"Risk",
"Risk (CI)",
"Outcomes (Risk)",
"Outcomes/Total (Risk)",
"RR",
"RD")) %>%
dplyr::mutate(
type = label,
exposure = "sex",
outcome = "status")
# Generate rifttable
rifttable(
design = design1,
data = cancer)
#> # A tibble: 9 × 3
#> sex Male Female
#> <chr> <chr> <chr>
#> 1 Outcomes 112 53
#> 2 Total 138 90
#> 3 Outcomes/Total 112/138 53/90
#> 4 Risk 0.81 0.59
#> 5 Risk (CI) 0.81 (0.74, 0.87) 0.59 (0.49, 0.68)
#> 6 Outcomes (Risk) 112 (0.81) 53 (0.59)
#> 7 Outcomes/Total (Risk) 112/138 (0.81) 53/90 (0.59)
#> 8 RR 1 (reference) 0.73 (0.60, 0.88)
#> 9 RD 0 (reference) -0.22 (-0.34, -0.10)
# Use 'design' as columns (selecting RR and RD only)
rifttable(
design = design1 %>%
dplyr::filter(label %in% c("RR", "RD")),
data = cancer,
layout = "cols")
#> # A tibble: 2 × 3
#> sex RR RD
#> <chr> <chr> <chr>
#> 1 Male 1 (reference) 0 (reference)
#> 2 Female 0.73 (0.60, 0.88) -0.22 (-0.34, -0.10)
# Example 2: Survival outcomes (use 'time' and 'event'),
# with an effect modifier and a confounder
# Set table design
design2 <- tibble::tribble(
# Elements that vary by row:
~label, ~stratum, ~confounders, ~type,
"**Overall**", NULL, "", "blank",
" Events", NULL, "", "events",
" Person-years", NULL, "", "time",
" Rate/1000 py (95% CI)", NULL, "", "rate (ci)",
" Unadjusted HR (95% CI)", NULL, "", "hr",
" Age-adjusted HR (95% CI)", NULL, "+ age", "hr",
"", NULL, "", "blank",
"**Stratified models**", NULL, "", "",
"*ECOG PS1* (events/N)", 1, "", "events/total",
" Unadjusted", 1, "", "hr",
" Age-adjusted", 1, "+ age", "hr",
"*ECOG PS2* (events/N)", 2, "", "events/total",
" Unadjusted", 2, "", "hr",
" Age-adjusted", 2, "+ age", "hr",
"", NULL, "", "",
"**Joint model**, age-adj.", NULL, "", "",
" ECOG PS1", 1, "+ age", "hr_joint",
" ECOG PS2", 2, "+ age", "hr_joint") %>%
# Elements that are the same for all rows:
dplyr::mutate(
exposure = "sex",
event = "status",
time = "time",
effect_modifier = "ph.ecog")
# Generate rifttable
rifttable(
design = design2,
data = cancer %>%
dplyr::filter(ph.ecog %in% 1:2))
#> # A tibble: 18 × 3
#> sex Male Female
#> <chr> <chr> <chr>
#> 1 "**Overall**" "" ""
#> 2 " Events" "82" "44"
#> 3 " Person-years" "70" "59"
#> 4 " Rate/1000 py (95% CI)" "1164.8 (938.1, 1446.3)" "746.7 (555.7, 1003.4)"
#> 5 " Unadjusted HR (95% CI)" "1 (reference)" "0.60 (0.41, 0.86)"
#> 6 " Age-adjusted HR (95% CI)" "1 (reference)" "0.60 (0.41, 0.86)"
#> 7 "" "" ""
#> 8 "**Stratified models**" "" ""
#> 9 "*ECOG PS1* (events/N)" "54/71" "28/42"
#> 10 " Unadjusted" "1 (reference)" "0.53 (0.33, 0.85)"
#> 11 " Age-adjusted" "1 (reference)" "0.53 (0.33, 0.85)"
#> 12 "*ECOG PS2* (events/N)" "28/29" "16/21"
#> 13 " Unadjusted" "1 (reference)" "0.70 (0.37, 1.30)"
#> 14 " Age-adjusted" "1 (reference)" "0.68 (0.34, 1.36)"
#> 15 "" "" ""
#> 16 "**Joint model**, age-adj." "" ""
#> 17 " ECOG PS1" "1 (reference)" "0.55 (0.35, 0.88)"
#> 18 " ECOG PS2" "1.54 (0.98, 2.44)" "1.10 (0.62, 1.98)"
# Example 3: Get two estimates using 'type' and 'type2'
design3 <- tibble::tribble(
~label, ~stratum, ~type, ~type2,
"ECOG PS1", 1, "events/total", "hr",
"ECOG PS2", 2, "events/total", "hr") %>%
dplyr::mutate(
exposure = "sex",
event = "status",
time = "time",
confounders = "+ age",
effect_modifier = "ph.ecog")
rifttable(
design = design3,
data = cancer %>%
dplyr::filter(ph.ecog %in% 1:2))
#> # A tibble: 4 × 3
#> sex Male Female
#> <chr> <chr> <chr>
#> 1 "ECOG PS1" 54/71 28/42
#> 2 "" 1 (reference) 0.53 (0.33, 0.85)
#> 3 "ECOG PS2" 28/29 16/21
#> 4 "" 1 (reference) 0.68 (0.34, 1.36)
rifttable(
design = design3,
data = cancer %>%
dplyr::filter(ph.ecog %in% 1:2),
layout = "cols",
type2_layout = "cols")
#> # A tibble: 2 × 5
#> sex `ECOG PS1` `ECOG PS1 ` `ECOG PS2` `ECOG PS2 `
#> <chr> <chr> <chr> <chr> <chr>
#> 1 Male 54/71 1 (reference) 28/29 1 (reference)
#> 2 Female 28/42 0.53 (0.33, 0.85) 16/21 0.68 (0.34, 1.36)
# Example 4: Continuous outcomes (use 'outcome' variable);
# request rounding to 1 decimal digit in some cases;
# add continuous trend (slope per one unit of the 'trend' variable)
tibble::tribble(
~label, ~stratum, ~type, ~digits,
"Marginal mean (95% CI)", NULL, "mean (ci)", 1,
" Male", "Male", "mean", NA,
" Female", "Female", "mean", NA,
"", NULL, "", NA,
"Stratified model", NULL, "", NA,
" Male", "Male", "diff", 1,
" Female", "Female", "diff", 1,
"", NULL, "", NA,
"Joint model", NULL, "", NA,
" Male", "Male", "diff_joint", NA,
" Female", "Female", "diff_joint", NA) %>%
dplyr::mutate(
exposure = "ph.ecog_factor",
trend = "ph.ecog",
outcome = "age",
effect_modifier = "sex") %>%
rifttable(
data = cancer %>%
dplyr::filter(ph.ecog < 3) %>%
dplyr::mutate(ph.ecog_factor = factor(ph.ecog)))
#> # A tibble: 11 × 5
#> ph.ecog_factor `0` `1` `2` Trend
#> <chr> <chr> <chr> <chr> <chr>
#> 1 "Marginal mean (95% CI)" "61.2 (58.8, 63.5)" "61.5 (59.8, 63.1… "66.… ""
#> 2 " Male" "63.00" "62.79" "65.… ""
#> 3 " Female" "58.70" "59.19" "67.… ""
#> 4 "" "" "" "" ""
#> 5 "Stratified model" "" "" "" ""
#> 6 " Male" "0 (reference)" "-0.2 (-3.9, 3.5)" "2.0… "0.9…
#> 7 " Female" "0 (reference)" "0.5 (-3.5, 4.5)" "9.2… "4.4…
#> 8 "" "" "" "" ""
#> 9 "Joint model" "" "" "" ""
#> 10 " Male" "0 (reference)" "-0.21 (-3.75, 3.… "2.0… "0.9…
#> 11 " Female" "-4.30 (-8.70, 0.11)" "-3.81 (-7.74, 0.… "4.9… "4.3…
# Example 5: Get formatted output for Example 2 (see above)
if (FALSE) {
rifttable(design = design2,
data = cancer %>% dplyr::filter(ph.ecog %in% 1:2)) %>%
rt_gt(md = 1) # get markdown formatting in first column ('label')
}