Calculate descriptive summary statistics of all numeric variables in a given dataset. Optionally, this output can be stratified by one or more categorical variable(s).
tsummary(data, ..., by = NULL, na.rm = TRUE)
Data frame (tibble).
Optional. Variables to summarize. If not provided, all numeric variables will be summarized. Supports tidy evaluation; see examples.
Optional. Categorical variable(s) to stratify results by.
Optional. Drop missing values from summary statatistics?
If set to FALSE
, summary statistics may be missing in the presence of
missing values. Defaults to TRUE
.
Tibble, possibly grouped, with the following columns:
rows
Row count
obs
Count of non-missing observations
distin
Count of distinct values
min
Minimum value
q25
25th percentile
median
Median, 50th percentile
q75
75th percentile
max
Maximum value
mean
Mean
sd
Standard deviation
sum
Sum of all values
data(mtcars)
mtcars %>%
tsummary()
#> # A tibble: 11 × 12
#> variable rows obs distin min q25 median q75 max mean sd
#> <chr> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 am 32 32 2 0 0 0 1 1 0.406 0.499
#> 2 carb 32 32 6 1 2 2 4 8 2.81 1.62
#> 3 cyl 32 32 3 4 4 6 8 8 6.19 1.79
#> 4 disp 32 32 27 71.1 121. 196. 326 472 231. 124.
#> 5 drat 32 32 22 2.76 3.08 3.70 3.92 4.93 3.60 0.535
#> 6 gear 32 32 3 3 3 4 4 5 3.69 0.738
#> 7 hp 32 32 22 52 96.5 123 180 335 147. 68.6
#> 8 mpg 32 32 25 10.4 15.4 19.2 22.8 33.9 20.1 6.03
#> 9 qsec 32 32 30 14.5 16.9 17.7 18.9 22.9 17.8 1.79
#> 10 vs 32 32 2 0 0 0 1 1 0.438 0.504
#> 11 wt 32 32 29 1.51 2.58 3.32 3.61 5.42 3.22 0.978
#> # ℹ 1 more variable: sum <dbl>
# Select specific variables and
# remove some summary statistics:
mtcars %>%
tsummary(mpg, cyl, hp, am, gear, carb) %>%
dplyr::select(-mean, -sd, -sum)
#> # A tibble: 6 × 9
#> variable rows obs distin min q25 median q75 max
#> <chr> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 am 32 32 2 0 0 0 1 1
#> 2 carb 32 32 6 1 2 2 4 8
#> 3 cyl 32 32 3 4 4 6 8 8
#> 4 gear 32 32 3 3 3 4 4 5
#> 5 hp 32 32 22 52 96.5 123 180 335
#> 6 mpg 32 32 25 10.4 15.4 19.2 22.8 33.9
# Stratify by 'gear':
mtcars %>%
tsummary(mpg, hp, carb, by = gear)
#> # A tibble: 9 × 13
#> # Groups: variable [3]
#> variable gear rows obs distin min q25 median q75 max mean sd
#> <chr> <dbl> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 carb 3 15 15 4 1 2 3 4 4 2.67 1.18
#> 2 carb 4 12 12 3 1 1 2 4 4 2.33 1.30
#> 3 carb 5 5 5 4 2 2 4 6 8 4.4 2.61
#> 4 hp 3 15 15 10 97 150 180 210 245 176. 47.7
#> 5 hp 4 12 12 9 52 65.8 94 110 123 89.5 25.9
#> 6 hp 5 5 5 5 91 113 175 264 335 196. 103.
#> 7 mpg 3 15 15 13 10.4 14.5 15.5 18.4 21.5 16.1 3.37
#> 8 mpg 4 12 12 10 17.8 21 22.8 28.1 33.9 24.5 5.28
#> 9 mpg 5 5 5 5 15 15.8 19.7 26 30.4 21.4 6.66
#> # ℹ 1 more variable: sum <dbl>
# Stratify by 'gear' and 'am':
mtcars %>%
tsummary(mpg, hp, carb, by = c(am, gear))
#> # A tibble: 12 × 14
#> # Groups: variable, am [6]
#> variable am gear rows obs distin min q25 median q75 max mean
#> <chr> <dbl> <dbl> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 carb 0 3 15 15 4 1 2 3 4 4 2.67
#> 2 carb 0 4 4 4 2 2 2 3 4 4 3
#> 3 carb 1 4 8 8 3 1 1 1.5 2.5 4 2
#> 4 carb 1 5 5 5 4 2 2 4 6 8 4.4
#> 5 hp 0 3 15 15 10 97 150 180 210 245 176.
#> 6 hp 0 4 4 4 3 62 86.8 109 123 123 101.
#> 7 hp 1 4 8 8 6 52 65.8 79.5 109. 110 83.9
#> 8 hp 1 5 5 5 5 91 113 175 264 335 196.
#> 9 mpg 0 3 15 15 13 10.4 14.5 15.5 18.4 21.5 16.1
#> 10 mpg 0 4 4 4 4 17.8 18.8 21 23.2 24.4 21.0
#> 11 mpg 1 4 8 8 7 21 21.3 25.0 30.9 33.9 26.3
#> 12 mpg 1 5 5 5 5 15 15.8 19.7 26 30.4 21.4
#> # ℹ 2 more variables: sd <dbl>, sum <dbl>