Estimate Risk Ratios and Risk Differences using Regression • risks

Installation

The risks package can be installed from CRAN:

install.packages("risks")

Development versions can be installed from GitHub using:

remotes::install_github("stopsack/risks")

Summary

The risks package fits regression models for risk ratios (RR) and risk differences (RD). The package refers to “risk,” but “prevalence” can be substituted throughout.

What is the association between an exposure (smoker/nonsmoker, age in years, or underweight/lean/overweight/obese) and the risk of a binary outcome (dead/alive, disease/healthy), perhaps adjusting for confounders (men/women, years of education)? For such questions, many studies default to reporting odds ratios, which may exaggerate associations when the outcome is common. Odds ratios are often used because they are easily obtained from logistic regression models. Obtaining risk ratios or risk differences, especially adjusting for confounders, has typically required more advanced biostatistics and programming skills, including in R.

The risks package makes estimating adjusted risk ratios and risk differences as simple as fitting a logistic regression model. No advanced programming or biostatistics skills are required. Risk ratios or risk differences are returned whenever the data would allow for fitting a logistic model.

Basic example

The example data stem from a cohort of women with breast cancer. The the categorical exposure is stage, the binary outcome is death, and the binary confounder is receptor.

Fit a risk difference model:

library(risks)  # provides riskratio(), riskdiff(), postestimation functions
fit <- riskdiff(formula = death ~ stage + receptor, data = breastcancer)

Fitted objects can be used in the usual commands for generalized linear models, such as:

summary(fit)
#> 
#> Risk difference model, fitted via marginal standardization of a logistic model with delta method (margstd_delta).
#> Call:
#> stats::glm(formula = death ~ stage + receptor, family = binomial(link = "logit"), 
#>     data = breastcancer, start = "(no starting values)")
#> 
#> Deviance Residuals: 
#>    Min      1Q  Median      3Q     Max  
#>                                         
#> 
#> Coefficients: (3 not defined because of singularities)
#>                Estimate Std. Error z value Pr(>|z|)    
#> stageStage I    0.00000    0.00000     NaN      NaN    
#> stageStage II   0.16303    0.05964   2.734  0.00626 ** 
#> stageStage III  0.57097    0.09962   5.732 9.95e-09 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 228.15  on 191  degrees of freedom
#> Residual deviance: 185.88  on 188  degrees of freedom
#> AIC: 193.88
#> 
#> Number of Fisher Scoring iterations: 4
#> 
#> Confidence intervals for coefficients: (delta method)
#>                     2.5 %    97.5 %
#> stageStage I   0.00000000 0.0000000
#> stageStage II  0.04614515 0.2799187
#> stageStage III 0.37571719 0.7662158

tidy() from the broom package provides easy access to coefficients:

broom::tidy(fit)
#> # A tibble: 3 × 8
#>   term           estimate std.error statistic   p.value conf.low conf.high model
#>   <chr>             <dbl>     <dbl>     <dbl>     <dbl>    <dbl>     <dbl> <chr>
#> 1 stageStage I      0        0         NaN    NaN         0          0     marg…
#> 2 stageStage II     0.163    0.0596      2.73   6.26e-3   0.0461     0.280 marg…
#> 3 stageStage III    0.571    0.0996      5.73   9.95e-9   0.376      0.766 marg…

risks: Estimating risk ratios and risk differences using regression

Installation

Summary

Basic example

Further reading