Deidentifies the dataset for analysis and sharing:

  • Drops all dates, keeping time intervals only

  • Rounds age (in years, 1 decimal) and PSA (1 decimal)

  • Replaces patient ID by a sequential index number that still allows for merging pts, smp, and trt datasets.

  • Removes free text with potentially identifying information (treatments).

By default, this function is already being called automatically within load_prostate_redcap.

deidentify_prostate_redcap(data, trt_freetext = TRUE, extra = NULL)

Arguments

data

List with elements pts, smp, and trt, as generated by load_prostate_redcap.

trt_freetext

Optional. Remove free text treatment names? Defaults to TRUE.

extra

Optional. Tibble with additional data set, or list of additional data sets, in which a ptid variable of type character should be replaced with the same sequential ID as in pts. Other deidentification steps of such extra datasets must be done separately if necessary. Defaults to NULL (none).

Value

List:

  • pts: Deidentified patient-level data.

  • smp: Deidentified sample-level data.

  • trt: Deidentified treatment data.

  • ext: List of additional data sets (if provided, otherwise empty, NULL).