The adoption of Electronic Health Record (EHR) systems is growing at a fast pace in the U.S., and this growth results in very large quantities of patient clinical data becoming available in electronic format, with tremendous potential, coupled with growing concern for patient confidentiality breaches. Secondary use of clinical data is essential to fulfill the potential for personalized healthcare, improved healthcare management, and effective clinical research. De-identification of patient data has been proposed as a solution to both facilitate secondary use of clinical data, and to protect patient data confidentiality. The majority of clinical data found in the EHR is represented as narrative text clinical notes, and de-identification of clinical text is a tedious and costly manual endeavor. Automated approaches based on Natural Language Processing have been implemented and evaluated, allowing for much faster de-identification than manual approaches, with comparable protection. However, despite these advances, health care providers have been reluctant to use automated de-identification to share clinical data for secondary use. On the one hand, IRBs and privacy officers are wary of the possible risks; on the other hand, researchers are concerned about obfuscation of critical medical information in the de-identification process.

Learning Objective 1: Contrast characteristics and challenges of clinical text de-identification.

Learning Objective 2: Share experiences and ideas for improved quality, acceptance, and use of text de-identification.


Stephane Meystre (Presenter)
Medical University of South Carolina

David Carrell (Presenter)
Kaiser Permanente Washington Health Research Institute

Lynette Hirschman (Presenter)
The MITRE Corporation

John Aberdeen (Presenter)
The MITRE Corporation

Paul Fearn (Presenter)
Surveillance Research Program, National Cancer Institute

Valentina Petkov (Presenter)
Surveillance Research Program, National Cancer Institute

Jonathan Silverstein (Presenter)
University of Pittsburgh School of Medicine

Presentation Materials: