Article Text

Download PDFPDF

5PSQ-139 Using a text-mining approach to identify the context variables language barrier, living alone, cognitive frailty and non-adherence from electronic health records (EHRs)
  1. S Ten Hoope1,
  2. K Welvaars2,
  3. M Everaars-Klok1,
  4. S Van Schaik3,
  5. F Karapinar1
  1. 1Olvg Hospital, Clinical Pharmacy, Amsterdam, The Netherlands
  2. 2Olvg Hospital, Data Science, Amsterdam, The Netherlands
  3. 3Olvg Hospital, Neurology, Amsterdam, The Netherlands


Background and Importance Electronic Health Records (EHRs) contain free text fields such as clinical notes. These text fields frequently contain valuable information about the context of patients. Nevertheless, this information is often unused as text fields are time-consuming to read. The context variables language barrier, living alone, cognitive frailty and non-adherence are associated with unplanned hospital readmissions. Previous studies have not explored whether text-mining could help to identify these variables from free text.

Aim and Objectives The primary aim of this study was to identify the four context variables language barrier, living alone, cognitive frailty, and non-adherence from the EHRs using text-mining.

Material and Methods The study population was from a database of n = 1,120 unplanned hospital readmissions (30-days) at OLVG hospital. A manual standard was created by extracting information from clinical notes and categorising each patient for each variable (in duplo). For the simple terms language barrier and living alone, a rule-based algorithm was used, see figure 1. For the more complex terms cognitive frailty and non-adherence, a Named Entity Recognition (NER) algorithm was used, see figure 2. Each algorithm was validated against the manual standard until a high percentage agreement was achieved for a maximum of five iterations. The primary outcome was the percentage agreement and kappa value between the manual standard and the algorithm. Descriptive data analysis were used.

Results The rule-based algorithm for language barrier had a percentage agreement of 96.8% and a Kappa of 0.90. For living alone the percentage agreement was 76.8% and the Kappa 0.53. The NER model for cognitive frailty had a percentage agreement of 95.1% and Kappa of 0.83, and for non-adherence the agreement was 91.9% and Kappa 0.37. Generally, the models overestimated the number of patients with a context variable (e.g. a family member with a language barrier rather than the patient himself).

Conclusion and Relevance In this study, text-mining was able to identify context variables from EHRs, with a good kappa for the variable language barrier and cognitive frailty. Future studies should explore how overestimation in text-mining could be reduced. Text-mining could help healthcare professionals to anticipate on patient context in the future to optimise care.

Conflict of Interest No conflict of interest.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.