TY - JOUR
T1 - Machine learning functional impairment classification with electronic health record data
AU - Pavon, Juliessa M.
AU - Previll, Laura
AU - Woo, Myung
AU - Henao, Ricardo
AU - Solomon, Mary
AU - Rogers, Ursula
AU - Olson, Andrew
AU - Fischer, Jonathan
AU - Leo, Christopher
AU - Fillenbaum, Gerda
AU - Hoenig, Helen
AU - Casarett, David
N1 - Generated from Scopus record by KAUST IRTS on 2023-09-25
PY - 2023/9/1
Y1 - 2023/9/1
N2 - Background: Poor functional status is a key marker of morbidity, yet is not routinely captured in clinical encounters. We developed and evaluated the accuracy of a machine learning algorithm that leveraged electronic health record (EHR) data to provide a scalable process for identification of functional impairment. Methods: We identified a cohort of patients with an electronically captured screening measure of functional status (Older Americans Resources and Services ADL/IADL) between 2018 and 2020 (N = 6484). Patients were classified using unsupervised learning K means and t-distributed Stochastic Neighbor Embedding into normal function (NF), mild to moderate functional impairment (MFI), and severe functional impairment (SFI) states. Using 11 EHR clinical variable domains (832 variable input features), we trained an Extreme Gradient Boosting supervised machine learning algorithm to distinguish functional status states, and measured prediction accuracies. Data were randomly split into training (80%) and test (20%) sets. The SHapley Additive Explanations (SHAP) feature importance analysis was used to list the EHR features in rank order of their contribution to the outcome. Results: Median age was 75.3 years, 62% female, 60% White. Patients were classified as 53% NF (n = 3453), 30% MFI (n = 1947), and 17% SFI (n = 1084). Summary of model performance for identifying functional status state (NF, MFI, SFI) was AUROC (area under the receiving operating characteristic curve) 0.92, 0.89, and 0.87, respectively. Age, falls, hospitalization, home health use, labs (e.g., albumin), comorbidities (e.g., dementia, heart failure, chronic kidney disease, chronic pain), and social determinants of health (e.g., alcohol use) were highly ranked features in predicting functional status states. Conclusion: A machine learning algorithm run on EHR clinical data has potential utility for differentiating functional status in the clinical setting. Through further validation and refinement, such algorithms can complement traditional screening methods and result in a population-based strategy for identifying patients with poor functional status who need additional health resources.
AB - Background: Poor functional status is a key marker of morbidity, yet is not routinely captured in clinical encounters. We developed and evaluated the accuracy of a machine learning algorithm that leveraged electronic health record (EHR) data to provide a scalable process for identification of functional impairment. Methods: We identified a cohort of patients with an electronically captured screening measure of functional status (Older Americans Resources and Services ADL/IADL) between 2018 and 2020 (N = 6484). Patients were classified using unsupervised learning K means and t-distributed Stochastic Neighbor Embedding into normal function (NF), mild to moderate functional impairment (MFI), and severe functional impairment (SFI) states. Using 11 EHR clinical variable domains (832 variable input features), we trained an Extreme Gradient Boosting supervised machine learning algorithm to distinguish functional status states, and measured prediction accuracies. Data were randomly split into training (80%) and test (20%) sets. The SHapley Additive Explanations (SHAP) feature importance analysis was used to list the EHR features in rank order of their contribution to the outcome. Results: Median age was 75.3 years, 62% female, 60% White. Patients were classified as 53% NF (n = 3453), 30% MFI (n = 1947), and 17% SFI (n = 1084). Summary of model performance for identifying functional status state (NF, MFI, SFI) was AUROC (area under the receiving operating characteristic curve) 0.92, 0.89, and 0.87, respectively. Age, falls, hospitalization, home health use, labs (e.g., albumin), comorbidities (e.g., dementia, heart failure, chronic kidney disease, chronic pain), and social determinants of health (e.g., alcohol use) were highly ranked features in predicting functional status states. Conclusion: A machine learning algorithm run on EHR clinical data has potential utility for differentiating functional status in the clinical setting. Through further validation and refinement, such algorithms can complement traditional screening methods and result in a population-based strategy for identifying patients with poor functional status who need additional health resources.
UR - https://agsjournals.onlinelibrary.wiley.com/doi/10.1111/jgs.18383
UR - http://www.scopus.com/inward/record.url?scp=85159628939&partnerID=8YFLogxK
U2 - 10.1111/jgs.18383
DO - 10.1111/jgs.18383
M3 - Article
C2 - 37195174
SN - 1532-5415
VL - 71
SP - 2822
EP - 2833
JO - Journal of the American Geriatrics Society
JF - Journal of the American Geriatrics Society
IS - 9
ER -