17 мая 2024

Development, evaluation and validation of machine learning models to predict hospitalizations of patients with coronary artery disease within the next 12 months

871

Andrey D. Ermak, Denis V. Gavrilov, Roman E. Novitskiy, Alexander V. Gusev, Anna E. Andreychenko

Abstact

Improved survival of patients after acute coronary syndromes, population growth, and overall life expectancy rise have led to a significant increase in the proportion of patients with stable coronary artery disease (CAD), creating a significant load on the entire healthcare system. The disease often progresses with the development of many complications while significantly increasing the likelihood of hospitalization. Developing and applying a machine learning model for predicting hospitalizations of patients with CAD to an inpatient medical facility will allow for close monitoring of high-risk patients, early preventive interventions, and optimized medical care.

Aims: Development and external validation of personalized models for predicting the preventable hospitalizations of patients with stable CAD and its complications using ML algorithms and data of real-world clinical practice.

Methods: 135,873 depersonalized electronic health records of 49,103 patients with stable CAD were included in the study. Anthropometric measurements, physical examination results, laboratory, instrumental, anamnestic, and socio-demographic data, widely used in routine medical practice, were considered as potential predictors, a total of 73 features. Logistic regression, decision tree-based methods including gradient boosting (AdaBoost, LightGBM, XGBoost, CatBoost) and bagging (RandomForest and ExtraTrees), discriminant analysis (LinearDiscriminant, QuadraticDiscriminant), and naive Bayes classifier were compared. External validation was performed on the data of a separate region.

Results: The best results and stability to external validation data were shown by the CatBoost model with an AUC of 0.875 (95% CI 0.865–0.885) for the internal testing and 0.872 (95% CI 0.856–0.886) for the external validation.The best model showed good performance evaluated through AUROC, Brier score and standardized net benefit (for the target NPV threshold) for the validation dataset that was only slightly similar to the train data.

Conclusion: The metrics of the best model were superior to previously published studies. The results of external validation demonstrated the relative stability of the model to new data from another region that confirms the possibility of the model’s application in real clinical practice.

Andrey D. Ermak, Denis V. Gavrilov, Roman E. Novitskiy, Alexander V. Gusev, Anna E. Andreychenko. Development, evaluation and validation of machine learning models to predict hospitalizations of patients with coronary artery disease within the next 12 months. International Journal of Medical Informatics. 2024, Volume 187, 105476. https://doi.org/10.1016/j.ijmedinf.2024.105476.

Share

Subscribe to our newsletter

Are you interested in digital healthcare and artificial intelligence for medicine? Join our mailing list!

Join us

We are in social networks