Andreychenko A.E., Luchinin A.S., Ivshin A.A., Ermak A.D., Novitskiy R.E., Gusev A.V.
Background: Preeclampsia (PE) is a life-threatening and difficult-to-predict complication of pregnancy, characterized by multi-organ dysfunction. PE affects 2–8% of all pregnancies and is one of the leading causes of perinatal and maternal mortality, especially in cases of early onset PE. Objective: To develop models for predicting total and early onset PE in the first trimester of pregnancy using machine learning (ML) technologies based on real-world clinical data. Materials and methods: We analyzed 21,092 records obtained from electronic medical records through the Webiomed platform, corresponding to 12,434 unique pregnancies of 12,283 women aged 11 to 60 years, up to 16 weeks. Anamnestic, constitutional, clinical, instrumental, and laboratory data, commonly used in routine medical practice, were selected as potential factors for predicting PE, totaling 53 variables. To create the models, we employed logistic regression (LR), gradient boosting methods (LightGBM, XGBoost, CatBoost), and methods based on decision trees (RandomForest and ExtraTrees). Results: The ExtraTrees model demonstrated the highest accuracy in predicting PE, with an area under the curve (AUC) of 0.858 (95% CI 0.827–0.890). The model's overall accuracy was 0.634 (95% CI 0.616–0.652), sensitivity was 0.897 (95% CI 0.837–0.953), and specificity was 0.624 (95% CI 0.605–0.643). Among the models for assessing the risk of early onset PE, the RandomForest algorithm yielded the most promising results. The AUC after validation was 0.848 (95% CI 0.785–0.904), with an accuracy of 0.813 (95% CI 0.798–0.828), sensitivity of 0.733 (95% CI 0.565–0.885), and specificity of 0.814 (95% CI 0.799–0.828).
Conclusion: The metrics of the final models align with previously published models. External validation results demonstrate the relative stability of the models with new data, indicating their potential applicability in real clinical practice. This is our first experience in predicting complex pregnancy complications based on real-world clinical data. The quality of the predictive model depends directly on the data and the statistical algorithms used, aspects that we intend to refine in future studies.
Andreychenko A.E., Luchinin A.S., Ivshin A.A., Ermak A.D., Novitskiy R.E., Gusev A.V. Development and validation of models to predict total and early-onset preeclampsia in the first trimester of pregnancy using machine learning algorithms. Akusherstvo i Ginekologiya/Obstetrics and Gynecology. 2023; (10): 94-107 (in Russian) https://dx.doi.org/10.18565/aig.2023.101
Share
Subscribe to our newsletter
Are you interested in digital healthcare and artificial intelligence for medicine? Join our mailing list!