Data-driven Machine Learning Models for Risk Stratification and Prediction of Emergence Delirium in Pediatric Patients Underwent Tonsillectomy/Adenotonsillectomy
Alessandro Simonini 1Jeevitha Murugan 2Alessandro Vittori 3Roberta Pallotto 1Elena Bignami 4Maria Calevo 5Ornella Piazza 6Marco Cascella 6
1 Department of Pediatric Anaesthesia and Intensive Care, S.C. SOD Anestesia e Rianimazione Pediatrica, Ospedale G. Salesi, 60123 Ancona, Italy
2 BTech - Artificial Intelligence and Data Science, St Joseph's College of Engineering, 600119 Chennai, India
3 Department of Anesthesia and Critical Care, ARCO Roma Ospedale Pediatrico Bambino Gesù IRCCS, 00165 Rome, Italy
4 Anesthesiology, Critical Care and Pain Medicine Division, Department of Medicine and Surgery, University of Parma, 43126 Parma, Italy
5 Epidemiology and Biostatistic Unit, Scientific Directorate, IRCCS Istituto Giannini Gaslini, 16147 Genoa, Italy
6 Anesthesia and Pain Medicine, Department of Medicine, Surgery and Dentistry "Scuola Medica Salernitana'', University of Salerno, 84081 Baronissi, Italy
Ann. Ital. Chir., 2024, 95(5), 100132; https://doi.org/10.62713/aic.3485
Published: 20 Oct 2024
Copyright © 2024 The Author(s).
Abstract
AIM: In the pediatric surgical population, Emergence Delirium (ED) poses a significant challenge. This study aims to develop and validate machine learning (ML) models to identify key features associated with ED and predict its occurrence in children undergoing tonsillectomy or adenotonsillectomy. METHODS: The analysis involved data cleaning, exploratory data analysis (EDA), supervised predictive modeling, and unsupervised learning on a medical dataset (n = 423). After preliminary data cleaning, EDA encompassed plotting histograms, boxplots, pairplots, and correlation heatmaps to understand variable distributions and relationships. Four predictive models were trained including logistic regression (LR), random forest (RF), Support Vector Machine (SVM), and Gradient Boosting (XGBoost). The models were evaluated and compared using Receiver Operating Characteristic (ROC) Area Under the Curve (AUC), precision, recall, and feature importance. The RF model showed better performance and was used for the test (AUC-ROC 0.96, precision 1.00, and recall 0.92 on the validation set). K-means clustering was applied to find groups within the data. Elbow method and silhouette scores were used to determine the optimal number of clusters. The formed clusters were analyzed by aggregating features to understand the characteristics of each cluster. RESULTS: EDA revealed significant positive correlations between age, weight, American Society of Anesthesiologists (ASA) health score, and surgery duration with the risk of developing ED. Among the ML models, RF achieved the highest performance. Key predictive variables, based on the model's feature importance, included delirium screening scales, extubation time, and time to regain consciousness. Unsupervised K-means clustering identified 2–3 optimal clusters, which represented distinct patient subgroups: younger, healthier, low-risk individuals (cluster 0), and older patients with increasing chronic disease burden, higher delirium screening scores, and consequently higher post-operative delirium risk (clusters 1 and 2). CONCLUSIONS: ML techniques are valuable tools for extracting insights and making accurate predictions from healthcare data. High-performing algorithm-based models can be implemented for clinical decision support systems, facilitating early identification and intervention for ED in pediatric patients. By investigating various variables, it is possible to assess risk and implement preventive measures effectively. Furthermore, unsupervised clustering reveals distinct patient subgroups, enabling personalized perioperative management strategies and enhancing overall patient care.