CLINICAL RESEARCH
Early screening of autism spectrum disorder in toddlers
using machine learning
More details
Hide details
1
Department of Nursing, College of Nursing and Health Sciences, Jazan University,
Jazan, Saudi Arabia
Submission date: 2026-01-09
Final revision date: 2026-02-05
Acceptance date: 2026-02-24
Online publication date: 2026-03-20
Corresponding author
Darin Mansor Mathkor
Department of Nursing
College of Nursing and Health Sciences
Jazan University
Jazan-82911, Saudi Arabia
KEYWORDS
TOPICS
ABSTRACT
Introduction:
Autism spectrum disorder (ASD) is a developmental brain
condition that causes problems with social interaction and communication,
as well as repetitive behaviors. Early ASD screening is vital for prompt ac
tions, as existing diagnostic methods suffer from scalability limitations in
resource-constrained settings. Herein, a machine learning (ML)-based explain
able predictive model was developed and evaluated for early screening of ASD
in toddlers using Q-CHAT-10 behavioral, demographic, and clinical features.
Material and methods:
A total of 1054 toddlers’ records sourced from
the Autism Screening for Toddlers dataset freely available at Kaggle were ret
rospectively analyzed. Data preprocessing, statistical feature selection, and
dimensionality reduction were performed. Multiple ensemble models were
trained using Q-CHAT-10 behavioral features combined with demographic
and clinical variables. Several algorithms were tested, including Logistic Re
gression, Random Forest, Gradient Boosting, and Multilayer Perceptron, with
k-fold cross-validation for model selection. SHAP analysis was employed to
explore the reasons behind individual predictions.
Results:
Model performance was evaluated using ROC-AUC. Feature im
portance was checked to identify the most predictive items. The Gradient
Boosting classifier achieved the best performance, with an accuracy of 0.98
(95% CI: 0.85–0.93), sensitivity of 0.91, specificity of 0.87, and ROC-AUC
of 0.94 on the held-out test set. SHAP analysis revealed total Q-CHAT-10
score, response to name, pointing to share interest, and pretend play as
the most influential predictors.
Conclusions:
This ML framework accurately detects ASD traits in toddlers,
highlighting the potential of a scalable, low-cost screening tool to enable
early ASD detection and improve equitable access to pediatric care. How
ever, external validation across diverse populations with larger samples is
warranted before clinical application can be recommended.