OBSTETRICS AND GYNAECOLOGY / RESEARCH PAPER
MicroRNA-Based Machine Learning Classifier Accurately Predicts Molecular Subtypes in Endometrial Carcinoma
More details
Hide details
1
Department of Gynecology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, China
2
School of Clinical Medicine, Medical College of Yangzhou University, China
Submission date: 2025-11-20
Final revision date: 2026-02-12
Acceptance date: 2026-03-31
Online publication date: 2026-06-04
Corresponding author
Miao Li
Department of Gynecology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, China
KEYWORDS
TOPICS
ABSTRACT
Introduction:
Molecular classification of endometrial carcinoma (EC) has revolutionized prognostic stratification. MicroRNAs (miRNAs) offer a potential cost-effective alternative for molecular subtyping. This study presents a proof-of-concept computational analysis to evaluate the feasibility of a miRNA-based machine learning classifier for discriminating the four distinct EC molecular subtypes.
Material and methods:
We analyzed 232 EC samples from the TCGA-UCEC cohort with complete molecular annotations. To confirm the clinical relevance of the ground-truth labels used for modeling, survival analyses (Kaplan-Meier and cumulative incidence function) were first performed on the established TCGA subtypes. Feature selection was conducted exclusively on the training set, identifying a signature of 20 differentially expressed miRNAs. A benchmark of seven machine learning algorithms was conducted. A pilot external validation was performed on 15 independent clinical samples using qRT-PCR, with ground-truth subtypes confirmed via POLE sequencing and IHC (ProMisE criteria).
Results:
Survival analysis confirmed that the ground-truth TCGA subtypes exhibited statistically significant prognostic differences (p=0.0011 for OS). In the computational benchmark, the LASSO logistic regression model demonstrated superior performance on the independent test set (multiclass AUC = 0.850). In the pilot external validation cohort, the classifier achieved an accuracy of 86.7% (95% CI: 59.5%–98.3%), correctly identifying all POLE and CN-high cases.
Conclusions:
This study demonstrates the computational feasibility of using a 20-miRNA signature to classify EC molecular subtypes, particularly for the prognostically distinct POLE and CN-high groups. While the survival analysis reaffirms the prognostic value of the molecular subtypes themselves, our findings regarding the classifier represent a preliminary proof-of-concept.