Uncertainty-Aware Genomic Classification of Alzheimer's Disease: A Transformer-Based Ensemble Approach with Monte Carlo Dropout
Abstract
INTRODUCTION: Alzheimer's disease (AD) is genetically complex, complicating robust classification from genomic data. METHODS: We developed a transformer-based ensemble model (TrUE-Net) using Monte Carlo Dropout for uncertainty estimation in AD classification from whole-genome sequencing (WGS). We combined a transformer that preserves single-nucleotide polymorphism (SNP) sequence structure with a concurrent random forest using flattened genotypes. An uncertainty threshold separated samples into an uncertain (high-variance) group and a more certain (low-variance) group. RESULTS: We analyzed 1050 individuals, holding out half for testing. Overall accuracy and area under the receiver operating characteristic (ROC) curve (AUC) were 0.6514 and 0.6636, respectively. Excluding the uncertain group improved accuracy from 0.6263 to 0.7287 (10.24% increase) and F1 from 0.5843 to 0.8205 (23.62% increase). DISCUSSION: Monte Carlo Dropout-driven uncertainty helps identify ambiguous cases that may require further clinical evaluation, thus improving reliability in AD genomic classification.