A Statistical Test for Comparing the Linkage and Admixture Model Based on Central Limit Theorems
Abstract
In the Admixture Model, the probability that an individual carries a certain allele at a specific marker depends on the allele frequencies in $K$ ancestral populations and the proportion of the individual's genome originating from these populations. The markers are assumed to be independent. The Linkage Model is a Hidden Markov Model (HMM) that extends the Admixture Model by incorporating linkage between neighboring loci. This study investigates the consistency and central limit behavior of maximum likelihood estimators (MLEs) for individual ancestry in the Linkage Model, complementing earlier results by \citep{pfaff2004information, pfaffelhuber2022central, heinzel2025consistency} for the Admixture Model. These theoretical results are used to prove theoretical properties of a statistical test that allows for model selection between the Admixture Model and the Linkage Model. Finally, we demonstrate the practical relevance of our results by applying the test to real-world data from \cite{10002015global}.