$F_{ST}$ and kinship for arbitrary population structures I: Generalized definitions


$F_{ST}$ is a fundamental measure of genetic differentiation and population structure currently defined for subdivided populations. $F_{ST}$ in practice typically assumes the 'island model', where subpopulations have evolved independently from their last common ancestral population. In this work, we generalize the $F_{ST}$ definition to arbitrary population structures, where individuals may be related in arbitrary ways. Our definitions are built on identity-by-descent (IBD) probabilities that relate individuals through inbreeding and kinship coefficients. We generalize $F_{ST}$ as the mean inbreeding coefficient of the individuals' local populations relative to their last common ancestral population. This $F_{ST}$ naturally yields a useful pairwise $F_{ST}$ between individuals. We show that our generalized definition agrees with Wright's original and the island model definitions as special cases. We define a novel coancestry model based on 'individual-specific allele frequencies' and prove that its parameters correspond to probabilistic kinship coefficients. Lastly, we study and extend the Pritchard-Stephens-Donnelly admixture model in the context of our coancestry model and calculate its $F_{ST}$. Our probabilistic framework provides a theoretical foundation that extends $F_{ST}$ in terms of inbreeding and kinship coefficients to arbitrary population structures.

bioRxiv, doi:10.1101/083915