Estimating Species Trees from Quartet Gene Tree Distributions under the Coalescent Model
Abstract: In this article we propose a new method, which we name 'quartet neighbor joining', or 'quartet-NJ', to infer an unrooted species tree on a given set of taxa T from empirical distributions of unrooted quartet gene trees on all four-taxon subsets of T. In particular, quartet-NJ can be used to estimate a species tree on T from distributions of gene trees on T. The quartet-NJ algorithm is conceptually very similar to classical neighbor joining, and its statistical consistency under the multispecies coalescent model is proved by a variant of the classical 'cherry picking'-theorem. In order to demonstrate the suitability of quartet-NJ, coalescent processes on two different species trees (on five resp. nine taxa) were simulated, and quartet-NJ was applied to the simulated gene tree distributions. Further, quartet-NJ was applied to quartet distributions obtained from multiple sequence alignments of 28 proteins of nine prokaryotes.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.