Parameter identifiability for a profile mixture model of protein evolution
Abstract: A Profile Mixture Model is a model of protein evolution, describing sequence data in which sites are assumed to follow many related substitution processes on a single evolutionary tree. The processes depend in part on different amino acid distributions, or profiles, varying over sites in aligned sequences. A fundamental question for any stochastic model, which must be answered positively to justify model-based inference, is whether the parameters are identifiable from the probability distribution they determine. Here we show that a Profile Mixture Model has identifiable parameters under circumstances in which it is likely to be used for empirical analyses. In particular, for a tree relating 9 or more taxa, both the tree topology and all numerical parameters are generically identifiable when the number of profiles is less than 74.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.