Reconstructing ancestral characters and traits along a phylogenetic tree is central to evolutionary biology. It is the key to understanding morphology changes among species, inferring ancestral biochemical properties of life, or recovering migration routes in phylogeography. The goal is twofold: to reconstruct the character state at the tree root (e.g. the region of origin of some species), and to understand the process of state changes along the tree (e.g. species flow between countries). We deal here with discrete characters, which are ‘unique’, as opposed to sequence characters (nucleotides or amino-acids), where we assume the same model for all the characters (or for large classes of characters with site-dependent models) and thus benefit from multiple information sources. In this framework, we use mathematics and simulations to demonstrate that although each goal can be achieved with high accuracy individually, it is generally impossible to accurately estimate both the root state and the rates of state changes along the tree branches, from the observed data at the tips of the tree. This is because the global rates of state changes along the branches that are optimal for the two estimation tasks have opposite trends, leading to a fundamental trade-off in accuracy. This inherent ‘Darwinian uncertainty principle’ concerning the simultaneous estimation of ‘patterns’ and ‘processes’ governs ancestral reconstructions in biology. For certain tree shapes (typically speciation trees) the uncertainty of simultaneous estimation is reduced when more tips are present, however, for other tree shapes it does not (e.g. coalescent trees used in population genetics).
Here is the link : https://doi.org/10.1093/sysbio/syz054