LUTZONI, FRANCOIS*, PETER WAGNER, AND VALERIE REEB. Department of Botany, The Field Museum, Chicago, IL 60605. - Integrating ambiguously aligned regions of DNA sequences in phylogenetic analyses without violating positional homology.
Phylogenetic analyses of non-protein coding nucleotide sequences such
as ribosomal RNA genes, internal transcribed spacers (ITS) and introns
are often impeded by regions of the alignments that are ambiguously
aligned. These regions are characterized by the presence of gaps and
their uncertain positions no matter which optimization criteria are
used. This problem is particularly acute in large scale phylogenetic
studies and when aligning highly diverged sequences. Accommodating
these regions, where positional homology is likely to be violated, in
phylogenetic analyses has been dealt with very differently by
molecular systematists and evolutionists, ranging from the total
exclusion of these regions to the inclusion of every position
regardless of ambiguity in the alignment. We present a new method
that allows the inclusion of ambiguously aligned regions without
violating homology. This three-step procedure consists first of
delimiting homologous regions of the alignment containing ambiguously
aligned sequences. Second, each ambiguously aligned region is
unequivocally coded as a new character that replaces its respective
ambiguous region. Third, each of these coded characters is subjected
to a specific step matrix to account for the differential number of
changes (summing substitutions and indels) needed to transform one
sequence to another. The optimal number of steps included in the step
matrix is the one derived from the pairwise alignment with the highest
similarity and the lowest number of steps. In addition to potentially
enhancing phylogenetic resolution and support, by integrating
previously nonaccessible characters without violating positional
homology, this new approach can improve branch length estimations when
using parsimony.
Key words: ambiguous nucleotide sequence alignment, gaps, indels, phylogenetics, positional homology