OP-24 Lineage-specific gene loss following mitochondrial endosymbiosis and its potential for function prediction in eukaryotes
Toni Gabaldón (1), Martijn A. Huynen (1)
1) NCMLS, NIjmegen Center for Molecular Life Sciences/ CMBI Center for Molecular and Biomolecular Informatics. Radboud University Nijmegen
The endosymbiotic origin of mitochondria has resulted in a massive horizontal transfer of genetic material from an alphaproteobacterium to the early eukaryotes. Using large-scale phylogenetic analysis we have previously identified 630 orthologous groups of proteins derived from this event. Here we show that this proto-mitochondrial protein set has undergone extensive lineage-specific gene loss in thE eukaryotes, with an average of three losses per orthologous group in a phylogeny of nine species. This gene loss has resulted in a high variability of the alphaproteobacterial-derived gene content of present-day eukaryotic genomes that might reflect functional adaptation to different environments. Proteins functioning in the same biochemical pathway tend to have a similar history of gene loss events, and we use this property to predict functional interactions among proteins.
Supplementary Information: Sequences and trees for the selected orthologous groups can be accessed in the following address:http://www.cmbi.kun.nl/~jagabald/table_groupsb.html CONTACT: T.Gabaldon@cmbi.ru.nl
OP-25 A Gamma mixture model better accounts for among site rate heterogeneity
Tal Pupko (1), Nir Friedman (2), Itay Mayrose (1)
1) Tel-Aviv University, 2) Hebrew Universityl
Motivation: Variation of substitution rates across nucleotide and amino acid sites has long been recognized as a characteristic of molecular sequence evolution. Evolutionary models that account for this rate heterogeneity usually use a gamma density function to model the rate distribution across sites. This density function, however, may not fit real datasets, especially when there is a multimodal distribution of rates. Here, we present a novel evolutionary model based on a mixture of gamma density functions. This model better describes the among-site-ratevariation characteristic of molecular sequence evolution. The use of this model may improve the accuracy of various phylogenetic methods such as reconstructing phylogenetic trees, dating divergence events, inferring ancestral sequences, and detecting conserved sites in proteins.
Results: Using diverse sets of protein sequences we show that the gamma-mixture model better describes the stochastic process underlying protein evolution. We show that the proposed gamma-mixture model fits protein datasets significantly better than the single-gamma model in nine out of ten datasets tested. We further show that using the gamma-mixture model improves the accuracy of model-based prediction of conserved residues in proteins.
OP-26 Computing Recombination Networks from Binary Sequences
Daniel Huson (1), Tobias Kloepper (1)
1) Center for Bioinformatics, Tuebingen University
Motivation: Phylogenetic networks are becoming an important tool in molecular evolution, as the evolutionary role of reticulate events such as hybridization, horizontal gene transfer and recombination is becoming more evident, and as the available data is dramatically increasing in quantity and quality. Results: This paper addresses the problem of computing a most parsimonious recombination network for an alignment of binary sequences that are assumed to have arisen under the "infinite sites" model of evolution with recombinations. Using the concept of a splits network as the underlying datastructure, this paper shows how a recent method designed for the computation of hybridization networks can be extended to also compute recombination networks. A robust implementation of the approach is provided and is illustrated using a number of real biological datasets.
Availability: Upon publication, this approach will be made freely available as a part of the SplitsTree4 software, downloadable from: www.splitstree.org.
OP-27 Discriminating between rate heterogeneity and interspecific recombination in DNA sequence alignments with phylogenetic factorial HMMs Dirk Husmeier, Biomathematics and Statistics Scotland
Motivation: A recently proposed method for detecting recombination in DNA sequence alignments is based on the combination of hidden Markov models (HMMs) with phylogenetic trees. While this method was found to detect breakpoints of recombinant regions more accurately than most existing techniques, it inherently fails to distinguish between recombination and rate variation. In the present paper, we propose to marry the phylogenetic tree to a factorial HMM (FHMM). The states of the first hidden chain represent tree topologies, while the states of the second independent hidden chain represent different global scaling factors of the branch lengths. Inference is done in terms of a hierarchical Bayesian model, where parameters and hidden states are sampled from the posterior distribution with Gibbs sampling. Results: We have tested the proposed model on various synthetic and real-world DNA sequence alignments. The simulation results suggest that as opposed to the standard phylogenetic HMM, the phylogenetic FHMM clearly distinguishes between recombination and rate heterogeneity and thereby avoids the prediction of spurious recombinant regions.
Availability: The proposed method has been implemented in a MATLAB Package that extends Kevin Murphy's HMM toolbox (Murphy, 2002). Software and data used in our study are available from http://www.bioss.sari.ac.uk/~dirk/Supplements