FREE ELECTRONIC LIBRARY - Abstracts, online materials

Pages:   || 2 | 3 | 4 |

«A Rate-independent Technique for Analysis of Nucleic Acid Sequences: Evolutionary Parsimony’ James A. Lake Molecular Biology Institute and ...»

-- [ Page 1 ] --

A Rate-independent Technique for Analysis of Nucleic Acid

Sequences: Evolutionary Parsimony’

James A. Lake

Molecular Biology Institute and Department of Biology,

University of California, Los Angeles

The method of evolutionary parsimony-or operator invariants-is a technique

of nucleic acid sequence analysis related to parsimony analysis and explicitly designed for determining evolutionary relationships among four distantly related taxa.

The method is independent of substitution rates because it is derived from consideration of the group properties of substitution operators rather than from an analysis of the probabilities of substitution in branches of a tree. In both parsimony and evolutionary parsimony, three patterns of nucleotide substitution are associated one-to-one with the three topologically linked trees for four taxa. In evolutionary parsimony, the three quantities are operator invariants. These invariants are the remnants of substitutions that have occurred in the interior branch of the tree and are analogous to the substitutions assigned to the central branch by parsimony.

The two invariants associated with the incorrect trees must equal zero (statistically), whereas only the correct tree can have a nonzero invariant. The x*-test is used to ascertain the nonzero invariant and the statist&thy favored tree. Examples, obtained using data calculated with evolutionary rates and branchings designed to camouflage the true tree, show that the method accurately predicts the tree, even when substitution rates differ greatly in neighboring peripheral branches (conditions under which parsimony will consistently fail). As the number of substitutions in peripheral branches becomes fewer, the parsimony and the evolutionary-parsimony solutions converge. The method is robust and easy to use.

Introduction Parsimony analysis is one of the most widely used and generally accepted methods of phylogenetic analysis (Fitch 1977). It is characterized by both intellectual and operational simplicity. Yet, under conditions of unequal rates of substitution, parsimony can select an incorrect tree. Parsimony, as a successful method of phylogenetic determination, represents a baseline against which other methods can be measured.

The best-understood instance in which parsimony can incorrectly predict an unrooted tree occurs when sequences in neighboring peripheral branches of a tree evolve at greatly different rates. Felsenstein ( 1978), investigating a two-state model, showed that when highly different rates occur among four sequences the most parsimonious unrooted tree places the two most highly substituted peripheral branches on one side of the tree and the two least substituted peripheral branches together on the other side. This tree will be chosen no matter what the topology of the true tree.

In this paper I propose a method of phylogenetic analysis-related to parsimony

1. Key words: parsimony, phylogeny.

for correspondence and reprints: Dr. James A. Lake, Molecular Biology Institute, University Address of California, Los Angeles, California 90024.

Mol. Biol. Evol. 4(2):167-191. 1987.

0 1987 by The University of Chicago. rightsreserved.

All 0737-4038/87/0402-4207$02.00 168 Lake analysis and called evolutionary parsimony or the method of operator invariantsthat can predict the correct tree even when rates of nucleotide substitution differ by an order of magnitude in adjacent branches of the unrooted tree. The method is robust, and it is easy to calculate. In it, three quantities named “operator invariants” are calculated from four aligned nucleic acid sequences. The invariants are remnants of substitutions that have occurred in the interior branch of the tree and are analogous to the substitutions assigned by parsimony to the interior branch. Both the operator invariants and the parsimony terms are derived by analysis of patterns present in the aligned sequences. These three operator invariants are then used to predict the statistically significant dendrogram. The evolutionary-parsimony method is investigated by using data calculated from a tree of known topology and shown to accurately predict the initial tree under a variety of conditions, particularly within the zone in which the Felsenstein conditions prevail.


Determining the unrooted evolutionary tree that best reconstructs the evolution of four taxa requires the discrimination of a single tree from a set of three alternative tree topologies. Hence, the problem for four taxa serves as the simplest case model for developing a method to discriminate among topologically distinct dendrograms.

Parsimony and evolutionary parsimony have related-but differing-criteria.

Parsimony selects the tree that requires the minimum number of substitutions. In contrast, evolutionary parsimony selects the tree that requires the minimum number of consistent substitutions (“consistent” is used to imply consistency with evolution in the peripheral branches of the tree). In the limit that the number of substitutions in the peripheral branches of the tree becomes small relative to those in the central branch, all substitutions become consistent ones and the parsimony solution converges to the evolutionary-parsimony solution.

Two simple examples serve to illustrate the differences between substitutions and consistent substitutions. Consider the trees in figure 1. The initial tree in la refers to the tree used to calculate the sequences, and the most parsimonious tree is the tree inferred from analysis of the calculated sequences. In these examples, the probability of substitution is equal for all bases. Thus, for an RNA sequence (nucleotides C, U, A, or G), an A would be replaced with equal likelihood by U, C, or G. When there is a high probability of nucleotide substitution in the central branch of the initial tree and low probabilities in the other branches, one finds the pattern xxyy at most positions.

This is the informative pattern for parsimony and identifies the tree that positions taxa 1 and 2 together and 3 and 4 together as being the most parsimonious. The absence of other patterns, except for xxxx, indicates that most substitutions are consistent ones. In this example, parsimony correctly predicts the initial tree.

In the second example, figure lb, the probability of nucleotide substitutions is very large in the peripheral branches leading to taxa 1 and 3 and small in the branches leading to taxa 2 and 4 and in the central branch. Typical sequences are shown in the panel below the true tree, but the expected pattern xxyy that is diagnostic for the true tree is not present. Contrary to one’ expectations, the informative pattern for parsis mony that is present is xyxy. (For this example calculations show that, in the limit of infinite substitution in branches 1 and 3, the xyxy pattern should occur at fully 3/16 of the positions). Hence, the most parsimonious- or minimum substitution-tree in figure 1b is not the initial tree but is the tree that connects taxa 1 and 3 and connects Evolutionary Parsimony 169

–  –  –

PIG. 1.-Examples illustrating when parsimony correctly selects a tree and when it fails. Branch lengths (of either 0 or -0.8) represent the relative probabilities of a nucleotide difference at any one position. The patterns (Cavender 198 1) observed in the aligned sequences and the number of their occurrences are shown adjacent to the sequences. In la parsimony correctly predicts the true tree. In 1b the tree predicted by parsimony places taxa 1 and 3 and taxa 2 and 4 together in a different topological arrangement from that found in the true tree.

2 and 4. In this example, parsimony has picked an incorrect tree because substitutions inserted in peripheral branches of the tree have mimicked the pattern normally produced by substitutions in the central branch of an alternative tree topology. Those substitutions that mimic an incorrect pattern are described as inconsistent substitutions.

The presence of a second type of pattern (xyxz) indicates, however, that xyxy might represent inconsistent substitutions. In the following sections, explicit definitions of both consistent and inconsistent substitutions are detailed and a parsimony-like procedure for determining trees using consistent substitutions is presented.

A Vector Representation Descriptions of operator invariants and of both consistent and inconsistent substitutions are facilitated by using a vector representation of sequences. In this representation a set of four aligned sequences, each of length n, is represented as the vector sum of n vectors. Thus, in figure 2a, each of the 256 (or 43 possible combinations of nucleotides represents a direction in a 256dimensional sequence space. For example, the vector CGGC, present at the third position along the sequence, is one of 20 subvectors that make up the sequence vector, s.

This 256dimensionaI space can be considerably simplified if one includes information about the (molecular-biological) details of the substitution process. Because 170 Lake

–  –  –

DNA copying and repair mechanisms distinguish most readily between the larger purines and the smaller pyrimidines, exchanges that substitute one purine for another or one pyrimidine for another (transitions) occur much more frequently than those that interchange a pyrimidine and a purine (transversions). Wilson and co-workers (Brown et al. 1982), for example, have shown that, for mitochondrial DNAs, transitions occur an order of magnitude more frequently than transversions. This difference is applied to the definition of basis vectors in the following paragraph.

Distinguishing between transitions and transversions allows one to reduce the number of basis vectors from 256 to 36. This simpler representation, which replaces each of the nucleotide letter symbols with the numbers one through four, is shown in figure 2b. Since the representation of the nucleotide in position one in a vector is arbitrary, a “ 1” is assigned to represent it and all others of the same type. Any nucleotide related to the nucleotide in position one by a transition is assigned a “2” to represent it. The first nucleotide (if any) that is related to the nucleotide in position one by a transversion (and all others of the same type) is represented by a “3.” Finally, any nucleotide related by a transition to the type represented by a “3” is represented by a “4.” With this notation, any combination of four nucleotides can be represented by one of 36 types.

To simplify this further, a shorthand, one-letter, notation is introduced (table 1).

In the example in figure 2a, position CGGC becomes 133 1 in 2b and is abbreviated as vector component G; UGGG becomes 1333 and is abbreviated as component A.

Thus the set of four aligned sequences can be represented by the single line of components shown in figure 2c. Similarly, a unit vector pointing in the G direction will be represented as e, in the one-letter code.

Evolutionary Parsimony 171 This notation allows one to describe four aligned sequences either as spectral components of the aligned sequences or as a sequence vector. In the example in figure 2 the vector component G ( 133 1) occurs four times, and the val_ue of the G spectral component is listed as G = 4. Similarly, the sequence vector, S-corresponding to spectral components a, A, b, B, etc. and unit vectors I, A, etc.-is written as S=ai+AA+bh+BB+. (1) l l l Operator-Invariant Analysis The spectral components from the previous example are used to illustrate parsimony analysis and evolutionary-parsimony analysis (fig. 3). In this example and throughout this paper, the parsimony analysis will consider only transversion substitutions in the central branch of the tree. Thus, each of the three spectral components E, F, and G is most parsimoniously associated with one of three possible evolutionary trees called the E, the F, and the G trees, respectively. These trees are shown in figure

4. The tree associated with the largest component is most parsimonious, i.e., requires the minimal number of transversion substitutions. Parsimony analysis of the spectral components in figure 3 identifies the G tree as being most parsimonious.

The method of evolutionary parsimony is similar to parsimony but uses additional spectral components to determine consistent substitutions. As shown in figure 3 the operator invariants are linear combinations of four spectral components. As with parsimony, each invariant (X, Y, or Z) is associated with a tree (the E, F, and G trees, respectively). The evolutionary interpretation of the operator invariants is that they

–  –  –

FIG. 3.-The operator spectral components derived in fig. 2 analyzed using both the parsimony method and the method of evolutionary parsimony. In this example, evolutionary parsimony selects the correct E tree even though tree G is most parsimonious.

172 Lake are the remnants of transversion substitutions made in the central branch of the tree.

Only the historically correct tree has contributed consistent substitutions to the sequences, and only it can have a nonzero invariant. The two incorrect trees cannot have remnants and thus will be associated with (statistically) zero invariants. In the example in figure 3, only the X invariant (the E tree) is found to be significantly greater than zero when the invariants are analyzed by the X2-test (see Statistical Tests and Tree Selection below). The observation that the Z invariant is approximately equal to zero, even though the G tree is the most parsimonious, indicates that many of the substitutions supporting the G tree are inconsistent substitutions.

Each operator invariant (X, Y, or Z) has three types of spectral components measuring different aspects of the evolutionary process-namely, a parsimony term (E, F, or G), two peripheral branches terms (H and J, L and N, or Q and S), and a compensatory term (u, v, or w). As a guide to understanding the invariants, examples of the functioning of their components are given below.

Under conditions of low substitution rates in peripheral tree branches, the peripheral branches’ terms and the compensatory term will be small and only the parsimony term be large. This is the reason that, in this limit, parsimony and evolutionary parsimony predict the same tree.

When transversion substitutions in peripheral branches of the tree are frequent, this can artifactually increase the parsimony term associated with the incorrect trees.

Pages:   || 2 | 3 | 4 |

Similar works:

«51 Forestry THE SEASONAL ABUNDANCE OF THE NEWLY ESTABLISHED PARASITOID COMPLEX OF THE EUCALYPTUS TORTOISE BEETLE (PAROPSIS CHARYBDIS) D.C. JONES and T.M. WITHERS Forest Research, PB 3020, Rotorua Corresponding author:diane.jones@forestresearch.co.nz ABSTRACT Enoggera nassaui has been the key biological control agent of the eucalyptus tortoise beetle Paropsis charybdis since 1987. In 2001 a second egg parasitoid Neopolycystus insectifurax as well as an obligate hyperparasitoid of E. nassaui,...»

«ANOMALOUS MENTAL PHENOMENA RESEARCH IN RUSSIA AND THE FORMER SOVIET UNION: A FOLLOW UP Larissa Vilenskaya & Edwin C. May, Ph.D. ABSTRACT We describe our further exploration into research of anomalous mental phenomena (AMP) in the Former Soviet Union (FSU). We visited numerous research centers in major cities of Russia and the Ukraine, met with leading researchers in the field, visited their laboratories, and participated in some experiments. In their research, our Russian colleagues emphasize...»

«CURRICULUM VITAE: Prof. Yunus Daud Mgaya PhD University of Dar es Salaam, Tanzania 1. PERSONAL DATA Name: Prof. Yunus Daud MGAYA Nationality: Tanzanian Name of place and date of birth: Usangi, Mwanga District, Kilimanjaro; 27 July 1957 Department: Aquatic Sciences and Fisheries College: Natural and Applied Sciences Address: University of Dar es Salaam, P.O. Box 35091, Dar es Salaam, TANZANIA Telephone: 255-22-2410394 (Office); 255-22-2 617 457 (Home); 255-784-237 774 (Mobile) Fax: 255-22-2 410...»

«Dr. M. Tim Tinker, Research Biologist, USGS Curriculum Vitae ttinker@usgs.gov, tinker@biology.ucsc.edu http://brd1.ucsc.edu/ Academic Record University of California, Santa Cruz, CA PhD Ecology and Evolutionary Biology, 1998-2004 Dissertation Research: Population biology and foraging behavior of the southern sea otter University of Waterloo, Ontario, Canada M.Sc., Biology, 1991-1993 Thesis: Behavioral ecology and energetics of grey seals (Halichoerus grypus) on land-fast ice University of...»

«SyllabuS Cambridge O level biology For examination in June and November 2017 2018 and 2019, Version 1 Cambridge Secondary 2 Changes to syllabus for 2017, 2018 and 2019 This syllabus has been updated, but there are no significant changes. you are advised to read the whole syllabus before planning your teaching programme. Cambridge International Examinations retains the copyright on all its publications. Registered Centres are permitted to copy material from this booklet for their own internal...»

«ISSN 1393 6670 CONSERVATION MANAGEMENT OF THE FRESHWATER PEARL MUSSEL Margaritifera margaritifera Part 1: Biology of the species and its present situation in Ireland E. A. Moorkens IRISH WILDLIFE MANUALS No. 8 Series Editor: F. Marnell E. A. Moorkens (1999) Conservation Management of the Freshwater Pearl Mussel Margaritifera margaritifera. Part 1: Biology of the species and its present situation in Ireland. Irish Wildlife Manuals, No. 8. Dúchas, The Heritage Service Department of Arts,...»

«ISSN: 0975-8585 Research Journal of Pharmaceutical, Biological and Chemical Sciences A Survey Based On Cloth Pattern Recognition for Optically Defective Humanity. Annammal S*, and Sakthi prabha R. Department of ECE, Sathyabama University, Chennai, Tamil Nadu, India. ABSTRACT The recognition of Clothes pattern is a challenging task for optically impaired people.Due to the large intra class pattern variation this becomes a challenging task in computer vision. The clothing pattern is categorised...»

«CURRICULUM VITAE I. NAME: Ayoade M. J. ODUOLA I. i CONTACT ADDRESS: University of Ibadan Research Foundation University of Ibadan Ibadan, Nigeria P.O. Box 28041 Agodi Gate Ibadan, Nigeria ii. TELEPHONE and E-MAIL: Cell Phone (+234) 80 55 22 6957 Phone: +1 7188393803 E-mail: amjoduola@hotmail.com amj.oduola@mail.ui.edu.ng II. PRESENT POSITION: Professor and Director University of Ibadan Research Foundation III. UNIVERSITY EDUCATION & POST-DOCTORAL TRAINING: 1971-1974 BS (With Honours); Central...»

«1 © CSIRO 2006 10.1071/IS05020_AC ISSN 1445-5226 Invertebrate Systematics, 2006, 20(3), 305–365. Revision of Nicrophorus in part: new species and inferred phylogeny of the nepalensis-group based on evidence from morphology and mitochondrial DNA (Coleoptera : Silphidae : Nicrophorinae) Derek S. SikesA,D, Ronald B. MadgeB and Stephen T. TrumboC A Department of Biological Sciences, University of Calgary, 2500 University Drive NW, Calgary, Alberta, T2N 1N4, Canada. B 1637 16 Street S. E.,...»

«Chapter 6 Investigating the Diversity of Parasitic Protozoa using Gregarine Parasites of Invertebrates Charlotte K. Omoto and Dennis C. Cartwright School of Biological Sciences Washington State University Pullman, WA 99164-4236 Charlotte K. Omoto omoto@wsu.edu Dennis C. Cartwright cartwrig@wsu.edu Charlotte Omoto is a professor of Biology at Washington State University where she has been since 1984. She received her B.S. in Biology from the University of Washington, Seattle and her Ph.D. from...»

«Acta Sci. Pol., Hortorum Cultus 11(4) 2012, 47-57 THE EVALUATION OF QUALITY OF SELECTED CULTIVARS OF PARSLEY (Petroselinum sativum L. ssp. crispum) Ewa Osińska, Wiesława Rosłon, Marlena Drzewiecka Warsaw University of Life Sciences – SGGW Abstract. Parsley leaves are the most valuable vegetables owing to its biological properties. They are a very rich source of vitamin C, -carotene and mineral constituents. The study aim of the present work was to evaluate yield and quality of three...»

«Computing Maximum Subsequence in Parallel C. E. R. Alves1, E. N. C´ceres2, and S. W. Song3 a Universidade S˜o Judas Tadeu, S˜o Paulo, SP Brazil, a a prof.carlos r alves@usjt.br Universidade Federal de Mato Grosso do Sul, Campo Grande, MS Brazil, edson@dct.ufms.br Universidade de S˜o Paulo, S˜o Paulo, SP Brazil, a a song@ime.usp.br Abstract. The maximum subsequence problem finds the contiguous subsequence of n real numbers with the highest sum. This is an important problem that arises in...»

<<  HOME   |    CONTACTS
2017 www.abstract.dislib.info - Abstracts, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.