Comparative Sequence Analysis of 23S rRNA from Proteobacteria

July 14, 2017 | Autor: Nina Springer | Categoria: Evolutionary Biology, Microbiology, Comparative sequence analysis, Higher Order Thinking
Share Embed


Descrição do Produto

System. App!. Microbio!' 18, 164-188 (1995) © Gustav Fischer Verlag, Stuttgart· Jena . New York

Comparative Sequence Analysis of 23S rRNA from Proteobacteria WOLFGANG LUDWIG 1 *, RAMON ROSSELL6-MORA 1, ROSA AZNAR 1, SABINE KLUGBAUERl, STEFAN SPRING 1, KONSTANTIN REETZ 1, CLAUDIA BEIMFOHR 1, ELKE BROCKMANN 1, GUDRUN KIRCHHOF 1, SILVIA DORN 1, MARIANNE BACHLEITNER 1, NORBERT KLUGBAUER 1, NINA SPRINGER 1, DAVID LANe, RAYMOND NIETUPSKy 2 , MICHAEL WEIZENEGGER 1, and KARL-HEINZ SCHLEIFER 1 I

!

Lchrstuhl fur Mikrobiologie der Technischen Universitiit Munchen, D-80290 Miinchen, FRG GENE-TRAK Systems, Framingham, MA, USA

Received March 29, 1995

Summary 23S rRNA genes of 17 strains representing the a. B. y. band f subclasses of the Proteobacteria were completely sequenced. The selJuences were aligned to about 120 published as well as unpublished complete or almost complete primary structures of 23S rRNAs from other members of the domain Bacteria representing all known phyla. Primary and higher order structure analyses revealed remarkable differences of predicted 235 rRNA structures from members of the different subclasses. A phylogenetic tree reflecting the relationships among proteobacteria was reconstructed based on 23S rRNA sequence comparison. The topology of the tree is similar to that of a tree based on an equivalent 165 rRNA sequence data set.

Key words: 16S rRNA - 235 rRNA - Higher order structure - Phylogeny - Proteobacteria

Introduction Comparative 16S rRNA sequence analysis as originally introduced by Carl Woese during the seventies of this century currently is the most reliable and widely used method for the reconstruction of major (micro-)organismal phylogenies. The rapidly progressing elucidation of phylogenies had, and increasingly has major impacts on the restructuring and improvement of especially microorganismal taxonomy towards a classification system based on natural relationships (Woese, 1987; Woese et aI., 1990; Woese, 1992; Stackebrandt, 1992; Olsen and Woese, 1993; Ludwig and Schleifer, 1994; Woese, 1994). However, the impacts of the rRNA technologies on the applied disciplines of biology such as medical, food, environmental microbiology, and microbial ecology probably may have to be assessed to an even higher value. Comparative sequencing and specific probe technology (DeLollg et aI., 1989; Schleifer et aI., 1993; Amann and Ludwig, 1994; Amann et aI., 1995) or the combination of both, are most valuable tools for the rapid identification of • Corresponding author

microorganisms at and above the specics level. The progress with respect to sensitivity and routine applicability of rRNA technologies which resulted from the establishment of polymerase chain reaction based in vitro amplification techniques cannot be underestimated. Probably, the most momentous aspect of rRNA technology development during the past few years is the acessibility of non-cultured microorganisms for phylogenetic analyses and in situ detection and identification using the whole cell probing approach (Amann et aI., 1991; Spring et aI., 1992; Spring et aI., 1993; Amann and Ludwig, 1994; Amann et aI., 1995). The 16S rRNA molecules have been proven to fulfill the requirements of phylogenetic marker molecules (Woese, 1987; Woese et aI., 1990; Olsen and Woese, 1993; Ludwig and Schleifer, 1994) and to provide useful target sites for hybridization probes diagnostic for variolls levels of phylogenetic relatedness (Stahl and Amann, 1993; Schleifer et aI., 1993; Amann and Llldwig, 1994; Amann et aI., 1995). Initially, 16S rRNA was chosen as the object of phylogenetic analyses taking into account the balance of molecular complexity and herewith phylogenetic informa-

Comparative Sequence Analysis oi 235 rRNA from Proteobacteria tion content versus experimental expediture for the determination of its primary structure. The limitations of sequence analyses of large molecules have been overcome by modern rapid techniques for manually or automated rRNA (gene) sequencing. Consequently, a large dataset of partial and almost complete 16S rRNA sequences is currently available in public databases (Maidak et a!., 1994; Van de Peer et aI., 1994) covering the majority of lines of descent among the so far culturable part of the microbial world and a rapidly increasing number of non-cultured environmental organisms. The availability of this large number of primary structures providing a comprehensive selection of reference data is a prerequisite of reliable phylogenetic analyses and sequence or probe based organ ismal identification, thus enforcing the focus of the interest of research on the small subunit rRNAs. However, the phylogenetic information content stored in the sequence of (in the case of bacteria) 1500 or so nucleotides of 16S-like rRNAs is limited as is the number of potential probe target sites. The large subunit rRNAs contain about twice the number of monomers than their small subunit counterparts while the overall primary structure picture is the same: an alternating succession of evolutionary highly and less conserved regions which are diagnostic for different levels of relationships. Thus, in comparison with 16S (-like) rRNAs, 23S (-like) rRNAs may provide information on additional phylogenetic levels or, given a higher number of characters supporting a given phylogenetic grouping, (statistically) more confident definition of the respective group (Ludwig and Schleifer, 1994). The first complete 23S rRNA primary structure, that of Escherichia coli, has been published in 1980 (Brosius et aI., 1980) only two years later than the corresponding 16S rRNA sequence data (Brosius et aI., 1978). However, since then, the large subunit rRNAs did not attract major interest as phylogenetic marker or reservoir of probe target sites. This is reflected by a rather limited number (73) of complete or at least 80% complete bacterial 235 rRNA sequences available in public data bases (Table 1; Maidak et aI., 1994; De Ri;k et aI., 1994). The only bacterial line of descent which has been investigated more thoroughly with respect to 23S rRNA so far is that of the Gram-positive bacteria with a low DNA G+C content for which 35 complete or at least 80% complete sequences are available (Ludwig et aI., 1992; Ash and Collins. 1992; Campbell et aI., 1993; Harland et aI., 1993; Martinez-Murcia et aI., 1993; Van der Mer et aI., 1993). In the present study the available data set of 235 rRNA sequences from the Proteobacteria, another major phylogenetic group, was substantially extended. The proteobacteria comprise five subclasses (a-E) as defined upon 165 rRNA data (Stackebrandt et aI., 1988; Stackebrandt, 1992). This phylum contains a phenotypically diverse and phylogenetically interwined mixture of bacteria. The vast majority of non-cyanobacterial phototrophic bacteria are dispersed over the a-, 13-, and y-subclasses, often closely related to non-phototrophs. Autotrophic, methylotrophic, and magnetotactic bacteria are dispersed over part of the subclasses, however, not exclusively among the proteobacteria. Many nitrate and nitrite reducing bacteria are mem-

165

bers of the f3-subclass, but are also present in other lineages. Most sulfur-dependent proteobacteria and all myxobacteria are found within the o-subclass. Presently, more than 800 almost complete 16S rRNA primary structures of proteobacteria representing all subclasses are available in public databases (Maidak et aI., 1994; Van de Peer et aI., 1994). Eighteen 23S rRNA sequences had been available for members of the a-, 13-, y- and E-subclasses (Maidak et aI., 1994; De Ri;k et a!., 1994), and seventeen additional sequences were determined in the course of the present study. Materials and Methods Organisms and culture conditions. The strains used for 235 rRNA sequence determination are specified in Table 1. Acinetobacter calcoaceticus, Aeromonas hydrophila, Alcaligenes faecalis, and Zoogloea ramigera were grown aerobically at 30°C in a medium containing (gil) tryptone 10; yeast extract 5; glucose 5; NaCI 5; pH 7.5. The medium for the aerobic growth of .. Rhizobium lupini" was yeast extract mannitol broth (Vincent, 1970). Cells of Klebsiella pneumoniae were aerobically grown on Trypticase Soy Agar supplemented with 0.7% (wtv) yeast extract at 37°C. Nutrient agar C1B (Oxoid, Wesel, FRG) was used for the aerobic cultivation of Paracoccus denitrificans at 25°C. Rhodospirillum rubrum was cultivated anaerobically at 25°C in the light in the Rhodospirillaceae medium described in the catalogue of strains of the German Collection of Microorganisms and Cell Cultures. Strains of Vibrio vulnificus were grown in Brain Heart Infusion (SHI, Difco) supplemented with 0.5% NaCI (w/v) at 25°C. Cells of Leucothrix mucor, Nannocystis exedens and Stigmatella aurantiaca were kindely provided by H. Reichenbach (Braunschweig, Germany), those of Thiobacillus cuprinus by K. O. Stetter (Regensburg, Germany). Nucleic acid purification. Genomic DNA was purified according to Marmur (1961) or by using the Qiagen purification kit (Diagen, Hilden, Germany). Purified genomic DNA of Wolinella succinogenes was a gift of R. Amann (Miinchen, Germany). Northern hybridization. Northern hybridizations of purified proteobacterial RNAs to a 235 rRNA directed oligonucleotide probe (5'-GTfBCCCCATTCGG-3'; Escherichia coli positions 115-127) were performed as described previously (Roller et aI., 1992). Cloning or rRNA genes. Ribosomal RNA genes or gene fragments were obtained by restriction of genomic DNA or in vitro amplification. Identification, purification and cloning were done as described earlier (Ludwig et aI., 1992). Sequence determination. The sequence analyses of in vitro amplified or cloned rDNA were performed as described previously (Ludwig et aI., 1992; Springer et aI., 1993 b) using rRNA gene specific primers (Ludwig et aI., 1992; Roller et aI., 1992). Data analysis. The 235 rRNA sequences were added to an alignment of about 120 homologous sequences from bacteria using the program package ARB (Strunk et aI., in prep.). The tool ARB....EDIT was used for hemi-automated sequence alignment according to primary structure similarities, for higher order structure prediction, and further improvement of the alignment taking into account predicted higher order structure similarity. Finally, the alignment was evaluated by eye and corrected manually. Consetvation profiles, masks for alignment column selection according to positional variability (number of different residues per column, or fraction of the most frequent base[s]), similarity and distance matrices as well as distance corrections Uukes and Cantor, 1969) were established by using ARB-PHYL. Consensus se-

166

W. Ludwig et al.

quences were derived analyzing the respective selection of aligned sequences with the tool ARB_CON5EN5E. Phylogenetic trees were reconstructed applying distance matrix and maximum par· simony merhods as implemented in the corresponding tools of the ARB, GDE (Maidak et aI., 1994) and PHYLIP (Fe/sel/stein, 1989) program packages. Maximum likelihood analyses were performed using Olsel/'s program fastDNAml (Maidak et aI., 1994). A variety of different but always almost equivalent data sets of 165 and 235 rRNA primary structures were used for tree reconstructions. The data sets varied with respect to the inclusion of taxa and alignment positions. The most comprehensive data sets contained 165 or 235 rRNA primary srructures from all Bacteria and Archaea for which 23S rRNA sequences are a,·ailable. Subsets always comprised all available 23S rRNA sequences from proteobacteria or the corresponding 16S rRNA data and a varying selection of outgroup reference sequences from members of the other bacterial phyla. To recognize and minimize treeing artifacts which may result from the inclusion of highly variable sequence positions (LI/dwig and Schleifer, 1994), the data sets were modified by deleting alignment columns according to conservation profiles established for the bacteria, proteobacteria or subclasses, respectively. Distance matrix and maximum parsimony anal pes were performed upon the complete and subsets of bacterial and archaea1 rRNA sequences. Maximum likelihood based tree reconstructions were performed using the different subsets of proreobacterial sequences each including one outgroup reference from a non-proteobacteral phylum. The significance of tree topologies was tested performing bootstrapped distance and parsimony analyses using the respective tools of the PHYLIP package.

Results 23S rRNA primary structures Complete 235 rRNA gene sequences were determined for 17 bacterial species representing the five (a-E) known subclasses of the Proteobacteria phylum (Table 1). A conventionally cloned 235 rRNA gene was sequenced only in the case of Klebsiella pneumoniae (Table 2) whereas hI vitro amplified rRNA encoding DNA fragments (rDNA) were used for the corresponding analyses of the other bacteria. The rDNA fragments were either sequenced directly or after cloning (Table 2). To avoid uncorrect sequence data, which may result from cloning artifacts, and to minimize errors introduced by the DNA polymerases used for in vitro amplification, direct sequencing of amplified rDNA was preferred. Furthermore, the identities of the recombinant plasm ids were verified by direct sequencing of selected (evolutionary only moderately conserved) parts of the corresponding rDNA fragments. The sequence data have been deposited at the European Bioinformatics Institute (EBI) databases (the former EMBL Data Library; Emmert et aI., 1994; for accession numbers see Table 1). The 16S rRNA sequences of Aeromonas hydrophila D5M 30187T , Bradyrhizobium japonicum DSM 30131 T , Brevlwdimonas diminuta DSM 1635, Burkholderia cepacia D5M 50181, Klebsiella pnetlmoniae DSM 301041', Leucothrix mucor DSM 2157 T , "Rhizobium lupilli" D5M 30140, Rhodopseudomonas palustris DSM 126, and Rhodospirillum rulJrum D5M 107 were determined to provide an almost equivalent data set of ribosomal small subunit RNA sequences including pub-

lished ones. The corresponding EBI databases accession numbers are X87271 to X87279. The new 23S rRNA primary structures derived from the gene sequences were aligned and compared with about 120 complete or at least 80'Yo complete (with respect to the Escherichia coli 235 rRNA) homologous sequences from representatives of Bacteria. The reference sequences are unpublished own data or were taken from public databases (De Ri;k et aI., 1994; Emmert et aI., 1994; Maidak et ai., 1994). The G+C contents and length of the sequences are given in Table 1. For length calculation, the 5' and 3' termini were assumed in analogy to those of the homologous Escherichia coli sequence. Comparative analyses of the aligned primary structures revealed remarkable insertions and deletions at defined sites especially in the sequences from representatives of the a-subclass. (The terms insertion and delection are used to indicate stretches of additional or missing nucleotides using the Escherichia coli 23S rRNA as a standard). The corresponding sites are indicated in the secondary structure model depicted in Figure 1 and shown in more detail in Figures 2-11. Positional variabilities in terms of number and character of base differences at homologous positions are depicted in the secondary structure model of Figure 1. This model is based on those published previously (Hopf/ et ai., 1989; Larsen, 1992; Ludwig et ai., 1992; Cutell et aI., 1993) and reflects the higher order structure of the Escherichia coli sequence. The sequence shown in the model is a consensus sequence derived from an alignment of all available complete 235 rRNA primary structures from Proteobacteria. Invariant nucleotides are indicated by the corresponding (upper case) symbols (A, C, G, and UJ. The positional variabilities are shown by using the corresponding IUB codes indicating what combinations of two, three or four different bases are found at the particular alignment positions. Lower case letters indicate that the particular position is not present in one or more proteobacterial sequences. Only positions that are occupied by a residue in the Escherichia coli sequence are shown in the secondary structure model of Figure 1. Positions, characters and structures of insertions or deletions of more than three nucleotides are shown in Figures 2-11.

Higher order structure Secondary and higher order structure predictions were performed for the new 23S rRNA sequences by evaluating the sequences for base pairing according to the widely accepted higher order structure models (Hopf/ et aI., 1989; Larse1l, 1992; Ludwig et ai., 1992; Cutell et aI., 1993) and by searching for compensating base changes (Cutell and Fox, 1988; Hopf/ et aI., 1989). The conserved core structure is indicated in the secondary structure model of Figure 1. Invariant higher order structure elements that can be folded in the same manner by all available 23S rRNA primary structures of Proteobacteria are indicated by bold faced and italizised typing of the respective helix numbers. The different types of variable structure elements are shown in the Figures 2-11.

Comparali\'e Seljuence An.llysis of 23S rRNA from Proteob,lcteria

167

Table I. Proteobacleria for which complete 23S rRNA primary structures were determined in the present study or complete or at least 80% complete sCljuences arc available in public databases t,'vldidak et aI., 1994; Van dl' Peer et aI., 1994) Organism

Strain

Phylogem-

Accession

Length"

Y Y Y

B B

X87280 X8728 I X67946 X87282 X70370

2893 2886 274,)" 2882 2H54

G+C content (%) rDNA

Genome

50.7 53.2 52.3 52.7

38-47 58-62 58-62 56-59 n.k.

Reference

Adlll'to/"lCta wlcoacetiells Aernmoll
Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.