pico-PLAZA, a genome database of microbial photosynthetic eukaryotes

June 16, 2017 | Autor: Bram Verhelst | Categoria: Microbiology, Environmental microbiology, Genomics, Diatoms, Plant Genome Project, Genetic variation, Chlorophyta, Genetic variation, Chlorophyta

Share Embed

Denunciar este link

Descrição do Produto

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/245537135

pico-PLAZA, a genome database of microbial photosynthetic eukaryotes ARTICLE in ENVIRONMENTAL MICROBIOLOGY · JULY 2013 Impact Factor: 6.2 · DOI: 10.1111/1462-2920.12174 · Source: PubMed

CITATIONS

READS

6

29

9 AUTHORS, INCLUDING: Klaas Vandepoele

Bram Verhelst

Ghent University

Ghent University

81 PUBLICATIONS 5,928 CITATIONS

5 PUBLICATIONS 52 CITATIONS

SEE PROFILE

SEE PROFILE

Yves Van de Peer

Gwenael Piganeau

Ghent University

French National Centre for Scientific Resea…

395 PUBLICATIONS 33,467 CITATIONS

43 PUBLICATIONS 2,044 CITATIONS

SEE PROFILE

All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.

SEE PROFILE

Available from: Yves Van de Peer Retrieved on: 05 February 2016

bs_bs_banner

Environmental Microbiology (2013) 15(8), 2147–2153

doi:10.1111/1462-2920.12174

Genomics update pico-PLAZA, a genome database of microbial photosynthetic eukaryotes

Klaas Vandepoele,1,2*† Michiel Van Bel,1,2† Guilhem Richard,3,4 Sofie Van Landeghem,1,2 Bram Verhelst,1,2 Hervé Moreau,3,4 Yves Van de Peer,1,2 Nigel Grimsley3,4 and Gwenael Piganeau3,4 1 Department of Plant Systems Biology, VIB, Technologiepark 927, B-9052 Gent, Belgium. 2 Department of Plant Biotechnology and Bioinformatics, Ghent University, B-9052 Gent, Belgium. 3 CNRS, UMR 7232, Observatoire Océanologique, Banyuls-sur-Mer, France. 4 UPMC Univ Paris 06, UMR 7232, Observatoire Océanologique, Banyuls-sur-Mer, France.

tionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains.

Algal genomics comes of age Summary With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http:// bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to func*For correspondence. E-mail [email protected]; Tel. (+32) 9 33 13822; Fax (+32) 9 33 13809. †Both authors contributed equally.

© 2013 John Wiley & Sons Ltd and Society for Applied Microbiology

One decade ago, sequencing of the 18S rRNA gene from environmental samples of the ocean surface unveiled an astounding diversity of planktonic microbial eukaryotes (Diez et al., 2001; Moon-van der Staay et al., 2001; Guillou et al., 2008; Massana and Pedros-Alio, 2008). Metabarcoding approaches (Tautz and Domazet-Loso, 2011) have repeatedly enabled the identification of new pico-eukaryotic (cell size < 2 μm) lineages with no cultured representatives (Goodstein et al., 2012; Not et al., 2012). It also revealed the ecological importance of microbial algae in coastal waters and the ecophysiological parameters responsible for their global distribution (Demir-Hilton et al., 2011). However, the limits of using a few barcoding genes to estimate diversity became increasingly apparent with the availability of complete genomes. The comparison of the first pair of genomes in the Ostreococcus genus, O. tauri (Derelle et al., 2006) and O. lucimarinus (Palenik et al., 2007), disclosed an unexpected divergence of their genome sequence, with over 15% of species-specific genes and high levels of protein divergence, despite a 99.8% identity over the complete 18S rRNA sequence (Piganeau et al., 2011). The comparative analysis of complete genome sequences will yield alternative, less constrained, genes that more accurately represent species’ diversity in microbial eukaryotes (Slapeta et al., 2006; Piganeau et al., 2011), and thus provide a better understanding of their ecology.

2148 Genomics update The microalgal genomic era started with the publication of the genome sequence of the red alga Cyanidioschyzon merolae in 2004 (Matsuzaki et al., 2004), followed by projects focusing on the diatom Thalassiosira pseudonana (Armbrust et al., 2004) and the green alga O. tauri, (Derelle et al., 2006), which complemented largescale expressed sequence tag (EST) projects (for a review, see Tirichine and Bowler, 2011). Fuelled by comparative genomics, recent sequencing initiatives have provided significant new insights into secondary endosymbiosis (Moustafa et al., 2009; Deschamps and Moreira, 2012), genome organization and compaction (Derelle et al., 2006; Palenik et al., 2007), intron evolution (Worden et al., 2009) and horizontal gene transfer (Bowler et al., 2008; Moreau et al., 2012) in different unicellular eukaryotic species. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species remains a major challenge. Consequently, performing evolutionary analyses using genome sequences generated by different labs or consortia requires a centralized infrastructure where all

information is integrated, in combination with advanced, user-friendly methods for data mining. pico-PLAZA is a web-based resource (http:// bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analyses and study gene functions (for a complete tool overview and data types, see Table S1). Based on 16 algal genome sequences including 1 red, 1 brown and 10 green algae, as well as 5 stramenopile species (Fig. 1), detailed information is available describing genes, homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology (GO), InterPro and text-mining functional annotations (see Supporting Information Text and Figs S1 and S2). Furthermore, different interactive viewers are available to study genome organization using information on gene collinearity and synteny (conservation of relative gene order between species) (Figs S3 and S4). Various search functions, documentation pages, different export options and an extensive glossary are available to guide non-expert scientists during sequence

Fig. 1. Content of the pico-PLAZA database. An overview of the genome sequence information available in the pico-PLAZA database (http://bioinformatics.psb.ugent.be/pico-plaza/). The phylogenetic tree is based on the NCBI Taxonomy database. Footnotes: (1) phylum (2) year of publication in parenthesis (3) AnnoMine is a novel homology-based text-mining approach to functionally annotate genes; ‘n.d.’ indicates not determined because gene descriptions were assigned by original data provider.

© 2013 John Wiley & Sons Ltd and Society for Applied Microbiology, Environmental Microbiology, 15, 2147–2153

Genomics update

2149

analysis. Finally, pico-PLAZA’s Workbench, a userspecific analysis environment, allows the efficient characterization of large user-defined gene sets or sequences. As such, it provides a means to use the integrated genome information as a reference knowledge base for the analysis of new sequence information such as transcriptomes from non-model species lacking complete genome sequences or functional information. Below, we highlight three examples illustrating how pico-PLAZA can be used to identify species-specific genes, functionally characterize large-scale EST or RNA-Seq data sets, and identify new marker genes to study the diversity and distribution of algal species.

respectively. Considering both the large number of new pan families (i.e. new families not observed in a specific set of organisms) as well as clade-specific families (Table 1), it is possible to weight the importance of the acquisition of new gene functions and the expansion of specific gene families in the relationship between genotypic diversity and algal phenotypes (Blanc et al., 2012; Moreau et al., 2012). Examples of expanded functional categories include proteins with ankyrin repeat-containing domains in Ectocarpus siliculosus and Bathycoccus prasinos, protein kinases in C. reinhardtii, and tetratricopeptide-like helical proteins in E. siliculosus and Aureococcus anophagefferens.

Disentangling the core genome and species-specific gene families

Efficient exploration of gene function diversity in large-scale expression data sets

Searching gene families through phylogenetic profiles (i.e. the presence or absence of a gene family in a species) is important to understand gene family dynamics and shed light on both ancestral gene content (Merchant et al., 2007) and new gene birth (Tautz and Domazet-Loso, 2011). The pico-PLAZA Gene Family Finder tool allows the number of genes assigned to families in all available algal genomes to be compared, and can reveal large differences for individual green algal species. Whereas O. tauri contains 6893 genes grouped into families, Volvox carteri has more than 13 800 genes. The identification of orphan genes (genes lacking paralogs or homologous genes in any other species present in the database), gene family sizes (single-copy versus multi-gene) and species content (species-specific or genes sharing homologues in other species) provides a general overview of gene distributions in the different species (Fig. 2A). The Ostreococcus species have the most streamlined genomes characterized by a large number of single-copy gene families and a low number of multicopy families and orphans. In contrast, the largest number of genes in multicopy families is observed in V. carteri and Chlamydomonas reinhardtii. There are many species-specific gene families, and 1827 gene families are shared and unique to the Chlorophyceae (Table 1). These results explain the overall high number of genes present in these two species. Counting duplicated genes for all species reveals an important role for tandem duplication (between 3% and 15% of all genes for different species) as the molecular mechanism implied in gene family expansions (Fig. S5). In contrast to species-specific features, determining the number of core genes (i.e. genes shared in all species from a clade or specific set of species) within green algae revealed that 2078 core families are shared between all 10 species (Fig. 2B). When including brown/red algae or higher plants, this number decreases to 1494 and 1089

Apart from browsing individual genes or functional categories, pico-PLAZA can also be applied as a data warehouse to analyse large gene sets or characterize new sequences. To demonstrate this feature, we performed a functional and comparative analysis of a set of > 10 000 EST sequences from Phaeodactylum tricornutum using the Workbench. Based on a large-scale expression data set of > 120 000 sequenced cDNAs from 16 different libraries (Maheswari et al., 2010), we created two Workbench experiments for each library. One experiment comprises all sequences expressed in that condition independent from their expression in other conditions (called condition_all), while the second experiment covers sequences uniquely expressed in that condition (called condition_specific). The 16 libraries explore the responses of P. tricornutum to a range of growth conditions, including different nutrient regimes of Si, N, Fe, and dissolved inorganic carbon, stress (hyposalinity and low temperature), and blue light. Results about associated genes, families and functional GO enrichment analysis for all 32 experiments are summarized in Tables S2 and S3. We further present a detailed analysis of sequences from the ‘urea adapted (ua)’ library. After mapping all 3436 ‘ua’ sequences to the genome annotation of P. tricornutum (BLASTN against annotated transcripts; E-value < 1e-05 via the Workbench), a total of 2863 gene models were tagged with one or more EST sequences. Ninety-four per cent of these genes are associated to 1954 pico-PLAZA multigene families, and a detailed analysis of the phylogenetic family profiles reveals that 69 (4%) and 441 (23%) families are specific to P. tricornutum and diatoms respectively. Interestingly, the latter includes a family of S-adenosylmethionine decarboxylases (HOM004619) involved in spermidine biosynthesis that was putatively acquired through horizontal gene transfer from a bacterial donor (Maheswari et al., 2010). GO enrichment analysis

© 2013 John Wiley & Sons Ltd and Society for Applied Microbiology, Environmental Microbiology, 15, 2147–2153

2150 Genomics update

Fig. 2. Overview of gene content in different algal genomes. A. Fraction of protein-coding genes assigned to different categories based on homologues in other species and copy number. B. Pan and core genome plots. Starting from the first species (Volvox carteri, left), the number of genes with homologues in other species (right) is scored as ‘core’ whereas the number of new genes without homologues in the included species (left from current species) are scored as ‘new pan genes’. ‘Pan genes’ indicates the sum of all ‘new pan genes’ based on the species already included. The number of core genes at specific taxonomic levels is indicated. The left Y-axis covers the number of core and pan genes, the right Y-axis reports the number of new pan genes.

© 2013 John Wiley & Sons Ltd and Society for Applied Microbiology, Environmental Microbiology, 15, 2147–2153

Genomics update Table 1. Clade-specific gene families. Clade-specific core familiesa

# gene families

Land plants (3) Green algae (10) Chlorophyceae (2) Trebouxiophyceae (2) Mamiellophyceae (6) Diatoms (3)

974 37 1827 139 449 1035

a. Numbers in parenthesis indicate the number of included species.

(Fig. S6) of the ‘ua_all’ gene set reveals an overrepresentation of genes involved in nitrogen (405 genes), amino acid (117 genes) and organic acid metabolism (132 genes), confirming that diatoms can use urea as a nitrogen source (Armbrust et al., 2004; Maheswari et al., 2010). Interestingly, five ‘ua_specific’ gene families have only homologues in diatoms and therefore comprise diatom-specific genes playing a role in urea-mediating signalling. These functional enrichment results offer a molecular view on the adaptation of P. tricornutum to different environments of ecological relevance. Furthermore, the possibility of combining diatom-specific gene families with specific transcriptional responses under different nutrient, stress and light regimes provides an entry point to link currently unknown genes with unique phenotypic features (Gollery et al., 2006). Sieving complete genomes for new barcoding genes Based on the different integrated genomes, precomputed gene families and detailed gene orthology information (Fig. S2), pico-PLAZA enables a systematic screen of the gene content from complete genomes of microbial photosynthetic eukaryotes. This permits the identification of alternative barcoding genes to screen metagenomes and to address issues about the diversity and distribution of microbial algae. These candidate marker genes should preferably be single-copy genes with a scalable phylogenetic spread from the genus to the order and phyla level. Although this case study is currently restricted to the lineages represented in pico-PLAZA, the number of available genomes will rapidly increase because of future genome projects of microbial eukaryotes and large-scale sequencing initiatives such as the Tara Oceans protist sequencing project (Karsenti et al., 2011) and CAMERA (Sun et al., 2011). To identify lineage-specific genes for environmental monitoring, the Gene Family Finder tool can be used to find species or clade-specific gene families and identify putative gene markers. For example, 442 protein-coding genes are single copy in all three Ostreococcus species (option ‘Clade selection: Ostreococcus’) and absent in Micromonas and Bathycoccus (Table S4). The singlecopy feature of a candidate barcoding gene is essential to

2151

avoid spurious diversity overestimation from multiple gene copies within a genome. Performing a query on singlecopy genes in the order Mamiellales leads to the retrieval of 328 gene families. For each of these gene families, visual inspection of the amino-acid alignment using the JalView editor (University of Dundee, Dundee, Scotland, UK) (Waterhouse et al., 2009) enables the identification of conserved motifs for Polymerase Chain Reaction (PCR) primer design. This two-step protocol provides a practical approach for the detection of genes that can be used to investigate the prevalence of Ostreococcus (or Mamiellophyceae) and their distribution in the ocean. Protein-coding gene markers may enable intraspecific and interspecific diversity to be investigated alongside one another by using appropriate constraints on synonymous coding positions. As a second example, we demonstrate how picoPLAZA can be used to identify intraspecific markers based on multispecies collinearity. The level of nucleotide polymorphism at neutrally evolving sites is a fundamental parameter in molecular evolution, as it is informative about the mutation rate and the effective population size of a species. The proportion of neutrally evolving sites is expected to be lower in protein-coding genes than in intergenic regions. In Ostreococcus, intergenic regions flanked by two stop codons (called ‘tail-to-tail’ intergenic regions) have the highest proportion of neutrally evolving sites (Piganeau et al., 2009). Using the GenomeView genome browser (Broad Institute of MIT and Harvard, Cambridge, MA, USA) (Abeel et al., 2012), pico-PLAZA enables tail-to-tail intergenic regions to be identified rapidly in each genome. Furthermore, cross-species collinearity information (Fig. S4) provides detailed information about conserved intergenic regions that are flanked by orthologous genes. These regions are good candidates for the estimation of intraspecific diversity from environmental strains. The application of this approach to the genomes of Ostreococcus guided the choice of eight tail-to-tail intergenic regions for use as markers to estimate the level of nucleotide polymorphism in O. tauri in the Northwest Mediterranean (Grimsley et al., 2010). The spectrum of polymorphism observed in these sequences provided indirect evidence for meiotic recombination, a key process of adaptation in natural populations. Conclusions pico-PLAZA provides an unparalleled set of data types and tools for comparative genomics and data mining in algae. Future efforts will be made to extend the number of available algal species and to include novel data types to study gene function and regulation. Overall, pico-PLAZA represents a useful toolkit to aid researchers in the exploration of the diversity and evolution of

© 2013 John Wiley & Sons Ltd and Society for Applied Microbiology, Environmental Microbiology, 15, 2147–2153

2152 Genomics update algal genomes through a comprehensible web-based research interface. Acknowledgements We would like to thank Pierre Rouzé, Stephane Rombauts and Evelyne Derelle for general feedback and Sebastian Proost for technical i-ADHoRe support. SVL would like to thank the Research Foundation Flanders (FWO) for funding her research. This work was supported by the Multidisciplinary Research Partnership ‘Bioinformatics: from nucleotides to networks’ Project (no. 01MR0410W) of Ghent University and Agence Nationale de la Recherche grant PHYTADAPT n° NT09_567009.

References Abeel, T., Van Parys, T., Saeys, Y., Galagan, J., and Van de Peer, Y. (2012) GenomeView: a next-generation genome browser. Nucleic Acids Res 40: e12. Armbrust, E.V., Berges, J.A., Bowler, C., Green, B.R., Martinez, D., Putnam, N.H., et al. (2004) The genome of the diatom Thalassiosira pseudonana: ecology, evolution, and metabolism. Science 306: 79–86. Blanc, G., Agarkova, I., Grimwood, J., Kuo, A., Brueggeman, A., Dunigan, D., et al. (2012) The genome of the polar eukaryotic microalga Coccomyxa subellipsoidea reveals traits of cold adaptation. Genome Biol 13: R39. Bowler, C., Allen, A.E., Badger, J.H., Grimwood, J., Jabbari, K., Kuo, A., et al. (2008) The Phaeodactylum genome reveals the evolutionary history of diatom genomes. Nature 456: 239–244. Demir-Hilton, E., Sudek, S., Cuvelier, M.L., Gentemann, C.L., Zehr, J.P., and Worden, A.Z. (2011) Global distribution patterns of distinct clades of the photosynthetic picoeukaryote Ostreococcus. ISME J 5: 1095–1107. Derelle, E., Ferraz, C., Rombauts, S., Rouze, P., Worden, A.Z., Robbens, S., et al. (2006) Genome analysis of the smallest free-living eukaryote Ostreococcus tauri unveils many unique features. Proc Natl Acad Sci USA 103: 11647–11652. Deschamps, P., and Moreira, D. (2012) Reevaluating the green contribution to diatom genomes. Genome Biol Evol 4: 683–688. Diez, B., Pedros-Alio, C., and Massana, R. (2001) Study of genetic diversity of eukaryotic picoplankton in different oceanic regions by small-subunit rRNA gene cloning and sequencing. Appl Environ Microbiol 67: 2932–2941. Gollery, M., Harper, J., Cushman, J., Mittler, T., Girke, T., Zhu, J.K., et al. (2006) What makes species unique? The contribution of proteins with obscure features. Genome Biol 7: R57. Goodstein, D.M., Shu, S., Howson, R., Neupane, R., Hayes, R.D., Fazo, J., et al. (2012) Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40: D1178–D1186. Grimsley, N., Pequin, B., Bachy, C., Moreau, H., and Piganeau, G. (2010) Cryptic sex in the smallest eukaryotic marine green alga. Mol Biol Evol 27: 47–54. Guillou, L., Viprey, M., Chambouvet, A., Welsh, R.M.,

Kirkham, A.R., Massana, R., et al. (2008) Widespread occurrence and genetic diversity of marine parasitoids belonging to Syndiniales (Alveolata). Environ Microbiol 10: 3349–3365. Karsenti, E., Acinas, S.G., Bork, P., Bowler, C., De Vargas, C., Raes, J., et al. (2011) A holistic approach to marine eco-systems biology. PLoS Biol 9: e1001177. Maheswari, U., Jabbari, K., Petit, J.L., Porcel, B.M., Allen, A.E., Cadoret, J.P., et al. (2010) Digital expression profiling of novel diatom transcripts provides insight into their biological functions. Genome Biol 11: R85. Massana, R., and Pedros-Alio, C. (2008) Unveiling new microbial eukaryotes in the surface ocean. Curr Opin Microbiol 11: 213–218. Matsuzaki, M., Misumi, O., Shin, I.T., Maruyama, S., Takahara, M., Miyagishima, S.Y., et al. (2004) Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature 428: 653–657. Merchant, S.S., Prochnik, S.E., Vallon, O., Harris, E.H., Karpowicz, S.J., Witman, G.B., et al. (2007) The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science 318: 245–250. Moon-van der Staay, S.Y., De Wachter, R., and Vaulot, D. (2001) Oceanic 18S rDNA sequences from picoplankton reveal unsuspected eukaryotic diversity. Nature 409: 607– 610. Moreau, H., Verhelst, B., Couloux, A., Derelle, E., Rombauts, S., Grimsley, N., et al. (2012) Gene functionalities and genome structure in Bathycoccus prasinos reflect cellular specializations at the base of the green lineage. Genome Biol 13: R74. Moustafa, A., Beszteri, B., Maier, U.G., Bowler, C., Valentin, K., and Bhattacharya, D. (2009) Genomic footprints of a cryptic plastid endosymbiosis in diatoms. Science 324: 1724–1726. Not, F., Siano, R., Kooistra, W., Simon, N., Vaulot, D., and Probert, I. (2012) Diversity and ecology of eukaryotic marine phytoplankton. Adv Bot Res 64: 1–53. Palenik, B., Grimwood, J., Aerts, A., Rouze, P., Salamov, A., Putnam, N., et al. (2007) The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation. Proc Natl Acad Sci USA 104: 7705–7710. Piganeau, G., Vandepoele, K., Gourbiere, S., Van de Peer, Y., and Moreau, H. (2009) Unravelling cis-regulatory elements in the genome of the smallest photosynthetic eukaryote: phylogenetic footprinting in Ostreococcus. J Mol Evol 69: 249–259. Piganeau, G., Eyre-Walker, A., Jancek, S., Grimsley, N., and Moreau, H. (2011) How and why DNA barcodes underestimate the diversity of microbial eukaryotes. PLoS ONE 6: e16342. Slapeta, J., Lopez-Garcia, P., and Moreira, D. (2006) Global dispersal and ancient cryptic species in the smallest marine eukaryotes. Mol Biol Evol 23: 23–29. Sun, S., Chen, J., Li, W., Altintas, I., Lin, A., Peltier, S., et al. (2011) Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource. Nucleic Acids Res 39: D546–D551. Tautz, D., and Domazet-Loso, T. (2011) The evolutionary origin of orphan genes. Nat Rev Genet 12: 692–702.

© 2013 John Wiley & Sons Ltd and Society for Applied Microbiology, Environmental Microbiology, 15, 2147–2153

Genomics update Tirichine, L., and Bowler, C. (2011) Decoding algal genomes: tracing back the history of photosynthetic life on Earth. Plant J 66: 45–57. Waterhouse, A.M., Procter, J.B., Martin, D.M., Clamp, M., and Barton, G.J. (2009) Jalview Version 2 – a multiple sequence alignment editor and analysis workbench. Bioinformatics 25: 1189–1191. Worden, A.Z., Lee, J.H., Mock, T., Rouze, P., Simmons, M.P., Aerts, A.L., et al. (2009) Green evolution and dynamic adaptations revealed by genomes of the marine picoeukaryotes Micromonas. Science 324: 268–272.

Supporting information Additional Supporting Information may be found in the online version of this article at the publisher’s web-site:

2153

Fig. S2. Integrative Orthology Viewer. Fig. S3. Exploring genome-wide collinearity and synteny. Fig. S4. From pairwise to multispecies genome collinearity. Fig. S5. Overview of gene duplicates per species. Fig. S6. GO enrichment output for P. tricornutum dataset urea_adapted_all. Table S1. Overview of tools in the pico-PLAZA platform. Table S2. Transcript count summary results case study ‘Efficient exploration of gene function diversity in large-scale expression data sets’. Table S3. GO enrichment summary results case study ‘Efficient exploration of gene function diversity in large-scale expression data sets’. Table S4. Gene family results Ostreococcus case study ‘Application of pico-PLAZA for metagenomic screens’.

Fig. S1. Integration of protein domain and gene structure conservation using phylogenetic trees.

© 2013 John Wiley & Sons Ltd and Society for Applied Microbiology, Environmental Microbiology, 15, 2147–2153

Lihat lebih banyak...

pico-PLAZA, a genome database of microbial photosynthetic eukaryotes

Descrição do Produto

Comentários