Characterizing DNA preservation in degraded specimens of Amara alpina (Carabidae: Coleoptera)

Share Embed


Descrição do Produto

Molecular Ecology Resources (2014) 14, 606–615

doi: 10.1111/1755-0998.12205

Characterizing DNA preservation in degraded specimens of Amara alpina (Carabidae: Coleoptera) PETER D. HEINTZMAN,* 1 SCOTT A. ELIAS,† KAREN MOORE,‡ KONRAD PASZKIEWICZ‡ and IAN BARNES*2 *School of Biological Sciences, Royal Holloway University of London, Egham, TW20 0EX, UK, †Department of Geography, Royal Holloway University of London, Egham, TW20 0EX, UK, ‡Exeter Sequencing Service, Biosciences, College of Life & Environmental Sciences, University of Exeter, Exeter, EX4 4QD, UK

Abstract DNA preserved in degraded beetle (Coleoptera) specimens, including those derived from dry-stored museum and ancient permafrost-preserved environments, could provide a valuable resource for researchers interested in species and population histories over timescales from decades to millenia. However, the potential of these samples as genetic resources is currently unassessed. Here, using Sanger and Illumina shotgun sequence data, we explored DNA preservation in specimens of the ground beetle Amara alpina, from both museum and ancient environments. Nearly all museum specimens had amplifiable DNA, with the maximum amplifiable fragment length decreasing with age. Amplification of DNA was only possible in 45% of ancient specimens. Preserved mitochondrial DNA fragments were significantly longer than those of nuclear DNA in both museum and ancient specimens. Metagenomic characterization of extracted DNA demonstrated that parasite-derived sequences, including Wolbachia and Spiroplasma, are recoverable from museum beetle specimens. Ancient DNA extracts contained beetle DNA in amounts comparable to museum specimens. Overall, our data demonstrate that there is great potential for both museum and ancient specimens of beetles in future genetic studies, and we see no reason why this would not be the case for other orders of insect. Keywords: ancient DNA, Coleoptera, DNA preservation, metagenomics, museum DNA, shotgun sequencing Received 20 August 2013; revision received 15 November 2013; accepted 19 November 2013

Introduction The beetles (Insecta: Coleoptera) are the most speciose insect order (40%; >350 000 species) (Gullan & Cranston 2010) and are found in nearly all terrestrial ecosystems, fulfilling a great variety of niches (Grove & Stork 2000; New 2007). They are therefore a major focus of biological investigation, in which a genetic approach is often required. To conduct genetic studies, specimens need to be collected. However, due to the inhospitable nature of some regions, such as the arctic, there may be limited access and opportunity for collecting fresh specimens. Due to more than two centuries of collection effort, hundreds of millions of insect specimens have been Correspondence: Peter D. Heintzman, Fax: 1-831-459-5353; E-mail: [email protected] 1 Present address: Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA 2 Present address: Earth Sciences Department, Natural History Museum, Cromwell Road, London, SW7 5BD, UK

deposited in museum collections (Arino 2010), thereby reducing the need to sample directly from difficult regions (Schaefer et al. 2009). In addition, these specimens can add a temporal perspective, on a decadal to centennial timescale (Rowe et al. 2011), to genetic studies (Harper et al. 2006; Hartley et al. 2006). However, DNA from museum specimens is highly degraded (Wandeler et al. 2007), which has no doubt dissuaded researchers from utilizing them. Previous genetic studies have generally been small in scale and limited in scope, with no standardized data that characterize the DNA present in these remains (Goldstein & Desalle 2003; Gilbert et al. 2007a; Thomsen et al. 2009; Castalanelli et al. 2010; Gibson et al. 2012). A large-scale assessment of the proportion and preservation of endogenous DNA is therefore required. This would inform researchers keen to exploit these genetic resources, as well as museum curators who may be concerned about potentially destructive genetic analyses (Mandrioli 2008). Ancient beetle specimens, which are routinely recovered from permafrost and other environments (Elias

© 2013 John Wiley & Sons Ltd

D N A P R E S E R V A T I O N I N D E G R A D E D B E E T L E S P E C I M E N S 607 2010), are also potential genetic resources that have the capacity to further extend the temporal aspect of beetle DNA studies to millennial timescales. Recent proof of concept studies has demonstrated the presence of beetle ancient (a)DNA from a variety of environments and ages (Reiss 2006; Willerslev et al. 2007; King et al. 2009; Thomsen et al. 2009), with permafrost considered to have the greatest potential (Reiss 2006). The next step in beetle aDNA research is to characterize the DNA present in permafrost-preserved remains, which would assess their potential as a future genetic resource. The aim of this study therefore was to characterize DNA preservation in beetle remains from both museum and ancient environments. In characterizing DNA preservation from these two environments, we wished to estimate the potential for recovery of DNA from very small samples, overall success rate of PCR amplification, retrievable fragment length using both PCR and Illumina shotgun sequencing, as well as using the latter to examine the extent and nature of contamination, and potential recovery of the metagenome. Altogether, these data provide an overview of DNA preservation in both drystored museum and ancient permafrost-preserved beetle remains. Amara alpina (Carabidae: Coleoptera) was selected as the study taxon, as it is well represented in permafrost deposits, especially those of Beringia (Elias et al. 2000) from which aDNA of a multitude of taxa has been recovered (Shapiro & Cooper 2003) and is well represented in the museum collections of Europe and North America. Additionally, A. alpina is the most cold-adapted ground beetle (Bennike et al. 2000) and is important for palaeoclimatic and palaeoenvironmental reconstruction (Elias 2010).

Materials and methods Specimen provenance, age and collection Specimens were divided into three age classes: modern (10 years old) and ancient (permafrost deposits). All specimens, including age and provenance, are listed in Table S1 (Supporting information). Modern specimens (n = 6) were collected from four Beringian (Chukotka, Alaska, Yukon) localities between 2002 and 2004. Museum specimens (n = 133), which had either been pinned or glued to a card mount, were gathered from the Canadian National Collection of Insects, Arachnids and Nematodes (CNC) in April 2010 and the Swedish Museum of Natural History (NRM) in May 2011. These specimens were originally collected between 9 and 136 years before DNA extraction was conducted and originated from 89 localities in Scandinavia (n = 28), Russia (n = 29) and North America (n = 76). A single

© 2013 John Wiley & Sons Ltd

hind leg (including femur, tibia, tarsi), which is not informative for morphological identification, was removed for each analysis. Ancient specimens (n = 148), consisting of complete or broken isolated sclerites, were collected from 15 Beringian sites between 2003 and 2006 (Table S2, Supporting information). Specimens ranged in age from >560 000 to 5900 calendar years old, based on either radiocarbon dating of associated plant remains, relative tephra dating of layers above or below beetle-bearing sediments, or through rough stratigraphic correlation. Specimens were isolated from the sediment and stored at room temperature (Elias 1994). Prior to DNA extraction, ancient specimens were stored at 20 °C to reduce further degradation (Reiss 2006).

DNA extraction Degraded specimens (museum, ancient) were extracted, and PCR reactions were prepared, following standardized aDNA handling protocols (Cooper & Poinar 2000; Hofreiter et al. 2001; Gilbert et al. 2005; Wandeler et al. 2007). These included using sterile, dedicated ancient DNA laboratories (Royal Holloway, NRM) and conducting independent replication (Data S1, Supporting information). DNA was extracted using the QIAamp DNA Micro kit (Qiagen), with a modified version of the tissues protocol. This included using carrier RNA and conducting the final eluting step twice (50 lL each). Overnight lysis was conducted for 16 to 20 hours. Tween-20 was added to DNA extracts (final concentration of 0.05%) to prevent DNA adhering to the plastic tube surface (Faulds et al. 2004). Extraction controls were used in a ratio of one control to five samples. Specimens were not disintegrated prior to extraction as initial investigation indicated that this did not affect the DNA recovery likelihood. Modern DNA was extracted by Mack (2008) using the same method, with the exception that complete specimens were digested instead of a single hind leg [following Gilbert et al. (2007a)]. Two specimens (one modern, one ancient) were extracted in a previously published study (Thomsen et al. 2009).

PCR and Sanger sequencing Three markers were targeted: mitochondrial COI, and multicopy nuclear 28S and ITS1. Novel overlapping species-specific primer sets were designed using Oligo v6.8 (Rychlik & Rychlik 2005). Primer design templates were based on data from Gilbert et al. (2007a), Mack (2008), Thomsen et al. (2009) and GenBank. PCR amplification conditions, primer sequences, sets and annealing temperatures are listed in Data S1 and Table S3 (Supporting information). Purification and sequencing

608 P . D . H E I N T Z M A N E T A L . of PCR products followed Brace et al. (2012). Sequencing was also conducted at the NRM using an ABI 3100 genetic analyser and BigDye Terminator chemistry (v1.1). Sequence data were quality-checked manually in Sequencher v4.7 (Gene Codes) and compared against the Basic Local Alignment Search Tool (BLAST) database and A. alpina reference sequences (GenBank: KF695187KF695189) to ensure that the correct target had been amplified. To detect any potential cross-sample contamination, sequence data from overlapping fragments were compared for each specimen. Due to the small sample size of the modern data set, data from modern and museum specimens were combined and are hereafter referred to as museum. All stated fragment lengths include primer sequence. Specimens were considered to yield mitochondrial (mt)DNA or nuclear (nu)DNA if one or more fragments were recovered for each data type. All specimens that failed to yield any DNA were tested with a minimum of three or two primer sets for mtDNA and nuDNA, respectively. The use of species-specific primers could impede amplification if specimens had been taxonomically misidentified prior to PCR, leading to DNA recovery rates being underestimated. Therefore, to assess primer set specificity, primers were also tested on a specimen from a congeneric species (A. glacialis). One ancient specimen that yielded mtDNA was excluded from the analysis of DNA recovery rates, due to extract exhaustion prior to testing for the presence of nuDNA. All mtDNA PCR products were sequence verified as A. alpina. A total of 142 nuDNA PCR products were sequenced, with at least one nuDNA fragment being sequence verified from 71 of the 187 specimens reported to have yielded nuDNA (museum: 45/136, ancient: 26/ 51). Amplification success was defined as the number of mtDNA (COI) fragments recovered from eight overlapping PCR primer sets, which had products of between 124 and 196 bp. Different combinations of these primers, which amplified longer fragments (287 to 514 bp), were tested on the museum specimens (Table S3, Supporting information). A combination that would have amplified four overlapping fragments was treated as four fragments retrieved. For museum specimens, age was determined using either the collection date or other data (e.g. specimen collected on the same collecting trip as a specimen with an explicit collection date) stated on the specimen label (Table S1, Supporting information). Dated museum specimens were binned into 50-year time intervals. Ancient specimens were binned into twenty thousand year (ka) time intervals with age determined by calibrated dates (Table S2, Supporting information). Specimens binned as >60 ka ranged in age from >100 to >560 ka. To assess whether older specimens yielded

fewer DNA fragments, two-tailed Kruskal–Wallis tests were employed in SPSS v19. Specimens that were undated or imprecisely dated (e.g. Late Pleistocene, ‘LP’), as well as museum specimens that did not yield mtDNA, were excluded from Kruskal–Wallis tests. To assess the longest amplifiable mtDNA fragment, museum specimens were initially tested for a 446 bp fragment. Successfully amplified specimens were tested for longer fragments and those that failed to amplify a product were tested for sequentially shorter fragments. A linear regression was performed in SPSS to test for a relationship between the longest amplifiable fragment and age. The specimens excluded from the Kruskal–Wallis tests were also excluded from the regression analysis.

Illumina library preparation, shotgun sequencing and analysis Six samples (two from each age class), with excellent DNA recovery based on Sanger sequencing, were used for shotgun sequencing (Table S1, Supporting information). DNA libraries were constructed using a modified version of the Meyer & Kircher (2010) protocol, without the initial DNA fragmentation step, and using QIAquick spin columns (Qiagen) for all purification steps. These modifications are listed in Data S1, Supporting information. Amplified and indexed libraries were quantified using a spectrophotometer and gel electrophoresis. These were then diluted to equimolar concentrations, pooled and sequenced on the Illumina HiSeq-2000 platform at the Exeter Sequencing Facility, using a single lane of 2 9 100 cycles on a paired-end flow cell, following the manufacturer’s instructions. Raw HiSeq data were filtered to remove poor-quality reads and sequencing artefacts using the standard (Blankenberg et al. 2010) and FASTX toolkits (all v1.0.0 unless otherwise stated), respectively, on the Galaxy server (Goecks et al. 2010). Reads were binned by sample using the FASTX Barcode Splitter, with a single base mismatch permitted. For each data set, forward and reverse reads were merged using SeqPrep, which combined the two reads if overlap was detected and removed adapter sequence. SeqPrep output consisted of the merged and remaining unmerged reads, with between 87.2 and 99.7% of filtered reads being successfully merged. A detailed methodology and pipeline of shotgun data quality control is available in Data S1 and Fig. S1 (Supporting information). To characterize the fragment length distribution of endogenous DNA from A. alpina, merged reads were aligned to 15 short reference sequences using Bowtie2 (Langmead & Salzberg 2012). These references included eight mtDNA (658 to 16823 bp) and seven nuDNA (183 to 1043 bp) sequences that were either downloaded from

© 2013 John Wiley & Sons Ltd

D N A P R E S E R V A T I O N I N D E G R A D E D B E E T L E S P E C I M E N S 609 GenBank or originated from the Sanger data of this study (Table S4, Supporting information). At least one representative sequence was chosen for each locus available for Amara, as well as the three available carabid mitogenomes (as of 14/09/2012). A reference genome was not used here, as the two available Coleoptera genomes belong to taxa [Tribolium castaneum, Dendroctonus ponderosae (Richards et al. 2008; Keeling et al. 2013)] that diverged from A. alpina more than 250 million years ago (McKenna & Farrell 2009) and were therefore deemed insufficiently similar to act as reference sequences. Duplicate sequences were not removed from BAM files, due to biases observed during duplicate removal (Data S2 and Table S5, Supporting information). To ensure the reasonable assignment of reads to reference sequences, BAM files were indexed and visually inspected using Tablet v1.12.09.03 (Milne et al. 2010). Fragment length information was extracted from aligned reads in the BAM files and pooled into two categories for each sample (mtDNA and nuDNA). Distributions were plotted using a five-point centred moving average to smooth length distributions. Descriptive statistics were calculated, and comparisons of fragment length distributions between DNA categories were performed using t-tests in SPSS. Appropriate t-statistics were selected using the results of Levene’s test for equality of variances. To assess the metagenome of degraded specimens, merged and remaining unmerged read files were concatenated, and the ‘Collapse’ tool in Galaxy was used to remove PCR duplicates. Reads were assembled into contigs using a de novo assembly tool (clc_novo_assemble) in the CLC assembly cell v4.0.6-beta, with the minimum contig size set to 40 bp. Contigs were produced in this way to increase the likelihood of robust taxonomic assignment (Prufer et al. 2010) and to further collapse PCR duplicates. Contig sequences were identified taxonomically using BLAST v2.2.25 (database downloaded on 23/08/2012 and gi_taxid_nucl.bin downloaded on 03/09/2012). Resulting taxonomic assignments were visualized in MEGAN v4.70.3 (Huson et al. 2011), with parameters set to minScore = 50, minComplexity = 0.44, topPercent = 10, winScore = 0 and minSupport = 5. These parameters ensured that contigs with low-quality

BLAST hits were excluded. Taxonomic information for each of the specimens was combined, normalized and collapsed to the rank of Class where possible. The twelve most abundant classes were scrutinized further by assessing the major composite genera, which each comprised >2% of the identifiable contigs across all specimens.

Results and discussion DNA recovery rate Nearly, all museum specimens (96%) yielded both mtDNA and nuDNA (Table 1). Previous dry-stored museum beetle DNA studies have reported a 46 to 100% recovery rate (Goldstein & Desalle 2003; Gilbert et al. 2007a; Thomsen et al. 2009), which demonstrates the relative effectiveness of the methods employed here. The three specimens that did not yield DNA were all from the CNC, collected from the same locality, in the same year, by the same collector and represent all the tested specimens from this locality. We speculate that DNAdegrading substances, such as ethyl acetate, may have been used to kill and/or temporarily store these specimens (Reiss et al. 1995; Dillon et al. 1996; Gilbert et al. 2007a). In ancient specimens, 26% yielded both mtDNA and nuDNA, whereas 54% yielded neither, and ~10% yielded either mtDNA or nuDNA only (Table 1). This is the first reported recovery of nuDNA from a permafrost-preserved invertebrate and the second report of mtDNA recovered from permafrost-preserved invertebrate macrofossils [following Thomsen et al. (2009)]. MtDNA and nuDNA were recovered from localities aged up to 55 000 and 41 000 radiocarbon years before present, respectively, which represents the oldest ancient invertebrate DNA recovered from macrofossils for both DNA types (King et al. 2009; Thomsen et al. 2009). These recovery rates are low compared with recovery rates from permafrost-preserved bone, which range between 22 and 80% (Barnes et al. 2002, 2007; Shapiro et al. 2004; Campos et al. 2010a,b). This may be due to the very small size of beetle sclerites (60 ka all failed to yield aDNA, but the sample size of these specimens was small (n = 5). Reduced amplification success with increasing specimen age in nonfrozen sedimentpreserved beetles has been observed elsewhere (King et al. 2009). Age is a highly significant predictor of the longest amplifiable fragment (Fig. 1c) (linear regression: R2 = 0.441, F = 98.612, d.f. = 1.125, P < 0.001). This suggests that strand breaks and/or interstrand cross-links are still occurring for decades after the specimen has been desiccated and stored. However, the regression

magnitude smaller than the material used in bone-based studies (Barnes et al. 2002, 2007; Shapiro et al. 2004; Campos et al. 2010a,b), therefore resulting in fewer template molecules available for amplification. In addition, bones are often ground into a fine powder to increase the DNA yield. However, due to the potential loss of specimen material, this was not conducted here. Alternatively, thermal age analysis indicates that aDNA preservation may be an order of magnitude poorer in beetle remains when compared to bone (King et al. 2009), which may be due to apatite reducing the fragmentation rate of DNA in bone (Lindahl 1993). Misidentification was not detected in ancient specimens that yielded DNA (n = 68). It is unlikely that species-specific primer sets would have prevented any potentially misidentified specimens from yielding DNA, as primer sets were also successfully tested on a congeneric species (A. glacialis). Therefore, the assertion made by Quaternary entomologists that even single broken sclerites can be reliably identified (Coope 2004; Elias 2010) is supported by the genetic data.

DNA preservation by age All mtDNA fragments were successfully amplified from 80.6% of museum specimens, whereas 3.6% yielded none (Fig. 1a). The five specimens that did not yield mtDNA were two of the oldest specimens from the NRM and the three CNC specimens that did not yield any DNA. The 15.8% of specimens that amplified one to seven fragments were almost exclusively >100 years old. The relationship between museum specimen age and amplification success was highly significant (two-tailed Kruskal–Wallis test: v2 = 58.995, d.f. = 2, P < 0.001). This may indicate that the concentration of amplifiable DNA decreases with age, due to the continuing occurrence of

(a)

(b)

(c)

60 40 20 0

0

1

2 3 4 5 6 7 Fragments retrieved

8

60

Longest fragment (bps)

Specimens (%)

Specimens (%)

80

40 20 0

0

1

2

3

4

5

6

Fragments retrieved

7

8

600 400 200 0 1870

1910 1950 Collection year

1990

Fig. 1 Amplification success of mitochondrial DNA from (a) museum and (b) ancient specimens of A. alpina based on the number of amplifiable fragments, and (c) the longest amplifiable fragment retrieved from museum specimens by collection year. Colours indicate specimen age: (a) collection year, blue: 1951–2004; red: 1901–1950; off-white: 1851–1900; green: no date, and (b) calibrated age, red: 1–20 ka; orange: 21–40 ka; green: 41–60 ka; blue: >61 ka; grey: no date. In (c), the trend line equation is y = 1.946x 3410.417, and the dotted lines illustrate 95% confidence intervals.

© 2013 John Wiley & Sons Ltd

D N A P R E S E R V A T I O N I N D E G R A D E D B E E T L E S P E C I M E N S 611 analysis may have been affected by several limitations in the data set, including potential differences in efficiency between primer sets and the noncontinuous nature of the fragment lengths targeted. Comparable relationships between age and maximum retrievable fragment length have been shown in specimens of other insect orders (Strange et al. 2009; Ugelvig et al. 2011). Intriguingly, these studies targeted microsatellites, suggesting that this observation may be applicable to mtDNA and nuDNA, as well as a variety of different insect orders. The regression analysis suggests that shorter mtDNA fragments, of length viable for both PCR and next-generation sequencing-based approaches (70 bp), may be retrievable from museum specimens of late 18th century age. This is earlier than the oldest recovered museum insect mtDNA (1820) (Thomsen et al. 2009) and encompasses the vast majority of entomological specimens housed in collections, indicating their potential utility as genetic resources.

Fragment length distributions A total of 43.3 million paired-end reads were obtained from the HiSeq run, of which 38 million passed quality control measures and were assigned to a sample. 64 424 reads (0.17% of filtered reads) were identified as originating from A. alpina (Table S5, Supporting information). However, due to the use of short reference sequences, these reads represent a tiny fraction of the total number of reads that originated from A. alpina. The length distributions of the identified reads indicate that nuDNA mean fragment length (79 to 109 bp) is

Contigs from modern and museum samples ranged in length from 40 to 7935 bp, with N50 values of between 125 and 184 bp, of which 0.4 to 0.5% could be taxonomically identified (Table S7, Supporting information). The taxonomic profiles of these samples are comparable (Fig. 3), even considering the large age range of these samples (9 to 138 years) and their independent storage histories in separate museums (Table S1, Supporting information). This suggests that the museum metagenome of historical dried beetle, and perhaps insect, specimens may be fairly consistent, regardless of age or storage collection.

0.01

Ancient-1 Prop. of reads

0.02

0.02 0.01 0

0 0

50 100 150 Frag. length (bps)

0

200

Modern-2

0 50 150 100 Frag. length (bps)

0.04 0.02 0 0

200

Museum-2

0.01

0

50 100 150 Frag. length (bps)

200

0.02 0.01 0 0

50 100 150 Frag. length (bps)

100 50 150 Frag. length (bps)

200

Ancient-2

0.03

Prop. of reads

0.02

Prop. of reads

Prop. of reads

Metagenomic analysis

Museum-1 Prop. of reads

Prop. of reads

Modern-1

significantly shorter than mtDNA (61 to 101 bp; Fig. 2; t-tests: P < 0.001; Table S6, Supporting information). These length distributions and mean length values are similar between all samples and comparable to other museum and ancient DNA data sets (Gilbert et al. 2007b, 2008; Miller et al. 2009; Rasmussen et al. 2011; Kircher 2012). The disparity between mtDNA and nuDNA mean fragment length is in agreement with recent studies of vertebrates from permafrost and sediment deposits (Schwarz et al. 2009; Allentoft et al. 2012), which suggest that nuDNA degrades at a faster rate than mtDNA. This may be due to the circular configuration of mtDNA making it less accessible to exonucleases (Allentoft et al. 2012), the double membrane of the mitochondrion offering additional protection (Schwarz et al. 2009) or the interaction of nuDNA and histones facilitating strand breaks (Binladen et al. 2006).

200

0.04 0.02 0 0

50 100 150 Frag. length (bps)

200

Fig. 2 Fragment length distributions of DNA from three age classes (modern, museum, ancient) of A. alpina. The proportion of reads is the number of reads for a given fragment length divided by the total number of reads in that sample. Fitted lines are based on a fivepoint centred moving average. Solid line: nuclear DNA, dotted line: mitochondrial DNA.

© 2013 John Wiley & Sons Ltd

612 P . D . H E I N T Z M A N E T A L . Insecta Mollicutes

Betaproteobacteria Actinopterygii

Actinobacteria Eudicotyledons

Alphaproteobacteria Saccharomycetes

Mammalia Flavobacteriia

Gammaproteobacteria Arachnida

Prop. of identified contigs (%)

50

40

30

20

10

0

Modern-1

Modern-2

Museum-1

Museum-2

Ancient-1

Ancient-2

Fig. 3 Metagenomic composition of DNA from three age classes (modern, museum, ancient) of A. alpina. The twelve most abundant classes and phyla across all data sets are illustrated, in descending order. Invertebrates: red, vertebrates: orange, plants and fungi: yellow, Proteobacteria: green, other bacteria: blue.

The Insecta made up 38 to 49% of identifiable contigs (Fig. 3; Table S7, Supporting information), which were classified as belonging to model species in the Lepidoptera, Hymenoptera, Diptera, Hemiptera and Coleoptera (Table S8, Supporting information). These model insects have their nuclear genomes in the BLAST database and are therefore biased towards during taxonomic assignment. An exception is Abax, which does not have a reference genome, but is taxonomically close to A. alpina (both Carabidae: Harpalinae). The number and variety of component taxa further demonstrates the issues associated with the lack of a suitable reference genome, which, together with the very low proportion of contigs that could be assigned, would suggest that a substantial number of A. alpina contigs were not identified (Prufer et al. 2010). The single-leg extracted museum samples contain comparable proportions of insect DNA to the wholespecimen extracted modern samples, which suggests that a single leg is sufficient for yielding museum DNA from insects. Parasites and commensals constitute the majority of the bacterial contigs in the modern and museum samples. Alphaproteobacterial contigs were mainly composed of Wolbachia. The Wolbachia contigs for modern-1 were assigned to the wPip strain, whereas contigs for the other modern and museum samples were assigned to wRi. These strains belong to different

Wolbachia supergroups [wRi: A, wPip: B; (Klasson et al. 2009)], which would therefore suggest two separate infection histories in this species. Contigs identified to the bacterial class Mollicutes, which comprise 19% in modern-2 and are found in very low proportions in the museum samples, are made up of Spiroplasma. These are arthropod commensals, which can be pathogenic (Regassa & Gasparich 2006). As a commensal, Spiroplasma are found in the arthropod gut and become pathogenic when they enter the hemolymph (Regassa & Gasparich 2006). Given that the modern samples had far higher proportions of Spiroplasma compared with the museum samples and that DNA was extracted from whole specimens rather than legs in the former, it is inferred that Spiroplasma detected here were commensals. Another major bacterial parasite of beetles, Rickettsia (Duron et al. 2008), was not detected in any of the samples analysed. We see no obvious reason why this metagenomic approach would not be applicable to the investigation of parasitism in museum specimens of other insect orders, as well as other arthropod groups. The remaining identifiable contigs in the modern and museum samples were assigned to the Mammalia (9 to 17%; Homo, Mus), Actinopterygii (Danio), Eudicotyledons and Saccharomycetes. The origins of these contaminant contigs are speculated to be from human handling (Homo), the museum environment (Eudicotyledons,

© 2013 John Wiley & Sons Ltd

D N A P R E S E R V A T I O N I N D E G R A D E D B E E T L E S P E C I M E N S 613 Saccharomycetes, Mus) and potential misassignments in the BLAST database. The two ancient samples comprise contigs that range in length from 40 to 11617 bp, with N50 values of between 174 and 214 bp, of which 16 to 19% could be taxonomically identified (Table S7, Supporting information). These samples have profoundly different taxonomic profiles to the modern and museum samples, with 0.2 to 0.3% of identified contigs assigned as Insecta (Fig. 3; Table S7, Supporting information). There were contigs assigned to Amara in both ancient samples. Given the large taxonomic diversity of insects and the limited amount of genetic information available for Amara, we would not expect any contigs to be falsely identified to the target genus, if the insect contigs were an artefact of background contamination. Mammalian contamination was low in the ancient samples (0.2 to 2%). The vast majority of identified contigs from the ancient samples were assigned to the Proteobacteria and Actinobacteria. The ratios of these bacterial groups differ, with the Alpha- and Betaproteobacteria (30 to 38% of identified contigs) and Actinobacteria (38%) dominating in ancient-1 and ancient-2, respectively (Fig. 3). These bacterial compositions are consistent with the provenance of these samples (permafrost); the component genera suggest the samples were from glacial or periglacial soils/sediments, near aquatic sources (Table S8, Supporting information). This was inferred from Polaromonas [glacial and periglacial deposits (Darcy et al. 2011)], Caulobacter [aquatic or semi aquatic habitats (Laub et al. 2007)] and the remaining genera being typical of a soil/ sediment environment (Janssen 2006; Philippot et al. 2007; Doughari et al. 2011). This complements the locality information for these samples (Goldbottom Creek and Titaluk River, in permanently frozen sediment) and opens up the possibility of identifying potentially unknown or dubious provenance information based on the bacterial metagenome of ancient samples. However, the diversity and ubiquity of many environmental bacteria would only allow for general inferences to be made. Although the proportion of endogenous DNA from A. alpina in these chitinous remains seems to be around two orders of magnitude lower than other permafrost or cold-preserved tissues, such as bone and hair (Poinar et al. 2006; Gilbert et al. 2008; Miller et al. 2008; Lindqvist et al. 2010), this is due to BLAST bias, which resulted from the lack of a suitable reference genome. Based on the proportion of contigs identified, modern and museum samples have an average of 165 times more insect DNA than ancient samples. However, based on the proportion of all contigs, this ratio decreases from 165:1 to 4:1, with the disparity between these two ratios being due to a far greater proportion of contigs identified in the ancient samples (Table S7, Supporting

© 2013 John Wiley & Sons Ltd

information). This suggests that the amount of insect DNA in permafrost-preserved remains is comparable to the amount found in museum specimens, although direct comparison between the museum and the ancient samples may be problematic if insect DNA in the ancient samples was outcompeted by environmental DNA during shotgun sequencing.

Conclusions Using museum and ancient specimens of the ground beetle A. alpina, this study has explored DNA preservation in degraded beetle specimens and assessed their potential utility for future genetic studies. Museum specimens have great potential, with nearly all specimens containing amplifiable endogenous DNA. Furthermore, it was possible to recover parasite sequences, which indicates that museum specimens could be routinely used to study host–parasite associations over decadal to centennial timescales (Tsangarasa & Greenwood 2012). Amplifiable endogenous DNA could only be recovered from 45% of ancient specimens, which would suggest that large quantities of specimens would be required for large-scale analyses. The vast majority of extracted DNA was identified as originating from environmental bacteria, although this was probably due to BLAST bias. The major hurdle to future degraded insect DNA research will be the availability of appropriate reference genome sequences for maximum exploitation of recovered sequence data. International initiatives, such as the 5000 insect genomes project (i5k), are anticipated to provide such reference genomes over the next 5 years.

Acknowledgements The authors thank Svetlana Kuzmina and Philip Thomsen for specimens and DNA extracts; Yves Bousquet, Anthony Davies (both CNC) and Johannes Bergsten (NRM) for access to the museum specimens; Selina Brace, Jessica Thomas, Edwin van Leeuwen, Roland Preece, Aurelien Ginolhac, Matthias Meyer and Martin Kircher for technical advice and assistance; and Tom Gilbert, Stephen Rossiter and three anonymous reviewers for comments on earlier drafts of this work. PDH acknowledges funding from a RHUL Reid Scholarship, the RHUL Research Strategy Fund and SYNTHESYS (SE-TAF-1185; SYNTHESYS receives funding from the European Community–Research Infrastructure Action under FP7 ‘Capacities’ Specific Programme). Details on the i5k initiative can be found at: http:// www.arthropodgenomes.org/wiki/i5K.

References Allentoft ME, Collins M, Harker D et al. (2012) The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils. Proceedings of the Royal Society of London Series B-Biological Sciences, 279, 4724–4733.

614 P . D . H E I N T Z M A N E T A L . Andersen JC, Mills NJ (2012) DNA extraction from museum specimens of parasitic hymenoptera. PLoS ONE, 7, e45549. Arino AH (2010) Approaches to estimating the universe of natural history collections data. Biodiversity Informatics, 7, 81–92. Barnes I, Matheus P, Shapiro B, Jensen D, Cooper A (2002) Dynamics of Pleistocene population extinctions in Beringian brown bears. Science, 295, 2267–2270. Barnes I, Shapiro B, Lister A et al. (2007) Genetic structure and extinction of the woolly mammoth, Mammuthus primigenius. Current Biology, 17, 1072–1075. Bennike O, Bj€ orck S, B€ ocher J, Walker IR (2000) The Quaternary arthropod fauna of Greenland: a review with new data. Bulletin of the Geological Society of Denmark, 47, 111–134. Binladen J, Wiuf C, Gilbert MTP et al. (2006) Assessing the fidelity of ancient DNA sequences amplified from nuclear genes. Genetics, 172, 733–741. Blankenberg D, Gordon A, Von Kuster G et al. (2010) Manipulation of FASTQ data with Galaxy. Bioinformatics, 26, 1783–1785. Brace S, Barnes I, Powell A et al. (2012) Population history of the Hispaniolan hutia Plagiodontia aedium (Rodentia: Capromyidae): testing the model of ancient differentiation on a geotectonically complex Caribbean island. Molecular Ecology, 21, 2239–2253. Campos PF, Kristensen T, Orlando L et al. (2010a) Ancient DNA sequences point to a large loss of mitochondrial genetic diversity in the saiga antelope (Saiga tatarica) since the Pleistocene. Molecular Ecology, 19, 4863–4875. Campos PF, Willerslev E, Sher A et al. (2010b) Ancient DNA analyses exclude humans as the driving force behind late Pleistocene musk ox (Ovibos moschatus) population dynamics. Proceedings of the National Academy of Sciences of the United States of America, 107, 5675–5680. Castalanelli MA, Severtson DL, Brumley CJ et al. (2010) A rapid nondestructive DNA extraction method for insects and other arthropods. Journal of Asia-Pacific Entomology, 13, 243–248. Coope GR (2004) Several million years of stability among insect species because of, or in spite of, Ice Age climatic instability? Philosophical Transactions of the Royal Society of London Series B-Biological Sciences, 359, 209–214. Cooper A, Poinar HN (2000) Ancient DNA: do it right or not at all. Science, 289, 1139. Darcy JL, Lynch RC, King AJ, Robeson MS, Schmidt SK (2011) Global distribution of Polaromonas phylotypes–evidence for a highly successful dispersal capacity. PLoS ONE, 6, e23742. Dillon N, Austin AD, Bartowsky E (1996) Comparison of preservation techniques for DNA extraction from hymenopterous insects. Insect Molecular Biology, 5, 21–24. Doughari HJ, Ndakidemi PA, Human IS, Benade S (2011) The ecology, biology and pathogenesis of Acinetobacter spp.: an overview. Microbes and Environments, 26, 101–112. Duron O, Bouchon D, Boutin S et al. (2008) The diversity of reproductive parasites among arthropods: Wolbachia do not walk alone. BMC Biology, 6, 27. Elias SA (1994) Quaternary Insects and Their Environments. Smithsonian Institute, Washington, District of Columbia. Elias SA (2010) Advances in Quaternary Entomology. Elsevier, Oxford, UK. Elias SA, Berman D, Alfimov A (2000) Late Pleistocene beetle faunas of Beringia: where east met west. Journal of Biogeography, 27, 1349–1363. Faulds K, Smith WE, Graham D (2004) Evaluation of surface-enhanced resonance Raman scattering for quantitative DNA analysis. Analytical Chemistry, 76, 412–417. Gibson CM, Kao RH, Blevins KK, Travers PD (2012) Integrative taxonomy for continental-scale terrestrial insect observations. PLoS ONE, 7, e37528. Gilbert MTP, Bandelt HJ, Hofreiter M, Barnes I (2005) Assessing ancient DNA studies. Trends in Ecology and Evolution, 20, 541–544. Gilbert MTP, Moore W, Melchior L, Worobey M (2007a) DNA extraction from dry museum beetles without conferring external morphological damage. PLoS ONE, 2, e272.

Gilbert MTP, Tomsho LP, Rendulic S et al. (2007b) Whole-genome shotgun sequencing of mitochondria from ancient hair shafts. Science, 317, 1927–1930. Gilbert MTP, Kivisild T, Gronnow B et al. (2008) Paleo-Eskimo mtDNA genome reveals matrilineal discontinuity in Greenland. Science, 320, 1787–1789. Goecks J, Nekrutenko A, Taylor J, Team G (2010) Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biology, 11, 1–13. Goldstein PZ, Desalle R (2003) Calibrating phylogenetic species formation in a threatened insect using DNA from historical specimens. Molecular Ecology, 12, 1993–1998. Grove SJ, Stork NE (2000) An inordinate fondness for beetles. Invertebrate Taxonomy, 14, 733–739. Gullan PJ, Cranston PS (2010) The Insects: An Outline of Entomology, 4th edn. Wiley-Blackwell, Chichester, UK. Hansen AJ, Mitchell DL, Wiuf C et al. (2006) Crosslinks rather than strand breaks determine access to ancient DNA sequences from frozen sediments. Genetics, 173, 1175–1179. Harper GL, Maclean N, Goulson D (2006) Analysis of museum specimens suggests extreme genetic drift in the adonis blue butterfly (Polyommatus bellargus). Biological Journal of the Linnean Society, 88, 447–452. Hartley CJ, Newcomb RD, Russell RJ et al. (2006) Amplification of DNA from preserved specimens shows blowflies were preadapted for the rapid evolution of insecticide resistance. Proceedings of the National Academy of Sciences of the United States of America, 103, 8757–8762. Hofreiter M, Serre D, Poinar HN, Kuch M, Paabo S (2001) Ancient DNA. Nature Reviews Genetics, 2, 353–359. van Houdt JK, Breman FC, Virgilio M, de Meyer M (2010) Recovering full DNA barcodes from natural history collections of Tephritid fruitflies (Tephritidae, Diptera) using mini barcodes. Molecular Ecology Resources, 10, 459–465. Huson DH, Mitra S, Ruscheweyh HJ, Weber N, Schuster SC (2011) Integrative analysis of environmental sequences using MEGAN4. Genome Research, 21, 1552–1560. Janssen PH (2006) Identifying the dominant soil bacterial taxa in libraries of 16S rRNA and 16S rRNA genes. Applied and Environmental Microbiology, 72, 1719–1728. Keeling CI, Yuen MM, Liao NY et al. (2013) Draft genome of the mountain pine beetle, Dendroctonus ponderosae Hopkins, a major forest pest. Genome Biology, 14, R27. King GA, Gilbert MTP, Willerslev E, Collins MJ, Kenward H (2009) Recovery of DNA from archaeological insect remains: first results, problems and potential. Journal of Archaeological Science, 36, 1179–1183. Kircher M (2012) Analysis of high-throughput ancient DNA sequencing data. In: Ancient DNA: Methods and Protocols, Methods in Molecular Biology (eds Shapiro B & Hofreiter M), pp. 197–228. Springer, New York. Klasson L, Westberg J, Sapountzis P et al. (2009) The mosaic genome structure of the Wolbachia wRi strain infecting Drosophila simulans. Proceedings of the National Academy of Sciences of the United States of America, 106, 5725–5730. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nature Methods, 9, 357–359. Laub MT, Shapiro L, McAdams HH (2007) Systems biology of Caulobacter. Annual Review of Genetics, 41, 429–441. Lindahl T (1993) Instability and Decay of the Primary Structure of DNA. Nature, 362, 709–715. Lindqvist C, Schuster SC, Sun Y et al. (2010) Complete mitochondrial genome of a Pleistocene jawbone unveils the origin of polar bear. Proceedings of the National Academy of Sciences of the United States of America, 107, 5053–5057. Mack L (2008) Investigating mitochondrial DNA phylogenies of Arctic and European beetle taxa. MSc thesis, Mainz University. Mandrioli M (2008) Insect collections and DNA analyses: how to manage collections? Museum Management and Curatorship, 23, 193–199. McKenna DD, Farrell BD (2009) Beetles (Coleoptera). In: The Timetree of Life (eds Hedges SB & Kumar S), pp. 278–289. Oxford University Press, Oxford, UK.

© 2013 John Wiley & Sons Ltd

D N A P R E S E R V A T I O N I N D E G R A D E D B E E T L E S P E C I M E N S 615 Meyer M, Kircher M (2010) Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harbor Protocols, 2010, pdb prot5448. Miller W, Drautz DI, Ratan A et al. (2008) Sequencing the nuclear genome of the extinct woolly mammoth. Nature, 456, 387–390. Miller W, Drautz DI, Janecka JE et al. (2009) The mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus). Genome Research, 19, 213–220. Milne I, Bayer M, Cardle L et al. (2010) Tablet–next generation sequence assembly visualization. Bioinformatics, 26, 401–402. New TR (2007) Beetles and conservation. Journal of Insect Conservation, 11, 1–4. Philippot L, Hallin S, Schloter M (2007) Ecology of denitrifying prokaryotes in agricultural soil. Advances in Agronomy, 96, 249–305. Poinar HN, Schwarz C, Qi J et al. (2006) Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA. Science, 311, 392–394. Prufer K, Stenzel U, Hofreiter M et al. (2010) Computational challenges in the analysis of ancient DNA. Genome Biology, 11, R47. Rasmussen M, Guo X, Wang Y et al. (2011) An Aboriginal Australian genome reveals separate human dispersals into Asia. Science, 334, 94–98. Regassa LB, Gasparich GE (2006) Spiroplasmas: evolutionary relationships and biodiversity. Frontiers in Bioscience, 11, 2983–3002. Reiss RA (2006) Ancient DNA from ice age insects: proceed with caution. Quaternary Science Reviews, 25, 1877–1893. Reiss RA, Schwert DP, Ashworth AC (1995) Field preservation of Coleoptera for molecular genetic analyses. Environmental Entomology, 24, 716–719. Richards S, Gibbs RA, Weinstock GM et al. (2008) The genome of the model beetle and pest Tribolium castaneum. Nature, 452, 949–955. Rowe KC, Singhal S, Macmanes MD et al. (2011) Museum genomics: lowcost and high-accuracy genetic data from historical specimens. Molecular Ecology Resources, 11, 1082–1092. Rychlik W, Rychlik P (2005) Oligo: Primer Analysis Software. Molecular Biology Insights Inc, Cascade, Colorado, USA. Schaefer H, Heibl C, Renner SS (2009) Gourds afloat: a dated phylogeny reveals an Asian origin of the gourd family (Cucurbitaceae) and numerous oversea dispersal events. Proceedings of the Royal Society of London Series B-Biological Sciences, 276, 843–851. Schwarz C, Debruyne R, Kuch M et al. (2009) New insights from old bones: DNA preservation and degradation in permafrost preserved mammoth remains. Nucleic Acids Research, 37, 3215–3229. Shapiro B, Cooper A (2003) Beringia as an Ice Age genetic museum. Quaternary Research, 60, 94–100. Shapiro B, Drummond AJ, Rambaut A et al. (2004) Rise and fall of the Beringian steppe bison. Science, 306, 1561–1565. Strange JP, Knoblett J, Griswold T (2009) DNA amplification from pinmounted bumble bees (Bombus) in a museum collection: effects of fragment size and specimen age on successful PCR. Apidologie, 40, 134–139. Thomsen PF, Elias S, Gilbert MT et al. (2009) Non-destructive sampling of ancient insect DNA. PLoS ONE, 4, e5048. Tsangarasa K, Greenwood AD (2012) Museums and disease: using tissue archive and museum samples to study pathogens. Annals of Anatomy, 194, 58–73. Ugelvig LV, Nielsen PS, Boomsma JJ, Nash DR (2011) Reconstructing eight decades of genetic variation in an isolated Danish population of the large blue butterfly Maculinea arion. BMC Evolutionary Biology, 11, 201. Wandeler P, Hoeck PE, Keller LF (2007) Back to the future: museum specimens in population genetics. Trends in Ecology and Evolution, 22, 634–642. Watts PC, Thompson DJ, Allen KA, Kemp SJ (2007) How useful is DNA extracted from the legs of archived insects for microsatellite-based population genetic analyses? Journal of Insect Conservation, 11, 195–198.

© 2013 John Wiley & Sons Ltd

Willerslev E, Cappellini E, Boomsma W et al. (2007) Ancient biomolecules from deep ice cores reveal a forested Southern Greenland. Science, 317, 111–114.

P.D.H., I.B. and S.A.E. designed the research. I.B. and S.A.E. oversaw the project. S.A.E. provided ancient samples. P.D.H. and S.A.E. performed the research. K.M. and K.P. performed the Illumina sequencing. P.D.H. analysed the data. P.D.H. wrote the paper with input from all other authors.

Data accessibility Representative mitochondrial and nuclear DNA sequences have been deposited in GenBank with Accession nos. KF695187-KF695189. BAM files and metagenomic data sets are in the DRYAD database at doi:10. 5061/dryad.0179t. Sample data can be found in the online supplementary material.

Supporting Information Additional Supporting Information may be found in the online version of this article: Data S1 Supplementary methods. Data S2 Duplicate removal bias. Fig. S1 Customised workflow for analysis of Illumina pairedend DNA sequence data. Table S1 Data on all Amara specimens used in this study. Table S2 Locality data for ancient specimens of A. alpina. Table S3 Primers used in this study. Table S4 Details of the 15 reference sequences used to retrieve A. alpina sequences from the merged read files. Table S5 The proportion of reads per sample (a) aligned to the 15 reference sequences listed in Table S4, and (b) affected by the duplicate removal bias toward nuclear DNA in modern and museum samples. Table S6 Mean fragment length differences between mitochondrial and nuclear DNA. Table S7 Details of the contigs used for metagenomic assessment, and the proportion that were taxonomically assigned, including those for the Insecta. Table S8 Major taxonomic components of the groups identified in Fig 3, and their inferred origin.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.