Micropia-Dm2, the nucleotide sequence of a rearranged retrotransposon from Drosophila melanogaster

Share Embed


Descrição do Produto

Nucleic Acids Research, Vol. 18, No. 14 4265

k.) 1990 Oxford University Press

Micropia-Dm2, the nucleotide sequence of a rearranged retrotransposon from Drosophila melanogaster Dirk-Henner Lankenau and Wolfgang Hennig Department of Molecular and Developmental Genetics, Katholic University, Toernooiveld, NL-6525 ED Nijmegen, The Netherlands EMBL accession no. X14173

Submitted June 7, 1990

Micropia-Dm2 belongs to a group of four sequenced micropia elements. It originates from a screen with micropia-DhMiF2A as a radioactive probe on a lambda genomic clone bank of

Drosophila melanogaster (2, 3). The sequence features are very similar to those of micropia-Dml 1, however, some major rearrangements have taken place. A general description of all micropia elements has been published (4). Here we represent the complete nucleotide sequence of micropia-Dm2 and about 500 bp of 5' flanking sequence. The internal sequences of the copia insertion have been partially sequenced (4), but data are not shown here. Forty-two sequence features are indicated within the sequence. The numbers at the left border of the sequence refer to these features and are described below.

Description of important sequence features: 1. EcoRI site, end of micropia insert 2. TATA-box 3. Cap-site 4. Poly adenylation signal 5. RNA termination signal 6. Simple tandem repeat 7. Inverted repeat of 3' end of 5' LTR 8. Leu tRNA primer binding site 9. Initiation codon Met of MHC-class I-like (similar) protein gene

10. 11. 12. 13. 14. 15. 17. 18. 19. 20. 21. 22. 23.

Insertion of copia, 5' target site duplication Inverted repeat of 5' end of 5' LTR of copia Copia position 4768 (Emori et al. 1985) Purine rich region, plus strand primer binding site of copia 5' inverted repeat of 3' LTR of copia & 16. Promoter: 5' start sites of cellular 2- and 5-kb RNA, shown on 3' LTR of copia Polyadenylation signal of copia Simple tandem repeat in copia 3' inverted repeat of 3' LTR of copia Insertion site of copia, 3' target site duplication, sequence belongs to micropia MHC-like protein gene CCHC-finger motif of retroelements, two repeats Amino terminal end of putative protease Carboxy terminal end of putative protease

24. 25. 26. 27.

Insertion site of retroposon, 5' target site duplication 5' end of retroposon, inserted into micropia 3' oligo(A) tail of retroposon insertion Insertion site of retroposon, 3' target site duplication, sequence belongs to micropia. This region between the protease and reverse transcriptase has been described as 'undefined' peptide (3). Recently a significant homology to this region has been identified in the vaccinia virus genome

(5).

28. Amino terminal end of putative reverse transcriptase coding region 29. Carboxy terminal end of reverse transcriptase 30. 5' end of so called 'tether' 31. 3' end of 'tether' 32. Amino terminal end of integrase. 33. Carboxy terminal end of RNaseH 34. Amino terminal end of putative integrase 35. Deletion in micropia-DM11 36. Deletion in micropia-DhMiF2A 37. 3' end of open reading frame and 5' end of 5' non-protein

coding region 3' conserved tandem repeats, 5 units Putative 5' (+ strand) primer binding site 5' inverted repeat of 3' LTR First tandem repeat unit and second truncated unit of LTR, potential ecdysteroid receptor binding sites are indicated by gray shading; micropia-Dm2 abruptly ends at this site 42. 5' flanking sequence of micropia

38. 39. 40. 41.

REFERENCES 1. Emori,Y., Shiba,T., Kanaya,S., Inouye,S., Yuki,S. and Saigo,K. (1985) Nature 315, 773-776. 2. Huijser,P., Kirchhoff,C., Lankenau,D.-H. and Hennig,W. (1988) J. Mol. Biol. 203, 689-698. 3. Lankenau,D.-H., Huijser,P., Jansen,E., Miedema,K. and Hennig,W. (1988) J. Mol. Biol. 204, 233-246. 4. Lankenau,D.-H., Huijser,P., Jansen,E., Miedema,K. and Hennig,W. (1990)

Chromosoma 99, 111-117.

5. Slabaugh,M.B. and Roseman,N.A. (1989) Proc. Natl. Acad. Sci. USA 86, 4152-4155.

4266 Nucleic Acids Research, Vol. 18, No. 14 RETROTRANSPOSON micropla-Dm2 2?,1_GAATTCATGG 3 AATTGAGAAG

CGCTGCTTGG TTAGAAAGGG AAAACTATAT AATGAAAAGG GAATGCCAAA AACTGGAGTG AGGdGCATTAA TCGTGGAGAA

AGACAAAGCA_GGCTGCACGA

60 120

4 GCCAAAGCAG ACGCAAGTGG ACTCGTTGAC TGCGCACAGC TGCATAAAAT TATATAGTA6 180 5,6 AAAGAGATTT. GAGCGACGCT GAATA_TGq__C q ~ q~!(If_:iGA(i0CCC CTGATATTCT 2 40 fl ACACCCTGCA TTTTCTGAGG ATCAGTGGTG 300

T-AACCCGACATCFAGAAGTGG_GATCTGTGCC GAAATGCAAA ATCGGAATTT

'g§TGCAGTAGTG+ iM GCCGGCACGT

GGCTGAACTT GTAAAAATCA TGCAAGTGAC

360

GAGCAATGTT .GGAATATACT ATTCAACCTA CAAAAATAAC GTTAAACAAC 4 20

-ACTACTTTAA TTTGaTATAA TGGCCRRRRR

RRRRRRRRRR RRRRRRRRRR RRRRRRRRRR

4 80

internal sequence of copia 12 RRRRRRRRRR RRGATCAGTT AACACTTCCC AGAATGCACA CCACCCACAT TTGATAGTTA 540

1314

CTAATGkATA TTATTGTTAT GTTTTTAATT ATAGACGTTA TTTTTGA GG GGCGTGTTGG 600 AATATACTAT TGAAGGTACA AAAATAACGT TAAACAACAC TACTTTATAT TTGATATG AA 660

15 TGGCCACACC TTTTATGCCA TAAAACATAT TGTAAGAGAA TACCACTCTT_TTTATTCCTT 720 15,16 CTTTCCTTCT TGTACGTTTT TTGCTGTGAG TAGGTCGTGG TGCTGGTGTT GCAGTTGAAA 780 j Pf ff g S" XRXUYkdXt"AAACTCAAA CATAAACTTG ACTATTTATT TATTTATTAA 840 T T ATACAAAGCAACCTAG TTAT~GATGTT AAGCTACC6CA 900 1920 ~ ~ -T ATTAAGGACC TOTC ATCTGG 960

*.1.q.G~

TTCTAACTGA AGGGAACTGC TCCAGGAATT ATTTACTCAA

TGACGACGCT TTCTTGCGCA TGCGTACCAG ATGCTCGAGA

21 221

GCACCCCCTT ATCTCAGTGG ATTTCTGCAG CAGCCGCCCG GACTACAAAG TATGGCAAAC AAGTAAGCTA TGACAACCTT

-9qCkC-TT-CTG--T-TC&AAGCCG

aGCAAAGTG GTGTCCAACA AAAGGAAGTA AATTGATCAT GGCACTAAGT CTAACACAAA TCTCGTACCA GGGTATGACT CGCTTTGAAA CCGAAGAGAC GCCGGCCGCT ACTGCCGCCG AATGTTACGC Gr.TGTATGCG TGGCGGAATA TGGAAATAGA AGAAATTGCC ATTGACAGTC GTTTGCAGCG CGTCCTCTTC CAGGCGGAGT TAAAAGCGTT TACGTTCGAC GGACCTGACC ACAAGAACCG TAAGGCATCG GGACATCGAA

TTGCjTGAATG __CCGAAGTAAA

ATAGACAGGC GAAACCGCAG CGTGAAAAAT CAAATGTTAC GTGCTATCGG GqGGACATTT CTCAACCMIG_TGCCCGAAAA ACGGAAGTGC AGCCAAACAA

-AACAGAAGAC TGTTAACCAA TGTTGTGTGA CTGAGCCAAA GGGAAGCTTG GTGAGATCTA TTAGCAGTAA GTGGCAGTGT TGGAAATATT 23 GAGAAATACT

24 25 ALATCAAAAAC

TCCAATTTGT GTTATCTGGT GTGCAGTACA ATTTCATGTA TAAACAAGAC

TTCGATTCCG AAACGTATAA TTGCAAATCT GTCCCGAACG TTTTATGTAA

TGTTAATAAT_TGTTCTTTTA

2,iCTGCATTGTA TTGATTCATC --CGTTACTGAG CGATCGTTTA AGCTCAATTA ATTGAGTTAC 28 TACTCGAGTA AATACAGGCG GCGCCGACCT TACAGACTTA ATTGATAAGA TGTAATATTG CGTCAAAAAG AAGAACGGAA CACGATTTCG GATAAATACC AGCAAATTAT TTCACATGCC

GCTTCTAAT

GTGCAGAGTG ACAATACTGT TGAGTGAAGT AGCAAATGAG TTTTGACATC TATGTTAATT AAATAAATAT TATTGACACC CTCGACTTCA CCGTTTGATT GAGAGAAGTA TTGCTCTCCC

CTTTGTCCGA TTGAAAAGCA AAATGAAAAT GCCCCGAAGA TTCGCCCAAG CCGACCGTCT ATGTGTTGAT CCTTGCCGCT TATCAGCGAT TGGATATGGC AAGTGGTTTC

TGAATCCGTG GAATATACTG CATTTGTGCC CCAGCGCACA GTCATAAATG CACTTGGTGA GGACGACATA ATGGTAGTAT CGCCAACCAA TTTGAATGTT CTTACAAAGG CTGGTTTTAC AACAACGGTT CAGTATTTAG GCTATGAAGT AAAGATATCT TCATTAAGCT CCTTGCCTCC GCCTCTTACT TTCGCAAATT ATTGTATTCA CTTTCGTCTG GTAGCGGCAA 31 CAGACTTAAA GTTGTGACGA TCCTCACAAA CTGATGCAAG GAGTTGCACA ATATCCTATT 32 CCGTAFTAGAA AGTAAGCCCC ATGTAATCGA ATCTAGATAT CACTCCTACG AGCTGGAAAC TCGCCATTAC CTAATTGGCC GTGAGTTCGT TTCTCGCACA AAAATAGATT TAACCCCCAG GTTTAATTTC GAAATTCAGT ATAGAGAGGG AAGAAATCCT TTATCACCCG AACACAATTT AAATCTGTCT GAAATTTCAA GTACTTGGCT AATAGAAATT GTTAACAAAT TGGAGTCAGA TGATTTGCGA AAAGGTGTAT TATATCGCAA ACCAGTTGTA CCCAGAGCTT TCAAATGGTC GCATTTAGGG TGGCAAAAGA CACTTGATAA 33134GAACAAGTAT GTTCGAAAAT TTTTTTCAAA TTCCGGGAAG GTTCAGGCGG AACTTCATTC CATCCACATA GATATAACGG GGAAATTAAG 36,35 TGTTCAGATC GATGCCTATA CAAAGTTTGT CGAAAGCTGT GTTAATGCTA TGAAATCTTC (ACATCCCAAC CAGGGCAGAT GTTTTACTAG GAAAGTTGAA CTTCACTTGA TTGCTACGGG GGTGATGGAA ACACTGAAAA ATTTGTTGTC GGACGCACTT GGCGAAGTCC AACTTGCACT AAGTCCGTTA GAAATGTTAA TTGGTAAACA TGAGACCGAA TGTGAAATAG ATATGGCAAC TTCTTAGCGT CTTACGACAA ATCCCGATTT CACGTAGGTG ACTATGTGCT ATTGAGGAAT AAATTCAGAG GACCGTTTTT GGTAACTGAA TCGTTGACGA GTAACCGATC GTTCA-AGTAT 31 GCAGAAATCC CGAATGAGTT AAACGAGAAT TGAATGAAAA GAAAAGCCCG CCAATGAGTT

29.,30CATTGGCTTG

ACCGATATAA AACTGCATGG

ACGTTTTTAA AGTCGGCTGG GTTACAACCG ACATCCAATG AAGAAGCGAC CCAGTTGTAT ATGCGACAAG TGCGGCCAAC GATGTGACTC CATCAACGAG AAAGACGACA GGCATTGGTG GAAAATATTA CTGATAGGGC

CTCCCTTATT AATGATAAAA CACTATAAAC GAATGATATT CGATAATTTT AAAGTTGTAA GCGCTGTTAT GTTACTGTTA ATAAAAAAAA AAAATTGTTC GTATCACTAG TTTACCAACG GGATACCTCA GATCCAACTA AAACTGTTCA GTGCGAATCG AGGTGAGCGA TTTGCTAGCC CCATGTTGCT TTTAGAGAGC TAAACTCGAA CAAATTGCTA GACTTCGCGG CACCAAATCC CGATTCACCC AAAAATGCGC CATCTGTTTT TCTTTTGTAA TCGTTTACAT TTGGAAAGGT TAAAAACTGT GCTAAATGCA GTTTTCTCAA GAAATTCGTC CGAATGTGCG

AGACGGCCTC CCTTGCTAAC GGAATTGGCT CTTTAACCTT GCGAGCGGGA TCCTCAAACT CTCTCCGGCG TTAGACAATT CGTGTCTGGA TTCTCCCAAC TTATGAAACC GATTACATGG AGCGCTGAGC TGGAAGAGAT TGAGCCTGCT CTGGTAATCT TCGACCCGCA TGCCTGTGGA TATGGAGCGA TACTTTTGCA ATACTTCAGC AAAACAACTA CCTCTGTTGA CTTGGCAGTG GTAAAAGCCG TTAAACATTT TGTCTATACA GACTGCAATT CATTAAAAGC AGTTCACCGC TGGTGGGCCT ACTTACAATC TAAGCGTATG GCTCATGTGG ATTTCCTATC GTCAATAAAC AAGATTCCCG AAAAAAGAGT TCTTGCTGAG CAACGGTTAG ACCTTGAGAT TGAATTGGCC GAAAACTTGG CCAAAACGTA GGTCCAAAGA CGAGGTAGAA CAAGTTATTT AGTAATTAAC CAGGTACACG AGTCGATAAT AGTGTACCAG TATTATTGGT TCGCTAAAAT CTGCATAACT TGTAGATCAG_TOAATCATC CATTCCGAAG ACAAGTATAC CGTGGCACAC TGGCAAGAGC GATTTGAAGG AATATGTCAT TTATCTGTAT.CACACCTTAA AGATAGATGC CATATCCTTA TTTGGAGTAC CFAGATCGCAT CTCTAAGTTT TCAGAGTTTT GCGTATCGCA AATGAGCCGT GCAAATGGGC AAGTGGAACG AGTGGTAGAA TCAAGTCAAC GATCGTGGCA GAATTGTACA ATTTCTCGTG CCACTGAGGC GGCTTGACCC CTTGGATTAG TTCCCCCATG TGTTAGAGCT CATGCGACAG AAAATATGAA GATAGCAGTA GGGCAGCCGT TGACAAACAC GAAGAAAGAC ACCAAACTAA GTTAGATCCG GTATTAGAGG GTGACAGGTA TACACTAAAG TGCCATGAAG ATATACGTAA AATGCCGGAT GTAGAGCAAT AGCTGAAATA TAGAAACAGT

CTTTTG~TGA?A CGAG MTT

1020

TGGCAAGAGT 1080

~

1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740

1800 1860 1920

1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940

3000 3060 3120 3180 3240 3300

3360 3420 3480 3540 3600

3660 3720 3780 3840 3900 3960 4020 4080 4140

4200 4260 4320 4380 4440

4500 4560 4620 4680 4740 4800 4860

38' AGGTGAGACG AGATTATTTGTCGTACGATATGGGTTAT qp 4980 TCCAGGAGAG 'ACGATGAGTT TGGATTGAAT TAATAATCAA GTGTGTGTGA ACTGGCGGAA 5040 GATCGATATAj TAGAAATCGA TAAATGATAA TGTTAAGATA AGTTGTGAGC TGATGTATTA 5100 CTGATCAATG GCTGTTTTGT TGAAATTAAA GTATCAAAAG

GAACTGAATA TTCTTCACAG TGTTATTGAA CTTACACGAG

TGAAAATAGA AATTAAGATT TAGTGATGAA GACGTGAAAT

ATAAGTTATC TAAGAAATAC AGTAGGTGAT GTCAGALATGG

CCAGCAACAG ACCTAATAAA CTTGATATCT CCGTGTCGTG

TGAAATAAGA GTCAAACTAA TGGTATCTCG GCGAAAATAA

_39,40 41TGAGTATGCIG TGTAGCGCT GLTTTACTTCT TCTCCAINECTTTGCTA TTATGCGU 7A 'IL 7TATI GAACACGTGG CAAAAAThPC TTGAGGTGTG AAATGAGAAT GATAATGAAT TCTTTATTAC AAATATGFAT AATAAATGTA TCTCCCAAAT ATTCTAGAGTTCGTAT 42 TGGCGATCTT CGTTGAACTT TCGCTCGGCT CTCAGTTGCA GTACTGGGTA GACCGAGAGA

GOGGGGTATGT

CTCAATTCCC ACGATTGTCA GACTGGTTTA CTGTAGTTAG

5160 5220 5280 5340 5400 5460 5520 5580 5640 5700

TAACTCGCGC GGTAGTTAGT TCGGCCACAA CAGTAATGAT GTTGTGGTCA ACTCGTCCAC TCTCCACTCA TATGTGGCCC ACATAGATGC TGCCTCATTG CGTCGATATC ACTAGTAGCT CTGTGATTAT CGTTCGTGTA CACCTCTGTT 5760 GGGCGGCGAC TCTTGCCTCC AGCATCCACT CTGACAGCAG CTCACCCCGT 5820

CTTGCCCCTC GGCATGACGA GAGAGTTTGC TAAGCCACAT GGGGGACACT 5880 GCATTCGCGT CGAGGCCCAG GATTGCGGGG GTTCTGCTGG CCTGCAGCAA GACCGCATCC 5940 ATGTACCTAT ACCAGAGTGC ATCGAACTGC AGTATCGG 6000

View publication stats

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.