Complete fiber structures of complex trimeric autotransporter adhesins conserved in enterobacteria

Share Embed


Descrição do Produto

Complete fiber structures of complex trimeric autotransporter adhesins conserved in enterobacteria Marcus D. Hartmann, Iwan Grin, Stanislaw Dunin-Horkawicz, Silvia Deiss, Dirk Linke, Andrei N. Lupas1, and Birte Hernandez Alvarez Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany

Trimeric autotransporter adhesins (TAAs) are modular, highly repetitive surface proteins that mediate adhesion to host cells in a broad range of Gram-negative pathogens. Although their sizes may differ by more than one order of magnitude, they all follow the same basic head-stalk-anchor architecture, where the head mediates adhesion and autoagglutination, the stalk projects the head from the bacterial surface, and the anchor provides the export function and attaches the adhesin to the bacterial outer membrane after export is complete. In complex adhesins, head and stalk domains may alternate several times before the anchor is reached. Despite extensive sequence divergence, the structures of TAA domains are highly constrained, due to the tight interleaving of their constituent polypeptide chains. We have therefore taken a “domain dictionary” approach to characterize representatives for each domain type by X-ray crystallography and use these structures to reconstruct complete TAA fibers. With SadA from Salmonella enterica, EhaG from enteropathogenic Escherichia coli (EHEC), and UpaG from uropathogenic E. coli (UPEC), we present three representative structures of a complex adhesin that occur in a conserved genomic context in Enterobacteria and is essential in the infection process of uropathogenic E. coli. Our work proves the applicability of the dictionary approach to understanding the structure of a class of proteins that are otherwise poorly tractable by high-resolution methods and provides a basis for the rapid and detailed annotation of newly identified TAAs. coiled coil

| β-layer

G

ram-negative pathogenic bacteria express numerous different virulence factors to overcome host defenses, to mobilize and take up nutrients within the host, to invade cells or tissues, and to adhere to host cells (1). Adhesion to host cells is a key event during the onset of infection; the mediators of this process, called adhesins, are a heterogeneous group of bacterial surface proteins, which vary in architecture, domain content, and mode of binding. One distinct class of adhesins are the trimeric autotransporter adhesins (TAAs) (2, 3), also referred to as type Vc secretion systems (4). TAAs are important virulence factors of many well-studied pathogens: Examples include YadA of Yersinia enterocolitica, a species causing enteritis, mesenteric lymphadenitis, and reactive arthritis (5); NadA of Neisseria meningitidis (6), an agent of meningitis and sepsis; BadA of Bartonella henselae (7), which is the agent of cat scratch disease; UspA1 and A2 of Moraxella catarrhalis (8), a prominent species in respiratory tract infections, and Hia of Haemophilus influenza (9), an organism causing meningitis and respiratory tract infections. Despite their role in the context of unrelated diseases, these TAAs always fulfill similar functions—adhesion to host cells, autoagglutination, and biofilm formation (3). All TAAs display the same basic architecture: The N-terminal head typically mediates molecular interactions such as autoagglutination or binding to extracellular matrix proteins. It is followed by an extended and typically coiled-coil rich stalk, which projects the head from the bacterium and often provides binding sites for host serum factors (10, 11). The protein ends in a membrane anchor (2). In architecturally complex adhesins, head and stalk segments may alternate several times before the anchor www.pnas.org/cgi/doi/10.1073/pnas.1211872110

is reached (12). Whereas head and stalk are assembled from an array of analogous domains (13), the anchor is homologous in all TAAs and represents the defining element of this protein family (2). It trimerizes in the outer membrane to form a 12-stranded β-barrel pore (14), through which the head and the stalk exit the periplasm, thus giving rise to the name “autotransporter.” The C-terminal end of the folded stalk occludes the pore after export is completed. A number of partial TAA structures were solved recently. Several head structures, from YadA (15), Hia (16), BadA (13), and BpaA (17), revealed different trimeric complexes with novel folds. Partial stalk structures from UspA1 (18), SadA (19), and YadA (20) substantiated earlier predictions that coiled coils are the dominant structural motif of TAA stalks, albeit sometimes with noncanonical properties such as unusual periodicities or ion binding sites in their core. Finally, one structure of a TAA membrane anchor could also be determined, from Hia (14), showing a size and architecture similar to that of single-chain autotransporters, albeit built of three chains rather than a single one. Despite their strong sequence divergence, structures of homologous TAA domains are so closely conserved that one structure can be used to solve the next one by molecular replacement (13, 21); this characteristic, and the fact that the domains can be predicted from sequence using state-of-the-art homology detection methods (12), prompted us to suggest a dictionary approach to understand the structure of TAAs, given that their flexibility and extreme length otherwise precludes their analysis by high-resolution methods. We proposed to solve representatives for all TAA domains defined from sequence analysis, which could then be used to model full TAA fibers from fragments (13, 22). Strictly speaking, the term “domain,” which has been developed on globular proteins to denote independently folding units, does not fully describe the structural elements of TAA or oligomeric fibers in general. We therefore use the term here for a unique and complete TAA building block, defined evolutionarily as a segment with a specific structure that can be shuffled in TAAs with few constraints from adjacent segments. By this definition, not all TAA domains are independently folding units. During our work on a web-based annotation platform for TAAs (12), we identified a chromosomally encoded TAA from Enterobacteriacea as an excellent model system, due to its domain complexity and the genetic tractability of its parent organisms. This protein is called SadA in Salmonella enterica (22), EhaG in enteropathogenic Escherichia coli (EHEC), and UpaG in uropathogenic E. coli (UPEC). UpaG has been found to be

Author contributions: M.D.H., D.L., A.N.L., and B.H.A. designed research; M.D.H., I.G., S.D.-H., S.D., and B.H.A. performed research; M.D.H., A.N.L., and B.H.A. analyzed data; and M.D.H., D.L., A.N.L., and B.H.A. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. W.I.W. is a guest editor invited by the Editorial Board. Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.pdb.org [PDB ID codes 2YNY (SadAK1), 2YNZ (SadAK5), 2YO0 (SadAK9cfI), 2YO1 (SadAK9cfII), 2YO2 (SadAK12), and 2YO3 (SadAK14)]. 1

To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1211872110/-/DCSupplemental.

PNAS | December 18, 2012 | vol. 109 | no. 51 | 20907–20912

BIOCHEMISTRY

Edited by William I. Weis, Stanford University School of Medicine, Stanford, CA, and accepted by the Editorial Board November 8, 2012 (received for review July 15, 2012)

Fig. 1. Electron microscopy of SadA. Transmission (A and C) and scanning electron micrographs (B and D) of E. coli Top10 carrying an overexpression vector for Salmonella typhimurium SadA (C and D) or an empty vector as control (A and B). Cells were grown in liquid culture and immobilized on polylysine-coated coverslips. In A and C, cells were labeled after immobilization with an antibody raised against SadA and affinity purified against SadAK9 and chemically fixated by using glutaraldehyde. (Scale bars: 1 μm.)

essential for the colonization of the urinary tract by UPEC (23), whereas EhaG mediates binding of EHEC specifically to colorectal epithelium (24); SadA promotes biofilm formation and host cell adherence in Salmonella (25). Here, we reconstruct the full SadA, UpaG, and EhaG fibers from the structures of representative SadA fragments, describing in the process a number of unusual structural motifs with functional implications. These motifs include a coiled coil elaborated by a “collar” of three-stranded β-meanders, which may provide stiffness to the stalk; a β-layer motif, which acts as a universal adaptor for transitions between α-helical and β-stranded domains; and a connector with an intrinsic rotational flexibility of up to 25°, which offers an attractive mechanism for fine-tuning the relative orientation of consecutive binding sites along the fiber. Our work proves the domain dictionary approach and provides a basis for the rapid and detailed annotation of newly identified TAAs.

fragments, SadAK1–16, each covering at least one copy of the domains and motifs found in the protein. With one exception (SadAK5), the constructs were designed to start and end with coiled-coil segments, so that they could be extended by adaptors derived from the trimeric form of the GCN4 leucine zipper, GCN4pII, to increase their stability (22). The constructs SadAK1, K3, K5, K9, K12, and K14, which together cover all desired domain types (Fig. 2 and Fig. S2), yielded crystals that diffracted to high resolution. We obtained their structures by molecular replacement in a hierarchical fashion: The shortest constructs, K1 and K3, were solved by using the trimeric GCN4 structure 1GCM as a search model and each further construct was solved with parts of the previous ones: K12 with K1, K5 with K12, K14 with K5, and K9 with K14. Whereas K1, K12, K3, and K14 were solved in their entirety, the C-terminal 43 residues of K5 were not visible and the N-terminal 41 residues of K9 were recognizable, but not traceable, in the electron density. We obtained structures for K9 from two different crystal forms, K9cfI and K9cfII, neither of which allowed us to trace the N-terminal part. All constructs were solved from crystals containing the full trimers in the asymmetric unit and display more or less pronounced asymmetries; only K12 and one K9 structure (K9cfI) were built by crystallographic symmetry and are therefore fully symmetrical. On the basis of these structures, we reconstructed the whole fibers of SadA, UpaG, and EhaG according to the domain annotation in Figs. S2–S4. To yield straight fibers, the modeling

Results and Discussion Chromosomally Encoded Enterobacterial Adhesin. Enterobacteria of

the genera Escherichia, Salmonella, and Shigella contain a chromosomal TAA located between the mtl-operon, which encodes genes responsible for mannitol metabolism, and the lld-operon, which contains genes for the uptake and metabolism of L-lactate (Fig. S1). The TAAs of pathogenic organisms are characterized by a high genetic turnover (2), probably due to selective pressure by the host immune system; correspondingly, in some strains, this adhesin may be disrupted by frame shifts, internal stop codons, or insertion elements. This observation is made only rarely in Escherichia and Salmonella species, but frequently in Shigella. In the nonpathogenic laboratory strain E. coli K12, this genomic region is lost entirely and the L-lactate operon is located directly downstream of the mannitol operon. Early on in the project, we settled on the Salmonella ortholog, SadA, as a model system, because the laboratory strain LT2 contains this protein whereas the Escherichia laboratory strain K12 does not. We resorted to recombinant expression of SadA in E. coli. In scanning electron micrographs, SadA-expressing cells displayed a rough cell surface densely covered with knobs, in contrast to uninduced control cells. Transmission electron micrographs of sections labeled with anti-SadA antibodies confirmed the insertion of SadA into the outer membrane and its exposure to the cell surface (Fig. 1). SadA is a protein of 1,461 residues with a complex domain composition. It includes a signal sequence of 54 residues, four YadA-like head domains and extended stalk regions, multiply segmented by FGG, HANS, DALL, and neck motifs (for domain and motif definitions; ref. 12). Whereas the domain architecture of its N-terminal half is quite variable between orthologs in different species, its C-terminal half, encompassing three of the four YadA-like heads and the membrane anchor, is highly conserved. Overall, SadA illustrates very well the modularity and the combination of conserved and variable regions characteristic for TAAs. Structure Determination of SadA Fragments and Reconstruction of Full Fibers. To gain sufficient structural information for recon-

struction of the full fiber, we designed a redundant set of SadA 20908 | www.pnas.org/cgi/doi/10.1073/pnas.1211872110

Fig. 2. Crystal structures of fragments and reconstructed full fibers of SadA, UpaG, and EhaG. The three chains of the SadA trimers are colored individually; the coiled-coil adaptors fused to the fragments are shown in gray. Only one of the two structures obtained for SadAK9 (K9cfI) is shown. The reconstructed fibers are compared with a prototypical simple trimeric adhesin, YadA. Indicated lengths are measured on the atomic coordinates of the models and include the membrane anchor.

Hartmann et al.

process was conducted with symmetry constraints; the membrane anchor was modeled by using the Hia membrane anchor domain (PDB ID code 2GR8) as a template. With a length of ∼115

nm, the model of UpaG is slightly longer than the ∼108-nm-long SadA and the ∼101-nm-long EhaG model. In the following, we will describe the structure of the individual domains and motifs.

Fig. 4. The DALL domain. (A) Structure of the DALL1 domain in SadAK14 together with the downstream neck domain. One chain is colored yellow; the DALL1 domain is drawn in thick lines. The central water molecule of the DALL1 and the water molecules interacting mostly with the yellow chain are drawn in red, the central water of the neck in black, and others in gray. Notably, the N-terminal coiled coil is kinked with respect to the trimer axis of the DALL1-neck tandem. (B) Close-up of the β-sheet on one face of the DALL1-neck tandem. Compared with DALL2 in E, the β-sheet of the DALL1 is invaded by bridging water molecules. (C) Top view of the DALL1 domain, highlighting the β-layer interactions. Although the backbone interactions of the central β-layer residues are almost undisturbed, the coordination of the central water molecule is asymmetric because of the kink of the coiled coil. (D) Structure of the DALL2 domain in SadAK12 together with the downstream neck domain. One chain is colored green; the DALL2 domain is drawn in thick. The central water molecule and the water interacting with the green chain are shown in red, the central water molecule of the neck in black, and others in gray. Superposed in thin black is the structure of the DALL2-neck tandem in SadAK5. (E) Close-up of the continuous β-sheet formed between the DALL2 and neck domain. (F) Top view of the DALL2 domain in SadAK12, highlighting the interchain interaction between the conserved Tryptophan and Histidine and the β-layer interactions. (G) Superposition of the DALL1 structure of SadAK14 and the DALL2 structure of SadAK12 on the C-terminal halves. Because of the waters invading the β-sheet in DALL1, the spacing of the β-strands differs between the two structures. The N-terminal coiled-coil of DALL1 in SadAK14 is 10° kinked with respect to the symmetric DALL2 in SadAK12. (H) Sequence alignment of DALL1 and DALL2 domains, highlighting the central β-layer residues and the conserved residues in interchain interactions.

Hartmann et al.

PNAS | December 18, 2012 | vol. 109 | no. 51 | 20909

BIOCHEMISTRY

Fig. 3. Neck domains in context with different upstream domains. (A–C ) Neck domains in SadA in context with different upstream domains. One chain of each trimer is colored individually. The neck (thick lines) forms a continuous β-sheet with the last β-strand of the respective upstream domain (transparently thick). Structurally invariant water molecules are in black. (A) Long neck following a DALL2 domain in SadAK5. (B) Short neck following a Ylhead domain in SadAK14. (C) Short neck following a HIM3 domain in SadAK9. (D) Superposition of nine known neck structures as indicated in H, solved in context with their native upstream domain, colored by trimer. The insertion sequence in the neck of Hia (1S7M) was cut for clarity. (E) Top view of D. (F) Head insert motifs HIM2 and HIM3 fold tightly around their respective neck: Superimposition of B and C (HIM3) with a HIM2-neck domain, highlighting the head insert motif (thick lines). (G) Close up of the β-layer in the neck of SadAK5, highlighting the hydrogen bonding network that involves the central water molecule. (H) Sequence alignment of neck sequences, highlighting the central β-layer residue (mostly valine) from the DAVN consensus sequence.

Fig. 5. β-Layers as universal adaptors. The structure of β-layers in three different contexts, highlighting the β-interactions of the three chains around the central water molecule. (A) Side and top views are shown for the β-layer in a DALL domain (Left; DALL2 from SadAK12) leading from alpha to beta, a β-layer in a neck domain (Right; from SadAK12), leading from beta to alpha, and a β-layer separating two coiled-coil segments, leading form alpha to alpha, from structure 2BA2 (Center). B and C show a superposition of the three different β-layers in side and top view, respectively. The superposition is based on the central β-layer residues and the central water molecule. With a total of 8 DALL and 12 neck domains, the whole SadA fiber comprises 20 β-layers.

YadA-Like Head Domains. SadA contains four YadA-like head domains (Ylheads), of which we solved the third in construct K9 and the fourth in K9 and K14. Ylheads are trimers of singlestranded, left-handed β-helices with a repetitive substructure. The general repeat length is 14 residues, and each repeat consists of an outer and an inner β-strand, running perpendicularly to the fiber axis. The repeats stack to form continuous β-sheets running along the inner and outer faces of the β-helices (15). Like the heptads of coiled coils, individual repeats have a structure that is independent of their sequence, but short insertions and deletions of one or two residues are common and, in contrast to coiled coils, do not affect the overall structure. In the four head domains of SadA, 20 of 24 repeats are 14 residues long and three others have 12, 15, and 16 residues, respectively. Longer insertions occur on occasion and are almost invariably found in the last repeat of a head segment. These head insert motifs (HIMs) can be assigned to individual subtypes on the basis of their sequence properties (12). We identified two HIM subtypes previously, HIM1 and HIM2, and here observe a third type, HIM3, in the structure of the third Ylhead of SadA. C-terminally, Ylheads always merge into neck domains, continuing the inner β-sheet into the β-sheet of the latter. In the case of the third Ylhead of SadA (construct K9; Fig. 3C), a HIM3 forms additional interactions, folding tightly around the succeeding neck domain. Comparable interaction patterns can be observed for a HIM2 in the partial Burkholderia pseudomallei TAA head structure 3LAA (17) (Fig. 3F). Connectors from Beta to Alpha: Neck Domain. Neck domains mediate the transition from a variety of β-stranded domains into downstream coiled-coil regions in TAAs (Fig. 3). They are rep20910 | www.pnas.org/cgi/doi/10.1073/pnas.1211872110

resented by multiple structures in the Protein Data Bank in context with different upstream domains, to which we add another five exemplars from SadA. Necks can be subdivided into short necks, long necks, and necks with an insertion sequence (ISneck). They only differ in the length of the loop that connects their two β-strands. A superposition shows an invariant structure for all three subclasses, mediating a perfectly straight and symmetric transition into the downstream coiled coil (Fig. 3 D and E). Moreover, the interactions with the upstream domains are conserved and independent of the nature of the latter. The last β-strand of the first chain of the upstream domain forms a continuous β-sheet with the first β-strand of the second chain and the second β-strand of the third chain of the neck, yielding a highly intertwined structure. This β-sheet is, apart from a common hydrophobic core, the only interaction the neck forms with its upstream domain, with the exception of Ylhead/HIM domains described above. Notably, HIM3 and HIM2 are always associated with short and long necks, respectively. The crux of the neck is, however, located directly at the transition to the coiled coil where the three chains of the trimer cross each other and form a tight symmetric network of hydrogen bonds. Here, the valine of the consensus sequence DAVN from one chain forms β-sheet interactions with the valines of the other chains, and these three valines together coordinate a central water molecule on the trimer axis (Fig. 3G). This structural motif constitutes the core of the transition from the β-stranded structure of the neck to the α-helical structure of the downstream coiled coil, forming a plane of β-sheet like interactions perpendicular to the coiled-coil axis. We refer to this motif as a β-layer and call the valine the central β-layer residue (see Fig. 5).

Fig. 6. The HANS domain. (A) Side view of the last HANS domain in SadA from structure K9cfI, second in the alignment in G. The dashed line indicates where the N-terminal coiled coil breaks into a second, distorted segment. (B) Top view of the same HANS domain, showing the side chains of the second β-strand. (C and D) Superposition of all structures of the last HANS domain of SadA (K9cfI black, K9cfII gray, K14 light blue) with the HANS from 3LAA (salmon) on the KYFHANS consensus motif, illustrating the conformational flexibility of the N-terminal helical segment. C shows a side view of a superposition of all individual chains, and D shows the top view of the full trimers. (E and F) Superposition of two different conformations of the last HANS domain in SadA (from K9cfI and from K9cfII) onto the coiled coil above the distorted segment (indicated by the dashed line). Between the two conformations, the C-terminal Ylhead is rotated by 25° with respect to the N-terminal coiled-coil. (G) Sequence alignment of HANS domains.

Hartmann et al.

Hartmann et al.

Special Coiled-Coil Motifs of the Stalk. SadA contains a single copy of a FGG domain after the N-terminal head, which we covered in the structures of K1 and K12. This domain is a stalk variant that connects two coiled-coil segments (Fig. 7 A–D). At the FGG sequence motif, the individual chains break the N-terminal coiled coil to form a β-hairpin that packs against the N-terminal coiled-coil segment, continue into a second β-hairpin that packs against the C-terminal coiled-coil segment, and finally return to where they departed from the coiled coil to fold into the C-terminal coiled-coil segment. In doing so, the three chains undergo a perfect 120° rotation around the trimer axis so that the helices of the N-terminal segment appear to pass seamlessly into the helices of the neighboring chain of the C-terminal segment: At the junction of the N- and C-terminal helices, the backbone hydrogen bonding pattern of α-helices is continued between the N-terminal helix of one and the C-terminal helix of the other chain (Fig. 7B). In a superposition with the only other available FGG structure in 3LAA (ref. 17; Fig. 7A), the structures only deviate in the loops of the β-hairpins where they differ in length. By forming a tight collar around them, the FGG domain presumably provides rigidity to the adjacent coiled-coil segments. For instance, the β-hairpins of the FGG domain in 3LAA fold BIOCHEMISTRY

Connectors from Alpha to Beta. DALL domain. SadA comprises eight DALL domains, one of type 1 and seven of type 2, which always occur in tandems with neck domains; K5 and K12 cover two different DALL2-neck tandems, K14 the DALL1-neck tandem. DALL domains mediate the transition from preceding coiledcoil segments to the β-stranded structure of their respective neck domains and form continuous β-sheets with them. Architecturally, DALL2 is simpler than DALL1 (Fig. 4). DALL2 forms a β-sheet between the first and second β-strand of the same chain that continues into the neck domain. A conserved tryptophan in the first and a conserved histidine in the second strand form a π-stacking interaction between the chains. In DALL1, which carries an insertion between the first and second strand, the β-sheet is invaded by bridging water molecules (Fig. 4B). Consequently, the two strands have a wider spacing so that DALL1 and DALL2 can be structurally superimposed only on either their N- or C-terminal halves (Fig. 4G). Compared with the neck domain, the transition mediated by DALL domains has a considerable degree of conformational flexibility; the coiledcoil axis preceding the DALL1 domain in the structure of K14 is kinked ∼10° with respect to the trimer axis of the DALL1neck tandem. DALL domains can be viewed as upside-down neck domains. The transition from the coiled-coil to the β-stranded structure of the domain is mediated by a β-layer in the same way as in the neck, just that the transition leads from an α-helical to a β-stranded structure; the central β-layer residue is the first leucine of the DALL consensus motif. Moreover, β-layers are not limited to transitions from alpha to beta and beta to alpha, and not to TAAs, underscoring the universality of β-layers: In the structure 2BA2 of a DUF16 domain from Mycoplasma pneumoniae, a β-layer is found that directly connects two trimeric coiled-coil segments, thus forming a transition from alpha to alpha. In a superposition of β-layers of all three types (from a DALL domain, a neck domain, and the one from 2BA2), both the central β-layer residues and the central water molecule match perfectly (Fig. 5). HANS domain. Three transitions from α-helical to β-stranded domains in SadA are mediated without a β-layer by a structurally simpler connector, the HANS domain. It precedes the last three heads of the enterobacterial TAAs. Although our construct K9 comprises the last two HANS domains of SadA, in both crystal forms, K9cfI and K9cfII, only the last HANS is fully ordered. This last HANS is also captured in the structure of K14, and in all three structures it assumes different conformations. In the HANS connector, each helix of the coiled coil passes seamlessly into a β-sheet via a divergent β-turn (Fig. 6). This turn is formed by the residue preceding the KYFHANS consensus sequence and the phenylalanine therein. Starting with the tyrosine of the motif, the sheet passes into the inner β-sheet of the neighboring chain of the downstream Ylhead domain. The conformational flexibility of the most C-terminal HANS, observed when comparing the three crystal structures, resides in the coiled-coil before the KYFHANS motif: Approximately 10 residues before reaching the β-turn, the coiled coil breaks into a second, distorted segment with a steeper crossing angle of the helices. This distorted segment assumes different conformations in all three structures and entirely accounts for the different HANS conformations (Fig. 6). Most striking is the difference between the two structures of K9: When superimposed on the upstream coiled coil, the downstream Ylhead is rotated by 25° around the trimer axis between the two HANS conformations (Fig. 6 E and F). Additionally, although the transition at this HANS is straight or almost straight in K9cfI or K14, it is kinked by ∼6° in K9cfII. This conformational freedom is further underlined by the N-terminal HANS domain in the two structures of the K9 construct. In both, the whole N-terminal coiled coil down to the KYFHANS consensus motif is poorly ordered. Because faint overall electron density for the N-terminal coiledcoil is visible, albeit not interpretable, we can conclude that it is in fact present and folded.

Fig. 7. FGG domain and NxYTD motif. Side (A) and top (C) view of the FGG domain. The β-hairpins inserted between the coiled-coil segments fold around the bundle like a collar and cause a 120° rotation of the helical bundle. (B) Directly at the junction of the two coiled-coil segments, the N-terminal helices appear to pass seamlessly into to the C-terminal helices. (D) Sequence alignment of FGG domains. (E and F) Side and top view of the NxYTD motif in SadAK14, highlighting hydrogen bonding interactions and the invariant water network. Before reaching the anchor, many TAAs contain a right-handed coiled-coil segment in the stalk that contains a YTD sequence motif at the transition to the final left-handed segment; as described recently for YadA (20), this motif forms an elaborate network of hydrogen bonds: The threonines of the three chains form hydrogen bonds with each other in the core via their hydroxyl groups, whereas the tyrosines and aspartates form interchain hydrogen bonds around the bundle. Careful analysis however shows that the extent of the network is even larger. In more than half of all occurrences, YTD is preceded by an asparagine two residues before the tyrosine, yielding the motif NxYTD. These asparagines, together with the backbone oxygen of the residue preceding them and the threonines, coordinate three water molecules in the core of the bundle. Inspection of the YadA structures and their experimental data reveals that the three water molecules are also present in the two NxYTD motifs in YadA, albeit not modeled consistently.

PNAS | December 18, 2012 | vol. 109 | no. 51 | 20911

around the whole α-helical half of an adjacent HANS domain, so that the latter is perfectly well ordered and forms a straight transition. It may therefore play a role in the fine tuning of the flexibility of the fiber. The other two stalk variants found in SadA are polar core motifs within the coiled coil. One is the NxYTD motif, with one occurrence in SadA, which is described in detail in Fig. 7. The other motif, N@d, is found throughout the whole SadA fiber. As we reported (19), this motif sequesters ions to the core of the coiled coil and has functional implications for the autotransport process. Implications for the Adhesion Process. In the light of the adhesion process, the DALL and HANS connector domains are of special functional importance: Both confer well-defined degrees of flexibility to the fiber. DALL domains can act as hinges that allow local bending of the fiber, whereas the hinge point is localized to its β-layer; the maximum bending angle we observed in crystalline state was 10° for a single DALL domain. Because some complex adhesins often comprise a large number of DALL domains, the total overall possible bending angle can sum up considerably. The HANS domain, also allowing for local bending, additionally provides rotational freedom around the trimer axis by partial unwinding of its helices. We observed rotation angles of up to 25° between different crystal structures of the same HANS domain—if this angle was the maximum deflection achievable by one HANS domain, only the three HANS domains in SadA would already allow for overall rotations of 75° around the trimer axis. Both these rotational degrees of freedom, perpendicular to and around the trimer axis, are conceivably facilitating the adjustment of the relative and overall orientation of the substrate binding domains of the fiber. For further fine tuning, the flexibility conferred by HANS may be delimited by FGG domains. The importance of flexibility for function has been pinpointed in detail for different TAAs, including the Moraxella catarrhalis adhesin UspA1 (26), Haemphilus influenzae Hia (21), and E. coli EibD (11). All of these adhesins bind to large receptors or matrix proteins on the surface of host cells, which disallows a rigid, straight fiber. 1. Gerlach RG, Hensel M (2007) Protein secretion systems and adhesins: The molecular armory of Gram-negative pathogens. Int J Med Microbiol 297(6):401–415. 2. Hoiczyk E, Roggenkamp A, Reichenbecher M, Lupas A, Heesemann J (2000) Structure and sequence analysis of Yersinia YadA and Moraxella UspAs reveal a novel class of adhesins. EMBO J 19(22):5989–5999. 3. Linke D, Riess T, Autenrieth IB, Lupas A, Kempf VA (2006) Trimeric autotransporter adhesins: Variable structure, common function. Trends Microbiol 14(6):264–270. 4. Leo JC, Grin I, Linke D (2012) Type V secretion: Mechanism(s) of autotransport through the bacterial outer membrane. Philos Trans R Soc Lond B Biol Sci 367(1592): 1088–1101. 5. Bölin I, Norlander L, Wolf-Watz H (1982) Temperature-inducible outer membrane protein of Yersinia pseudotuberculosis and Yersinia enterocolitica is associated with the virulence plasmid. Infect Immun 37(2):506–512. 6. Comanducci M, et al. (2002) NadA, a novel vaccine candidate of Neisseria meningitidis. J Exp Med 195(11):1445–1454. 7. Riess T, et al. (2004) Bartonella adhesin a mediates a proangiogenic host cell response. J Exp Med 200(10):1267–1278. 8. Lafontaine ER, et al. (2000) The UspA1 protein and a second type of UspA2 protein mediate adherence of Moraxella catarrhalis to human epithelial cells in vitro. J Bacteriol 182(5):1364–1373. 9. St Geme JW, 3rd, Cutter D (2000) The Haemophilus influenzae Hia adhesin is an autotransporter protein that remains uncleaved at the C terminus and fully cell associated. J Bacteriol 182(21):6005–6013. 10. Biedzka-Sarek M, et al. (2008) Functional mapping of YadA- and Ail-mediated binding of human factor H to Yersinia enterocolitica serotype O:3. Infect Immun 76(11):5016–5027. 11. Leo JC, et al. (2011) The structure of E. coli IgG-binding protein D suggests a general model for bending and binding in trimeric autotransporter adhesins. Structure 19(7): 1021–1030. 12. Szczesny P, Lupas A (2008) Domain annotation of trimeric autotransporter adhesins— daTAA. Bioinformatics 24(10):1251–1256. 13. Szczesny P, et al. (2008) Structure of the head of the Bartonella adhesin BadA. PLoS Pathog 4(8):e1000119.

20912 | www.pnas.org/cgi/doi/10.1073/pnas.1211872110

Conclusions In our quest to obtain the complete molecular structures of the chromosomally encoded TAAs of Enterobacteriacae, we have characterized various hitherto undescribed domains and elements. We describe a universal structural motif for the transition from α-helical to β-stranded structures that we name β-layer. This recurrent motif facilitates the domain shuffling that is the hallmark of these highly repetitive, fibrous proteins, but also occurs outside of the TAA family. In addition, we describe the structure and dynamics of connector domains with well-defined intrinsic rotational degrees of freedom that confer structural flexibility to the fiber that is highly relevant for the adhesion process. This work completes the list of structural templates for frequently occurring TAA domains and proves both the robustness and the applicability of the dictionary approach to understanding TAA structure. Despite low sequence identity, domains of the same type are structurally nearly identical— crystal structures of new fragments could be solved by molecular replacement by using search models of homologous domains. With the structures reported here, we were able to reconstruct the full fibers of SadA, UpaG, and EhaG; likewise, we can now reconstruct other TAA fibers either completely or to a large extent, including many important pathogenicity factors, using standard molecular modeling techniques. Materials and Methods The protein used for structure determination is SadA of Salmonella enterica subsp. enterica serovar Typhimurium strain LT2 (GenBank accession no. NP_462591). The other two proteins modeled are UpaG (NP_756286) and EhaG (NP_290185). Methodological details about sequence analysis, cloning, expression, protein and antibody purification, electron microscopy, crystallization, crystal structure solution, and molecular modeling are described in SI Materials and Methods. Crystallization conditions are listed in Table S1. Data collection and refinement statistics are summarized in Table S2. ACKNOWLEDGMENTS. We thank Ines Wanke, Reinhard Albrecht, and Kerstin Baer for setting up the crystallization experiments and the staff of beamline PXII/Swiss Light Source for excellent technical support. This work was supported by institutional funds from the Max Planck Society and by German Science Foundation Grants FOR449/LU1165 and SFB766/B4.

14. Meng G, Surana NK, St Geme JW, 3rd, Waksman G (2006) Structure of the outer membrane translocator domain of the Haemophilus influenzae Hia trimeric autotransporter. EMBO J 25(11):2297–2304. 15. Nummelin H, et al. (2004) The Yersinia adhesin YadA collagen-binding domain structure is a novel left-handed parallel beta-roll. EMBO J 23(4):701–711. 16. Yeo HJ, et al. (2004) Structural basis for host recognition by the Haemophilus influenzae Hia autotransporter. EMBO J 23(6):1245–1256. 17. Edwards TE, et al. (2010) Structure of a Burkholderia pseudomallei trimeric autotransporter adhesin head. PLoS ONE 5(9):, e12803. 18. Conners R, et al. (2008) The Moraxella adhesin UspA1 binds to its human CEACAM1 receptor by a deformable trimeric coiled-coil. EMBO J 27(12):1779–1789. 19. Hartmann MD, et al. (2009) A coiled-coil motif that sequesters ions to the hydrophobic core. Proc Natl Acad Sci USA 106(40):16950–16955. 20. Alvarez BH, et al. (2010) A transition from strong right-handed to canonical lefthanded supercoiling in a conserved coiled-coil segment of trimeric autotransporter adhesins. J Struct Biol 170(2):236–245. 21. Meng G, St Geme JW, 3rd, Waksman G (2008) Repetitive architecture of the Haemophilus influenzae Hia trimeric autotransporter. J Mol Biol 384(4):824–836. 22. Hernandez Alvarez B, et al. (2008) A new expression system for protein crystallization using trimeric coiled-coil adaptors. Protein Eng Des Sel 21(1):11–18. 23. Valle J, et al. (2008) UpaG, a new member of the trimeric autotransporter family of adhesins in uropathogenic Escherichia coli. J Bacteriol 190(12):4147–4161. 24. Totsika M, et al. (2012) Molecular characterization of the EhaG and UpaG trimeric autotransporter proteins from pathogenic Escherichia coli. Appl Environ Microbiol 78(7):2179–2189. 25. Raghunathan D, et al. (2011) SadA, a trimeric autotransporter from Salmonella enterica serovar Typhimurium, can promote biofilm formation and provides limited protection against infection. Infect Immun 79(11):4342–4352. 26. Agnew C, et al. (2011) Correlation of in situ mechanosensitive responses of the Moraxella catarrhalis adhesin UspA1 with fibronectin and receptor CEACAM1 binding. Proc Natl Acad Sci USA 108(37):15174–15178.

Hartmann et al.

Supporting Information Hartmann et al. 10.1073/pnas.1211872110 SI Materials and Methods Sequence Analysis. The protein used in this study is SadA of Salmonella enterica subsp. enterica serovar Typhimurium strain LT2 (GenBank accession no. NP_462591). To determine its genomic localization in enterobacteria, homologs were detected by BLAST against the nonreduntant database at National Center for Biotechnology Information. For selected hits, the genomic context was extracted and visualized by using GCView (1). The domain composition of the trimeric autotransporters from different Enterobacteria was evaluated by using daTAA (2). Cloning. The full-length SadA gene was amplified by PCR from genomic DNA of S. enterica subsp. enterica serovar Typhimurium strain LT2 by using primers SP1 (5′-GGAACCTTTCTAGATAACGAGGGCAAAAAATGAATAGAATATTTAAAGTCCTCTGGAATGCC) and SP2 (5′-CCAAGGTTAAGCTTATTACCACTGGAA GCCCGCGCC). The obtained 4.4-kbp fragment was digested with HindIII and XbaI and cloned in pASKIBA2 (IBA BioTAGnology). The resulting clone pSadA served as a template for amplification of shorter SadA constructs by PCR using primers SK5fw (5′-GACCATGGTCTCCGATTTATGAAACCAACCAGAAGGTGGATC) and SK5rev (5′-GACCATGGTCTCCTCATTCAGCCGTTACCCGTTGCGTATGCATC) for SadAK5, SK9fw (5′-GACCATGGTCTCCGATTCA AAATGCCATTGGTGCGG TCAC) and SK9rev (5′-GACCATGGTCTCCTCATTTGCGCCACATTAACCGCGTC AGTG) for SadAK9, SK12fw (5′-GACCATGGTCTCCGATTTATTCTTTAAGTCAAT CCGTCGCCGACCGACTCGGCGG) and SK12rev (5′-GACCATGGTCTCCT CATCTGAGAGCCGTTAACGGCATCGGTGCTGTCCGCAGCCAGG) for SadAK12, and SK14fw (5′-GACCATGGTCTCCGATTAAAGTAACGGACGCGCAGGTTTCC) and SK14rev (5′-GACCATGGTCTCCTCATCTTGTTTTCTACGCCTTTGATTTTGC) for SadAK14. The yielded fragments of expected size were digested with BsaI and ligated into vector pIBA-GCN4tri-His (3). Correctness of the clones was verified by DNA sequencing. Protein Expression and Purification. The purification of SadAK1 and SadAK3 was described (3). SadAK5, which contains amino acids 823–947 of SadA, was expressed as a fusion with only an N-terminal GCN4pII-adaptor. This was achieved by insertion of a stop codon in primer SK5rev immediately before the C-terminal GCN4pII adaptor and the following (His)6 linker. SadAK5 was overexpressed in Escherichia coli TOP10 as described above. Pelleted cells were ruptured by using a French press in 20 mM Tris·HCl, 40 mM NaCl, 4 mM MgCl2 at pH 7.4 containing a protease inhibitor mix (Roche), PMSF, and Dnase I. After a centrifugation step (140,000 × g, 40 min at 4 °C), the supernatant was diluted 1:5 with 0.5 M Mes at pH 5.5 and applied onto a cation exchange column (SP Sepharose; GE Healthcare) equilibrated with 20 mM Mes and 40 mM NaCl at pH 5.5. SadAK5 bound to the column and was eluted with a gradient of 0–1 M KCl. The final step was a size exclusion chromatography step (S75; GE Healthcare) in 20 mM Mops and 150 mM NaCl at pH 7.2. The other SadA constructs comprise SadA residues 1049– 1304 in SadAK9, 255–358 in SadAK12, and 1185–1386 in SadAK14, each fused to GCN4pII adaptors on both the N- and C terminus. The C-terminal adaptor is followed by a (His)6 linker to simplify purification. The proteins were expressed and purified under denaturing conditions by using a NiNTA column (GE Healthcare) as described for His-tagged SadAK3 (3). Refolding was performed at 4 °C by dialysis by using the following buffers: Hartmann et al. www.pnas.org/cgi/content/short/1211872110

20 mM Mops, 450 mM NaCl, and 10% glycerol at pH 7.2 for SadAK9; 20 mM Mops and 150 mM NaCl at pH 7.2 for SadAK12, and 20 mM Tris·HCl and 150 mM NaCl at pH 7.4 for SadAK14. For SadAK9, an additional size exclusion chromatography step on a Superdex 200 column (GE Healthcare) equilibrated with the appropriate refolding buffer was necessary to obtain pure protein. Antibody Purification. For antibody production, full-length SadA was expressed in E. coli BL21 Omp8 DE3 cells (4) at 25 °C in LB medium overnight, supplemented with 100 μg/mL Ampicillin. Outer membrane fractions were isolated essentially as described, using the method of differential membrane solubilization with Sarkosyl (5). The outer membranes were then solubilized in 3% Octylpolyoxyethylene (C8POE), 50 mM EDTA, 150 mM NaCl, and 20 mM Tris·HCl at pH 8.0 and were diluted after 2 h to a final detergent concentration of 1%. The solution was cleared of debris by centrifugation and was subjected to phase separation by using ice cold 20% (NH4)2SO4 (6). The detergent-rich phase was collected, dialyzed against 1% C8POE and 20 mM Tris·HCl at pH 8.0, and subjected to anion exchange chromatography with a MonoQ column (GE Healthcare) by using a 0–1 M NaCl gradient to remove residual lipopolysaccharides. Finally, the fractions containing SadA were pooled, precipitated with 90% ice-cold Acetone, and resuspended as a slurry in PBS to a concentration of approximately 0.5 mg/mL The purified protein was used to raise rabbit anti-SadA antibodies. After testing the specificities of several bleeds against SadA, the polyclonal antiserum was purified by using SadAK9 as bait. SadAK9 was coupled to a 1-mL HiTrap NHS-activated HP column (GE Healthcare) according to the manufacturer instruction. One milliliter of anti-SadA antiserum was diluted 1:20 with 50 mM Mops and 150 mM NaCl at pH 7.2 and loaded on the column equilibrated with the same buffer. Bound antibodies were eluted with a gradient of 0–4 M MgCl2 and tested for specificity in Western blotting. Electron Microscopy. E. coli Top10 cells carrying either an empty

pASK-IBA3 or pASK-IBA2-SadA were grown at 37 °C until OD600 = 0.6–0.8. Overproduction was induced by the addition of anhydrotetracycline, and cells were grown for another 4 h, spun down gently to concentrate cells, and immobilized on poly-LLysine–coated coverslips. For immunolabeling, immobilized cells were blocked with 0.5% (wt/vol) BSA and 0.2% (wt/vol) gelatin in PBS and labeled with affinity-purified SadA specific rabbit IgG. Bound antibodies were detected with Nanogold coupled to goat IgG anti-rabbit IgG (no. 2003; Nanoprobes). After washing (2× blocking buffer, 3× PBS; 5 min each) the antigen–antibody–marker complexes were stabilized with 0.5% GA in PBS for 5 min and washed six times with H2O (total 25 min). Gold markers were enhanced for 35 min with silver lactate, hydrochinone, and gum arabic in citrate buffer at pH 3.8 according to Danscher (7). For scanning electron microscopy, colonies were postfixed with 1% osmium tetroxide in PBS for 1 h on ice, dehydrated in ethanol and critical-point-dried from CO2. The samples were sputtercoated with 7 nm gold-palladium and examined at 20 kV accelerating voltage in a Hitachi S-800 field emission scanning electron microscope. X-Ray Crystallography. Crystallization trials were set up by using the Honey bee 961 robot from Genomic Solutions, mixing 400 nL 1 of 6

of protein solution with 400 nL of reservoir solution in 96-well sitting-drop Corning 3550 plates. The reservoir volume was 75 μL. The crystallization conditions for all crystals used in the diffraction experiments are listed in Table S1, together with soaking conditions for cryoprotection where applicable. All crystals were loop mounted, flash frozen in liquid nitrogen, and all data were collected at beamline X10SA (PXII) at the SLS (Paul Scherrer Institute, Villigen, Switzerland) under cryo conditions at 100 K by using a mar225 CCD detector (Marresearch). Diffraction images were processed and scaled by using the XDS program suite (8). All structures were solved by molecular replacement using MOLREP (9), in a hierarchical fashion: The shortest construct SadAK1, comprising an FGG domain, was solved as described for SadAK3 (10), using the trimeric GCN4 structure 1GCM as a search model. For K12, comprising an FGG domain and a DALL/Neck tandem, was solved by using the K1 structure as a search model, the resulting structure of the DALL/Neck tandem from K12 was subsequently used to solve K5. K14 was solved by using K5, and K9 by using K14. In both crystal forms obtained for K9, K9cfI, and K9cfII, the N-terminal 41 residues that should form a continuous coiled coil are not traceable, albeit overall electron density resembling the expected shape of the coiled coil is visible. As described in the main text, we explain this phenomenon with the intrinsic flexibility of the HANS connector following the coiled coil, giving rise to the elevated R factors of the two structures. In K5, the C-terminal 43 residues are not traceable and supposedly unstructured—the C terminus ending directly after a HANS domain was chosen unfortunate, which we learned after obtaining the first structure of a HANS domain in K9. K12 shows an interesting crystallographic peculiarity: In the crystal packing in P6322, cavities of 32-symmetry are formed along the c-direction, which can accommodate full K12 trimers. In the later stages of refinement, weak electron density characteristic for the rod-shaped K12 trimer emerged in these cavities.

This density can be explained by an average of two K12 trimers in opposite orientations, related by the twofold axes of the crystal system. Refinement of an additional protein chain in these cavities was tried in space group P6322 with an occupancy of 50% and also after expansion to lower symmetry, including P1. In all cases, the refinement resulted in only faint electron density in the resulting 2FO-FC maps for this chain, which was always, also in lower symmetry, interpreted best as an average over both orientations with low occupancy. Because none of these trials led to improved R factors (but to more complicated models), we decided to refine and deposit the final model in P6322 without interpreting the density in the cavities. We assume that, throughout the crystal, approximately 50% of these cavities are stochastically filled with trimers, randomly in one of the two orientations, which might be of importance for crystal integrity. For K1, K5, K12, and K14, ARP/WARP (11) was used for automated rebuilding. All structures were completed in cyclic manual modeling with Coot (12) and refinement with REFMAC5 (13). For K9cfII and K14, the refinement was carried out by using NCS restraints. In all structures, the nature of structural solvent molecules, such as the central water molecules in β-layers and chloride ions in N@d layers, was identified on the basis of their coordination distances to the protein, refined B-factors, and fit to the electron density maps. Analysis with Procheck (14) showed good geometries for all structures. Data collection and refinement statistics are summarized in Table S2 together with PDB accession codes. All molecular depictions were prepared by using MolScript (15) and Raster3D (16).

1. Grin I, Linke D (2011) GCView: The genomic context viewer for protein homology searches. Nucleic Acids Res 39(Web Server issue):W353–W356. 2. Szczesny P, Lupas A (2008) Domain annotation of trimeric autotransporter adhesins— daTAA. Bioinformatics 24(10):1251–1256. 3. Hernandez Alvarez B, et al. (2008) A new expression system for protein crystallization using trimeric coiled-coil adaptors. Protein Eng Des Sel 21(1):11–18. 4. Prilipov A, Phale PS, Van Gelder P, Rosenbusch JP, Koebnik R (1998) Coupling sitedirected mutagenesis with high-level expression: large scale production of mutant porins from E. coli. FEMS Microbiol Lett 163(1):65–72. 5. Arnold T, Linke D (2008) The use of detergents to purify membrane proteins. Curr Protoc Protein Sci, Chapter 4:Unit 4.8.1–4.8.30. 6. Arnold T, Linke D (2007) Phase separation in the isolation and purification of membrane proteins. Biotechniques 43(4):427–430, 432, 434 passim. 7. Danscher G (1981) Histochemical demonstration of heavy metals. A revised version of the sulphide silver method suitable for both light and electronmicroscopy. Histochemistry 71(1):1–16. 8. Kabsch W (1993) Automatic processing of rotation diffraction data from crystals of initially unknown symmetry and cell constants. J Appl Cryst 26:795–800.

9. Vagin A, Teplyakov A (2000) An approach to multi-copy search in molecular replacement. Acta Crystallogr D Biol Crystallogr 56(Pt 12):1622–1624. 10. Hartmann MD, et al. (2009) A coiled-coil motif that sequesters ions to the hydrophobic core. Proc Natl Acad Sci USA 106(40):16950–16955. 11. Perrakis A, Morris R, Lamzin VS (1999) Automated protein model building combined with iterative structure refinement. Nat Struct Biol 6(5):458–463. 12. Emsley P, Cowtan K (2004) Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr 60(Pt 12 Pt 1):2126–2132. 13. Murshudov GN, Vagin AA, Lebedev A, Wilson KS, Dodson EJ (1999) Efficient anisotropic refinement of macromolecular structures using FFT. Acta Crystallogr D Biol Crystallogr 55(Pt 1):247–255. 14. Laskowski RA, Macarthur MW, Moss DS, Thornton JM (1993) Procheck - A program to check the stereochemical quality of protein structures. J Appl Cryst 26:283–291. 15. Kraulis PJ (1991) Molscript - A program to produce both detailed and schematic plots of protein structures. J Appl Cryst 24:946–950. 16. Merritt EA, Bacon DJ (1997) Raster3D: Photorealistic molecular graphics. Methods Enzymol 277:505–524.

Hartmann et al. www.pnas.org/cgi/content/short/1211872110

Modeling of Full-Length Fibers. The models of full-length SadA, UpaG, and EhaG proteins were computed in Modeler by using the structure of the YadA head (PDB ID code 1P9H), the reported SadA structures, and the structure of the Hia membrane anchor domain (PDB ID code 2GR8) as templates. The modeling process was conducted with symmetry constraints.

2 of 6

255 302

SadAK1 SadAK5 SadAK14

1386

1304

1185

SadAK9

1049

947

823

519

479

358

MNRIFKVLWNAATGTFVVTSETAKSRGKKNGRRKLAVSALIGLSSIMVSADALA NAGNDTGDGVTPTGTQTGGKGWIAIGTDATANTYTNVDGA SAAMGYKASAMGKW STAIGSYSQSTGDS SLALGVKSVSAGDR AIAMGASSSASGSY SMAMGVYANSSGAK SVALGYKSVASGAT SSALGYQATASGDD SAAFGNGAKAIGTN SVALGSGSVAQEDN SVAVGNSTTQRQ ITYVAKGDINSTSTDAVTGAQ IYSLSQSVADRLGGGASVNSDGTVNAPLYEVGTGIYNNVGSALSALNTS--------------ITNTEASVAGLA EDALLWDESISAFSASHTGNAS-K ITNLAAGTLAADSTDAVNGSQ LFDTNEKVDKNTADIATNTGSINQNTADITANTDSINQNTTDIAANTTSINQNTTDIATNTTNINSLSDSVTTLT DDALLWDAASGAFSAKHNGSDS-K ITNLAAGTLAADSTDAVNGSQ LFDTNEKVD----------------------------QNTADITTNTNSINQNTTDIATNTTNINNLSDSITTLT DDALLWDAASGAFSANHNGSAS-K ITNLAAGTLAADSTDAVNGSQ LFATNENVS----------------------------QNTADITTNTNSINQNTTDIATNTTSINNLSDSITTLT DDALLWDAASGTFSASRSGSAS-K ITNLAAGTLAADSTDAVNGSQ LYETNQKVD----------------------------QNTS---------------------AIADINTSITNLS SDNLSWNETTSSFSASHGSSTTNK ITNVAAGELSEESTDAVNGSQ LFETNEKVD----------------------------QNTTDIAANTTNITQNSTAIENLNTSVSDINTSITGLT DNALLWDEDTGAFSANHGGSTS-K ITNVAAGALSEDSTDAVNGSQ LYETNQKVD----------------------------QNTS---------------------AIADINTSITNLG TDALSWDDEEGAFSASHGTSGTNK ITNVAAGEIASDSTDAVNGSQ LYETNMLISQYNESISQLAGDTSETYITENGTGV KYIRTNDNGLEGQDAYATGNG ATAVGYDAVASGAG lefthanded (heptad) coiled coil SLALGQNSSSSIEG hydrophobic core residues SIALGS--GSTSNR AITTGIRETSATSD polar core residues GVVIGYNTTDRELLG ALSLGTDGESYRQ righthanded (pentadecad) coiled coil ITNVADGSEAQDAVTVRQL YTD motif QNAIGAVTTTPT KYYHANSTEEDSLAVGTD SLAMGAKTIVNADA GIGIGLNTLVMADAIN GIAIGSNARANHAN SIAMGNGSQTTRGAQTDYTAYNMDTPQNSVG EFSVGSEDGQRQ ITNVAAGSADTDAVNVGQL KVTDAQVSRNTQSITNLNTQVSNLDTRVTNIENGIGDIVTTGST KYFKTNTDGADANAQGAD SVAIGSGSIAAAEN SVALGTNSVADEAN TVSVGSSTQQRR ITNVAAGVNNTDAVNVAQLKAS EAGSVRYETNADGSVNYSVLNLGDGSGGTTR IGNVSAAVNDTDAVNYAQL KRSVEEANTYTDQK MGEMNSKIKGVENKMSGGIASAMAMAGL PQAYAPGANMTSIAGGTFNGESAVAIGVSMVSESGGWVYKLQGTSNSQGDYSAAIGAGFQW

SadAK3

SadAK12

255

Fig. S1. Genomic context of the enterobacterial adhesins. Genomic region of E. coli and S. typhimurium strains containing SadA or its homologs (UpaG in E. coli CFT 073, EhaG in E. coli O157:H7). The adhesin is located between the mtl-operon (mannitol metabolism) and the lld-operon (L-lactate metabolism). This genomic structure is conserved in all strains of E. coli and Salmonella spp except for E. coli K-12 and derivatives that appear to have a 5-kb deletion at this location.

Anchor

Neck

FGG motif

DALL 1/2 HANS motif

YadA-like Head repeats

Head Insert Motif

Signal sequence

Fig. S2. Sequence of SadA and constructs used in this study. Domains and constructs are marked individually and colored as indicated.

Hartmann et al. www.pnas.org/cgi/content/short/1211872110

3 of 6

MNKIFKVIWNPATGSYTVASETAKSRGKKSGRSKLLISALVAGGL LSSFGASADNYTGQPTDYGDGSAGDGWVAIGKGAKANTFMNTSGA STALGYDAIAEGEY SSAIGSKTLATGGA SMAFGVSAKAMGDR SVALGASSVANGDR SMAFGRYAKTNGFT SLAIGDSSLADGEK TIALGNTAKAYEIM SIALGDNANASKEY AMALGASSKAGGAD SLAFGRKSTANSTG SLAIGADSSSSNDN AIAIGNKTQALGVN SMALGNASQASGES SIALGNTSEASEQN AIALGQGSIASKVN SIALGSNSLSSGEN AIALGEGSAAGGSN SLAFGSQSRANGND SVAIGVGAAAATDN SVAIGAGSTTDASN TVSVGNSATKRK IVNMAAGAISNTSTDAINGSQ LYTISDSVAKRLGGGATVGSDGTVTAVSYALRSGTYNNVGDALSGID NNTLQWNKTAGAFSANHGANATNK ITNVAKGTVSATSTDVVNGSQ LYDLQ QDALLWNGTAFSAAHGTEATSK ITNVTAGNLTAGSTDAVNGSQ LKTTNDNVTTNTTNIATNTTNITNLTDAVNGLG DDSLLWNKAAGAFSAAHGTEATSK ITNVTAGNLTAGSTDAVNGSQ LKTTNDNVTTNTTNIATNTTNITNLTDAVNGLG DDSLLWNKTAGAFSAAHGTDATSK ITNVTAGNLTAGSTDAVNGSQ LKTTNDNVTTNTTNIATNTTNITNLTDAVNGLG DDSLLWNKTAGAFSAAHGTDATSK ITNVKAGDLTAGSTDAVNGSQ LKTTNDNVSTNTTNITN-------LTDAVNGLG DDSLLWNKTAGAFSAAHGTDATSK ITNVKAGDLTAGSTDAVNGSQ LKTTNDNVSTNTTNITN-------LTDSVGDLK DDSLLWNKAAGAFSAAHGTEATSK ITNLLAGKISSNSTDAINGSQ LYGVADSFTSYLGGGADISDTGVLSGPTYTIGGTDYTNVGDALAAINTSFSTSL GDALLWDATAGKFSAKHGINNAPSV ITDVANGAVSSTSSDAINGSQ LYGVSDYIADALGGNAVVNTDGSITTPTYAIAGGSYNNVGDALEAIDTTL DDALLWDTTANGGNGAFSAAHGKDKTASV ITNVANGAVSATSNDAINGSQ LYSTNKYIADALGGDAEVNADGTITAPTYTIANTDYNNVGEALDALD NNALLWDEDAGAYNASHDGNASK ITNVAAGDLSTTSTDAVNGSQ LNATNILVTQNSQMINQLAGNTSETYIEENGAGI NYVRTNDSGLAFNDASASGIG ATAVGYNAVASHAS SVAIGQDSISEVDT lefthanded (heptad) coiled coil GIALGS--SSVSSR hydrophobic core residues VIVKGTRNTSVSEE polar core residues GVVIGYDTTDGELLG ALSIGDDGKYRQ IINVADGSEAHDAVTVRQL righthanded (pentadecad) coiled QNAIGAVATTPT KYYHANSTAEDSLAVGED YTD motif SLAMGAKTIVNGNA GIGIGLNTLVLADAIN GIAIGSNARANHAD SIAMGNGSQTTRGAQTNYTAYNMDAPQNSVG EFSVGSEDGQRQ ITNVAAGSADTDAVNVGQL KVTDAQVSQNTQSITNLNTQVTNLDTRVTNIENGIGDIVTTGST KYFKTNTDGADANAQGKD SVAIGSGSIAAADN SVALGTGSVADEEN TISVGSSTNQRR ITNVAAGVNATDAVNVSQLKSS EAGGVRYDTKADGSIDYSNITLGGGNSGTTR ISNVSAGVNNNDAVNYAQL KQSVQETKQYTDQR MVEMDNKLSKTESKLSGGIASAMAMTGL PQAYTPGASMASIGGGTYNGESAVALGVSMVSANGRWVYKLQGSTNSQGEYSAALGAGIQW

Anchor

Neck

FGG motif Fig. S3.

DALL 1/2 HANS motif

YadA-like Head repeats

coil

Head Insert Motif

Signal sequence

Sequence of UpaG and symbolic domain arrangement. Domains are marked individually and colored as indicated.

Hartmann et al. www.pnas.org/cgi/content/short/1211872110

4 of 6

MNKIFKVIWNPATGNYTVTSETAKSRGKKSGRSKLLISALVAGGML SSFGALANAGNDNGQGVDYGSGSAGDGWVAIGKGAKANTFMNTSGS STAVGYDAIAEGQY SSAIGSKTHAIGGA SMAFGVSAISEGDR SIALGASSYSLGQY SMALGRYSKALGKL SIAMGDSSKAEGAN AIALGNATKATEIM SIALGDTANASKAY SMALGASSVASEEN AIAIGAE-TEAAEN ATAIGNNAKAKGTN SMAMGFGSLADKVN TIALGNGSQALADN AIAIGQGNKADGVD AIALGNGSQSRGLN TIALGTASNATGDK SLALGSNSSANGIN SVALGADSIADLDN TVSVGNSSLKRK IVNVKNGAIKSDSYDAINGSQ LYAISDSVAKRLGGGAAVDVDDGTVTAPTYNLKNGSKNNVGAALAVLD ENTLQWDQTKGKYSAAHGTSSPTASV ITDVADGTISASSKDAVNGSQ LKATNDDVEANTANIATNTSNIATNTANIATNTTNITNLTDSVGDLQ ADALLWNETKKAFSAAHGQDTTSK ITNVKDADLTADSTDAVNGSQ LKTTNDAVATNTTNIANNTSNIATNTTN-------ISNLTETVTNLG EDALKWDKDNGVFTAAHGTETTSK ITNVKDGDLTTGSTDAVNGSQ LKTTNDAVATNTTNIATNTTN--------------ISNLTETVTNLG EDALKWDKDNGVFTAAHGNNTASK ITNILDGTVTATSSDAINGSQ LYDLSSNIATYFGGNASVNTD-GVFTGPTYKIGETNYYNVGDALAAINSSFSTSL GDALLWDATAGKFSAKHGTNGDASV ITDVADGEISDSSSDAVNGSQ LHGVSSYVVDALGGGAEVNAD-GTITAPTYTIANADYDNVGDALNAIDTTL DDALLWDADAGENGAFSAAHGKDKTASV ITNVANGAISAASSDAINGSQ LYTTNKYIADALGGDAEVNAD-GTITAPTYTIANAEYNNVGDALDALD DNALLWDETANGGAGAYNASHDGKASI ITNVANGSISEDSTDAVNGSQ LNATNMMIEQNTQIINQLAGNTDATYIQENGAGI NYVRTNDDGLAFNDASAQGVG ATAIGYNSVAKGDS SVAIGQGSYSDVDT lefthanded (heptad) coiled coil GIALGS--SSVSSR hydrophobic core residues VIAKGSRDTSITEN polar core residues GVVIGYDTTDGELLG ALSIGDDGKYRQI INVADGSEAHDAVTVRQL righthanded (pentadecad) coiled coil QNAIGAVATTPT YTD motif KYFHANSTEEDSLAVGTD SLAMGAKTIVNGDK GIGIGYGAYVDANALN GIAIGSNAQVIHVN SIAIGNGSTTTRGAQTNYTAYNMDAPQNSVG EFSVGSADGQRQ ITNVAAGSADTDAVNVGQL KVTDAQVSQNTQSITNLDNRVTNLDSRVTNIENGIGDIVTTGST KYFKTNTDGVDASAQGKD SVAIGSGSIAAADN SVALGTGSVATEENTI SVGSSTNQRR ITNVAAGKNATDAVNVAQLKSS EAGGVRYDTKADGSIDYSNITLGGGNGGTTR ISNVSAGVNNNDVVNYAQL KQSVQETKQYTDQR MVEMDNKLSKTESKLSGGIASAMAMTGL PQAYTPGASMASIGGGTYNGESAVALGVSMVSANGRWVYKLQGSTNSQGEYSAALGAGIQW

Anchor

Neck

FGG motif

DALL 1/2 HANS motif

YadA-like Head repeats

Head Insert Motif

Signal sequence

Fig. S4. Sequence of EhaG and symbolic domain arrangement. Domains are marked individually and colored as indicated.

Hartmann et al. www.pnas.org/cgi/content/short/1211872110

5 of 6

Table S1. Crystallization conditions and cryo protection Structure

Protein solution

Concentration, mg/mL

Reservoir solution (RS)

Cryo solution

K1

20 mM Mops at pH 7.2, 150 mM NaCl

4

RS+15% (vol/vol) PEG 400

K5

20 mM Mops at pH 7.2, 150 mM NaCl

17

K9cfI

20 mM Tris at pH 7.4, 150 mM NaCl, 10% (vol/vol) glycerol

8.5

K9cfII

20 mM Mops at pH 7.2, 150 mM NaCl, 10% (vol/vol) glycerol

4

K12

20 mM Mops at pH 7.2, 150 mM NaCl

10

K14

20 mM Tris at pH 7.4, 150 mM NaCl

12

16% (wt/vol) PEG 4000, 80 mM sodium acetate, 100 mM Hepes at pH 7.5 10% (wt/vol) PEG 10000, 200 mM magnesium nitrate 20% (vol/vol) butanediol, 100 mM sodium acetate at pH 4.5 15% (vol/vol) butanediol, 100 mM sodium acetate at pH 4.2 500 mM ammonium tartrate, 100 mM sodium acetate at pH 5.0 12% (wt/vol) PEG 8000, 100 mM magnesium acetate, 100 mM Tris at pH 8.5

RS+20% (vol/vol) PEG 400 —



RS+25% (vol/vol) EG RS+10% (vol/vol) PEG 400

Crystallization conditions for SadAK3 have been published (10).

Table S2. Data collection and refinement statistics Structure PDB ID code Monomers/AU Space group a, Å b, Å c, Å β, ° Resolution range, Å Completeness, % Redundancy I/σ(I) Rmerge, % Rcryst/Rfree, % Bond length/angle rmsd, Å/° Ramachandran plot statistics, %

SadAK1

SadAK5

SadAK9cfI

SadAK9cfII

SadAK12

SadAK14

SadAK3

2YNY 3 P21 34.8 40.4 98.6 93.5 38.0–1.35 (1.43–1.35) 96.0 (89.6) 4.30 (3.22) 13.9 (1.98) 5.5 (62.8) 13.3/19.8 0.021/1.68

2YNZ 3 P212121 44.6 60.3 135.6 90 19.9–1.40 (1.49–1.40) 98.8 (97.9) 4.38 (4.14) 12.3 (1.97) 6.9 (67.9) 17.6/23.2 0.011/1.10

2YO0 1 P 63 (*) 54.0 54.0 306.7 90 37.2–2.80 (2.97–2.80) 99.5 (99.4) 3.25 (3.22) 8.59 (2.07) 14.0 (59.8) 22.1/29.7 0.010/1.14

2YO1 3 P21 81.9 48.7 135.6 105.1 39.1–3.10 (3.28–3.10) 96.9 (94.6) 2.31 (2.35) 8.93 (2.35) 10.0 (42.3) 26.5/32.0 0.014/1.32

2YO2 1 P 6322 48.6 48.6 366.0 90 38.3–2.00 (2.12–2.00) 99.5 (98.6) 7.30 (6.01) 13.8 (2.29) 8.7 (63.7) 24.2/28.4 0.015/1.29

2YO3 3 C2 189.3 46.2 103.8 98.0 37.3–2.00 (2.12–2.00) 98.3 (96.8) 3.91 (3.60) 12.4 (2.32) 7.1 (61.0) 20.0/25.3 0.011/1.10

2WPQ 3 P21 26.0 37.0 178.4 92.7 34.1–1.85 (1.96–1.85) 98.3 (94.0) 3.97 (3.82) 10.1 (1.91) 7.7 (65.0) 22.1/28.8 0.007/0.84

98.9/1.1/0/0

97.7/2.3/0/0

72.6/26.2/1.2/0

77.3/20.7/1.0/1.1

97.3/2.7/0/0

93.6/6.1/0.3/0

100/0/0/0

Values in parentheses refer to the highest resolution shell. Statistics for SadAK3, which have been published (10), are shown for the sake of completeness. The ramachandran plot statistics show the percentage of residues in the most favored/additionally allowed/generously allowed/disallowed regions, respectively, as defined and determined by using the program Procheck (14). *The crystals of SadAK9cfI were hemihedrally twinned with apparent 622 symmetry, twinning operator -H-K,K,-L and a twin fraction of 43%.

Hartmann et al. www.pnas.org/cgi/content/short/1211872110

6 of 6

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.