Modeling pilus structures from sparse data

Descrição do Produto

Journal of Structural Biology 173 (2011) 436–444

Contents lists available at ScienceDirect

Journal of Structural Biology journal homepage: www.elsevier.com/locate/yjsbi

Modeling pilus structures from sparse data Manuel Campos a, Olivera Francetic a, Michael Nilges b,⇑ a b

Institut Pasteur, Unité de Génétique moléculaire, CNRS URA 2172, Département de Microbiologie, 25 rue du docteur Roux, F-75015 Paris, France Institut Pasteur, Unité de Bio-informatique structurale, CNRS URA 2185, Département de Biologie Structurale et Chimie, 25 rue du docteur Roux, F-75015 Paris, France

a r t i c l e

i n f o

Article history: Available online 27 November 2010 Keywords: Molecular modeling Pilus assembly Protein secretion Cysteine cross-linking

a b s t r a c t Bacterial Type II secretion systems (T2SS) and type IV pili (T4P) biogenesis machineries share the ability to assemble thin ﬁlaments from pilin protein subunits in the plasma membrane. Here we describe in detail the calculation strategy that served to determine a detailed atomic model of the T2SS pilus from Klebsiella oxytoca (Campos et al., PNAS 2010). The strategy is based on molecular modeling with generalized distance restraints and experimental validation (salt bridge charge inversion; double cysteine substitution and crosslinking). It does not require directly ﬁtting structures into an envelope obtained from electron microscopy, but relies on lower resolution information, in particular the symmetry parameters of the helix forming the pilus. We validate the strategy with T4P where either a higher resolution structure is available (for the gonococcal (GC) pilus from Neisseria gonorrhoeae), or where we can compare our results to additional experimental data (for Vibrio cholerae TCP). The models are of sufﬁcient precision to compare the architecture of the different pili in detail. Ó 2010 Elsevier Inc. All rights reserved.

1. Introduction Type IV pili (T4P) and type II secretion systems (T2SS) are members of a superfamily of ﬁlament biogenesis machineries that share a common origin with achaeal ﬂagella and pili (Ng et al., 2006; Hansen and Forest, 2006). They assemble thin ﬂexible ﬁlaments using a highly conserved basic machinery consisting of a polytopic inner membrane protein, a cytoplasmic ATPase and a prepilin peptidase enzyme involved in pilin maturation. T2SS and T4P pilin subunits typically have a conserved N-terminal amino acid sequence in an extended a-helix that mediates ﬁlament assembly (Craig et al., 2003). The amino acid sequences of the C-terminal globular domains are much less well conserved among T4P and T2SS pilins and are responsible for the pilus functions. The molecular structures of these ﬁlaments are of great interest due to their central role in bacterial pathogenesis and their attractiveness as targets for vaccines and therapeutics. However, they are difﬁcult to study at atomic resolution, due to the insolubility of the pilin protomers, and due to the fact that they readily assemble into heterogeneous and ﬂexible pilus ﬁlaments. The molecular structures of these pili could provide important insights into the mechanisms of their assembly and function. The method of choice to study the structure of the pili at atomic detail would be high resolution determination of the protomers (the pilins) combined with cryo-electron microscopy of the assembly. Several high resolution structures of T4P or T2SS pilin subunits ⇑ Corresponding author. E-mail address: [email protected] (M. Nilges). 1047-8477/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.jsb.2010.11.015

or subunit fragments have been obtained by X-ray crystallography and nuclear magnetic resonance spectroscopy (Audette et al., 2004; Craig et al., 2003, 2006; Hazes et al., 2000; Keizer et al., 2001; Parge et al., 1995; Ramboarina et al., 2005; Xu et al., 2004; Korotkov et al., 2009; Köhler et al., 2004). Only one structure of an intact type IV pilus ﬁlament has been determined to date by combined X-ray crystallography and cryo-electron microscopy (cryo-EM) reconstruction the gonococcal (GC) pilus from Neisseria gonorrhoeae (Craig et al., 2006). The cryo-EM study of the GC pilus resulted in a 12.5 Å resolution map of the pilus (Craig et al., 2006), a resolution much too low to obtain atomic detail. However, by ﬁtting an atomic model of the individual components into the EM reconstruction, a pseudo-atomic model can be obtained, thereby effectively bridging the different resolution ranges (Rossmann et al., 2005; Volkmann and Hanein, 2003). In the case of the GC pilus (Craig et al., 2006), symmetry reduced the number of possible arrangements within the EM surface sufﬁciently to obtain a unique arrangement of protomers. The structure is based on a computational docking procedure using rigid body operations only, assuming that the pilin structure is inﬂuenced neither by the packing in the crystal nor in the pilus. Considering the overall nonglobular structure of the pilin (a globular head with a long N-terminal hydrophobic helix), this is a strong assumption. We describe here an alternative strategy, which is not based on a cryo-EM map of the molecular complex, but on quantities that can be obtained with fairly high conﬁdence from lower resolution EM studies (the symmetry of the helix), on conformational restraints and on molecular modeling. The restraints can either serve

M. Campos et al. / Journal of Structural Biology 173 (2011) 436–444

to bias the conformational docking and to limit the conformational space, or to evaluate and select models from unbiased calculations. In the current study, the restraints are assumed to be derived from mutation experiments (salt bridge charge inversion; double cysteine substitution and crosslinking), but distance restraints could come from a variety of other experiments that could be applied to this non-soluble system (FRET, solid state NMR). The helical pili are highly symmetrical, and all distance data present an ambiguity similar to that observed for NMR data of symmetric systems in liquid or solid state (Nilges et al., 2010). That is, since it is unknown how the subunits are arranged in the pilus before its assembly is determined, it is a priori impossible to assign an interaction specifically to two subunits. For NOE derived distance restraints for small symmetric assemblies, we developed a concept that allows the use of ambiguous distance data, the ambiguous distance restraint (ADR) (Nilges, 1993; Bardiaux et al., 2009). This restraint can be used in the same way as standard distance restraints during a structure calculation. Its use circumvents the necessity to explicitly assign the restraint to two unique atoms. The principal use of ambiguous distance restraints is automated structure determination from NMR data (see Nilges and O’Donoghue (1998), Altieri and Byrd (2004), and Güntert (2009) for reviews). Several software packages (Linge et al., 2001; Herrmann et al., 2002) use the concept of ambiguous distance restraints for automated assignment of ambiguous distance data in combination with an iterative assignment strategy, alternating structure calculation and assignment. The concept of ADRs is not limited to the treatment of ambiguities in NOEs but is useful for a variety of other interactions (Nilges, 1995; Bardiaux et al., 2009). In particular, data from NMR interface mapping experiments can be converted into ambiguous distance restraints (Nilges and O’Donoghue, 1998; Dominguez et al., 2003), but also other contact relationships are ambiguous (Alber et al., 2007). In the present paper, we use it in particular for restraints derived from charge inversion and from double cysteine substitutions and crosslinking experiments. In the Klebsciella oxytoca T2S pilus and the Vibrio cholerae Toxin-coregulated pilus (TCP), inter-molecular salt bridges could be demonstrated by single charge inversions abolishing pilus assembly, and their combinations that restored it (Li et al., 2008; Campos et al., 2010). In addition, double cysteine substitutions in the hydrophobic segment of the major pilin PulG in the T2S pilus led to position-speciﬁc cross-linking of protomers in the assembled pili, providing evidence for interactions in the core of the ﬁlament of the T2S pilus. For the GC pilus, we compare our models to the published structure (Craig et al., 2006) and discuss differences due to the different computational strategies (rigid docking versus ﬂexible docking). For the V. cholerae toxin-coregulated pilus (TCP), we compare our model to the deuterium exchange data obtained by mass spectrometry (Li et al., 2008). The present study is a special case of using proteomics type data for integrative from sparse conformational data, a ﬁeld that has been reviewed extensively recently (Lasker et al., 2010; Karaca et al., 2010; Das and Baker, 2008). 2. Materials and methods 2.1. Modeling of the pilins Only for the GC pilus, a complete structure of the pilin including its N-terminal a-helix is available (Craig et al., 2006) (PDB code 2HIL). For the TcpA and the PulG pilins, the N-terminus is missing in the X-ray crystal structures (PDB codes 1OQV and 1T92, respectively). For PulG, we used the structure of the N-terminal helix taken from the Pseudomonas aeruginosa T4 pilin PilA full-length structure (Craig et al., 2003) (pdb 1OQW) to complete the structure. In particular, a proline at position 22 conserved in the GC PilE,

437

K. oxytoca PulG and the P. aeruginosa PilA, all members of the group a T4 pilins, induces a kink in the helix at this position in the 3D structures of PilA and PilE. A mutation of this proline in PulG strongly reduces pilus generation in K. oxytoca, an indication that the kink or additional ﬂexibility induced by the proline at this position is important for the proper assembly of the pilus. Despite the high overall sequence similarity between the N-terminal helices of different T2SS and T4P pilins, the TcpA pilin, a member of group b T4P pilins, lacks this proline residue at position 22. In contrast to a preceding study on the TCP (Li et al., 2008), we therefore extended the C-terminal helix as a straight continuation of the helix. For the PulG pilin, the 20 C-terminal residues were modeled by exploiting the close homology to GspG from enterohaemorragic Escherichia coli (pdb 3G20), to include the calcium binding site in the b2–b3 loop (Korotkov et al., 2009), which is absent in the crystal structure of PulG (Köhler et al., 2004). For the homology modeling, we used ad hoc scripts written for CNS (Brunger et al., 1998) to superpose structures and replace residues. The N-terminal methyl group was included in the model of all three pilins (Strom and Lory, 1991). 2.2. Use of conformational restraints The calculation strategy described in this paper relies in part on conformational restraints. We used different types of conformational restraints in different stages of the calculation protocol described further below. 1. The initial structures of the pilins were maintained in a ﬂexible and adaptive way using log-harmonic distance restraints and automated weighting (Nilges et al., 2008) on the alpha carbons, apart from the two N-terminal residues (and the two C-terminal residues for the PulG pilin, for which no structural template was available). A log-harmonic distance restraint potential is derived from the log-normal distribution (Rieping et al., 2005) and depends on the square of the difference of the logarithms of the distances (Nilges et al., 2008):

2 d Elog ¼ k log ; d0

ð1Þ

where d is the instantaneous distance in the structure, d0 is the target distance (in this case, the distance in the original structure), and k is a weighting factor. The method bears some similarity to methods based on the elastic network model recently described for X-ray crystal reﬁnement of low resolution structures (Schröder et al., 2007; Schröder et al., 2010), with important differences: (i) We use the log-harmonic potential which differs from a harmonic potential that it is more tolerant to larger deviations. It is an advantage to allow for changes in the structure since there may be important changes between the a structure in the crystal and in the pilus. The external packing forces present in both situations may be different. (ii) Rather than updating the network, we combined the ‘‘log-elastic’’ network with an automatic reclassiﬁcation of restraints depending on their deviation from the initial distance (corresponding to a Bayesian mixture model), and automated weighting. We used a total of four classes of restraints, with limits between classes set to multiples of the overall logarithmic standard deviation. 2. In the ﬁrst stage of modeling, a single linear restraint (i.e., the force is independent of the distance to the pilus axis) was introduced between the center of mass of the helix and the origin, to maintain the protomers close to the pilus axis. The origin is ﬁxed in space and serves also as the center of the coordinate system for the symmetry operations. The center of mass of the helix is determined ‘‘on the ﬂy’’ at each molecular dynamics or minimization step.

438

M. Campos et al. / Journal of Structural Biology 173 (2011) 436–444

3. The experimental distance information on salt bridges obtained by double charge inversion (experimental for the T2S and TCP; predicted for GC) was introduced as a distance restraint in some calculations. The symmetry of the pilus leads to an ambiguity in this contact information: it is not known between which two protomers the saltbridge exists without knowing the arrangements of the protomers in the pilus. This type of ambiguity can be readily treated by an ambiguous distance restraint. A quantity with the dimension of a distance can be calculated for N distances:

D¼

N X

!1=m m dn

;

ð2Þ

n

where each of the N distances dn is calculated between the ﬁrst residue of the saltbridge to the second residue on one of the N neighbors (in our case, N = 30 since there are a total of 30 neighbors simulated), and where the exponent m determines the ‘‘selectivity’’ of the restraint: the higher the exponent, the more selective, with limm!1 D ¼ minðdn ; n ¼ 1; . . . ; NÞ. For NOE derived distances, the natural choice for m is 6, because of the distance dependency of the NOE. For the current application we have chosen m = 20. The advantage of using an ambiguous distance restraint rather than the minimum distance is that the restraint is differentiable everywhere, which makes minimization and molecular dynamics more stable. The symmetry in the distances is imposed using the symmetry operators directly in the calculation of the non-bonded interaction of one monomer with the images and in the calculation of the different contributions in Eq. (2). 4. In some of the calculations (see below for a description of the different calculations performed), a linear restraint (force independent on distance) was used to incorporate the (experimental or predicted) cysteine crosslinking data, also using ambiguous restraints. This restraint was active in stages 1 and 2 of the minimization protocol only. We preferred a linear distance dependency for this restraint since it is difﬁcult to convert the crosslinking efﬁciency directly into an upper limit for the distance. The ambiguity was treated as for the saltbridge restraints. 5. To improve the interactions of the charged side chains, in particular in the ﬁrst two stages of the calculation where no or only approximate electrostatic interactions were present, an adaptive log-harmonic ambiguous restraint was used, as an attractive intermolecular interaction between a charged residue and all residues of opposite charge on any protomer in the pilus. For example, for a Lysine residue, the restraint connects the Lysine Nf to all Od atoms of Aspartates, all Oe of Glutamates, and the C-terminal oxygens, on any of the neighbors. This restraint was motivated by the observation that charges are very well compensated in the interior of the pilus. These restraints were checked, re-classiﬁed and weighted at regular intervals in a similar way as the conformational restraints on the initial structures. 2.3. Search strategy for pilus modeling The full-length structures of the pilins served to generate molecular models of the pili by a multi-stage minimization and molecular dynamics procedure, identical for all three pili. We ran calculations with different combinations of conformational restraints. The biophysical properties derived from electron microscopy measurements were always used, throughout the calculation: for the T2S pilus, rise per unit: 10.4 Å, rotation angle per unit: 84.71° (Köhler et al., 2004); for the GC pilus, rise per unit: 10.5 Å, rotation angle per unit: 104° (Craig et al., 2006); and for the TCP, rise per unit:

7.5 Å, rotation per unit: 140° (Li et al., 2008). The helical sense was imposed to be right handed for T2S and GC, left handed for TCP, but we also tested the effect of using the wrong handedness (see Section 3). The symmetry of the pili was imposed by modeling only one primary protomer explicitly and by calculating the interactions of the primary protomer with the rest of the pilus (force ﬁeld and distance restraints) by using symmetry operators. Ideal symmetry of the system is thus maintained throughout the whole calculation. We used a total of 31 protomers (15 neighbors of the primary protomer in each direction of the helix axis). This may seem excessive, but adding protomer units in this way comes at a negligible computational cost. Also, more partners need to be considered when longer cutoffs are used (up to 12 Å during the modeling for the electrostatic interaction, and up to 25 Å in the analysis) than those that are in direct contact. All symmetry operators required to reconstruct the entire system were explicitly speciﬁed, using the ‘‘NCS strict’’ option in CNS (Brunger et al., 1998). These are rotations in multiples of the rotation per unit (e.g., lefthanded rotation of 140° for the TCP), and multiples of the translation per unit along the helix axis (e.g., 7.5 Å for the TCP). The primary monomer is free to rotate and translate (i.e., the distance from the symmetry axis and the orientation with respect to the symmetry axis can vary). To include the symmetry-related ambiguity in the restraints, we modiﬁed the NOE distance routine in CNS to use the symmetry operators and calculate summed distances over all images of the primary protomer speciﬁed by the symmetry operators. A ﬂag allows to either include or exclude the intra-protomer contribution. In the beginning of the calculation, a monomer was placed close to the pilus axis, with the a-helix approximately parallel to the pilus axis. The orientation around the N-terminal a-helix was chosen randomly within 180° (for the GC and T2S pili) or 360° (for the TCP). The orientation around the two axes perpendicular to the pilus axis were randomized within 20°. The minimization proceeded then in a three-stage protocol. The scripts are available as supplementary material. 2.4. Stage 1: Rapid packing optimization The ﬁrst stage consisted in a fast optimization of the geometry and the packing, with a simpliﬁed non-bonded interaction (repulsive van der Waals only, no electrostatics). A constant force (linear) restraint was used to attract the protomer to the pilus axis. In addition, a non-speciﬁc salt-bridge restraint favored interactions between oppositely charged residues. 2.5. Stage 2: Vacuum reﬁnement In the second stage, the structures were reﬁned in vacuo with a full force ﬁeld (the CHARMM Param19 extended atom force ﬁeld (Brooks et al., 1983)), with adapted non-bonded parameters similar to previous work (a distance-dependent dielectric of 1.41, a switching function between 2 and 9 Å, and a non-bonded cut-off of 10 Å (Blondel et al., 1999)). All the restraints from stage 1 remained active. This stage served to further optimize the packing between protomers. 2.6. Stage 3: Clustering and solvent reﬁnement After the second stage, the pilus structures were clustered based on the RMS difference of their Ca positions. To calculate a matrix of RMS distances, the pilins were not superimposed but rotated around and shifted along the pilus axis to an equivalent position within the pilus (i.e., the distance from the pilus axis and the orientation of the protomer remained unchanged). In this way, the RMS difference between different models still includes differences

439

M. Campos et al. / Journal of Structural Biology 173 (2011) 436–444

due to rigid body translations and rotations in addition to differences in the structure of the protomer. The clustering algorithm (Daura et al., 1999) counts the number of neighbors of a structure using a user-deﬁned cutoff (see Table 1) and takes the structure with largest number of neighbors with all its neighbors as a cluster. It then eliminates all structures in the cluster from the pool of structures. We ran the algorithm in ‘‘full linkage’’ mode, which adds a structure to a cluster when its distance to any element of the cluster is less than the cutoff. Two structures within a cluster can thus differ by more than the cut-off. The third reﬁnement stage, usually only performed on the ﬁrst cluster, consisted in a short reﬁnement in explicit water of the main cluster, similar to the one used in NMR structure determination (Linge et al., 2003). We used a water layer of 10 Å thickness and a non-bonded cutoff of 12 Å, and the TIP3P water model, with the CHARMM Param19 extended atom force ﬁeld and standard nonbonded parameters, with electrostatic force-shifting (Steinbach and Brooks, 1994). No conformational restraints were used during this stage. 2.7. Structure validation and energy evaluation Structures of the pili were validated by calculating the overall energy with the CHARMM19 force ﬁeld and with PROCHECK (Laskowski et al., 1993). No bad non-bonded contacts were identiﬁed. The ﬁnal 10 ps of molecular dynamics of the 20 lowest energy models were used for more detailed energetic analysis. The energy of the total structure and the energetic contributions of single residues were evaluated using the generalized Born electrostatic options in CNS, modiﬁed for symmetric systems (Moulinier et al., 2003). We used the CHARMM Param19 force ﬁeld with the ACE generalised Born implementation (Calimet et al., 2001), with an internal dielectric constant of 4 and an external dielectric constant of 80. The energetic contribution of a residue to the pilus stability was evaluated as the total non-bonded energy (vdW, electrostatic,

generalized Born) between the residue and the rest of the system, evaluated once in the pilus, from which we subtracted the energy evaluated with the pilin protomer placed at 100 Å from the pilus axis (no interaction between protomers). Entropic contributions were neglected. 3. Results 3.1. Overview of the calculations Between 400 and 1000 pilus structures were calculated for each system and setup, and clustered with respect to structural differences, position and orientation. We performed stage 1 (rapid packing optimization) and stage 2 (molecular dynamics reﬁnement with distance-dependent dielectric) for all structures. In all three systems, there were many more structures in the ﬁrst cluster than in any of the other clusters, and therefore only the ﬁrst cluster was reﬁned in water and analysed further. In contrast to the modeling procedure used for the published models of the GC pilus and TCP, the sidechains and the entire two N-terminal residues were kept completely ﬂexible, in addition to an overall ﬂexibility of the backbone, allowing the protomers to explore conformations that would be missed in a procedure based on rigid body motions only. For all systems, we ran a calculation with only the overall packing restraint (see Table 1). For the T2S, we ran calculations with restraints derived from cysteine crosslinking and salt bridge charge reversal experiments, for the TCP, with a restraint derived from charge reversal experiment only. For the GC pilus, we predicted close distances from the cryo-EM derived model, for those charge pairs or CB–CB distances that could potentially be determined experimentally. For some calculations, we used the incorrect hand. Table 1 gives an overview over all calculations performed. In the cryo EM structure (Craig et al., 2006), two inter-molecular saltbridges are clearly present, with distances between the charged atoms of less than 3 Å (Glu 49/Arg 30, and Asp 153/Lys 76). Two

Table 1 Overview over calculations.

a b c d e

Calculationa

Pilus

Handb

Restraintc

PG-L0 PG-LSB1C PG-R0 PG-RSB1C PG-RSB1CC

T2S T2S T2S T2S T2S

Left Left Right Right Right

– Asp 48 – Asp 48 Asp 48

– Arg 87 – Arg 87 Arg 87

– Val 10 – Ile 10 Val 10 Ile 6 Val 9

– Leu – Leu Leu Leu Leu

GC-L0 GC-LSB4C

GC GC

Left Left

– Leu 16

GC GC GC GC GC

Right Right Right Right Right

– Val Val Val Val

– Leu Leu Leu Leu

GC-RSB4CC

GC

Right

– Lys 74 Arg 30 Lys 76 – Lys 74 Arg 30 Lys 76 Lys 74 Arg 30 Lys 76 Lys 74 Arg 30 Lys 76

– Val 9

GC-R0 GC-RSB1C GC-RSB2C GC-RSB3C GC-RSB4C

– Asp 113 Glu 49 Asp 153 – Glu 113 Glu 49 Asp 153 Asp 113 Glu 49 Asp 153 Glu 113 Glu 49 Asp 153

Val 9 Leu 6 Leu 6

TC-L0 TC-LSB1 TC-R0 TC-RSB1

TCP TCP TCP TCP

Left Left Right Right

– Glu 83 – Glu 83

– Arg 26 – Arg 26

– – – –

9 9 9 9

1st clusterd

Cutoff (Å)e

17.7 37.8 23.5 66.1 47.7

1.75 1.75 1.75 1.75 1.75

4.0 9.0

1.75 1.75

35.6 65.8 46.5 57.3 71.8

1.75 1.75 1.75 1.75 1.75

Leu 16 Ile 12 Leu 16

26.7

1.75

– – – –

12.7 16.5 8.9 9.9

2.5 2.5 2.5 2.5

16 16 16 13 16

16 16 16 16

Name of the calculation in the text and in the ﬁgure legends. PG: T2S pilus; GC: GC pilus; and TC: TCP. Left-handed is correct for TCP and right-handed for PulG and GC. Restraint applied in the calculation. 4 Å upper limit for salt bridges, linear restraint for crosslinks. Percentage of structures in the ﬁrst (largest) cluster. Cutoff distance applied in cluster calculation. Any member of a cluster is separated from another cluster member maximally by this distance.

440

M. Campos et al. / Journal of Structural Biology 173 (2011) 436–444

more distances (between Glu 5 and the N-terminal nitrogen, and between Glu 113 and Lys 74), are between 5 and 6 Å. In the calculations with restraints, we imposed a uniform upper limit of 4 Å on the distance between the charged side-chains, using a restraint ambiguous for the two oxygens in Glu and Asp, and the two nitrogens in Arg, in addition to the ambiguity due to the symmetry of the pilus. Hence, in total, for a saltbridge between a Glu and an Arg residue, there are 2 * 2 * 30 possibilities in the restraint. Interestingly, the closest contacts between b carbons in the N-terminal helix in the GC pilus differ slightly from the experimentally established closest contacts in the PulG pilus (Campos et al., in press). In the GC pilus, we observe the hierarchy 6–16 < 6–12 < 6– 15 < 9–16 < 6–13 < 7–12 < 10–16, whereas in the PulG system, the probability to form a disulﬁde bridge indicated a hierarchy 9–16 < 6–13 < 11–16 < 11–17 10–17 (Campos et al., 2010). For the double cysteine mutants at position 11–17 and 10–17, virtually no crosslinking had been observed (Campos et al., 2010). The distance range for Cb–Cb distances observed in disulﬁde bridges in X-ray crystal structures is between 2.9 and 4.6 Å (Hazes and Dijkstra, 1988). Since it is difﬁcult to establish an exact quantitative relationship between the probability of forming a disulﬁde bridge and the distance between the b carbons, we chose a linear restraint where the force is independent of the distance to incorporate the crosslinking results in the calculations.

bridge (Glu 83–Arg 26) increases the convergence only little. This indicates the importance of the cysteine crosslink restraint for the correct organisation of the N-terminal helices in the center of the pilus. Only for the GC pilus we can compare the results of the procedure with the structure of a pilus obtained with EM data of much higher resolution (Craig et al., 2006). It is therefore important to ﬁnd criteria to identify the most likely model without a reference structure, in addition to the size of the cluster. The total energy (CHARMM19 force ﬁeld with ACE generalized Born) is incapable to distinguish between incorrect models and the correct model, and does not show any correlation with the atomic RMS difference to the cryo-EM model (Fig. 1, left). The restraint energy, in contrast, shows very good correlation with the similarity to the cryo-EM model. For the models from the unrestraint calculation, the restraints identify structures between 1 and 5 Å away from the cryo-EM model, with an average of around 2.5 Å (Fig. 1, right). As explained in Section 2 (in the section on clustering and solvent reﬁnement), this number is obtained by calculating the RMS difference of atomic positions without superposition and contains contributions from differences in the structure, distance from the center of the pilus and orientation of the protomer. The average RMS difference of the calculated structures is very similar to the estimated error in atomic position in the cryo-EM GC pilus model (around 2.5 Å) (Craig et al., 2006).

3.2. Convergence 3.3. Calculations with the incorrect handedness The convergence is dominated by overall packing. In all three systems, the pilin protomers readily rotate by up to 90° (for the PulG and GC pili) or 180° (in the case of the TCP) around the N-terminal a-helix to the same orientation. Table 1 indicates the fraction of calculations converging to the largest cluster. Without any conformational restraints, the convergence is best for the GC pilus, followed by the T2S pilus and the TCP. The performance is comparable for TCP and T2S, since the rotation angle sampled in the starting structure is twice as large for TCP than for the T2S pilus (see Section 2), albeit with a larger cluster cutoff for TCP. Conformational restraints have a strong inﬂuence on the convergence to the largest cluster. In the GC pilus, it nearly doubles if one salt bridge restraint (Glu 113–Lys 74) and one crosslink restraint (Val 9–Leu 16) are applied (calculation GC-RSB1C). Not all salt bridges have the same inﬂuence on convergence (compare calculations GC-RSB1C, GC-RSB2C and GC-RSB3C), and, surprisingly, convergence deteriorates when more than one cysteine crosslink restraint is applied (calculation GC-RSB4C). The restraints have a similar effect in the T2S pilus (cf. calculations PG-R0 and PG-SB1C). Surprisingly, also here the convergence is reduced if more than one cysteine crosslink restraint is applied (calculation PG-RSB1CC). In the TCP, an application of the one experimentally validated salt

The fact that the original handedness of the T2S pilus had been proposed incorrectly (Köhler et al., 2004) is indicative of the potential difﬁculties to establish the handedness experimentally from EM data. We therefore performed some calculations with the incorrect handedness, in order to see if the choice of the correct handedness would be possible from the result of our calculations. In terms of convergence to the largest cluster, there is a more or less marked difference: 17.7% vs. 23.5% in T2S; 4.0% vs. 35.6% for the GC pilus, and 8.9% vs. 12.7% for the TCP (see Table 1). Hence, this criterion (number of structures in top cluster) alone would allow a clear distinction between correct and incorrect handedness only in the GC pilus. The variation of non-bonded energy (van der Waals and electrostatic with implicit solvent) is much bigger within the top cluster in each calculation than the difference between calculations with correct and incorrect handedness (not shown), and it is therefore not possible to use this criterion either. The situation changes if the experimentally established contacts are used in the analysis. The difference becomes clear-cut when the restraint energy is evaluated for structures calculated without restraints (Fig. 2, red and green bars). There are structures without

Fig. 1. Analysis of the unrestraint calculation GC-R0. Left: Atomic RMS differences to the cryo-EM model as a function of energy. Right: RMS to the cryo-EM structure against restraint energy.

M. Campos et al. / Journal of Structural Biology 173 (2011) 436–444

441

Fig. 2. Histograms of restraint energy for the PilE and the PulG systems. Left: GC-R0 (red), GC-RSB1C (black), GC-L0 (green), and GC-LSB1C (blue). Right: PG-R0 (red), PG-RSB1C (black), PG-L0 (green), and PG-LSB1C (blue).

any restraint violations for the correct, right-handed pilus, for PilE and PulG. In contrast, there are no structures that can satisfy the restraints completely for the left-handed structures for these two systems. When we biased the calculation by the application of structural restraints, we found that structures with the wrong hand could be obtained that satisﬁed the restraints. The ratio of convergence to the largest cluster remains roughly the same for the GC and T2S pili. However, comparing the results with left- and right-handed structures allowed in both cases to clearly identify the correct handedness, since the right-handed structures satisfy the restraints better (Fig. 2, black and blue bars): the distribution becomes much sharper and its maximum shifts to the left (lower RMS difference to restraints). 3.4. Geometry of the pili and water accessibility Despite differences in the structure of the pilins, in particular the size of the globular head, and in the measured helical repeats, the structures of the T2SS and T4P pili modeled in this study have common features (Fig. 3). The outer diameter is similar; it varies between approximately 65 Å for the PulG and the GC pili and 80 Å for the TCP. This outer diameter was not restraint directly in any of the calculations, it is a consequence of the packing of the pilin protomers in the pili. There is a very small or vanishing central cavity in all three pili. Profound grooves separate the strands of the helices in the pili, making the N-terminal a-helices in the core of the pili accessible to water. For the TCP, this is shown in Fig. 4, which maps the accessibility measured by mass spectrometry (Li et al., 2008) onto the surface. It is difﬁcult to make a strict quantitative correlation between the mass spectrometry data and a three dimensional structure, since many factors inﬂuence hydrogen exchange (Best and Vendruscolo, 2006). The regions showing fastest hydrogen exchange are clearly water accessible in our model. The interactions between protomers are similar in our model and in the model directly based on the hydrogen exchange data (Li et al., 2008). However, the ‘‘hole’’ giving access to the central helix is less wide in our model, and is more focussed on residues 13–23, which show the fastest hydrogen exchange. Other parts of the N-terminal helix are very little accessible to water in our model, in agreement with the mass spectrometry data. 3.5. Protomer packing The tight packing within the pilus produces numerous speciﬁc interactions between neighboring protomers in all pili, such that each protomer (P) interacts directly with several neighboring subunits. Whereas the packing is similar in the GC and PulG pilus, we

observed some differences for the TCP. In the PulG and the GC pilus, most interactions are with four upper (P+4, P+3, P+1, P+7) and four lower (P4, P3, P1, P7) protomers in the ﬁlament; in the TCP, with four upper (P+3, P+5, P+2, P+8) and four lower (P3, P5, P2, P8) protomers. Some direct contacts extend to protomers P+10/P10 for the TCP and protomer P+8/P8 for the GC pilus. The P+7 interactions in PulG are electrostatic interactions between the N-terminus and the side chain of Asp53. Different models were proposed for the GC and the T2S pilus assembly (Craig et al., 2006; Campos et al., 2010). However, the packing order and the assembly machinery between these systems are similar (Sauvonnet et al., 2000) . For the PulG pilus, the interactions PP+1 are crucial for assembly and secretion function – there is a clear correlation between the formation of two conserved salt bridges at the P/P+1 interface and function (Campos et al., 2010). We also analysed the number of protomers within a 25 Å range of the principal protomer in the pilus, the cutoff chosen for the electrostatic interactions. We observed the maximum number of interacting protomers for the TCP, up to positions P+13/P13, indicating that the 15 neighbors on each side included in the modeling were indeed sufﬁcient for calculating the electrostatic interactions but not excessive. 3.6. Role of charged and polar interactions There are about twice as many charged residues in PulG than in PilE or TcpA (seven negatively charged and six positively charged residues in PulG, including the N-terminal Phe; three and four, respectively, in PilE, and two and four, respectively, in TcpA). Apart from the strictly conserved charged positions at the N-terminus and for Glu 5, there are other positions (approximately) conserved for charged or polar residues: position 29/30; position 35; and position 44. Intra- or inter-subunit salt bridges neutralize each charge. Fig. 4 shows an close up view of the region around residues Arg26 and Lys 35 in an ensemble of 20 structures for the TCP, which provides a rather dynamic view of these polar interactions, similar to what we had observed in the T2S pilus (Campos et al., 2010). 3.7. Energetic contributions of each residue Fig. 5 shows the average energetic contribution of each residue to the stability of the pilus in all three systems, estimated by MMGBSA calculations. A few dominant residues are predicted to add substantially to the stability of the pilus. The exact location of the minima depends on the exact sequence. A common feature is that the predicted contributions to stability are scattered over the whole sequence and not only concentrated in the N-terminal helix.

442

M. Campos et al. / Journal of Structural Biology 173 (2011) 436–444

Fig. 3. Ensembles of the 20 lowest energy structuress of the ﬁrst cluster of calculations without restraints. Colouring ranges from blue (10 Å from the pilus axis) to red (maximum distance). Top row: side view. Bottom row: top view. Left: TCP. Middle: T2S pilus. Right: GC pilus.

Fig. 4. Left: Close-up of a ‘‘hole’’ in the TCP ﬁlament surface, which exposes part of the N-terminal a-helix (residues 13–23). One protomer is colored to represent approximately deuterium exchange as measured in Li et al. (2008) (blue: less than 20%, green: around 40%, and orange: more than 50%. A part of the N-terminal helix is clearly visible (red). To illustrate the accessibility of the helix to water through this channel some water molecules introduced during the modeling are shown. Right: Close-up of the interface between neighboring subunits in the TCP, formed between the ab-loop of one subunit and the N-terminal a-helix of a neighboring subunit. The interface is shown as if looking out from the interior of the ﬁlament. An ensemble of charged side-chains is shown involving residues Arg26, Lys35, Asp82 and Glu83 (in red and blue for negatively and positively charged residues, respectively). Ensembles of other charged side chains in the neighborhood are also shown (in pink for acidic and in pale blue for basic side chains). The coloring of the backbone is as in Fig. 3.

M. Campos et al. / Journal of Structural Biology 173 (2011) 436–444

Fig. 5. By-residue energy contribution to the overall stability of the pili. Top: PulG pilus. Middle: GC pilus. Bottom: TCP.

In PulG, two pairs, Asp48–Arg87, and Asp44–Arg88, were consistently predicted to form intermolecular salt bridges. These interactions were validated by site-directed mutagenesis, via replacement of residues Asp44, Glu48, Arg87 and Arg88 individually and in pairs by residues of the opposite charge.

443

test the inﬂuence of structural restraints that could be obtained by crosslinking and mass spectrometry or by site-directed mutagenesis. The strategy may be more generally applicable to structures showing helical symmetry. We have deliberately not based our calculation protocol on using very long molecular dynamics trajectories. The principal driving force is derived from experimental data: from the helical parameters of the pilus derived from EM data, from conformational restraints from mutation experiments, and from compactness considerations. The capabilities of CNS to handle many different sources of experimental restraints and strict symmetry is therefore most important, in addition to the possibility to write elaborate reﬁnement scripts. For generating the structures, the force ﬁeld is in a way of secondary importance, and we can use a somewhat dated force ﬁeld such as PARAM19. To assess convergence, we have used the size of the cluster and energetic criteria. The energetic criteria alone were not sufﬁcient to identify the correct structures. We are currently modifying our scripts to use a more recent force ﬁeld (than PARAM19) and a more elaborate implicit solvent implementation (than ACE). How sparse can the data be to still get meaningful structures? This is difﬁcult to answer in general. The whole project started with calculating structures without any restraints (Campos et al., in press), in order to determine different arrangements of the pilins in the pilus compatible with the overall symmetry, and to identify inter-molecular contacts that distinguish the different conformations that could be used for experimental validation. In this respect, the calculations are useful even without any restraints derived from mutation experiments. The only data necessary consist in an overall description of the helical symmetry (rise per protomer and rotation angle per protomer). The use of a computational script allowed us to construct models of the three pili in identical ways and to compare the obtained models in detail. T2SS and T4aP are more closely related (short N-terminal presequence, Pro22, right handed) to each other than to TcpA (type IVb group pilin), consistent with the fact that the assembly machinery is different (Pelicic, 2008). There are differences in the size of the pilin protomers, the exact helical parameters (from 7.5 Å rise along the axis in TCP to 10.5 Å in the GC pilus; and from 84.71° rotation per protomer in the T2S pilus to 140° in TCP; left-handed helix for TCP and right-handed helices for GC and T2S pili). These differences in the overall parameters are likely due to the different sizes of the globular heads and the spacing of the charged residues that make inter-molecular saltbridges. The organisation is dictated by the packing of the hydrophobic N-terminal helices close to the pilus axis and the packing of the much larger globular C-terminal heads. In consequence, in spite of some similarities, there are also differences in the organisation of the protomers in the pili, making it important to have tools for the modeling of reliable models from data that is relatively easy to obtain. Acknowledgement This work was partially supported by the Institute Pasteur transversal project Grant No. 339.

4. Discussion

References

In this paper, we describe a strategy to obtain models with pseudo-atomic detail from sparse distance data and low resolution electron microscopy data. We show that with this strategy we can obtain reliable models for pili of the T2SS, T4aP and T4bp (TCP) families. Electron microscopy data as used here is available for other pili in this family, and a more atomic picture of the pili could be obtained in a straightforward way. The strategy also permits to

Alber, F., Dokudovskaya, S., Veenhoff, L.M., Zhang, W., Kipper, J., Devos, D., Suprapto, A., Karni-Schmidt, O., Williams, R., Chait, B.T., Rout, M.P., Sali, A., 2007. Determining the architectures of macromolecular assemblies. Nature 450, 683–694. Altieri, A.S., Byrd, R.A., 2004. Automation of NMR structure determination of proteins. Curr. Opin. Struct. Biol. 14, 547–553. Audette, G.F., Irvin, R.T., Hazes, B., 2004. Crystallographic analysis of the Pseudomonas aeruginosa strain K122-4 monomeric pilin reveals a conserved receptor-binding architecture. Biochemistry 43, 11427–11435.

444

M. Campos et al. / Journal of Structural Biology 173 (2011) 436–444

Bardiaux, B., Bernard, A., Rieping, W., Habeck, M., Malliavin, T.E., Nilges, M., 2009. Inﬂuence of different assignment conditions on the determination of symmetric homodimeric structures with ARIA. Proteins 75, 569–585. Best, R.B., Vendruscolo, M., 2006. Structural interpretation of hydrogen exchange protection factors in proteins: characterization of the native state ﬂuctuations of CI2. Structure 14, 97–106. Blondel, A., Renaud, J.P., Fischer, S., Moras, D., Karplus, M., 1999. Retinoic acid receptor: a simulation analysis of retinoic acid binding and the resulting conformational changes. J. Mol. Biol. 291, 101–115. Brooks, B.R., Bruccoleri, R.E., Olafson, B.D., States, D.J., Swaminathan, S., Karplus, M., 1983. Charmm: a program for macromolecular energy and minimization and dynamics calculations. J. Comput. Chem. 4, 187–217. Brunger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse-Kunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., Read, R.J., Rice, L.M., Simonson, T., Warren, G.L., 1998. Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr. D: Biol. Crystallogr. 54, 905–921. Calimet, N., Schaefer, M., Simonson, T., 2001. Protein molecular dynamics with the Generalized Born/ACE solvent model. Proteins 45, 144–158. Campos, M., Nilges, M., Cisneros, D.A., Francetic, O., 2010. Detailed structural and assembly model of the type II secretion pilus from sparse data. Proc. Natl. Acad. Sci. USA 107, 13081–13086. Craig, L., Taylor, R.K., Pique, M.E., Adair, B.D., Arvai, A.S., Singh, M., Lloyd, S.J., Shin, D.S., Getzoff, E.D., Yeager, M., Forest, K.T., Tainer, J.A., 2003. Type IV pilin structure and assembly: X-ray and EM analyses of Vibrio cholerae toxincoregulated pilus and Pseudomonas aeruginosa PAK pilin. Mol. Cell 11, 1139– 1150. Craig, L., Volkmann, N., Arvai, A.S., Pique, M.E., Yeager, M., Egelman, E.H., Tainer, J.A., 2006. Type IV pilus structure by cryo-electron microscopy and crystallography: implications for pilus assembly and functions. Mol. Cell 23, 651–662. Das, R., Baker, D., 2008. Macromolecular modeling with Rosetta. Annu. Rev. Biochem. 77, 363–382. Daura, X., Gademann, K., Jaun, B., Seebach, D., van Gunsteren, W., Mark, A., 1999. Peptide folding: when simulation meets experiment. Angew. Chem., Int. Ed. 38, 236–240. Dominguez, C., Boelens, R., Bonvin, A.M.J.J., 2003. HADDOCK: a protein–protein docking approach based on biochemical or biophysical information. J. Am. Chem. Soc. 125, 1731–1737. Güntert, P., 2009. Automated structure determination from NMR spectra. Eur. Biophys. J. 38, 129–143. Hansen, J.K., Forest, K.T., 2006. Type IV pilin structures: insights on shared architecture, ﬁber assembly, receptor binding and type II secretion. J. Mol. Microbiol. Biotechnol. 11, 192–207. Hazes, B., Dijkstra, B.W., 1988. Model building of disulﬁde bonds in proteins with known threedimensional structure. Protein Eng. 2, 119–225. Hazes, B., Sastry, P.A., Hayakawa, K., Read, R.J., Irvin, R.T., 2000. Crystal structure of Pseudomonas aeruginosa PAK pilin suggests a main-chain-dominated mode of receptor binding. J. Mol. Biol. 299, 1005–1017. Herrmann, T., Güntert, P., Wüthrich, K., 2002. Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA. J. Mol. Biol. 319, 209–227. Karaca, E., Melquiond, A.S.J., deVries, S.J., Kastritis, P.L., Bonvin, A.M.J.J., 2010. Building macromolecular assemblies by information-driven docking. Mol. Cell Proteomics 9, 1784–1794. Keizer, D.W., Slupsky, C.M., Kalisiak, M., Campbell, A.P., Crump, M.P., Sastry, P.A., Hazes, B., Irvin, R.T., Sykes, B.D., 2001. Structure of a pilin monomer from Pseudomonas aeruginosa: implications for the assembly of pili. J. Biol. Chem. 276, 24186–24193. Köhler, R., Schäfer, K., Müller, S., Vignon, G., Diederichs, K., Philippsen, A., Ringler, P., Pugsley, A.P., Engel, A., Welte, W., 2004. Structure and assembly of the pseudopilin PulG. Mol. Microbiol. 54, 647–664. Korotkov, K.V., Gray, M.D., Kreger, A., Turley, S., Sandkvist, M., Hol, W.G.J., 2009. Calcium is essential for the major pseudopilin in the type 2 secretion system. J. Biol. Chem. 284, 25466–25470.

Lasker, K., Phillips, J.L., Russel, D., Velazquez-Muriel, J., Schneidman-Duhovny, D., Tjioe, E., Webb, B., Schlessinger, A., Sali, A., 2010. Integrative structure modeling of macromolecular assemblies from proteomics data. Mol. Cell Proteomics 9, 1689–1702. Laskowski, R.A., MacArthur, M.W., Moss, D.S., Thornton, J.M., 1993. PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26, 283–291. Li, J., Lim, M.S., Li, S., Brock, M., Pique, M.E., Woods, V.L., Craig, L., 2008. Vibrio cholerae toxin-coregulated pilus structure analyzed by hydrogen/deuterium exchange mass spectrometry. Structure 16, 137–148. Linge, J., O’Donoghue, S.I., Nilges, M., 2001. Automated assignment of ambiguous nuclear Overhauser effects with ARIA. Meth. Enzymol. 339, 71–90. Linge, J., Williams, M.A., Spronk, C.A.E.M., Bonvin, A.M.J.J., Nilges, M., 2003. Reﬁnement of protein structures in explicit solvent. Proteins 50, 496–506. Moulinier, L., Case, D.A., Simonson, T., 2003. Reintroducing electrostatics into protein X-ray structure reﬁnement: bulk solvent treated as a dielectric continuum. Acta Crystallogr. D: Biol. Crystallogr. 59, 2094–2103. Ng, S.Y.M., Chaban, B., Jarrell, K.F., ﬂagella, Archaeal, 2006. bacterial ﬂagella and type IV pili: a comparison of genes and posttranslational modiﬁcations. J. Mol. Microbiol. Biotechnol. 11, 167–191. Nilges, M., 1993. A calculation strategy for the structure determination of symmetric dimers by 1H NMR. Proteins 17, 297–309. Nilges, M., 1995. Calculation of protein structures with ambiguous distance restraints: automated assignment of ambiguous NOE crosspeaks and disulphide connectivities. J. Mol. Biol. 245, 645–660. Nilges, M., O’Donoghue, S.I., 1998. Ambiguous NOEs and automated NOE assignment. Progr. Nucl. Magn. Reson. Spectrosc. 32, 107–139. Nilges, M., Bernard, A., Bardiaux, B., Malliavin, T., Habeck, M., Rieping, W., 2008. Accurate NMR structures through minimization of an extended hybrid energy. Structure 16, 1305–1312. Nilges, M., Malliavin, T., Bardiaux, B., 2010. Protein structure calculation using ambiguous restraints. In: Encyclopedia of Magnetic Resonance. John Wiley & Sons, Ltd. doi:10.1002/9780470034590.emrstm1156. Parge, H.E., Forest, K.T., Hickey, M.J., Christensen, D.A., Getzoff, E.D., Tainer, J.A., 1995. Structure of the ﬁbre-forming protein pilin at 2.6 Å resolution. Nature 378, 32–38. Pelicic, V., 2008. Type IV pili: e pluribus unum? Mol. Microbiol. 68, 827–837. Ramboarina, S., Fernandes, P.J., Daniell, S., Islam, S., Simpson, P., Frankel, G., Booy, F., Donnenberg, M.S., Matthews, S., 2005. Structure of the bundle-forming pilus from enteropathogenic Escherichia coli. J. Biol. Chem. 280, 40252–40260. Rieping, W., Habeck, M., Nilges, M., 2005. Modeling errors in noe data with a lognormal distribution improves the quality of NMR structures. J. Am. Chem. Soc. 127, 16026–16027. Rossmann, M.G., Morais, M.C., Leiman, P.G., Zhang, W., 2005. Combining X-ray crystallography and electron microscopy. Structure 13, 355–362. Sauvonnet, N., Vignon, G., Pugsley, A.P., Gounon, P., 2000. Pilus formation and protein secretion by the same machinery in Escherichia coli. EMBO J. 19, 2221– 2228. Schröder, G.F., Brunger, A.T., Levitt, M., 2007. Combining efﬁcient conformational sampling with a deformable elastic network model facilitates structure reﬁnement at low resolution. Structure 15, 1630–1641. Schröder, G.F., Levitt, M., Brunger, A.T., 2010. Super-resolution biomolecular crystallography with low-resolution data. Nature 464, 1218–1222. Steinbach, P., Brooks, B., 1994. New spherical cutoff methods for long-range forces in macromolecular simulation. J. Comput. Chem. 15, 667–683. Strom, M.S., Lory, S., 1991. Amino acid substitutions in pilin of Pseudomonas aeruginosa: effect on leader peptide cleavage, amino-terminal methylation, and pilus assembly. J. Biol. Chem. 266, 1656–1664. Volkmann, N., Hanein, D., 2003. Docking of atomic models into reconstructions from electron microscopy. Meth. Enzymol. 374, 204–225. Xu, X.-F., Tan, Y.-W., Lam, L., Hackett, J., Zhang, M., Mok, Y.-K., 2004. NMR structure of a type IVb pilin from Salmonella typhi and its assembly into pilus. J. Biol. Chem. 279, 31599–31605.

Lihat lebih banyak...

Modeling pilus structures from sparse data

Descrição do Produto

Comentários