Thesis (Index) <- Sean Forman <- You Are Here
In Chapter 5, we introduced tweaking and the
mathematics behind our algorithm for aligning separated
-strands. In
this chapter, we will present examples of tweaking at work and how it
affects the energetics of our protein folds. We will also gather some
data regarding the efficacy of tweaking, the number of times tweaking
is applied, and how successful those attempts are. We will then fold
some random sequences utilizing a scoring function that promotes the
production of
-sheets. We will finish with a prediction for a short
fifty-six amino acid protein, protein G (1PGB) [30].
As part of our effort to debug tweaking and generate a large number of
tweaking instances, we added a debugging module to tweaking. This
module generates proteins with short to medium length stretches of
Gly. We restrict the Gly amino acids to only a single
-sheet torsion
angle pair (
,
), and we then utilize a
simplified scoring function that attempts to maximize the number of
tweaking instances and the distance between the mating amino acids
within a single protein. While this is not a good setup to solve for
actual protein structures, it does demonstrate the capabilities of our
tweaking work quite clearly. The examples from this chapter use this
module unless otherwise noted.
Tweaking is able to align both parallel and anti-parallel strands. We will select some examples that illustrate the variety of behaviors one can see in tweaking.
|
In Figure 7.1, we show a pair of successful tweaking
events. The first is a short turn region that is tweaked so that the
two
-sheets are able to create an anti-parallel
-sheet. The two amino acids
aligned here are close to each other on the backbone (amino acids 8
and 12). In order to achieve this alignment, we tweaked nine backbone
angles on three amino acids (9-11). The size of the changes are in
Table 7.1.
|
On average, the three
angles were tweaked
, the
angles
, and the
angles
.
Since we were being as permissive as possible in the construction of
our proteins, we allowed the
torsion angles to vary just as
easily as the non-
torsion angles. It is clear that a
naturally occurring protein would not have an
angle of
. It is a relatively simple matter to freeze those
angles during a tweaking event.
The second example demonstrates a tweaking event where many of the
amino acids are frozen. This tweaking event leads to a parallel
-sheet.
The two bonding amino acids (2 and 32) are far apart on the backbone.
In order to align these two atoms we tweak a total of 27 backbone bond
angles in 9 amino acids (out of a total of 29 possible). The
untweaked amino acids are not changed because they are part of a
-strand,
and tweaking always respects existing secondary structure. We won't
present all 27 angles changed here, but we will summarize the results.
On average, the nine
angles were tweaked
, the
angles
, and the
angles
.
If you consider the figures presented (Figure 7.1), the anti-parallel case has much less slack in its alignment. This requires the tweaking algorithm to wrench the protein around into alignment. The parallel case has more opportunity to alter the intervening angles, so it is able to effect more subtle changes to bring to two strands into alignment.
We can also use our scoring function to determine if these alignments
are energetically advantageous for the protein. We would expect that
they are as the alignment creates a greater number of hydrogen bonding
and reduces the accessible surface area of the fully extended protein.
To effectively measure this we have extended the second
-strand two
additional amino acids to create a
-sheet from the now aligned
-strands. As
Figure 7.2 shows, the score of the tweaked protein
is far better than that of the misaligned protein. This also points
out why tweaking events are less likely to occur when the optimism
(see Section 3.3) of our search is low. While we see
a dramatic improvement from a tweaking event, the energy function will
view the partial fold shown in Figure 7.2 poorly and
would possibly prune that partial solution when optimism is low.
Figure 7.3 displays another capability of tweaking.
In the figure, amino acids 17 and 73 are aligned. In between those
two amino acids, we already have alignments between amino acids 2 and
32, 33 and 40, 43 and 52, and 58 and 63. Tweaking maintained each of
these prior alignments and did not destroy pre-determined structure.
This feature allows us to bring global
-strands into alignment without
threatening more local structures.
Using our specialized tweaking environment, we are able to generate
protein folds with a large number of
-sheets. While this ``solution''
(see Figure 7.4) does not represent an actual
protein, it does show that
-sheets are possible in the HOPS folding
environment. When given the proper inducement, HOPS will locate a
good deal of sheet behavior.
|
While the examples above show that tweaking can align two
-strands, it
does not imply that all attempts at alignment are successful. In
fact, the vast majority of alignment attempts are unsuccessful. In
most cases, the two atoms tweaking is working to align are too far
apart from each other to align. If there is an insufficient number of
tweakable amino acids between the two targets, there will be no way to
bring the amino acids together even if they are relatively close to
each other.47
Even if we can determine a sequence of torsion angles that align the two targets, this contortion will often introduce steric clashes which disqualify the tweaked conformation from consideration (see Figure 7.5).
We have run a series of tests of tests to determine what number of
attempted tweaks successfully align the two strands and what
percentage of those then survive without incurring an immediate steric
clash. In the first test, we used our tweak-conducive environment and
ran the algorithm ten times creating random proteins 20 amino acids in
length. In the second test, we ran tweaking HOPS also 10 times
creating random proteins 20 amino acids in length. However, this time
we used standard torsion angles and did not use a scoring function
that directly promotes the creation of
-sheets. Instead, we used our
standard scoring function.
In the first test, we attempted to construct a tweak 2,680,746 times,
but were successful only 3,176 times (
of all attempts). Of
those 3,176 successful alignments, only 711 did not create steric
clashes. In the second test, we were much more successful. A total
of 54,133,738 tweaks were attempted, and 1,453,486 were successful
(
). Of the successful tweaks 36,274 did not result in a steric
clash. This means that approximately
of all tweaks in a
realistic situation were successful. While this may seem
discouraging, a successful tweaking event does have a significant
benefit energetically.
Naturally, the goal of tweaking is the ab initio prediction
of
-sheet structure in real proteins. While we have established that
tweaking will work under idealized conditions, we will now attempt to
predict the structure of an already known structure, protein G. Protein G was
mentioned earlier in Chapter 2. It is a small
protein (56 amino acids) which has a pair of
-sheets with a helix in
between. The two anti-parallel
-sheets then come together to form a
larger sheet. This combination of sheet and helix structure makes it
a quality target for our technique.
The standards of success in ab initio structure prediction
are difficult to define. There are only a few ab initio
predictions in each CASP that can be considered accurate predictions
of a protein's structure. While we do not at the moment have a
program that we would enter into blind tests such as CASP, we are
moving in that direction. As an example of progress in this
direction, we have run tests on protein G with a single caveat. For input
torsion angles, we have utilized the actual angles found in the
protein and a standard sheet angle. This does not tilt things as
dramatically into our favor as it might seem. Only 13 of the 56 amino
acids had between two and four angle choices in this test and there
were 12 possible torsion angles for each of the 11 Thr amino
acids. This guarantees that our discrete search space still contains
well over
possible conformations.
For the results presented here, we ran a test for 80 hours on a
network of 8-10 Pentium computers (varying in speed from 1GHz to
75MHz). All computers utilize the Linux operating system. The best
solution we gathered is not guaranteed to be the optimum as we did not
run to completion. The results are still encouraging as HOPS
correctly chose the torsion angles for the central helix and also
created
-strands and a pair of
-sheets. HOPS and tweaking brought out a
good deal of the sheet structure found in the native fold which
contains a pair of anti-parallel
-sheets that then fold together to form
a parallel
-sheet. We have the first anti-parallel
-sheet, however, the
bonding pattern is off slightly as we have matched amino acids 7 and
24, compared to 7 and 15 in the native fold. However, we match amino
acids 5 and 42 in a parallel
-sheet, compared to amino acids 6 and 52 in
the native fold. A printout of the difference in torsion angles is
presented in Appendix D. The actual and
predicted structures are side-by-side in Figure 7.6.