Changes: Diderot's Suggested Method

Revision as of 15:47, 8 July 2010

Diderot's Suggested Method

Note: This page is under construction and will change suddenly and rapidly over the next few days.

Preliminaries

First, try to develop an intuitive sense of what a good fold looks like. Fold.it mostly lacks the computational power of the big, distributed folding programs. It hopes to make up for that lack with plain old human intuition. This is not as far-fetched as it sounds; in the game of go, for example, even moderately skilled humans still routinely beat the best computers -- because there is just too much math for the computers to do. Humans win by intuition.

Protein folding appears to be a similarly complex problem. To develop your intuition, you should spend a lot of time looking at successfully solved protein folds. Ubiquitin is a work of art, as is insulin. Scientific journals, the protein data bank, Rosetta@Home and even Wikipedia are excellent sources for these successful folds. Learn from them.

Strive for an aesthetically pleasing shape. Beta sheets flow. They are virtually never in perfectly straight lines. (If they were, the computational math would be a lot easier, and fold.it might not be necessary.) Proteins are not like Le Corbusier's "efficient" buildings. They're not square, straight, and blocky. They're more like Frank Gehry's twisty and curvaceous architecture. And yet there is a logic and a beauty to that flow. A folder is an artist.

This aesthetic sense is what you're going to aim at in every puzzle, and from what I've been able to tell, coming up with a good general idea of what a given protein ought to look like is the single most important thing you can do to raise your score. Having a bad general shape and then making minor adjustments to it may raise your score incrementally, but to really break through requires a sense of the large, important stuff. It requires a sense of the way proteins move, and of the way that they tend to look in their finished form.

While You're Folding

Step One: Alignment and Secondary Structure

The Alignment Tool is very powerful, and I wish we'd had it back when we were doing CASP 8. (Can you imagine CASP without alignment???) This tool should be where you start on every single protein that doesn't already come with a suggested alignment.

That said, do not be a slave to the results of the alignment tool. Even a "successful" alignment is basically a piece of crap. It usually scores deep in the negative tens of thousands, and many, many elements of the protein are likely to be flat-out wrong -- wrong folding, wrong Secondary Structure, wrong Sidechain Position, wrong everything.

Once you get an alignment that is acceptable to the program, a look at the protein's secondary structures. Don't even try to fold the whole thing globally yet. Instead, go through the entire protein, amino by amino, and make sure you feel comfortable with the secondary structure that the Alignment Tool has assigned to each one.

For example, if you find a Proline or a Glycine in the middle of a Beta Sheet or an Alpha Helix, be suspicious. They are probably not rightly assigned. Consider making them into part of a Loop.

If you find several aminos from the MALEK group in the same vicinity, consider making them into a helix. Your helix could include aminos not in the MALEK group, but most of the time a helix will be noticeably heavier on these five.

If you see a bunch of them roughly near each other, then the chances are that they want to be a helix. Assign them that structure -- including any aminos in between your MALEKs -- and then use the Rebuild tool on them. It will gradually turn this section into a more or less well-formed helix.

For most alignments generated by the fold.it client, some aminos will be in beta sheets. These aren't terribly difficult to straighten out, as long as you don't have any prolines or glycines in them. Figuring out which aminos need to be in beta sheets is actually much harder, because almost all aminos can do it. A lot of this work will come in step two, and you might often change your mind about which aminos go in and which go out of your betas.

Step Two: Imagining the Tertiary

At this point your alignment will still look like crap and still score very low. Don't worry about it.

Take a bunch of Bands and attach them to the alignment so that they pull everything a bit apart -- you want a better look at the whole thing. Wiggle, and let the bands do their work. Take note of how things seem inclined to move. See which parts of the protein might make good betas, and how these might be brought around to hydrogen bond with one another.

Remember that at this point you are called on to be the most creative and the most thoughtful. This is likely what makes or breaks your fold.

Take your betas and get them to hydrogen bond with one another. You may need to rebuild some of them, and you will almost certainly Tweak them a lot. Consider which sides of these betas are hydrophilic or hydrophobic. A beta sheet with a very hydrophilic side will almost certainly be placed on the outside of the protein, with the hydrophilic side pointing out. Hydrophobic sides of your betas will likely face the inside. A beta sheet that is hydrophobic on both sides will probably lie on the inside of the protein, surrounded by helices and loops, or by other betas.

Drag things around by brute force, making liberal use of rubber bands, wiggles, and tweaks. Only rarely shake the sidechains, and avoid Local wiggles -- at this point you don't want to get too much Mojo. You want to still explore what's out there, and see what possibilities there are -- not to get caught in a local minimum.

When you are satisfied that you have a roughly good framework, remove your rubber bands, unfreeze any sections you've frozen, and wiggle and shake the whole thing into place. If you've done well, your score will rise dramatically, and your protein will stabilize at something like what you meant it to look like.

If it doesn't, then take stock of the situation. It sometimes happens that you get an even better configuration than the one you imagined. And it sometimes happens that you end up with worse. This is where you need to use your intuition again, possibly go back to the drawing board, or consider some rebuilding to make your idea work. Much judgment is required at this stage as well. Repeat it until you are satisfied and you have a stable, compact overall shape.

Step Three: Unhappy Aminos and Other Fixable Problems

One of the first things you'll notice is that many amino acids are still low-scoring, "unhappy" parts of the fold. You'll want to address these next. It could be that they don't really belong in the secondary structure you've given them. Or it could be that the structure itself needs to be rebuilt. Or you might not even be able to say quite what's wrong with them.

There are a variety of ways to deal with this, but I find that small, highly constrained rebuilds are often the best. There are a nearly infinite number of ways of dealing with these aminos, but the fact is, now is the time to do it. Be creative and go for it.

Hydrophobicity will likely remain a problem at this stage, and you will need to address it as well.... (yes, more is needed here....)

New structural ideas may emerge. Don't be afraid to rethink your structure.

Mojo can often be your friend here... (explain why)

After every change in this step, you will want to shake the sidechains, either before or after you wiggle the change into place. (I usually try both.) Sometimes, you'll want to move a sidechain manually and then wiggle, especially if you are trying to fill a Void.

Step Four: Finishing with Scripts

Once you've fixed your lower-scoring aminos, your protein should be mostly green. It's at this point that you will want to consider local wiggle strategy, various tweaking scripts (I like Pletsch's Acid Tweaker), and other Endgame scripting tools. Each will raise your mojo to the point where relatively few of your conscious changes will result in score increases, so you do want to be careful here. Do not script too soon, and do not over-script. One thing I like about Pletsch's Acid Tweaker is that it doesn't seem to do this to the protein quite so much as others I've tried.

If You Get Stuck

If you become dissatisfied with the fold you've created, do one of two things: Either work on a different puzzle, or go back to the beginning and try a different alignment. Either way, save your work, because you never know when an idea will occur to you about a structure that formerly looked unpromising.

@@ Line 1: / Line 1: @@
+== Diderot's Suggested Method ==
-== Update ==
+Note:  This page is under construction and will change suddenly and rapidly over the next few days.
-I read this over just now and realized that this isn't really how I play FoldIt anymore.  My approach to the game is very different from what it was a few weeks ago.  I'll be updating this page soon. [[User:FI Diderot|FI Diderot]] 15:10, 27 September 2008 (UTC)
+===Preliminaries===
-== Diderot's Suggested Method ==
+First, try to develop an intuitive ''sense'' of what a good fold looks like.  Fold.it mostly lacks the computational power of the big, distributed folding programs.  It hopes to make up for that lack with plain old human intuition.  This is not as far-fetched as it sounds; in the game of go, for example, even moderately skilled humans still routinely beat the best computers -- because there is just too much math for the computers to do.  Humans win by intuition.
+Protein folding appears to be a similarly complex problem.  To develop your intuition, you should spend a lot of time looking at successfully solved protein folds.  [[Ubiquitin]] is a work of art, as is [[insulin]].  Scientific journals, the [http://www.pdb.org/pdb/home/home.do protein data bank], [[Rosetta@Home]] and even Wikipedia are excellent sources for these successful folds.  Learn from them.
+Strive for an [[aesthetics | aesthetically pleasing]] shape.  [[Beta sheets]] flow.  They are virtually never in perfectly straight lines.  (If they were, the computational math would be a lot easier, and fold.it might not be necessary.)  Proteins are not like Le Corbusier's "efficient" buildings.  They're not square, straight, and blocky.  They're more like Frank Gehry's twisty and curvaceous architecture.  And yet there is a logic and a beauty to that flow.  A folder is an artist.
+This aesthetic sense is what you're going to aim at in every puzzle, and from what I've been able to tell, coming up with a good general idea of what a given protein ought to look like is the single most important thing you can do to raise your score.  Having a bad general shape and then making minor adjustments to it may raise your score incrementally, but to really break through requires a sense of the large, important stuff.  It requires a sense of the way proteins move, and of the way that they tend to look in their finished form.
+===While You're Folding===
+====Step One:  Alignment and Secondary Structure====
+The [[Alignment Tool]] is very powerful, and I wish we'd had it back when we were doing CASP 8.  (Can you imagine CASP without alignment???)  This tool should be where you start on every single protein that doesn't already come with a suggested alignment.
+That said, do not be a slave to the results of the alignment tool.  Even a "successful" alignment is basically a piece of crap.  It usually scores deep in the negative tens of thousands, and many, many elements of the protein are likely to be flat-out wrong -- wrong folding, wrong [[Secondary Structure]], wrong [[Sidechain Position]], wrong everything.
+Once you get an alignment that is acceptable to the program, a look at the protein's secondary structures.  Don't even try to fold the whole thing globally yet.  Instead, go through the entire protein, amino by amino, and make sure you feel comfortable with the secondary structure that the Alignment Tool has assigned to each one.
+For example, if you find a [[Proline]] or a [[Glycine]] in the middle of a [[Beta Sheet]] or an [[Alpha Helix]], be suspicious.  They are probably not rightly assigned.  Consider making them into part of a [[Loop]].
+If you find several aminos from the [[MALEK]] group in the same vicinity, consider making them into a helix.  Your helix could include aminos not in the MALEK group, but most of the time a helix will be noticeably heavier on these five.
+If you see a bunch of them roughly near each other, then the chances are that they want to be a helix.  Assign them that structure -- including any aminos in between your MALEKs -- and then use the [[Rebuild]] tool on them.  It will gradually turn this section into a more or less well-formed helix.
+For most alignments generated by the fold.it client, some aminos will be in beta sheets.  These aren't terribly difficult to straighten out, as long as you don't have any prolines or glycines in them.  Figuring out which aminos need to be in beta sheets is actually much harder, because almost all aminos can do it.  A lot of this work will come in step two, and you might often change your mind about which aminos go in and which go out of your betas.
+====Step Two:  Imagining the Tertiary====
+At this point your alignment will still look like crap and still score very low.  Don't worry about it.
+Take a bunch of [[Band]]s and attach them to the alignment so that they pull everything a bit apart -- you want a better look at the whole thing.  [[Wiggle]], and let the bands do their work.  Take note of how things seem inclined to move.  See which parts of the protein might make good betas, and how these might be brought around to [[hydrogen bond]] with one another.
+Remember that at this point you are called on to be the most creative and the most thoughtful.  This is likely what makes or breaks your fold.
+Take your betas and get them to hydrogen bond with one another.  You may need to rebuild some of them, and you will almost certainly [[Tweak]] them a lot.  Consider which sides of these betas are [[Hydrophobicity | hydrophilic or hydrophobic]].  A beta sheet with a very hydrophilic side will almost certainly be placed on the outside of the protein, with the hydrophilic side pointing out.  Hydrophobic sides of your betas will likely face the inside.  A beta sheet that is hydrophobic on both sides will probably lie on the inside of the protein, surrounded by helices and loops, or by other betas.
+Drag things around by brute force, making liberal use of rubber bands, wiggles, and tweaks.  Only rarely shake the sidechains, and avoid [[Local wiggle]]s -- at this point you don't want to get too much [[Mojo]].  You want to still explore what's out there, and see what possibilities there are -- not to get caught in a local minimum.
+When you are satisfied that you have a roughly good framework, remove your rubber bands, unfreeze any sections you've frozen, and wiggle and shake the whole thing into place.  If you've done well, your score will rise dramatically, and your protein will stabilize at something like what you meant it to look like.
+If it doesn't, then take stock of the situation.  It sometimes happens that you get an even better configuration than the one you imagined.  And it sometimes happens that you end up with worse.  This is where you need to use your intuition again, possibly go back to the drawing board, or consider some [[Rebuild | rebuilding]] to make your idea work.  Much judgment is required at this stage as well.  Repeat it until you are satisfied and you have a stable, [[Compactness | compact]] overall shape.
+====Step Three:  Unhappy Aminos and Other Fixable Problems====
-Each folder is going to come up with his own style of folding, most likely.  This is natural and only to be expected.  Diderot posted the following several weeks ago, but some of it is not completely up to date.  Edits would be welcome!
+One of the first things you'll notice is that many amino acids are still low-scoring, "unhappy" parts of the fold.  You'll want to address these next.  It could be that they don't really belong in the secondary structure you've given them.  Or it could be that the structure itself needs to be rebuilt.  Or you might not even be able to say quite what's wrong with them.
-What follows is a sort of loose algorithm: It's a set of procedures that you may, and perhaps should, deviate from at any time. In folding I may start or re-start from any one of these suggestions, or I may do something not listed. Still, this is how I generally try to do it.
+There are a variety of ways to deal with this, but I find that small, highly constrained rebuilds are often the best.  There are a nearly infinite number of ways of dealing with these aminos, but the fact is, now is the time to do it.  Be creative and go for it.
-First, consider the [[beta sheets]]. These are the "flat" pieces. A good fold tends to see many of them aligned with one another and sharing one or more [[hydrogen bonds]]. I've read that beta sheets sharing hydrogen bonds tend to have a higher bond strength (and thus presumably higher scores) when the sheets are on the inside of the protein among the hydrophobic bits, but I have not found this to be the case in fold.it proteins so far -- more often, the beta sheets seem to be on the outside. Even when I have put them on the inside on purpose, they seem to migrate outward as I adjust things for score increases.
+Hydrophobicity will likely remain a problem at this stage, and you will need to address it as well.... (yes, more is needed here....)
-You can't just h-bond the betas however you like, though. Above all, try to arrange them so that their hydrophilic stuff is more outside than in. Note that the easiest way to manipulate the shape of a beta is simply to grab the things it's attached to, and pull gently. Pulling the beta itself doesn't work well, because -- apparently -- there isn't the leverage given the tools we now have. The shape of a beta depends almost entirely on the configuration of things attached to and around it.
+New structural ideas may emerge.  Don't be afraid to rethink your structure.
-Strive for an [[aesthetics | aesthetically pleasing]] shape (I'm serious about this). Beta sheets flow. They're not like Le Corbusier's "efficient" buildings, but more like Frank Gehry's twisty and curvaceous architecture. Don't even try ironing them out. Once your beta sheets have an appropriately pleasing look, you'll want to try to fold everything else into a compact mass around them, with the hydrophobic elements facing inward. Do this slowly and gradually, and keep your hydrogen bonds intact by rubber banding them together while you work on other parts, otherwise the h-bonds will fall apart.
+Mojo can often be your friend here...  (explain why)
-If you're totally mystified about beta sheet aesthetics, take a look at some proteins whose structures are already known. [[Ubiquitin]] is a work of art, as is [[insulin]]. Browse wikipedia for others. [[Rosetta@Home]] is another excellent source of inspiration.
+After every change in this step, you will want to shake the sidechains, either before or after you wiggle the change into place.  (I usually try both.)  Sometimes, you'll want to move a sidechain manually and then wiggle, especially if you are trying to fill a [[Void]].
-After you've dealt with beta sheet aesthetics and reached a decently compact form for everything else, shake the side chains. I may be wrong here, but I have found that shaking is not terribly useful, EXCEPT in two cases: after you've radically rejiggered the entire structure, or after you've improved your overall score. After little adjustments that either decrease your score or leave it the same, you almost never gain anything from shaking. Don't waste your time on it.
+====Step Four:  Finishing with Scripts====
-If you've got a good build, you will know it when you [[shake sidechains | shake]] -- the score will go up a lot, sometimes by 100 points, sometimes even more. If this happens, try wiggling again. Sometimes the score will increase yet again, but there tend to be diminishing returns here. Once your protein reaches this point, reassess the aesthetics, the [[Hydrophobicity | hydrophilic/hydrophobic]] areas, and the [[compactness]]. Adjust as needed. Repeat until you get stuck. Then move to the next step...
+Once you've fixed your lower-scoring aminos, your protein should be mostly green.  It's at this point that you will want to consider local wiggle strategy, various tweaking scripts (I like Pletsch's Acid Tweaker), and other [[Endgame]] scripting tools.  Each will  raise your mojo to the point where relatively few of your conscious changes will result in score increases, so you do want to be careful here.  Do not script too soon, and do not over-script.  One thing I like about Pletsch's Acid Tweaker is that it doesn't seem to do this to the protein quite so much as others I've tried.
-Turn on the "[[show voids]]" option. Look at where the [[voids]] are, look at where you've got stuff dangling out, and try to make the one meet the other. Or else try to push the same element that's dangling in such a way that another bit will pop into a void. Whenever you achieve this, your score will go up. Be sure to shake the sidechains after every score gain you earn from this and other methods.
-I seldom rotate the sidechains, because I find that 1) I don't gain much from it and 2) I can't easily predict what a rotate-[[wiggle]] combo will do. Indeed, it's pretty much a shot in the dark. Others disagree with me on this and rotate-wiggle as a way of moving the [[backbone]] elements. Me, if I want to move the backbone, left-click and drag it where I want it.
+====If You Get Stuck====
-If you become dissatisfied with the fold you've created, do one of two things: Either work on a different puzzle, or go back to the beginning and try a different configuration. Either way, save your work, because you never know when an idea will occur to you about a structure that formerly looked unpromising.
+If you become dissatisfied with the fold you've created, do one of two things: Either work on a different puzzle, or go back to the beginning and try a different alignment. Either way, save your work, because you never know when an idea will occur to you about a structure that formerly looked unpromising.
-Learning the interface is important in accomplishing all of these tasks. Here is what to do if you consistently have trouble making the protein do what you want it to do, even when you're not working in close quarters: Disregard the score and set yourself a goal for the protein's appearance. Use the tools to make a smiley face, or a letter, or a geometric form. Doing this for a while will teach you better how to use the interface to manipulate a protein, and then you'll be ready for the harder stuff.