An approach to creating Hydrogen Bond Networks.
Creating Hydrogen Bond Networks from scratch can present problems. You might end up with them by chance in the normal course of protein manipulation but this is rather unlikely, although it's gratifying when this happens. One approach is to try and get a construct like a serine triangle. Some amino acids (tyrosine, threonine, serine) have one hydrogen bond donor and one acceptor at the same position and these can combine to produce a network: many examples can be found in the wiki. They have a couple of big advantages: they're relatively easy to create and they're 100% satisfied in terms of polar atoms so they score well, although you've got to get the geometry right. (Note: their status is a bit unclear right now). More complex networks present their own set of challenges: this page shows one way of going about it.
1) Create the monomer.
It's best first to get the monomer scoring well and reasonably stable. This means not using endgame scripts like Local Wiggle (you can run these at the end as usual) but does mean trying to get the maximum score (or close to it) on all the filters. Life will be a lot easier later on if you can manage this.
The basic structure is created using the Blueprint Palette. Remember to turn View Symmetric Chains off while dragging and dropping elements from the Building Blocks (tx Vakobo). After an initial Mutate/wiggle, etc. my favourite script sequence goes something like this: Idealize: Local Mutate : Helix Twister (and/or Jolter for sheet-heavy monomers). I don't like to run these scripts to exhaustion (except perhaps Idealize) but rather cancel them after ten/fifteen minutes or so. At intervals, particularly during the early stages, it's a good idea to eyeball the protein and see if there is anything "wrong" with it. The word wrong here covers a multitude of problems but often means that the scripts are pushing the protein into a pose different from what was originally intended, or perhaps the helices are getting a bit ragged-looking. In such a case its almost always best to do a bit of hand-folding or perhaps make judicious use of the IdealizeSS tool to get the protein back on track. You'll lose a few points in the process but more often than not will rapidly get them back plus a few more when rerunning scripts.
Rebuild Scripts don't seem all that productive for this kind of puzzle, perhaps because you're building it out of ideal elements. Still running it relatively late in the process, perhaps with remix lengths of 4 or 5 can be productive. Cut and Wiggle later in the game is also a favourite of mine.
Dealing with unsatisfied filters.
Most of the time the filter score will go up in the natural course of running scripts and hand folding and you don't have to worry about them too much. Sometimes though you'll get some stubborn situations where the filter penalties refuse to go away. In such a case there are various ways of dealing with the problem.
a) Core Existence filter.
If this is unsatisfied then the protein needs to be packed tighter. It may seem natural to use a Quake-style script to do this but it's often quicker to do it manually. For example, if the protein is a Helix Bundle you can make a cutpoint at one of the helix-helix turns, then carefully rotate the protein around one of the new endpoints while watching the Core Existence filter score. Then when its at its maximum (or close to it) you can can close the cutpoint, wiggle the sidechains, mutate, wiggle, etc. and hope the Core Existence filter values stays stable. The helix-helix loop may have changed slightly, but there seems to be some tolerance in the Ideal Loops filter so there's unlikely to be a drop in the Ideal Loops filter score.
b) Ideal Loops filter.
You can try repeating drag and drop in Blueprint to recreate the original Ideal Loop but all too often this seems to make too big a change to the protein. A better way is to invoke Remix on the offending loop. If Remix finds no replacement (as often seems to be the case with the small loops defined by the Building Box tool), then change one or more of the adjacent residues to loop and try Remix again. Flip through the list of new structures generated by Remix while simultaneously watching the Ideal Loop filter score looking for any jump in the score. Select the likeliest-looking gainer and mutate/wiggle etc.
c) Buried Unsats filter.
Still a discovery in progress but try mutating it to something non-polar. If that doesn't work try mutating neighbouring residues to something non-polar (may disrupt the environment enough to pacify the filter algorithm) or perhaps to aspartate or asparagine to create a bond to the unsatisfied polar.
Small point: sometimes you end up with buried polars on the inside of helix-helix turns. It seems that the next-to-top loop (GBB in Abego colours) in the Building box menu avoids these best but its early days...
2) Create the trimer
Drag the units of the monomer together in whatever way pleases you and mutate, wiggle etc. until things are stable. One point here: you get better results later on if the secondary structure elements are at a bit of an angle to one another. It may seem natural to have all the helices in a trimer co-axial with one another but this seems to make it harder to ultimately create networks, presumably because of geometric constraints.
Then run Local Mutate for a few iterations: the script tends to produce its best gains early on. After a bit change the position of the monomer slightly and repeat: sometimes small changes in position/orientation end up making a big difference in score.
3) Create some skeleton networks.
This is where scripts come in (I've found it almost impossible otherwise). Specifically HNetwork Probe or HNetwork Probe 2: these are kludgy scripts but they do the job for now. Changes appear likely in the way HNetworks are treated in Foldit and this is a bit of a disincentive to tidy things up.
The idea is to select a bunch of amino acids in the protein, mutate them and wiggle the sidechains. If by chance they form a network then great: save the pose for future examination in a quicksave slot and try again. If not mutate something and try again.
Before running HNetwork Probe, it's necessary to select the sidechains to be mutated as above. In practice, you'll want to select those residues that lie near the interface between chains. You can list the sidechains in the script or, more conveniently, use the "Select unfrozen" option: this is enabled by default in version 2 of the script. Freeze all the sidechains (shift double-click on the sidechains is helpful here: there are also scripts such as FreezeSome that can be used). Then unfreeze manually those sidechains that look as if they might potentially form networks. Because of the inefficient way the script operates you'll probably want to limit the number of residues to 20 or so.
At this point the protein will look something like this:
Now run HNetwork Probe 2
HNetwork Probe 2 dialog.
If you've set up the protein with partially frozen sidechains as described and illustrated above you're pretty much ready to go: the default settings work quite well. For the sake of completeness though here's a list of the settings:
If checked, as described above. If not see below.
Relevant iff Select unfrozen above is not selected. List the residues to mutate (ranges are allowed) separated by commas in a way already used by many other scripts; e.g. 34,36,41-44, etc.
The list of possible replacements. All amino acids that contain a donor or acceptor in their sidechain are included except lysine (k) and arginine (r) ; these have a lot of polar atoms and its hard to satisfy them all: furthermore you don't want them in the interior of the protein where they're likely to score terribly. Still , maybe they're worth adding.
When mutating, sets a lower bound for the sidechain partial score. There's little point getting nice networks if the sidechains are too badly positioned. (The script attempts to keep this criteria but doesn't always succeed)
Start and end quick save slots.
The default values are only so that you don't erase commonly-used slots and so that you don't have too many networks to examine.
Once the script starts, best to go to the library for a couple of hours. When you return, and assuming you have done nothing recently to offend the folding gods, you'll see something like this in the script output:
Init HNet score 0
Found network ( gain 600 ) storing in qs 13
Found network ( gain 450 ) storing in qs 14
Found network ( gain 450 ) storing in qs 15
Found network ( gain 1000 ) storing in qs 16
Found network ( gain 900 ) storing in qs 17
Found network ( gain 600 ) storing in qs 18
Cancel the script: the quicksave slots containing new networks will then be loaded into the undo buffer for convenient reference.
4) Examine the networks
Make sure Show Hydrogen Bond Network is checked (and Show Sidechains is unchecked to avoid clutter) and scroll back through the Undo buffer to take a look at the networks. When evaluating them some of the things you might want want to think about are:
1) The Network bonus. Important of course and if you happen to have the maximum bonus (1800 in the current puzzles) there's little need to look further. More often than not though this is a secondary consideration.
2) Percent polars satisfied. 100% is best of course but networks like this can be hard to extend. You might want to select one that less than 100%, particularly if it looks like there's scope for extending it.
3) Networks that have badly positioned sidechains such as aromatic rings pressing against a helix might well be discarded. Still before doing so it might be worth going through the procedure described at the bottom of this section to see if there's any improvement.
4) Aesthetics. There's something very appealing about a well-formed network that spans a good fraction of the interface between the monomers and even if these may not score as well as others they're still often worth sharing with scientists.
A quick look at the networks produced above and some random thoughts on each:
A nice-looking glutamine triangle. The good : its 100% satisfied and you have bonding to the backbone: it looks a little strange but the scientists find this sort of thing interesting. The bad : the Sidechain scoring scoring for the Glutamine is -265. Ouch! Maybe that'll go away in the process of manipulation (described below) but if it doesn't it's a bit of a showstopper.
A tyrosine triangle. The good : its 100% satisfied and the tyrosine looks well placed. The bad : the scientists think there's an issue with networks composed of 3 hydroxyl groups: serine and threonine triangles are mentioned at the end of Foldit Lab report 11 but tyrosines aren't mentioned: the status of tyrosine triangles seems a little unclear. Update: tyrosine triangles, along with serine and threonine triangles are now frowned upon so this will probably have to be discarded.
A rather unattractive network with a couple of tryptophans and a glutamate. Tryptophans are quite useful in the sense that they have a single donor atom only so if you can get a bond to that it'll act as a "terminating node" to the network and will also contribute to the 100% Polar satisfied goal. However Tryptophans are bulky and hydrophobic and really like to be on the inside of the protein. The good: if you could get some extra bonds to the glutamate in the middle you might be able to extend the network. The bad : it's only 75% satisfied and those tryptophans look badly positioned.
An interesting-looking network with 2 histidines and an aspartate. The good : scores well (1000 points and 83% satisfied). The bad : it looks like there may be an issue extending the network. Histidines, depending on pH, have two donors and acceptors in the sidechain and only one is shown. Its possible that they are both close enough to the outside to make bonds with water, in which case the fact that it's not 100% satisfied doesn't matter from a science perspective, only in a Foldit scoring sense.
Interesting : two tyrosines, a histidine, and an aspartate forming a network (score 900 : 75% satisfied ). The good: it's interesting, has a decent score and if you could get more polars satisfied you could end up with a really nice-looking network. The bad: its not easy in this sort of setup to see which are the unsatisfied polars which makes progressing a bit hit or miss. Also the histidine looks as if it might be clashing with an adjacent helix. This might be a good network to manipulate manually.
Composed of a couple of tryptophans and an aspartate: similar to 15. The good : unlike 15 this one is 100% satisfied: it also looks as if the tryptophans are better positioned as regards the inside of the protein. The bad: it's possible the tryptophans may turn out to be too exposed. Other than that it looks nice though: however it scores 600 and can't be extended readily as its 100% satisfied: you'd have to create another network elsewhere inn the protein to get the full bonus.
The best way I've found of progressing from these quicksave poses is as follows : Unfreeze all: freeze the sidechains that make up the Network : Mutate : Wiggle All (hoping the Network hangs together) , Mutate. If you happen to notice the Hnet bonus score dropping during the Mutate step, you might want to back up and check Show Sidechain bonds. The scoring function for Hnetworks is a little arcane and sometimes sidechains that don't appear as part of the network do in fact make a contribution: such sidechains also need to be frozen.
Once you have a network, it's often a good idea to keep the sidechains comprising it frozen.
5) Extending the network.
Once you actually have a network to play with, it's worth playing around a bit with manual mutations in or near the network. You might try substituting threonine for serine perhaps, or aspartate for asparagine, or try mutating neighbouring residues to satisfy any unsatisfied polars, perhaps using the Pick Sidechain tool to help. The results can be surprising and not always intuitive.
It's also possible to use Hnetwork Probe to extend the Network. Freeze all the sidechains, including those that make up the network, and then unfreeze only those in the immediate neighbourhood of the network.
6) Further material
The original blog post here https://fold.it/portal/node/2000666
From bkoeps monthly video updates
Foldit Lab report #3 (around 4:20)
Foldit Lab report #4 (around 2:50)
Foldit Lab report #5 (around 2:40)
Foldit Lab report #6 (around 7:40)
Foldit Lab report #10 (around 4:10)
Foldit Lab report #11 (around 3:40)