Irc 340441 1343444851 E2 spdenne

spdenne's solution for T0743 from puzzle 602.

Seal myoglobin appears in Foldit as a contest. Currently, players must be invited to join a contest. Contests are often used as classroom assignments.

As discussed in detail below, it's not clear that the protein in the seal myoglobin contest actually comes from a seal. The Protein Data Bank PDB entries which match the protein are associated with Ruminococcus gnavus, a type of bacteria. The name "seal myoglobin" is used here because that's how it appears in Foldit.

Seal myoglobin is presented as a multistart puzzle, with eight different starting poses. The initial poses were generated by automated protein structure prediction tools.

In a multistart puzzle, initially one of poses is selected at random. Each time Reset Puzzle is performed, the protein is switched to the next starting pose.

The contest also allows use of the alignment tool, which allows players to mix and match the different starting poses. The shape of part of one starting pose can be applied to part of the protein, and another pose can be applied to a different part of the protein. This process is referred to as "threading" the protein on to a model, or "partial threading", using only certain segments.

Some recipes attempt to select the best pose for a multistart puzzle. See Rav3n_pl SBS v1.2, for an example from 2011. (The "SBS" in the recipe name means "select best start".)

Hovering over a segment and hitting the tab key displays the segment information window, which shows the current structure.

For seal myglobin, the eight structures are identified as follows:

  • Structure 1: k1
  • Structure 2: k2
  • Structure 3: k3
  • Structure 4: k5
  • Structure 5: g2
  • Structure 6: g3
  • Structure 7: g4
  • Structure 8: g5

The seal myoglobin contest shows "Server models for T0743 Contest" under the score at the top of the Foldit window. The T0743 indicates this protein was a CASP target, specifically during CASP 10 in 2012.

In Foldit, T0743 was presented in Puzzle 602: Server models for T0743. See images for Puzzle 602, which include player-submitted images and images from the PDB. The German version of the Foldit wiki also has T0743 results. There's also an entry for "Target T0743" in CASP 10 Preliminary Results, showing how well Foldit solutions fared in the competition.

The CASP results mention "Zhang" and "Quark" structures, which are apparently different predictions generated by the Zhang lab and its Quark protein structure prediction server.

The protein has 159 segments with the primary structure:


This sequence is a partial match for two PDB entries:

  • 2MC8 - chain A, offset +35
  • 4HYZ - chains A and B, offset +35

4HYZ shows a dimer, with two copies of the protein.

The offset of +35 for 2MC8 and 4HYZ means that segments 1 to 35 of the Foldit protein aren't found in these PDB entries. As a result, segment 1 of the PDB entries is segment 36 of the protein in Foldit. The PDB entries do cover all the rest of the Foldit protein, however.

As mentioned above, both these PDB entries mention Ruminococcus gnavus, a type of bacteria, as the source of the protein. It appears likely that the "seal myglobin" designation is an error. The PDB entries describe the protein as "uncharacterized", which likely means the protein was identified via DNA, but it's exact function is unknown.