The key secret to finding best structures: Exploration

Started July 16, 2009 by zoran

zoran Lv 1

July 16, 2009

This is story about the Puzzle 138 (Rosetta Decoy 2). It is fascinating because it uncovers where we as developers should be focusing our design energy, and it also indicates that a change of game playing strategy may benefit all results. The short story is pay close attention to solutions that have relatively good score (but perhaps not the best), but appear noticeably different in structure from other solutions.

To demonstrate the point here is the plot of the initial solutions of puzzle 138. Vertical axis is the foldit score (expressed as a rosetta energy function where lower is better). The horizontal axis is the root mean square distance of backbone molecules, so the smaller it is the closer we are to the solution in the sense of per-molecule distance. The blue dot is the native structure from x-ray crystallography, so it could still have some slight errors, but it's the closest we know ho to get to the real thing. The graph shows that there is a large groups of solutions that are slight modifications of the starting solution, that is the molecules don't really move a lot. Still we see that to get to the native, lots of molecule movement needs to happen (towards the left on the graph). In fact we see that foldit solutions are hitting a virtual wall, where the score function lowers without approaching the solution. You will also notice that a lone exploerer has a solution that according to the foldit score is not the best, but on the graph appears much closer to the solution.
Solution of Puzzle 138

The question in our mind was this: if we started from this lone island discovered by foldit players, could the players do much better? So we posed another puzzle with the starting point shown in the black in the second image. The answer is emphatically yes! The Cyan points on the graph show the solutions after the second puzzle, which effectively shows that the solution is found.

This finding hold a key to all successful strategies. It's good to explore, and to actually move the protein in significant ways. you may get more points with hard-to-notice shakes and wiggles, but to truly explore better solutions one must venture into the unknown. It also shows that although the solution may not be good right away, after a bit more game play the significantly better solution may be found. So, exploration is the key.

In the near future we will focus on ways to encourage exploration within the game play. For the foldit players I would suggest paying close attention to the solutions that appear relatively close and likely worse in score to the best known solution, but that are significantly different in shape. Every time one of these solutions is found, it should be carefully explored. Chances are the true native solution may be nearby.

LennStar Lv 1

July 17, 2009

Could you post the solutions? The start1, the best score1, the best "native1"(=start2), the best native2 and best score2?

I'd like to see if the players can "see" the better structure. Our only indicator is the score, so if you are saying we should not concentrate on the score too much, we need another measurement.

dap Lv 1

July 17, 2009

The plots of the scores are interesting. With one group being those that worked on tweaking the starting point and the other, higher scoring group being those that restructured and essentially created their own starting point. I think it would be interesting to see some plots of these scores versus time. Run it through a filter to smooth it out a bit. I would expect the higher scores to initially have a low dip in their score and then a sharp ascent and the lower scoring group to have a gradual ascent from the get go up to their final score.

Brad Taylor Lv 1

July 23, 2009

I have to ask - if the native form is the blue dot why isn't the Ca rms = 0?

Also it seems that the Rosetta score is lowest at points that do not correspond to the native form. How common is this and is this something that we can expect to improve over time?

vakobo Lv 1

July 27, 2009

Is it possible to add such graphs in Foldit?

admin Staff Lv 1

August 03, 2009

Ca rms is not zero, because the native is a native was arrived at by taking the structure from crystallography (which will not be perfectly natural, as the protein is crystallised) and performing a series of shakes and wiggles on it.