Foldit

Andrew Leaver-Fay Lv 1

January 09, 2014

Under the hood, Foldit is undergoing a subtle modification to its energy function. The energy function (the scoring function) is what decides whether one structure is better than another structure, and is meant to capture the underlying physics of proteins in water. The changes that are going into the energy function represent years worth of work; in spite of the fact that we long knew about several problems in Foldit's previous energy function, it turns out that fixing them was really tough!

There are a few things that you will notice about the new energy function. First, the hydrogen bond term is now a bit different. Some hydrogen bond acceptors – in particular, (red) oxygens that don't have attached hydrogens, which are the vast majority of protein oxygens – now prefer their donor hydrogen atoms to be "in the plane" that the acceptors lie in. For instance, in the side chain of aspartate (ASP, or D) there are two oxygen atoms (OD1 and OD2) that lie in the same plane with two carbons (CG and CB) giving this side chain a flat appearance (See Figure). These "sp2 hybridzed" oxygens prefer hydrogen atoms to lie in the same plane, and this preference is readily visible in known protein structures. Until now, however, this preference was not captured by Foldit's energy function. This new term brings Foldit's energy function closer to the quantum-mechanically understood electronic character of oxygens.

Second, Foldit now contains an explicit (but short ranged) Coloumbic interaction term that describes how charged atoms interact. This basically says "opposite charges want to be close, and like charges want to be far." Until now, Foldit relied on a low-resolution term that tried to capture this rather fundamental property in a crude way between protein side chains, but it failed to capture the interactions that involved the backbone. Why would we not have included this term until now? Well, it is a little slower to evaluate this interaction energy and we weren't sure until recently how useful this energy was in describing protein energetics. We already had an explicit hydrogen bonding term describing very-close charge/charge interactions and the crude term prevented too many like-charged side chains from getting too close. For quite some time, the data we had seemed to tell us that combination was good enough. However, very recent data has convinced us that a Coloumbic term is genuinely useful in telling the difference between correctly folded models and incorrectly folded ones. This term will make Foldit a bit slower, but, better to take an extra few minutes to design a protein than to spend several weeks (and $$) trying to synthesize a dud!

Finally, you might notice that the side chain energy is a bit smoother for several amino acids. You'll notice this difference for the "flat" amino acids (the ones with sp2 hybridization) which are almost half of them: ASP, GLU, PHE, HIS, ASN, GLN, TRP, and TYR! Before, the energy function would jump around when crossing the artificially-defined boundaries between angle bins and would frustrate the minimization algorithm (wiggle) whenever it would step across those boundaries. With the new side-chain energy term, wiggle should be more effective.

There are a handful of other subtle changes that are going in to Foldit's energy function that will probably not be noticeable. The Lennard-Jones term is now evaluated analytically, instead of through table interpolation, which will make wiggle more efficient. The ideal bond angles and lengths are being corrected for several hydrogens and a few CB atoms. Finally, the term describing the bond geometries for disulfides (CYS-CYS chemical bonds) has been refined.

We are in the process of writing up the changes to the energy function in a paper. For those that are interested, I'll post a link to the paper once it is complete.

In addition to these fundamental changes to scoring, the density term will be normalized to make scoring more consistent across different density maps. There will also be a bonding term that forces the protein geometry to be more ideal. Look for these changes coming soon.

Aspartate (right) prefers for the polar hydrogen atoms to lie in the plane that its two side-chain oxygens (red) and its two side-chain carbons (green) lie in. Here you can see a backbone nitrogen putting (left) its hydrogen in the plane of an aspartate's side chain in order to form a hydrogen bond.

wisky Lv 1

January 09, 2014

When do you expect these changes to take effect in game?
I really like these changes! :) :) :)

beta_helix Staff Lv 1

January 09, 2014

I assume these changes will be going into Rosetta@home? And I'm curious what happened to the alpha test Boinc project Ralph: it seems to be a thing of the past.

Mike Cassidy Lv 1

January 09, 2014

When do you expect these changes to take effect in game?
I really like these changes! :) :) :)

Mike Cassidy Lv 1

January 09, 2014

Wiggle by "w" seems fine

auntdeen Lv 1

January 09, 2014

Please see my comment here:

http://fold.it/portal/node/996588

spvincent Lv 1