Foldit

LociOiling Lv 1

May 07, 2025

The recipes print protein, AA Edit, SS Edit, and SelectoPro are now at version 3.0. Version 3.0 corrects and streamlines chain detection, particularly for DNA and RNA.

AA Edit, SS Edit, and SelectoPro should also be faster, since they don't retrieve as much information as before. However, print protein will still be slow on large puzzles, since it gets every possible bit of information about each segment.

The most noticable change in version 3.0 is for DNA. Previous versions didn't handle a DNA double helix correctly, reporting it as a single chain. The new recipes will report a double helix as two chains. (Also, SelectoPro v1.3 didn't handle DNA or RNA correctly, but it's now consistent with the other recipes.)

There are several mostly minor changes to the scriptlog output. The most noticable change is for a DNA double helix, where there are now two chains instead of one. The more minor changes which apply to all four recipes include:

messages about N-terminals and C-terminals no longer appear
messages no longer mention "mutables" if no mutables are present (some beginner puzzles still have mutables)
chain information is presented in a consistent format which includes the chain type (protein, DNA, RNA, ligand)

For print protein, there are some additional scriptlog changes:

for puzzles with disulfide bridges, the message reporting the pairings appears in a different spot
the "sequence if hydrophobic" section isn't included for non-protein chains
when conditions/bonuses are listed, the "satisfied" flag is no longer reported, since it doesn't seem meaningful

Chain detection

Previous versions of these recipes looked for N-terminals and C-terminals of proteins to determine chain ends. This method was not reliable, since there terminals aren't always located in an experiment. And of course, this method only worked for protein chains. The somewhat complex logic for finding N-terminals and C-terminals based on atom counts has been removed.

The new version of chain detection uses the distance between adjacent alpha carbons for protein, and adjacent phosphorus atoms for DNA/RNA. For proteins, a distance to 4 Angstroms is used, for DNA/RNA, it's 8 Angstroms. Any thing over those distances means a new chain has started.

The 4/8 Angstrom distances have been determined through testing on a large number of Foldit puzzles.

Chain detection may still have issues in some cases.

If DNA or RNA is stretched into a straight line, the distances between adjacent phosphorus atoms may exceed 8 Angstroms.

For proteins, the puzzle-making machinery sometimes connects the side of a spot where there are missing residues. This results in cases where the alpha carbon distance exceeds 4 Angstroms.

Both these issues will result in additional chains being reported. In most cases, the alpha carbon/phosphorus distances should go back to normal with the use of normal Foldit tools.

(A version of this message appears as a comment for each of the four recipes.)