PDB import and rendering issues - puckers and straws

Started by LociOiling

LociOiling Lv 1

I've been looking back at the ED recon puzzles, trying to identify the ones that had problems which might affect the value of the results.

A number of the ED recons had puckers or straws, where Foldit thinks there is a peptide bond between distant residues. Straws are easy to spot just by looking at the puzzle, but puckers may require looking into the details, like the ideality subscores. Puckers tend to happen where there are missing residues.

The starting pose of puzzle 2323 has an obvious straw at 65-66. The corresponding residues in 3HRY are A 65 and A 66. So there are no missing residues, but A 65 and A 66 shouldn't be bonded.

On the PDB site, the Mol* viewer shows a dotted line between A 65 and A 66. Jmol simply shows a large gap between the two, but they're clearly not connected.

Foldit thinks that 65 and 66 are connected, as shown here:

Foldit standalone produces the same structure from the 3HRY PDB file. Standalone also throws in the popup message "trunc variants detected" upon loading the PDB.

I used bands to measure the distances between the atoms of segment 65 and segment 66. The mean distance for successful bands was 27.33 Angstroms. For some reason, bands failed between atoms 1 and 4 of both segments, which I'm still looking into.

Since a PDB file doesn't usually include explicit bonds, it's up to the rendering software to figure out where the bonds should go. It seems like Foldit/Rosetta is doing this differently as compared to other viewers. Bonding atoms which are around 27 Angstroms apart doesn't seem reasonable.

PDB 3HRY does have a lot of oddities. There's a missing atom record for the final "OG" oxygen of residue A 65. Serine A 65 has multiple positions, so there are "ASER" and "BSER" records, but not for every atom. Similar "A" and "B" records appear in several other spots. Threonine A 66 has "B" records, but no "A" records.

I'll look at 2383 next, which had lots of puckers and a few straws. So far, I think all those problems are due to missing residues.

I'll also look at some of the puzzles which ended up without puckers or straws.

See the bug report puzzle 2323 issues with missing densities for an earlier look at 2323.

LociOiling Lv 1

A quick look at Puzzle 2383 reveals lots of puckers and straws, 15 according to my recipe. The images show how the gaps at E 42-E 55, E 82-E 96, and E 149-E 160 appear in Foldit and Jmol. In all three cases, Foldit has the edges of the gap bonded, while Jmol shows empty space. I've added measurements (dotted lines) in Jmol to show the distances involved.

An inspection of the file for PDB 3GD1 file shows there are missing residues as expected. There are no missing atoms involved in the three gaps shown here, and no "A" and "B" positions, both issues seen in 3HRY for puzzle 2323.

In Foldit standalone, 3GD1 also produces a "trunc variants detected" popup. The log file contains these messages:

standalone.application.boinc.Puzzle: {0} FOUND LOWER TRUNC AT RESIDUE 1
standalone.application.boinc.Puzzle: {0} FOUND UPPER TRUNC AT RESIDUE 345
standalone.application.boinc.Puzzle: {0} FOUND LOWER TRUNC AT RESIDUE 346
standalone.application.boinc.Puzzle: {0} FOUND UPPER TRUNC AT RESIDUE 645
standalone.application.boinc.Puzzle: {0} FOUND LOWER TRUNC AT RESIDUE 646
standalone.application.boinc.Puzzle: {0} FOUND UPPER TRUNC AT RESIDUE 1003
standalone.application.boinc.Puzzle: {0} FOUND LOWER TRUNC AT RESIDUE 1004
standalone.application.boinc.Puzzle: {0} FOUND UPPER TRUNC AT RESIDUE 1011
standalone.application.StandaloneApplication: {0} Trunc variants detected!

All the residue numbers cited in the "trunc" messages seem to be in chain C, so they're not involved in the chain E issues shown here.


LociOiling Lv 1

Just another observation, in PDB 3HRY, residue A 65 has only five atom records in the "A" position. These are the backbone heavy atoms and the beta carbon. The "OG" sidechain oxygen is missing, so this serine looks just like alanine as far as the PDB goes. Foldit/Rosetta fills in the OG atom and adds all the hydrogens, so you get 11 atoms for segment 65.

LociOiling Lv 1

On the positive side, here are two puzzles which correctly handled missing residues:

Looking at the PDB files for those two, I don't see any obvious structural differences compared to PDB 3HRY and PDB 3GD1, from puzzles 2323 and 2383, which had straws and puckers where there were missing residues.

When I open 2O42 in Foldit standalone, I get the "trunc variants" popup, and it loads as just two chains, with an obvious straw at 150-151, ideality about 14,000 each, very bad in standalone. Foldit V44-20250213-cae7350b02-win_x64-devprev handles the gap by showing 139-150 as a separate chain.

For 3KU7, the standalone results are similar. The "trunc variants" popup appears, and it shows just two chains, each with an obvious straw. Puzzle 2427 instead shows four chains, reflecting missing residues in both of the PDB chains.

I tried exporting using the Foldit standalone "export PDB" function, then importing again. No change.

One item to note: the "show backbone issues" view option does a nice job of highlighting the problem areas.

My standalone release is 20240928-462640d328-win_x64-INTERNAL, which appears to be the latest version.

LociOiling Lv 1

The recipe Pucker Picker 3.0 RC1 is now available. It detects potential trouble spots by looking at segment distances and ideality.

The category Pucker on the wiki lists the puzzles that seem to have missing residue puckers. There are only eight puzzles in that category at the moment.

There are a few puzzles that appeared puckered at the start, but turn out to be fine. For Puzzle 2421, Pucker Picker reports problems in both the protein and the DNA sections at the start, but there are no missing residues or bases. It seems like the starting pose just needs work.

On Puzzle 2409, the recipe sees two puckers at the start, but the segments aren't connected in the puzzle, just really close together. The display for PDB 4PQP shows dotted lines in these spots.

There are a few other edge cases like these two. This current version of the recipe is "RC1", release candidate 1, in case further updates are needed.

beta_helix Staff Lv 1

Thanks Loci.
Have you noticed any pucker issues in recent ED puzzles?

We want to make sure that we did indeed fix it properly last year after these problems were reported.

LociOiling Lv 1

The problem did seem to improve, but then 2596: Electron Density Reconstruction 114 had several puckers.

Before that, 2400: Electron Density Reconstruction 72 was the last one that had a pucker due to missing residues.

2421: Electron Density Reconstruction 79 appeared to have puckers in both the protein and the DNA at the start, but it was just a really bad starting pose, no missing residues.

2427: Electron Density Reconstruction 81 and 2430: Electron Density Reconstruction 82 had missing residues in the PDB, but no puckers in Foldit. The missing residues caused the start of a new chain.

I guess puckers aren't necessarily the end. PDB 7CTO had a small pucker in two Foldit puzzles. It nevertheless made it into PDB-REDO on 2460: Refine Density Reconstruction 2. It was only missing two residues out of 816.

beta_helix Staff Lv 1

Thank you for reporting this, LociOiling… we believe we found the issue that caused 2596 to pucker up.

Please let us know if you notice any puckers in the future (maybe we need to run your script as soon as a puzzle is posted!), so that we can address those problems as early as possible.