Placeholder image of a protein
Icon representing a puzzle

2282: Electron Density Reconstruction 33

Closed since almost 3 years ago

Novice Overall Prediction Electron Density

Summary


Created
March 27, 2023
Expires
Max points
100
Description

The structure of this protein has already been solved and published, but close inspection suggests that there are some problems with the published solution. We'd like to see if Foldit players can use the same electron density data to reconstruct a better model. There's three copies of the same protein here, but not all the segments are visible. It's pretty big, so this is one where the Trim tool might come in handy. One other note: if you happen to glance at the PDB entry; you might recognize the names of some of the authors for their involvement in Foldit. Not everything is perfect the first time around...

Sequence
MGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHHMGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHHMGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHH

Top groups


  1. Avatar for Go Science 100 pts. 58,596
  2. Avatar for Anthropic Dreams 2. Anthropic Dreams 70 pts. 58,423
  3. Avatar for L'Alliance Francophone 3. L'Alliance Francophone 47 pts. 58,171
  4. Avatar for Contenders 4. Contenders 30 pts. 58,108
  5. Avatar for FamilyBarmettler 5. FamilyBarmettler 19 pts. 57,363
  6. Avatar for Marvin's bunch 6. Marvin's bunch 11 pts. 56,721
  7. Avatar for Australia 7. Australia 7 pts. 56,390
  8. Avatar for BOINC@Poland 8. BOINC@Poland 4 pts. 56,026
  9. Avatar for Firesign 9. Firesign 2 pts. 54,910
  10. Avatar for VeFold 10. VeFold 1 pt. 53,853

  1. Avatar for WBarme1234 21. WBarme1234 Lv 1 24 pts. 57,363
  2. Avatar for drumpeter18yrs9yrs 22. drumpeter18yrs9yrs Lv 1 22 pts. 57,249
  3. Avatar for alcor29 23. alcor29 Lv 1 20 pts. 57,101
  4. Avatar for akaaka 24. akaaka Lv 1 18 pts. 57,054
  5. Avatar for BootsMcGraw 25. BootsMcGraw Lv 1 17 pts. 56,941
  6. Avatar for guineapig 26. guineapig Lv 1 15 pts. 56,908
  7. Avatar for Anfinsen_slept_here 27. Anfinsen_slept_here Lv 1 14 pts. 56,845
  8. Avatar for Idiotboy 28. Idiotboy Lv 1 13 pts. 56,797
  9. Avatar for Trajan464 29. Trajan464 Lv 1 12 pts. 56,751
  10. Avatar for fpc 30. fpc Lv 1 11 pts. 56,721

Comments


Bruno Kestemont Lv 1

This remembers me a smaler de novo symetric puzzle (pentamer I think) that I designed several years ago with the same concept ;)

LociOiling Lv 1

Yes, this one has three chains, 160 residues each. The chains are not 100% identical, so it may not be considered a true symmetry puzzle.

It's a match for 4KYZ and 4KY3, both mentioning "designed protein OR327".

For some reason, the sequence for 4KYZ shows four identical chains. The 3D viewer on rcsb.org only shows chain A. I'm not sure if there's an option to see all the chains.

When I open 4KYZ in Jmol, I see four chains, but they're in a totally different configuration than what we're seeing in 2282. This seems to be a versatile protein, sometimes a trimer, sometimes a tetramer.

Here are the chains detected by the latest version of "print protein", which I hope to release soon:

A: diqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegql
B: iqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegqlg
C: diqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegql

This version of "print protein" relies on the distance between alpha carbons (from structure.GetDistance) to detect where chains begin and end. That's useful when atom counts don't tell the whole story.

LociOiling Lv 1

I just noticed the "not all the segments are visible" comment in the official notes.

Each of the three chains is missing the MG at the beginning, and then the expression tag, GSLEHHHHHH at the end.

However, chain B is also missing the D at the beginning, and then picks up a G from the expression tag at the end, so the length is still 160.

The PDB file for 4KYZ doesn't really explain all this too well, but it does have "REMARK 465" records for some of the missing residues.

Another interesting point is the PDB file calls for selenomethionine (MSE) instead of plain methionine (MET). Selenomethionine substitutes a selenium atom in the place of the sulfur atom found in methionine. Foldit apparently doesn't work with selenomethionine, so we're stuck with plain old methionine.

Here's the Foldit sequence, in lower case, aligned with the stated sequence in upper case, with everything split into chains:

                                                                                                     1         1         1         1         1         1         1
           1         2         3         4         5         6         7         8         9         0         1         2         3         4         5         6
  1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
  diqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegql
MGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHH
   iqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegqlg
MGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHH
  diqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegql
MGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHH