Placeholder image of a protein
Icon representing a puzzle

2282: Electron Density Reconstruction 33

Closed since almost 3 years ago

Novice Overall Prediction Electron Density

Summary


Created
March 27, 2023
Expires
Max points
100
Description

The structure of this protein has already been solved and published, but close inspection suggests that there are some problems with the published solution. We'd like to see if Foldit players can use the same electron density data to reconstruct a better model. There's three copies of the same protein here, but not all the segments are visible. It's pretty big, so this is one where the Trim tool might come in handy. One other note: if you happen to glance at the PDB entry; you might recognize the names of some of the authors for their involvement in Foldit. Not everything is perfect the first time around...

Sequence
MGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHHMGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHHMGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHH

Top groups


  1. Avatar for Go Science 100 pts. 58,596
  2. Avatar for Anthropic Dreams 2. Anthropic Dreams 70 pts. 58,423
  3. Avatar for L'Alliance Francophone 3. L'Alliance Francophone 47 pts. 58,171
  4. Avatar for Contenders 4. Contenders 30 pts. 58,108
  5. Avatar for FamilyBarmettler 5. FamilyBarmettler 19 pts. 57,363
  6. Avatar for Marvin's bunch 6. Marvin's bunch 11 pts. 56,721
  7. Avatar for Australia 7. Australia 7 pts. 56,390
  8. Avatar for BOINC@Poland 8. BOINC@Poland 4 pts. 56,026
  9. Avatar for Firesign 9. Firesign 2 pts. 54,910
  10. Avatar for VeFold 10. VeFold 1 pt. 53,853

  1. Avatar for alyssa_d_V2.0 61. alyssa_d_V2.0 Lv 1 1 pt. 53,175
  2. Avatar for Deleted player 62. Deleted player 1 pt. 53,084
  3. Avatar for Swapper242 63. Swapper242 Lv 1 1 pt. 53,047
  4. Avatar for apetrides 64. apetrides Lv 1 1 pt. 53,018
  5. Avatar for kitsoune 65. kitsoune Lv 1 1 pt. 52,577
  6. Avatar for pruneau_44 66. pruneau_44 Lv 1 1 pt. 51,713
  7. Avatar for Endermenace31 67. Endermenace31 Lv 1 1 pt. 51,142
  8. Avatar for hada 68. hada Lv 1 1 pt. 51,042
  9. Avatar for furi0us 69. furi0us Lv 1 1 pt. 50,344
  10. Avatar for Sammy3c2b1a0 70. Sammy3c2b1a0 Lv 1 1 pt. 50,042

Comments


Bruno Kestemont Lv 1

This remembers me a smaler de novo symetric puzzle (pentamer I think) that I designed several years ago with the same concept ;)

LociOiling Lv 1

Yes, this one has three chains, 160 residues each. The chains are not 100% identical, so it may not be considered a true symmetry puzzle.

It's a match for 4KYZ and 4KY3, both mentioning "designed protein OR327".

For some reason, the sequence for 4KYZ shows four identical chains. The 3D viewer on rcsb.org only shows chain A. I'm not sure if there's an option to see all the chains.

When I open 4KYZ in Jmol, I see four chains, but they're in a totally different configuration than what we're seeing in 2282. This seems to be a versatile protein, sometimes a trimer, sometimes a tetramer.

Here are the chains detected by the latest version of "print protein", which I hope to release soon:

A: diqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegql
B: iqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegqlg
C: diqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegql

This version of "print protein" relies on the distance between alpha carbons (from structure.GetDistance) to detect where chains begin and end. That's useful when atom counts don't tell the whole story.

LociOiling Lv 1

I just noticed the "not all the segments are visible" comment in the official notes.

Each of the three chains is missing the MG at the beginning, and then the expression tag, GSLEHHHHHH at the end.

However, chain B is also missing the D at the beginning, and then picks up a G from the expression tag at the end, so the length is still 160.

The PDB file for 4KYZ doesn't really explain all this too well, but it does have "REMARK 465" records for some of the missing residues.

Another interesting point is the PDB file calls for selenomethionine (MSE) instead of plain methionine (MET). Selenomethionine substitutes a selenium atom in the place of the sulfur atom found in methionine. Foldit apparently doesn't work with selenomethionine, so we're stuck with plain old methionine.

Here's the Foldit sequence, in lower case, aligned with the stated sequence in upper case, with everything split into chains:

                                                                                                     1         1         1         1         1         1         1
           1         2         3         4         5         6         7         8         9         0         1         2         3         4         5         6
  1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
  diqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegql
MGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHH
   iqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegqlg
MGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHH
  diqvqvniddngknfdytytvtteselqkvlnelmdyikkqgakrvrisitarsskeaykflailakvfaelgyndinrkmtvrfrgddlealekalkemirqarkfagtvtytldgndleititgvprqvleelakeaerlakefnitititvtvegql
MGDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARSSKEAYKFLAILAKVFAELGYNDINRKMTVRFRGDDLEALEKALKEMIRQARKFAGTVTYTLDGNDLEITITGVPRQVLEELAKEAERLAKEFNITITITVTVEGQLGSLEHHHHHH