Foldit

1258: Tuberculosis Challenge - Phase 1

Closed since almost 10 years ago

Intermediate Intermediate Intermediate Intermediate Intermediate Intermediate Intermediate Overall Overall Overall Overall Overall Overall Overall Prediction Prediction Prediction Prediction Prediction Prediction Prediction

Summary

Created: July 11, 2016
Expires: July 19, 2016 at 23:00 UTC
Max points: 100

Description

This puzzle starts with an unfolded sequence with secondary structure assigned from PSIPRED. The target protein is LepB and is currently being investigated for drug discovery against Tuberculosis (TB). TB is caused by the bacillus Mycobacterium tuberculosis and has killed more than 1.5 million people in 2014. Right now, no crystal structure exists for this target. Models created by Foldit players will be used to help solve the structure when crystals become available.

Top groups

1. Contenders
100 pts. 9,670
2. Anthropic Dreams 73 pts. 9,563
3. Go Science 52 pts. 9,497
4. Gargleblasters 36 pts. 9,362
5. Deleted group pts. 9,235
6. L'Alliance Francophone 16 pts. 9,218
7. Void Crushers 10 pts. 9,063
8. Beta Folders 6 pts. 9,058
9. Italiani Al Lavoro 4 pts. 8,255
10. HMT heritage 2 pts. 7,801

1. gitwut Lv 1
100 pts. 9,615
2. Galaxie Lv 1 98 pts. 9,547
3. Susume Lv 1 96 pts. 9,517
4. Bruno Kestemont Lv 1 94 pts. 9,494
5. Deleted player pts. 9,403
6. dembones Lv 1 90 pts. 9,379
7. MurloW Lv 1 88 pts. 9,329
8. brow42 Lv 1 86 pts. 9,309
9. tokens Lv 1 84 pts. 9,272
10. grogar7 Lv 1 82 pts. 9,261

Comments

free_radical Lv 1

July 15, 2016

I do not know…I will look into this.

Susume Lv 1

July 15, 2016

1258 has 7 prolines in residues 117-148 (I think this is the area brow is looking at), in the motif
PXXXPXXXPXXXXXXPXXXPXXXXXXXPXXXP. It would certainly be interesting if that area could form a proline-rich helix.

Of the 865 homologs that jpred found for our sequence, only 20 have 4 or more prolines in those 7 spots, and only 5 have 5 or more of them, so it is not a common motif. (Note these are not homologs with solved structures, just known sequences.)

Susume Lv 1

July 16, 2016

Residues 23-26 look like they may be an insertion, with only 119 of 855 homologs having all four residues present. Of the 33 homologs having C at residue 26, all of them also have C at residue 23. Of the 36 homologs having C at residue 23, 33 of them have C at residue 26. So this might be a candidate for a disulfide, since the C's strongly tend to occur together or not at all. Not sure what use a disulfide would be on residues so close together, though.

The number of homologs having C in the other places is:
Residue 105: 131 have C
Residue 137: 38 have C
Residue 169: 12 have C
Residue 173: 1 has C

Susume Lv 1

July 16, 2016

Ser94, Ser96, and Lys174 in the article correspond to 13 S, 15 S, and 93 K in our protein. If they form a dyad, does that mean they are in contact? And on the outside of the protein, so they can interact with other proteins?

Susume Lv 1

July 16, 2016

Apparently in E. coli there are proteins in the periplasm (the stuff between the inner and outer membranes) with CXXC motif disulfides whose job is to aid in the formation of correct disulfides in other proteins that don't finish folding until they have been transported through the inner membrane and into the periplasm. LepB seems to clip the N-terminal off of proteins entering the periplasm; maybe with its conserved CXXC motif at 23-26 it also assists in the formation of disulfides.

jeff101 Lv 1

July 16, 2016

In http://jb.asm.org/content/194/10/2614.full.pdf cited above,
references 5, 25, 27, and 36 sounded interesting to me. I was
able to download them all from home. The next trick is to read
them. Perhaps some of you will have time to read them before I do:

(5) http://jb.asm.org/content/175/16/4957.full.pdf

(25) http://www.nature.com/nature/journal/v396/n6712/pdf/396707b0.pdf

(27) http://www.sfu.ca/~mpaetzel/publications/Paetzel_SPase_Review_ChemRev_2002.pdf

(36) https://www.researchgate.net/publication/7634944_Type_I_signal_peptidase_An_overview

The following I found while searching for the others:

jeff101 Lv 1

July 16, 2016

I only knew about the http://jb.asm.org/content/194/10/2614.full.pdf
article above because jmbrownlee333 gave a link to it in Veteran Chat
a few days ago. Thanks to jmbrownlee333 for sharing this link!

jeff101 Lv 1

July 16, 2016

Is there any evidence that there are NO disulfide bonds in LepB?
Also, how does being gram+ or gram- affect disulfide content?
Finally, is TB gram+ or gram-?

Bruno Kestemont Lv 1

July 17, 2016

All your discussion seems very complicated (but interesting) to me. Frankly said, I just tried designs from "artistic" inspiration and using idealSS and remixes to help the protein show me some direction to move. It gives relatively good point results but I suppose there will be 1 chance on 100000000 that my design is the right one !
I suppose that this is the purpose of this puzzle: many crowd "non expert" solutions ("out of the box") and, who knows, this could inspire the biochemists for some parts of the protein.

BTW, a question: our best scores seem very small as compared to the number of segments.

Does Nature always find the minimum energy OR are some natural proteins in kind of "local minimum" ?

Susume Lv 1

July 17, 2016

There is a group of mostly blue sidechains that is extremely conserved across homolog sequences: 156-166, GDNRxxSxDSR. Do we have any idea what these are for, or any clue where they should go?

They are even more strongly conserved than the serine-lysine catalytic dyad, half of which (the lysine) is in a different part of the sequence from the TB in about a quarter of the homologs (including, by the way, the related E. coli protein whose structure is solved).

The Nature article by Paetzel et al that jeff101 posted a link to, which is about the E. coli protein, describes two pockets that help the E. coli protein stick to its targets (which are the same kind of targets our TB protein wants to stick to) - but those are both hydrophobic pockets. Could this mostly blue part of the protein also be involved in sticking to targets? It must do something both important and very specific, or the amino acids would vary more across homologs than they do.