Placeholder image of a protein
Icon representing a puzzle

1258: Tuberculosis Challenge - Phase 1

Closed since over 9 years ago

Intermediate Overall Prediction

Summary


Created
July 11, 2016
Expires
Max points
100
Description

This puzzle starts with an unfolded sequence with secondary structure assigned from PSIPRED. The target protein is LepB and is currently being investigated for drug discovery against Tuberculosis (TB). TB is caused by the bacillus Mycobacterium tuberculosis and has killed more than 1.5 million people in 2014. Right now, no crystal structure exists for this target. Models created by Foldit players will be used to help solve the structure when crystals become available.

Top groups


  1. Avatar for Contenders 100 pts. 9,670
  2. Avatar for Anthropic Dreams 2. Anthropic Dreams 73 pts. 9,563
  3. Avatar for Go Science 3. Go Science 52 pts. 9,497
  4. Avatar for Gargleblasters 4. Gargleblasters 36 pts. 9,362
  5. Avatar for Deleted group 5. Deleted group pts. 9,235
  6. Avatar for L'Alliance Francophone 6. L'Alliance Francophone 16 pts. 9,218
  7. Avatar for Void Crushers 7. Void Crushers 10 pts. 9,063
  8. Avatar for Beta Folders 8. Beta Folders 6 pts. 9,058
  9. Avatar for Italiani Al Lavoro 9. Italiani Al Lavoro 4 pts. 8,255
  10. Avatar for HMT heritage 10. HMT heritage 2 pts. 7,801

  1. Avatar for gitwut
    1. gitwut Lv 1
    100 pts. 9,615
  2. Avatar for Galaxie 2. Galaxie Lv 1 98 pts. 9,547
  3. Avatar for Susume 3. Susume Lv 1 96 pts. 9,517
  4. Avatar for Bruno Kestemont 4. Bruno Kestemont Lv 1 94 pts. 9,494
  5. Avatar for Deleted player 5. Deleted player pts. 9,403
  6. Avatar for dembones 6. dembones Lv 1 90 pts. 9,379
  7. Avatar for MurloW 7. MurloW Lv 1 88 pts. 9,329
  8. Avatar for brow42 8. brow42 Lv 1 86 pts. 9,309
  9. Avatar for tokens 9. tokens Lv 1 84 pts. 9,272
  10. Avatar for grogar7 10. grogar7 Lv 1 82 pts. 9,261

Comments


Susume Lv 1

1258 has 7 prolines in residues 117-148 (I think this is the area brow is looking at), in the motif
PXXXPXXXPXXXXXXPXXXPXXXXXXXPXXXP. It would certainly be interesting if that area could form a proline-rich helix.

Of the 865 homologs that jpred found for our sequence, only 20 have 4 or more prolines in those 7 spots, and only 5 have 5 or more of them, so it is not a common motif. (Note these are not homologs with solved structures, just known sequences.)

Susume Lv 1

Residues 23-26 look like they may be an insertion, with only 119 of 855 homologs having all four residues present. Of the 33 homologs having C at residue 26, all of them also have C at residue 23. Of the 36 homologs having C at residue 23, 33 of them have C at residue 26. So this might be a candidate for a disulfide, since the C's strongly tend to occur together or not at all. Not sure what use a disulfide would be on residues so close together, though.

The number of homologs having C in the other places is:
Residue 105: 131 have C
Residue 137: 38 have C
Residue 169: 12 have C
Residue 173: 1 has C

Susume Lv 1

Ser94, Ser96, and Lys174 in the article correspond to 13 S, 15 S, and 93 K in our protein. If they form a dyad, does that mean they are in contact? And on the outside of the protein, so they can interact with other proteins?

Susume Lv 1

Apparently in E. coli there are proteins in the periplasm (the stuff between the inner and outer membranes) with CXXC motif disulfides whose job is to aid in the formation of correct disulfides in other proteins that don't finish folding until they have been transported through the inner membrane and into the periplasm. LepB seems to clip the N-terminal off of proteins entering the periplasm; maybe with its conserved CXXC motif at 23-26 it also assists in the formation of disulfides.

jeff101 Lv 1

In http://jb.asm.org/content/194/10/2614.full.pdf cited above,
references 5, 25, 27, and 36 sounded interesting to me. I was
able to download them all from home. The next trick is to read
them. Perhaps some of you will have time to read them before I do:

(5) http://jb.asm.org/content/175/16/4957.full.pdf

(25) http://www.nature.com/nature/journal/v396/n6712/pdf/396707b0.pdf

(27) http://www.sfu.ca/~mpaetzel/publications/Paetzel_SPase_Review_ChemRev_2002.pdf

(36) https://www.researchgate.net/publication/7634944_Type_I_signal_peptidase_An_overview

The following I found while searching for the others:

jeff101 Lv 1

Is there any evidence that there are NO disulfide bonds in LepB?
Also, how does being gram+ or gram- affect disulfide content?
Finally, is TB gram+ or gram-?

Bruno Kestemont Lv 1

All your discussion seems very complicated (but interesting) to me. Frankly said, I just tried designs from "artistic" inspiration and using idealSS and remixes to help the protein show me some direction to move. It gives relatively good point results but I suppose there will be 1 chance on 100000000 that my design is the right one !
I suppose that this is the purpose of this puzzle: many crowd "non expert" solutions ("out of the box") and, who knows, this could inspire the biochemists for some parts of the protein.

BTW, a question: our best scores seem very small as compared to the number of segments.

Does Nature always find the minimum energy OR are some natural proteins in kind of "local minimum" ?

Susume Lv 1

There is a group of mostly blue sidechains that is extremely conserved across homolog sequences: 156-166, GDNRxxSxDSR. Do we have any idea what these are for, or any clue where they should go?

They are even more strongly conserved than the serine-lysine catalytic dyad, half of which (the lysine) is in a different part of the sequence from the TB in about a quarter of the homologs (including, by the way, the related E. coli protein whose structure is solved).

The Nature article by Paetzel et al that jeff101 posted a link to, which is about the E. coli protein, describes two pockets that help the E. coli protein stick to its targets (which are the same kind of targets our TB protein wants to stick to) - but those are both hydrophobic pockets. Could this mostly blue part of the protein also be involved in sticking to targets? It must do something both important and very specific, or the amino acids would vary more across homologs than they do.