Placeholder image of a protein
Icon representing a puzzle

1258: Tuberculosis Challenge - Phase 1

Closed since over 9 years ago

Intermediate Overall Prediction

Summary


Created
July 11, 2016
Expires
Max points
100
Description

This puzzle starts with an unfolded sequence with secondary structure assigned from PSIPRED. The target protein is LepB and is currently being investigated for drug discovery against Tuberculosis (TB). TB is caused by the bacillus Mycobacterium tuberculosis and has killed more than 1.5 million people in 2014. Right now, no crystal structure exists for this target. Models created by Foldit players will be used to help solve the structure when crystals become available.

Top groups


  1. Avatar for SETI.Germany 11. SETI.Germany 1 pt. 7,637
  2. Avatar for Deleted group 12. Deleted group pts. 7,170
  3. Avatar for Bad Monkey 13. Bad Monkey 1 pt. 6,123
  4. Avatar for Russian team 14. Russian team 1 pt. 4,111
  5. Avatar for Team Mexico 15. Team Mexico 1 pt. 3,025
  6. Avatar for Rechenkraft.net 16. Rechenkraft.net 1 pt. 0
  7. Avatar for DSN @ Home 17. DSN @ Home 1 pt. 0

  1. Avatar for gitwut
    1. gitwut Lv 1
    100 pts. 9,615
  2. Avatar for Galaxie 2. Galaxie Lv 1 98 pts. 9,547
  3. Avatar for Susume 3. Susume Lv 1 96 pts. 9,517
  4. Avatar for Bruno Kestemont 4. Bruno Kestemont Lv 1 94 pts. 9,494
  5. Avatar for Deleted player 5. Deleted player pts. 9,403
  6. Avatar for dembones 6. dembones Lv 1 90 pts. 9,379
  7. Avatar for MurloW 7. MurloW Lv 1 88 pts. 9,329
  8. Avatar for brow42 8. brow42 Lv 1 86 pts. 9,309
  9. Avatar for tokens 9. tokens Lv 1 84 pts. 9,272
  10. Avatar for grogar7 10. grogar7 Lv 1 82 pts. 9,261

Comments


Susume Lv 1

Primary sequence is:
MLTFVARPYLIPSESMEPTLHGCSTCVGDRIMVDKLSYRFGSPQPGDVIV
FRGPPSWNVGYKSIRSHNVAVRWVQNALSFIGFVPPDENDLVKRVIAVGG
QTVQCRSDTGLTVNGRPLKEPYLDPATMMADPSIYPCLGSEFGPVTVPPG
RVWVMGDNRTHSADSRAHCPLLCTDDPLPGTVPVANVIGKARLIVWPPSR
WGVVRSVNPQQGR

Bautho Lv 1

Do the scientists know if we have to form disulfide bridges for this protein?
Would be really helpful, because there are 6 of them in the sequence

spvincent Lv 1

I think a puzzle this size is quite intractable, particularly when the secondary structure is so ill-defined. How about a few starting structures in the alignment palette?

free_radical Lv 1

Hi Batz,

There is no evidence that there are disulfide bonds in LepB. One of the scientist who is working on this problem responded:

To predict if the protein forms disulfide bonds scientists look at the protein residency in a particular cellular compartment and its oxidative properties.

  1. You check what kind of bacteria TB is gram+ or gram-
  2. Predict protein compartment
  3. Check for presence of disulfides

jeff101 Lv 1

With 6 cysteines, there is
1 way to have no disulfides,
15 ways to have 1 disulfide,
45 ways to have 2 disulfides, and
15 ways to have 3 disulfides.

If we number the cysteines 1-6
so that 12,34 means 2 disulfides
(one between cysteines 1 & 2 and
another between cysteines 3 & 4),
below are all the different ways:

1 disulfide (15 ways): 
12 13 14 15 16 
23 24 25 26
34 35 36
45 46
56

2 disulfides (45 ways):
12,34 12,35 12,36 12,45 12,46 12,56
13,24 13,25 13,26 13,45 13,46 13,56
14,23 14,25 14,26 14,35 14,36 14,56
15,23 15,24 15,26 15,34 15,36 15,46
16,23 16,24 16,25 16,34 16,35 16,45
23,45 23,46 23,56
24,35 24,36 24,56
25,34 25,36 25,46
26,34 26,35 26,45
34,56 
35,46
36,45

3 disulfides (15 ways):
12,34,56 12,35,46 12,36,45
13,24,56 13,25,46 13,26,45
14,23,56 14,25,36 14,26,35
15,23,46 15,24,36 15,26,34
16,23,45 16,24,35 16,25,34

bandsomeSS (https://fold.it/portal/recipe/101275)
is a Recipe for banding disulfide bonds.

bandsome (https://fold.it/portal/recipe/43861)
has a web page with discussion and links about
disulfide bonds. It says that more disulfide
bonds form in an oxidizing environment (like in
the blood, spinal fluid, extracellular medium,
lumen of the rough endoplasmic reticulum,
mitochondrial intermembrane space, secretory
proteins, lysosomal proteins, exoplasmic domains
of membrane proteins, hair, and feathers) than
in a reducing environment (like in the cytosol
and most cellular compartments).

jeff101 Lv 1

This puzzle has about twice the usual number of amino acids in it.
Why not give us about twice the usual amount of time to work on it?
Also, please let us load this puzzle's solutions into future puzzles.

Thanks!

free_radical Lv 1

Hi Spvincent,

I think you are hitting upon one of the reasons why this specific target is so difficult. This target is bound to the cell membrane of mycobacterium tuberculosis through an 80 residue linker. For this puzzle, we have already taken that 80 residue linker out to focus more on the fold of the protein. Additionally, the protein is only ~25% identical to the closest homolog with a crystal structure.

For the first phase, I would like to see what types of interesting ideas come from Foldit. Why? Because this is a difficult problem and Foldit players think about the puzzles in a different way than I do, or the other scientists working on this puzzle. This is a great asset, especially in a field where ideas have become a little stagnated. I value the models and ideas from Foldit players.

Phase 2 will incorporate folded proteins from phase 1 along with a homology model that I created and a couple of template proteins (remember, the templates are bad because the low homology).

free_radical Lv 1

Hi Jeff101,

I have no problem extending the puzzle for a week. Phase 2 will include models from phase 1 to work on. I know this is a hard puzzle (I have been playing it too, and for the life of me I cant get my score that high…).

jeff101 Lv 1

Several recent De-novo puzzles (1252,1243,1237,1231,1224) have been followed by
Predicted Contacts puzzles (1255,1246,1240/1240b,1234,1227), where Contact Maps
are predicted using co-evolution data. Will there be a Predicted Contacts puzzle
for 1258's protein as well?

jeff101 Lv 1

The above sequence of 213 amino acids seems to be residues 82-294 in Fig.4 of
http://jb.asm.org/content/194/10/2614.full.pdf that includes boxes B, C, D, and E.
Box B contains Ser94 & Ser96 while Box D contains Lys174.

p.2617 of the above article says "Amino acid alignments of LepB with SPaseI
of other bacterial species identified a short intracellular domain and a large
extracellular domain containing the conserved regions boxes B, C, D, and E. The
predicted catalytically active residues of the characteristic serine-lysine dyad
are located in box B and box D." p.2617 also says "These results indicate that
the Ser94, Ser96, and Lys174 residues are essential for LepB function."

p.2618 says "we hypothesize that Ser94 and Lys174 form the catalytic center of the
protein, while Ser96 likely stabilizes the interaction with the preprotein and the
catalytic serine residue". pp.2618-9 says "the active site is located on the outside
of the cytoplasmic membrane, making it relatively accessible for small molecules."

Finally, p.2614 says "Stepwise translocation of the preprotein across the membrane
is driven by SecA-mediated ATP hydrolysis. After translocation, LepB cleaves the
signal peptide from the preprotein, releasing the mature protein into the periplasm."

All these things make me think the protein in Puzzle 1258 is an extracellular one.
They also make me think Ser94 and Lys174 should be close to each other.