free_radical Lv 1
I do not know…I will look into this.
Closed since over 9 years ago
Intermediate Overall PredictionThis puzzle starts with an unfolded sequence with secondary structure assigned from PSIPRED. The target protein is LepB and is currently being investigated for drug discovery against Tuberculosis (TB). TB is caused by the bacillus Mycobacterium tuberculosis and has killed more than 1.5 million people in 2014. Right now, no crystal structure exists for this target. Models created by Foldit players will be used to help solve the structure when crystals become available.
I do not know…I will look into this.
1258 has 7 prolines in residues 117-148 (I think this is the area brow is looking at), in the motif
PXXXPXXXPXXXXXXPXXXPXXXXXXXPXXXP. It would certainly be interesting if that area could form a proline-rich helix.
Of the 865 homologs that jpred found for our sequence, only 20 have 4 or more prolines in those 7 spots, and only 5 have 5 or more of them, so it is not a common motif. (Note these are not homologs with solved structures, just known sequences.)
Residues 23-26 look like they may be an insertion, with only 119 of 855 homologs having all four residues present. Of the 33 homologs having C at residue 26, all of them also have C at residue 23. Of the 36 homologs having C at residue 23, 33 of them have C at residue 26. So this might be a candidate for a disulfide, since the C's strongly tend to occur together or not at all. Not sure what use a disulfide would be on residues so close together, though.
The number of homologs having C in the other places is:
Residue 105: 131 have C
Residue 137: 38 have C
Residue 169: 12 have C
Residue 173: 1 has C
Ser94, Ser96, and Lys174 in the article correspond to 13 S, 15 S, and 93 K in our protein. If they form a dyad, does that mean they are in contact? And on the outside of the protein, so they can interact with other proteins?
Apparently in E. coli there are proteins in the periplasm (the stuff between the inner and outer membranes) with CXXC motif disulfides whose job is to aid in the formation of correct disulfides in other proteins that don't finish folding until they have been transported through the inner membrane and into the periplasm. LepB seems to clip the N-terminal off of proteins entering the periplasm; maybe with its conserved CXXC motif at 23-26 it also assists in the formation of disulfides.
In http://jb.asm.org/content/194/10/2614.full.pdf cited above,
references 5, 25, 27, and 36 sounded interesting to me. I was
able to download them all from home. The next trick is to read
them. Perhaps some of you will have time to read them before I do:
(5) http://jb.asm.org/content/175/16/4957.full.pdf
(25) http://www.nature.com/nature/journal/v396/n6712/pdf/396707b0.pdf
(27) http://www.sfu.ca/~mpaetzel/publications/Paetzel_SPase_Review_ChemRev_2002.pdf
(36) https://www.researchgate.net/publication/7634944_Type_I_signal_peptidase_An_overview
The following I found while searching for the others:
I only knew about the http://jb.asm.org/content/194/10/2614.full.pdf
article above because jmbrownlee333 gave a link to it in Veteran Chat
a few days ago. Thanks to jmbrownlee333 for sharing this link!
Is there any evidence that there are NO disulfide bonds in LepB?
Also, how does being gram+ or gram- affect disulfide content?
Finally, is TB gram+ or gram-?
All your discussion seems very complicated (but interesting) to me. Frankly said, I just tried designs from "artistic" inspiration and using idealSS and remixes to help the protein show me some direction to move. It gives relatively good point results but I suppose there will be 1 chance on 100000000 that my design is the right one !
I suppose that this is the purpose of this puzzle: many crowd "non expert" solutions ("out of the box") and, who knows, this could inspire the biochemists for some parts of the protein.
BTW, a question: our best scores seem very small as compared to the number of segments.
Does Nature always find the minimum energy OR are some natural proteins in kind of "local minimum" ?
There is a group of mostly blue sidechains that is extremely conserved across homolog sequences: 156-166, GDNRxxSxDSR. Do we have any idea what these are for, or any clue where they should go?
They are even more strongly conserved than the serine-lysine catalytic dyad, half of which (the lysine) is in a different part of the sequence from the TB in about a quarter of the homologs (including, by the way, the related E. coli protein whose structure is solved).
The Nature article by Paetzel et al that jeff101 posted a link to, which is about the E. coli protein, describes two pockets that help the E. coli protein stick to its targets (which are the same kind of targets our TB protein wants to stick to) - but those are both hydrophobic pockets. Could this mostly blue part of the protein also be involved in sticking to targets? It must do something both important and very specific, or the amino acids would vary more across homologs than they do.