Feedback on the last Marburg design puzzle

Started July 01, 2015 by v_mulligan

v_mulligan Lv 1

July 01, 2015

Happy Canada Day to our Canadian players!

We're about to give you a Marburg design puzzle, but I first wanted to provide a little bit of feedback on the last Marburg puzzle that we ran (puzzle 1073: Marburg Binder Design with Disulfides). We got some pretty good-looking designs back, but curiously, all of the best designs were ones that were shared with us using the "Share with Scientists" feature; they were not the top-scoring designs. This reflects one of the big challenges, for us, in giving these puzzles to the Foldit community: it's hard for us to figure out how to translate our qualitative idea of what a "good" design will look like into a set of quantitative rules that allow a computer to assign a higher score to a better design, and we didn't quite get it right on that puzzle.

We're going to revisit the 37-residue Marburg binder design in the new puzzle, but this time, we'll tweak the scoring and the filters to try to promote the features that we want to see. In particular, we'd like to see:

– Compactness. Our favourite designs were the ones that were not so elongated. It's good to try to imagine how the peptide would look in the absence of the Marburg glycoprotein. If its structure depends on its interaction with the glycoprotein, and if it's floppy and unstructured in isolation, this means that the binding event must confer order on the peptide. This is entropically unfavourable (it's hard to increase the order of a physical system), and the price of ordering the peptide must be paid by a reduction in binding affinity. So nice, compact peptides that are likely to retain their structure in isolation are best for binding. We'll be including a core existence filter in the new puzzle to help to promote this.
– Long-range disulfides. Disulfide bonds are a great way of enforcing order in a peptide, but this only works if they're holding together parts that would otherwise be inclined to move apart. A disulfide between two amino acids that are close together in linear sequence doesn't do a whole lot. Often, the best disulfide-bonding patterns (and the ones frequently seen in structured natural peptides) are "leapfrog" patterns. For example, you might link residue 1 to residue 20, and residue 10 to residue 30. This maximizes the linear separation between disulfide-bonded cysteine residues while minimizing the length of stretches of amino acids that contain no disulfide.
– Lots of secondary structure. Many of the best designs had at least one helix lying across a sheet of at least two strands.
– Lots of proline residues and few glycine residues. Proline helps to rigidify a peptide in a desired conformation. Glycine, on the other hand, is too flexible, and increases the number of possible alternative conformations, making folding into a single unique conformation more difficult. Although glycine is sometimes necessary (there are sometimes loops in which one residue has to be in an unusual conformation that no other amino acid type could access), we'd like to see few glycine residues if possible. In the new puzzle, we'll be increasing the penalty for glycine and increasing the bonus for proline.
– Void-free packing. We recommend turning on the cavity or void display in the view options. This draws red spheres wherever there is an empty space in the core of the peptide being designed or at the interface with the Marburg glycoprotein. These red spheres are bad – there shouldn't be empty spaces! If you can eliminate them, please do so (you should find that it helps your score, too).
– Additional interactions with the glycoprotein. Although much of the interaction with the glycoprotein comes from the loop taken from the antibody crystal structure, the peptide is likely to have more affinity and specificity for its target if you can design additional interactions with the surface of the glycoprotein. Shape and charge complementarity beyond the antibody loop are good things! Now, this comes with a caveat: don't go for additional interactions at the expense of ordered structure. A peptide that lies entirely along the glycoprotein surface, forming lots of interactions with the surface and no interactions with itself, is likely to be too disordered to bind. If you can design in additional interactions with the surface that come off of well-ordered secondary structure elements that pack well with the rest of the peptide, though, that would be terrific.

Here are some of the solutions that we particularly liked:

retiredmichael of the Beta Folders had a nice design in which an alpha helix lies across one face of a three-stranded beta-sheet. This design had lots of secondary structure, a pretty good hydrophobic core (in which most of the secondary structure elements contribute), a nice disulfide bonding pattern, and relatively few voids in the interior of the peptide. The only criticism I'd have of this design is that one strand in the sheet does not contribute to the hydrophobic core; aside from this, this design looks very good.
Design by retiredmichael of the Beta Folders.

mimi of the Contenders created a very interesting topology with helices sandwiching a three-stranded beta-sheet. This creates a lot of opportunities for interaction with the surface of the Marburg glycoprotein. The one thing to watch out for, here, is that this design has a bit less of a hydrophobic core, which might hinder its folding.
Design by mimi of the Contenders.

MurloW of Anthropic Dreams made a design with a topology somewhat similar to retiredmichael's, albeit with the helix in a different orientation and a fourth strand on the beta-sheet. This also opens up possibilities for interactions with the glycoprotein – but be careful about those voids at the interface!
Design by MurloW of Anthropic Dreams.

eusair's design put the helix on the opposite face of the beta-sheet to that chosen by retiredmichael or MurloW, and this allowed eusair to stick a valine very nicely into a cleft created by two prolines on the glycoprotein surface. I like this design quite a bit – but be careful about those edge strands, that don't always contribute to the hydrophobic core.
Design by eusair.

Congratulations to all of these players, and good luck to everyone on the next puzzle!

Bruno Kestemont Lv 1

July 02, 2015

Your commentrs are usefull, encouraging for the quoted players and for us to share to scientists, and very useful to help us hopefully better designs (even if scoring does not reflects everything you prefer).

spmm Lv 1

July 08, 2015

Why such a small fragment? 37 residues only

spmm Lv 1

July 08, 2015

For the players who are not protein scientists - in foldit 'strands' are usually just called sheets, so the discussion of strands above may be a bit confusing.
A ''fourth strand on the beta sheet' for example.

http://proteinstructures.com/Structure/Structure/secondary-sructure.html This may or may not help.

v_mulligan Lv 1

July 08, 2015

Thanks! It's hard to avoid jargon sometimes. Yes, the individual pieces of a sheet are called "strands"; many strands make a sheet.

v_mulligan Lv 1

July 08, 2015

We'd ultimately like to go as small as we can. In protein design, there are some trade-offs. If we make a giant protein, it's easy to design for thermal stability (the protein won't spontaneously unfold), and to create a giant binding interface that recognizes the viral protein with terrific specificity and affinity. Unfortunately, a giant protein is hard to make (especially in large amounts), can be more prone to oxidative damage and degradation over time, and is hard to get into the body. (It won't cross the gut-blood barrier, for example, so it would have to be injected, which limits its use as a drug, particularly since organizations like Doctors Without Borders like to avoid sharp objects like needles that can puncture protective clothing when caring for Ebola and Marburg patients.) The other big problem with big proteins is that our immune system is very good at recognizing large foreign proteins. Normally, a foreign protein in the bloodstream would be from a pathogen, so it's something that our immune system has evolved to recognize and clear.

Smaller peptides are better at evading the immune system, can be more shelf-stable, and can cross barriers more effectively, allowing easier administration; they're also easier to produce in large amounts. There's another advantage, too: although we have to rely on bacteria or yeast to produce large proteins for us, small peptides can be synthesized chemically. This means that we can include unnatural amino acids, covalent cross-links, or a terminal peptide bond (to make a cyclic peptide), all of which can increase the resistance of the peptide to degradation or confer other desired properties. The downside, though, is that it's harder to engineer specificity and affinity in something small.

We'll be posting additional puzzles with even fewer residues, so the challenge will continue!

blivens Lv 1

July 17, 2015

Interesting point about there being an entropic penalty for folding upon binding. I've seen estimates that ~25% of proteins have large disordered regions, and at least some of these must fold upon binding. So maybe the entropic penalty is actually cancelled out by faster identification of the binding site or some other functional advantage.

Not that we should try to design intrinsically disordered proteins, but it's interesting to think about.