Experiment results for IL-2R binders

Started by bkoep

bkoep Staff Lv 1

We have lab results from our IL-2R binder experiments! In late 2021, we challenged Foldit players to design a protein binder for the IL-2 receptor, as a strategy to reduce the side effects of cancer immunotherapy. We sourced Foldit solutions to put together a pool of 1997 designs to test for IL-2R binding in the wet lab.

In short, we did not see any strong binders for the IL-2R target.

The data

We tested the binder designs at the UW Institute for Protein Design, using fluorescence activated cell sorting (FACS). You can read more about the FACS technique in this previous blogpost. To recap, a FACS experiment lets us quickly sort through thousands and thousands of designs, which are displayed on yeast cells and tagged with fluorescent markers.

Below is a preview of the raw experiment results. You can download the data for all 1997 designs here.

pdb_id         	counts1 counts2 counts3 counts4 counts5 counts6 counts7
2011731_c0001  	31  	0   	0   	0   	0   	0   	0
2011731_c0007  	125 	0   	0   	0   	0   	0   	0
2011731_c0008  	112 	0   	0   	0   	0   	0   	0
2011731_c0013  	162 	0   	0   	0   	0   	0   	0
2011731_c0018  	270 	0   	0   	0   	0   	0   	0
2011731_c0019  	97  	0   	0   	0   	0   	0   	0
2011731_c0024  	292 	0   	0   	0   	0   	0   	0
2011731_c0026  	146 	0   	0   	0   	0   	0   	0
2011731_c0029  	2   	0   	0   	0   	0   	0   	0
...

The seven “counts” columns correspond to seven different FACS sorts, according to the following schedule:

Sort schedule

  1. Expression
  2. Enrichment at 1000 nM target
  3. Enrichment at 1000 nM target
  4. Binding at 1000 nM target
  5. Binding at 100 nM target
  6. Binding at 10 nM target
  7. Binding at 1 nM target

Designs by IPD scientists

Alongside the 1997 Foldit designs, we also tested 30,000 designs created by IPD scientists using an automated design method. From the 30,000 IPD designs, we detected 77 binder hits for the IL-2R target.

By and large, these hits reinforce what we already know about binder design. The 77 hits had AlphaFold confidence ranging 80-97% with an average confidence of 92% (vs. 88% among Foldit designs). And the hits all had high Contact Surface, ranging 400-600 with an average of 506 (vs. 432 among Foldit designs).

That's a good sign! Every week we see Foldit players design proteins with similar metrics. And now we're working on hard targets that are difficult for automated design, like the TGF receptor and CD22.

IPD researchers will be following up on the 77 hits to more precisely measure binding mode and affinity, and see if they can be improved for tighter and more specific binding. In the meantime, we'll continue to challenge Foldit players with binder targets! Players can look forward to more design tools, like the recent Neural Net Objective, to help us design ever better binders.

Thank you to all the Foldit players who participated in the IL-2R binder design puzzles. Keep up great work, and happy folding!

ichwilldiesennamen Lv 1

It is good to hear that there were some sticky binders found even though these did not come from Foldit-designs. I know that there is only a small chance that a Foldit design would actually work because we are lacking "throughput".
But it makes me wonder if something is yet profoundly inorrectly modelled in Foldit which leads to discrepancies in Foldit-results to actual realizations. And to me this is the treatment of BUNs in Foldit. This is because if I think back to when the result of an actual, working COVID-binder were shown to us in Foldit, the interface between the binder and the target was lit up like a christmas-tree due to the many BUNs present. And that seemed to work in reality.
In current binder-puzzles we are encouraged to have almost no BUNs at all. Typically this can be achieved by pushing the protein away from the target. But this is in most cases counteracting significantly the CS-performance and therefore decreases "stickiness". Since you write in the article that CS has very if not the most importance, wouldn't it be a good idea to relax the BUNs objective significantly in favor of CS? Maybe by reducing significantly the radius in which an inerfering object would lead to triggering of a BUN? This would give way more headroom for better CS-values.
Could the data of these found working binders be used to "train" this radius? Because BUNs in these solutions seem to be unproblematic in reality and they should be for us in Foldit as well. Maybe this will get us scientifically more relevant solutions in Foldit?!
The other important aspect - to get good AF values - was already addressed with the AF-server and NN-tool. So it is possible to get good results in the region of 90 which should be well-suited.
Anyway, I would be really happy to see these working solutions in Foldit. There should be a lot in there to learn from. Could you make some or all available in a sandbox-puzzle or so to flip through?

bkoep Staff Lv 1

Yes, there is some discord about the treatment of BUNS in protein design.

In principle, most of what we know from physics and solved protein structures tells us that real BUNS are exceedingly rare. In practice, it seems that we can get away with design models that include BUNS but still function as binders. The catch is that the model BUNS are inaccurate; in reality our design folds or binds slightly differently from the model, so that all the BUNS are satisfied.

As an example, we can look more closely at the COVID binder. You're right that Foldit flags a lot of BUNS atoms in the LCB1 design model, even though we know that this design tightly binds the SARS-CoV-2 spike. However, in the solved structure of this protein, it seems that all of these interface BUNS are actually satisfied. So the BUNS in the design model are inaccurate; the inaccuracies may come from the approximations that Foldit uses to speed up BUNS calculations, or they may come from slight errors in the design model.

We do not want to disregard BUNS entirely. Maybe some BUNS are inaccurate, but any real BUNS are sure to prevent binding. Without a way to tell which BUNS are inaccurate, our safest course is to avoid all BUNS as best we can. If you are right that players are sacrificing Contact Area to avoid BUNS, then maybe we can improve our success rates by relaxing the BUNS Objective. Perhaps this is worth some experimentation… (Although, I'm not sure that decreasing the solvent "probe radius" is be the best way to go about this.)

Unfortunately, we are not currently able to share the successful IL-2R binders that were designed by IPD scientists. I do agree it could be instructive to look at successful binder designs in Foldit. We may be able to put together something like that for some other targets.

ichwilldiesennamen Lv 1

If I understand this correctly then the folds in Foldit are still quite far away from being accurate in reality. Because as you write not only the forming of HBonds is inaccurate (which would eliminate BUNs and doesn't happen properly) but the actual structure of the Foldit-fold (twist-angles, distances etc.) may be off to the equivalent real fold? But isn't it so that the current optimization towards good AF-values should at least eliminate the second topic to the largest extent? Even so we have to hope still that the HBonds will form EXACTLY as we design them because if I understand you correctly if only 1 BUN will form in reality then the whole design will probably be ruined? This would explain to me why there is probably only a minute chance that a Foldit-binder will actually work.
But it raises to me the question: how much worthiness do the Foldit-solutions then actually have for scientific research? And what is the main "screw" to fiddle with in order to improve Foldit in this regard? Are there for example recipes we should avoid or use more in order to improve the scientific worthiness of designs? Or can we hope that very good AF-values will solve this problem automatically?
Regarding the COVID binder this would mean that the structure of the thing is 100% as in reality (because you know exactly how it folds) but it may not be positioned and bonded correctly? If this is so then there must be some position where all bonds form as they should with 0 BUNs? That would be something that I would really like to see.
Don't get me wrong, I like Foldit very much and the challenges in the puzzles but it would be good to know "where we are now" in terms of scientific value of zhe solutions and what could be improved to get better. I like the prospect of having relaxed BUNs objective. Might be worth a try and could improve CS
So can I deduce out of this that even if we could assure that the monomer would fold in Foldit EXACTLY as in reality, I can still dump a design if there is a region in the interface where some buried polar sidechain HAS NO CHANCE of finding an HBond-partner to get fully satisfied? Because this would create a BUN for sure. So I would be forced to mutate this sidechain to something nonpolar to safely avoid this (which may not be possible for binder-designs).Is the 0-BUNs requirement really that stringent?

bkoep Staff Lv 1

Apologies, I should clarify that when I say "model errors" in my comment above, I am talking about extremely small errors in atomic coordinates, less than 1 Å RMSD. This kind of error usually does not change the protein H-bonds that are modeled. However, these tiny errors can have an outsized effect on solvent accessible surface area, and this is where the BUNS inaccuracies tend to come from. In other words, inaccurate BUNS are usually due to errors in "buriedness" – not errors in H-bonds.

Rest assured that proteins designed in Foldit are just as accurate as the best in the field. In the 2019 Foldit protein design paper, we solved structures of 4 Foldit designs, with backbone errors ranging 0.9-1.8 Å RMSD. Compare, for example, against this recent paper from Huang et al. at USTC, with several solved structures from a brand new deep learning design method; the backbone errors in their models range 0.9-2.4 Å RMSD.

This level of accuracy should be sufficient for binder design. (Other applications, like enzyme design, may require greater accuracy.) So we are not currently worried about the accuracy of proteins designed in Foldit. Our biggest hurdle right now is creating models that meet stringent Objectives: high Contact Surface, high AlphaFold confidence, and few BUNS. Unfortunately I can't tell you how to do that, or "which screw to fiddle with" – we are hoping Foldit players can help us figure it out!

The difficulty with BUNS is that the Foldit BUNS Objective is particularly sensitive, and the BUNS Objective is possibly more stringent than it needs to be. For the sake of design quality, we prefer it to be too stringent rather than too lenient. But, of course, extra stringency makes Foldit more challenging (and frustrating) from the player's perspective.

ichwilldiesennamen Lv 1

It is good to hear that model-differences are related to backbone-offsets/SASA rather than failing H-Bonds. And especially it is good that the accuracy of the Foldit-proteins is close to excellent. I didn't know that it was THAT good. Still I do not fully understand how this can so profoundly influence BUNs. Is my understanding correct that as long as some unsatisfied can be "reached" by water it will not be flagged as a BUN? So "false" BUNs (meaning BUNs that are flagged in Foldit but not in reality) can not occur for in Foldit fully H-bonded buried atoms but for maybe some unburied ones in or at the interface-rim which are on the verge of becoming a BUN but since this is not correctly modelled/detected in Foldit actually form a BUN in reality and therefore prevent "sticking"? Because as I understand it only those BUNs can be problematic which are not showing in Foldit but actually occur in reality since Foldit should be more stringent here. Still this would mean that for a successful binder all open atoms in or around the interface (also on the target!) would have to be either fully bonded or they must be reachable by water. I can hardly believe that this was the case (no matter what Foldit showed) for the COVID binder. I guess I will have to study it more closely. I will keep in mind that probably not all of the BUNs shown in Foldit will actually become BUNs in reality.
Thx for trying in the current binder-puzzle to relax the BUNs-objective. It is worth a try. And it should enable better values in the other objectives. So it should be possible to bring the binder closer to the target (with acceptable BUNs-penalty) and thereby improving CS and with an acceptable amount of (possibly false) BUNs.
This whole discussion is motivated from my side by the numbers you mentioned. From 30k results there were 77 successful ones. So if the Foldit designs would have had the same "quality" we should have seen 2k/30k*77=5 successful binders in them. But we did not. That made me wonder about the "quality".

nspc Lv 1

Maybe we need a new tool (not an objective, a tool in Foldit), that compute BUNS but with very nice precision, and that take like 2 minutes for exemple.

We can use it only sometime as a "protein analysis tool".

It can help to find real BUNS to fix before share to scientist (or after if we want a variant), and help to detect all false positives.

We can keep new score system that reduce penalty in buns, because it helps to have better CS.

ichwilldiesennamen Lv 1

If something like this would be posdible then this would be just great! Because it will be for sure far more helpful to get only information if there will really be BUNs forming and then one could act on them and try to avoid them. If such an accurate non-realtime determination of BUNs is possible at all, then this could be the way to go I guess.