Foldit

bkoep Staff Lv 1

June 30, 2020

The results from our IL6R binder experiment are back! This experiment tested 100 Foldit designs from the first two rounds of our Coronavirus Anti-inflammatory puzzles, to see if any of them bind to the IL6R target.

In short, we did not see any successful binding from the Foldit designs. This is unfortunate, but we should not be too discouraged! Read on for more details about the experiment, and what these results mean for Foldit (hint: more binder design puzzles!).

This is a long blog post, broken into a few different sections. First, we’ll explain some background about DNA libraries and fluorescence-activated cell sorting techniques that were used for this experiment. Then we’ll go over the experiment results for protein expression and target binding. Finally, we’ll close out with some discussion about these results, and thoughts about what’s next for Foldit.

DNA libraries

In order to test lots of proteins at once, we order a custom DNA library. A DNA library is a mixed pool containing thousands of different DNA genes that encode our designed proteins.

In this experiment, the library includes genes for 100 Foldit player designs and thousands of designs from IPD researchers. All of these designs are intended to bind to the IL6R target.

We insert this mixture of genes into a yeast culture so that each yeast cell gets a gene for just one binder design.

We insert our designed gene alongside a companion gene that encodes a yeast membrane protein. When these genes are decoded, our designed protein is linked to the companion membrane protein. The yeast cell exports these to the cell membrane, so that our designed binder is displayed on the outside of the yeast cell, but is still tethered to the companion protein embedded in the membrane.

Although we expect the yeast cell to have lots of binders on the surface, those binders should all be identical since they came from the same gene.

Figure 1. A DNA library is a mixture with DNA genes encoding thousands of protein designs. The genes are inserted into yeast cells so that the yeast cells can decode the genes and express the designed proteins. The yeast cells export the designed proteins to the cell membrane so that they are displayed on the yeast surface.

Now we have a culture with millions and millions of yeast cells, which are displaying our library with thousands of different binder designs. Each yeast cell displays only one of the designs from the library; but there may be many identical yeast cells that each display the same design.

Fluorescence-activated cell sorting (FACS)

Now that our designed protein is displayed on the yeast surface, we tag the protein with a fluorescent molecule that emits green light. The intensity of green fluorescence corresponds to the amount of protein displayed on the yeast surface (higher intensity = more protein).

In a separate tube, our target protein (IL6R) is free-floating in solution, and we tag it with a different fluorescent molecule that emits red light.

Then we mix the free-floating target IL6R with our yeast cells. We expect the target will stick to binders that are displayed on the yeast surface. However, if one of our designed proteins does not bind the target, then no target molecules will stick to that yeast cell.

Now we'd like to measure how much target is stuck to each yeast cell. We use a microfluidics device to pass yeast cells, one at a time, in front of a sensitive photometer, which measures the intensity of green and red fluorescence in two separate measurements.

These two measurements are typically plotted as a scatter plot. Each point represents one yeast cell, where the x-axis is intensity of green fluorescence (the amount of displayed protein), and the y-axis is intensity of red fluorescence (the amount of bound target).

Figure 2. (A) Green-tagged designs are tethered to the yeast surface, while red-tagged target is free-floating. If a design successfully binds the target, then a yeast cell will have high-intensity green and red fluorescence. (B) FACS scatter plot of yeast fluorescence measurements. Each point is a yeast cell, with green fluorescence (expression) on the x-axis, and red fluorescence (binding) on the y-axis. Points in the top right corner represent cells with both red and green fluorescence, indicating good expression and binding. (Note that the colors in the plot represent point density; for example, the patch of red near the center of the plot means there are lots of overlapping points in this region.)

After taking these measurements, the cell sorter can redirect each individual yeast cell to one of two buckets (“select” or “reject”), based on their fluorescence. Normally, we are looking for cells that have strong expression (intense green) and strong binding (intense red). So we want to select the top right quadrant of the scatter plot, and reject everything else.

After sorting, we end up with a “select” bucket of all the yeast cells displaying successful binders (these were cells with intense red and green fluorescence, indicating that they express well and stick to the target).

The last step of this experiment is to figure out which proteins were displayed on those cells. There were thousands of designs in our library; which ones stick to the target?

For this, we use DNA sequencing to read the genes of everything in our “select” bucket. If we read a gene encoding one of our designs, then we know that a yeast cell displaying our design was sorted into the select bucket, and so it must have had strong red and green fluorescence.

The final output of our experiment is a list of genes that were found in the "select" bucket, and the number of times we read each gene. If our bucket contains multiple, identical yeast cells with the same gene, then we expect to see multiple reads of that gene.

The data

Below is a preview of the data from this experiment. You can download the data for all 100 Foldit designs here.

design_id       counts1 counts2 counts3 counts4 counts5 counts6 DDG     SASA        SC      BUNS
2009432_c0003   21      0       0       0       0       0       -26.908 946.664     0.600   9
2009432_c0004   57      3       0       0       0       0       -35.443 1198.221    0.669   8
2009432_c0006   29      0       3       0       0       0       -40.365 1386.322    0.647   10
2009432_c0007   17      0       5       0       1       0       -53.948 1635.076    0.679   15
2009432_c0009   67      0       0       0       0       0       -31.730 1032.899    0.665   6
2009432_c0010   94      0       0       0       0       0       -31.894 1267.798    0.672   10
2009432_c0011   57      0       0       0       0       0       -30.796 1122.379    0.553   9
2009432_c0012   111     1       0       0       0       0       -37.067 1340.479    0.641   10
2009432_c0014   5       0       0       0       0       0       -44.323 1378.069    0.554   13
2009432_c0016   16      0       0       0       0       0       -39.257 1460.892    0.649   10
...

In the table above, you can see that each design has six “counts” columns. These correspond to six different FACS experiments with the IL6R binder library, which we'll describe below:

Expression
Binding at 1000 nM
Binding at 100 nM
Binding at 10 nM
Binding at 1 nM
Binding at 0.1 nM

Sorting for expression

In experiment #1, we try to measure how well the yeast can express and display our designed proteins. We don’t mix the target IL6R protein with our yeast and we don’t measure red fluorescence for binding. We only select yeast with strong green fluorescence, collecting cells that have lots of designed protein displayed on their surface.

The expression experiment is a helpful control for the later binding experiments, but it can also tell us something about how well our proteins behave. Stable, well folded proteins are easily displayed by the yeast, and these yeast will have strong green fluorescence. In contrast, unstable, poorly folded proteins are less likely to be displayed, and will show weaker fluorescence.

For many of the Foldit designs, the sequencing counts from experiment #1 are a little low. The median expression count for a design in this entire library was about 50, and only a third of the Foldit designs met this threshold. This suggests that some of these protein designs are not folding very well.

This is in line with our expectations. When Foldit players design monomer proteins from scratch, we see about a 50% success rate for good folding in the lab (50% is very good by protein design standards!). Binder design is harder than bare monomer design, because we generally have to sacrifice folding stability to optimize binding. So we should expect that <50% of binder designs will fold properly.

Sorting for binding

After selecting for expression, we can start selecting designs from our library based on binding.

This time we mix our yeast cells with red-tagged target IL6R that is free in solution. In the early experiments we mix with a high concentration of the target (1000 nM).

A binding measurement at high concentrations of target is a lenient test for binding. There are lots of target molecules floating around, so even weak binders are likely to have some target stuck to them.

After letting the yeast cells equilibrate with the target in solution, we pass the yeast through the cell sorter and measure the intensity of both red and green light. If a cell lights up for both expression and binding (in the top right quadrant), then we send it to the select bucket for sequencing.

Figure 3. FACS scatter plots. (A) The fluorescence measurements from expression experiment #1. We see two clusters of cells in the bottom left and bottom right quadrants, representing cells with poor expression and high expression, respectively. We select everything in the bottom right quadrant. Note that this experiment does not include any IL6R target, so there is no red fluorescent signal for binding (there are no cells in the top left or top right quadrants). (B) The fluorescence measurements from binding experiment #2. After incubating the yeast cells with target IL6R, we see that some cells have both green and red fluorescence (the top right quadrant). This indicates both strong expression and also strong binding.

We typically repeat the binding experiment, reducing the concentration of target each time. Binding measurements at low concentrations of target provide a stringent test for binding. At 0.1 nM target concentration, we are likely to see binder and target stuck together only if they bind very tightly.

We see very low sequencing counts for all of the Foldit designs–even at high concentration of target–which indicates zero binders. Some designs show a couple of reads in one or two of the binding experiments, but this is within the range of noise that we would expect for zero binders.

Why didn't the Foldit designs bind to the target?

These results are slightly disappointing, but we should not be too discouraged!

Although none of our Foldit designs bound to the IL6R target, we did see a few binders from the designs by IPD researchers. Below are the counts from the tightest IPD binder:

design_id      counts1 counts2 counts3 counts4 counts5 counts6 DDG     SASA        SC      BUNS
IPD_design     144     38      69      56      13      52      -39.114 1720.442    0.640   9

Figure 4. An IPD-designed protein binder with exceptional binder metrics, which appears to bind IL6R. The IL6R library included thousands of proteins designed by IPD researchers with highly optimized binder metrics. Only a handful of designs successfully bound to the target.

Why did we see binding from IPD designs but not from Foldit designs? The IPD designs had exceptional binder metrics. Recall from our previous blogpost that certain metrics seem to correlate with good binding (DDG, SASA, BUNS, shape complementarity). If we rank the tested designs using these metrics, we find that this IPD design outranks all but three of our Foldit designs.

In order to design successful protein binders in Foldit, we will need to focus on these binder metrics. If we can make these metrics available in Foldit puzzles, we are confident that Foldit players will be able to optimize them just as well as IPD researchers. To that end, the Foldit team has been working to add new Objectives that can compute all of these metrics in Foldit. We should be able to release the first prototype Objectives in an update very soon!

Another important consideration here is the sheer number of IPD designs tested. The library for this experiment included thousands of IPD designs, and all of them had top-tier binding metrics like the one above. Even with those thousands of designs, we only got a few binder hits out of the library.

Unfortunately, such high failure rates are typical for protein binder experiments. We have to remember that protein design is a difficult challenge with many pitfalls, and our understanding of protein folding and binding is imperfect. To succeed in protein binder design, we will need to generate lots of designs to test.

What's next for Foldit?

The Foldit designs in this experiment came from just the first two rounds of the anti-inflammatory puzzles, back in April. Since then, we’ve seen even more great designs from Foldit players, and we’ll continue to run binder design puzzles as we work to improve the Foldit tools.

Soon Foldit will have prototype Objectives for calculating DDG, SASA, and shape complementarity. Already, it seems that players have been able to use the new BUNS Objective to improve designs in recent weeks.

We’re excited to keep pressing on the problem of protein binder design! We are used to tackling hard problems in Foldit, we tend to learn a lot about proteins in the process. We think that Foldit players have a lot to contribute in this arena, and we’ll be looking to tackle new (and harder) targets in the coming months.

Remember that we also have an experiment under way to test Foldit-designed binders for the coronavirus spike protein, and we should have results from that experiment soon. So stay tuned for more, and happy folding!

Franco Padelletti Lv 1

July 01, 2020

Wow! A fascinating article!
How many things we can still learn at 72 years of age.
By means of this "game", which I began many years ago as a pure pastime during the holidays.

We are constantly torn between the search for a high score for the ranking (the passion for the race) and the desire to do something for the search for new treatments. In a world that changes so quickly. … but less than it would take.

Soon! Hurry up and prepare new tools that take into account "binder metrics". We like to be at the top of the ranking … but we also like to know that "our" protein has some chance of getting the result we expect.

I forgot: THANK YOU!

Bruno Kestemont Lv 1

July 01, 2020

Great pedagogy and wonderfull illustrations !

This kind of blog is quite usefull to motivate us to persever. Design puzzles would seem us a little bit repetitive if we didn't understand why and how we could do better next time.

Like champions can repeat 1000 x the same movings on the seek of excellence.

Bruno Kestemont Lv 1

July 01, 2020

I went into the list of binder metrics for all 100 Foldit designs.

Many of them did better than the top IPD design for (only) one of the metrics, but no one passed IPD for more than 2 metrics together (out of the 5 "correlated" metrics counts1, DDG, SASA, SC and BUNS).

And logically, no one passed any counts2-6 metrics.

49% outperformed best IPD for SC (>= 0.64)
38% for BUNS (<= 9 BUNS)
11% for DDG (< -39.1)
2% for SASA (> 1720)
1% for counts1 (only ! what a shame, we should be good in there !)

23% passed IPD for 2 metrics.

Go go Foldit players (and algorythm team)! We want to do better than the Scientists ! ;)

nspc Lv 1

July 01, 2020

Good article ^^.

The score system is one of the most important problem.

Because when we use a recipe, this increase score but, it optimises in a bad way sometimes.

Rosetta score is thinked for a unique protein (I think?), so I tested in the coronavirus beginer binder design (with the real humain protein) :

I runned a recipe on this real protein, the scrore increased a lot, but this removed a lot of bouds with target, and replaced them by hydrophonibics sidechains.
I think this replacement is good for a single protein, with contact between 2 helix.
(but here it is not a single protein).

Objectives need to be very accurate, or take count more about boudings with target. Maybe we need a weighting for residus score that boud with target.

The weighting have to be accurate, so maybe we can extract a value from a database that compare proteins we know, and that bind.

Maybe we can add a weighting in buns score too (from residue score part), instead of a fixed score by resolved buns.

Franco Padelletti Lv 1

July 01, 2020

The score system is one of the most important problem. I agree.

I agree, but I believe that is not a trivial task to perform.
And not for the skills and quality of the coders. … I don't believe that the programmers team is sufficiently funded.
Does the funders really believe that there is a future (for revenue, I means) for the "applications in the real life" of this "game"? … mumble, mumble!

I also agree for all the other considerations and suggestions in your post. I saw one of your design and noted that it had too few h-bonds with the target (while mine had many more of them … but a worst score. What a shame!) ;)

Franco Padelletti Lv 1

July 01, 2020

I did this "sort" this morning
… A bit frustrating, but a protein looks to bind. … Or not! … Maybe it does not fold …

I have too many things to study yet.

pdb_id        counts1 counts2 counts3 counts4 counts5 counts6     DDG    SASA           SC     BUNS
2009432_c0437	90	2	0	0	1	4	-31,306	1378,731	0,54	11
2009565_c0043	48	2	3	0	0	2	-29,427	1247,023	0,666	14
2009432_c0007	17	0	5	0	1	0	-53,948	1635,076	0,679	15
2009565_c0096	26	0	0	0	1	0	-1,047	1593,63	        0,593	14
2009565_c0044	91	2	7	4	0	0	-26,689	1226,616	0,58	8
2009432_c0017	67	4	1	2	0	0	-45,715	1555,507	0,656	13
2009432_c0341	36	0	6	0	0	0	-28,998	1535,019	0,36	7

nspc Lv 1

July 01, 2020

Yes, maybe what we need from scientists is a new Rosetta score system, but for 2 proteins that bind.

Score can increase not only with good shape or good bondings.

We can have :
-If protein stay stick.
-If bonds can be done easily.
-If there is too mush problem with water, etc.

The first idea can be : just add a weighting in some existing score parts, when there is a contact with target protein.
But maybe we can introduce new ideas to have this kind of new score.

(but I am not biochemist, maybe it already exist?)

spvincent Lv 1

July 02, 2020

Thanks for the experimental details.

Can't help noticing that the IPD binder is a triple helix, just like most of the Foldit submissions.

Anfinsen_slept_here Lv 1

July 30, 2020

I am just curious what software gets used to calculate DDG? Do tell.