bkoep Staff Lv 1
The experimental results are in for Foldit player’s 99 binders against the coronavirus spike protein! If you’ve been following along, you know this experiment was planned for earlier this summer, but got held up by some technical problems with our DNA supplier. Well, we found a workaround, got new materials, and ran the binding experiment to test whether any of the 99 Foldit designs bind to the SARS-CoV-2 spike protein.
Unfortunately, we did not see appreciable binding from any of the 99 Foldit designs. Below we’ll walk through the details of the experiment, and we’ll also discuss some exciting news about a successful binder designed by IPD scientists.
The experiment
Our binding experiment uses two techniques called yeast display and fluorescence activated cell sorting (FACS). You can read more about those techniques in a previous blog post.
In short, we put custom DNA into 100,000s of yeast cells, which then display our protein designs on their surface. After mixing our yeast with fluorescent target protein, we can quickly sort through the yeast cells and pick out those that bind to the target.

Figure 1. (A) Schematic of FACS experiment and (B) example scatter plot of fluorescence from a FACS sort. Each point is a yeast cell, with green fluorescence (expression) on the x-axis, and red fluorescence (binding) on the y-axis. Points in the top right corner represent cells with both red and green fluorescence, indicating good expression and binding.
After each sort, we sequence the DNA of just the collected cells (e.g. the cells that showed expression and binding signal). These DNA sequences can be mapped back to the protein designs that were displayed on the yeast cells.
We count how many times we read each design in the sequencing data. A design with a high number of sequencing counts means that a lot of yeast cells displaying this design were collected, and indicates a successful binder.
The data
Below is a preview of the data. You can download the data for all 99 designs here.
pdb_id counts1 counts2 counts3* counts4 counts5 counts6 BUNS DDG SASA SC 2008926_c0022 10 0 0 0 0 0 7 -33.546 1314.890 0.661 2008926_c0023 21 1 0 0 0 0 8 -33.030 1391.938 0.663 2008926_c0026 30 13 0 0 0 0 12 -37.822 1621.635 0.584 2008926_c0034 1073 2357 0 0 0 1 12 -44.100 1656.985 0.648 2008926_c0036 3 3 0 0 0 0 9 -46.865 1574.854 0.648 2008926_c0037 590 4026 0 45 52 144 7 -36.222 1633.888 0.569 2008926_c0040 343 323 1 0 0 0 10 -35.853 1568.804 0.645 2008926_c0042 57 199 0 0 0 0 6 -31.511 1407.946 0.490 2008926_c0052 2 0 0 0 0 0 6 -31.936 1445.994 0.555 ...
*Note: There was a sequencing error for sort #3, which is why the counts are mostly zeros in the counts3 column. The counts3 numbers do not represent the actual collected fraction from sort #3, and we should disregard those numbers. Fortunately, since sort #3 was an enrichment sort and we have good data for later sorts, we don’t need those counts to interpret the experiment results.
The details
We used a different sorting schedule here than we did in the previous IL6R experiment. In the IL6R experiment, Foldit designs were pooled with a number of IPD designs and were sorted together at the same time. We screened that entire pool against a range of binding conditions (target concentrations from 0.1 to 1000 nM).
In this spike binder experiment, we were able to purify the starting pool so that it was made up almost entirely of Foldit designs. We also took some extra steps to enrich the starting pool, and we only screened against high concentrations of target after enrichment.
Sort schedule
- Expression
- Enrichment at 1000 nM target
- Enrichment at 1000 nM target
- Enrichment at 1000 nM target
- Binding at 1000 nM target
- Binding at 100 nM target
Instead of going directly from the starting pool into binding sorts at different concentrations of target, we first carried out several rounds of enrichment sorting in order to amplify any potential binders. An enrichment sort is very similar to a binding sort, where we select yeast cells that have both expression and binding signal. The experimental conditions are a little more lenient for binding during an enrichment sort.
The important part of enrichment is that the selected fraction of each enrichment sort provides the input for the following sort. If we do this several times in a row, we can drastically enrich the composition of the pool to favor anything that binds even a little bit. This is a way to increase the presence of any weak binders, and helps to ensure we don’t miss anything that was underrepresented in the starting pool.

Figure 2. Diagram of sort procedure. Each bar represents a pool of cells that undergoes sorting. In sort #1, we collect only cells that show high expression (green fluorescence), and these cells become the input for sort #2. Sorts #2-4 are enrichment sorts which should exponentially increase the presence of any binders in the pool. After enrichment, sorts #5 and #6 screen for cells that show binding signal at different concentrations of target.
For each of the sorts in the figure above, we've also noted the percentage of cells that were collected from the sort. In expression sort #1, we collected cells based only on whether they display any protein on their surface (green fluorescence). In sorts #2-6, we collected cells based on whether they bind to the target (red fluorescence).
If there are any successful binders in the starting pool, their prevalence should increase exponentially during enrichment sorts. After a few rounds of enrichment, successful binders will grow to dominate the pool so that the majority of cells show binding.
Unfortunately, after three rounds of enrichment, we still see that <5% of cells show any binding signal at 1000 nM target concentration. This is a clear sign that nothing in the pool binds significantly at 1000 nM target ("easy" binding conditions).

Figure 3. FACS data for Foldit spike binders. Each point represents a single yeast cell displaying a Foldit binder on its surface. The x-axis is intensity of green fluorescence (how much binder is expressed on the cell surface) and the y-axis is red fluorescence (how much target is bound at the cell surface). If there were any successful binders in the pool, we would expect to see a large population in the top right corner of each plot.
Looking at the sequencing counts, we see that a handful of designs did become more prominent during enrichment and show up consistently in the final binding sorts. This does indicate that these designs tend stick to the target somewhat more than other designs in the pool. However, these low numbers are consistent with what we could expect from unfolded non-specific binding, or very weak binding. It is unlikely these designs are folding and sticking to the target as intended, and we cannot expect to improve them by optimization.

