bkoep Staff Lv 1
Friday, March 26 was the last day of our Influenza HA binder design competition! After Puzzle 1968 closed, we collected all of the solutions that were shared with scientists and tallied the valid submissions from each player.
The final rankings
LociOiling - 43 designs
CharlieFortsConscience - 32 designs
ucad - 21 designs
Dudit - 20 designs
spvincent - 10 designs
Bruno Kestemont - 10 designs
nspc - 8 designs
BootsMcGraw - 7 designs
silent gene - 7 designs
ichwilldiesennamen - 6 designs
akaaka - 5 designs
Enzyme - 5 designs
Galaxie - 5 designs
robgee - 3 designs
dcrwheeler - 3 designs
zippyc137 - 3 designs
Anfinsen_slept_here - 2 designs
OWM3 - 2 designs
irk-ele - 2 designs
NinjaGreg - 1 design
georg137 - 1 design
martinzblavy - 1 design
Jpilkington - 1 design
grogar7 - 1 design
alcor29 - 1 design
stomjoh - 1 design
Blipperman - 1 design
Norrjane - 1 design
phi16 - 1 design
infjamc - 1 design
sgeldhof - 1 design
blazegeek - 1 design
Congratulations to LociOiling, who submitted an astounding 43 designed binders for influenza HA!
What did we learn from this competition?
To recap, the aim of this competition was to trial an experimental reward system that encourages players to create the greatest number of quality designs, rather than focus on creating the single highest-scoring design (as in normal Foldit puzzles).
We think this could be a way to make Foldit more effective for protein design research problems, because Foldit is currently limited by design throughput (not by the quality of top-scoring designs). Optimizing for the highest Foldit score works well for protein prediction problems, but the problem of protein design is not so straightforward; a higher-scoring design is not always better. In addition, there is a secondary concern that competitive players tend to optimize solutions so tenaciously that late-game refinement exceeds the limits of our score function.
The competition puzzle was set up to mirror the previous Puzzle 1962: Influenza HA Binder Design: Round 3. Both puzzles used the same score function and Objectives. The only difference between the two puzzles was a scoring offset of 7,500 points (so a 10,000 point competition solution is equivalent to a 17,500 point solution in Puzzle 1962), and the competition puzzle ran for two weeks instead of just one. Using Puzzle 1962 as a control, we can look at the competition results to answer the two big questions about our experimental reward system:
1. Does the competition reward system actually increase throughput?
2. Are competition submissions still high-quality solutions?
Let’s start with question #2.
Are competition submissions still high-quality solutions?
Yes, competition designs appear just as promising as designs from regular puzzles.
This was largely enforced by rule #1 of the competition, which set a threshold of at least 10,000 points for all valid submissions. Foldit scientists chose this threshold based on the results of the previous Puzzle 1962. It seemed 10,000 points could be achieved only if you were able to satisfy most of the Objectives and also attain a reasonable base score.
Note that 10,000 points is still a very high bar for this puzzle, and most of the soloists in Puzzle 1968 were unable to reach this score. All of the players to reach this level have been playing Foldit for at least 6 months, and many of them are experienced veterans. (Bravo to akaaka, who joined Foldit in September 2020–the “youngest” Foldit player to submit a valid competition solution!)
We should also clarify that many solutions below the 10,000 point threshold are still scientifically valuable and will be analyzed by Foldit scientists as possible candidates for lab testing. The 10,000 point threshold does not represent a cutoff for “scientifically useful” solutions. Rather, past this threshold we think further optimization is not very helpful, and a player could contribute more to research by working on another solution.
So, we know that all of the valid submissions scored at least 10,000 points, which should correspond to promising designs. But let's spot check a couple of values to be certain they are reasonable…
Among valid solutions, the worst DDG value was -32.4 kcal/mol, and the worst Contact Surface value was 336. While these values do fall short of their targets (DDG < -40; Contact Surface > 400), these are still promising numbers that could indicate a successful binder. The majority of submissions met the targets for both of these difficult binder design Objectives.
This gives us confidence that the 10,000 point threshold was stringent enough to ensure that all submissions were high quality designs. Note that Foldit scientists will still run additional analyses on these solutions before selecting designs for lab testing.
Does the competition reward system actually increase throughput?
Yes, players created quality designs at almost triple the rate of a normal puzzle.
After any Foldit puzzle closes, we comb through all the puzzle solutions to pull out distinct designs, using protein sequence and structural alignment to sort out duplicate and unfinished solutions. After the competition puzzle ran for two weeks, we identified 242 distinct solutions with at least 10,000 points (this includes solutions from players who opted out of the competition and played Puzzle 1968 normally). By contrast, in one week our “control” Puzzle 1962 yielded 43 distinct protein designs above the equivalent score threshold. Accounting for the difference in puzzle duration, this works out to a rate increase by a factor of 2.8x.
This is a good sign! It indicates that Foldit does have the capacity for greater design throughput, and that a tweak to our reward system could make Foldit more effective for research in protein design. However, the experimental system used here may still need some adjustments…
Was the “puzzle reset” rule effective against duplicated work?
Mostly. But there were several instances where a player, after submitting a solution, restarted the puzzle and rebuilt almost the exact same solution from scratch!
The puzzle reset rule was intended to force players to make multiple distinct designs. Without this rule, we were afraid that each player would make only a single 10,000 point solution, and then repeatedly submit it with trivial changes. In effect, this would boost their competition standing without actually making a meaningful scientific contribution.
Nevertheless, there were some cases where a player submitted two valid solutions with almost the exact same sequence and structure, even though they were designed completely independently after a puzzle reset. This strategy circumvents the purpose of the puzzle reset rule. If we want a reward system that accurately reflects the scientific contribution of each player, we will need to make some changes to the system used in this competition.
A successful experiment
Congratulations again to our champion LociOiling and all of the players who participated in the competition!
One thing that is still missing from this analysis is player feedback. We invite all players (participants and observers) to leave a comment below with your thoughts about this competition. Was gameplay significantly different than in normal puzzles? Did you enjoy it more or less? Do you have suggestions that would make this kind of competition more fun, or more productive?
Keep up the great folding, and practice your binder design skills in the latest Puzzle 1973: Tie2 Binder Design: Round 1!