Competition results for influenza HA binder design

Started by bkoep

bkoep Staff Lv 1

Friday, March 26 was the last day of our Influenza HA binder design competition! After Puzzle 1968 closed, we collected all of the solutions that were shared with scientists and tallied the valid submissions from each player.

The final rankings

LociOiling - 43 designs
CharlieFortsConscience - 32 designs
ucad - 21 designs
Dudit - 20 designs
spvincent - 10 designs
Bruno Kestemont - 10 designs
nspc - 8 designs
BootsMcGraw - 7 designs
silent gene - 7 designs
ichwilldiesennamen - 6 designs
akaaka - 5 designs
Enzyme - 5 designs
Galaxie - 5 designs
robgee - 3 designs
dcrwheeler - 3 designs
zippyc137 - 3 designs
Anfinsen_slept_here - 2 designs
OWM3 - 2 designs
irk-ele - 2 designs
NinjaGreg - 1 design
georg137 - 1 design
martinzblavy - 1 design
Jpilkington - 1 design
grogar7 - 1 design
alcor29 - 1 design
stomjoh - 1 design
Blipperman - 1 design
Norrjane - 1 design
phi16 - 1 design
infjamc - 1 design
sgeldhof - 1 design
blazegeek - 1 design

Congratulations to LociOiling, who submitted an astounding 43 designed binders for influenza HA!

What did we learn from this competition?

To recap, the aim of this competition was to trial an experimental reward system that encourages players to create the greatest number of quality designs, rather than focus on creating the single highest-scoring design (as in normal Foldit puzzles).

We think this could be a way to make Foldit more effective for protein design research problems, because Foldit is currently limited by design throughput (not by the quality of top-scoring designs). Optimizing for the highest Foldit score works well for protein prediction problems, but the problem of protein design is not so straightforward; a higher-scoring design is not always better. In addition, there is a secondary concern that competitive players tend to optimize solutions so tenaciously that late-game refinement exceeds the limits of our score function.

The competition puzzle was set up to mirror the previous Puzzle 1962: Influenza HA Binder Design: Round 3. Both puzzles used the same score function and Objectives. The only difference between the two puzzles was a scoring offset of 7,500 points (so a 10,000 point competition solution is equivalent to a 17,500 point solution in Puzzle 1962), and the competition puzzle ran for two weeks instead of just one. Using Puzzle 1962 as a control, we can look at the competition results to answer the two big questions about our experimental reward system:

1. Does the competition reward system actually increase throughput?
2. Are competition submissions still high-quality solutions?

Let’s start with question #2.

Are competition submissions still high-quality solutions?

Yes, competition designs appear just as promising as designs from regular puzzles.

This was largely enforced by rule #1 of the competition, which set a threshold of at least 10,000 points for all valid submissions. Foldit scientists chose this threshold based on the results of the previous Puzzle 1962. It seemed 10,000 points could be achieved only if you were able to satisfy most of the Objectives and also attain a reasonable base score.

Note that 10,000 points is still a very high bar for this puzzle, and most of the soloists in Puzzle 1968 were unable to reach this score. All of the players to reach this level have been playing Foldit for at least 6 months, and many of them are experienced veterans. (Bravo to akaaka, who joined Foldit in September 2020–the “youngest” Foldit player to submit a valid competition solution!)

We should also clarify that many solutions below the 10,000 point threshold are still scientifically valuable and will be analyzed by Foldit scientists as possible candidates for lab testing. The 10,000 point threshold does not represent a cutoff for “scientifically useful” solutions. Rather, past this threshold we think further optimization is not very helpful, and a player could contribute more to research by working on another solution.

So, we know that all of the valid submissions scored at least 10,000 points, which should correspond to promising designs. But let's spot check a couple of values to be certain they are reasonable…

Among valid solutions, the worst DDG value was -32.4 kcal/mol, and the worst Contact Surface value was 336. While these values do fall short of their targets (DDG < -40; Contact Surface > 400), these are still promising numbers that could indicate a successful binder. The majority of submissions met the targets for both of these difficult binder design Objectives.

This gives us confidence that the 10,000 point threshold was stringent enough to ensure that all submissions were high quality designs. Note that Foldit scientists will still run additional analyses on these solutions before selecting designs for lab testing.

Does the competition reward system actually increase throughput?

Yes, players created quality designs at almost triple the rate of a normal puzzle.

After any Foldit puzzle closes, we comb through all the puzzle solutions to pull out distinct designs, using protein sequence and structural alignment to sort out duplicate and unfinished solutions. After the competition puzzle ran for two weeks, we identified 242 distinct solutions with at least 10,000 points (this includes solutions from players who opted out of the competition and played Puzzle 1968 normally). By contrast, in one week our “control” Puzzle 1962 yielded 43 distinct protein designs above the equivalent score threshold. Accounting for the difference in puzzle duration, this works out to a rate increase by a factor of 2.8x.

This is a good sign! It indicates that Foldit does have the capacity for greater design throughput, and that a tweak to our reward system could make Foldit more effective for research in protein design. However, the experimental system used here may still need some adjustments…

Was the “puzzle reset” rule effective against duplicated work?

Mostly. But there were several instances where a player, after submitting a solution, restarted the puzzle and rebuilt almost the exact same solution from scratch!

The puzzle reset rule was intended to force players to make multiple distinct designs. Without this rule, we were afraid that each player would make only a single 10,000 point solution, and then repeatedly submit it with trivial changes. In effect, this would boost their competition standing without actually making a meaningful scientific contribution.

Nevertheless, there were some cases where a player submitted two valid solutions with almost the exact same sequence and structure, even though they were designed completely independently after a puzzle reset. This strategy circumvents the purpose of the puzzle reset rule. If we want a reward system that accurately reflects the scientific contribution of each player, we will need to make some changes to the system used in this competition.

A successful experiment

Congratulations again to our champion LociOiling and all of the players who participated in the competition!

One thing that is still missing from this analysis is player feedback. We invite all players (participants and observers) to leave a comment below with your thoughts about this competition. Was gameplay significantly different than in normal puzzles? Did you enjoy it more or less? Do you have suggestions that would make this kind of competition more fun, or more productive?

Keep up the great folding, and practice your binder design skills in the latest Puzzle 1973: Tie2 Binder Design: Round 1!

BootsMcGraw Lv 1

"…there were some cases where a player submitted two valid solutions with almost the exact same sequence and structure, even though they were designed completely independently after a puzzle reset."

Considering that I submitted eight valid solutions and was credited for only seven, I am going to guess this was the case.

Does anyone have a script that compares two solutions to see how much they have in common? I might have spent that entire evening developing another design, had I known.

BootsMcGraw Lv 1

I made at least thirteen attempts to meet the 10K minimum score; not all were successful. The ONLY solutions I had that met the minimum score for submission had the full DDG and SC and one or fewer BUNS and one or fewer bad loops (but not both BUNS and bad loops).

The criteria were challenging, and only mildly frustrating. 8/10 would play, again.

robgee Lv 1

10k was a good target.
Took me a week to get 1st solution.
11 attempts for 3 solutions.
Gameplay difference:

endgamed for way less time <2hrs.
Enjoyment :
more fun 'cause it was a challenge,
also more frustrating but in a good way.
More productive:
Lol ! you got x2.8 more solutions ,
how many more do you want !! :p
Summary:
Challenging but fun, would play again.

spvincent Lv 1

I enjoyed this puzzle: I think the format is preferable to the limited-move style of puzzle we had previously, where there was something of a feeling of being "rushed" (not in a time sense clearly, but rather the move limit acted as a disincentive to backtracking when midway though a puzzle).

I thought I'd submitted 13 solutions. Turns out I forgot to upload one (oh well) but I was wondering about the other two, which maybe were flagged as invalid for some reason. Failure to reset properly perhaps, although I thought I was pretty careful about that.

Look forward to more puzzles like this.

CharlieFortsConscience Lv 1

I liked this puzzle a lot, as it gave us all the benefit of time. So often, I've wished 'if only I had longer…' so when this puzzle was posted, I was instantly drawn. It ended up feeling like a competitively meaningful sandbox puzzle. It took me a week to post 4 solutions, but I started to get a feel for what worked and what didnt, which guided my approach and allowed me to refine the process so that I was able to churn out 3 or 4 solutions a day.

The vast majority of my 33 were tri-helical bundles. I had a couple of quad helices, and a couple of tri-sheet with bi-helicals nestled underneath. I began to notice a sweet spot that appeared to satisfy the DDG and Contact bonuses repeatedly, so I concentrated on preparing solutions that consistently held that successful helix in place and then varied the other 2 helices by one or 2 sidechains to make them different enough to qualify. And you can see the results here -

https://imgflip.com/gif/53nad5

And I too, inadvertently created a duplicate, by mishandling the blueprint setup at the beginning of the process and not realising at the time. Interestingly, those 2 solutions ended up differing by one single residue by the end.

And I also want to send congrats to LociOiling. I thought I was in with a shout with 32, but it was not to be. Well played sir.

Bruno Kestemont Lv 1

Hahaha " But there were several instances where a player, after submitting a solution, restarted the puzzle and rebuilt almost the exact same solution from scratch!"

LOL we players will always try to find a way to "gain" a competition even if it's not scientifically interesting. Just for fun and/or if it can save us time.

At the end, I developed a kind of "industrial" design production with a succession of always the same stategies/actions/recipes.

I feel I'm favoured with a (new) multicore computer. This kind of competition might disfavour owners of old computers (with few clients).

Suggestion: you could try to "correct" this competitive advantage by only considering (for competition, not for science) a maximum of 1 design per day.

Gaming suggestion: is there a way to reward the valid shares to scientists ? For example by giving a +1 final bonus point for each valid share, as a separate "puzzle" named "bonus for puzzle x". In this case, LociOiling would now gain 43 global points etc.

LociOiling soloist score, and mine, would change as follows:
LociOiling 3538+43=3581
Bruno Kestemont 3528+10=3538

Or a build in bonus system that would "recognise" good shares and immediately reward points in the puzzle score. It would be amazing to discover afterwards that a gaining player actually didn't find the best score solution but "only" gained a lot of sharing rewards. That would make the competition strategies more elaborated than simply trying to get the highest score.

ucad Lv 1

Does or should Foldit mine for amino acid sequences during gameplay? Recipes used to mutate the higher scoring solutions must be generating many unique sequences still over 10k.

Perhaps a few endgame recipes/features that work based off amino acid conservation and point loss thresholds would be useful. Ones that generate a cloud of unique mutated solutions rather than grind away without mutating.

NinjaGreg Lv 1

Up until now, I liked having three puzzles going at once so that when one got to the endgame I could give it less attention, but was less interested except for the score. With this puzzle, it was fun to try different initial designs out.

I like Bruno's suggestion of only considering one submission per day, so those of us that have slower computers can still compete.

I did try three designs. The second one couldn't score high enough, the third I think I forgot to submit.

Count me as a favorable response!