Icon representing a puzzle

2470: CASP16 Ligand Puzzle L4000 Round 2

Closed since over 1 year ago

Intermediate Overall Small Molecule Design

Summary


Created
June 14, 2024
Expires
Max points
100
Description

Use the Ligand Queue (default hotkey 7) and the the Reaction Design Panel (no default hotkey) to explore how different ligand bind to the protein.

This puzzle is part of the CASP16 competition. Foldit players are participating to see how well they can predict how small molecules can bind to proteins. Note that in contrast to prior drug design puzzles, we're not just interested in the top scoring small molecule, but instead are interested in getting good structures for each of the provided ligand compounds. Its worth dividing your time across all the compounds, rather than concentrating on a particular one.

The protein target is the SARS-CoV-2 Mpro protease There are over a thousand structures of this protein in the Protein Databank, many of them bound to ligands. There are 25 ligand structures of interest in this competition. The starting small molecule is one of the co-crystalized ligands, and is provided just to indicate the likely binding site. It's not one of the molecules in the competition - you'll need to use the Ligand Queue or the Reaction Design tool in order to load one of the other ligands.

Since the goal is to predict the structure of the protein ligand complex, we've allowed full backbone and sidechain flexibility on this puzzle. -- That said, all of the bound structures are highly similar to each other (and thus to the starting structure). The backbone is very unlikely to change at all from the starting conformation, and there a restraints (unchangeable bands) to the starting conformation -- these will show up as red lines if you move the backbone too far.

Compared with the first round, we're starting with a different protein backbone and initial ligand structure. We've also enabled the Reaction Design panel as an alternate way of navigating the various compounds available, but with only a single "reaction" - all the compounds should be available as "reactants" to this single reaction. For this round we're focusing on a subset of 20 compounds which still need additional attention. All the ligands to test should be accessible directly from the Reaction Design panel or the Ligand Queue.

Sequence
SGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDVVYCPRHVICTSEDMLNPNYEDLLIRKSNHNFLVQAGNVQLRVIGHSMQNCVLKLKVDTANPKTPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDLEGNFYGPFVDRQTAQAAGTDTTITVNVLAWLYAAVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCASLKELLQNGMNGRTILGSALLEDEFTPFDVVRQCSGVTFQ

Top groups


  1. Avatar for Contenders 100 pts. 22,111
  2. Avatar for Anthropic Dreams 2. Anthropic Dreams 68 pts. 22,039
  3. Avatar for Go Science 3. Go Science 44 pts. 21,407
  4. Avatar for L'Alliance Francophone 4. L'Alliance Francophone 27 pts. 21,030
  5. Avatar for Void Crushers 5. Void Crushers 16 pts. 20,594
  6. Avatar for Australia 6. Australia 9 pts. 20,154
  7. Avatar for VeFold 7. VeFold 5 pts. 19,828
  8. Avatar for Russian team 8. Russian team 3 pts. 19,623
  9. Avatar for FamilyBarmettler 9. FamilyBarmettler 1 pt. 19,463
  10. Avatar for Gargleblasters 10. Gargleblasters 1 pt. 18,789

  1. Avatar for Steven Pletsch 41. Steven Pletsch Lv 1 1 pt. 13,214
  2. Avatar for frostschutz 42. frostschutz Lv 1 1 pt. 13,153
  3. Avatar for hada 43. hada Lv 1 1 pt. 12,894
  4. Avatar for zbp 44. zbp Lv 1 1 pt. 12,717
  5. Avatar for Dr.Sillem 45. Dr.Sillem Lv 1 1 pt. 12,524
  6. Avatar for rinze 46. rinze Lv 1 1 pt. 12,506
  7. Avatar for Swapper242 47. Swapper242 Lv 1 1 pt. 12,303
  8. Avatar for abiogenesis 48. abiogenesis Lv 1 1 pt. 12,267
  9. Avatar for vybi 49. vybi Lv 1 1 pt. 12,168
  10. Avatar for Kimdonghyeon 50. Kimdonghyeon Lv 1 1 pt. 12,155

Comments


rmoretti Staff Lv 1

Objectives

Maximum bonus: +1 000

Torsion Quality (max +1000)
Keeps bond rotations in a good range. Using Wiggle or Tweak Ligand can fix bad torsions. (Show highlights torsions to be rotated.)

LociOiling Lv 1

Thanks to @HuubR, the list of CASP16 L4000 compounds now links to a molecule viewer for each compound. Just click on the compound ID in the first column to open the link

A quick check of round 2 shows that compounds L4001, L4013, L4010, L4017, and L4027 aren't available in the ligand queue or reaction design tools. These compounds were available in round 1.

Despite the puzzle comments, a number of compounds aren't available via reaction design, although they do appear in the ligand queue:

  • L4002
  • L4006
  • L4007
  • L4008
  • L4010
  • L4018
  • L4019
  • L4019
  • L4020
  • L4025
  • L4026

That leaves only the 20 compounds in the ligand queue.

The reaction design tool seems to produce a number of duplicate compounds, compounds where all the small molecule properties shown in Foldit are the same. This is similar to the results from the reaction design tool in last week's L3000 puzzle.

(Edit: added L4010 to the list of unavailable compounds.)

jeff101 Lv 1

Since round 2 has just 20 of the 25 compounds from round 1,
I take it we explored 5 of the compounds in round 1 enough.
How did you decide we had explored these 5 compounds enough?

spvincent Lv 1

Why is it expected that any change from the starting conformation would be highly unlikely? Those big loops such as 137-147 look as if they ought to be pretty flexible.

rmoretti Staff Lv 1

@LociOiling The ligand queue and Reaction Panel reactants should have exactly the same list of compounds. I'm not sure why you're seeing a discrepancy: that might point to a bug. Thanks for doing the collation.

@jeff101 I took a look at all the protein-ligand complexes which were sent back, and did some counting of the number of times each compound was seen, as well as the number of players which submitted poses for each compounds. There were some clear outliers, but the cutoff was set such that we got 20 compounds (which is where both the ligand queue and reaction design panel effectively max out at).

@spvincent In looking at the various crystal structures bound to a range of different compounds, there's actually very little movement of the loops around the binding site. I think it's one of those situations where despite a region being "loopy" in secondary structure, the interactions it's making with the rest of the protein are well defined and localized, which means that the structure (while not "regular") is held pretty tightly into its conformation.

rmoretti Staff Lv 1

@LociOiling Update on the "missing" compounds. The panel is set up to randomize the regent list. However, the approach it currently takes when randomizing for 20 or more compounds is sampling with replacement, which means that you have a good chance of getting duplicate compounds when the number of reagents is close to 20. But if you re-randomize the compounds (e.g. by using the circular arrows in the lower left) that should change which subset of compounds you get.