Icon representing a puzzle

2458: CASP16 Ligand Puzzle L4000

Closed since almost 2 years ago

Intermediate Overall Small Molecule Design

Summary


Created
May 17, 2024
Expires
Max points
100
Description

Use the Ligand Queue (default hotkey 7) to explore how different ligand bind to the protein.

This puzzle is part of the CASP16 competition. Foldit players are participating to see how well they can predict how small molecules can bind to proteins. Note that in contrast to prior drug design puzzles, we're not just interested in the top scoring small molecule, but instead are interested in getting good structures for each of the provided ligand compounds. Its worth dividing your time across all the compounds, rather than concentrating on a particular one.

The protein target is the SARS-CoV-2 Mpro protease There are over a thousand structures of this protein in the Protein Databank, many of them bound to ligands. There are 25 ligand structures of interest in this competition. The starting small molecule is one of the co-crystalized ligands, and is provided just to indicate the likely binding site. It's not one of the molecules in the competition - you'll need to use the Ligand Queue tool in order to load one of the other ligands.

Since the goal is to predict the structure of the protein ligand complex, we've allowed full backbone and sidechain flexibility on this puzzle. -- That said, all of the bound structures are highly similar to each other (and thus to the starting structure). The backbone is very unlikely to change at all from the starting conformation, and there a restraints (unchangeable bands) to the starting conformation -- these will show up as red lines if you move the backbone too far.

Sequence
SGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDVVYCPRHVICTSEDMLNPNYEDLLIRKSNHNFLVQAGNVQLRVIGHSMQNCVLKLKVDTANPKTPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDLEGNFYGPFVDRQTAQAAGTDTTITVNVLAWLYAAVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCASLKELLQNGMNGRTILGSALLEDEFTPFDVVRQCSGVTFQ

Top groups


  1. Avatar for Contenders 100 pts. 23,861
  2. Avatar for Go Science 2. Go Science 65 pts. 22,953
  3. Avatar for Anthropic Dreams 3. Anthropic Dreams 41 pts. 21,197
  4. Avatar for L'Alliance Francophone 4. L'Alliance Francophone 24 pts. 21,061
  5. Avatar for VeFold 5. VeFold 14 pts. 20,180
  6. Avatar for Marvin's bunch 6. Marvin's bunch 7 pts. 20,064
  7. Avatar for Australia 7. Australia 4 pts. 19,979
  8. Avatar for FamilyBarmettler 8. FamilyBarmettler 2 pts. 19,830
  9. Avatar for Street Smarts 9. Street Smarts 1 pt. 17,395
  10. Avatar for Foldit Staff 10. Foldit Staff 1 pt. 15,686

  1. Avatar for hada 31. hada Lv 1 6 pts. 19,716
  2. Avatar for georg137 32. georg137 Lv 1 5 pts. 19,712
  3. Avatar for Merf 33. Merf Lv 1 5 pts. 19,700
  4. Avatar for Arne Heessels 34. Arne Heessels Lv 1 4 pts. 19,688
  5. Avatar for DScott 35. DScott Lv 1 4 pts. 19,653
  6. Avatar for AlphaFold2 36. AlphaFold2 Lv 1 3 pts. 19,645
  7. Avatar for Steven Pletsch 37. Steven Pletsch Lv 1 3 pts. 19,630
  8. Avatar for Vinara 38. Vinara Lv 1 2 pts. 19,628
  9. Avatar for abiogenesis 39. abiogenesis Lv 1 2 pts. 19,619
  10. Avatar for BarrySampson 40. BarrySampson Lv 1 2 pts. 19,574

Comments


rmoretti Staff Lv 1

Objectives

Maximum bonus: +1 000

Torsion Quality (max +1000)
Keeps bond rotations in a good range. Using Wiggle or Tweak Ligand can fix bad torsions. (Show highlights torsions to be rotated.)

rmoretti Staff Lv 1

What happened to L3000? It's a large protein with a large number of ligands. We're waiting on some fixed that should be making their way to devprev before we run that one, so we've skipped ahead.

Eagle eyed players will notice that the ligands in the ligand queue go up to L4028 and you're missing the L4005, L4012 & L4021 ligands in the Ligand Queue. That's not a mistake. The CASP organizers omitted those numbers for some reason.

Also, the CASP organizer's descriptions note a several compounds (L4003, L4013, L4019, & L4023) are covalently bound to a cysteine in the pocket. They also note that several of the compounds have multiple ligands and other co-crystalized molecules in the active site. Due to technical limitations of the Foldit client, we're dealing with that at the moment - right now we're just asking you to dock the single ligand in the pocket.

LociOiling Lv 1

The restaints mentioned in the post are also known as "constraints". They're what the "show constraints" view option shows or hides. It's been a while since a puzzle had any constraints, but "show constraints" is checked by defaul.

alcor29 Lv 1

Is there a link to the listings of all 25 ligands and their characteristics like we had for the first CASP puzzle with 17 ligands?

LociOiling Lv 1

Here are the CASP 16 L4000 compounds as seen in the ligand queue in this puzzle. The info was transcribed by hand, so errors are possible. Corrections welcomed.

Update: L4003 and L4027 were wrong in the initial list. These entries have been totally replaced in the current version. There were also minor changes to a couple of other entries.

I didn't find entries for L4005, L4012, and L4021 in the ligand queue. That's different to what rmoretti mentions above. I'll see if the CASP site has an official list.

jeff101 Lv 1

https://predictioncenter.org/casp16/targetlist.cgi has a list of targets posted so far.
#'s 5-8 in this list are L1000 L2000 L3000 & L4000. If you click on their names, you get:
https://predictioncenter.org/casp16/target.cgi?id=59&view=all for L1000
https://predictioncenter.org/casp16/target.cgi?id=58&view=all for L2000
https://predictioncenter.org/casp16/target.cgi?id=57&view=all for L3000
https://predictioncenter.org/casp16/target.cgi?id=56&view=all for L4000
On the page listed above for L4000, you can scroll down to find a compressed
tar.gz archive file of SMILES info for L4000's ligands:
https://predictioncenter.org/download_area/CASP16/extra_experiments/L4000.tar.gz
I haven't downloaded or looked inside this file yet, but
https://predictioncenter.org/casp16/target.cgi?id=56&view=all
has some useful trivia about L4000's protein and ligands.
Does anyone itch to use https://zinc20.docking.org/substances/home/
to convert the ligand SMILES codes into images for the Foldit wiki?

L3000 sounds like a monster, so I am glad we are putting it off.
L1000-L4000 all have expiration-dates/deadlines from July 1 to July 15.
Perhaps this means we can revisit ones like L1000 with its 17 ligands
before its CASP16 deadline.

The page below is also helpful. It seems to list only CASP16 ligand targets:
https://predictioncenter.org/casp16/targetlist.cgi?view=ligand