Placeholder image of a protein
Icon representing a puzzle

2207: SARS-CoV-2 helicase CACHE Challenge: Round 2

Closed since over 3 years ago

Intermediate Overall Small Molecule Design

Summary


Created
September 30, 2022
Expires
Max points
100
Description

Compete in a challenge to design a drug targeting the SARS-CoV-2 helicase. Use the small molecule design tools and the compound library panel to find library compounds which bind to the active site of the enzyme.

Note: To get the most out of the small molecule design tools, we recommend changing you view settings to the Small Molecule Design Preset.

This puzzle is part of Foldit's participation in the CACHE Challenge. From the set of all compounds submitted in the multiple rounds of puzzles, Foldit scientists will select up to 100 compounds based on the CACHE-provided criteria. Only compounds which are in a commercially available library will be selected, so it's beneficial to make use of the Compound Library panel to search for library compounds similar to your current design. But don't limit yourself to the compound library. You're more likely to get good results by alternating: optimizing the molecule with the small molecule design tools, find the closest library compound, then further refine it with the design tools.

The changes from the first round is a more stringent Bad Groups objective and the addition of the Synthetic Accessibility objective. This will hopefully help reduce the number of high scoring, far-from-library compounds we're seeing.

Participation in CACHE puzzles is subject to the CACHE Terms of Participation, in particular “the Challenge IP [including Challenge Compounds] will be made freely available in the public domain pursuant to Creative Commons Attribution Only (CC-BY 4.0 or subsequent versions) licensing terms, with the intent that such Challenge IP may be Used and practiced by Users for any purpose”.

Top groups


  1. Avatar for Anthropic Dreams 100 pts. 17,094
  2. Avatar for Contenders 2. Contenders 65 pts. 16,989
  3. Avatar for Go Science 3. Go Science 41 pts. 16,754
  4. Avatar for Gargleblasters 4. Gargleblasters 24 pts. 16,734
  5. Avatar for Marvin's bunch 5. Marvin's bunch 14 pts. 16,363
  6. Avatar for Russian team 6. Russian team 7 pts. 16,193
  7. Avatar for FamilyBarmettler 7. FamilyBarmettler 4 pts. 16,150
  8. Avatar for L'Alliance Francophone 8. L'Alliance Francophone 2 pts. 16,009
  9. Avatar for Australia 9. Australia 1 pt. 15,968
  10. Avatar for Foldit Staff 10. Foldit Staff 1 pt. 15,822

  1. Avatar for gmn
    1. gmn Lv 1
    100 pts. 17,067
  2. Avatar for Aubade01 2. Aubade01 Lv 1 95 pts. 16,970
  3. Avatar for spvincent 3. spvincent Lv 1 90 pts. 16,945
  4. Avatar for drjr 4. drjr Lv 1 85 pts. 16,939
  5. Avatar for Sandrix72 5. Sandrix72 Lv 1 80 pts. 16,754
  6. Avatar for HuubR 6. HuubR Lv 1 76 pts. 16,734
  7. Avatar for ucad 7. ucad Lv 1 71 pts. 16,706
  8. Avatar for Bruno Kestemont 8. Bruno Kestemont Lv 1 67 pts. 16,618
  9. Avatar for LociOiling 9. LociOiling Lv 1 63 pts. 16,532
  10. Avatar for maithra 10. maithra Lv 1 60 pts. 16,515

Comments


rmoretti Staff Lv 1

Objectives

Objectives in this puzzle are driven primarily by the evaluation criteria used by CACHE.

Maximum bonus: +7 000

Torsion Quality (max +1000)
Keeps bond rotations in a good range. Using Wiggle or Tweak Ligand can fix bad torsions. (Show highlights torsions to be rotated.)

Number of Rotatable Bonds (max +1000)
Intended to keep the ligand from getting too big and floppy. You can reduce rotatable bonds by deleting groups or forming rings. (Show highlights rotatable bonds.)

Ligand TPSA (max +1000)
Topological Polar Surface Area - Keeps the polar surface area (including buried polar surface) low. To improve, try removing oxygens and nitrogens. (Show highlights atoms contributing to higher TPSA.)

Ligand cLogP (max +1000)
A measure of polarity - Keeps the molecule from getting too hydrophobic. To improve, try adding polar oxygens and nitrogens. (Show highlights atoms contributing to higher cLogP.)

Bad Groups (max +1000)
Gives a bonus for avoiding groups that interfere with assays, or which are far from the compounds in the library. (Show highlights groups at issue.)

Molecular Weight (max +1000)
Keeps the ligand a reasonable size.

Synthetic Accessibility (max +1000)
Keeps the ligand from going too far from the compounds in the library. (Show highlights parts of the molecule at issue.)

spvincent Lv 1

I don't understand how the requirement that only compounds in a commercially available library will be submitted will work. The best score in Foldit will almost certainly come from a "hand crafted" design: submitting this to get a library of similar designs results in a list of compounds that score nowhere near as well as the original (that's my experience anyway). I'm getting lists of 25 compounds all of which contain halogens and most of which contain sulphur as well. On accepting these compounds into Foldit they invariably have penalties from the cLogP or Bad Group objectives (you'd have thought that by definition anything in the compound library would score perfectly as far as the Bad Group objective is concerned). It doesn't seem practical to work with these suggested compounds: by the time all the objectives have been fixed the compound is changed beyond recognition.

rmoretti Staff Lv 1

The limit for only testing library compounds is a practical one. We really don't have the resources to custom synthesize the "hand crafted" designs. The hope is that people will come up with hand-crafted designs which are close to compounds in the library. Those library designs might not score quite as well as some more hand-tailored designs, but we'd hope they'd score decently. We'll look through the designs to find those library compounds which work best.

To some extent, this is an experiment to see how well this approach (manual adjustment iterated with library searching) works. If you have recommendations about how to make it work better (given the constraint that the set of library compounds is the set of library compounds), we're receptive. – For right now, my best suggestion is to work with the Compound Library iteratively. Get the results, fix them up, then re-search the library. Hopefully you'll eventually walk to a region of chemical space where the library results are are more similar to your designed/optimized compounds.

Regarding the Bad Group objective, right now we're passing along the results from the database search as-is, and we don't have any pre-filters for compound properties. On the round 2 puzzle in particular, the Bad Group objective may trigger due to the fact that we greatly increased its stringency.

jeff101 Lv 1

In the puzzle description it says the goal is to "find library compounds which bind to the active site of the enzyme." Where exactly is the active site of this enzyme? Is it the region that is full of voids? Can you list for us some segment numbers that are thought to be in the active site of this enzyme? If, for example, the active site is on helix X over here and strand Z over there, should our compounds try to bind both helix X and strand Z? Would it be enough to bind to helix X only? Would it be enough to bind to strand Z only? Do our Foldit scores reflect at all how well our compound binds to the active site of the enzyme? Is it possible to get a high Foldit score for bonding to a region of the enzyme far from the active site? Is the active site mostly inside the enzyme, or is it more on the surface of the enzyme? Thanks in advance.

rmoretti Staff Lv 1

The helicase enzyme is actually a rather large structure, so the starting protein we've given has actually been trimmed a bunch. It should be just the "cup" around the active site of interest. (There's the greatest number of voids in that pocket, but there are some voids elsewhere, as well.) The starting ligand should be in roughly the place where we want things to bind. (Anywhere with that pocket should be fine.) I'm not sure if we have a good sense of if it's better to bind tightly to one side of the pocket (e.g just helix X), or to try to bridge the gap (bind both helix X and strand Z). As long as you don't move the ligand to the "outside"/"back", you should be fine.

If you're interested in more details about the target site, the CACHE website has more info, in particular this presentation. Click on the slides tab and "Channel site sticks". We're attempting to block the binding of the yellow RNA molecule by filling in the location where it overlaps with the blue channel. For orientation, the Y515 and the H554 there are the Tyrosine and Histidine that are contacting the starting ligand (Segments 136 and 175 in Foldit. See also the PDB# in the TAB Segment information box).

Foldit scores should include how well the ligand binds. It's theoretically possible to get decent scores when binding outside of the pocket, but at this point we think it unlikely. Looking at the results from Round 1, I didn't really see much good-scoring compounds that would be excluded by being outside the pocket. If that starts to change, we can try changing the objectives.

jeff101 Lv 1

Thanks for your reply, rmoretti. I have been taking snapshots like below for my puzzle 2207 results. I use the same View settings with a full-screen GUI each time, and I press Home right before I take each snapshot. I would like to return to puzzle 2204 to make similar snapshots for my results from there, but none of my clients are letting me access puzzle 2204 anymore. How can I fix this?

Anyway, could you draw circles on the snapshot above to show where we should try to have our ligands bind?

Most solutions I've seen have had ligands bind in the region with the most red voids. Nevertheless, one that got a score of 15522.683 +6600 had its ligand near segments phe20 and thr24 on the top edge of the top-most helix shown. This helix extends from glu15 at its left end to ser35 at its right end. For the solution with score 14683.101 +7000, there was a void on this helix near glu23. Nevertheless, for the solution with 15522.683 +6600, that void was shifted to the right to leu25-lys26. I have shared both of these 2207 solutions with scientists. Would you consider the 15522.683 +6600 solution a realistic one, or should we try to avoid making ones like it?

jeff101 Lv 1

Another issue is that sometimes when I set up a Foldit GUI to display things as above, its client becomes frustratingly slow. I suspect this is related to having both the Objectives and the Small Molecule Properties visible simultaneously, so most of the time when I run recipes, I hide the protein, Objectives, and Small Molecule Properties and then minimize the Foldit GUI. Why do you think the clients slow down so much? What would you recommend I do to fix this problem?

Thanks again,
Jeff