SARS-CoV-2 helicase CACHE Challenge preliminary results

Started by rmoretti

rmoretti Staff Lv 1

We have preliminary results to share from the CACHE challenge!

The CACHE Challenge

About a year ago we launched a puzzle series as part of the CACHE Challenge. CACHE is like CASP for small molecule drug design – it’s an independent, blind prediction task to see if computationalists can do a good job of predicting which small molecule can bind to a given protein structure. The first Foldit CACHE puzzle series (which is for CACHE Challenge #2) was to design a compound which bound to the RNA binding site of SARS-CoV-2 helicase.

Since that puzzle series ended, we’ve taken all the designs which Foldit players created in the eight rounds, and filtered them for the CACHE provided quality metrics (e.g. molecular weight, hydrophobicity, fraction of sp3 carbons). To maximize the number of compounds we would be able to order, we also limited the selected compounds to only those which were in the compound library. (We did also search the top off-library compounds for similar on-library compounds, but that didn’t produce any new compounds.)

The compounds which passed those filters were then redocked into the protein to make sure that the designed binding mode was specific. The compounds were then ranked for predicted binding energy. We removed a number of very similar compounds to get more diversity in the set and then sent the list to the CACHE organizers, who consulted with Enamine and with us to reduce the set to those compounds which could be ordered and which would still be in budget.

Results

Since then CACHE has ordered and tested the compounds. Due to the difficulties of chemical synthesis, not all the compounds which we submitted were able to be tested, but 76 of them were! The CACHE organizers and their collaborators did a number of different types of assays to make sure that the compounds were binding to the helicase, weren’t binding nonspecifically, didn’t have odd aggregation, and weren’t binding to the ATP pocket of the helicase.

And Foldit players designed two compounds which passed this initial screen! Congratulations to Aubade01 and an anonymous player for their designs. (If you wish your username to be mentioned in future Foldit blog posts and papers, you can go to https://fold.it/profile/edit or click the gear icon in the upper right when logged in to change the “Foldit can share my username” setting.) Both successful compounds were from Round 7, though they weren’t the top scoring compounds from that round.

Hit 1; SMILES Cc1cc(c(C#N)cn1)N1C[C@@H](CO)[C@@H](C1)c1cccc(c1)F

Hit 2; SMILES Cc1ccccc1c1nc(C(O)=O)c2CCCn12

Foldit was moderately successful in its submission - 10/23 participating groups had two or more compounds selected for being advanced to the next phase, for a total of 46 compounds being advanced. Our best binding compound had an affinity of 33 µM, which falls in the middle of the range of compounds being advanced (5 µM to 250 µM, with lower being better).

Onto the next phase

Since we have compounds which passed preliminary screening, we’ve been invited to participate in the next phase! In this phase, we’re asked to explore the “structure activity relationship” of the compounds we had success with. That is, can we find compounds which are similar to the compounds we submitted, but which have better binding affinity?

Similar to the first phase, only compounds which are in the compound library will be considered. Additionally, we need to submit compounds which are “close enough” to our hit compounds. There isn’t a hard threshold on this, but the intent is to make the hit compounds better, rather than come up with completely novel compounds. Also keep in mind that we don’t have experimental structures of the protein-ligand complex, so the starting location of the compound may not be where or how it actually binds.

Note: due to the deadline for CACHE submission, we’re interrupting the CCHFV puzzles - they’ll be back once we’ve gotten what we need for CACHE.

Participation in CACHE puzzles is subject to the CACHE Terms of Participation, in particular “the Challenge IP [including Challenge Compounds] will be made freely available in the public domain pursuant to Creative Commons Attribution Only (CC-BY 4.0 or subsequent versions) licensing terms, with the intent that such Challenge IP may be Used and practiced by Users for any purpose”.

jeff101 Lv 1

Some quotes from above bring up questions for me:

"The compounds which passed those filters were then redocked into the protein to make sure that the designed binding mode was specific. The compounds were then ranked for predicted binding energy."

"Also keep in mind that we don’t have experimental structures of the protein-ligand complex, so the starting location of the compound may not be where or how it actually binds."

In puzzles like 2342, does the starting ligand begin in a position found from your redocking calculations? From your redocking calculations, do you know which atoms on the starting ligand should hydrogen-bond to which atoms on the protein? (If so, why not tell us these atoms?) It seems like the starting ligand can score higher in puzzle 2342 if you let it move a bit away from its starting position. Should we trust where Foldit's scoring within puzzle 2342 leads us, or should we keep the starting ligand where it begins in puzzle 2342 because this starting position comes from a more trustworthy calculation?

Finally, if you were to estimate how far the starting ligand's true binding site might be from the ligand's starting position in puzzle 2342, what would you estimate for this distance? Less than 5 Angstroms? Less than 10 Angstroms?

Thanks!

jeff101 Lv 1

Below is a figure from an old post about Foldit puzzle 1029:

For more details, see https://fold.it/forum/blog/player-designs-enter-the-wet-lab

Do your redocking calculations give results like in the right figure above?
That is, if you took the ligand shape and position from the lowest Rosetta energy result
and calculated the rms deviation between its ligand atom positions and the ligand atom
positions for all other Rosetta energy results in your redocking calculations, would they
look like the red and green dots in the figure on the right above? These show that as
the Rosetta energy gets lower, the ligand atom positions get closer to their positions
for the lowest Rosetta energy. If you plotted Foldit scores instead of Rosetta energies,
the green dots would be in the upper left of the plot rather than the lower left, giving
the largest Foldit scores instead of the lowest Rosetta energies for the lowest rms
values.

In a plot of your redocking results for the starting ligand in puzzle 2342, what is the range
of ligand atom rms values for the lowest, say, 10 units of Rosetta energy? What is the range
of Foldit scores for this same range of lowest Rosetta energies? Does your plot give a smooth
funnel shape like in the figure to the right above? Does our other top ligand from CACHE
Challenge #2 give a similar plot with similar ranges and behaviors?

Thanks again!

rosie4loop Lv 1

Foldit was moderately successful in its submission - 10/23 participating groups had two or more compounds selected for being advanced to the next phase, for a total of 46 compounds being advanced.

Actually pretty impressive I think, considering most of the other groups basically screen the whole database of over 1b compounds and evaluate with different methods. From a quick glimpse of all computational methods, majority seems to be different pre-screening strategy, automatic binding prediction (like 11/23 with autodock/autodock vina and variants, a few use non-vina free codes or in-house programs, fewer use commercial programs like Schrödinger (including glide), MOE, ICM), combined with AI assisted scoring and/or QM or MM based energy evaluation.

The throughput of Foldit should be much lower, I'd guess likely fewer than 500 thousands combining efforts of all rounds including structures not recorded by the server or not in the library.

And the molecular weight of hits are among the lowest, more space for optimization, and the so-call "ligand efficiency" would be high. I think it's understandable, if most of the others heavily rely on vina based methods, since vina is known to bias bigger molecules.

rmoretti Staff Lv 1

@jeff101 The current 2342 puzzle should use the player-designed conformation. We're planning on running a puzzle with the redocked conformation later. The difference in absolute (center of mass) positioning is low (less than 5 Ang), but the compounds is flipped around a bit.

When we submitted compounds, we did indeed make plots like you show, and only selected compounds which had decent binding funnels (as you get better in energy they get closer to the player-designed conformation). Though they weren't all quite so sharp as the one you've posted.

rosie4loop Lv 1

Well the limits about alphafold is true and sounds valid based on the evidence provided, but be careful when making use of these eye-catching statements. AlphaFold is one of the AI tools, but it's not the only AI tool.

Like any new technologies, be an educated user. Apply it in the right direction, and be aware of it's limit. When you get a new mobile phone, you may need a sim card to call a friend. Shouldn't blame the phone for not functioning if the SIM card isn't inserted.

AlphaFold is for protein modelling, it's well-known that there are critical pitfalls in the generated models. Educated and responsible use of the tool is recommended, instead of blindly using the technology without the knowledge of it's limits. Indeed, if you use alphafold without awaring of the limits, e.g. try to predict the structure of intrinsically disordered regions, it will lead to the wrong direction, it cannot do that.

As I'm working on molecular modelling, friends from experimental groups would approach me and ask about AlphaFold. Can you use AlphaFold to predict RNA structure? No, not now, as of 2023, there're other programs for that, suggest alternative. Can you use AlphaFold to rationalize this event I observed in experiments? To answer it, review literature, as I'm not familiar with the tool. If yes, review the pros and cons, and comment whether it's appropriate to use it for this specific purpose. Even if it's not AlphaFold, do the same for other software, be an educated and responsible user.

About the use of AI in CACHE and drug design

Just check the list of computational methods of CACHE, there are groups that combine AI and physics based methods, and they are more "successful" in the challenge, 2 times or even 4 times the hits compared to Foldit. Does it mean AI beat human? I don't think so. I don't think it's fair to judge simply based on this.

On the other hand, it's a popular area to use AI to assist in scoring of ligand design. Other groups in the list of CACHE are using AI combining with physics based method. It's also common to use AI in the prediction of drug's properties after designing, e.g. predict solubility, half life in body, etc.

Are they accurate? I don't know. I can only judge based on experimental evidence. Science is full of exceptional cases. Be educated, be responsible, when you try out a new technique. And be curious of the reason why, if you're getting something unexpected.

(Disclaimer: I'm not connected to any of the CACHE participants nor developer of any AI tools, I'm user of modelling software. Personally I'd encourage the educated, cautious use of any modelling tools, whether it's AI or physics based.)

spvincent Lv 1

Thanks for your comments: always good to hear from someone in the field.

The point that hit home for me in that blog article was the sheer number of possible bonding types in ligands compared to those that exist in proteins. I recall reading somewhere of an estimate for the number of possible organic compounds of molecular weight less than 500 that could be represented using normal bonding rules (obviously a great number would be chemically implausible). It was 10^500! I’m going on memory here so that may be wrong by quite a bit but even so it gives a sense of how big chemical space is and how difficult is the task of finding ligands for any specific binding.

The point about educated and responsible use of any technology being important is most certainly a valid one, but I’m not sure if it really warrants being displayed in a 24pt font size.

rosie4loop Lv 1

I was trying to add a sub heading with ## but for some reason the line is gone. Originally it should be a markdown heading "## on the use of AI in CACHE". Thanks for reminding me, I've restored the original heading.

rosie4loop Lv 1

I assume you're talking about de novo ligand design by AI, like what the recent publication by Baker's lab does on protein design (aka RFdiffusion=RoseTTafold diffusion+proteinmpnn+alphafold). De novo ligand design is a hot topic, like at least several papers on high-impact journal a month, for example from a protein sequence, predict a SMILES string as a binder. But it's indeed challenging in practice. Recent benchmarking study still question the practical use on some of them (apologize can't find the paper at the moment, it's either an ACS or RSC or Springer publication)

I just wanted to say that other CACHE participants are using AI in their workflow, from the number they seems to be getting more hits. (See: https://cache-challenge.org/challenge-2/computational-methods) AI is already playing a role in drug design even if it's not designing by AI. But I don't think it's fair to simply conclude by the statement "xxx is more useful than yyy".

For example, lets check the method used by a group that has 8 hits proceed to round two. They start by selecting a set of structurally diverse compounds from ZINC, predict their binding affinity by physics based scoring with a free software (Autodock Vina), use the score to train an AI model, then use the AI model to screen the whole ZINC database. (Steps 1-5 in their computational method) Another AI model is also used in step 7 for rescoring docking results.

For another group that has 5 hits: they modified the physics based scoring/pose prediction method in an open source docking program (grandparent being Vina again, AI modified version is gnina), to use AI to do these tasks, and use the program as one of the tool in the prediction.

Many others also integrated AI in their workflow, but getting similar or fewer hits than Foldit. In the round 1 challenge, in terms of prediction power some AI perform better than Foldit, while others may not be as good as the binding prediction by human.

My personal opinion is, that those "successful" groups use ultra high throughput virtual screening to evaluate nearly all billions of compounds in the database. Foldit use physics based method to help human design, the throughput is much lower, still getting comparable results.

Is it possible that this time AI is just lucky? Maybe. The world is full of exceptional cases, and the complexity of drugs binding which depends not only on binding pose making rational design always challenging.

A personal comment here, as a researcher I don't like using AI in research unless it's necessary, because rarely does people in other fields understand the limits and sometimes request to apply it in the wrong direction (they are doing useful research possibly with high impact, but using a wrong tool would damage it). It's time-consuming to cite enough paper to convince them to use an appropriate tool, with the hype in AI. But the reality is that AI based drug design is rapidly developing.

(fix typos, clarification and link to CACHE #2 computational method)