VHL Ligand Design Updates

Started by rmoretti

rmoretti Staff Lv 1

In January we completed a series of 10 puzzle rounds with the goal to redesign a small molecule which bound to the von Hippel-Lindau E3 ubiquitin ligase. (Blog Post) The hope was that Foldit players would be able to make novel changes to the core of the molecule while simultaneously improving the molecule in ways which would make it better for oral (by mouth) drugs.

From all your design efforts, we received over 6500 unique molecules. Some were kinda funky, but there were a number of promising ones in the mix.

Our collaborators at Boehringer Ingelheim (BI) were interested in seeing everything Foldit players were able to produce, so we sent all 6500+ compounds to BI to be evaluated. The first thing to do was to run some automated filtering to get rid of compounds which have major issues. This included compounds which extended too far out of the binding pocket (~150 excluded), as well as those compounds which were too large (~1000) or too small (~150), or fell outside of the acceptable range for the numbers of atoms for various elements (~1000), or had too many rotatable bonds or rings (~1000). They were also filtered for compounds which were outside the desired range for the number of hydrogen bond donors & acceptors (~500) and clogP (~50). Finally, substructure searches were performed for groups which were reactive or otherwise would cause issues in a drug (~500) as well as other problematic groups such as long hydrophobic chains (e.g. aliphatic alkanes, aliphatic ethers, and aliphatic alkenes; ~500).

After all that filtering, approximately 1000 compounds were taken forward into more detailed evaluation. In this round of evaluation, the collaborators attempted to identify compounds which changed the central binding motif of the ligand (the hydroxyproline ring) while improving properties associated with better oral drugs, most notably the TPSA (topological polar surface area). The collaborators also double checked all of the original 6500 molecules, to make sure that the automated filtering didn’t accidentally throw out a promising molecule which could be easily fixed.

When evaluating the molecules, one major issue was noticed. There were a large number of molecules which may have had good binding scores by Foldit, but had strained bond rotations (torsions). It’s unlikely that the molecules would ever actually bind in that position due to the energetic strain those molecules were under. The Foldit score wasn’t capturing this energetic strain, but luckily there are methods developed by the Rarey group at the University of Hamburg which allow BI to evaluate how “abnormal” the torsions are. (We have now incorporated these methods into Foldit – see the recent release.)

From a combination of the torsion strain evaluation, the TPSA predictions, and other such metrics to predict how promising the compounds may be as an oral drug, our collaborators selected about 250 compounds. These included both compounds directly from Foldit players, as well as compounds which take their core idea from Foldit compounds, but which had been modified by BI chemists to improve certain properties or fix issues with synthesizability or the like. These compounds were then redocked in the protein to see if they would easily find the designed binding mode. Those compounds which could be redocked were further evaluated with more computationally involved binding energy prediction methods.

Finally, the promising compounds were evaluated and ranked by a number of experienced medicinal chemists at Boehringer Ingelheim, looking at how good a potential drug they might be, as well as how easy they might be to synthesize. From that evaluation, they came up with a ranked list of 19 compounds which were sent off to be synthesized. – Congratulations to Bruno Kestemont, fiendish_ghoul, equilibria, NeLikomSheet and 5 other anonymous users (user name sharing form), whose work formed the basis of these molecules.

Synthesis of these compounds is now underway. Chemical synthesis is hard, so not all of these molecules might actually be created in the end. But early news is promising, and all of the compounds which are successfully synthesized will be submitted for testing in BI’s internal assays, both for ability to bind to the VHL E3 ligase protein, as well as how well they perform in efflux and permeability (the properties which affect how good an oral drug this might make, and what we were hoping to gauge through the TPSA measure).

As was mentioned at the start, all data generated from this project will be released publicly, with no restrictions on subsequent use. We don’t have the experimental assay data yet, but people interested in the full set of 6500 compounds which Foldit players have generated can download an SDF formatted file. (You should be able to open an SDF file in PyMol, Chimera or other structure viewing programs. Coordinates of the molecules should be placed for binding into the puzzle starting structure.)

jeff101 Lv 1

As long as you're giving out an sdf formatted file of all 6500+ compounds designed by Foldit players, would you also give out something like a spreadsheet that lists all 6500+ compounds, their various Foldit scores (clogP, TPSA, etc.), and what filters/tests each one has passed? This spreadsheet could also include columns listing which Foldit player(s) (soloist + evolvers) designed each compound, which of the 6500+ compounds were the 19 chosen for synthesis, and for the 19, what each one's ranking was. For that matter, why not make a sandbox-type puzzle that lets Foldit players inspect any of the 19 compounds chosen for synthesis?

rosie4loop Lv 1

(Edit 2: note that the crazy 160K Foldit scores in the SDF file were the due to an issue in round 5 of VHL puzzles, as reported in this post. From the record it seems that round 5 had been closed early due to this issue and re-opened as round 5b. These glitched molecules were kept in the list may cause some lags in the data visualizer.)

Maybe a bit late for me to post it here. Information such as Foldit score or cLoP can be found in the SDF file, e.g. as you open it in a plain text editor.

To view all the compounds in a spreadsheet, you may use freely available chemical data visuallization programs like DataWarrior or ICM browser, or similar tools. Below is a screenshot of displaying the file in DataWarrior.