(FolditStandalone) Test of setting up custom puzzles with non-protein/non-standard residues

Started by rosie4loop

rosie4loop Lv 1

I am posting my notes here for those who would like to know the detailed steps to create custom Foldit puzzles with FolditStandalone and share it privately, e.g. for the training of students using in-house data yet to be published, or if you have concerns to let students create a game account.

You may also refer to the following article first for a brief overview of using Foldit in class:

  1. Official Foldit forum post for educators which provide detailed information on how to create the puzzle config. files (although lacking instructions on more updated tools or puzzle types)
  2. Dsilva, L., Mittal, S., Koepnick, B., Flatten, J., Cooper, S., and Horowitz, S. (2019) Creating custom Foldit puzzles for teaching biochemistry. Biochemistry and Molecular Biology Education 47, 133–139. [link]

The following notes outlined the steps of creating a minimal custom small-molecule design puzzle with FolditStandalone, without a .puzzle_setup file for advanced configuration. Please refer to the official instructions for how to create the puzzle configuration file.

Software version I used:

  • Folditstandalone: 20230109-b54c9a359c-win_x64-INTERNAL on Windows 10
  • Rosetta 3.12 bundle compiled on Linux Mint 19.1

This note has two parts. In Part A I tested a more "standard" method (not the optimal method!). Part B is a workaround for those without the scripts from standard Rosetta.


A. Standard method

Requirement
• “molfile_to_params.py” from standard Rosetta package (require python)
• Input files
• .pdb with protein and starting ligand
• .params of the ligand containing the chemical info about the ligand
• Note that the .pdb and .params must share the same prefix to be loaded in the same session

Preface
• In the example we’re using the protein-ligand complex of the lysozyme L99A/M102Q from the RCSB PDB (3HTB) which contains a 2-propylphenol ligand with residue name “JZ4”.
• In this example it seems to be ok to use the files from pdb directly and use FolditStandalone to clean the protein file. However, for entries with more than one molecules in the asymmetric unit or with a lot of non-standard residues, it’s better to clean the file before loading into foldit to avoid errors.

• NOTE: this current method has an issue that the original ligand coordinates in .pdb cannot be loaded directly.

(solved)

Preparation of protein
• Download the complex pdb file from the RCSB PDB (https://www.rcsb.org/structure/3HTB)

• Load the raw PDB into FolditStandalone and export PDB to clean up the file. Check the file carefully to make sure it contains everything you want.

Preparation of ligand
• Scroll down to get the ligand .sdf file

• It is better to add hydrogens to the ligand before generation of the params file, but for a quick demo this time we proceeded without the step.

• Process the ligand sdf with the following command (replace PATH/TO/ROSETTA with your actual path to rosetta):
python /PATH/TO/ROSETTA/source/scripts/python/public/molfile_to_params.py -n JZ4 3htb_D_JZ4.sdf
• This steps generates a "JZ4.params" and a "JZ4_0001.pdb"
Rename JZ4.params into 3htb.params Optional.

Preparation of the complex
• Open "JZ4_0001.pdb" and the cleaned protein pdb exported from the previous step of protein preparation in a text editor.
• Copy the contents of "JZ4_0001.pdb" to right after the "TER" record of the protein (around the end) of the cleaned protein file,

• then save it as "3htb.pdb"

Start the Folditstandalone client.
• Load the two files.


• Wait for a few seconds for it to load.

• Adjust the view to find the ligand. It is also good to change the visualization setting for a cleaner view. You can see the ligand design panel would be colored after you click on the lingand

• To get the hydrogens, just use the ligand modifying panel and e.g. click on a carbon and change it back to C


• You can now modify the ligand as you like. Do some wiggles and shake to optimize the position. (in this case the protein is flexible since we didn’t add a restraint)
• The session can be saved for later use or for sharing. This time I called it “3htb-jz4_opt0”.
• To recover a previous session, simply load the saved “.ir_session”. This will automatic generate a folder with the name of your session

• which contains the following files:

[Extras] Additional test just for fun. Just a proof of concept, this additional section still have a lot of issue!
• Exporting the complex pdb also allow comparing the designed pose with docking output from external source, e.g. using some scripts or manually, removing auxiliary data from the foldit-generated pdb file, extracting the ligand, re-dock with smina using the autobox option and output in PDB format, then manually putting the top-scoring pose to the pdb file, fix the residue name/id and restoring TER record, Update: load the file that contains a ligand smiles at the end of PDB file and the manually added, docked coordinates of ligand, , then import the file(s) back into folditstandalone. The initial test of this step has some issues (showing two ligands, one of them with messed up structure, haven’t checked if its docked ligand or the params ligand) but should work after some adjustment in theory.
• loading the docked complex and the original .params results in two ligands. After MMFF wiggle to fix the messed-up ligand it looks like this

B. Workaround for those who don’t have standard Rosetta

WARNING: original ligand coordinates will be lost with this method. Instead, the dummy ligand with messed-up coordinates would be located at the binding site, I am guessing its aligned to the ligand.

• Copy the "example_ligand.params" from https://fold.it/dist/external/standalone/quickstart.html to your working directory and (optional) rename it into 3htb.params for clarity (or whatever basename of the file you have)
• Open the params file with a TEXT EDITOR. You will see something like this on top of the file.

• Check the residue name of the ligand in the pdb file, and replace the “D2N” in the params file with ligand’s residue name. in this case is JZ4.
• Save the modified params file.
• Load both of the files into folditstandalone


• After loading, a messed up ligand appears which cannot be fixed by MMFF Wiggle.

• Export this structure as pdb. Update: Since the saved ligand is different from the originally loaded PDB ligand, after exporting the PDB from Folditstandalone once, the PDB would have the "IRDATA SMILES " lines near the end of the file, so Foldit can now read the ligand automatically.
• Load the new set of files. Now you can fix the messy ligand with MMFF Wiggle, or modify it.
• Before MMFF wiggle,

and after MMFF wiggle. Still messy, but easier to fix.

• After changing ligand color, deleting most ligand atoms till only a methane left, adding a benzene, then some wiggling.

• You can now save the puzzle or do more editing.

Reference

  1. Previous post by LociOiling
  2. Foldit Standalone Quickstart Guide
  3. Rosetta tutorial on preparing ligand params file
  4. Dsilva, L., Mittal, S., Koepnick, B., Flatten, J., Cooper, S., and Horowitz, S. (2019) Creating custom Foldit puzzles for teaching biochemistry. Biochemistry and Molecular Biology Education 47, 133–139. [article][example files]
  5. Official Foldit forum post for educators, with the instructions and examples.

rosie4loop Lv 1

To setup a reaction design puzzle, I have tested with the files from the Campaign dataset which can be found in the resource folder of both the game and the standalone version:

In this test, I observed the following:

  1. For a first-time setup, a pdb, a ligand params and a reaction_set file is required.
  2. EXCEPT when the protein + ligand complex being exported from FolditStandalone and containing "IRDATA SMILES" lines around the end of the pdb file, only the .pdb and .reaction_set are necessary

    this also apply to simple small-molecule design puzzle, in that case only the exported pdb is needed.
  3. Unfortunately, the reaction_set file is encrypted, so I cannot create my own set, but rather loading the provided file.
    • I have tested with several lines from "58_robust_reactions.smirks", taken from foldit/foldstandalone "\cmp-database-xxx\database\ligand_fragments", which did not work.
    • I assume the "reaction_set" could be in smarts/smi format, but I haven't been successfully created a puzzle with a new reaction library yet. It would be nice if more information regrading file format can be provided by the Devs!
  4. The working, saved session generate a folder with the following items after importing

    with the reaction_set file remains encrypted

rosie4loop Lv 1

Placeholder for a more complicated design puzzle of metalloprotein /non-standard residues /aptamer design.


Prelim. notes on metalloprotein:
Currently the support of ions seems to be limited:
Foldit would protonate metal-binding cysteines. I am ignoring this issue for now, but will probably need to further adjust the topology to fix this issue. Maybe better to skip this kind of puzzles at this stage.

Tested with the 4-coordinating tetrahedral Zn params provided in:
Dsilva, L., Mittal, S., Koepnick, B., Flatten, J., Cooper, S., and Horowitz, S. (2019) Creating custom Foldit puzzles for teaching biochemistry. Biochemistry and Molecular Biology Education 47, 133–139. [article][example files]

Metal ions such as zinc 2+ plays an important role in many biological functions, hence for the design of metal-coordinating ligands it should be included as a part of the enzyme. Setting up this kind of puzzles requires the params file of both the starting ligand and zinc ion.
For zinc with other coordination number or coordination geometry or other metal ions you will need to prepare another params file.


rosie4loop Lv 1

[Apr 22, 2023] Updates and TBD

  1. Updated info on the motivation of creating in-house private puzzle.
  2. Added links to the Official Foldit forum post for educators which provide detailed information on how to create the puzzle config. files (although lacking instructions on more updated tools or puzzle types)
  3. TBD fix format issues.

[Apr 20, 2023] Updates and TBD

  1. I figured that its not necessary to use the same basename for the files to be imported into the same session, updated these notes accordingly. Thanks to the example files provided by Dsilva et al's publication.
  2. Updated reference to include Dsilva et al's publication.
  3. Updated link to the electron density puzzle notes.
  4. Placeholder of the more complicated design puzzle(s).

[Apr 11, 2023] TODO list and work-in-progress

  1. adding notes on puzzle config (limiting tools for students, locking protein)
  2. reorganizing this topic, post 1 preface and TOC link to related puzzle notes
  3. design puzzle with non-standard residues (metal or modified AA/NA) or aptamer (DNA/RNA)
  4. (In a separated topic) Notes on some proof of concept electron density educational puzzles: standard density, (Working: density + alignment, density + ligand)

[Apr 7, 2023] Update to the notes on small molecule design puzzle:

  1. Fixing the ligand loading issue in standard method, adding a quick cleaning step.
  2. Removing some of the problematic steps.