Placeholder image of a protein
Icon representing a puzzle

1440: Aflatoxin Challenge: Round 1

Closed since over 8 years ago

Intermediate Overall Design

Summary


Created
October 13, 2017
Expires
Max points
100
Description

Aflatoxins are a class of poisonous compounds that contaminate a significant portion of the global food supply. In this puzzle, players are challenged to redesign an enzyme that could break down aflatoxin molecules. The majority of the protein is frozen, with the aflatoxin ligand fixed in a binding pocket. Surrounding the binding pocket are a number of loops that might be redesigned without affecting the folding stability of the protein. In these loops, players may manipulate the protein backbone and mutate the residue sidechains. Redesign the loops of this protein to better bind the aflatoxin ligand!



This is the first puzzle of our Aflatoxin Challenge, sponsored by Mars Inc. and Thermo Fisher Scientific. Promising designs will be tested by the Siegel Lab at UC Davis. By participating in the challenge/game, the players agree that all player designs will be available permanently in the public domain, and the players will not seek intellectual property protection over the designs created as part of the challenge/game.



Note: Due to special interest in this puzzle, the deadline has been extended by one week, to October 31 at 23:00 UTC.

Top groups


  1. Avatar for Beta Folders 100 pts. 16,065
  2. Avatar for Gargleblasters 2. Gargleblasters 85 pts. 15,921
  3. Avatar for Go Science 3. Go Science 72 pts. 15,899
  4. Avatar for Void Crushers 4. Void Crushers 61 pts. 15,894
  5. Avatar for Anthropic Dreams 5. Anthropic Dreams 51 pts. 15,890
  6. Avatar for Contenders 6. Contenders 42 pts. 15,850
  7. Avatar for Italiani Al Lavoro 7. Italiani Al Lavoro 35 pts. 15,759
  8. Avatar for HMT heritage 8. HMT heritage 28 pts. 15,717
  9. Avatar for Marvin's bunch 9. Marvin's bunch 23 pts. 15,677
  10. Avatar for Russian team 10. Russian team 19 pts. 15,612

  1. Avatar for LociOiling
    1. LociOiling Lv 1
    100 pts. 16,065
  2. Avatar for reefyrob 2. reefyrob Lv 1 99 pts. 15,990
  3. Avatar for Timo van der Laan 3. Timo van der Laan Lv 1 97 pts. 15,894
  4. Avatar for Enzyme 4. Enzyme Lv 1 95 pts. 15,875
  5. Avatar for eusair 5. eusair Lv 1 93 pts. 15,860
  6. Avatar for spvincent 6. spvincent Lv 1 91 pts. 15,849
  7. Avatar for Mike Cassidy 7. Mike Cassidy Lv 1 90 pts. 15,842
  8. Avatar for Bruno Kestemont 8. Bruno Kestemont Lv 1 88 pts. 15,825
  9. Avatar for johnmitch 9. johnmitch Lv 1 86 pts. 15,816
  10. Avatar for retiredmichael 10. retiredmichael Lv 1 85 pts. 15,794

Comments


Aubade01 Lv 1

Public Domain research, funded by government grants and published in research journals, is used by private companies doing corporate research on commercial for-profit products. There is no way to stop this unless the research is kept private and a commercial secret. Even then, if a new product using this new idea is sold publicly, the product and be reverse engineered by a competing company so they can make one like it. Patents can help protect a novel product, but in the case of very big commercial hits they can be invalidated on narrow technical legal grounds. Universities with a big commercial success idea will certainly make money. Graduate students (and on-line ones like ourselves) may get some recognition, but rarely financial rewards. Consider it a gift towards the public good.

Aubade00 / 01

Susume Lv 1

In the view menu, turn on Show bonds (sidechain), Show bonds (non-protein), and Show bondable atoms. This will let you see the oxygen atoms (red or purple) on the ligand (the aflatoxin) that can form H bonds, and see if there are are any bonds formed. It also lets you see the red, blue, or purple atoms of the protein that can form bonds to the ligand. Forming bonds to the ligand helps the protein grab it and break it apart, and also helps your score. If your View menu does not have the Show options listed above, click menu, options, advanced GUI to get the additional view options.

If your game is running slowly during wiggles, you may want to set View Sidechains to Don't Show, just during wiggles or scripts. The hotkeys for this are Shift-D (Don't Show sidechains) and Shift-A (Show All sidechains).

In addition to H bonds, the protein can grab the ligand by forming pi stacks with it. This is when the hexagon or pentagon of a sidechain lines up neatly with a hexagon or pentagon of the ligand. This Wikipedia article has a nice diagram of three different shapes that pi stacks can take.

LociOiling Lv 1

In addition to Susume's tips, the Foldit aflatoxin launch blog post is a pretty good place to start. It's pretty long and somewhat technical, but there's a lot of good information. Here are some nut-and-bolts details that may help translate the science into Foldit terms.

In puzzle 1440, the aflatoxin is the small molecule that appears as segment 303. It's not connected to the rest of the protein, making it a "ligand". You can use "X-ray tunnel for ligand" in the advanced view options to help spot the aflatoxin, but the tunnel view doesn't always center the ligand. Rotating the protein with the X-ray tunnel open makes it easier to see the ligand. Unlike the protein, the ligand shows lots of double bonds, another guide to spotting it.

The aflatoxin in 1440 is locked, so you won't be able to use any of the small molecule design tools on it. Instead, you're allowed to design selected parts of the protein. You can use all the usual protein design and refinement tools, including mutate, on the unlocked part of the puzzle.

The starting protein in puzzle 1440 is a "gluconolactonase", specifically the one in PDB id 3DR2. PDB is the protein data bank, available at rcsb.org. (More on that in another post.)

Gluconolactonase is an enzyme which works on ring structures called lactones. The version of aflatoxin in puzzle 1440 has a lactone group. According to the blog post, degrading that lactone is "demonstrated to decrease aflatoxin toxicity by more than 20-fold". Unfortunately, the puzzle 1440 protein doesn't work on aflatoxin. The goal is to fix that. No one knows what changes to the protein might get the job done.

The protein in 3DR2 doesn't include aflatoxin. 3DR2 is actually a dimer, meaning there are two copies of the protein. There's only one copy of the protein in puzzle 1440. There are some calcium ions in 3DR2 that aren't included with the puzzle 1440 protein.

This post is the first in a series. More on some other slight differences between PDB 3DR2 and what you see in Foldit in the next post.

(Edit: couple of minor edits. Need either more sleep or more coffee.)

LociOiling Lv 1

In puzzle 1440, the starting protein is based on a published experimental solution, which you can look at online. It's not clear how much this will help, since the goal is to change the structure and function of the protein, but the additional information in the Protein Data Bank (PDB) might at least provide inspiration.

The PDB website is at rcsb.org. You can look up proteins by their PDB id. Puzzle 1440 is based on PDB id 3DR2. A given PDB entry contains an abstract of the article describing the experiment, and lots of technical detail about the protein. A given protein may have multiple PDB entries, reflecting different experiments.

The PDB website includes the JSMol 3D viewer (JavaScript), and you can also use standalone viewers like JMol and PyMol (Java- and Python-based, respectively). They're all similar to Foldit, allowing you to rotate and zoom the protein.

You can also look at the actual PDB data file, which, among many things, includes the XYZ coordinates of each atom in the protein. (The PDB files are available under the "Display Files" and "Download Files" dropdowns on a protein's page at rcsb.org.)

In the "Mol" viewers, all you see initially is a cloud of atoms. You can right-click on the background, then select Style -> Scheme -> Cartoon to switch to a view similar to the Foldit cartoon view. If you right-click again, and select Style -> Bonds -> 0.15 Ä, you get something similar to showing all sidechains in Foldit. These options are the same in JSMol and JMol, but I haven't verified them in PyMol.

One nice thing about the "Mol" cartoon view is that helixes and sheets have arrows (or rockets) pointing in the direction of the higher-numbered segments. This feature makes it easier to well which end of the protein is which and which way a structure is aligned.

The "Mol" viewers also have true (simulated) 3D options under Style -> Stereographic. Lots of fun if you "forgot" to return those red-and-blue glasses after the movie.

If you looking at 3DR2, you need to make a few adjustments to get to what you see in Foldit, especially at the segment-by-segment level.

3DR2 is a dimer, meaning it has two copies of the protein. In this case, the copies are not quite identical, due to problems with the experiment.

In the "Mol" viewers, you can limit the view to chain A, the first of the two copies. Right-click on the background and select "Console". In the console window, enter these commands:

select :A
restrict selected

These commands select the first copy of the protein, and limit the display to that copy. (Close the console window to see the protein again.)

The sequence for 3DR2 shows three additional segments at the beginning of both chains. The experiment didn't find these segments for chain A, so the model starts at segment 4. (For chain B, the first four segments somehow went missing.)

So, looking at chain A of 3DR2, segment 4 is segment 1 in puzzle 1440. You see the segment numbers (or "residue" numbers) when you hover over the segment in a "Mol" viewer, similar to shift-tab in Foldit.

Overall, segments 1 to 302 in Foldit are segments 4 to 305 of chain A of 3DR2 in the PDB. Segment 303 in Foldit is the aflatoxin ligand, which isn't present in 3DR2.

A few segments in the middle of 3DR2 also didn't show up in the results. For the A chain, segments 210, 211, and 212 are missing. The puzzle designers filled in the corresponding segments, 207 to 209, in puzzle 1440. These formerly missing segments are included in the ones which can be designed in 1440.

(Second in a series, collect them all!)

LociOiling Lv 1

As Susume has noted, the protein in puzzle 1440 has a "beta propeller" shape.

This complicated shape no doubt is key to how the protein works as an enzyme. The abstract for PDB, entry 3DR2 describes it technical terms, saying it

...forms a novel disulfide-bonded clamshell dimer comprising two doughnut-shaped six-bladed beta-propeller domains, yet with an exceptionally long N-terminal subdomain forming an extra helix and four additional beta-strands to enclose half of the outermost beta-strands of each propeller.

At least the "clamshell" and "doughnut" parts are easy. Some of the other stuff requires more explanation.

A "dimer" is just a protein built in two parts. Both parts may be identical, as in 3DR2. (OK, mostly identical.)

The "disulfide-bonded" part refers a disulfide bridges that would connect the two halves of the dimer. Since one half is missing, these bridges isn't present in puzzle 1440. The cysteine segments at segments 5 and 157 in Foldit would join the two halves of the dimer. (Segment 5 on the A chain would be bridged to segment 157 on the B chain, and vice-versa, so two disulfide bridges.)

There is a bridge between segments 242 and 279 in 1440, but both these segments are locked, so there's no chance to do anything with it. Segments 15, 99, 258, and 295 also start out as cysteine, and are unlocked. You might be able to mutate other segments to cysteine to form bridges with them, or you can also mutate them to be something other than cysteine. The PDB for 3DR2 doesn't even list the bridge between 242 and 279, so you're on your own if you want to add more.

Continuing on in the abstract, "domains" and "sub-domains" are really just "parts" or "chunks".

The "N-terminal" of the protein means the part that starts at segment 1, the amiNo end. The other end is the "C-terminal", the aCid end. N-terminal refers to the nitrogen in the "amino group" of the amino acid. Segment 1 is the first amino acid added to the protein, and its nitrogen isn't bonded to anything.

"Beta-strands" are what we call "sheets" in Foldit. Technically, each flat zigzag part in the cartoon view would be a "beta strand", and you could only call it a "beta sheet" when two or more separate strands are bonded together. For Foldit, dropping the "beta" part of the name and just calling a strand a sheet works well enough.

The "beta" name came about because sheets/strands were discovered second back in the early days of X-ray crystallography. Helixes were discovered first, so you got alpha-helixes.

It's easy to find three additional sheets ("beta-strands") in the N-terminal part of 3DR2 and puzzle 1440. Each of these three sheets is bonded to a different but adjacent blade of the propeller. We'll look for a fourth "additional" sheet in yet another post, which will hopefully include some pictures.

bkoep Staff Lv 1

All the organizations involved (Mars Inc., Thermo Fisher, UW, etc.) have agreed that no one will pursue IP protecting the direct results from Foldit players in these Aflatoxin puzzles.

As Aubade01 noted below, there is nothing we can do about protecting derivatives of work in the public domain. Anyone (including the sponsors, their competitors, and Foldit players themselves) will have access to the Foldit player designs from these puzzles, and will be free to use the results to inform their own original research.

We nonetheless consider this an opportunity for Foldit players to do constructive science and make significant, positive contributions to aflatoxin research. However, I understand your misgivings, and recognize that any Foldit player may abstain from Aflatoxin Challenge puzzles if he or she so chooses.

bkoep Staff Lv 1

We expect that the best designs will make additional interactions with the ligand.

These interactions can be hydrogen bonds with polar atoms on the ligand; but they can also be nonpolar residues that pack nicely against the ligand, or sidechains that occlude voids nearby. Good designs may also include improvements that indirectly favor ligand-binding, by stabilizing protein residues near the binding pocket or elsewhere. We hope that (if the puzzle is set up well) the score will reflect all of these factors.

LociOiling Lv 1

Here's the aflatoxin ligand from puzzle 1440, segment 303. It's a little different than the structures you see for aflatoxin B1 (AFB1) and other common varieties of aflatoxin. For example, atom 24, at the top of the image, is calcium. Everything else is either carbon (light blue, using EnzDes coloring here), oxygen (red), or hydrogen (white). It can be a little hard to tell, but the hydrogens are atoms 26 through 38.

The scientists will have to explain exactly what's going on here.

The atoms were identified by using structure.GetAtomCount, then drawing a band between the ligand and a nearby segment of the protein. Atom 24 was a little difficult to pin down, but it was traced through the config files 0002004322.ir_puzzle.params and database/chemical/atom_type_sets/fa_standard/atom_properties.txt.

[img_assist nid=2004351 title=Puzzle 1440 aflatoxin ligand with atom numbers desc=Puzzle 1440, segment 303, aflatoxin ligand. link=popup align=left width=640 height=553]

Susume Lv 1

Thanks Loci! That makes it much easier to talk about specific atoms in the ligand!

The pentagon that hangs directly off the lactone ring (atoms 5, 7, 8) has two hydrogens each on carbons 7 and 8 (hydrogens 26, 27, 28, 29) so the hydrogens don't lie in the plane of the pentagon. Can that pentagon form a pi sandwich with an aromatic sidechain (or two), or do the hydrogens get in the way?

Also, can a pentagon and a hexagon form a pi sandwich with each other, or do they have to be the same shape?

LociOiling Lv 1

Here's aflatoxin B1 as seen in JMol. It's quite a bit different than the ligand at segment 303 in puzzle 1440.

[img_assist nid=2004352 title=Aflatoxin B1 desc=Aflatoxin B1 seen in JMol. link=popup align=left width=640 height=553]