Icon representing a recipe

Recipe: DDG Calculatr 1.1.1

created by donuts554

Profile


Name
DDG Calculatr 1.1.1
ID
103728
Shared with
Public
Parent
None
Children
None
Created on
August 10, 2020 at 06:21 AM UTC
Updated on
August 10, 2020 at 06:21 AM UTC
Description

This recipe calculates the DDG Binding Energy of the complex of the locked and unlocked segments of the binder design puzzle in kCal/mol (kilocalorie per mole) and e^2/Angstrom (elementary charge squared per angstrom). This recipe also calculates the three most attractive and repulsive amino acid pairs in the locked and unlocked segments.
The more negative the DDG score, the more likely your protein is to bind to the locked receptor. It is somewhat similar to and based on how DDG is formally calculated in Rosetta. Instead of the various Rosetta subscores, this recipe takes into consideration the potential energy of the complex of the two segments, as determined by the force between every pair of amino acids that have one on the unlocked segment, and one on the locked segment of the puzzle, as determined by Coulomb's inverse-square law.
But this time in version 1.1.1, I fixed a problem with the ranking of amino acid pairs, and I made the recipe more accurate by adding a band if the distance between the amino acids is below a threshold.

Note: This version of the recipe does not take Clashing or Hydrophobic hiding or voids into consideration, so it seems that it would be best to use the Foldit score for your protein in the case that your protein has many clashes or exposeds or voids. I recommend using this recipe with no clashes, few exposeds and a little amount of voids in the unlocked segment, and to have the unlocked segment close to the locked segment of the binder puzzle so that the score that it would give would be more accurate.

Best for


Code


distanceBand = 0 versionString = "1.1.1" -- Hello! Welcome to the recipe code of DDG Calculatr 1.1.1! Feel free to read through the code. Please PM me if you have any questions or comments. -- This recipe calculates the DDG Binding Energy of the complex of the locked and unlocked segments of the puzzle in kCal/mol (kilocalorie per mole) and e^2/Angstrom (elementary charge squared per angstrom). This recipe also calculates the three most attractive and repulsive amino acids in the locked and unlocked segments. -- It is somewhat similar to and based on how DDG is formally calculated in Rosetta. -- Instead of the various Rosetta subscores, this recipe takes into consideration the potential energy of the complex of the two segments as determined by the force between every pair of amino acids that have one on the unlocked segment and one on the locked segment of the puzzle as determined by Coulomb's inverse-square law. -- But this time in version 1.1.1, I fixed a problem with the ranking of amino acid pairs, and I made the recipe more accurate by adding a band if the distance between the amino acids is below a threshold. -- -- Thank you for reading through this message! function hasNoChargedAtoms(AA) return (AA == 'c') or (AA == 'u') or (AA == 'g') or (AA == 'p') or (AA == 'a') or (AA == 'v') or (AA == 'i') or (AA == 'l') or (AA == 'm') or (AA == 'f') end function hasAChargedAtom(AA) return (AA == 'k') or (AA == 's') or (AA == 't') or (AA == 'y') or (AA == 'w') end function hasTwoChargedAtoms(AA) return (AA == 'h') or (AA == 'd') or (AA == 'e') or (AA == 'n') or (AA == 'q') end function getBondableAtoms(AA) if(hasAChargedAtom(AA)) then if (AA == 'k') or (AA == 'w') then --The amino acid has a donor atom return "Donor" else --The amino acid has a purple atom return "Donoracceptor" end else if(hasTwoChargedAtoms(AA)) then if (AA == 'd') or (AA == 'e') then --The amino acid has two acceptor atoms return "Diacceptor" else --The amino acid has a donor and acceptor atom return "Donoracceptor" end end end end function main() print("Hello " .. user.GetPlayerName() .. "! Welcome to DDG Calculatr " .. versionString .. ", by donuts554!") print("For reference, in the context of this recipe, kCal/mol = kilocalorie per mole, e = elementary charge, Ang. = Angstrom, e^2/Ang. = elementary charge squared per angstrom, seg. = segment, and AA = amino acid.") if not (structure.IsLocked(1)) then errorDialog = dialog.CreateDialog("Error") errorDialog.Text = dialog.AddLabel("There are no locked segments in this puzzle.") errorDialog.OKButton = dialog.AddButton("OK", 1) dialog.Show(errorDialog) return end lastUnlockedSeg = structure.GetCount() if structure.IsLocked(lastUnlockedSeg) then errorDialog = dialog.CreateDialog("Error") errorDialog.Text = dialog.AddLabel("There are locked segments that have indexes after the unlocked segment indexes.") errorDialog.OKButton = dialog.AddButton("OK", 1) dialog.Show(errorDialog) return end print("There are locked segments in this puzzle.") print("Finding the index of the first unlocked segment...") incrementMax = -9999999 incrementMax2 = -9999998 incrementMax3 = -9999997 incrementMin = 9999999 incrementMin2 = 9999998 incrementMin3 = 9999997 distMax = "" distMax2 = "" distMax3 = "" distMin = "" distMin2 = "" distMin3 = "" seg = 1 while structure.IsLocked(seg) do seg = seg + 1 end firstUnlockedSeg = seg print("The index of the first unlocked segment has been found!") --Amino acids with sulfur aren't polar in this case print("Calculating DDG...") lockedAA = "" unlockedAA = "" DDG = 0 -- firstUnlockedSeg-1 is the last locked segment for i = 1, firstUnlockedSeg-1, 1 do print("Seg. " .. i .. "/" .. firstUnlockedSeg-1 .. " of locked seg. - DDG (e^2/Ang.) so far: " .. DDG .. ".") lockedAA = structure.GetAminoAcid(i) --print(lockedAA) if hasNoChargedAtoms(lockedAA) then --print("Segment " .. i .. " has no charged atoms in its sidechain.") else for j = firstUnlockedSeg, lastUnlockedSeg, 1 do --print(i) --print("Calculating index " .. j-firstUnlockedSeg+1 .. "/" .. lastUnlockedSeg-firstUnlockedSeg+1 .. " of the unlocked segment - The DDG so far is " .. DDG*structure.GetDistance(i,j) .. ".") unlockedAA = structure.GetAminoAcid(j) if hasNoChargedAtoms(unlockedAA) then --print("Segment " .. j .. " has no charged atoms in its sidechain.") else lockedBondables = getBondableAtoms(lockedAA) unlockedBondables = getBondableAtoms(unlockedAA) if(lockedBondables == "Donor") then if(unlockedBondables == "Donor") then updateDDG(1, i ,j) else if(unlockedBondables == "Donoracceptor") then updateDDG(-1, i ,j) else if(unlockedBondables == "Diacceptor") then updateDDG(-2, i ,j) else --The unlocked amino acid is Arginine updateDDG(3, i ,j) end end end else if(lockedBondables == "Donoracceptor") then if(unlockedBondables == "Donor") then updateDDG(-1, i ,j) else if(unlockedBondables == "Donoracceptor") then updateDDG(-2, i ,j) else if(unlockedBondables == "Diacceptor") then updateDDG(-2, i ,j) else --The unlocked amino acid is Arginine updateDDG(-3, i ,j) end end end else if(lockedBondables == "Diacceptor") then if(unlockedBondables == "Donor") then updateDDG(-2, i ,j) else if(unlockedBondables == "Donoracceptor") then updateDDG(-2, i ,j) else if(unlockedBondables == "Diacceptor") then updateDDG(4, i ,j) else --The unlocked amino acid is Arginine updateDDG(-6, i ,j) end end end else --The locked amino acid is Arginine if(unlockedBondables == "Donor") then updateDDG(3, i ,j) else if(unlockedBondables == "Donoracceptor") then updateDDG(-3, i ,j) else if(unlockedBondables == "Diacceptor") then updateDDG(-6, i ,j) else --The locked amino acid is Arginine updateDDG(9, i, j) end end end end end end end end end end --This is the net conversion based on the conversion from Angstrom to Meter, Coloumb's Law (in Joules), and the conversion from Joule to Calorie to Kilocalorie. kCalDDG = DDG * 0.214807880367 print("DDG is calculated!") print("DDG = " .. DDG .. " e^2/Angstrom, which is " .. kCalDDG .. " kCal/mol.") print("Most repulsive AA pairs (" .. incrementMax * 0.214807880367 .." kCal/mol): " .. distMax .. ", " .. distMax2 .. ", & " .. distMax3) print("Most attractive AA pairs (" .. incrementMin * 0.214807880367 .." kCal/mol): " .. distMin .. ", " .. distMin2 .. ", & " .. distMin3) DDGDialog = dialog.CreateDialog("Calculation Results") DDGDialog.TextEAngstrom = dialog.AddLabel("DDG = " .. DDG .. " e^2/Angstrom") DDGDialog.TextKcalMole = dialog.AddLabel("Which is " .. kCalDDG .. " kCal/mol") DDGDialog.TextRepulsiveHead = dialog.AddLabel("Repulsive AA pairs (" .. incrementMax .. " kCal/mol) ") DDGDialog.TextRepulsiveList = dialog.AddLabel(distMax .. ", " .. distMax2 .. ", & " .. distMax3) DDGDialog.TextAttractive = dialog.AddLabel("Attractive AA pairs (" .. incrementMin .. " kCal/mol) ") DDGDialog.TextAttractiveList = dialog.AddLabel(distMin .. ", " .. distMin2 .. ", & " .. distMin3) DDGDialog.OKButton = dialog.AddButton("OK", 1) dialog.Show(DDGDialog) end function updateDDG(chargeProduct, i, j) distance = structure.GetDistance(i, j) -- 9.36 Angstroms is the distance between the endpoints of a straightforward zigzag 6 (not 7 because the hydrogen atom can be in a hydrogen bond [Hydrogen atom can be intersected in a hydrogen bond]) atoms long (lysine [pointing straight outwards] - alpha carbon to nitrogen), doubled. -- Note: The path from the alpha carbon of Tryptophan to the farthest atom on the sidechain of that amino acid is six atoms long, but is shorter due to the path not being a straightforward zigzag. -- The reason why this distance is chosen is because it is the closest distance where the length of the biggest sidechain, doubled, can fit in, and that if the distance is shorter, then the number of possible ways the sidechains on both sides can be in gets significantly more limited. This limitation characterizes the significance of the positions of the sidechains, and so the minimum distance where this limitation is unsignificant is used. -- I would like to point out that I have considered the thought that two lysine sidechains pointing straight at each other with their alpha carbons 9.36 Angstroms apart would cause a clash, and so that this configuration would not be possible. However, this is only just one known impossible configuration, so the limitation is not significant. If the distance becomes more shorter, then the much larger number of configurations only possible with the alpha carbons 9.36 Angstroms apart would be impossible, and the muc larger number of impossible configurations is what makes the limitation significant. if (distance < 9.36) then if((lockedBondables == "Donor") or (lockedAA == 'r')) then atomIndex1 = 9 else if((lockedBondables == "Diacceptor") or (lockedAA == 'q')) then atomIndex1 = 7 else if ((lockedAA == 's') or (lockedAA == 't') or (lockedAA == 'n')) then atomIndex1 = 6 else if(lockedAA == 'h') then atomIndex1 = 9 else -- lockedAA = Tyrosine atomIndex1 = 12 end end end end if((unlockedBondables == "Donor") or (unlockedAA == 'r')) then atomIndex2 = 9 else if((unlockedBondables == "Diacceptor") or (unlockedAA == 'q')) then atomIndex2 = 7 else if ((unlockedAA == 's') or (unlockedAA == 't') or (unlockedAA == 'n')) then atomIndex2 = 6 else if(unlockedAA == 'h') then atomIndex2 = 9 else -- lockedAA = Tyrosine atomIndex2 = 12 end end end end distanceBand = band.AddBetweenSegments(i, j, atomIndex1, atomIndex2) if(not(distanceBand == 0)) then distance = band.GetLength(distanceBand) band.Delete(distanceBand) distanceBand = 0 end end increment = chargeProduct/distance if(increment > incrementMax) then distMax3 = distMax2 distMax2 = distMax distMax = i .. "&" .. j incrementMax = increment else if(increment > incrementMax2) then distMax3 = distMax2 distMax2 = i .. "&" .. j incrementMax2 = increment else if(increment > incrementMax3) then distMax3 = i .. "&" .. j incrementMax3 = increment end end end if(increment < incrementMin) then distMin3 = distMin2 distMin2 = distMin distMin = i .. "&" .. j incrementMin = increment else if(increment < incrementMin2) then distMin3 = distMin2 distMin2 = i .. "&" .. j incrementMin2 = increment else if(increment < incrementMin3) then distMin3 = i .. "&" .. j incrementMin3 = increment end end end DDG = DDG + increment end function handleError(err) if(not(distanceBand == 0)) then band.Delete(distanceBand) end if string.find(err, "Cancelled")~=nil then print("The recipe was cancelled.") else print("Sadly, there is an error.") print(err) end end status = xpcall(main, handleError)

Comments


Susume Lv 1

This is a really interesting idea! A couple of things you might want to consider for a future version: I notice you are dividing the charge product by the distance between charges, but Coulomb's law actually uses the square of the distance - this is important because the attractive or repulsive force decreases very rapidly with distance.

The structure.GetDistance function that foldit Lua provides gives you the distance between the alpha carbons. You might get better accuracy by locating the polar atoms themselves and adding a band between them, then asking for the length of the band. That would definitely make the recipe much slower, though. One way to limit the time increase would be to query alpha carbon distance first (which is quite fast), and only use a band to get a more accurate distance if the alpha carbon distance is below some threshold (close enough that the difference between alpha carbon distance and sidechain to sidechain distance is expected to be significant).

I hope you'll keep working on this!

jeff101 Lv 1

Coulomb's law for the force between 2 charges does go 
as 1/r^2, where r is the distance between the 2 charges,
but the potential energy due to this force goes as 1/r.

The gravitational force between 2 masses also goes
as 1/r^2 while the gravitational potential energy
goes as 1/r. 

The equations for electrostatic force & potential energy
are very similar to the equations for gravitational force &
potential energy. One difference is that the electrostatic
ones go as the product of 2 charges while the gravitational
ones go as the product of 2 masses.

Because of these things, I think using 1/r when calculating
the Binding Energy DDG makes more sense than using 1/r^2.

https://en.wikipedia.org/wiki/Inverse-square_law#Electrostatics
https://en.wikipedia.org/wiki/Inverse-square_law#Gravitation
https://en.wikipedia.org/wiki/Gravitational_energy#Newtonian_mechanics
https://en.wikipedia.org/wiki/Electric_potential_energy#Electrostatic_potential_energy_of_one_point_charge

Susume Lv 1

I notice the recipe attempts to keep track of the top three pairs contributing to repulsion, and the top three pairs contributing to attraction. When a new pair exceeds the #1 spot in these lists, it gets inserted at the top of the list and the others are pushed down. However, if a new pair falls between #1 and #2, or between #2 and #3, it is not getting added into the list as it should.

The recipe is using the following values for the point charges (measured in e, elementary charge) for each amino acid:
0 a, c, f, g, i, l, m, p, v
1 k, w
3 r
-1 h, n, q, s, t, y
-2 d, e
I assume these are based on the number of polar atoms in the sidechain. D and E are commonly considered charged, while I guess the polar atoms on the others could be considered to have a partial charge. Is there a table somewhere of how charged the various amino acids are? On those that have both a partial positive and a partial negative, should these point charges be considered separately rather than aggregated?

jeff101 Lv 1

More arguments favoring 1/r instead of 1/r^2 are below:

Electrostatic forces (which go as the product of 2 charges 
divided by r^2, the square of the distance between the 
charges) are vectors with both magnitude and direction. 
Being vectors in 3D space gives x y and z components in 
cartesian coordinates. This makes it more complicated to 
calculate the overall force on a single charge due to a 
group of nearby charges.

Electrostatic energy (which goes as the product of 2 charges
divided by r, the distance between the charges) is a scalar 
with no direction or x y z components to worry about. This
makes it easier to calculate the overall energy of a group
of nearby charges.

donuts554 Lv 1

I will try to fix the issue of when a new pair falls in between #1 & #2 or #2 & #3 in the next version of this recipe, and I will try to make the recipe more accurate by adding a band if the distance between the amino acids is below a threshold. Thank you for the suggestions!

I think there is a table somewhere on how charged the various amino acids are, at http://thinkpeptides.com/aminoacidproperties.html . On the table, it seems that the isoelectric point and sidechain acidity/basicity of each amino acid can show how charged the various amino acids are.

On the amino acids that have both a partial positive and a partial negative, I think the point charges should be considered separately rather than aggregated, because the partial charges do play a significant role in the binding mechanism of two proteins, as either of them can attract any charged amino acid. If they are aggregated, then those amino acids wouldn't be polar and their partial positive and negative charges wouldn't be taken into consideration.

I would like to point out that in this recipe, the amino acids h, n, q, s, t, and y are assigned either 1 or -1 elementary charge depending on the corresponding amino acid pair it is compared with, as both of the polar atoms can be bonded to.