Icon representing a recipe

Recipe: print protein 2.9.4

created by LociOiling

Profile


Name
print protein 2.9.4
ID
108276
Shared with
Public
Parent
print protein 2.9.3
Children
Created on
May 08, 2023 at 21:36 PM UTC
Updated on
May 08, 2023 at 22:13 PM UTC
Description

Update of "print protein lua2 V0" by marie_s.

Version 2.9.4 has improved chain detection. The main dialog has been restructured to prevent conditions where it's "too tall" to fit the Foldit window.

Best for


Code


--[[ print protein info on the protein - Concatenation of recipes, part of recipes or functions by: Tlaloc, spvincent, seagate, John McLeod, Crashguard303, Gary Forbis and authors on wiki Now with code from Timo van der Laan and and more code from spvincent. Borrowed HerobrinesArmy's idea for copy-and-paste on the segment score table, plus option for including atom and rotamer counts. Intended use: 1. Run 2. Open the script log (scriptlog.default.xml) in a text editor. 3. Strip off the start and end lines and save it as a text file with another name. 4. Import that into Excel as a comma-delimited text file. Alternately: 3. Select and copy the lines containing the score detail. 4. Paste into a spreadsheet or other program that accepts CSV (comma-separated values) format. Yet another alternative: 3. Select and copy the "score report" from the new "copy-and-paste" dialog. 4. Paste into the spreadsheet of your choice. version history --------------- print protein lua v0 - 2011/08/15 - marie_s (Marie Suchard) print protein 2.0 - 2015/05/13 - LociOiling + add dialog + use active subscores + restrict scope of most variables + convert amino acid table to keyed format + added kludges specific to puzzle 879 (hope they are never needed) + add adjustable rounding, eliminate existing "no trunc" logic + made tab the default delimiter, with comma or semicolon as alternates + add detailed scoring information + add "modifiable sections" report, make original "mutable" report optional + make "mini contact table" optional + in subscores report, add option for atom and rotamer counts, make hydropathy index optional + add warnings for unknown amino acid or secondary structure codes, subtotal mismatches, suspect ligands + and copy-and-paste dialog for subscores, primary and secondary structure print protein 2.1 - 2016/03/23 - LociOiling + add ruler print protein 2.2 - 2016/04/18 - LociOiling + fix crash in normal ligand case + add mean score to subscore display print protein 2.3 - 2016/10/27 - LociOiling + add density analysis for ED puzzles * density by amino acid * density by aromatic vs. non-aromatic * density by aliphatic vs. non-aliphatic * density by hydrophobic vs. hydrophilic * density deviation, showing whether each segment is above or below mean density for the segment's amino acid type + move less-used parms to second screen to avoid screen overflow print protein 2.4 - 2016/11/15 - LociOiling + move ED items to separate dialog to avoid textbox bug + add "active" flag to active subscores table to fix totally messed up selection logic; ripple change to GetScore print protein 2.5 - 2017/05/19 - LociOiling + fix rulers, split long lines print protein 2.6 - 2017/08/06 - LociOiling + save and restore to conserve moves print protein 2.7 - 2017/11/02 - LociOiling + consolidate duplicated segment subscore calls + add info from scoreboard and user functions print protein 2.8 - 2017/12/15 - LociOiling + handle RNA, make educated guess for DNA + include ligands in report, treat each ligand independently + add new type column, P for protein, R for RNA, D for DNA, M for molecule (ligand) + if locked segments found, include "locked" column + check for unlocked sidechains of locked segments print protein 2.8.1 - 2018/04/27 - LociOiling + short lines + chain detection + filter reporting + eliminate internal segment table, use protNfo + convert AminoAcids table to use named values (not indexed) + disulfide analysis + additional density reports sugggested by jeff101 + unreleased, density changes not finished, especially with regard to terminal regions, but several new comparisons added print protein 2.9 - 2020/03/18 - LociOiling + revived + refine filter reporting + update to SLT module + use Lua table format for modifiable sections + separate locked backbone and locked sidechain into two columns + use better separators in scriptlog + minor formatting tweaks print protein 2.9.1 - 2020/04/11 - LociOiling + handle proline at N-terminal correctly + account for cysteines as terminals in locked, zero-score positions + use multiple-pass, phased approach + remove scoreboard.GetScore, gets Rosetta for best overall, not this protein print protein 2.9.2 - 2020/07/01 - LociOiling + figure out disulfide bridges even in zero-score sections + correct DNA code for thymine, "dt", not "du" (d'oh!) (as seen in Foldit Standalone) + fix stupid crash when there's a ligand print protein 2.9.3 - 2022/08/28 - LociOiling + fix crash when binder target lacks N- or C-terminals print protein 2.9.4 - 2023/05/08 - LociOiling + use structure.IsLocked for sidechains as well as backbone + backup detection of N- and C-terminals based on distance + limit main dialog height + move subscores to separate dialog + convert structure, ligand, and density tables to named columns + standardize chain reporting + link to pageable chains dialog from main dialog + add "end of report" lines for the sake of copy/paste ]]-- -- -- globals section -- Recipe = "print protein" Version = "2.9.4" ReVersion = Recipe .. " " .. Version WAYBACK = 99 -- save and restore isLigand = false subScores = {} -- subscore table gTotal = 0 -- grand total all subscores segScoreCache = {} -- current.GetSegmentEnergyScore results saved here by FindActiveSubscores subScoreCache = {} -- current.GetSegmentEnergySubscore results saved here by FindActiveSubscores BoxScore = { tSubScores = 0, -- total of active subscores, all segments tSegScores = 0, -- total of segment scores tScoreFilt = 0, -- total score, filters on tScoreFOff = 0, -- total score, filters off tScoreNrgy = 0, -- total energy score tScoreBonus = 0, -- total filter bonus tScoreForm = 0, -- subscores + filter bonus + 8000 tScoreDark = 0, -- total "dark" score } PPX = { kHydro = false, -- include hydropathy index kAtom = false, -- include atom count kRotamer = false, -- include rotamer count kLongnm = false, -- include full name of amino acid or nucleobase kAbbrev = false, -- include abbreviation kRound = 3, -- number of digits for rounding kFax = 10 ^ -3, -- initial rounding factor jMaxL = 100, -- max line length for strings dtab = true, delim = "\t" , -- default delimiter is tab character dcomma = false, -- allow user to select comma dsemic = false, -- allow user to select semicolon kMutDet = false, -- detailed mutable report kContact = false, -- contact table kDensity = false, -- density report } jDensity = false -- true if density component present Ctypes = { P = "protein", D = "DNA", R = "RNA", M = "ligand", } -- -- AminoAcids -- -- names and key properties of all known amino acids and nucleobases -- -- Notes: -- -- * commented entries (at the end) are not in Foldit -- * one-letter amino acid code is the table key -- * two-letter RNA and DNA nucleotides are also valid -- * the fields in this table are now referenced by name -- * the "unk" and "x" codes are considered protein, unless the segment is marked as -- ligand in the secondary structure ( code "M" ) -- * acref is atom count mid-chain, used to detect multiple peptide chains -- AminoAcids = { a = { code = "a", ctype = "P", acref = 10, short = "Ala", long = "Alanine", hydrop = 1.8 }, c = { code = "c", ctype = "P", acref = 11, short = "Cys", long = "Cysteine", hydrop = 2.5 }, d = { code = "d", ctype = "P", acref = 12, short = "Asp", long = "Aspartate", hydrop = -3.5 }, e = { code = "e", ctype = "P", acref = 15, short = "Glu", long = "Glutamate", hydrop = -3.5 }, f = { code = "f", ctype = "P", acref = 20, short = "Phe", long = "Phenylalanine", hydrop = 2.8 }, g = { code = "g", ctype = "P", acref = 7, short = "Gly", long = "Glycine", hydrop = -0.4 }, h = { code = "h", ctype = "P", acref = 17, short = "His", long = "Histidine", hydrop = -3.2 }, i = { code = "i", ctype = "P", acref = 19, short = "Ile", long = "Isoleucine", hydrop = 4.5 }, k = { code = "k", ctype = "P", acref = 22, short = "Lys", long = "Lysine", hydrop = -3.9 }, l = { code = "l", ctype = "P", acref = 19, short = "Leu", long = "Leucine", hydrop = 3.8 }, m = { code = "m", ctype = "P", acref = 17, short = "Met", long = "Methionine ", hydrop = 1.9 }, n = { code = "n", ctype = "P", acref = 14, short = "Asn", long = "Asparagine", hydrop = -3.5 }, p = { code = "p", ctype = "P", acref = 15, short = "Pro", long = "Proline", hydrop = -1.6 }, q = { code = "q", ctype = "P", acref = 17, short = "Gln", long = "Glutamine", hydrop = -3.5 }, r = { code = "r", ctype = "P", acref = 24, short = "Arg", long = "Arginine", hydrop = -4.5 }, s = { code = "s", ctype = "P", acref = 11, short = "Ser", long = "Serine", hydrop = -0.8 }, t = { code = "t", ctype = "P", acref = 14, short = "Thr", long = "Threonine", hydrop = -0.7 }, v = { code = "v", ctype = "P", acref = 16, short = "Val", long = "Valine", hydrop = 4.2 }, w = { code = "w", ctype = "P", acref = 24, short = "Trp", long = "Tryptophan", hydrop = -0.9 }, y = { code = "y", ctype = "P", acref = 21, short = "Tyr", long = "Tyrosine", hydrop = -1.3 }, -- -- codes for ligands or modified amino acids -- x = { code = "x", ctype = "P", acref = 0, short = "Xaa", long = "Unknown", hydrop = 0 }, unk = { code = "x", ctype = "P", acref = 0, short = "Xaa", long = "Unknown", hydrop = 0 }, -- -- bonus! RNA nucleotides -- ra = { code = "a", ctype = "R", acref = 33, short = "a", long = "Adenine", hydrop = 0, }, rc = { code = "c", ctype = "R", acref = 31, short = "c", long = "Cytosine", hydrop = 0, }, rg = { code = "g", ctype = "R", acref = 34, short = "g", long = "Guanine", hydrop = 0, }, ru = { code = "u", ctype = "R", acref = 30, short = "u", long = "Uracil", hydrop = 0, }, -- -- bonus! DNA nucleotides -- da = { code = "a", ctype = "D", acref = 0, short = "a", long = "Adenine", hydrop = 0, }, dc = { code = "c", ctype = "D", acref = 0, short = "c", long = "Cytosine", hydrop = 0, }, dg = { code = "g", ctype = "D", acref = 0, short = "g", long = "Guanine", hydrop = 0, }, dt = { code = "t", ctype = "D", acref = 0, short = "t", long = "Thymine", hydrop = 0, }, -- -- dusty attic! musty cellar! jumbled boxroom! -- can't bear to part with these treasures -- -- b = { code = "b", ctype = "P", acref = 10, short = "Asx", long = "Asparagine/Aspartic acid", hydrop = 0 }, -- j = { code = "j", ctype = "P", acref = 10, short = "Xle", long = "Leucine/Isoleucine", hydrop = 0 }, -- o = { code = "o", ctype = "P", acref = 10, short = "Pyl", long = "Pyrrolysine", hydrop = 0 }, -- u = { code = "u", ctype = "P", acref = 10, short = "Sec", long = "Selenocysteine", hydrop = 0 }, -- z = { code = "z", ctype = "P", acref = 10, short = "Glx", long = "Glutamine or glutamic acid", hydrop = 0 } , } -- -- amino acid types -- Aromatic = { f = { "phenylalanine", }, h = { "histidine", }, w = { "tryptophan", }, y = { "tyrosine", }, } Aliphatic = { i = { "isoleucine", }, l = { "leucine", }, v = { "valine", }, } Hydrophobic = { a = { "alanine", }, c = { "cysteine", }, f = { "phenylalanine", }, i = { "isoleucine", }, l = { "leucine", }, m = { "methionine", }, p = { "proline", }, v = { "valine", }, w = { "tryptophan", }, y = { "tyrosine", }, } -- -- end of globals section -- -- -- begin protNfo Beta package version 0.5 -- -- version 0.4 is packaged as a psuedo-class or psuedo-module -- containing a mix of data fields and functions -- -- all entries must be terminated with a comma to keep Lua happy -- -- the commas aren't necessary if only function definitions are present -- -- version 0.3 merges in the chain-detection logic developed in -- AA Edit 2.0 -- -- version 0.4 merges in the ligand logic from GetSeCount -- -- version 0.5 is still a work in progress -- -- TODO: still depends heavily on the external AminoAcids table -- -- protNfo = { PROTEIN = "P", LIGAND = "M", RNA = "R", DNA = "D", UNKNOWN_AA = "x", UNKNOWN_BASE = "xx", CYSTEINE_AA = "c", CYS_MID_BRIDGE = 10, -- atom count for midchain disulfide bridge CYS_AMBIGUOUS1 = 11, -- atom count for C terminal bridge or midchain, no bridge CYS_AMBIGUOUS2 = 12, -- atom count for N terminal bridge or C terminal, no bridge CYS_N_NO_BRIDGE = 13, -- atom count for N terminal, no disulfide bridge PROLINE_AA = "p", HELIX = "H", SHEET = "E", LOOP = "E", segCnt = 0, -- unadjusted segment count segCnt2 = 0, -- segment count adjusted for terminal ligands aa = {}, -- amino acid codes ss = {}, -- secondary structure codes ACRF = 4.0, -- alpha carbon reference distance acdx = {}, -- alpha carbon distance atom = {}, -- atom counts rot = {}, -- rotamer counts phobe = {}, -- hydrophobics lock = {}, -- locked segments slck = {}, -- locked sidechains mute = {}, -- mutable segments ctype = {}, -- segment type - P, M, R, D first = {}, -- true if segment is first in chain last = {}, -- true if segment is last in chain nterm = {}, -- true if protein and if n-terminal cterm = {}, -- true if protein and if c-terminal fastac = {}, -- external code for FASTA-style output short = {}, -- short name long = {}, -- long name hydrop = {}, -- hydropathy index chainid = {}, -- chain id chainpos = {}, -- position in chain cysteine = {}, -- cysteines chains = {}, -- summary of chains ligands = {}, -- ligand table getChains = function ( self ) -- -- getChains - build a table of the chains found -- -- Most Foldit puzzles contain only a single protein (peptide) chain. -- A few puzzles contain ligands, and some puzzles have had two -- protein chains. Foldit puzzles may also contain RNA or DNA. -- -- For proteins, the atom count can be used to identify the first -- (N terminal) and last (C terminal) ends of the chain. The AminoAcids -- table has the mid-chain atom counts for each amino acid. -- -- Cysteine is a special case, since the presence of a disulfide -- bridge also changes the atom count. -- -- For DNA and RNA, the beginning and end of the chain is determined -- by context at present. For example, if the previous segment was protein -- and this segment is DNA, it's the start of a chain. -- -- Each ligand is treated as a chain of its own, with a length of 1. -- -- chain table entries -- ------------------- -- -- ctype - chain type - "P" for protein, "M" for ligand, "R" for RNA, "D" for DNA -- fasta - FASTA-format sequence, single-letter codes (does not include FASTA header) -- start - Foldit segment number of sequence start -- stop - Foldit segment number of sequence end -- len - length of sequence -- chainid - chain id assigned to entry, "A", "B", "C", and so on -- -- For DNA and RNA, fasta contains single-letter codes, so "a" for adenine. -- The codes overlap the amino acid codes (for example, "a" for alanine). -- The DNA and RNA codes must be converted to the appropriate two-letter codes Foldit -- uses internally, for example "ra" for RNA adenine and "da" for DNA adenine. -- -- -- we're assuming Foldit won't ever have more chains -- local chainid = { "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" } local chainz = {} local chindx = 0 local curchn = nil for ii = 1, self.segCnt do if self.first [ ii ] then chindx = chindx + 1 chainz [ chindx ] = {} curchn = chainz [ chindx ] curchn.ctype = self.ctype [ ii ] curchn.fasta = "" curchn.ss = "" curchn.start = ii curchn.chainid = chainid [ chindx ] curchn.mute = 0 curchn.len = 0 end curchn.fasta = curchn.fasta .. self.fastac [ ii ] curchn.ss = curchn.ss .. self.ss [ ii ] self.chainid [ #self.chainid + 1 ] = curchn.chainid self.chainpos [ #self.chainpos + 1 ] = ii - curchn.start + 1 if self.mute [ ii ] then curchn.mute = curchn.mute + 1 end if self.last [ ii ] then curchn.stop = ii curchn.len = curchn.stop - ( curchn.start - 1 ) end end return chainz end, getLigands = function ( self ) -- -- ultra-paranoid method for detecting ligands -- -- each ligand segment is treated separately in this version -- local ligandz = {} for ii = 1, self.segCnt do if protNfo.ss [ ii ] == "M" then local atoms = protNfo.atom [ ii ] local rots = protNfo.rot [ ii ] local sscor = current.GetSegmentEnergyScore ( ii ) ligandz [ #ligandz + 1 ] = { seg = ii, atom = atoms, rot = rots, score = sscor } end end print ( #ligandz .. " ligands" ) for jj = 1, #ligandz do print ( "ligand # " .. jj .. ", segment = " .. ligandz [ jj ].seg .. ", atoms = " .. ligandz [ jj ].atom .. ", rotamers = " .. ligandz [ jj ].rot .. ", score = " .. round ( ligandz [ jj ].score ) ) if ligandz [ jj ].seg < self.segCnt2 then print ( "WARNING: non-standard ligand at segment " .. ligandz [ jj ].seg .. ", most ligand-aware recipes won't work properly" ) end end return ligandz end, setNfo = function ( self ) self.segCnt = structure.GetCount() -- -- standard ligand adjustment -- self.segCnt2 = self.segCnt while protNfo.ss [ self.segCnt2 ] == "M" do self.segCnt2 = self.segCnt2 - 1 end if self.segCnt2 == self.segCnt then print ( "segment count = " .. self.segCnt ) else print ( "original segment count = " .. self.segCnt ) print ( "adjusted segment count = " .. self.segCnt2 ) end -- -- initial scan - retrieve basic info from Foldit and AminoAcids table -- for ii = 1, self.segCnt do self.aa [ #self.aa + 1 ] = structure.GetAminoAcid ( ii ) self.ss [ #self.ss + 1 ] = structure.GetSecondaryStructure ( ii ) if ii < self.segCnt then self.acdx [ #self.acdx + 1 ] = structure.GetDistance ( ii, ii + 1 ) else self.acdx [ #self.acdx + 1 ] = 10000 end self.atom [ #self.atom + 1 ] = structure.GetAtomCount ( ii ) self.rot [ #self.rot + 1 ] = rotamer.GetCount ( ii ) self.phobe [ #self.phobe + 1 ] = structure.IsHydrophobic ( ii ) local lock, slck = structure.IsLocked ( ii ) -- structure.IsLocked returns backbone, sidechain self.lock [ #self.lock + 1 ] = lock -- backbone self.slck [ #self.slck + 1 ] = slck -- sidechain self.mute [ #self.mute + 1 ] = structure.IsMutable ( ii ) -- -- look it up -- local aatab = AminoAcids [ self.aa [ ii ] ] if aatab ~= nil then self.ctype [ #self.ctype + 1 ] = aatab.ctype -- -- even the codes 'x' or 'unk' are considered protein -- unless the secondary structure is "M" -- -- this handles glycosylated amino acids -- in puzzles 879, 1378b, and similar -- -- segment 134 in puzzle 879 is the example, -- it's no longer asparagine, but it is part of -- the peptide chain -- if self.ss [ ii ] == self.LIGAND then self.ctype [ ii ] = self.LIGAND end -- -- other info -- else -- -- special case: unknown code - mark it as ligand -- -- this should not occur, but just in case -- self.ctype [ #self.ctype + 1 ] = self.LIGAND aa = self.UNKNOWN_AA -- a known unknown aatab = AminoAcids [ aa ] end self.short [ #self.short + 1 ] = aatab.short self.long [ #self.long + 1 ] = aatab.long self.hydrop [ #self.hydrop + 1 ] = aatab.hydrop self.fastac [ #self.fastac + 1 ] = aatab.code self.nterm [ #self.nterm + 1 ] = false self.cterm [ #self.cterm + 1 ] = false -- -- build table of cysteines -- if self.aa [ ii ] == self.CYSTEINE_AA then local ds = current.GetSegmentEnergySubscore ( ii, "Disulfides" ) self.cysteine [ #self.cysteine + 1 ] = { seg1 = ii, seg2 = nil, bonded = false, score = ds, } end end -- end of initial scan -- -- analyze cysteines, don't rely on disfulide subscore -- if #self.cysteine > 0 then local bridges = 0 local cstring = "" local cref = AminoAcids [ self.CYSTEINE_AA ].acref -- reference atom count for ii = 1, #self.cysteine do local c1 = self.cysteine [ ii ].seg1 for jj = ii + 1, #self.cysteine do local c2 = self.cysteine [ jj ].seg1 if c1 ~= c2 then -- -- measure the distance between tip sulfurs -- local a1 = 6 local a2 = 6 if self.cterm [ c1 ] then a1 = a1 + 1 end if self.cterm [ c2 ] then a2 = a2 + 1 end local bnd = band.AddBetweenSegments ( c1, c2, a1, a2 ) if bnd ~= nil and bnd ~= 0 then local dst = band.GetLength ( bnd ) local diff = dst - 2.0 if math.abs ( diff ) < 0.2 then --[[ print ( "disulfide bridge " .. c1 .. "-" .. c2 .. ", tip distance = " .. round ( dst ) ) ]]-- cstring = cstring .. c1 .. "," .. c2 .. " " bridges = bridges + 1 self.cysteine [ ii ].bonded = true self.cysteine [ ii ].seg2 = c2 self.cysteine [ jj ].bonded = true self.cysteine [ jj ].seg2 = c1 end band.Delete ( bnd ) end end end end if bridges > 0 then print ( bridges .. " disulfide bridges found, segment pairs = " ) print ( cstring ) end end -- -- rescan to determine first and last in chain for all types -- it's necessary to "peek" at neighbors for DNA and RNA -- for ii = 1, self.segCnt do local nterm = self.nterm [ ii ] local cterm = self.cterm [ ii ] local first = false local last = false if ii == 1 then first = true end if ii == self.segCnt then last = true end -- -- for proteins, determine n-terminal and c-terminal -- based on atom count -- if self.ctype [ ii ] == self.PROTEIN then local ttyp = "" local noteable = false local aatab = AminoAcids [ self.aa [ ii ] ] local ac = self.atom [ ii ] -- actual atom count local act = aatab.acref -- reference mid-chain atom count if ac ~= act or ( self.aa [ ii ] == self.CYSTEINE_AA and ac == act ) then ttyp = "non-standard amino acid" if ac == act + 2 then ttyp = "N-terminal" nterm = true notable = true elseif ac == act + 1 then ttyp = "C-terminal" cterm = true notable = true elseif self.aa [ ii ] == self.PROLINE_AA and ac == act + 3 then ttyp = "N-terminal" nterm = true notable = true end if self.aa [ ii ] == self.CYSTEINE_AA then local ds = current.GetSegmentEnergySubscore ( ii, "Disulfides" ) local bonded = false --print ( "cysteine at " .. ii .. ", disulfides score = " .. ds ) if ds ~= 0 and math.abs ( ds ) > 0.01 then bonded = true nterm = false cterm = false ttyp = "disulfide bridge" notable = false if ac == act + 1 then ttyp = "N-terminal" nterm = true notable = true elseif ac == act then ttyp = "C-terminal" cterm = true notable = true end else for xx = 1, self.segCnt do local aax = structure.GetAminoAcid ( xx ) -- duplicated effort! if xx ~= ii and aax == self.CYSTEINE_AA then local bx = band.AddBetweenSegments ( ii, xx, 6, 6 ) if bx ~= nil and bx ~= 0 then local bl = band.GetLength ( bx ) if bl >= 1.9 and bl <= 2.2 then bonded = true if ac == act + 1 then ttyp = "N-terminal" nterm = true notable = true elseif ac == act then ttyp = "C-terminal" cterm = true notable = true end end band.Delete ( bx ) end end if bonded then break end end if not bonded then ttyp = "unpaired cysteine" notable = false end end self.cysteine [ #self.cysteine + 1 ] = { ii, bonded, {}, } end if notable then print ( ttyp .. " detected at segment " .. ii .. ", amino acid = \'" .. self.aa [ ii ] .. "\', atom count = " .. ac .. ", reference count = " .. act .. ", secondary structure = " .. self.ss [ ii ] ) end end end -- -- set results from first pass -- self.nterm [ ii ] = nterm self.cterm [ ii ] = cterm -- -- time for some distance magic -- if not nterm then if ii == 1 then nterm = true self.nterm [ ii ] = nterm else if self.acdx [ ii - 1 ] > self.ACRF then nterm = true self.nterm [ ii ] = nterm end end end if not cterm then if self.acdx [ ii ] > self.ACRF then cterm = true self.cterm [ ii ] = cterm end end -- -- second-guessing -- if self.ctype [ ii ] == self.PROTEIN then if self.nterm [ ii ] then first = true end if self.cterm [ ii ] then last = true end -- -- kludge for cases where binder target doesn't -- have an identifiable C terminal -- if ii < self.segCnt then if self.ctype [ ii ] == self.PROTEIN or ( self.ctype [ ii ] == self.PROTEIN and self.nterm [ ii + 1 ] ) then last = true end end -- -- special cases for puzzles 879, 1378b, and similar -- -- if modified AA ends or begins a chain, mark -- it as C-terminal or N-terminal -- -- hypothetical: no way to test so far! -- if AminoAcids [ self.aa [ ii ] ] [ AACODE ] == self.UNKNOWN_AA then if ii > 1 then -- -- if previous segment was not protein, this segment is an n-terminal -- if self.ctype [ ii - 1 ] ~= self.ctype [ ii ] then first = true self.nterm [ ii ] = true print ( "non-standard amino acid at segment " .. ii .. " marked as N-terminal (previous segment non-protein)" ) -- -- if previous segment is protein and a c-terminal, -- this segment is an n-terminal elseif ii > 1 and self.cterm [ ii - 1 ] then first = true self.nterm [ ii ] = true print ( "non-standard amino acid at segment " .. ii .. " marked as N-terminal (previous segment C-terminal)" ) end end if ii < self.segCnt then -- -- if next segment is not protein, this is a c-terminal -- if self.ctype [ ii + 1 ] ~= self.ctype [ ii ] then last = true self.cterm [ ii ] = true print ( "non-standard amino acid at segment " .. ii .. " marked as C-terminal (next segment non-protein)" ) -- -- if next segment is protein and an n-terminal, -- this segment is a c-terminal -- elseif ii < self.segCnt and self.nterm [ ii + 1 ] then last = true self.cterm [ ii ] = true print ( "non-standard amino acid at segment " .. ii .. " marked as C-terminal (next segment N-terminal)" ) end end end elseif self.ctype [ ii ] == self.DNA or self.ctype [ ii ] == self.RNA then if ii > 1 and self.ctype [ ii - 1 ] ~= self.ctype [ ii ] then first = true end if ii < self.segCnt and self.ctype [ ii + 1 ] ~= self.ctype [ ii ] then last = true end else -- ligand first = true last = true end self.first [ #self.first + 1 ] = first self.last [ #self.last + 1 ] = last end -- -- summarize the chain info -- self.chains = self:getChains () -- -- get the ligand info -- self.ligands = self:getLigands () end, } -- -- end protNfo Beta package version 0.3 -- -- -- function print score by spvincent -- function round ( x ) if x == nil then return "nil" end return x - x % PPX.kFax end SLT = { -- SLT--SLT--SLT--SLT--SLT--SLT--SLT--SLT--SLT--SLT-- --[[ SLT - Segment set, list, and type module v0.5.pp (special for print protein) Includes the segment set and list module and the segment type module developed by Timo van der Laan. The following Foldit recipes contain the original code for these modules: * Tvdl enhanced DRW 3.1.1 - https://fold.it/portal/recipe/102840 * TvdL DRemixW 3.1.2 - https://fold.it/portal/recipe/102398 The "set and list" module performs logical operations and transformations on tables containing ranges of segment. The segment type module find lists and sets of segments with various properties, such as selected or frozen. A "list" is one-dimensional table containing segment numbers. A "set" is a two-dimensional table containing segment number ranges. For example, given a list of segments: list = { 1, 2, 3, 7, 8, 11, 13, 14, 15 } the corresponding set is: set = { { 1, 3 }, { 7, 8 }, { 11, 11 }, {13, 15 } } Most functions assume that the sets are well-formed, meaning they are ordered and have no overlaps. As an example, the method FindUnlocked returns a set of all the unlocked segments in a puzzle. The method can be called as follows: funlocked = SLT:FindUnlocked () The return value funlocked is a two-dimensional table containing ranges of unlocked segments. In source format, the table might look like this: funlocked = { { 27, 35, }, { 47, 62, }, { 78, 89, }, } The code to use this table would look like: -- -- for each range of segments -- for ii = 1, #funlocked do -- -- for each segment in the range, so something -- for jj = funlocked [ ii ] [ 1 ], funlocked [ ii ] [ 2 ] do ... something ... end end This psuedo-module is a table containing a mix of data fields and methods. This wiki article explains the packaging technique: https://foldit.fandom.com/wiki/Lua_packaging_for_Foldit Authorship ---------- Original by Timo van der Laan: 02-05-2012 TvdL Free to use for non commercial purposes French comments by Bruno Kestemont and perhaps others. v0.1 - LociOiling + extract and reformat code v0.2 - LociOiling - 2017/11/03 + add primary FindUnlocked function v0.3 - LociOiling + add FindRotamers function v0.4 - LociOiling - 2019/10/29 + package as table + remove dependencies on segCnt and segCnt2 v0.5 - LociOiling - 2019/12/17 + convert functions to methods, update internal references v0.5.pp - LociOiling - 2020/03/16 + special version for print protein ]]-- -- -- variables -- segCnt = nil, -- segment count, not adjusted for ligands segCnt2 = nil, -- segment count, not including terminal ligands -- -- initializer - can be called externally, but invoked inline if segCnt or segCnt2 are nil -- Init = function ( self ) self.segCnt = structure.GetCount () self.segCnt2 = self.segCnt while structure.GetSecondaryStructure ( self.segCnt2 ) == "M" do self.segCnt2 = self.segCnt2 - 1 end end, -- -- segment set and list functions -- SegmentListToSet = function ( self, list ) -- retirer doublons local result = {} local ff = 0 local ll = -1 table.sort ( list ) for ii = 1, #list do if list [ ii ] ~= ll + 1 and list [ ii ] ~= ll then -- note: duplicates are removed if ll > 0 then result [ #result + 1 ] = { ff, ll } end ff = list [ ii ] end ll = list [ ii ] end if ll > 0 then result [ #result + 1 ] = { ff, ll } end return result end, SegmentSetToList = function ( self, set ) -- faire une liste a partir d'une zone local result = {} for ii = 1, #set do for kk = set [ ii ] [ 1 ], set [ ii ] [ 2 ] do result [ #result + 1 ] = kk end end return result end, SegmentCleanSet = function ( self, set ) -- Makes it well formed return self:SegmentListToSet ( self:SegmentSetToList ( set ) ) end, SegmentInvertSet = function ( self, set, maxseg ) -- -- Gives back all segments not in the set -- maxseg is added for ligand -- local result={} if maxseg == nil then maxseg = structure.GetCount () end if #set == 0 then return { { 1, maxseg } } end if set [ 1 ] [ 1 ] ~= 1 then result [ 1 ] = { 1, set [ 1 ] [ 1 ] - 1 } end for ii = 2, #set do result [ #result + 1 ] = { set [ ii - 1 ] [ 2 ] + 1, set [ ii ] [ 1 ] - 1, } end if set [ #set ] [ 2 ] ~= maxseg then result [ #result + 1 ] = { set [ #set ] [ 2 ] + 1, maxseg } end return result end, SegmentInvertList = function ( self, list ) if self.segCnt2 == nil then self:Init () end table.sort ( list ) local result = {} for ii = 1, #list - 1 do for jj = list [ ii ] + 1, list [ ii + 1 ] - 1 do result [ #result + 1 ] = jj end end for jj = list [ #list ] + 1, self.segCnt2 do result [ #result + 1 ] = jj end return result end, SegmentInList = function ( self, seg, list ) -- verifier si segment est dans la liste table.sort ( list ) for ii = 1, #list do if list [ ii ] == seg then return true elseif list [ ii ] > seg then return false end end return false end, SegmentInSet = function ( self, set, seg ) --verifie si segment est dans la zone for ii = 1, #set do if seg >= set [ ii ] [ 1 ] and seg <= set [ ii ] [ 2 ] then return true elseif seg < set [ ii ] [ 1 ] then return false end end return false end, SegmentJoinList = function ( self, list1, list2 ) -- fusionner 2 listes de segments local result = list1 if result == nil then return list2 end for ii = 1, #list2 do result [ #result + 1 ] = list2 [ ii ] end table.sort ( result ) return result end, SegmentJoinSet = function ( self, set1, set2 ) --fusionner (ajouter) 2 zones return self:SegmentListToSet ( self:SegmentJoinList ( self:SegmentSetToList ( set1 ), self:SegmentSetToList ( set2 ) ) ) end, SegmentCommList = function ( self, list1, list2 ) -- chercher intersection de 2 listes local result = {} table.sort ( list1 ) table.sort ( list2 ) if #list2 == 0 then return result end local jj = 1 for ii = 1, #list1 do while list2 [ jj ] < list1 [ ii ] do jj = jj + 1 if jj > #list2 then return result end end if list1 [ ii ] == list2 [ jj ] then result [ #result + 1 ] = list1 [ ii ] end end return result end, SegmentCommSet = function ( self, set1, set2 ) -- intersection de 2 zones return self:SegmentListToSet ( self:SegmentCommList ( self:SegmentSetToList ( set1 ), self:SegmentSetToList ( set2 ) ) ) end, SegmentSetMinus = function ( self, set1, set2 ) return self:SegmentCommSet ( set1, self:SegmentInvertSet ( set2 ) ) end, SegmentPrintSet = function ( self, set ) print ( self:SegmentSetToString ( set ) ) end, SegmentSetToString = function ( self, set ) -- pour pouvoir imprimer local line = "" for ii = 1, #set do if ii ~= 1 then line = line .. ", " end line = line .. set [ ii ] [ 1 ] .. "-" .. set [ ii ] [ 2 ] end return line end, SegmentSetInSet = function ( self, set, sub ) if sub == nil then return true end -- -- Checks if sub is a proper subset of set -- for ii = 1, #sub do if not self:SegmentRangeInSet ( set, sub [ ii ] ) then return false end end return true end, SegmentRangeInSet = function ( self, set, range ) -- verifier si zone est dans suite if range == nil or #range == 0 then return true end local bb = range [ 1 ] local ee = range [ 2 ] for ii = 1, #set do if bb >= set [ ii ] [ 1 ] and bb <= set [ ii ] [ 2 ] then return ( ee <= set [ ii ] [ 2 ] ) elseif ee <= set [ ii ] [ 1 ] then return false end end return false end, SegmentSetToBool = function ( self, set ) --vrai ou faux pour chaque segment utilisable ou non local result = {} for ii = 1, structure.GetCount () do result [ ii ] = self:SegmentInSet ( set, ii ) end return result end, -- -- End of Segment Set module -- -- -- Module Find Segment Types -- FindMutablesList = function ( self ) if self.segCnt2 == nil then self:Init () end local result = {} for ii = 1, self.segCnt2 do if structure.IsMutable ( ii ) then result [ #result + 1 ] = ii end end return result end, FindMutables = function ( self ) return self:SegmentListToSet ( self:FindMutablesList () ) end, FindFrozenList = function ( self ) if self.segCnt2 == nil then self:Init () end local result = {} for ii = 1, self.segCnt2 do if freeze.IsFrozen ( ii ) then result [ #result + 1 ] = ii end end return result end, FindFrozen = function ( self ) return self:SegmentListToSet ( self:FindFrozenList () ) end, FindLockedList = function ( self ) if self.segCnt2 == nil then self:Init () end local result = {} for ii = 1, self.segCnt2 do if structure.IsLocked ( ii ) then result [ #result + 1 ] = ii end end return result end, FindLocked = function ( self ) return self:SegmentListToSet ( self:FindLockedList () ) end, FindUnlockedList = function ( self ) if self.segCnt2 == nil then self:Init () end local result = {} for ii = 1, self.segCnt2 do if not structure.IsLocked ( ii ) then result [ #result + 1 ] = ii end end return result end, FindUnlocked = function ( self ) return self:SegmentListToSet ( self:FindUnlockedList () ) end, FindSLockedList = function ( self ) local result = {} for ii = 1, self.segCnt do -- -- special mod for print protein: use slck -- if protNfo.slck [ ii ] then result [ #result + 1 ] = ii end end return result end, FindSLocked = function ( self ) return SLT:SegmentListToSet ( SLT:FindSLockedList () ) end, FindZeroScoreList = function ( self ) if self.segCnt == nil then self:Init () end local result = {} for ii = 1, self.segCnt do local sub = 0 for jj = 1, #subScores do -- sub = sub + current.GetSegmentEnergySubscore ( ii, subScores [ jj ] [ 1 ] ) -- -- special mod for print protein: use subScoreCache -- sub = sub + subScoreCache [ subScores [ jj ] [ 1 ] ] [ ii ] end if sub == 0 then result [ #result + 1 ] = ii end end return result end, FindZeroScore = function ( self ) return self:SegmentListToSet ( self:FindZeroScoreList () ) end, FindRotamersList = function ( self ) if self.segCnt == nil then self:Init () end local result = {} for ii = 1, self.segCnt do local rots = rotamer.GetCount ( ii ) if rots > 1 then result [ #result + 1 ] = ii end end return result end, FindRotamers = function ( self ) return self:SegmentListToSet ( self:FindRotamersList () ) end, FindSelectedList = function ( self ) if self.segCnt == nil then self:Init () end local result = {} for ii = 1, self.segCnt do if selection.IsSelected ( ii ) then result [ #result + 1 ] = ii end end return result end, FindSelected = function ( self ) return self:SegmentListToSet ( self:FindSelectedList () ) end, FindAAtypeList = function ( self, aa ) if self.segCnt2 == nil then self:Init () end local result = {} for ii = 1, self.segCnt2 do if structure.GetSecondaryStructure ( ii ) == aa then result [ #result + 1 ] = ii end end return result end, FindAAtype = function ( self, aa ) return self:SegmentListToSet ( self:FindAAtypeList ( aa ) ) end, FindAminotype = function ( self, at ) --NOTE: only this one gives a list not a set if self.segCnt2 == nil then self:Init () end local result={} for ii = 1, self.segCnt2 do if structure.GetAminoAcid ( ii ) == at then result [ #result + 1 ] = ii end end return result end, }-- SLT--SLT--SLT--SLT--SLT--SLT--SLT--SLT--SLT--SLT-- function divline () print ( "========================================" ) end function makeruler ( sBeg, sEnd, title, inverted, last ) if title == nil then title = "" end if inverted == nil then inverted = false end if last == nil then last = false end local function tenpart ( ff ) if ff >= 100 then ff = ff % 100 end return ( ff - ( ff % 10 ) ) / 10 end local function hunpart ( ff ) if ff >= 1000 then ff = ff % 1000 end return ( ff - ( ff % 100 ) ) / 100 end local onez = "" local tenz = "" local hunz = "" local numz = sBeg % 10 for ii = sBeg, sEnd do onez = onez .. numz % 10 if ii % 10 == 0 then tenz = tenz .. tenpart ( ii ) if ii >= 100 then hunz = hunz .. hunpart ( ii ) else hunz = hunz .. " " end else if ii == sBeg and ii > 1 then tenz = tenz .. tenpart ( ii ) hunz = hunz .. hunpart ( ii ) else tenz = tenz .. " " hunz = hunz .. " " end end numz = numz + 1 if numz > 10 then numz = 1 end end if last then tenz = tenz:sub ( 1, tenz:len () - 1 ) .. tenpart ( sEnd ) hunz = hunz:sub ( 1, hunz:len () - 1 ) .. hunpart ( sEnd ) end local ruler = "" if not inverted and sEnd >= 100 then ruler = ruler .. hunz .. "\n" end if not inverted and sEnd >= 10 then ruler = ruler .. tenz .. "\n" end ruler = ruler .. onez ruler = ruler .. " " .. title .. " " .. sBeg .. "-" .. sEnd if inverted then ruler = ruler .. "\n" end if inverted and sEnd >= 10 then ruler = ruler .. tenz .. "\n" end if inverted and sEnd >= 100 then ruler = ruler .. hunz end return ruler end function linotype ( line, seg ) -- -- pp 2.9 - new optional "seg" for Foldit segment #, -- adds second ruler -- local title1 = "segment/residue" local title2 = "segment" if seg == nil then seg = 1 end local invert = false if seg ~= 1 then invert = true title1 = "residue" end local len = line:len () local maxL = math.min ( len, PPX.jMaxL ) if line:len () > maxL then print ( line ) print ( "" ) end for ii = 1, len, maxL do local lastseg = math.min ( ii + PPX.jMaxL - 1, len ) local linelen = lastseg - ii + 1 local last = false if lastseg >= len then last = true end print ( makeruler ( ii, lastseg, title1, false, last ) ) if invert then print ( "" ) end print ( line:sub ( ii, lastseg ) ) if invert then print ( "" ) end if invert then print ( makeruler ( seg, seg + linelen - 1, title2, true, last ) ) seg = seg + maxL end print ( "" ) end end -- -- tlaloc functions to print sequence of letter -- function BuildSequence ( start, stop ) local seqstring = "" local hydrostring = "" local strucstring = "" local stypestring = "" local lockstring = "" local slockstring = "" local nonp = 0 local locks = 0 local slocks = 0 for ii = start, stop do local aac = protNfo.fastac [ ii ] if protNfo.ctype [ ii ] ~= protNfo.PROTEIN then nonp = nonp + 1 end seqstring = seqstring .. aac if protNfo.phobe [ ii ] then hydrostring = hydrostring .. "i" else hydrostring = hydrostring .. "e" end strucstring = strucstring .. protNfo.ss [ ii ] stypestring = stypestring .. protNfo.ctype [ ii ] local locked = "U" if protNfo.lock [ ii ] then locked = "L" locks = locks + 1 end lockstring = lockstring .. locked local slocked = "U" if protNfo.slck [ ii ] then slocked = "L" slocks = slocks + 1 end slockstring = slockstring .. slocked end -- -- return values -- -- primary structure -- hydro -- secondary structure -- type -- lock -- sidechain lock -- #non-protein segments -- #locked segments -- #locked sidechains -- return seqstring, hydrostring, strucstring, stypestring, lockstring, slockstring, nonp, locks, slocks end -- -- find modifiable sections -- -- pp 2.9 - use Lua table format, report segment ranges -- function FindModifiable () local function prtrange ( rname, range ) if range == nil or #range == 0 then return end print ( "--" ) print ( #range .. " " .. rname .. " sections" ) rname = rname:gsub ( " ", "_" ) rname = rname:gsub ( ",", "" ) print ( rname .. " = {" ) for kk = 1, #range do print ( " { " .. range [ kk ] [ 1 ] .. ", " .. range [ kk ] [ 2 ] .. ", }," ) end print ( "}" ) end -- local flocked = SLT:FindLocked () prtrange ( "locked", flocked ) -- local funlocked = SLT:SegmentInvertSet ( flocked ) prtrange ( "unlocked", funlocked ) -- local fslocked = SLT:FindSLocked () prtrange ( "locked sidechain", fslocked ) -- local fsunlocked = SLT:SegmentInvertSet ( fslocked ) prtrange ( "unlocked sidechain", fsunlocked ) -- local ulckblcks = SLT:SegmentCommSet ( funlocked, fslocked ) prtrange ( "unlocked backbone, locked sidechain", ulckblks ) -- local zeroScore = SLT:FindZeroScore () prtrange ( "zero score", zeroScore ) -- if #flocked == 1 and #zeroScore > 0 then local LockedNZ = SLT:SegmentInvertSet ( zeroScore, flocked [ 1 ] [ 2 ] ) prtrnange ( "locked, non-zero score", LockedNZ ) end -- local mutables = SLT:FindMutables () prtrange ( "mutable", mutables ) -- local lockmut = SLT:SegmentCommSet ( flocked, mutables ) prtrange ( "locked, mutable", lockmut ) end -- -- find mutable segments -- function FindMutable ( ) local mutable = {} local mutablestring = '' for ii = 1, protNfo.segCnt2 do if protNfo.mute [ ii ] == true then mutable [ #mutable + 1 ] = ii end end print ( #mutable .. " mutables found" ) if #mutable > 0 then print ( "n" .. PPX.delim .. "segment" .. PPX.delim .. "aacode" .. PPX.delim .. "aaname" ) end for ii = 1, #mutable do print ( ii .. PPX.delim .. mutable [ ii ] .. PPX.delim .. protNfo.aa [ mutable [ ii ] ] .. PPX.delim .. protNfo.long [ mutable [ ii ] ] ) mutablestring = mutablestring .. "'" .. protNfo.aa [ mutable [ ii ] ] .. "'," end print ( mutablestring ) --- for copy paste on other recipe return mutable end -- -- function FindActiveSubscores adapted from EDRW by Timo van der Laan -- -- This function should be called first -- it saves segment scores and subscores -- in segScoreCache and subScoreCache. -- function FindActiveSubscores () local result = {} local gTotal = 0 -- grand total all subscores local gTotalS = 0 -- grand total all segment subscores local Subs = puzzle.GetPuzzleSubscoreNames () local soot = "" for ii = 1, #Subs do subScoreCache [ Subs [ ii ] ] = {} -- save for later local total = 0 local abstotal = 0 local part for jj = 1, protNfo.segCnt do part = current.GetSegmentEnergySubscore ( jj, Subs [ ii ] ) subScoreCache [ Subs [ ii ] ] [ jj ] = part total = total + part abstotal = abstotal + math.abs ( part ) end if abstotal > 10 then result [ #result + 1 ] = { Subs [ ii ], total, true } gTotal = gTotal + total soot = soot .. "#" .. #result .. ": " .. Subs [ ii ] .. ", total = " .. round ( total ) .. "\n" end end for ii = 1, protNfo.segCnt do segScoreCache [ #segScoreCache + 1 ] = current.GetSegmentEnergyScore ( ii ) gTotalS = gTotalS + segScoreCache [ ii ] end -- -- create pseudo scoreparts -- -- in effect: -- -- Subs [ #Subs + 1 ] = "Subtotal" -- total of all scoreparts per segment -- Subs [ #Subs + 1 ] = "Unknown" -- overall segment score minue total of all scoreparts -- -- but nothing actually added to Subs table, just subScoreCache -- subScoreCache [ "Subtotal" ] = {} subScoreCache [ "Unknown" ] = {} local gstot = 0 local gunk = 0 for jj = 1, protNfo.segCnt do local stot = 0 local unk = 0 for ii = 1, #Subs do stot = stot + subScoreCache [ Subs [ ii ] ] [ jj ] end subScoreCache [ "Subtotal" ] [ jj ] = stot gstot = gstot + stot unk = segScoreCache [ jj ] - stot subScoreCache [ "Unknown" ] [ jj ] = segScoreCache [ jj ] - stot gunk = gunk + unk end result [ #result + 1 ] = { "Subtotal", gstot, true } result [ #result + 1 ] = { "Unknown", gunk, true } divline () print ( #result .. " active subscores" ) print ( soot ) print ( "total of all subscores: " .. round ( gTotal ) ) print ( "total of all segment scores: " .. round ( gTotalS ) ) if ( round ( gTotal ) ~= round ( gTotalS ) ) then print ( "WARNING: total subscores " .. round ( gTotal ) .. " not equal total segment scores " .. round ( gTotalS ) ) end return result, gTotal, gTotalS end function FindActiveFilters () filter.EnableAll() local fnames = filter.GetNames () print ( #fnames .. " conditions or filters" ) local tbonus = 0 for ii = 1, #fnames do local hasbonus = filter.HasBonus ( fnames [ ii ] ) local fscore = nil if hasbonus then bonus = filter.GetBonus ( fnames [ ii ] ) tbonus = tbonus + bonus else bonus = nil end local satisfied = filter.ConditionSatisfied ( fnames [ ii ] ) local oot = "#" .. ii .. ": " .. fnames [ ii ] .. ", satisfied = " .. tostring ( satisfied ) .. ", has bonus = " .. tostring ( hasbonus ) if hasbonus then oot = oot .. ", bonus = " .. round ( bonus ) end print ( oot ) end print ( "total of individual filter bonuses = " .. round ( tbonus ) ) end -- print segment scores in spreadsheet format -- -- subScores - selected subscores -- nonprot - number of non-protein segments -- locked - number of locked segments -- slocked - number of locked segments -- chains - number of chains -- function SegScores ( subScores, nonprot, locked, slocked, chains ) local tReport = "" local headStr = "seg" .. PPX.delim if chains > 1 then headStr = headStr .. "chain" .. PPX.delim headStr = headStr .. "residue" .. PPX.delim end headStr = headStr .. "ID" .. PPX.delim .. "SS" .. PPX.delim if nonprot > 0 then headStr = headStr .. "type" .. PPX.delim end if PPX.kLongnm then headStr = headStr .. "name" .. PPX.delim end if PPX.kAbbrev then headStr = headStr .. "abbrev" .. PPX.delim end if locked > 0 or slocked > 0 then headStr = headStr .. "bb_lock" .. PPX.delim headStr = headStr .. "sc_lock" .. PPX.delim end if PPX.kHydro then headStr = headStr .. "Hyd" .. PPX.delim end if PPX.kAtom then headStr = headStr .. "atoms" .. PPX.delim end if PPX.kRotamer then headStr = headStr .. "rotamers" .. PPX.delim end headStr = headStr .. "score" .. PPX.delim for ii = 1, #subScores do if subScores [ ii ] [ 3 ] then headStr = headStr .. subScores [ ii ] [ 1 ] .. PPX.delim end end divline () print ( "\"segment scores\"" ) print ( headStr ) tReport = tReport .. "\"segment scores\"\n" .. headStr .. "\n" local tSegEnergy = 0 for ii = 1, protNfo.segCnt do local segEnergy = segScoreCache [ ii ] tSegEnergy = tSegEnergy + segEnergy local segScore = ii .. PPX.delim if chains > 1 then segScore = segScore .. protNfo.chainid [ ii ] .. PPX.delim segScore = segScore .. protNfo.chainpos [ ii ] .. PPX.delim end segScore = segScore .. protNfo.fastac [ ii ] .. PPX.delim .. protNfo.ss [ ii ] .. PPX.delim if nonprot > 0 then segScore = segScore .. protNfo.ctype [ ii ] .. PPX.delim end if PPX.kLongnm then segScore = segScore .. protNfo.long [ ii ] .. PPX.delim end if PPX.kAbbrev then segScore = segScore .. protNfo.short [ ii ] .. PPX.delim end if locked > 0 or slocked > 0 then local ll = "U" if protNfo.lock [ ii ] then ll = "L" end segScore = segScore .. ll .. PPX.delim if protNfo.slck [ ii ] then ll = "L" else ll = "U" end segScore = segScore .. ll .. PPX.delim end if PPX.kHydro then segScore = segScore .. round ( protNfo.hydrop [ ii ] ) .. PPX.delim end if PPX.kAtom then segScore = segScore .. protNfo.atom [ ii ] .. PPX.delim end if PPX.kRotamer then segScore = segScore .. protNfo.rot [ ii ] .. PPX.delim end segScore = segScore .. round ( segEnergy ) .. PPX.delim for jj = 1, #subScores do if subScores [ jj ] [ 3 ] then segScore = segScore .. round ( subScoreCache [ subScores [ jj ] [ 1 ] ] [ ii ] ) .. PPX.delim end end print ( segScore ) tReport = tReport .. segScore .. "\n" end local footstr = "totals" .. PPX.delim if chains > 1 then footstr = footstr .. PPX.delim .. PPX.delim -- no totals for chain and segment end footstr = footstr .. "" .. PPX.delim .. "" .. PPX.delim if nonprot > 0 then footstr = footstr .. PPX.delim -- no totals for type end if PPX.kLongnm then footstr = footstr .. PPX.delim -- no totals for long name end if PPX.kAbbrev then footstr = footstr .. PPX.delim -- no totals for abbreviation end if locked > 0 or slocked > 0 then footstr = footstr .. PPX.delim .. PPX.delim -- no totals for backbone, sidechain locked end if PPX.kHydro then footstr = footstr .. PPX.delim -- no totals for hydropathy end if PPX.kAtom then footstr = footstr .. PPX.delim -- no totals for atoms end if PPX.kRotamer then footstr = footstr .. PPX.delim -- no totals for rotamers end footstr = footstr .. round ( tSegEnergy ) .. PPX.delim for ii = 1, #subScores do if subScores [ ii ] [ 3 ] then footstr = footstr .. round ( subScores [ ii ] [ 2 ] ) .. PPX.delim end end print ( footstr ) tReport = tReport .. footstr .. "\n" .. "\"end of report\"\n" return tReport end -- -- print density analysis -- function DensityRat ( ) local tReport = "" local headStr = "\"AA code\"" .. PPX.delim .. "\"AA name\"" .. PPX.delim .. "\"segment count\"" .. PPX.delim .. "\"total density\"" .. PPX.delim .. "\"% total density\"" .. PPX.delim .. "\"mean density\"" .. PPX.delim .. "\"worst density\"" .. PPX.delim .. "\"worst density seg\"" .. PPX.delim .. "\"best density\"" .. PPX.delim .. "\"best density seg\"" .. PPX.delim -- -- density by amino acid -- local tAA = {} -- -- binary (true/false) tables for density by AA type -- -- -- density of aromatics - true => aromatic -- local tAromatic = {} tAromatic [ true ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } tAromatic [ false ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } -- -- density of aliphatics - true => alphatic -- local tAliphatic = {} tAliphatic [ true ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } tAliphatic [ false ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } -- -- density of hydrophobics - true => hydrophobic -- local tHydrophobic = {} tHydrophobic [ true ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } tHydrophobic [ false ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } -- -- density of helixes - true => helix -- local tHelix = {} tHelix [ true ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } tHelix [ false ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } -- -- density of sheets - true => sheets -- local tSheet = {} tSheet [ true ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } tSheet [ false ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } local function denUpdate ( tDen, segDensity, segindx ) tDen.count = tDen.count + 1 tDen.total = tDen.total + math.abs ( segDensity ) if segDensity > tDen.best then tDen.best = segDensity tDen.besty = segindx end if segDensity < tDen.worst then tDen.worst = segDensity tDen.worsty = segindx end end local tSegDensity = 0 for ii = 1, protNfo.segCnt do local aaCode = protNfo.aa [ ii ] if tAA [ aaCode ] == nil then tAA [ aaCode ] = { count = 0, total = 0, mean = 0, best = -999, besty = 0, worst = 999, worsty = 0, } end -- -- update table of density by amino acid -- local segDensity = subScoreCache [ "Density" ] [ ii ] tSegDensity = tSegDensity + math.abs ( segDensity ) local aaDen = tAA [ aaCode ] if aaDen ~= nil then denUpdate ( aaDen, segDensity, ii ) else print ( "ERROR: AA density table entry for " .. aaCode .. " is nil" ) end -- -- update table of density by aromatic vs. non-aromatic -- local aromDen = tAromatic [ Aromatic [ aaCode ] ~= nil ] if aromDen ~= nil then denUpdate ( aromDen, segDensity, ii ) else print ( "ERROR: Aromatic density table entry for " .. aaCode .. " is nil" ) end -- -- update table of density by aliphatic vs. non-aliphatic -- local alipDen = tAliphatic [ Aliphatic [ aaCode ] ~= nil ] if alipDen ~= nil then denUpdate ( alipDen, segDensity, ii ) else print ( "ERROR: Aliphatic density table entry for " .. aaCode .. " is nil" ) end -- -- update table of density by hydrophobic vs. non-hydrophobic -- local phobDen = tHydrophobic [ Hydrophobic [ aaCode ] ~= nil ] if phobDen ~= nil then denUpdate ( phobDen, segDensity, ii ) else print ( "ERROR: hydrophobic density table entry for " .. aaCode .. " is nil" ) end -- -- update table of density by helix -- local helixDen = tHelix [ protNfo.ss [ ii ] == protNfo.HELIX ] if helixDen ~= nil then denUpdate ( helixDen, segDensity, ii ) end -- -- update table of density by sheet -- local sheetDen = tSheet [ protNfo.ss [ ii ] == protNfo.SHEET ] if sheetDen ~= nil then denUpdate ( sheetDen, segDensity, ii ) end end divline () print ( "\"density by AA\"" ) print ( headStr ) tReport = tReport .. "\"density by AA\"\n" .. headStr .. "\n" for aac, aaDen in pairs ( tAA ) do if aaDen.count > 0 then aaDen.mean = aaDen.total / aaDen.count end local denline = aac .. PPX.delim .. AminoAcids [ aac ].long .. PPX.delim .. aaDen.count .. PPX.delim .. round ( aaDen.total ) .. PPX.delim .. round ( ( aaDen.total / tSegDensity ) * 100 ) .. PPX.delim .. round ( aaDen.mean ) .. PPX.delim .. round ( aaDen.worst ) .. PPX.delim .. aaDen.worsty .. PPX.delim .. round ( aaDen.best ) .. PPX.delim .. aaDen.besty print ( denline ) tReport = tReport .. denline .. "\n" end local footstr = "totals" .. PPX.delim .. protNfo.segCnt .. PPX.delim .. round ( tSegDensity ) .. PPX.delim .. PPX.delim .. round ( tSegDensity / protNfo.segCnt ) print ( footstr ) tReport = tReport .. footstr .. "\n" headStr = "\"aromatic AA\"" .. PPX.delim .. "" .. PPX.delim .. "\"segment count\"" .. PPX.delim .. "\"total density\"" .. PPX.delim .. "\"% total density\"" .. PPX.delim .. "\"mean density\"" .. PPX.delim .. "\"worst density\"" .. PPX.delim .. "\"worst density seg\"" .. PPX.delim .. "\"best density\"" .. PPX.delim .. "\"best density seg\"" .. PPX.delim divline () print ( "\"density by aromatic vs. non-aromatic\"" ) print ( headStr ) tReport = tReport .. "\"density by aromatic vs. non-aromatic\"\n" .. headStr .. "\n" for aac, aaDen in pairs ( tAromatic ) do if aaDen.count > 0 then aaDen.mean = aaDen.total / aaDen.count end local denline = tostring ( aac ) .. PPX.delim .. "" .. PPX.delim .. aaDen.count .. PPX.delim .. round ( aaDen.total ) .. PPX.delim .. round ( ( aaDen.total / tSegDensity ) * 100 ) .. PPX.delim .. round ( aaDen.mean ) .. PPX.delim .. round ( aaDen.worst ) .. PPX.delim .. aaDen.worsty .. PPX.delim .. round ( aaDen.best ) .. PPX.delim .. aaDen.besty print ( denline ) tReport = tReport .. denline .. "\n" end headStr = "\"aliphatic AA\"" .. PPX.delim .. "" .. PPX.delim .. "\"segment count\"" .. PPX.delim .. "\"total density\"" .. PPX.delim .. "\"% total density\"" .. PPX.delim .. "\"mean density\"" .. PPX.delim .. "\"worst density\"" .. PPX.delim .. "\"worst density seg\"" .. PPX.delim .. "\"best density\"" .. PPX.delim .. "\"best density seg\"" .. PPX.delim divline () print ( "\"density by aliphatic vs. non-aliphatic\"" ) print ( headStr ) tReport = tReport .. "\"density by aliphatic vs. non-aliphatic\"\n" .. headStr .. "\n" for aac, aaDen in pairs ( tAliphatic ) do if aaDen.count > 0 then aaDen.mean = aaDen.total / aaDen.count end local denline = tostring ( aac ) .. PPX.delim .. "" .. PPX.delim .. aaDen.count .. PPX.delim .. round ( aaDen.total ) .. PPX.delim .. round ( ( aaDen.total / tSegDensity ) * 100 ) .. PPX.delim .. round ( aaDen.mean ) .. PPX.delim .. round ( aaDen.worst ) .. PPX.delim .. aaDen.worsty .. PPX.delim .. round ( aaDen.best ) .. PPX.delim .. aaDen.besty print ( denline ) tReport = tReport .. denline .. "\n" end headStr = "\"hydrophobic AA\"" .. PPX.delim .. "" .. PPX.delim .. "\"segment count\"" .. PPX.delim .. "\"total density\"" .. PPX.delim .. "\"% total density\"" .. PPX.delim .. "\"mean density\"" .. PPX.delim .. "\"worst density\"" .. PPX.delim .. "\"worst density seg\"" .. PPX.delim .. "\"best density\"" .. PPX.delim .. "\"best density seg\"" .. PPX.delim divline () print ( "\"density by hydrophobic vs. non-hydrophobic\"" ) print ( headStr ) tReport = tReport .. "\"density by hydrophobic vs. non-hydrophobic\"\n" .. headStr .. "\n" for aac, aaDen in pairs ( tHydrophobic ) do if aaDen.count > 0 then aaDen.mean = aaDen.total / aaDen.count end local denline = tostring ( aac ) .. PPX.delim .. "" .. PPX.delim .. aaDen.count .. PPX.delim .. round ( aaDen.total ) .. PPX.delim .. round ( ( aaDen.total / tSegDensity ) * 100 ) .. PPX.delim .. round ( aaDen.mean ) .. PPX.delim .. round ( aaDen.worst ) .. PPX.delim .. aaDen.worsty .. PPX.delim .. round ( aaDen.best ) .. PPX.delim .. aaDen.besty print ( denline ) tReport = tReport .. denline .. "\n" end headStr = "\"helix\"" .. PPX.delim .. "" .. PPX.delim .. "\"segment count\"" .. PPX.delim .. "\"total density\"" .. PPX.delim .. "\"% total density\"" .. PPX.delim .. "\"mean density\"" .. PPX.delim .. "\"worst density\"" .. PPX.delim .. "\"worst density seg\"" .. PPX.delim .. "\"best density\"" .. PPX.delim .. "\"best density seg\"" .. PPX.delim divline () print ( "\"density by helix vs. non-helix\"" ) print ( headStr ) tReport = tReport .. "\"density by helix vs. non-helix\"\n" .. headStr .. "\n" for aac, aaDen in pairs ( tHelix ) do if aaDen.count > 0 then aaDen.mean = aaDen.total / aaDen.count end local denline = tostring ( aac ) .. PPX.delim .. "" .. PPX.delim .. aaDen.count .. PPX.delim .. round ( aaDen.total ) .. PPX.delim .. round ( ( aaDen.total / tSegDensity ) * 100 ) .. PPX.delim .. round ( aaDen.mean ) .. PPX.delim .. round ( aaDen.worst ) .. PPX.delim .. aaDen.worsty .. PPX.delim .. round ( aaDen.best ) .. PPX.delim .. aaDen.besty print ( denline ) tReport = tReport .. denline .. "\n" end headStr = "\"sheet\"" .. PPX.delim .. "" .. PPX.delim .. "\"segment count\"" .. PPX.delim .. "\"total density\"" .. PPX.delim .. "\"% total density\"" .. PPX.delim .. "\"mean density\"" .. PPX.delim .. "\"worst density\"" .. PPX.delim .. "\"worst density seg\"" .. PPX.delim .. "\"best density\"" .. PPX.delim .. "\"best density seg\"" .. PPX.delim divline () print ( "\"density by sheet vs. non-sheet\"" ) print ( headStr ) tReport = tReport .. "\"density by sheet vs. non-sheet\"\n" .. headStr .. "\n" for aac, aaDen in pairs ( tSheet ) do if aaDen.count > 0 then aaDen.mean = aaDen.total / aaDen.count end local denline = tostring ( aac ) .. PPX.delim .. "" .. PPX.delim .. aaDen.count .. PPX.delim .. round ( aaDen.total ) .. PPX.delim .. round ( ( aaDen.total / tSegDensity ) * 100 ) .. PPX.delim .. round ( aaDen.mean ) .. PPX.delim .. round ( aaDen.worst ) .. PPX.delim .. aaDen.worsty .. PPX.delim .. round ( aaDen.best ) .. PPX.delim .. aaDen.besty print ( denline ) tReport = tReport .. denline .. "\n" end tReport = tReport .. "\"end of report\"\n" divline () print ( "\"density deviation (above/below mean density by AA)\"" ) local dendeviate = "" for ii = 1, protNfo.segCnt do local aaCode = protNfo.aa [ ii ] local segDensity = subScoreCache [ "Density" ] [ ii ] local jDenMean = tAA [ aaCode ].mean if round ( segDensity ) == round ( jDenMean ) or tAA [ aaCode ].count == 1 then dendeviate = dendeviate .. "=" elseif segDensity < jDenMean then dendeviate = dendeviate .. "-" else dendeviate = dendeviate .. "+" end end linotype ( dendeviate ) return tReport, dendeviate end -- -- GetStruct -- return a list of structures of a specified type -- -- adapted from spvincent's Helix Rebuild -- function GetStruct ( structT ) local within_struct = false local structList = {} local structStart = 0 local structLast = 0 local structScr = 0 for ii = 1, protNfo.segCnt do if ( protNfo.ss [ ii ] == structT ) then if ( within_struct == false ) then -- start of a new struct within_struct = true structStart = ii structScr = 0 end structLast = ii if ii <= #segScoreCache then structScr = structScr + segScoreCache [ ii ] else structScr = structScr + current.GetSegmentEnergyScore ( ii ) end elseif ( within_struct == true ) then -- end of a struct within_struct = false structList [ #structList + 1 ] = { type = structT, first = structStart, last = structLast, use = false, score = structScr } end end if ( within_struct == true ) then structList [ #structList + 1 ] = { type = structT, first = structStart, last = structLast, use = false, score = structScr } end return structList end -- -- function to print a little contact table -- function contact ( structs ) local head = PPX.delim .. PPX.delim local first = true for s = 1, #structs do if structs [ s ].type == "H" or structs [ s ].type == "E" then local string = structs [ s ].type .. PPX.delim .. structs [ s ].first .. PPX.delim .. structs [ s ].last for s2 = 1, #structs do if structs [ s2 ].type == "H" or structs [ s2 ].type == "E" then if first then head = head .. PPX.delim .. structs [ s2 ].type end local mean = 0 local nb = 0 for i = structs [ s ].first, structs [ s ].last do local min = 999999 for j = structs [ s2 ].first, structs [ s2 ].last do dist = structure.GetDistance ( i, j ) if dist < min then min = dist end end mean = mean + min nb = nb + 1 end mean = mean / nb local c = " " if structure.GetDistance ( structs [ s ].first, structs [ s2 ].last ) < structure.GetDistance ( structs [ s ].first, structs [ s2 ].first ) then -- TODO: verify logic here, comparing first segs? if mean < 5 then c = 'X' elseif mean < 10 then c = 'x' end else if mean < 5 then c = 'O' elseif mean < 10 then c = 'o' end end string = string .. PPX.delim .. c end end if first then divline () print ( "\"mini contact table\"" ) print ( head ) first = false end print ( string ) end end end function ShowReport ( tReport, chnz ) local ask = dialog.CreateDialog ( ReVersion .. " - Subscore Report" ) ask.l15 = dialog.AddLabel ( "Click inside the one of the text boxes, then" ) ask.l16 = dialog.AddLabel ( "use ctrl+a (command+a on Mac) to select all," ) ask.l20 = dialog.AddLabel ( "and control+c or command+c to copy, then" ) ask.l30 = dialog.AddLabel ( "paste into spreadsheet" ) ask.l10 = dialog.AddLabel ( "---- segment subscores report ----" ) ask.rep = dialog.AddTextbox ( "subscores:", tReport ) ask.OK = dialog.AddButton ( "OK", 1 ) ask.chains = dialog.AddButton ( "Chains", 2 ) repeat local rc = dialog.Show ( ask ) if rc == 2 then ShowChains ( chnz, 1 ) end until rc == 1 end function ShowChains ( chnz, chndx ) if chndx == nil then chndx = 1 end local CHPAGE = 4 -- chains / page local rc = 0 local ask = dialog.CreateDialog ( ReVersion .. " - Chain Report" ) local chmax = math.min ( #chnz, chndx + CHPAGE - 1 ) ask.CHDisp = dialog.AddLabel ( "displaying " .. chndx .. " - " .. chmax .. " of " .. #chnz .. " chains" ) --[[ ask.l15 = dialog.AddLabel ( "Click inside the one of the text boxes, then" ) ask.l16 = dialog.AddLabel ( "use ctrl+a (command+a on Mac) to select all," ) ask.l20 = dialog.AddLabel ( "and control+c or command+c to copy, then" ) ask.l30 = dialog.AddLabel ( "paste into spreadsheet or other tool" ) ]]-- ask.SEQN = dialog.AddLabel ( "---- primary and secondary structure ----" ) local cmin = math.min ( #chnz, 4 ) for ii = chndx, chmax do local chain = chnz [ ii ] if chain.ctype ~= protNfo.LIGAND then ask [ "chn" .. ii .. "l1" ] = dialog.AddLabel ( "Chain " .. chain.chainid .. " (" .. Ctypes [ chnz [ ii ].ctype ] .. ")" ) ask [ "chn" .. ii .. "l2" ] = dialog.AddLabel ( "segments " .. chain.start .. "-" .. chain.stop .. ", length = " .. chain.len .. ", mutables = " .. chain.mute ) ask [ "chn" .. ii .. "ps" ] = dialog.AddTextbox ( "primary", chain.fasta ) ask [ "chn" .. ii .. "ss" ] = dialog.AddTextbox ( "secondary", chain.ss ) end end ask.OK = dialog.AddButton ( "OK", 1 ) if chndx > 1 then ask.prev = dialog.AddButton ( "Prev", 2 ) end if chmax < #chnz then ask.next = dialog.AddButton ( "Next", 3 ) end ask.Cancel = dialog.AddButton ( "Cancel", 0 ) repeat local rc = dialog.Show ( ask ) if rc == 2 then rc = ShowChains ( chnz, chndx - CHPAGE ) end if rc == 3 then rc = ShowChains ( chnz, chndx + CHPAGE ) end until rc < 2 return rc end function ShowDensityReport ( dReport, dendev ) local ask = dialog.CreateDialog ( ReVersion .. " - Density Report" ) ask.l15 = dialog.AddLabel ( "Click inside the one of the text boxes, then" ) ask.l16 = dialog.AddLabel ( "use ctrl+a (command+a on Mac) to select all," ) ask.l20 = dialog.AddLabel ( "and control+c or command+c to copy, then" ) ask.l30 = dialog.AddLabel ( "paste into spreadsheet" ) ask.l50 = dialog.AddLabel ( "---- density analysis ----" ) ask.dens = dialog.AddTextbox ( "density:", dReport ) ask.l70 = dialog.AddLabel ( "---- density deviation from mean AA density ----" ) ask.dev = dialog.AddTextbox ( "deviation:", dendev ) ask.OK = dialog.AddButton ( "OK", 1 ) dialog.Show ( ask ) end function GetParms ( scrParts, ligands, chnz ) local ask = dialog.CreateDialog ( ReVersion ) local kRet = 0 repeat if protNfo.segcnt == protNfo.segcnt2 then ask.segcnt = dialog.AddLabel ( protNfo.segCnt2 .. " segments" ) else ask.segcnt = dialog.AddLabel ( protNfo.segCnt .. " segments" ) ask.segcnt2 = dialog.AddLabel ( protNfo.segCnt2 .. " segments, adjusted for ligands" ) end if #chnz > 0 then ask.chnnn = dialog.AddLabel ( #chnz .. " chains" ) local cmin = math.min ( #chnz, 4 ) for ii = 1, cmin do local chain = chnz [ ii ] if chain.ctype ~= protNfo.LIGAND then ask [ "chn" .. ii .. "l1" ] = dialog.AddLabel ( "Chain " .. chain.chainid .. " (" .. Ctypes [ chnz [ ii ].ctype ] .. ")" ) ask [ "chn" .. ii .. "l2" ] = dialog.AddLabel ( "segments " .. chain.start .. "-" .. chain.stop .. ", length = " .. chain.len .. ", mutables = " .. chain.mute ) end end if cmin < #chnz then ask.chzzz = dialog.AddLabel ( "(partial list, see \"Chains\" for complete chain info)" ) end end if #ligands > 0 then ask.ligands = dialog.AddLabel ( #ligands .. " ligand section(s), see scriptlog for details" ) end ask.NRGE = dialog.AddLabel ( "---- score information ----" ) --[[ BoxScore = { tSubScores = 0, -- total of active subscores, all segments tSegScores = 0, -- total of segment scores tScoreFilt = 0, -- total score, filters on tScoreFOff = 0, -- total score, filters off tScoreNrgy = 0, -- total energy score tScoreBonus = 0, -- total filter bonus tScoreForm = 0, -- subscores + filter bonus + 8000 tScoreDark = 0, -- total "dark" score } ]]-- ask.active = dialog.AddLabel ( #scrParts .. " active subscores" ) ask.tSubScores = dialog.AddLabel ( "total of all subscores = " .. round ( BoxScore.tSubScores ) ) ask.tScoreFilt = dialog.AddLabel ( "current score = " .. round ( BoxScore.tScoreFilt ) ) if BoxScore.tScoreBonus ~= 0 then ask.tScoreBonus = dialog.AddLabel ( "filter bonus = " .. round ( BoxScore.tScoreBonus ) ) else ask.tScoreBonus = dialog.AddLabel ( "no filter bonus" ) end ask.tScoreForm = dialog.AddLabel ( "subscores + filter bonus + 8000 = " .. round ( BoxScore.tScoreForm ) ) ask.tScoreDark = dialog.AddLabel ( "\"dark\" points = " .. round ( BoxScore.tScoreDark ) ) if jDensity then local tdense = 0 for ii = 1, protNfo.segCnt2 do tdense = tdense + subScoreCache [ "Density" ] [ ii ] end ask.tScoreDensity = dialog.AddLabel ( "density score = " .. round ( tdense ) ) end ask.SECT = dialog.AddLabel ( "---- report sections ----" ) ask.kContact = dialog.AddCheckbox ( "mini contact table", PPX.kContact ) ask.kMutDet = dialog.AddCheckbox ( "mutable details", PPX.kMutDet ) if jDensity then ask.kDensity = dialog.AddCheckbox ( "density analysis", PPX.kDensity ) end ask.OK = dialog.AddButton ( "OK", 1 ) ask.chains = dialog.AddButton ( "Chains", 2 ) ask.scores = dialog.AddButton ( "Subscores", 3 ) ask.more = dialog.AddButton ( "Format", 4 ) ask.Cancel = dialog.AddButton ( "Cancel", 0 ) kRet = dialog.Show ( ask ) if kRet > 0 then PPX.kContact = ask.kContact.value PPX.kMutDet = ask.kMutDet.value if jDensity then PPX.kDensity = ask.kDensity.value end if kRet == 2 then ShowChains ( chnz, 1 ) end if kRet == 3 then local sbs = false sbs, scrParts = GetSubscoreParms ( scrParts ) end if kRet == 4 then GetFormatParms () end end until kRet < 2 if kRet == 1 then return true, scrParts end return false, scrParts end function GetFormatParms () local ask = dialog.CreateDialog ( ReVersion .. " - Format Options" ) ask.DETAIL = dialog.AddLabel ( "---- additional columns ----" ) ask.kLongnm = dialog.AddCheckbox ( "long name", PPX.kLongnm ) ask.kAbbrev = dialog.AddCheckbox ( "abbreviation", PPX.kAbbrev ) ask.kHydro = dialog.AddCheckbox ( "hydropathy index", PPX.kHydro ) ask.kAtom = dialog.AddCheckbox ( "atom count", PPX.kAtom ) ask.kRotamer = dialog.AddCheckbox ( "rotamer count", PPX.kRotamer ) ask.FORMAT = dialog.AddLabel ( "---- formatting options ----" ) ask.jMaxL = dialog.AddSlider ( "line length:", PPX.jMaxL, 40, 250, 0 ) ask.kRound = dialog.AddSlider ( "decimal places:", PPX.kRound, 1, 8, 0 ) ask.ddlm = dialog.AddLabel ( "delimiters (last selected wins)" ) ask.dtab = dialog.AddCheckbox ( "tab", PPX.dtab ) ask.dcomma = dialog.AddCheckbox ( "comma", PPX.dcomma ) ask.dsemic = dialog.AddCheckbox ( "semicolon", PPX.dsemic ) ask.OK = dialog.AddButton ( "OK", 1 ) ask.Cancel = dialog.AddButton ( "Cancel", 0 ) local kRet = dialog.Show ( ask ) if kRet > 0 then PPX.kLongnm = ask.kLongnm.value PPX.kAbbrev = ask.kAbbrev.value PPX.kHydro = ask.kHydro.value PPX.kAtom = ask.kAtom.value PPX.kRotamer = ask.kRotamer.value PPX.jMaxL = ask.jMaxL.value PPX.kRound = ask.kRound.value PPX.kFax = 10 ^ -PPX.kRound PPX.dtab = ask.dtab.value if PPX.dtab then PPX.delim = "\t" end PPX.dcomma = ask.dcomma.value if PPX.dcomma then PPX.delim = "," end PPX.dsemic = ask.dsemic.value if PPX.dsemic then PPX.delim = ";" end return true end return false end function GetSubscoreParms ( scrParts ) local ask = dialog.CreateDialog ( ReVersion ) local kRet = 0 repeat ask.DETAIL = dialog.AddLabel ( "---- subscore reporting options ----" ) for ii = 1, #scrParts do ask [ scrParts [ ii ] [ 1 ] ] = dialog.AddCheckbox ( scrParts [ ii ] [ 1 ] .. ": " .. round ( scrParts [ ii ] [ 2 ] ) .. " (" .. round ( scrParts [ ii ] [ 2 ] / protNfo.segCnt ) .. " / seg) ", scrParts [ ii ] [ 3 ] ) end ask.OK = dialog.AddButton ( "OK", 1 ) ask.Cancel = dialog.AddButton ( "Cancel", 0 ) kRet = dialog.Show ( ask ) if kRet > 0 then local aPart = 0 for ii = 1, #scrParts do scrParts [ ii ] [ 3 ] = ask [ scrParts [ ii ] [ 1 ] ].value aPart = aPart + 1 end -- -- select all if nothing selected -- if aPart == 0 then for ii = 1, #scrParts do scrParts [ ii ] [ 3 ] = true end end end until kRet < 2 if kRet == 1 then return true, scrParts end return false, scrParts end -- -- Ident - print identifying information at beginning and end of recipe -- -- slugline - first line to print - normally recipe name and version -- function Ident ( slugline ) print ( slugline ) print ( "Puzzle: " .. puzzle.GetName () .. " (" .. puzzle.GetPuzzleID () .. ")" ) local trk = ui.GetTrackName () if trk ~= "default" then print ( "Track: " .. trk ) end local scoretype = scoreboard.GetScoreType () local scort = "" if scoretype == 0 then scort = "soloist" elseif scoretype == 1 then scort = "evolver" elseif scoretype == 2 then scort = "all hands" elseif scoretype == 3 then scort = "no score" else scort = "unknown/error" end print ( "User: " .. user.GetPlayerName () .. " (" .. scort .. " #" .. scoreboard.GetRank ( scoretype ) .. ")" ) end function main () save.Quicksave ( WAYBACK ) divline () Ident ( ReVersion ) -- -- get protein info -- divline () print ( "collecting protein info, please wait" ) protNfo:setNfo () local chainz = protNfo.chains local chw = " chain" if #chainz > 1 then chw = chw .. "s" end print ( #chainz .. chw ) for cc = 1, #chainz do print ( "chain " .. chainz [ cc ].chainid .. ", type = " .. Ctypes [ chainz [ cc ].ctype ] .. ", segments = " .. chainz [ cc ].start .. "-" .. chainz [ cc ].stop .. ", length = " .. chainz [ cc ].len .. ", mutables = " .. chainz [ cc ].mute ) end -- -- determine which subscores are active -- --[[ BoxScore = { tSubScores = 0, -- total of active subscores, all segments tSegScores = 0, -- total of segment scores tScoreFilt = 0, -- total score, filters on tScoreFOff = 0, -- total score, filters off tScoreNrgy = 0, -- total energy score tScoreBonus = 0, -- total filter bonus tScoreForm = 0, -- subscores + filter bonus + 8000 tScoreDark = 0, -- total "dark" score } ]]-- -- -- FindActiveSubscores should be called first, to build subScoreCache -- subScores, BoxScore.tSubScores, BoxScore.tSegScores = FindActiveSubscores () -- -- special check for "Density" subscore -- for ii = 1, #subScores do if subScores [ ii ] [ 1 ] == "Density" then jDensity = true -- density present PPX.kDensity = true -- select density report by default break end end -- -- report conditions/filters -- divline () behavior.SetFiltersDisabled ( false ) BoxScore.tScoreFilt = current.GetEnergyScore () behavior.SetFiltersDisabled ( true ) BoxScore.tScoreFOff = current.GetEnergyScore () BoxScore.tScoreBonus = BoxScore.tScoreFilt - BoxScore.tScoreFOff if fBonus ~= 0 then print ( "current filter bonus = " .. round ( BoxScore.tScoreBonus ) ) end behavior.SetFiltersDisabled ( false ) -- -- report on individual filters -- FindActiveFilters () BoxScore.tScoreForm = BoxScore.tSubScores + BoxScore.tScoreBonus + 8000 print ( "subscores + filter bonus + 8000 = " .. round ( BoxScore.tScoreForm ) ) BoxScore.tScoreDark = BoxScore.tScoreFilt - BoxScore.tScoreForm print ( "current score = " .. round ( BoxScore.tScoreFilt ) ) print ( "\"dark\" points = " .. round ( BoxScore.tScoreDark ) ) --[[ the calculation in this Rosetta score logic is correct, but scoreboard.GetScore returns the Rosetta score of the best overall solo or evolver pose, which is not necessarily *this* pose... local sRosetta = scoreboard.GetScore () print ( "Rosetta energy score = " .. round ( sRosetta ) ) local sRosecon = 10 * ( 800 - sRosetta ) print ( "converted Rosetta score = " .. round ( sRosecon ) ) ]]-- -- -- get chains -- local chainz = protNfo.chains -- -- get details for subscore report -- local go, selScores = GetParms ( subScores, protNfo.ligands, chainz ) if go then for cc = 1, #chainz do divline () print ( "chain " .. chainz [ cc ].chainid .. ", type = " .. Ctypes [ chainz [ cc ].ctype ] .. ", segments = " .. chainz [ cc ].start .. "-" .. chainz [ cc ].stop .. ", length = " .. chainz [ cc ].len .. ", mutables = " .. chainz [ cc ].mute ) print ( "" ) if chainz [ cc ].ctype ~= protNfo.LIGAND then -- -- sequence of letters, hydrophobes, structures, locks -- local seq, hydro, struc, types, locks, slocks, locked, slocked = BuildSequence ( chainz [ cc ].start, chainz [ cc ].stop ) -- -- print the sequences -- print ( "primary structure sequence (single-letter amino acid codes for searching PDB)" ) print ( "" ) linotype ( seq, chainz [ cc ].start ) --[[ print ( "type of each segment (P = protein, R = RNA, D = DNA, M = molecule (ligand))" ) linotype ( types ) ]]-- print ( "sequence with i if hydrophobic" ) print ( "" ) linotype ( hydro, chainz [ cc ].start ) print ( "secondary structure" ) print ( "" ) linotype ( struc, chainz [ cc ].start ) if locked > 0 then print ( "locked backbone segments" ) print ( "" ) linotype ( locks, chainz [ cc ].start ) end if slocked > 0 then print ( "locked sidechain segments" ) print ( "" ) linotype ( slocks, chainz [ cc ].start ) end end end -- -- get overall counts for detail report -- TODO: duplicate effort here counting locked and slocked, resolve in next version -- local nonprot = 0 local locked = 0 local slocked = 0 for ll = 1, protNfo.segCnt do if protNfo.ctype [ ll ] ~= protNfo.PROTEIN then nonprot = nonprot + 1 end if protNfo.lock [ ll ] then locked = locked + 1 end if protNfo.slck [ ll ] then slocked = slocked + 1 end end -- -- find modifiable sections -- divline () print ( "modifiable sections - results in Lua table format using segment numbers" ) FindModifiable () -- -- print segment scores -- local tReport = SegScores ( selScores, nonprot, locked, slocked, #chainz ) -- -- print density analysis -- local dReport = nil local dendev = nil if jDensity then dReport, dendev = DensityRat () end if PPX.kContact then -- -- detect structures -- local helixList = GetStruct ( "H" ) local sheetList = GetStruct ( "E" ) local structList = {} for ii = 1, #helixList do structList [ #structList + 1 ] = helixList [ ii ] end for ii = 1, #sheetList do structList [ #structList + 1 ] = sheetList [ ii ] end -- -- mini contact table -- contact ( structList ) end -- -- find mutable segments and print them -- if PPX.kMutDet then divline () print ( "mutable segments" ) FindMutable ( ) end -- -- show reports for copy and paste -- ShowReport ( tReport, chainz ) if PPX.kDensity then ShowDensityReport ( dReport, dendev ) end end -- -- exit via the cleanup function -- cleanup () end function cleanup ( errmsg ) if CLEANUPENTRY ~= nil then return end CLEANUPENTRY = true print ( "---" ) local reason local start, stop, line, msg if errmsg == nil then reason = "complete" else start, stop, line, msg = errmsg:find ( ":(%d+):%s()" ) if msg ~= nil then errmsg = errmsg:sub ( msg, #errmsg ) end if errmsg:find ( "Cancelled" ) ~= nil then reason = "cancelled" else reason = "error" end end Ident ( ReVersion ) if reason == "error" then print ( "Unexpected error detected" ) print ( "Error line: " .. line ) print ( "Error: \"" .. errmsg .. "\"" ) end save.Quickload ( WAYBACK ) end xpcall ( main, cleanup ) --- end of recipe

Comments


LociOiling Lv 1

print protein 2.9.4 now uses the distance between segments to determine where chains start and end. This supplements the original strategy, which involved looking at atoms counts to determine whether a segment was an N-terminal, C-terminal, just the middle of a chain.

Many of the recent electron density puzzles have multiple chains, and ED experiments often fail to recover the terminal segments (residues). So atom counts were not reliable on these puzzles. While distance is more reliable, in may still be incorrect where mid-chain residues were not found in the ED results. In these cases, print protein will incorrectly report extra chains.

Aside from the new chain detection method, the main print protein dialog has been broken up to prevent it from getting "too tall" to fit in the Foldit window.

The main dialog now reports summary information for up to four chains. The sequence information is not included in this display, but a "Chains" button opens a dialog which displays the primary and secondary structure of each chain. The structure information can be copied and pasted. The Chains dialog displays up to four chains at time, with a "Next" button if there are more.

The main dialog also no longer displays the totals and means for each subscore. The "Subscores" button now displays this information in a separate dialog. As previously, all subscores are checked by default, meaning they are included in the main report. Uncheck any unwanted subscores.

There's now a "Format" button, which replaces the "More" button in previous versions. The resulting dialog is the same as in previous version, and allows fine-tuning the report and spreadsheet output.

Clicking "Cancel" on the main dialog skips producing the segment-by-segment subscores report. As before, clicking "OK" on the main dialog produces the detail report, and presents a dialog which allows the copying the results for pasting into a spreadsheet. An "end of report" line has been added to address problems with the last line getting truncated. The "Chains" button can be used to open the chains dialog again.

For an electron density puzzles, a separate copy-and-paste dialog appears after the main dialog is dismissed. As with the main report, the electron density analysis can be copied and pasted. Again, an "end of report" line has been added to help avoid truncation.

Aside from these changes, there are several minor formatting tweaks. The number of mutables is now reported wherever chain information appears. For electron density, the total electron density score is reported after the "dark points". Normally, the ED score accounts for the "dark" points. (Dark points may still show up in other puzzle types.)

Internally, the detection of locked sidechains is now updated to use structure.IsLocked, which now includes this information. The new method should be marginally faster.

LociOiling Lv 1

AA Edit 2.1 and SS Edit 2.1 have been released. All three recipes now have identical chain detection logic (with some minor exceptions).

In addition to missing residues in ED puzzles, some protein design puzzles are a problem for the updated recipes. For example, in the Mpox binder puzzles, the protein target consisted of small fragments of a larger protein. The new logic reports each fragment as a separate chain. That's not necessarily wrong, but having an option for combine fragments might make sense.

A future release will resolve the remaining differences between AA Edit, SS Edit, and print protein. Various options for combining broken chains are also on the drawing board.