Icon representing a recipe

Recipe: print protein 2.8

created by LociOiling

Profile


Name
print protein 2.8
ID
102599
Shared with
Public
Parent
print protein 2.7
Children
Created on
January 11, 2018 at 23:14 PM UTC
Updated on
January 11, 2018 at 23:14 PM UTC
Description

Update of "print protein lua2 V0" by marie_s. Version 2.5 fixes rulers and breaks up long lines. Version 2.6 saves and restores to conserve moves for sketchbook. Version 2.7 speeds things up. Version 2.8 adds features suggested by recent puzzles.

Best for


Code


--[[ print protein info on the protein - Concatenation of recipes, part of recipes or functions by: Tlaloc, spvincent, seagate, John McLeod, Crashguard303, Gary Forbis and authors on wiki Now with code from Timo van der Laan and and more code from spvincent. Borrowed HerobrinesArmy's idea for copy-and-paste on the segment score table, plus option for including atom and rotamer counts. Intended use: 1. Run 2. Open the script log (scriptlog.default.xml) in a text editor. 3. Strip off the start and end lines and save it as a text file with another name. 4. Import that into Excel as a comma-delimited text file. Alternately: 3. Select and copy the lines containing the score detail. 4. Paste into a spreadsheet or other program that accepts CSV (comma-separated values) format. Yet another alternative: 3. Select and copy the "score report" from the new "copy-and-paste" dialog. 4. Paste into the spreadsheet of your choice. version history --------------- print protein lua v0 - 2011/08/15 - marie_s (Marie Suchard) print protein 2.0 - 2015/05/13 - LociOiling + add dialog + use active subscores + restrict scope of most variables + convert amino acid table to keyed format + added kludges specific to puzzle 879 (hope they are never needed) + add adjustable rounding, eliminate existing "no trunc" logic + made tab the default delimiter, with comma or semicolon as alternates + add detailed scoring information + add "modifiable sections" report, make original "mutable" report optional + make "mini contact table" optional + in subscores report, add option for atom and rotamer counts, make hydropathy index optional + add warnings for unknown amino acid or secondary structure codes, subtotal mismatches, suspect ligands + and copy-and-paste dialog for subscores, primary and secondary structure print protein 2.1 - 2016/03/23 - LociOiling + add ruler print protein 2.2 - 2016/04/18 - LociOiling + fix crash in normal ligand case + add mean score to subscore display print protein 2.3 - 2016/10/27 - LociOiling + add density analysis for ED puzzles * density by amino acid * density by aromatic vs. non-aromatic * density by aliphatic vs. non-aliphatic * density by hydrophobic vs. hydrophilic * density deviation, showing whether each segment is above or below mean density for the segment's amino acid type + move less-used parms to second screen to avoid screen overflow print protein 2.4 - 2016/11/15 - LociOiling + move ED items to separate dialog to avoid textbox bug + add "active" flag to active subscores table to fix totally messed up selection logic; ripple change to GetScore print protein 2.5 - 2017/05/19 - LociOiling + fix rulers, split long lines print protein 2.6 - 2017/08/06 - LociOiling + save and restore to conserve moves print protein 2.7 - 2017/11/02 - LociOiling + consolidate duplicated segment subscore calls + add info from scoreboard and user functions print protein 2.8 - 2017/12/15 - LociOiling + handle RNA, make educated guess for DNA + include ligands in report, treat each ligand independently + add new type column, P for protein, R for RNA, D for DNA, M for molecule (ligand) + if locked segments found, include "locked" column + check for unlocked sidechains of locked segments ]]-- -- -- globals section -- Recipe = "print protein" Version = "2.8" ReVersion = Recipe .. " " .. Version WAYBACK = 99 -- save and restore segCnt = structure.GetCount() segCnt2 = segCnt isLigand = false subScores = {} -- subscore table gTotal = 0 -- grand total all subscores segScoreCache = {} -- current.GetSegmentEnergyScore results saved here by FindActiveSubscores subScoreCache = {} -- current.GetSegmentEnergySubscore results saved here by FindActiveSubscores BoxScore = { tSubScores = 0, -- total of active subscores, all segments tSegScores = 0, -- total of segment scores tScoreFilt = 0, -- total score, filters on tScoreFOff = 0, -- total score, filters off tScoreNrgy = 0, -- total energy score tScoreBonus = 0, -- total filter bonus tScoreForm = 0, -- subscores + filter bonus + 8000 tScoreDark = 0, -- total "dark" score } kHydro = false -- include hydropathy index kAtom = false -- include atom count kRotamer = false -- include rotamer count kLongnm = false -- include full name of amino acid or nucleobase kAbbrev = false -- include abbreviation kRound = 3 -- number of digits for rounding kFax = 10 ^ -kRound -- initial rounding factor jMaxL = 100 -- max line length for strings dtab = true delim = "\t" -- default delimiter is tab character dcomma = false -- allow user to select comma dsemic = false -- allow user to select semicolon kMutDet = false -- detailed mutable report kContact = false -- contact table kDensity = false -- density report jDensity = false -- true if density component present -- -- indexes for helixList, sheetList, structList -- STRCTTYP = 1 --structure type STRCTSTR = 2 --starting segment STRCTEND = 3 --ending segment STRCTUSE = 4 --use flag (boolean) STRCTSCR = 5 --score -- -- ligand list offsets -- LLSEG = 1 LLATM = 2 LLROT = 3 LLSCR = 4 -- -- indexes for amino acid table -- AASHORT = 1 -- three-letter code AALONG = 2 -- full name AAPOLARITY = 3 -- polarity (text) AAACIDITY = 4 -- acidity (text) AAHYDROPATHY = 5 -- hydropathy index -- -- indexes for segment table -- TSAA = 1 -- amino acid - one letter for protein, two letters for RNA/DNA TSAS = 2 -- amino acid short name TSAL = 3 -- amino acid long name TSHY = 4 -- hydropathy index TSSS = 5 -- secondary structure TSLO = 6 -- locked - true or false TSSL = 7 -- sidechain locked - true or false TSTY = 8 -- segment type - P for protein, R for RNA, D for DNA, M for molecule (ligand) -- -- residues with -- in front (commented) are not in Foldit (as of Nov 15, 2010) -- one-letter amino acid code is the table key -- two-letter RNA and DNA nucleotides are also valid -- AminoAcids = { a = { "Ala", "Alanine", "nonpolar", "neutral", 1.8 }, -- b = { "Asx", "Asparagine or Aspartic acid" }, c = { "Cys", "Cysteine", "nonpolar", "neutral", 2.5 }, d = { "Asp", "Aspartate", "polar", "negative", -3.5 }, e = { "Glu", "Glutamate", "polar", "negative", -3.5 }, f = { "Phe", "Phenylalanine", "nonpolar", "neutral", 2.8 }, g = { "Gly", "Glycine", "nonpolar", "neutral", -0.4 }, h = { "His", "Histidine", "polar", "neutral", -3.2 }, i = { "Ile", "Isoleucine", "nonpolar", "neutral", 4.5 }, -- j = { "Xle", "Leucine or Isoleucine" }, k = { "Lys", "Lysine", "polar", "positive", -3.9 }, l = { "Leu", "Leucine", "nonpolar", "neutral", 3.8 }, m = { "Met", "Methionine ", "nonpolar", "neutral", 1.9 }, n = { "Asn", "Asparagine", "polar", "neutral", -3.5 }, -- o = { "Pyl", "Pyrrolysine" }, p = { "Pro", "Proline", "nonpolar", "neutral", -1.6 }, q = { "Gln", "Glutamine", "polar", "neutral", -3.5 }, r = { "Arg", "Arginine", "polar", "positive", -4.5 }, s = { "Ser", "Serine", "polar", "neutral", -0.8 }, t = { "Thr", "Threonine", "polar", "neutral", -0.7 }, -- u = { "Sec", "Selenocysteine" }, v = { "Val", "Valine", "nonpolar", "neutral", 4.2 }, w = { "Trp", "Tryptophan", "nonpolar", "neutral", -0.9 }, x = { "Xaa", "Unspecified/unknown", "", "", 0 }, -- kludge for puzzle 879 y = { "Tyr", "Tyrosine", "polar", "neutral", -1.3 }, -- z = { "Glx", "Glutamine or glutamic acid" } , -- -- bonus! RNA nucleotides -- ra = { "a", "Adenine", "", "", 0, }, rc = { "c", "Cytosine", "", "", 0, }, rg = { "g", "Guanine", "", "", 0, }, ru = { "u", "Uracil", "", "", 0, }, -- -- bonus! DNA nucleotides (as seen in PDB, not confirmed for Foldit) -- da = { "a", "Adenine", "", "", 0, }, dc = { "c", "Cytosine", "", "", 0, }, dg = { "g", "Guanine", "", "", 0, }, du = { "t", "Thymine", "", "", 0, }, } -- -- separate tables for easy type check -- RNAcodes = { ra = { "adenine", }, rc = { "cytosine", }, rg = { "guanine", }, ru = { "uracil", }, } DNAcodes = { da = { "adenine", }, dc = { "cytosine", }, dg = { "guanine", }, dt = { "thymine", }, } -- -- amino acid types -- Aromatic = { f = { "phenylalanine", }, h = { "histidine", }, w = { "tryptophan", }, y = { "tyrosine", }, } Aliphatic = { i = { "isoleucine", }, l = { "leucine", }, v = { "valine", }, } Hydrophobic = { a = { "alanine", }, c = { "cysteine", }, f = { "phenylalanine", }, i = { "isoleucine", }, l = { "leucine", }, m = { "methionine", }, p = { "proline", }, v = { "valine", }, w = { "tryptophan", }, y = { "tyrosine", }, } -- -- amino acids which normally have one rotamer -- if other AAs report one rotamer, it may -- indicate a locked sidechain -- OneRotamer = { a = { "alanine", }, g = { "glycine", }, } -- -- end of globals section -- -- -- begin protNfo Beta package version 0.1 -- -- version 0.1 is packaged as a psuedo-class or psuedo-module -- containing a mix of data fields and functions -- -- all entries must be terminated with a comma to keep Lua happy -- -- the commas aren't necessary if only function definitions are present -- -- protNfo = { aa = {}, -- amino acid codes ss = {}, -- secondary structure codes atom = {}, -- atom counts rot = {}, -- rotamer counts phobe = {}, -- hydro phobics lock = {}, -- locked segments slck = {}, -- locked sidechains mute = {}, -- mutable segments setNfo = function () for ii = 1, structure.GetCount () do protNfo.aa [ #protNfo.aa + 1 ] = structure.GetAminoAcid ( ii ) protNfo.ss [ #protNfo.ss + 1 ] = structure.GetSecondaryStructure ( ii ) protNfo.atom [ #protNfo.atom + 1 ] = structure.GetAtomCount ( ii ) protNfo.rot [ #protNfo.rot + 1 ] = rotamer.GetCount ( ii ) protNfo.phobe [ #protNfo.phobe + 1 ] = structure.IsHydrophobic ( ii ) protNfo.lock [ #protNfo.lock + 1 ] = structure.IsLocked ( ii ) protNfo.mute [ #protNfo.mute + 1 ] = structure.IsMutable ( ii ) -- -- take a stab at those locked sidechains -- local slk = false if OneRotamer [ protNfo.aa [ ii ] ] == nil and protNfo.rot [ ii ] <= 1 then slk = true end protNfo.slck [ #protNfo.slck + 1 ] = slk end end, } -- -- end protNfo Beta package version 0.1 -- -- -- function print score by spvincent -- function round ( x ) if x == nil then return "nil" end return x - x % kFax end -- -- Segment set and list module -- Notice that most functions assume that the sets are well formed -- (=ordered and no overlaps) -- 02-05-2012 TvdL Free to use for non commercial purposes -- function SegmentListToSet ( list ) -- retirer doublons local result = {} local f = 0 local l = -1 table.sort ( list ) for ii = 1, #list do if list [ ii ] ~= l + 1 and list [ ii ] ~= l then -- note: duplicates are removed if l > 0 then result [ #result + 1 ] = { f, l } end f = list [ ii ] end l = list [ ii ] end if l > 0 then result [ #result + 1 ] = { f, l } end return result end function SegmentSetToList ( set ) -- faire une liste a partir d'une zone local result = {} for ii = 1, #set do for k = set [ ii ] [ 1 ], set [ ii ] [ 2 ] do result [ #result + 1 ] = k end end return result end function SegmentCleanSet ( set ) -- Makes it well formed return SegmentListToSet ( SegmentSetToList ( set ) ) end function SegmentInvertSet ( set, maxseg ) -- Gives back all segments not in the set -- maxseg is added for ligand local result={} if maxseg == nil then maxseg = structure.GetCount () end if #set == 0 then return { { 1, maxseg } } end if set [ 1 ] [ 1 ] ~= 1 then result [ 1 ] = { 1, set [ 1 ] [ 1 ] - 1 } end for i = 2, #set do result [ #result + 1 ] = { set [ i - 1 ] [ 2 ] + 1, set [ i ] [ 1 ] - 1} end if set [ #set ] [ 2 ] ~= maxseg then result [ #result + 1 ] = { set [ #set ] [ 2 ] + 1, maxseg } end return result end function SegmentInList ( s, list ) -- verifier si segment est dans la liste table.sort ( list ) for ii = 1, #list do if list [ ii ] == s then return true elseif list [ ii ] > s then return false end end return false end function SegmentInSet ( set, s ) --verifie si segment est dans la zone for ii = 1, #set do if s >= set [ ii ] [ 1 ] and s <= set [ ii ] [ 2 ] then return true elseif s < set [ ii ] [ 1 ] then return false end end return false end function SegmentJoinList ( list1, list2 ) -- fusionner 2 listes de segments local result = list1 if result == nil then return list2 end for ii = 1, #list2 do result [ #result + 1 ] = list2 [ ii ] end table.sort ( result ) return result end function SegmentJoinSet ( set1, set2 ) --fusionner (ajouter) 2 zones return SegmentListToSet ( SegmentJoinList ( SegmentSetToList ( set1 ), SegmentSetToList ( set2 ) ) ) end function SegmentCommList ( list1, list2 ) -- chercher intersection de 2 listes local result = {} table.sort ( list1 ) table.sort ( list2 ) if #list2 == 0 then return result end local j = 1 for ii = 1, #list1 do while list2 [ j ] < list1 [ ii ] do j = j + 1 if j > #list2 then return result end end if list1 [ ii ] == list2 [ j ] then result [ #result + 1 ] = list1 [ ii ] end end return result end function SegmentCommSet ( set1, set2 ) -- intersection de 2 zones return SegmentListToSet ( SegmentCommList ( SegmentSetToList ( set1 ), SegmentSetToList ( set2 ) ) ) end function SegmentSetMinus ( set1, set2 ) return SegmentCommSet ( set1, SegmentInvertSet ( set2 ) ) end function SegmentPrintSet ( set ) print ( SegmentSetToString ( set ) ) end function SegmentSetToString ( set ) -- pour pouvoir imprimer local line = "" for ii = 1, #set do if ii ~= 1 then line = line .. ", " end line = line .. set [ ii ] [ 1 ] .. "-" .. set [ ii ] [ 2 ] end return line end function SegmentSetInSet ( set, sub ) if sub == nil then return true end -- Checks if sub is a proper subset of set for ii = 1, #sub do if not SegmentRangeInSet ( set, sub [ ii ] ) then return false end end return true end function SegmentRangeInSet ( set, range ) --verifier si zone est dans suite if range == nil or #range == 0 then return true end local bb = range [ 1 ] local ee = range [ 2 ] for ii = 1, #set do if bb >= set [ ii ] [ 1 ] and bb <= set [ ii ] [ 2 ] then return ( ee <= set [ ii ] [ 2 ] ) elseif ee <= set [ ii ] [ 1 ] then return false end end return false end function SegmentSetToBool ( set ) --vrai ou faux pour chaque segment utilisable ou non local result = {} for ii = 1, structure.GetCount () do result [ ii ] = SegmentInSet ( set, ii ) end return result end --- End of Segment Set module -- Module Find Segment Types function FindMutablesList () local result = {} for ii = 1, segCnt do if protNfo.mute [ ii ] then result [ #result + 1 ] = ii end end return result end function FindMutables() return SegmentListToSet ( FindMutablesList () ) end function FindFrozenList () local result = {} for ii = 1, segCnt2 do if freeze.IsFrozen ( ii ) then result [ #result + 1 ] = ii end end return result end function FindFrozen () return SegmentListToSet ( FindFrozenList () ) end function FindLockedList () local result = {} for ii = 1, segCnt do if protNfo.lock [ ii ] then result [ #result + 1 ] = ii end end return result end function FindLocked () return SegmentListToSet ( FindLockedList () ) end function FindSLockedList () local result = {} for ii = 1, segCnt do if protNfo.slck [ ii ] then result [ #result + 1 ] = ii end end return result end function FindSLocked () return SegmentListToSet ( FindSLockedList () ) end function FindZeroScoreList () local result = {} for ii = 1, segCnt do local sub = 0 for jj = 1, #subScores do -- sub = sub + current.GetSegmentEnergySubscore ( ii, subScores [ jj ] [ 1 ] ) -- -- special mod for print protein: use subScoreCache -- sub = sub + subScoreCache [ subScores [ jj ] [ 1 ] ] [ ii ] end if sub == 0 then result [ #result + 1 ] = ii end end return result end function FindZeroScore () return SegmentListToSet ( FindZeroScoreList () ) end function FindSelectedList () local result = {} for ii = 1, segCnt do if selection.IsSelected ( ii ) then result [ #result + 1 ] = ii end end return result end function FindSelected() return SegmentListToSet ( FindSelectedList () ) end function FindAAtypeList ( aa ) local result = {} for ii = 1, segCnt do if protNfo.ss [ ii ] == aa then result [ #result + 1 ] = ii end end return result end function FindAAtype ( aa ) return SegmentListToSet ( FindAAtypeList ( aa ) ) end function FindAminotype ( at ) --NOTE: only this one gives a list not a set local result={} for ii = 1, segCnt do if protNfo.aa [ ii ] == at then result [ #result + 1 ] = ii end end return result end -- -- end Module Find Segment Types -- -- -- count segments, check for ligand -- function GetSeCount () segCnt = structure.GetCount() -- -- standard ligand adjustment -- segCnt2 = segCnt while protNfo.ss [ segCnt2 ] == "M" do segCnt2 = segCnt2 - 1 end if segCnt2 == segCnt then print ( "segment count = " .. segCnt ) else print ( "original segment count = " .. segCnt ) print ( "adjusted segment count = " .. segCnt2 ) end -- -- ultra-paranoid method for detecting ligands -- -- each ligand segment is treated separately in this version -- ligandList = {} for ii = 1, segCnt do if protNfo.ss [ ii ] == "M" then local atoms = protNfo.atom [ ii ] local rots = protNfo.rot [ ii ] local sscor = current.GetSegmentEnergyScore ( ii ) ligandList [ #ligandList + 1 ] = { ii, atoms, rots, sscor } end end print ( #ligandList .. " ligands" ) for jj = 1, #ligandList do print ( "ligand # " .. jj .. ", segment = " .. ligandList [ jj ] [ LLSEG ] .. ", atoms = " .. ligandList [ jj ] [ LLATM ] .. ", rotamers = " .. ligandList [ jj ] [ LLROT ] .. ", score = " .. round ( ligandList [ jj ] [ LLSCR ] ) ) if ligandList [ jj ] [ LLSEG ] < segCnt2 then print ( "WARNING: non-standard ligand at segment " .. ligandList [ jj ] [ LLSEG ] .. ", most ligand-aware recipes won't work properly" ) end end return ligandList end -- -- build segments table -- function tablesegment () local table = { {} } for ii = 1, segCnt do table [ ii ] = {} local aac = protNfo.aa [ ii ] if AminoAcids [ aac ] == nil then print ( "WARNING: unknown amino acid \'" .. aac .. "\' at segment " .. ii .. ", code \'x\' substituted" ) aac = "x" end table [ ii ] [ TSAA ] = aac table [ ii ] [ TSAS ] = AminoAcids [ aac ] [ AASHORT ] table [ ii ] [ TSAL ] = AminoAcids [ aac ] [ AALONG ] table [ ii ] [ TSHY ] = AminoAcids [ aac ] [ AAHYDROPATHY ] table [ ii ] [ TSSS ] = protNfo.ss [ ii ] if table [ ii ] [ TSSS ] ~= "H" and table [ ii ] [ TSSS ] ~= "E" and table [ ii ] [ TSSS ] ~= "L" and table [ ii ] [ TSSS ] ~= "M" then print ( "WARNING: unknown secondary structure code \'" .. table [ ii ] [ TSSS ] .. "\' at segment " .. ii ) end -- -- get locked status -- table [ ii ] [ TSLO ] = protNfo.lock [ ii ] -- -- get sidechain locked status -- table [ ii ] [ TSSL ] = protNfo.slck [ ii ] -- -- determine type of segment -- local stype = "P" -- protein (default) if RNAcodes [ aac ] ~= nil then stype = "R" -- RNA elseif DNAcodes [ aac ] ~= nil then stype = "D" -- DNA elseif table [ ii ] [ TSSS ] == "M" then stype = "M" -- other molecule end table [ ii ] [ TSTY ] = stype end return table end function makeruler ( sBeg, sEnd ) local function tenpart ( ff ) if ff >= 100 then ff = ff % 100 end return ( ff - ( ff % 10 ) ) / 10 end local function hunpart ( ff ) if ff >= 1000 then ff = ff % 1000 end return ( ff - ( ff % 100 ) ) / 100 end local onez = "" local tenz = "" local hunz = "" local numz = 1 for ii = sBeg, sEnd do onez = onez .. numz % 10 if ii % 10 == 0 then tenz = tenz .. tenpart ( ii ) if ii >= 100 then hunz = hunz .. hunpart ( ii ) else hunz = hunz .. " " end else if ii == sBeg and ii > 1 then tenz = tenz .. tenpart ( ii ) hunz = hunz .. hunpart ( ii ) else tenz = tenz .. " " hunz = hunz .. " " end end if ii == segCnt2 then tenz = tenz:sub ( 1, tenz:len () - 1 ) .. tenpart ( ii ) hunz = hunz:sub ( 1, hunz:len () - 1 ) .. hunpart ( ii ) end numz = numz + 1 if numz > 10 then numz = 1 end end local ruler = "" if sEnd >= 100 then ruler = hunz .. "\n" end if sEnd >= 10 then ruler = ruler .. tenz .. "\n" end ruler = ruler .. onez return ruler end function linotype ( line ) if line:len () > jMaxL then print ( line ) print ( "" ) end for ii = 1, segCnt, jMaxL do local lastseg = math.min ( ii + jMaxL - 1, segCnt ) print ( makeruler ( ii, lastseg ) ) print ( line:sub ( ii, lastseg ) ) print ( "" ) end end -- -- tlaloc functions to print sequence of letter -- function BuildSequence ( table ) local seqstring = "" local hydrostring = "" local strucstring = "" local stypestring = "" local lockstring = "" local slockstring = "" local nonp = 0 local locks = 0 local slocks = 0 for ii = 1, segCnt do local aac = table [ ii ] [ TSAA ] if aac:len () > 1 then nonp = nonp + 1 aac = aac:sub ( aac:len () ) end seqstring = seqstring .. aac if protNfo.phobe [ ii ] then hydrostring = hydrostring .. "i" else hydrostring = hydrostring .. "e" end strucstring = strucstring .. table [ ii ] [ TSSS ] if table [ ii ] [ TSSS ] == "M" then nonp = nonp + 1 end stypestring = stypestring .. table [ ii ] [ TSTY ] local locked = "U" if table [ ii ] [ TSLO ] then locked = "L" locks = locks + 1 end lockstring = lockstring .. locked local slocked = "U" if table [ ii ] [ TSSL ] then slocked = "L" slocks = slocks + 1 end slockstring = slockstring .. slocked end print ( "primary structure sequence (single-letter amino acid codes for searching PDB)" ) linotype ( seqstring ) if nonp > 0 then print ( "CAUTION: non-protein entries found, consult types below" ) print ( "" ) end print ( "type of each segment (P = protein, R = RNA, D = DNA, M = molecule (ligand))" ) linotype ( stypestring ) print ( "sequence with i if hydrophobic" ) linotype ( hydrostring ) print ( "secondary structure sequence" ) linotype ( strucstring ) if locks > 0 then print ( "locked backbone segments" ) linotype ( lockstring ) end if slocks > 0 then print ( "locked sidechain segments" ) linotype ( slockstring ) end print ( "--" ) return seqstring, hydrostring, strucstring, stypestring, lockstring, slockstring, nonp, locks, slocks end -- -- find modifiable sections -- function FindModifiable () -- local flocked = FindLocked () --print ( #flocked .. " locked sections" ) for kk = 1, #flocked do print ( "locked section " .. kk .. ": " .. flocked [ kk ] [ 1 ] .. "-" .. flocked [ kk ] [ 2 ] ) end -- local funlocked = SegmentInvertSet ( flocked ) --print ( #funlocked .. " unlocked sections" ) for kk = 1, #funlocked do print ( "unlocked section " .. kk .. ": " .. funlocked [ kk ] [ 1 ] .. "-" .. funlocked [ kk ] [ 2 ] ) end -- local fslocked = FindSLocked () --print ( #fslocked .. " locked sidechain sections" ) for kk = 1, #fslocked do print ( "locked sidechain section " .. kk .. ": " .. fslocked [ kk ] [ 1 ] .. "-" .. fslocked [ kk ] [ 2 ] ) end -- local fsunlocked = SegmentInvertSet ( fslocked ) --print ( #fsunlocked .. " unlocked sidechain sections" ) for kk = 1, #fsunlocked do print ( "unlocked sidechain section " .. kk .. ": " .. fsunlocked [ kk ] [ 1 ] .. "-" .. fsunlocked [ kk ] [ 2 ] ) end -- local ulckblcks = SegmentCommSet ( funlocked, fslocked ) --print ( #ulckblcks .. " unlocked backbone, locked sidechain" ) for kk = 1, #ulckblcks do print ( "unlocked backbone, locked sidechain section " .. kk .. ": " .. ulckblcks [ kk ] [ 1 ] .. "-" .. ulckblcks [ kk ] [ 2 ] ) end -- local zeroScore = FindZeroScore () --print ( #zeroScore .. " zero score sections" ) for kk = 1, #zeroScore do print ( "zero score section " .. kk .. ": " .. zeroScore [ kk ] [ 1 ] .. "-" .. zeroScore [ kk ] [ 2 ] ) end -- if #flocked == 1 and #zeroScore > 0 then local LockedNZ = SegmentInvertSet ( zeroScore, flocked [ 1 ] [ 2 ] ) for kk = 1, #LockedNZ do print ( "locked, non-zero score section " .. kk .. ": " .. LockedNZ [ kk ] [ 1 ] .. "-" .. LockedNZ [ kk ] [ 2 ] ) end if flocked [ 1 ] [ 1 ] == 1 then segStart = math.min ( flocked [ 1 ] [ 2 ] + 1, segCnt2 ) end end -- local mutables = FindMutables () --print ( #mutables .. " mutable sections" ) for kk = 1, #mutables do print ( "mutable section " .. kk .. ": " .. mutables [ kk ] [ 1 ] .. "-" .. mutables [ kk ] [ 2 ] ) end -- local lockmut = SegmentCommSet ( flocked, mutables ) --print ( #lockmut .. " locked, mutable sections" ) for kk = 1, #lockmut do print ( "locked, mutable section " .. kk .. ": " .. lockmut [ kk ] [ 1 ] .. "-" .. lockmut [ kk ] [ 2 ] ) end end -- -- find mutable segments -- function FindMutable ( table ) local mutable = {} local mutablestring = '' for ii = 1, segCnt2 do if protNfo.mute [ ii ] == true then mutable [ #mutable + 1 ] = ii end end print ( #mutable .. " mutables found" ) if #mutable > 0 then print ( "n" .. delim .. "segment" .. delim .. "aacode" .. delim .. "aaname" ) end for ii = 1, #mutable do print ( ii .. delim .. mutable[ii] .. delim .. table [ mutable [ ii ] ] [ 1 ] .. delim .. table [ mutable [ ii ] ] [ 3 ] ) mutablestring = mutablestring .. "'" .. table [ mutable [ ii ] ] [ 1 ] .. "'," end print ( mutablestring ) --- for copy paste on other recipe return mutable end -- -- function FindActiveSubscores adapted from EDRW by Timo van der Laan -- -- This function should be called first -- it saves segment scores and subscores -- in segScoreCache and subScoreCache. -- function FindActiveSubscores ( show ) local result = {} local gTotal = 0 -- grand total all subscores local gTotalS = 0 -- grand total all segment subscores local Subs = puzzle.GetPuzzleSubscoreNames () for ii = 1, #Subs do subScoreCache [ Subs [ ii ] ] = {} -- save for later local total = 0 local abstotal = 0 local part for jj = 1, segCnt do part = current.GetSegmentEnergySubscore ( jj, Subs [ ii ] ) subScoreCache [ Subs [ ii ] ] [ jj ] = part total = total + part abstotal = abstotal + math.abs ( part ) end if abstotal > 10 then result [ #result + 1 ] = { Subs [ ii ], total, true } gTotal = gTotal + total if show then print ( "active subscore " .. #result .. ": " .. Subs [ ii ] .. ", total = " .. round ( total ) ) end end end for ii = 1, segCnt do segScoreCache [ #segScoreCache + 1 ] = current.GetSegmentEnergyScore ( ii ) gTotalS = gTotalS + segScoreCache [ ii ] end if show then print ( #result .. " active subscores" ) end if show then print ( "total of all subscores: " .. round ( gTotal ) ) end if show then print ( "total of all segment scores: " .. round ( gTotalS ) ) end if ( round ( gTotal ) ~= round ( gTotalS ) ) then print ( "WARNING: total subscores " .. round ( gTotal ) .. " not equal total segment scores " .. round ( gTotalS ) ) end return result, gTotal, gTotalS end -- -- print segment scores in spreadsheet format -- -- table - segment table -- subScores - selected subscores -- nonprot - number of non-protein segments -- locked - number of locked segments -- slocked - number of locked segments -- function SegScores ( table, subScores, nonprot, locked, slocked ) local tReport = "" local headStr = "n" .. delim .. "ID" .. delim .. "SS" .. delim if nonprot > 0 then headStr = headStr .. "type" .. delim end if kLongnm then headStr = headStr .. "name" .. delim end if kAbbrev then headStr = headStr .. "abbrev" .. delim end if locked > 0 or slocked > 0 then headStr = headStr .. "lock" .. delim end if kHydro then headStr = headStr .. "Hyd" .. delim end if kAtom then headStr = headStr .. "atoms" .. delim end if kRotamer then headStr = headStr .. "rotamers" .. delim end headStr = headStr .. "score" .. delim for ii = 1, #subScores do if subScores [ ii ] [ 3 ] then headStr = headStr .. subScores [ ii ] [ 1 ] .. delim end end print ( "--" ) print ( "\"segment scores\"" ) print ( headStr ) tReport = tReport .. "\"segment scores\"\n" .. headStr .. "\n" local tSegEnergy = 0 for ii = 1, segCnt do local segEnergy = segScoreCache [ ii ] tSegEnergy = tSegEnergy + segEnergy local segScore = ii .. delim .. table [ ii ] [ TSAA ] .. delim .. table [ ii ] [ TSSS ] .. delim if nonprot > 0 then segScore = segScore .. table [ ii ] [ TSTY ] .. delim end if kLongnm then segScore = segScore .. table [ ii ] [ TSAL ] .. delim end if kAbbrev then segScore = segScore .. table [ ii ] [ TSAS ] .. delim end if locked > 0 or slocked > 0 then local ll = "U" if table [ ii ] [ TSLO ] then ll = "L" end if table [ ii ] [ TSSL ] then ll = ll .. " L" else ll = ll .. " U" end segScore = segScore .. ll .. delim end if kHydro then segScore = segScore .. round ( table [ ii ] [ TSHY ] ) .. delim end if kAtom then segScore = segScore .. protNfo.atom [ ii ] .. delim end if kRotamer then segScore = segScore .. protNfo.rot [ ii ] .. delim end segScore = segScore .. round ( segEnergy ) .. delim for jj = 1, #subScores do if subScores [ jj ] [ 3 ] then segScore = segScore .. round ( subScoreCache [ subScores [ jj ] [ 1 ] ] [ ii ] ) .. delim end end print ( segScore ) tReport = tReport .. segScore .. "\n" end local footstr = "totals" .. delim .. "" .. delim .. "" .. delim if nonprot > 0 then footstr = footstr .. delim -- no totals for type end if kLongnm then footstr = footstr .. delim -- no totals for long name end if kAbbrev then footstr = footstr .. delim -- no totals for abbreviation end if locked > 0 or slocked > 0 then footstr = footstr .. delim -- no totals for locked end if kHydro then footstr = footstr .. delim -- no totals for hydropathy end if kAtom then footstr = footstr .. delim -- no totals for atoms end if kRotamer then footstr = footstr .. delim -- no totals for rotamers end footstr = footstr .. round ( tSegEnergy ) .. delim for ii = 1, #subScores do if subScores [ ii ] [ 3 ] then footstr = footstr .. round ( subScores [ ii ] [ 2 ] ) .. delim end end print ( footstr ) tReport = tReport .. footstr .. "\n" return tReport end -- -- print density analysis -- function DensityRat ( table ) local tReport = "" local headStr = "\"AA code\"" .. delim .. "\"AA name\"" .. delim .. "\"segment count\"" .. delim .. "\"total density\"" .. delim .. "\"% total density\"" .. delim .. "\"mean density\"" .. delim .. "\"worst density\"" .. delim .. "\"worst density seg\"" .. delim .. "\"best density\"" .. delim .. "\"best density seg\"" .. delim -- -- AA density table - keyed by AA code -- DENCOUNT = 1 DENTOTAL = 2 DENMEAN = 3 DENBEST = 4 DENBESTS = 5 DENWORST = 6 DENWORSTS = 7 -- -- density by amino acid -- local tAA = {} -- -- binary (true/false) tables for density by AA type -- -- -- density of aromatics - true => aromatic -- local tAromatic = {} tAromatic [ true ] = { 0, 0, 0, -999, 0, 999, 0, } tAromatic [ false ] = { 0, 0, 0, -999, 0, 999, 0, } -- -- density of aliphatics - true => alphatic -- local tAliphatic = {} tAliphatic [ true ] = { 0, 0, 0, -999, 0, 999, 0, } tAliphatic [ false ] = { 0, 0, 0, -999, 0, 999, 0, } -- -- density of hydrophobics - true => hydrophobic -- local tHydrophobic = {} tHydrophobic [ true ] = { 0, 0, 0, -999, 0, 999, 0, } tHydrophobic [ false ] = { 0, 0, 0, -999, 0, 999, 0, } local function denUpdate ( tDen, segDensity, segindx ) tDen [ DENCOUNT ] = tDen [ DENCOUNT ] + 1 tDen [ DENTOTAL ] = tDen [ DENTOTAL ] + segDensity if segDensity > tDen [ DENBEST ] then tDen [ DENBEST ] = segDensity tDen [ DENBESTS ] = segindx end if segDensity < tDen [ DENWORST ] then tDen [ DENWORST ] = segDensity tDen [ DENWORSTS ] = segindx end end local tSegDensity = 0 for ii = 1, segCnt do local aaCode = table [ ii ] [ 1 ] if tAA [ aaCode ] == nil then tAA [ aaCode ] = { 0, 0, 0, -999, 0, 999, 0, } end -- -- update table of density by amino acid -- local segDensity = subScoreCache [ "Density" ] [ ii ] tSegDensity = tSegDensity + segDensity local aaDen = tAA [ aaCode ] if aaDen ~= nil then denUpdate ( aaDen, segDensity, ii ) else print ( "ERROR: AA density table entry for " .. aaCode .. " is nil" ) end -- -- update table of density by aromatic vs. non-aromatic -- local aromDen = tAromatic [ Aromatic [ aaCode ] ~= nil ] if aromDen ~= nil then denUpdate ( aromDen, segDensity, ii ) else print ( "ERROR: Aromatic density table entry for " .. aaCode .. " is nil" ) end -- -- update table of density by aliphatic vs. non-aliphatic -- local alipDen = tAliphatic [ Aliphatic [ aaCode ] ~= nil ] if alipDen ~= nil then denUpdate ( alipDen, segDensity, ii ) else print ( "ERROR: Aliphatic density table entry for " .. aaCode .. " is nil" ) end -- -- update table of density by hydrophobic vs. non-hydrophobic -- local phobDen = tHydrophobic [ Hydrophobic [ aaCode ] ~= nil ] if phobDen ~= nil then denUpdate ( phobDen, segDensity, ii ) else print ( "ERROR: hydrophobic density table entry for " .. aaCode .. " is nil" ) end end print ( "--" ) print ( "\"density by AA\"" ) print ( headStr ) tReport = tReport .. "\"density by AA\"\n" .. headStr .. "\n" for aac, aaDen in pairs ( tAA ) do if aaDen [ DENCOUNT ] > 0 then aaDen [ DENMEAN ] = aaDen [ DENTOTAL ] / aaDen [ DENCOUNT ] end local denline = aac .. delim .. AminoAcids [ aac ] [ AALONG ] .. delim .. aaDen [ DENCOUNT ] .. delim .. round ( aaDen [ DENTOTAL ] ) .. delim .. round ( ( aaDen [ DENTOTAL ] / tSegDensity ) * 100 ) .. delim .. round ( aaDen [ DENMEAN ] ) .. delim .. round ( aaDen [ DENWORST ] ) .. delim .. aaDen [ DENWORSTS ] .. delim .. round ( aaDen [ DENBEST ] ) .. delim .. aaDen [ DENBESTS ] print ( denline ) tReport = tReport .. denline .. "\n" end local footstr = "totals" .. delim .. segCnt .. delim .. round ( tSegDensity ) .. delim .. delim .. round ( tSegDensity / segCnt ) print ( footstr ) tReport = tReport .. footstr .. "\n" headStr = "\"aromatic AA\"" .. delim .. "" .. delim .. "\"segment count\"" .. delim .. "\"total density\"" .. delim .. "\"% total density\"" .. delim .. "\"mean density\"" .. delim .. "\"worst density\"" .. delim .. "\"worst density seg\"" .. delim .. "\"best density\"" .. delim .. "\"best density seg\"" .. delim print ( "--" ) print ( "\"density by aromatic vs. non-aromatic\"" ) print ( headStr ) tReport = tReport .. "\"density by aromatic vs. non-aromatic\"\n" .. headStr .. "\n" for aac, aaDen in pairs ( tAromatic ) do if aaDen [ DENCOUNT ] > 0 then aaDen [ DENMEAN ] = aaDen [ DENTOTAL ] / aaDen [ DENCOUNT ] end local denline = tostring ( aac ) .. delim .. "" .. delim .. aaDen [ DENCOUNT ] .. delim .. round ( aaDen [ DENTOTAL ] ) .. delim .. round ( ( aaDen [ DENTOTAL ] / tSegDensity ) * 100 ) .. delim .. round ( aaDen [ DENMEAN ] ) .. delim .. round ( aaDen [ DENWORST ] ) .. delim .. aaDen [ DENWORSTS ] .. delim .. round ( aaDen [ DENBEST ] ) .. delim .. aaDen [ DENBESTS ] print ( denline ) tReport = tReport .. denline .. "\n" end headStr = "\"aliphatic AA\"" .. delim .. "" .. delim .. "\"segment count\"" .. delim .. "\"total density\"" .. delim .. "\"% total density\"" .. delim .. "\"mean density\"" .. delim .. "\"worst density\"" .. delim .. "\"worst density seg\"" .. delim .. "\"best density\"" .. delim .. "\"best density seg\"" .. delim print ( "--" ) print ( "\"density by aliphatic vs. non-aliphatic\"" ) print ( headStr ) tReport = tReport .. "\"density by aliphatic vs. non-aliphatic\"\n" .. headStr .. "\n" for aac, aaDen in pairs ( tAliphatic ) do if aaDen [ DENCOUNT ] > 0 then aaDen [ DENMEAN ] = aaDen [ DENTOTAL ] / aaDen [ DENCOUNT ] end local denline = tostring ( aac ) .. delim .. "" .. delim .. aaDen [ DENCOUNT ] .. delim .. round ( aaDen [ DENTOTAL ] ) .. delim .. round ( ( aaDen [ DENTOTAL ] / tSegDensity ) * 100 ) .. delim .. round ( aaDen [ DENMEAN ] ) .. delim .. round ( aaDen [ DENWORST ] ) .. delim .. aaDen [ DENWORSTS ] .. delim .. round ( aaDen [ DENBEST ] ) .. delim .. aaDen [ DENBESTS ] print ( denline ) tReport = tReport .. denline .. "\n" end headStr = "\"hydrophobic AA\"" .. delim .. "" .. delim .. "\"segment count\"" .. delim .. "\"total density\"" .. delim .. "\"% total density\"" .. delim .. "\"mean density\"" .. delim .. "\"worst density\"" .. delim .. "\"worst density seg\"" .. delim .. "\"best density\"" .. delim .. "\"best density seg\"" .. delim print ( "--" ) print ( "\"density by hydrophobic vs. non-hydrophobic\"" ) print ( headStr ) tReport = tReport .. "\"density by hydrophobic vs. non-hydrophobic\"\n" .. headStr .. "\n" for aac, aaDen in pairs ( tHydrophobic ) do if aaDen [ DENCOUNT ] > 0 then aaDen [ DENMEAN ] = aaDen [ DENTOTAL ] / aaDen [ DENCOUNT ] end local denline = tostring ( aac ) .. delim .. "" .. delim .. aaDen [ DENCOUNT ] .. delim .. round ( aaDen [ DENTOTAL ] ) .. delim .. round ( ( aaDen [ DENTOTAL ] / tSegDensity ) * 100 ) .. delim .. round ( aaDen [ DENMEAN ] ) .. delim .. round ( aaDen [ DENWORST ] ) .. delim .. aaDen [ DENWORSTS ] .. delim .. round ( aaDen [ DENBEST ] ) .. delim .. aaDen [ DENBESTS ] print ( denline ) tReport = tReport .. denline .. "\n" end print ( "--" ) print ( "\"density deviation (above/below mean density by AA)\"" ) local dendeviate = "" for ii = 1, segCnt do local aaCode = table [ ii ] [ 1 ] local segDensity = subScoreCache [ "Density" ] [ ii ] local jDenMean = tAA [ aaCode ] [ DENMEAN ] if round ( segDensity ) == round ( jDenMean ) or tAA [ aaCode ] [ DENCOUNT ] == 1 then dendeviate = dendeviate .. "=" elseif segDensity < jDenMean then dendeviate = dendeviate .. "-" else dendeviate = dendeviate .. "+" end end linotype ( dendeviate ) return tReport, dendeviate end -- -- GetStruct -- return a list of structures of a specified type -- -- adapted from spvincent's Helix Rebuild -- function GetStruct ( structT ) local within_struct = false local structList = {} local structStart = 0 local structLast = 0 local structScr = 0 for ii = 1, segCnt do if ( protNfo.ss [ ii ] == structT ) then if ( within_struct == false ) then -- start of a new struct within_struct = true structStart = ii structScr = 0 end structLast = ii if ii <= #segScoreCache then structScr = structScr + segScoreCache [ ii ] else structScr = structScr + current.GetSegmentEnergyScore ( ii ) end elseif ( within_struct == true ) then -- end of a struct within_struct = false structList [ #structList + 1 ] = { structT, structStart, structLast, false, structScr } end end if ( within_struct == true ) then structList [ #structList + 1 ] = { structT, structStart, structLast, false, structScr } end return structList end -- -- function to print a little contact table -- function contact ( structs ) local head = delim .. delim local first = true for s = 1, #structs do if structs [ s ] [ STRCTTYP ] == "H" or structs [ s ] [ STRCTTYP ] == "E" then local string = structs [ s ] [ STRCTTYP ] .. delim .. structs [ s ] [ STRCTSTR ] .. delim .. structs [ s ] [ STRCTEND ] for s2 = 1, #structs do if structs [ s2 ] [ STRCTTYP ] == "H" or structs [ s2 ] [ STRCTTYP ] == "E" then if first then head = head .. delim .. structs [ s2 ] [ STRCTTYP ] end local mean = 0 local nb = 0 for i = structs [ s ] [ STRCTSTR ], structs [ s ] [ STRCTEND ] do local min = 999999 for j = structs [ s2 ] [ STRCTSTR ], structs [ s2 ] [ STRCTEND ] do dist = structure.GetDistance ( i, j ) if dist < min then min = dist end end mean = mean + min nb = nb + 1 end mean = mean / nb local c = " " if structure.GetDistance ( structs [ s ] [ STRCTSTR ], structs [ s2 ] [ STRCTEND ] ) < structure.GetDistance ( structs [ s ] [ STRCTSTR ], structs [ s2 ] [ STRCTSTR ] ) then if mean < 5 then c = 'X' elseif mean < 10 then c = 'x' end else if mean < 5 then c = 'O' elseif mean < 10 then c = 'o' end end string = string .. delim .. c end end if first then print ( "--" ) print ( "\"mini contact table\"" ) print ( head ) first = false end print ( string ) end end end function ShowReport ( tReport, seq, struc ) local ask = dialog.CreateDialog ( ReVersion .. " copy-and-paste" ) ask.l15 = dialog.AddLabel ( "Click inside the one of the text boxes, then" ) ask.l16 = dialog.AddLabel ( "use ctrl+a (command+a on Mac) to select all," ) ask.l20 = dialog.AddLabel ( "and control+c or command+c to copy, then" ) ask.l30 = dialog.AddLabel ( "paste into spreadsheet" ) ask.l10 = dialog.AddLabel ( "---- segment subscores report ----" ) ask.rep = dialog.AddTextbox ( "subscores:", tReport ) ask.SEQN = dialog.AddLabel ( "---- primary (sequence) and secondary structure detail ----" ) ask.seq = dialog.AddTextbox ( "AA sequence:", seq ) ask.struc = dialog.AddTextbox ( "secondary:", struc ) ask.OK = dialog.AddButton ( "OK", 1 ) dialog.Show ( ask ) end function ShowDensityReport ( dReport, dendev ) local ask = dialog.CreateDialog ( ReVersion .. " copy-and-paste" ) ask.l15 = dialog.AddLabel ( "Click inside the one of the text boxes, then" ) ask.l16 = dialog.AddLabel ( "use ctrl+a (command+a on Mac) to select all," ) ask.l20 = dialog.AddLabel ( "and control+c or command+c to copy, then" ) ask.l30 = dialog.AddLabel ( "paste into spreadsheet" ) ask.l50 = dialog.AddLabel ( "---- density analysis ----" ) ask.dens = dialog.AddTextbox ( "density:", dReport ) ask.l70 = dialog.AddLabel ( "---- density deviation from mean AA density ----" ) ask.dev = dialog.AddTextbox ( "deviation:", dendev ) ask.OK = dialog.AddButton ( "OK", 1 ) dialog.Show ( ask ) end function GetParms ( scrParts, ligands ) local ask = dialog.CreateDialog ( ReVersion ) local kRet = 0 repeat if segcnt == segcnt2 then ask.segcnt = dialog.AddLabel ( segCnt2 .. " segments" ) else ask.segcnt = dialog.AddLabel ( segCnt .. " segments" ) ask.segcnt2 = dialog.AddLabel ( segCnt2 .. " segments, adjusted for ligands" ) end if #ligands > 0 then ask.ligands = dialog.AddLabel ( #ligands .. " ligand section(s), see scriptlog for details" ) end ask.NRGE = dialog.AddLabel ( "---- score information ----" ) --[[ BoxScore = { tSubScores = 0, -- total of active subscores, all segments tSegScores = 0, -- total of segment scores tScoreFilt = 0, -- total score, filters on tScoreFOff = 0, -- total score, filters off tScoreNrgy = 0, -- total energy score tScoreBonus = 0, -- total filter bonus tScoreForm = 0, -- subscores + filter bonus + 8000 tScoreDark = 0, -- total "dark" score } ]]-- ask.active = dialog.AddLabel ( #scrParts .. " active subscores" ) ask.tSubScores = dialog.AddLabel ( "total of all subscores = " .. round ( BoxScore.tSubScores ) ) ask.tScoreFilt = dialog.AddLabel ( "current score = " .. round ( BoxScore.tScoreFilt ) ) if BoxScore.tScoreBonus ~= 0 then ask.tScoreBonus = dialog.AddLabel ( "filter bonus = " .. round ( BoxScore.tScoreBonus ) ) else ask.tScoreBonus = dialog.AddLabel ( "no filter bonus" ) end ask.tScoreForm = dialog.AddLabel ( "subscores + filter bonus + 8000 = " .. round ( BoxScore.tScoreForm ) ) ask.tScoreDark = dialog.AddLabel ( "\"dark\" points = " .. round ( BoxScore.tScoreDark ) ) ask.DETAIL = dialog.AddLabel ( "---- subscore reporting options ----" ) for ii = 1, #scrParts do ask [ scrParts [ ii ] [ 1 ] ] = dialog.AddCheckbox ( scrParts [ ii ] [ 1 ] .. ": " .. round ( scrParts [ ii ] [ 2 ] ) .. " (" .. round ( scrParts [ ii ] [ 2 ] / segCnt ) .. " / seg) ", scrParts [ ii ] [ 3 ] ) end ask.SECT = dialog.AddLabel ( "---- report sections ----" ) ask.kContact = dialog.AddCheckbox ( "mini contact table", kContact ) ask.kMutDet = dialog.AddCheckbox ( "mutable details", kMutDet ) if jDensity then ask.kDensity = dialog.AddCheckbox ( "density analysis", kDensity ) end ask.OK = dialog.AddButton ( "OK", 1 ) ask.more = dialog.AddButton ( "More", 2 ) ask.Cancel = dialog.AddButton ( "Cancel", 0 ) kRet = dialog.Show ( ask ) if kRet > 0 then local aPart = 0 for ii = 1, #scrParts do scrParts [ ii ] [ 3 ] = ask [ scrParts [ ii ] [ 1 ] ].value aPart = aPart + 1 end -- -- select all if nothing selected -- if aPart == 0 then for ii = 1, #scrParts do scrParts [ ii ] [ 3 ] = true end end kContact = ask.kContact.value kMutDet = ask.kMutDet.value if jDensity then kDensity = ask.kDensity.value end if kRet == 2 then GetMoreParms () end end until kRet < 2 if kRet == 1 then return true, scrParts end return false, scrParts end function GetMoreParms ( ) local ask = dialog.CreateDialog ( ReVersion ) ask.DETAIL = dialog.AddLabel ( "---- additional columns ----" ) ask.kLongnm = dialog.AddCheckbox ( "long name", kLongnm ) ask.kAbbrev = dialog.AddCheckbox ( "abbreviation", kAbbrev ) ask.kHydro = dialog.AddCheckbox ( "hydropathy index", kHydro ) ask.kAtom = dialog.AddCheckbox ( "atom count", kAtom ) ask.kRotamer = dialog.AddCheckbox ( "rotamer count", kRotamer ) ask.FORMAT = dialog.AddLabel ( "---- formatting options ----" ) ask.kRound = dialog.AddSlider ( "decimal places:", kRound, 1, 8, 0 ) ask.ddlm = dialog.AddLabel ( "delimiters (last selected wins)" ) ask.dtab = dialog.AddCheckbox ( "tab", dtab ) ask.dcomma = dialog.AddCheckbox ( "comma", dcomma ) ask.dsemic = dialog.AddCheckbox ( "semicolon", dsemic ) ask.OK = dialog.AddButton ( "OK", 1 ) ask.Cancel = dialog.AddButton ( "Cancel", 0 ) local kRet = dialog.Show ( ask ) if kRet > 0 then kLongnm = ask.kLongnm.value kAbbrev = ask.kAbbrev.value kHydro = ask.kHydro.value kAtom = ask.kAtom.value kRotamer = ask.kRotamer.value kRound = ask.kRound.value kFax = 10 ^ -kRound dtab = ask.dtab.value if dtab then delim = "\t" end dcomma = ask.dcomma.value if dcomma then delim = "," end dsemic = ask.dsemic.value if dsemic then delim = ";" end return true end return false end -- -- Ident - print identifying information at beginning and end of recipe -- -- slugline - first line to print - normally recipe name and version -- function Ident ( slugline ) local function round ( ii ) return ii - ii % 0.001 end print ( slugline ) print ( "Puzzle: " .. puzzle.GetName () .. " (" .. puzzle.GetPuzzleID () .. ")" ) print ( "Track: " .. ui.GetTrackName () ) gname = user.GetGroupName () if gname ~= nil then gname = " (" .. gname .. ")" else gname = "" end print ( "User: " .. user.GetPlayerName () .. gname ) local scoretype = scoreboard.GetScoreType () local scort = "" if scoretype == 0 then scort = "soloist" elseif scoretype == 1 then scort = "evolver" elseif scoretype == 2 then scort = "all hands" elseif scoretype == 3 then scort = "no score" else scort = "unknown/error" end print ( "Rank: " .. scoreboard.GetRank ( scoretype ) .. " (" .. scort .. ")" ) local sGroup = scoreboard.GetGroupScore () if sGroup ~= nil then print ( "Group rank / score: " .. scoreboard.GetGroupRank () .. " / " .. round ( 10 * ( 800 - sGroup ) ) ) end end function main () save.Quicksave ( WAYBACK ) print ( "--" ) Ident ( ReVersion ) -- -- get protein info -- print ( "collecting protein info, please wait" ) protNfo.setNfo () print ( "--" ) print ( "collecting score info, please wait" ) print ( "--" ) -- -- count segments and search for ligand -- local ligands = GetSeCount () -- -- determine which subscores are active -- print ( "--" ) print ( "score information" ) --[[ BoxScore = { tSubScores = 0, -- total of active subscores, all segments tSegScores = 0, -- total of segment scores tScoreFilt = 0, -- total score, filters on tScoreFOff = 0, -- total score, filters off tScoreNrgy = 0, -- total energy score tScoreBonus = 0, -- total filter bonus tScoreForm = 0, -- subscores + filter bonus + 8000 tScoreDark = 0, -- total "dark" score } ]]-- -- -- FindActiveSubscores should be called first, to build subScoreCache -- subScores, BoxScore.tSubScores, BoxScore.tSegScores = FindActiveSubscores ( true ) -- -- special check for "Density" subscore -- for ii = 1, #subScores do if subScores [ ii ] [ 1 ] == "Density" then jDensity = true -- density present kDensity = true -- select density report by default break end end behavior.SetFiltersDisabled ( false ) BoxScore.tScoreFilt = current.GetEnergyScore () behavior.SetFiltersDisabled ( true ) BoxScore.tScoreFOff = current.GetEnergyScore () BoxScore.tScoreBonus = BoxScore.tScoreFilt - BoxScore.tScoreFOff if fBonus ~= 0 then print ( "current filter bonus = " .. round ( BoxScore.tScoreBonus ) ) end behavior.SetFiltersDisabled ( false ) BoxScore.tScoreForm = BoxScore.tSubScores + BoxScore.tScoreBonus + 8000 print ( "subscores + filter bonus + 8000 = " .. round ( BoxScore.tScoreForm ) ) BoxScore.tScoreDark = BoxScore.tScoreFilt - BoxScore.tScoreForm print ( "current score = " .. round ( BoxScore.tScoreFilt ) ) print ( "\"dark\" points = " .. round ( BoxScore.tScoreDark ) ) local sRosetta = scoreboard.GetScore () print ( "Rosetta energy score = ".. round ( sRosetta ) ) local sRosecon = 10 * ( 800 - sRosetta ) print ( "converted Rosetta score = " .. round ( sRosecon ) ) print ( "--" ) print ( "sequence information" ) -- -- build table of non-ligand segments -- local segTable = tablesegment () -- -- sequence of letters, hydrophobes, structures, locks -- local seq, hydro, struc, types, locks, slocks, nonprot, locked, slocked = BuildSequence ( segTable ) -- -- find modifiable sections -- print ( "--" ) print ( "modifiable sections" ) FindModifiable () -- -- get details for subscore report -- local go, selScores = GetParms ( subScores, ligands ) if go then -- -- print segment scores -- local tReport = SegScores ( segTable, selScores, nonprot, locked, slocked ) -- -- print density analysis -- local dReport = nil local dendev = nil if jDensity then dReport, dendev = DensityRat ( segTable ) end -- -- detect structures -- local helixList = GetStruct ( "H" ) local sheetList = GetStruct ( "E" ) local structList = {} for ii = 1, #helixList do structList [ #structList + 1 ] = helixList [ ii ] end for ii = 1, #sheetList do structList [ #structList + 1 ] = sheetList [ ii ] end if kContact then -- -- mini contact table -- contact ( structList ) end -- -- find mutable segments and print them -- if kMutDet then print ( "--" ) print ( "mutable segments" ) FindMutable ( segTable ) end -- -- show reports for copy and paste -- ShowReport ( tReport, seq, struc ) if kDensity then ShowDensityReport ( dReport, dendev ) end end -- -- exit via the cleanup function -- cleanup () end function cleanup ( errmsg ) if CLEANUPENTRY ~= nil then return end CLEANUPENTRY = true print ( "---" ) -- -- model 100 - print recipe name, puzzle, track, time, score, and gain -- local reason local start, stop, line, msg if errmsg == nil then reason = "complete" else -- -- model 120 - civilized error reporting, -- thanks to Bruno K. and Jean-Bob -- start, stop, line, msg = errmsg:find ( ":(%d+):%s()" ) if msg ~= nil then errmsg = errmsg:sub ( msg, #errmsg ) end if errmsg:find ( "Cancelled" ) ~= nil then reason = "cancelled" else reason = "error" end end Ident ( ReVersion ) if reason == "error" then print ( "Unexpected error detected" ) print ( "Error line: " .. line ) print ( "Error: \"" .. errmsg .. "\"" ) end save.Quickload ( WAYBACK ) end xpcall ( main, cleanup ) --- end of recipe

Comments


LociOiling Lv 1

Version 2.8 adds several new features which deal new puzzle features we've seen recently. This post summarizes the changes, use the "parent" link above to see more on the existing features.

Some of the puzzles have had a complex mix of locked and unlocked segments. Some have had large ligands, and in at least one case, the ligand had multiple rotamers. One puzzle was made entirely of RNA.

Ligands

Ligand reporting has been completely revamped. Each ligand segment is now reported separately. Ligands are no longer excluded from the score table, and most other functions.

Ligands appear in the amino acid string as "x", and since they're now included, you may want to remove them if you use JPred or some other prediction service.

RNA/DNA

For RNA, the base codes reported by Foldit, "ra", "rc", "rg", and "ru" are reported in the score table. They appear in the amino acid string as single-character codes, so again, you may want to be selective. (Similar logic is included for the DNA codes, but it hasn't been tested yet.)

To make things easier, there's a new "type" code, which indicates whether a segment is protein, RNA, DNA, or ligand, types "P", "R", "D", and "M". The codes appear in the score table and in the string section.

Locked backbone and sidechains

The recipe checks for locked segments, and if any are found, reports on them in a new column in the score table, and in the string section. The Foldit functions detect only locked backbone, so the recipe checks for locked sidechains by looking at rotamer counts.

In the score table, locks are reported in a single column. The entry "U U" means both backbone and sidechain appear to be unlocked; "L U" means locked backbone with unlocked sidechain, and "U L" means unlocked backbone with locked sidechain. Finally, "L L" means both backbone and sidechain are locked.

Backbone locks and sidechain locks are reported separately in the strings section as well. The "modifiable" section also looks at ranges where the lock status of the backbone and the sidechains is different.

New columns

On the "More" page, you can now select the long name and abbreviation for each amino acid or nucleobase, which get added to the score report like the other optional columns.

Limitations

There are some limitations. The check for locked sidechains uses rotamer.GetCount, which slows things down, especially on large puzzles. For amino acids that normally have more than one rotamer, having only one rotamer almost certainly means the segment is locked. But glycine and alanine have only one rotamer, so they're always marked as unlocked.

LociOiling Lv 1

Version 2.8 of "print protein" includes better support for puzzles with locked segments. It also has better reporting for ligands. Ligand segments are now included in all relevant sections, where they were previously excluded. For RNA puzzles, the recipe now handles two-character base (nucleotide) codes.

print protein overview

This version of "print protein" is based on the classic "print protein lua2 V0" by marie_s.

The recipe displays detailed scoring information, including the each segment's score and subscores. The subscores include categories like backbone, clashing, packing, hiding, and ideality.

The recipe also reports the protein's primary structure (amino acid sequence) and secondary structure (helixes, sheets, and loops).

The recipe offers copy-and-paste reporting for most of its key outputs. Complete output is also available in the recipe's scriptlog.

Thanks to spvincent, Timo van der Laan, and HerobrinesArmy for code and ideas. Thanks to brgreening for helping to illuminate the mystery of the total score.

ligands and segment count

The recipe has detailed reporting of ligands, looking for any segments with a secondary structure code of "M". Ligands are now included all reports.

scoring information

The recipe detects active subscores using the logic found in "Tvdl enhanced DRW".

In some cases, this logic may suppress certain subscores, such as disulfides, when they have a low total value across all segments. The recipe reports the active subscores in the main dialog and the scriptlog.

The recipe calculates the "filter bonus" by toggling filters off and on, and then checks the total score. In theory, the total score is 8000 points plus the total of all segment subscores, plus the filter bonus. There's is usually a discrepancy, which is reported as "dark" score.

The recipe also reports the Rosetta energy score scoreboard.GetScore, normally a negative number. The recipe converts the Rosetta score to a Foldit score using the formula "FolditScore = 10 * ( 800 - RosettaScore ). Again, there's normally a discrepancy between this converted score and the current score reported by the Foldit client.

sequence information

The recipe reports the primary sequence as a string of single-letter codes, and the secondary structure as a string with "H" for helix, "E" for sheet, "L" for loop, and "M" for molecule, indicating a ligand.

The recipe also uses single-letter codes for RNA or DNA bases. Since amino acids can have the same codes, if any ligand, RNA, or DNA segments are present, the recipe includes a "type" string which identifies what a particular segment represents. The codes are "P" for protein, "M" for molecule/ligand, "R" for RNA, and "D" for DNA.

In the scriptlog, the sequence and secondary structure information is reported both as single strings, and as fixed-length lines with rulers. The single strings are for copy-and-paste into other tools, while the rulers make it easier to find a specific segment.

The recipe also makes the primary sequence and secondary structure are available in a copy-and-paste dialog.

The recipe issues warning messages to the scriptlog if a non-standard amino acid code or secondary structure code is found. The code "x" is substituted for a non-standard amino acid code. Ligands are represented by code "x".

The recipe also reports hydrophobicity as a string with "i" for if hydrophobic, and "e" if not hydrophobic.

Locked segments are reported as single-character codes - "U" for unlocked, and "L" for locked. There are separate strings for locked backbone and locked sidechains. The same information is also presented in other sections.

modifiable sections

The recipe reports on modifiable sections, including locked and unlocked sections, zero-score sections, and mutable sections. The "mutable segments" report is now optional.

Some puzzles have locked sections with movable sidechains or locked sections that are mutable. Some recipes incorrectly assume that "locked" means not modifiable in any way.

main dialog and segment subscore report

The recipe displays a main dialog before the segment subscore report is produced. Along with reporting other information, the dialog lets you select which subscores are to be included in the report. The mini contact table and detailed mutable reports are also optionally available, as density analysis reports for Electron Density puzzles.

The main dialog has a "more" button, which displays less frequently used options. The hydropathy index (a fixed value based on the AA code), atom count, and rotamer count can optionally be included, along with the long names and abbreviations for the amino acid or RNA/DNA base. You can select the delimiter character, with the tab character as the default. The number of decimal places reported is also adjustable.

The segment subscore report available in a cut-and-paste dialog, or in the scriptlog. The report now includes a total line reflecting the column totals for the scoring components.

The fixed-width option found in previous versions, for example reporting "12389" instead of "123.89" or "123,89" has been eliminated.

density analysis

For Electron Density puzzles, the recipe offers various types of density analysis.

The density analysis has several sections. The density report appears as a default option on puzzles with a density component.

The first section of density analysis looks at density by amino acid type. Some amino acids outscore others. For example, tyrosine might average a density subscore near 50, but glycine might have average density under 20. The "density by AA" section lists each amino acid found in the puzzle, the number of segments with that AA, the total density score of those segments, and the mean density for that AA. It also lists the worst density score and the corresponding segment number, and best density score and segment number.

The next three sections are similar, but show the density component for "aromatics" (rings) versus non-aromatics, aliphatics versus non-aliphatics, and hydrophobics versus hydrophilics.

Aromatic AAs typically have a much higher density score than non-aromatics. Aliphatics typically score lower than non-aliphatics. Hydrophobics and hydrophilics are close, with hydrophobics typically scoring a bit better.

The aromatics are "f" phenylalanine, "h" histidine, "w" tryptophan, and "y" tyrosine.

For this recipe, aliphatics are "v" valine, "l" leucine, and "i" isoleucine. (Not included: "g" glycine and "a" alanine.)

The first four sections of the density analysis are output in spreadsheet-ready format, similar to the main segment report.

The final section is the "density deviation" report. For each segment, this section shows a "+" if the density subscore is higher than the average for that AA, a "-" if lower, and an "=" if the density subscore is close to the mean.

The density deviation report looks something like this:

"density deviation (above/below mean density by AA)"
1234567890123456789012345678901234567890123456
-++-++-+++-+=+-+---+++=+++++++++++-+----+-=---

The density deviation section is intended to provide a quick indication of which sections are scoring best in terms of density.

The density analysis items are available for copy-and-paste, and can also be retrieved from the scriptlog.

cut-and-paste dialogs

When you click "OK" in the main dialog, the segment subscore report and other selected reports are produced. The cut-and-paste dialog then appears, with text boxes for the subscore report, and the primary and secondary structures of the protein. These fields can be copied and pasted into a spreadsheet or another tool.

If density analysis is selected, the results are reported in a separate cut-and-paste dialog.

To copy a given field from a cut-and-paste dialog, click in its text box. Then use ctrl+a on Windows or command+a on Mac to "select all". Then use ctrl-c or command-c to copy. The selected text can then be pasted into the tool or webpage of your choice.

The use of the tab character as the default delimiter produces more legible scriptlog output (in most tools), and also simplifies pasting data into most spreadsheets, such as Excel and Open Office Calc. At least US English versions, spreadsheets typically recognize the comma-separated value (CSV) format automatically when pasting, and offer the tab character as the default delimiter.
scriptlog

Certain outputs, such as the mini-contact table and the detailed mutable report, and available only in the scriptlog.

The scriptlog file has the name "scriptlog.trackname.xml" where trackname is the current trackname, or "default" for the default track. The scriptlog file is located in the foldit installation folder, for example c:\Foldit in a Windows environment.

Although the scriptlog is nominally an XML file, with XML tags at the beginning and end, the recipe output is normally just plain text. (A few recipes create XML tags in their output, however.) A normal text editor, such as notepad on Windows, can be used to view the scriptlog. In some cases, you may need to manually select a tool to open the "XML" type. For example, in Windows, right-click the scriptlog file and select "Open with", then "Notepad".