Icon representing a recipe

Recipe: SS Edit 2.0.1

created by LociOiling

Profile


Name
SS Edit 2.0.1
ID
103294
Shared with
Public
Parent
SS Edit 1.2
Children
Created on
April 16, 2020 at 20:28 PM UTC
Updated on
April 16, 2020 at 20:28 PM UTC
Description

Displays the current secondary structure. The displayed value can be cut or copied and a new value pasted in. Clicking the "Change" button applies the displayed value to the protein. Click "Exit" when any changes are complete. Version 1.1 disables filters during the change and avoid crashes by not attempting to set ligand type "M". Version 1.2 disables undo stack pushes to allow one-key undo. Version 2.0 handles each chain separately, similar to AA Edit 2.0. Version 2.0.1 is a quick fix for proline at the N terminal.

Best for


Code


--[[ SS Edit Get and set secondary structure SS Edit displays the current secondary structure. The displayed value can be selected and cut or copied. A new value can be pasted in. When the "Change" button is clicked, the currently displayed secondary structure will be applied to the protein. SS Edit doesn't validate or edit the secondary structure codes. Only the foldit codes "H" for helix, "E" for sheet, and "L" for loop should be used. Other values may produce unpredictable results. If the structure list is longer than the protein, SS Edit discards the extra entries at the end of the list. If the structure list is shorter than the protein, SS Edit applies the list to the first *n* segments of the protein, where *n* is the length of the list. Any remaining segments are unchanged. All changes are written to the scriptlog. See "AA Copy Paste Compare v 1.1.1 -- Brow42" for a full-function recipe that works with primary and secondary structures. version 1.1 -- 2015/07/06 -- LociOiling * speed things up a bit by disabling filters * add standard cleanup * don't crash trying to set type "M", skip invalid codes version 1.2 -- 2016/12/23 -- LociOiling * a string is not a table * enable 1-step undo with undo.SetUndo ( false ) version 2.0 -- 2020/03/10 -- LociOiling * add chain awareness to match AA Edit 2.0 * copy chain logic from AA Edit 2.0, expand to get SS version 2.0.1 -- 2020/04/16 -- LociOiling * handle proline at N-terminal correctly ]]-- -- -- Globals -- Recipe = "SS Edit" Version = "2.0.1" ReVersion = Recipe .. " v." .. Version STYPES = { E = { "sheet" }, H = { "helix" }, L = { "loop" }, } AALONG = 1 AACODE = 2 -- redundant for proteins, needed for DNA and RNA AAATOM = 3 AATYPE = 4 -- -- amino acid names and abbeviations, -- third element is mid-chain atom count -- AANames = { a = { "alanine", "a", 10, "P", }, c = { "cysteine", "c", 11, "P", }, d = { "aspartate", "d", 12, "P", }, e = { "glutamate", "e", 15, "P", }, f = { "phenylalanine", "f", 20, "P", }, g = { "glycine", "g", 7, "P", }, h = { "histidine", "h", 17, "P", }, i = { "isoleucine", "i", 19, "P", }, k = { "lysine", "k", 22, "P", }, l = { "leucine", "l", 19, "P", }, m = { "methionine", "m", 17, "P", }, n = { "asparagine", "n", 14, "P", }, p = { "proline", "p", 15, "P", }, q = { "glutamine", "q", 17, "P", }, r = { "arginine", "r", 24, "P", }, s = { "serine", "s", 11, "P", }, t = { "threonine", "t", 14, "P", }, v = { "valine", "v", 16, "P", }, w = { "tryptophan", "w", 24, "P", }, y = { "tyrosine", "y", 21, "P", }, -- -- bonus! codes for ligands ("x" is common, but "unk" is historic) -- x = { "ligand", "x", 0, "M", }, unk = { "ligand", "x", 0, "M", }, -- -- bonus! RNA nucleotides -- ra = { "adenine", "a", 0, "R", }, rc = { "cytosine", "c", 0, "R", }, rg = { "guanine", "g", 0, "R", }, ru = { "uracil", "u", 0, "R", }, -- -- bonus! DNA nucleotides (as seen in PDB, not confirmed for Foldit) -- da = { "adenine", "a", 0, "D", }, dc = { "cytosine", "c", 0, "D", }, dg = { "guanine", "g", 0, "D", }, dt = { "thymine", "t", 0, "D", }, } -- -- SSNames is to parallel AANames, -- but not as complex -- SSNames = { H = { "helix", }, E = { "sheet", }, L = { "loop", }, M = { "ligand", }, } AA_ATOM_MAX = 27 -- modified AA if over this count -- -- tables for converting external nucleobase codes to Foldit internal codes -- RNAin = { a = "ra", c = "rc", g = "rg", u = "ru", } DNAin = { a = "da", c = "dc", g = "dg", t = "dt", } Ctypes = { P = "protein", D = "DNA", R = "RNA", M = "ligand", } -- -- begin protNfo Beta package version 0.2a -- -- version 0.2a is packaged as a psuedo-class or psuedo-module -- containing a mix of data fields and functions -- -- all entries must be terminated with a comma to keep Lua happy -- -- the commas aren't necessary if only function definitions are present -- -- removed some items found in 0.1 not needed here, -- added N-terminal and C-terminal checks, first and last analysis -- -- this version depends on the external AANames table and associated codes, -- so still a work in progress -- -- version 0.2a contains a quick fix for proline at N-terminal -- -- need to reconcile this version with the more extensive version in print protein -- protNfo = { PROTEIN = "P", LIGAND = "M", RNA = "R", DNA = "D", UNKNOWN_AA = "x", UNKNOWN_BASE = "xx", CYSTEINE_AA = "c", PROLINE_AA = "p", aa = {}, -- amino acid codes ss = {}, -- secondary structure codes atom = {}, -- atom counts mute = {}, -- mutable flag ctype = {}, -- segment type - P, M, R, D first = {}, -- true if segment is first in chain last = {}, -- true if segment is last in chain nterm = {}, -- true if protein and if n-terminal cterm = {}, -- true if protein and if c-terminal fastac = {}, -- external code for FASTA-style output setNfo = function () local segCnt = structure.GetCount () -- -- initial scan: retrieve basic information from Foldit -- for ii = 1, segCnt do local nterm = false local cterm = false protNfo.aa [ #protNfo.aa + 1 ] = structure.GetAminoAcid ( ii ) protNfo.ss [ #protNfo.ss + 1 ] = structure.GetSecondaryStructure ( ii ) protNfo.atom [ #protNfo.atom + 1 ] = structure.GetAtomCount ( ii ) protNfo.mute [ #protNfo.mute + 1 ] = structure.IsMutable ( ii ) local aatab = AANames [ protNfo.aa [ ii ] ] if aatab ~= nil then protNfo.ctype [ #protNfo.ctype + 1 ] = aatab [ AATYPE ] -- -- special case for puzzles 879, 1378b, and similar -- -- if unknown amino acid, but secondary structure is not -- ligand, mark it as protein -- -- segment 134 in puzzle 879 is the example -- if protNfo.ctype [ ii ] == protNfo.LIGAND and protNfo.ss [ ii ] ~= protNfo.LIGAND then protNfo.ctype [ ii ] = protNfo.PROTEIN end else protNfo.ctype [ #protNfo.ctype + 1 ] = protNfo.LIGAND aa = protNfo.UNKNOWN_AA end -- -- for proteins, determine n-terminal and c-terminal -- based on atom count -- if protNfo.ctype [ ii ] == protNfo.PROTEIN then local ttyp = "" local noteable = false local ac = protNfo.atom [ ii ] -- actual atom count local act = aatab [ AAATOM ] -- reference mid-chain atom count if ac ~= act or ( protNfo.aa [ ii ] == protNfo.CYSTEINE_AA and ac == act ) then ttyp = "non-standard amino acid" if ac == act + 2 then ttyp = "N-terminal" nterm = true notable = true elseif ac == act + 1 then ttyp = "C-terminal" cterm = true notable = true elseif protNfo.aa [ ii ] == protNfo.PROLINE_AA and ac == act + 3 then ttyp = "N-terminal" nterm = true notable = true end if protNfo.aa [ ii ] == protNfo.CYSTEINE_AA then local ds = current.GetSegmentEnergySubscore ( ii, "Disulfides" ) -- print ( "cysteine at " .. ii .. ", disulfides score = " .. ds ) if ds ~= 0 and math.abs ( ds ) > 0.01 then nterm = false cterm = false ttyp = "disulfide bridge" if ac == act + 1 then ttyp = "N-terminal" nterm = true elseif ac == act then ttyp = "C-terminal" cterm = true end notable = true else ttyp = "unpaired cysteine" notable = false end end if notable then print ( ttyp .. " detected at segment " .. ii .. ", amino acid = \'" .. protNfo.aa [ ii ] .. "\', atom count = " .. ac .. ", reference count = " .. act .. ", secondary structure = " .. protNfo.ss [ ii ] ) end end end if protNfo.ctype [ ii ] == protNfo.LIGAND then print ( "ligand detected at segment " .. ii ) end protNfo.nterm [ #protNfo.nterm + 1 ] = nterm protNfo.cterm [ #protNfo.cterm + 1 ] = cterm protNfo.fastac [ #protNfo.fastac + 1 ] = aatab [ AACODE ] end -- -- rescan to determine first and last in chain for all types -- it's necessary to "peek" at neighbors for DNA and RNA -- for ii = 1, segCnt do local nterm = protNfo.nterm [ ii ] local cterm = protNfo.cterm [ ii ] local first = false local last = false if ii == 1 then first = true end if ii == segCnt then last = true end if protNfo.ctype [ ii ] == protNfo.PROTEIN then if protNfo.nterm [ ii ] then first = true end if protNfo.cterm [ ii ] then last = true end -- -- special case for puzzles 879, 1378b, and similar -- -- if modified AA ends or begins a chain, mark -- it as C-terminal or N-terminal -- -- hypothetical: no way to test so far! -- if AANames [ protNfo.aa [ ii ] ] [ AACODE ] == protNfo.UNKNOWN_AA then if ii > 1 and protNfo.ctype [ ii - 1 ] ~= protNfo.ctype [ ii ] then first = true protNfo.nterm [ ii ] = true print ( "non-standard amino acid at segment " .. ii .. " marked as N-terminal" ) end if ii < segCnt and protNfo.ctype [ ii + 1 ] ~= protNfo.ctype [ ii ] then last = true protNfo.cterm [ ii ] = true print ( "non-standard amino acid at segment " .. ii .. " marked as C-terminal" ) end end elseif protNfo.ctype [ ii ] == protNfo.DNA or protNfo.ctype [ ii ] == protNfo.RNA then if ii > 1 and protNfo.ctype [ ii - 1 ] ~= protNfo.ctype [ ii ] then first = true end if ii < segCnt and protNfo.ctype [ ii + 1 ] ~= protNfo.ctype [ ii ] then last = true end else -- ligand first = true last = true end protNfo.first [ #protNfo.first + 1 ] = first protNfo.last [ #protNfo.last + 1 ] = last end end, } -- -- end protNfo Beta package version 0.2 -- -- -- end of globals section -- function getChains () -- -- getChains - build a table of the chains found -- -- Most Foldit puzzles contain only a single protein (peptide) chain. -- A few puzzles contain ligands, and some puzzles have had two -- protein chains. Foldit puzzles may also contain RNA or DNA. -- -- For proteins, the atom count can be used to identify the first -- (N terminal) and last (C terminal) ends of the chain. The AANames -- table has the mid-chain atom counts for each amino acid. -- -- Cysteine is a special case, since the presence of a disulfide -- bridge also changes the atom count. -- -- For DNA and RNA, the beginning and end of the chain is determined -- by context at present. For example, if the previous segment was protein -- and this segment is DNA, it's the start of a chain. -- -- Each ligand is treated as a chain of its own, with a length of 1. -- -- chain table entries -- ------------------- -- -- ctype - chain type - "P" for protein, "M" for ligand, "R" for RNA, "D" for DNA -- fasta - FASTA-format sequence, single-letter codes (does not include FASTA header) -- fastab - "backup" of fasta -- sstruc - secondary structure sequence, parallel to fasta, H/E/L -- sstrucb - "backup" of sstruc, parallel to fastab -- start - Foldit segment number of sequence start -- stop - Foldit segment number of sequence end -- len - length of sequence -- chainid - chain id assigned to entry, "A", "B", "C", and so on -- -- For DNA and RNA, fasta and fastab contain single-letter codes, so "a" for adenine. -- The codes overlap the amino acid codes (for example, "a" for alanine). -- The DNA and RNA codes must be converted to the appropriate two-letter codes Foldit -- uses internally, for example "ra" for RNA adenine and "da" for DNA adenine. -- -- -- we're assuming Foldit won't ever have more chains -- local chainid = { "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" } local chainz = {} local chindx = 0 local curchn = nil local segCnt = structure.GetCount () for ii = 1, segCnt do if protNfo.first [ ii ] then chindx = chindx + 1 chainz [ chindx ] = {} curchn = chainz [ chindx ] curchn.ctype = protNfo.ctype [ ii ] curchn.fasta = "" curchn.sstruc = "" curchn.start = ii curchn.chainid = chainid [ chindx ] curchn.mute = 0 end curchn.fasta = curchn.fasta .. protNfo.fastac [ ii ] -- -- secondary structure codes don't need translation, -- so we don't have the equivalent of "fastac" here, -- just use the codes from the Foldit API -- curchn.sstruc = curchn.sstruc .. protNfo.ss [ ii ] if protNfo.mute [ ii ] then curchn.mute = curchn.mute + 1 end if protNfo.last [ ii ] then curchn.stop = ii curchn.len = curchn.stop - ( curchn.start - 1 ) end end for ii = 1, #chainz do chainz [ ii ].fastab = chainz [ ii ].fasta chainz [ ii ].sstrucb = chainz [ ii ].sstruc end return chainz end function report_time(start_clock,start_time,clock_msg,time_msg) local seconds,minutes,hours,days if clock_msg==nil then clock_msg="CPU time" end if time_msg==nil then time_msg="Elasped time" end print(string.format("%s",os.date())) days,remainder=math.modf((os.clock()-start_clock)/(24*60*60)) hours,remainder=math.modf(remainder*24) minutes,remainder=math.modf(remainder*60) seconds,remainder=math.modf(remainder*60) print(string.format("%s(%02id:%02ih:%02im:%02is)",clock_msg,days,hours,minutes,seconds)) days,remainder=math.modf(os.difftime(os.time(),start_time)/(24*60*60)) hours,remainder=math.modf(remainder*24) minutes,remainder=math.modf(remainder*60) seconds,remainder=math.modf(remainder*60) print(string.format("%s(%02id:%02ih:%02im:%02is)",time_msg,days,hours,minutes,seconds)) end function getSS ( ) ssList = "" for ii = 1, structure.GetCount () do ssList = ssList .. structure.GetSecondaryStructure ( ii ) end return ssList end function setSS ( chain ) local changes = 0 local errz = 0 local offset = chain.start - 1 local sstrucn = "" -- possibly changed chain for ii = 1, chain.stop - ( chain.start - 1 ) do local sType = chain.sstruc:sub ( ii, ii ) local oType = chain.sstrucb:sub ( ii, ii ) -- -- unlike AAEdit, -- no conversion of sType is needed here, -- assuming it will normally be H/E/L, -- or maybe M for ligand -- -- also, H/E/L codes can usually be changed, -- but ligand code M can't -- -- (nevertheless, we'll let the user have a -- go at it, see what happens) -- if sType ~= oType then local sName = SSNames [ sType ] if sName ~= nil then structure.SetSecondaryStructure ( ii + offset, sType ) local newss = structure.GetSecondaryStructure ( ii + offset ) if newss == sType then changes = changes + 1 sstrucn = sstrucn .. sType else print ( "segment " .. ii + offset .. " (" .. chain.chainid .. ":" .. ii .. ") change to type \"" .. sType .. "\" failed" ) errz = errz + 1 sstrucn = sstrucn .. oType end else print ( "segment " .. ii + offset .. " (" .. chain.chainid .. ":" .. ii .. "), skipping invalid type \"" .. sType .. "\"" ) errz = errz + 1 sstrucn = sstrucn .. oType end else sstrucn = sstrucn .. oType end end chain.sstruc = sstrucn chain.sstrucb = sstrucn return changes, errz end function GetParameters ( chnz ) local dlog = dialog.CreateDialog ( ReVersion ) dlog.sc0 = dialog.AddLabel ( "segment count = " .. structure.GetCount () ) local cwd = "chain" if #chnz > 1 then cwd = "chains" end dlog.chz = dialog.AddLabel ( #chnz .. " " .. cwd ) for ii = 1, #chnz do local chain = chnz [ ii ] dlog [ "chn" .. ii .. "l1" ] = dialog.AddLabel ( "Chain " .. chain.chainid .. " (" .. Ctypes [ chnz [ ii ].ctype ] .. ")" ) dlog [ "chn" .. ii .. "l2" ] = dialog.AddLabel ( "segments " .. chain.start .. "-" .. chain.stop .. ", mutables = " .. chain.mute .. ", length = " .. chain.len ) dlog [ "chn" .. ii .. "ss" ] = dialog.AddTextbox ( "sec struct", chain.sstruc ) end dlog.u0 = dialog.AddLabel ( "" ) dlog.u1 = dialog.AddLabel ( "Usage: use select all and copy, cut, or paste" ) dlog.u2 = dialog.AddLabel ( "to save or change secondary structure" ) dlog.w0 = dialog.AddLabel ( "" ) dlog.w1 = dialog.AddLabel ( "Windows: ctrl + a = select all" ) dlog.w2 = dialog.AddLabel ( "Windows: ctrl + x = cut" ) dlog.w3 = dialog.AddLabel ( "Windows: ctrl + c = copy" ) dlog.w4 = dialog.AddLabel ( "Windows: ctrl + v = paste" ) dlog.z0 = dialog.AddLabel ( "" ) dlog.ok = dialog.AddButton ( "Change" , 1 ) dlog.exit = dialog.AddButton ( "Exit" , 0 ) if ( dialog.Show ( dlog ) > 0 ) then for ii = 1, #chnz do chnz [ ii ].sstruc = ( dlog [ "chn" .. ii .. "ss" ].value:upper ()):sub ( 1, chnz [ ii ].len ) end return true else return false end end function main () print ( ReVersion ) print ( "Puzzle: " .. puzzle.GetName () ) print ( "Track: " .. ui.GetTrackName () ) undo.SetUndo ( false ) protNfo.setNfo () local changeNum = 0 local ssList = "" local chnTbl = {} -- chains as table of tables chnTbl = getChains () local cwd = "chain" if #chnTbl > 1 then cwd = "chains" end print ( #chnTbl .. " " .. cwd ) for ii = 1, #chnTbl do local chain = chnTbl [ ii ] if chain.stop == nil then chain.stop = 999999 end print ( "chain " .. chain.chainid .. ", start = " .. chain.start .. ", end = " .. chain.stop ) print ( chain.sstruc ) end while GetParameters ( chnTbl ) do local cmods = 0 for ii = 1, #chnTbl do local chain = chnTbl [ ii ] if chain.sstruc ~= chain.sstrucb then print ( "--" ) print ( "chain " .. chain.chainid .. " changed" ) cmods = cmods + 1 local old = chain.sstrucb changeNum = changeNum + 1 local start_time = os.time () behavior.SetFiltersDisabled ( true ) local sChg, sErr = setSS ( chnTbl [ ii ] ) behavior.SetFiltersDisabled ( false ) print ( "segments changed = " .. sChg .. ", errors = " .. sErr ) print ( "old chain " .. chain.chainid .. ": " ) print ( old ) print ( "new chain " .. chain.chainid .. ": " ) print ( chain.sstrucb ) end end if cmods == 0 then print ( "--" ) print ( "nothing changed" ) end end cleanup () end function cleanup ( errmsg ) -- -- optionally, do not loop if cleanup causes an error -- (any loop here is automatically terminated after a few iterations, however) -- if CLEANUPENTRY ~= nil then return end CLEANUPENTRY = true print ( "---" ) -- -- model 100 - print recipe name, puzzle, track, time, score, and gain -- local reason local start, stop, line, msg if errmsg == nil then reason = "complete" else -- -- model 120 - civilized errmsg reporting, -- thanks to Bruno K. and Jean-Bob -- start, stop, line, msg = errmsg:find ( ":(%d+):%s()" ) if msg ~= nil then errmsg = errmsg:sub ( msg, #errmsg ) end if errmsg:find ( "Cancelled" ) ~= nil then reason = "cancelled" else reason = "error" end end print ( ReVersion .. " " .. reason ) print ( "Puzzle: " .. puzzle.GetName () ) print ( "Track: " .. ui.GetTrackName () ) if reason == "error" then print ( "Unexpected error detected" ) print ( "Error line: " .. line ) print ( "Error: \"" .. errmsg .. "\"" ) end behavior.SetFiltersDisabled ( false ) end xpcall ( main, cleanup )

Comments


LociOiling Lv 1

SS Edit 2.0 lets you copy and paste the protein's secondary structure.

Similar to AA Edit 2.0 this version of SS Edit presents each chain of the protein separately. This is handy for binder puzzles like the coronavirus series, where there's a locked target and a designable binder which are separate chains.

When you click the "Change" button, the recipe applies the contents of the "sec struct" box for a given chain to the protein. All entries are converted to upper case. Only the standard Foldit codes "L" for loop, "H" for helix, and "E" for sheet are applied to the protein. Anything else causes the entry to be skipped and the corresponding segment left unchanged.

The code "M" may appear for ligands in some puzzles, but it's normally not possible to change this code. "M" is included here as a valid input, but will probably fail when the recipe tries to apply it.

Seagat2011 Lv 1

Thanks!

Do you think you can add an "idealize" and/or "remix" component, as well as numbered segments or a way to avoid assigning to "locked" segments?

LociOiling Lv 1

Version 2.0.1 fixes chain detection when proline is at the N terminal. In this case, the atom count is 18, instead of 15 when proline is mid-chain. Usually the atom count increases by 2 at the N terminal.

(Edit: mid-chain count is 15.)