Icon representing a recipe

Recipe: Find The Gap

created by LociOiling

Profile


Name
Find The Gap
ID
106906
Shared with
Public
Parent
None
Children
None
Created on
June 06, 2022 at 03:27 AM UTC
Updated on
June 06, 2022 at 03:27 AM UTC
Description

Looks for protein segments which are suspiciously far apart. Reports these gaps and some nice statistics to boot.

Best for


Code


--[[ Find The Gap! Check the distance between the segments of a protein. Note the pairs of segments which seem unusually distant. There are several reasons for a gap: 1) the two segments are on different chains (the recipe doesn't check for multiple chains, found in binder design and other puzzles) 2) the protein has gaps representing missing segments (some electron puzzles have missing segments) 3) the protein was created by the Trim Tool, and contains two or more separate parts of an original protein Some puzzles have gaps for more than one reason. For example, the June 2022 Design of the Month Puzzle has a binding target that's composed of separate sections of a larger protein. Then the actual binder is a separate chain. The recipe uses the function structure.GetDistance to measure the distance between the backbone alpha carbons of each pair of segments. All distances are in Angstroms. Almost all distances between adjacent segments will be in the range of 3.7 to 3.9 Angstroms. The recipe considers a distance of 4 Angstroms or more to be a gap. The recipe prints the segments and distances for gaps. The recipe reports the mean or average distance along with the minimum and maximum distances. The recipe also reports an adjusted mean distance. A distance of 3.8 Angstroms is used when there's a gap. The mean adjusted distance should be around 3.8 Angstroms. version 1.0 - LociOiling - 20220605 * new ]]-- local segCnt = structure.GetCount () local segCnt2 = segCnt while structure.GetSecondaryStructure ( segCnt2 ) == "M" do segCnt2 = segCnt2 - 1 end local tdist = 0 local tdista = 0 local cdist = 0 local maxdist = 0 local mindist = 99999 for ii = 1, segCnt2 - 1 do local dist = structure.GetDistance ( ii, ii + 1 ) if dist < mindist then mindist = dist end if dist > maxdist then maxdist = dist end tdist = tdist + dist cdist = cdist + 1 if dist >= 4 then print ( "gap between " .. ii .. " and " .. ii + 1 .. ", distance = " .. string.format ( "%.3f", dist ) ) tdista = tdista + 3.8 else tdista = tdista + dist end end print ( "total distance = " .. string.format ( "%.3f", tdist ) ) print ( "segment pairs = " .. cdist ) print ( "mean distance = " .. string.format ( "%.3f", tdist / cdist ) ) print ( "minimum distance = " .. string.format ( "%.3f", mindist ) ) print ( "maximum distance = " .. string.format ( "%.3f", maxdist ) ) print ( "" ) print ( "mean adjusted distance = " .. string.format ( "%.3f", tdista / cdist ) )

Comments


LociOiling Lv 1

This recipe was inspired by recent discussions of the trim tool.

There's no sure-fire way for a recipe to tell whether its working on a trimmed protein.

However, if the trimmed protein consists of separate sections of the original protein, some of the segments will be abnormally far apart – gaps.

Just to keep things interesting, the protein in puzzle 2155b starts out with two gaps. These gaps are due to residues (segments) which turned up missing in the electron density experiment.

There's no way to tell if a gap is due to trimming the protein, or just missing segments. With puzzle 2155b, you can have both types.

Separate chains, seen in binder puzzles and elsewhere, are another possible source of gaps.

Detecting gaps is easy. The function structure.GetDistance returns the distance between the alpha carbons of two segments. It works on adjacent segments (segment N and N + 1, for example).

Distances are in Angstroms. A normal alpha carbon distance for adjacent segments is between 3.7 and 3.9 Angstroms. The recipe considers a distance of 4 Angstroms or more to be a gap.

This recipe only reports gaps and some related statistics. It doesn't do anything with the information, and is pretty much guaranteed not to change your score.