LociOiling Lv 1
NetSurfP is yet another web-based secondary structure prediction service. (JPred is another.)
NetSurfP outputs its results in a columnar format. The predictions for helix, sheet, and loop are expressed as probabilities.
(NetSurfP also predicts he "surface accessibility" of a given residue, which seem to be more or less the inverse of the likelihood the residue is buried in the hydrophobic core.)
The formatting for the NetSurfP results doesn't lend itself to being pasted directly into a spreadsheet.
This recipe does two things. First, it converts the NetSurfP output to a tab-delimited format that can be pasted into a spreadsheet. Second, it creates a secondary structure string. Both formats can be found in print protein 2.4.
To use this recipe, run a NetSurfP prediction, then copy the output to the clipboard. (It's not necessary to include the comment lines, but you can if you wish.)
Run the recipe, and paste the NetSurfP output into the textbox on the first screen. When you click OK, the secondary screen displays two text boxes, one containing the spreadsheet output, and one containing the secondary structure string.
The secondary structure string is created by picking the secondary structure type with the highest probability for each segment. The picking logic is quite simple, and doesn't worry about ties or close finishes.
This recipe depends heavily on the NetSurfP output format. Any changes to NetSurfP output will require revisions to the recipe.
Sample NetSurfP output:
# For publication of results, please cite: # A generic method for assignment of reliability scores applied to solvent accessibility predictions. # Bent Petersen, Thomas Nordahl Petersen, Pernille Andersen, Morten Nielsen and Claus Lundegaard # BMC Structural Biology 2009, 9:51 doi:10.1186/1472-6807-9-51 # # Column 1: Class assignment - B for buried or E for Exposed - Threshold: 25% exposure, but not based on RSA # Column 2: Amino acid # Column 3: Sequence name # Column 4: Amino acid number # Column 5: Relative Surface Accessibility - RSA # Column 6: Absolute Surface Accessibility # Column 7: Z-fit score for RSA prediction # Column 8: Probability for Alpha-Helix # Column 9: Probability for Beta-strand # Column 10: Probability for Coil E T Sequence 1 0.865 120.003 0.476 0.003 0.003 0.994 E E Sequence 2 0.758 132.423 0.324 0.694 0.003 0.303 E E Sequence 3 0.741 129.488 0.588 0.782 0.003 0.216 E R Sequence 4 0.409 93.707 0.281 0.858 0.002 0.139 E K Sequence 5 0.380 78.063 0.459 0.923 0.002 0.076 E K Sequence 6 0.597 122.844 1.114 0.938 0.007 0.055 E E Sequence 7 0.609 106.340 1.316 0.970 0.001 0.030 B I Sequence 8 0.066 12.284 -0.022 0.970 0.001 0.030 E Q Sequence 9 0.436 77.941 0.867 0.970 0.001 0.030 E K Sequence 10 0.613 126.012 1.216 0.970 0.001 0.030 ...