beta_helix Staff Lv 1
It is likely that most people used the PSIPRED predictions we posted for the last puzzle:
https://fold.it/portal/node/2009337#comment-40823
Closed since almost 6 years ago
Novice Overall PredictionThis is a follow-up to Puzzle 1820, but this time we are providing you with a starting model. Players may load in previous work from Puzzle 1820. This protein is encoded in the viral genome of SARS-CoV-2, but the protein's structure is still unknown. If we knew how this protein folds, we might be able to figure out its exact function. Refold this starting model to find higher-scoring solutions, which will tell us how this protein is most likely to fold!
It is likely that most people used the PSIPRED predictions we posted for the last puzzle:
https://fold.it/portal/node/2009337#comment-40823
Since we do not know the structure of this protein, we can only speculate.
I did run it through a Signal Peptide Prediction Server, SignalP-5.0, and indeed that was the prediction:
http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=5E907DFF00006C758F0CC973
As we don't know the structure of this protein, we really can only speculate.
I did run it through a Disulfide Bonding State Prediction Server, DISULFIND, and indeed it predicts 3 disulfide bridges (with 58.9% confidence):
http://disulfind.dsi.unifi.it/monitor.php?query=H0o2y1
I was to reiterate, however, that all these tools: DISULFIND, SignalP-5.0, and even PSIPRED, are all just statistical predictions… they are often correct, but are sometimes very wrong.
Proteins tend to follow their own rules, and they love creating exceptions (I learned this the hard way when studying knotted proteins for my PhD! :-)
In line with the 25,83 37,90 topology, some article on bioRxiv says this is gonna be an Ig fold:
mkflvflgiittvaafhqecslqsctqhqpyvvddpcpihfyskwyirvgarksaplielcvdeagskspiqyidignytvsclpftincqepklgslvvrcsfyedfleyhdvrvvldfi
lllllllllllllllllleeeeeeeellleeeeelllllleelllllllleeeeelleeeelllllllllleeelllleeeeelleeeeelllllleeeeeeeelllllllllllllllll
|37------------------------------------------------90|
|25-----------------------------------------------------83|
No idea about 61,102 though. My hunch is that it is possible, since this pair only occurs in the SARSr-CoV insertion.
https://www.biorxiv.org/content/10.1101/2020.03.04.977736v1.full
If the N-terminus is a signal peptide and if it is cleaved, wouldn't it be useful to post the puzzle at some point without the first 15aa? Even if there is no cleavage, the N-term will be in the membrane, so in Foldit any solution separating that helix from the rest of the fold will score horribly and players will be less likely to follow solutions like that.
That is a very good suggestion, Wilm.
We do have to keep in mind that we cannot be 100% sure that the N-terminus is a signal peptide, it's not like we have a cryo-EM map to work off of, we just have the sequence.
We'll definitely consider posting a trimmed version of it, if we end up running another puzzle for this protein… as there are many SARS-2-CoV protein sequences with no experimental structure available.
The Diamond Light Source in the UK has switched into emergency mode - if somebody can crystallize the ORF8 with and wit out the signal peptide and has good ties to DLS we could get fast turnover XRC