Placeholder image of a protein
Icon representing a puzzle

1844: Coronavirus ORF3a Prediction

Closed since almost 6 years ago

Intermediate Overall Prediction

Summary


Created
May 28, 2020
Expires
Max points
100
Description

Fold this coronavirus protein! This is a portion of a larger protein encoded in the viral genome of SARS-CoV-2. It is encoded in a region of the genome called ORF3a, but the protein's structure and function are still unknown. If we knew how this protein folds, we might be able to figure out its exact function. The puzzle's starting structure shows SS predictions from PSIPRED, and hints which parts of the protein might fold into helices or sheets. Refold this protein to find high-scoring solutions, which will tell us how this protein is most likely to fold!



Sequence:


WKCRSKNPLLYDANYFLCWHTNCYDYCIPYNSVTSSIVITSGDGTTSPISEHDYQIGGYTEKWESGVKDCVVLHSYFTSDYYQLYSTQLSTDTGVEHVTFFIYNKIVDEPEEHVQIHTIDGSSGVVNPVMEPIYDEPTTTTSVPL

Top groups


  1. Avatar for Go Science 100 pts. 11,452
  2. Avatar for Gargleblasters 2. Gargleblasters 80 pts. 10,957
  3. Avatar for Anthropic Dreams 3. Anthropic Dreams 63 pts. 10,929
  4. Avatar for Team India 4. Team India 49 pts. 10,610
  5. Avatar for Hold My Beer 5. Hold My Beer 37 pts. 10,484
  6. Avatar for Beta Folders 6. Beta Folders 28 pts. 10,455
  7. Avatar for Void Crushers 7. Void Crushers 21 pts. 10,306
  8. Avatar for L'Alliance Francophone 8. L'Alliance Francophone 15 pts. 10,179
  9. Avatar for Marvin's bunch 9. Marvin's bunch 11 pts. 9,789
  10. Avatar for Contenders 10. Contenders 8 pts. 9,779

  1. Avatar for N00BiZ85 251. N00BiZ85 Lv 1 1 pt. 0
  2. Avatar for SorinHigh 252. SorinHigh Lv 1 1 pt. 0
  3. Avatar for Sciren 253. Sciren Lv 1 1 pt. 0
  4. Avatar for Gambatte91 254. Gambatte91 Lv 1 1 pt. 0
  5. Avatar for ANU 255. ANU Lv 1 1 pt. 0
  6. Avatar for flyingfish08 256. flyingfish08 Lv 1 1 pt. 0
  7. Avatar for JuliaTwin2 257. JuliaTwin2 Lv 1 1 pt. 0
  8. Avatar for Pikamander2 258. Pikamander2 Lv 1 1 pt. 0
  9. Avatar for Joanna_H 260. Joanna_H Lv 1 1 pt. 0

Comments


bkoep Staff Lv 1


Conf: 962248952023898999555885446656789996999839985466210130014604
Pred: CCCCCCCCCCCCCCEEEEEECCCCCEEEECCCCCCEEEEECCCCCCCCCCCCCCCCCCEE
  AA: WKCRSKNPLLYDANYFLCWHTNCYDYCIPYNSVTSSIVITSGDGTTSPISEHDYQIGGYT
              10        20        30        40        50        60


Conf: 113344213899998535773786403210278740588898554368925430788941
Pred: EECCCCCCCEEEEEEECCCCEEEEEECCCCCCCCCCEEEEEEEEECCCCCHHCEEEEEEE
  AA: EKWESGVKDCVVLHSYFTSDYYQLYSTQLSTDTGVEHVTFFIYNKIVDEPEEHVQIHTID
              70        80        90       100       110       120


Conf: 7988336877533478984002479
Pred: CCCCCCCCCCCCCCCCCCCCCCCCC
  AA: GSSGVVNPVMEPIYDEPTTTTSVPL
             130       140

Artoria2e5 Lv 1

This is presumably the cytoplasmic part of ORF3a (UniProt P59632 for the very similar SARS-Cov-1 version). In line with this environment, no disulfides are expected by DiaNNA.

This thing is somewhat conserved in other betacov genomes; bat versions include A0A088DI21 and A0A1B3Q5V5. Below is an alignment of these guys; you can see the human-infecting SARSr-COVs have some extensions not found in the two bat versions.

>Pz1844
WKCRSKNPLLYDANYFLCWHTN
CYDYCIPYNSVTSSI-VITSGDGTTSPISEHDYQIGGYTEKWESGVKDCVVLHSYFTSDY
YQLYSTQLSTDTGVEHVTFFIYNKIVDEPEEHVQIHTIDGSSGVVNPVMEPIYDEPTTTCT
SVP-
>New|UniRef100_A0A088DI21 Uncharacterized protein n=1 Tax=Bat Hp-betacoronavirus/Zhejiang2013 TaxID=1541205 RepID=A0A088DI21_9BETC
IRCKSLVPLCADDDCFVNYNAG
GKTYCMPFDPNEPYLTLVVHQNGIT---------CGSYKLYGDVSIADRIYLVTLTKSVP
YSLQNI---FDAELCTIAFYIADCAV------IEDHTTAGKTPRLELKSDPIYEVPCATI
DVPL
>New|UniRef100_A0A1B3Q5V5 NS3 protein n=1 Tax=Rousettus bat coronavirus TaxID=1892416 RepID=A0A1B3Q5V5_9BETC
IRVHSMAPFVSTADNFAVLRTT
CSRFVFPVESSKDNVVVLTTSRGVF---------CNGIHVEGPTALSDNASIVSLFSTTV
LLLDRVEQGYDY---TVFVYISQQILRNSE--------SNPQGVVNPEFD----------
DVEL

This probably won't work, but just for the sake of fun I told Rosetta's GREMLIN to try and figure out what residues are changing together. This sort of correlation is usually indicative of close contact in the actual protein. http://openseq.org/sub.php?id=1591034373 The last time GREMLIN tried it was in 2013 without enough sequences to work with, and I don't really see that changing much: http://gremlin.bakerlab.org/pfam.php?id=PF11289&year=2013.