Placeholder image of a protein
Icon representing a puzzle

1844: Coronavirus ORF3a Prediction

Closed since almost 6 years ago

Intermediate Overall Prediction

Summary


Created
May 28, 2020
Expires
Max points
100
Description

Fold this coronavirus protein! This is a portion of a larger protein encoded in the viral genome of SARS-CoV-2. It is encoded in a region of the genome called ORF3a, but the protein's structure and function are still unknown. If we knew how this protein folds, we might be able to figure out its exact function. The puzzle's starting structure shows SS predictions from PSIPRED, and hints which parts of the protein might fold into helices or sheets. Refold this protein to find high-scoring solutions, which will tell us how this protein is most likely to fold!



Sequence:


WKCRSKNPLLYDANYFLCWHTNCYDYCIPYNSVTSSIVITSGDGTTSPISEHDYQIGGYTEKWESGVKDCVVLHSYFTSDYYQLYSTQLSTDTGVEHVTFFIYNKIVDEPEEHVQIHTIDGSSGVVNPVMEPIYDEPTTTTSVPL

Top groups


  1. Avatar for FoldIt@Netherlands 11. FoldIt@Netherlands 5 pts. 9,514
  2. Avatar for Chem Eng Thermo 12. Chem Eng Thermo 4 pts. 9,340
  3. Avatar for BOINC@Poland 13. BOINC@Poland 2 pts. 9,234
  4. Avatar for Rechenkraft.net 14. Rechenkraft.net 2 pts. 8,129
  5. Avatar for Team China 15. Team China 1 pt. 8,126
  6. Avatar for Trinity Biology 16. Trinity Biology 1 pt. 6,557
  7. Avatar for Russian team 18. Russian team 1 pt. 6,093
  8. Avatar for SETI.Germany 19. SETI.Germany 1 pt. 2,865

  1. Avatar for Scopper
    1. Scopper Lv 1
    100 pts. 11,000
  2. Avatar for mirp 2. mirp Lv 1 99 pts. 10,933
  3. Avatar for actiasluna 3. actiasluna Lv 1 97 pts. 10,924
  4. Avatar for grogar7 4. grogar7 Lv 1 95 pts. 10,923
  5. Avatar for lynnai 5. lynnai Lv 1 94 pts. 10,923
  6. Avatar for phi16 6. phi16 Lv 1 92 pts. 10,908
  7. Avatar for malphis 7. malphis Lv 1 90 pts. 10,833
  8. Avatar for Bruno Kestemont 8. Bruno Kestemont Lv 1 89 pts. 10,817
  9. Avatar for fiendish_ghoul 9. fiendish_ghoul Lv 1 87 pts. 10,777
  10. Avatar for drumpeter18yrs9yrs 10. drumpeter18yrs9yrs Lv 1 85 pts. 10,767

Comments


bkoep Staff Lv 1


Conf: 962248952023898999555885446656789996999839985466210130014604
Pred: CCCCCCCCCCCCCCEEEEEECCCCCEEEECCCCCCEEEEECCCCCCCCCCCCCCCCCCEE
  AA: WKCRSKNPLLYDANYFLCWHTNCYDYCIPYNSVTSSIVITSGDGTTSPISEHDYQIGGYT
              10        20        30        40        50        60


Conf: 113344213899998535773786403210278740588898554368925430788941
Pred: EECCCCCCCEEEEEEECCCCEEEEEECCCCCCCCCCEEEEEEEEECCCCCHHCEEEEEEE
  AA: EKWESGVKDCVVLHSYFTSDYYQLYSTQLSTDTGVEHVTFFIYNKIVDEPEEHVQIHTID
              70        80        90       100       110       120


Conf: 7988336877533478984002479
Pred: CCCCCCCCCCCCCCCCCCCCCCCCC
  AA: GSSGVVNPVMEPIYDEPTTTTSVPL
             130       140

Artoria2e5 Lv 1

This is presumably the cytoplasmic part of ORF3a (UniProt P59632 for the very similar SARS-Cov-1 version). In line with this environment, no disulfides are expected by DiaNNA.

This thing is somewhat conserved in other betacov genomes; bat versions include A0A088DI21 and A0A1B3Q5V5. Below is an alignment of these guys; you can see the human-infecting SARSr-COVs have some extensions not found in the two bat versions.

>Pz1844
WKCRSKNPLLYDANYFLCWHTN
CYDYCIPYNSVTSSI-VITSGDGTTSPISEHDYQIGGYTEKWESGVKDCVVLHSYFTSDY
YQLYSTQLSTDTGVEHVTFFIYNKIVDEPEEHVQIHTIDGSSGVVNPVMEPIYDEPTTTCT
SVP-
>New|UniRef100_A0A088DI21 Uncharacterized protein n=1 Tax=Bat Hp-betacoronavirus/Zhejiang2013 TaxID=1541205 RepID=A0A088DI21_9BETC
IRCKSLVPLCADDDCFVNYNAG
GKTYCMPFDPNEPYLTLVVHQNGIT---------CGSYKLYGDVSIADRIYLVTLTKSVP
YSLQNI---FDAELCTIAFYIADCAV------IEDHTTAGKTPRLELKSDPIYEVPCATI
DVPL
>New|UniRef100_A0A1B3Q5V5 NS3 protein n=1 Tax=Rousettus bat coronavirus TaxID=1892416 RepID=A0A1B3Q5V5_9BETC
IRVHSMAPFVSTADNFAVLRTT
CSRFVFPVESSKDNVVVLTTSRGVF---------CNGIHVEGPTALSDNASIVSLFSTTV
LLLDRVEQGYDY---TVFVYISQQILRNSE--------SNPQGVVNPEFD----------
DVEL

This probably won't work, but just for the sake of fun I told Rosetta's GREMLIN to try and figure out what residues are changing together. This sort of correlation is usually indicative of close contact in the actual protein. http://openseq.org/sub.php?id=1591034373 The last time GREMLIN tried it was in 2013 without enough sequences to work with, and I don't really see that changing much: http://gremlin.bakerlab.org/pfam.php?id=PF11289&year=2013.