We are very excited about this puzzle, because our collaborators who provided us with the 4 structures from Foldit's 2019 cryo-EM paper have been unable to solve this cryo-EM structure.
This will not be easy, because the full density has 46 symmetric monomers and we can't cut out a single copy of density for you (since we don't know the structure). Instead, we've sliced off a quarter of the entire density, which our collaborators estimate should be ~10 monomers. So keep this in mind, as we are only providing you with a single monomeric chain to fold. More details and pictures in the puzzle comments!
I missed today's office hour, so below are some questions I have:
(1) In https://fold.it/portal/node/2011305#comment-44223
where is the RNA? Is it inside the empty hole at the center
of the donut, so in effect all of the ED cloud we are given
is from protein? Is the RNA mixed with protein in the ED
cloud we are given? What does the ED typically look like
for RNA? Long ago you gave us images of the ED shapes for
different protein sidechains. Are there similar pictures
you could give us for pieces of RNA?
(2) When the protein interacts with RNA, should we expect
hydrophobic or hydrophilic residues at the protein/RNA
interface? Does this mean the inner surface of the donut
should be full of hydrophobic residues? How about the
outer surface of the donut? Are there certain protein
sidechains or sequences that often bind to RNA?
https://www.rcsb.org/3d-view/1CGM gives a 3D image of
a donut structure that you can move and rotate on the
screen. I think the brown part is protein and the green
part is RNA. In the brown part, each protein subunit has
both its termini at the outer circle of the donut. The
green part (RNA?) is close to the inner circle of the
donut, but some brown parts are even closer to the inner
circle.
https://www.rcsb.org/structure/1cgm has 2 images:
(1) "Biological Assembly 1" looks like our donut's ED.
You can download this image as 1cgm_assembly-1.jpeg.
(2) "Asymmetric Unit" gives 1cgm_model-1.jpeg, which
has a brown part that looks like protein and a green
part that could be RNA. If you click on the right
spot, it goes to the link below, which is a rotatable
3D image of a single protein/RNA subunit:
https://www.rcsb.org/3d-view/1CGM/0
Overall it seems like 1CGM has 4 helices and many
long loops per subunit.
If Puzzle 1964's structure is like 1CGM's,
most of the ED is due to protein, and only
a single RNA subunit (A C G or U) is
included for each protein subunit. Should
we expect all of the RNA subunits to be
identical (like an RNA sequence of all A's),
or should we expect there to be a mix of
A C G & U in the ED? In 1964, should we
expect the overall RNA to be one long strand,
or should we expect there to be base-pairing
(A-U and C-G) between 2 long strands of RNA?
Thanks again,
Jeff
I think AAPAAP may be referring to the fact that if you have a quarter donut, and you shave off the top quarter and bottom quarter of that piece, you will be left with an eighth of a donut.
By the way, even though it would make the cloud bigger, I would prefer the top and bottom not be sliced off, because they would help define what makes a monomer unit. I have one helix that could go either on the top or the bottom of the other ones - if I could see the top and bottom of the donut, I would know which side it goes on.
Keep in mind 1CGM is not our same virus - I just typed in 'mosaic virus' because I knew they had a similar capsid structure and took the first one that looked good. The psipred predicted secondary structure that came with the puzzle is more likely to be accurate than trying to match the secondary structure of 1CGM.
Thank you Susume, that I mean, I cannot see clearly from cryo-EM map provided where the top or bottom of the nucleocapsid of a plant virus is, even thought I can identify part of the RNA helix. Also my big problem is how to fold the protein (I think trial and error) Jpred prediction above shows some sheet structure, are you using only α-helices and loops or/and β-sheet structures?
At the end I do not know how to fix almost known points of the protein to almost known points of the cryo-EM map. The last need some strategy and practice with the Foldit program cryo-EM map density.
Hi Jeff,
yes most of the density is due to protein, there are only a few nucleotides per protein subunits, probably between 3-5 (but let's consider this to be an unknown). All the RNA bases will appear identical, actually they are averaged out by the image processing, and therefore represent an average of all possible bases present in the viral genome. The RNA backbone should be clearly identifiable.
The overall RNA should be one long, single strand (so, no base pairing).
Hope that helps
A.
My trROSETTA results (from YangLab's webserver) used this as its top match when coming up with its end result: https://www.rcsb.org/3d-view/ngl/3pdm
Though, the 1CGM was its second match.
My 'tr' result ended up being similar, but still having a few differences.
I ahdn't realized that the mega-structure these formed was made up of so many of these monomers, so today when I went in and mapped where I thought each monomer was, I was coming up with only 3 or so… After seeing that one I now realize there's roughly a dozen in our section of cloud… DOH!
Back to the drawing board tomorrow! :(