devprev scoring bug

Started by LociOiling

LociOiling Lv 1

Playing the current Refine Density test, I have a solution that shows 10785, but then drops to 10650 on any action. For example, select all, then apply autostructures, and the score drops. Start a recipe, and it reports the lower score at the start. So something's still wrong, I'd expect a solution that shows 10785 could be saved and then reloaded at that score. I don't expect autostructures to change the score. Simply getting the score at the start of a recipe should return the same score seen on the screen.

LociOiling Lv 1

Here are the steps I took, more or less:

  1. open puzzle on fresh client
  2. set low wiggle power
  3. shake for two cycles, wiggle sidechains, wiggle (S-E-W), repeat 3 or so times
  4. run Fuzes 3.0.3
  5. use refine density tool
  6. repeat S-E-W

At this point, I tried applying autostructures, and noticed the score drop. Restore credit best still has the higher score, however.

I'll try again with another fresh client, and report the results.

LociOiling Lv 1

Here's a second attempt. I may not have been looking that carefully at what the recipe did.

Release: V32-20240221-9f7b872844-win_x64-devprev
Puzzle: https://fold.it/puzzles/2013801
initial score: 2320.785

  1. open puzzle, clear selections (nothing selected)
  2. set low wiggle power in behavior options
  3. shake (S) for 2 cycles, score = 9928.216
  4. wiggle sidechains (E) until counter spins (~10 cycles), score = 10115.819
  5. wiggle (W) until it spins, score = 10401.052
  6. shake, score = 10401.265
  7. wiggle sidechains and wiggle find only small gains, score = 10401.274
  8. run Fuzes 3.0.3 with default settings, final score is 10650.130, but recipe earlier shows 10785 score
  9. restore credit best, score now 10785.203
  10. refine density, score back to 10650.130

So it's not super clear to me what's going on.

beta_helix Staff Lv 1

Thank you for your detailed steps, we've been able to reproduce this as well and hope to figure out soon why this is happening!

LociOiling Lv 1

The new release does seem to be an improvement. I'm not sure if the scoring glitches are completely gone, however.

A first cycle of refining density followed by rebuilding went well.

After another refine density followed by shake and wiggle, everything still appeared normal.

Then after another refine density, the score began to drop on wiggle.

Restore credit best eventually found the high score again. It took several tries, it seems that just a single control-c didn't do the trick.

The when I started Fuzes, the score dropped on the first score report in the recipe, which I believe is before any work is done. I stopped the recipe, restored credit, and started the recipe again. This time the high score held.

It almost seems like there's a timing dependency, some asynchronous process that's involved in determining the score.

apetrides Staff Lv 1

Hi @LociOiling I'm unable to replicate this issue. Could you please double check that you are still experiencing the issue, and walk me through the steps required to create it?

LociOiling Lv 1

I'm still seeing a problem in V33-20240305-0cb8bb8189-win_x64-devprev.

The problem seems to be with the credit best solution and the restore credit best function. A Refine Density can drop the score, and after this happens, the credit best solution is no longer stable when restored. The same issue also affects the "very best" solution, and may even hit "recent best" in some circumstances.

On the other hand, a manually saved solution is stable when restored.

My thought is that the automatically saved credit best pose doesn't include the density information found in a manually saved pose.

Manual saves are a workaround, but users often use restore credit best manually, and some recipes do as well. All the automatically saved poses need to work consistently, restoring the full context of when the save was created.

I shared "10646.192 manual save" and "Autosaved Credit Best Soloist Solution" with scientists to show the problem.

Here are the steps:

  1. On a fresh client, load https://fold.it/puzzles/2013801 (score 2320)
  2. Set low wiggle power
  3. Manually shake (S), wiggle (W), shake, wiggle (SWSW) (score 10336)
  4. Run Fuzes 1.5.1 https://fold.it/recipes/100080 (score 10446)
  5. Refine Density (score 10624)
  6. Manually SWSW again (score 10645)
  7. Run Fuzes again (score 10646)
  8. Refine Density again (score 10532)

Everything seems fine in steps 1 to 7.

The score drops at step 8, and doesn't recover (another round of SWSW and Fuzes gets it back to only 10552). I would normally abandon further work on that solution, and revert to the best scoring pose.

At this point, credit best restores the score to 10646, so everything still seems OK. But shake quickly drops the score back to 10532. That's a problem, since the previous 10646 was stable (shake and wiggle didn't change it). It seems like the updated density from step 8 wasn't replaced by the previous density from step 5.

A "fresh client" means the puzzle hasn't previously been played by the logged-in user. I have a couple of dummy accounts used for this kind of test.

I made manual saves at each step listed above. Manual saves are files like "puzzle_2013801_time_1713550604.ir_solution" in c:\Foldit or the equivalent directory. The 10646 pose has a size of 58,701 bytes.

The automatically saved credit best solution is in the file "autosave-creditbest.ir_solution" when playing as a soloist. It would be found in "C:\Foldit\puzzles\0002013801*usernum\default", where "0002013801" is the puzzle number, *usernum is the user number, and "default" is the name of the current track. The 10646 creditbest has a size of 58,766 bytes.

It's not really clear to me where the density information is being stored. I compared directories before and after replacing a step 8 unstable solution with a stable manual save. I didn't see any file differences. Even after a restart, I don't see any file changes.

I don't think all eight steps I listed are necessary to reproduce the problem. I'll try a simplified version next.