Mutate does indeed use Monte Carlo. (It's the same algorithm that's used in Shake, it just considers rotamers from multiple amino acid identities, not just the one that's currently there.)
Specifically, it uses an approach called Metropolis Monte Carlo Simulated Annealing. The upshot is that, while random, it doesn't actually have to consider all combinations. Instead, what it does is do a random walk across the energy landscape, using information about how it's doing currently to inform where it needs to go.
You can make an analogy of someone with their eyes closed trying to climb a mountain. You make a step in a random direction, and if that's going uphill, you move your "starting point" to that spot and repeat. If it's downhill, you move back to where you were, mostly. The issue is that never-downhill will leave you stuck on small humps instead of climbing upward. To combat that, the Metropolis criteria allows you to go slightly downhill sometimes, in the hope of crossing the valley and finding the bigger hill that's across the way. (The "simulated annealing" bit refers to the fact that how big a step downhill you're allowed to take changes over time - large in the beginning, but progressively smaller as the process progresses.)
From the analogy, you can see that while you'll stumble around the mountain a bit, you won't have to cover all of it. Once you get pointed uphill, you'll reach the top relatively quickly. In fact, if you do it right, you can prove mathematically that (in the limit) you'll spend time in each state in proportion to their score. (So, very little time in bad scoring states, and most of your time in the good scoring states.)
The algorithm used is the same as that used by the Baker lab (and the rest of the Rosetta community) to do their design runs. (Often there's more complications added on top with respect to optimizations, but the actual mutations are found by this algorithm.)
The concept of pruning the number of possibilities is a good one, and we do indeed attempt to remove rotamers/identities which will likely never work. Most often this comes in the form of removing rotamers which clash with the backbone (or other fixed residues), as they will likely never form part of the productive final set. But attempting to do more pruning tends not to be productive, as (outside of those major clashes) the energies of the different rotamers tend to be either similar or context dependent. (For example, a particular rotamer could be bad, except if it happens to make a hydrogen bond with a nearby residue, in which case it's very good.)
There's various approaches such as "Dead End Elimination" which attempt to use various logic-based arguments to narrow down which rotamers (and combination of rotamers) are present in the best-scoring structure. Some work of combining DEE with Rosetta design has been made, but in practice what you see is that you spend about as much time logic-ing through the combinations as you save by the pruning, so there's not much gain.
The score is indeed a weighted combination of subscores, with known weights, but looking at subscores (rather than the full score) isn't likely to get you any benefits. (Aside from the already existing clash pruning.) Subscore energies tend to be compensatory, so you'll get rotamers with a marginally worse hiding score which enables better hydrogen bonding, or those which have worse electrostatic scores to get better packing scores. You can't uniformly pin the best residue identity on a single subscore - how the subscores combine to give you the best amino acid/conformer at that position is highly context dependent – especially when you have multiple residues which can mutate. (In your hypothetical, you might indeed want to be able to mutate the leucine to an alanine if some other position can then mutate to a tyrosine to fill in the packing gap and also make a hydrogen bond. And there's no real way to figure that out prior to considering it as a possibility.)