Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loop unrolling in pack_memory #1219

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rndmcnlly
Copy link

To improve the state saving time for very large and sparse memories, we unroll the loop that checks for non-zero pages to work with blocks of 64 bytes per iteration.

Context: Towards implementing playable quotes for PC games, I want to take whole-system snapshots quite frequently (e.g. once or more per second). I found that the scan for non-zero pages was dominating the overall time spent in saving state. It looks like the code was already someone optimized for speed (by working with mem32 rather than mem8), so this enhancement, while somewhat ugly, might fit the surrounding code. In microbenching with a VM with 128MB of guest memory (and an M3 Max host processor) this lead to a ~2x speedup for the loop in question. Deeper unrolling helps slightly more on my processor, but that is probably overspecializing for my wide cache lines (128 bytes). The code here only exploits 64-byte lines.

To improve the state saving time for very large and sparse memories, we unroll the loop that checks for non-zero pages to work with blocks of 64 bytes per iteration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant