perf(core): reduce snapshot deserialization overhead#33321
Open
bartlomieju wants to merge 2 commits intomainfrom
Open
perf(core): reduce snapshot deserialization overhead#33321bartlomieju wants to merge 2 commits intomainfrom
bartlomieju wants to merge 2 commits intomainfrom
Conversation
1. Clear module requests before snapshotting — the requests contain ModuleReference with parsed URLs (ModuleSpecifier) that trigger ~2500 Url::parse calls during deserialization. Snapshotted modules are already instantiated and linked, so their import requests are never needed at runtime. 2. Use create_external_onebyte_const_unchecked for snapshot and extension sources — these were already validated as ASCII during snapshot creation / transpilation, so the per-source ASCII check is redundant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace FastString/ModuleInfo/SymbolicModule in the snapshot data with snapshot-specific types that use Cow<'a, str> for string fields. This allows bincode to borrow strings directly from the snapshot buffer during deserialization instead of heap-allocating copies via FastString::Deserialize. Also omits ModuleInfo.requests from the snapshot form entirely (moved from the previous commit's clear-after-serialize to not serializing them at all), eliminating the ~2500 Url::parse calls and their associated ModuleRequest/ModuleReference allocations from the serialized data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Member
|
Does this actually improve startup time? Also this segfault is because you need to use cargo nextest for running the deno core tests. |
Member
Author
TBH I'm not sure anymore; the first few times I captured flamegraphs I saw quite a few percent spent in Url::parse and other op setup stuff but I can't reproduce it anymore. We can just close this one for now. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reduces V8 snapshot sidecar deserialization overhead at startup:
Eliminate ~2500
Url::parsecalls: Module requests (ModuleRequestcontainingModuleSpecifier/url::Url) were serialized into the snapshot but never used at runtime — snapshotted modules are already instantiated and linked. Now cleared before serialization.Zero-copy string deserialization: Introduced snapshot-specific types (
ModuleInfoSnapshot,SymbolicModuleSnapshot) that useCow<'a, str>instead ofFastString. Bincode borrows strings directly from the&'staticsnapshot buffer.cow_to_fast_stringconvertsCow::BorrowedtoFastString::Staticvia transmute (safe because snapshot buffers are alwaysBox::leak'd to'static). Eliminates ~560 heap allocations.Skip redundant ASCII validation: Uses
create_external_onebyte_const_uncheckedinexternalize_sourcesfor snapshot and extension sources that were already validated during snapshot creation / transpilation.Context
Profiling showed
Url::parseconsuming ~12% of sidecar deserialization time during startup. The sidecar is 18KB vs 815KB for the V8 snapshot itself, but every byte of it was being actively processed (parsing URLs, allocating strings). These changes make sidecar deserialization nearly zero-cost.Test plan
deno run,deno evalwork correctlynode:path,Buffer,process) workscargo test -p deno_core --libpasses (402 passed, 1 pre-existing SIGSEGV ines_snapshottest unrelated to this change)🤖 Generated with Claude Code