fix: seed fast pseudo-random numbers to differ across processes#2663
fix: seed fast pseudo-random numbers to differ across processes#2663Amey Pawar (ameyypawar) wants to merge 5 commits into
Conversation
`gix-utils` and `gix-fs` drew fast pseudo-random numbers from `fastrand`'s global generators, whose per-thread seed is derived only from values that can coincide between separate processes (notably small thread IDs). Two concurrently running processes could therefore produce identical sequences, which weakens backoff jitter (meant to avoid a thundering herd of retries) and can make the filesystem capability probes collide on their temporary file names -- e.g. spuriously reporting that symlinks are unsupported (GitoxideLabs#1789). Add a `gix_utils::rng` module that draws from a per-thread `fastrand::Rng` seeded from a high-entropy OS source via `getrandom`, and route the affected call sites (backoff jitter and the `gix-fs` capability probes) through it. `getrandom` is a target-specific dependency that is excluded on `wasm32-unknown-unknown`, which has no entropy backend by default and runs a single process; there a best-effort seed is used that deliberately avoids `Instant::now()` (which panics on that target). For GitoxideLabs#1816
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: faa19e3139
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".
| use std::hash::{BuildHasher, Hash, Hasher}; | ||
|
|
||
| let mut hasher = std::collections::hash_map::RandomState::new().build_hasher(); | ||
| std::process::id().hash(&mut hasher); |
There was a problem hiding this comment.
Avoid calling process::id in the wasm fallback
On wasm32-unknown-unknown this fallback is always used, and std::process::id() traps at runtime on that target (no pids on this platform). That means the first call to gix_utils::rng::usize—for example through Quadratic::default_with_random()—will abort in browser/unknown WASM builds, whereas the previous fastrand path avoided unsupported OS APIs there. Use a wasm-specific seed that does not call std::process::id() or keep this call behind a non-wasm cfg.
Useful? React with 👍 / 👎.
`std::process::id()` traps at runtime on `wasm32-unknown-unknown`
("no pids on this platform"), and the fallback seed -- the only seed
path on that target -- called it, so the first `rng::usize` would
abort in browser/unknown wasm builds. On that single-process target,
where cross-process distinctness is moot anyway, defer to `fastrand`'s
own wasm-safe default seeding instead. The high-entropy `getrandom`
seed and the `process::id()`-based fallback now compile only off that
target.
| thread_local! { | ||
| static RNG: RefCell<fastrand::Rng> = RefCell::new(new_rng()); | ||
| } |
There was a problem hiding this comment.
This should probably use the lighter-weight Cell instead of RefCell. That is possible by making calls to the thread-local generator replace it with a zeroed-out instance, mutating the temporarily extracted instance to generate a number and advance the instance's state, and then replacing it back. fastrand uses Cell for its own thread-local Rng instance.
It feels intuitive to use RefCell for a thread-local PRNG, because a PRNG is often a large object that is expensive to copy. But fastrand implements the Wyrand algorithm, whose state is only 8 bytes. A fastrand::Rng object is really just a u64 under the hood.
(Because fastrand offers functions for generating a variety of different types, it has various helpers and macros it uses to wrap its usage of with in a way that reduces code duplication. Since the purpose of gix_utils::rng is to serve a small number of fast noncryptographic PRNG needs in gix-* crates, I don't think there's any need to structure the code like that here. But it may still be useful to look at the code of global_rng.rs in fastrand.)
While I think Cell is a better approach than RefCell here, I don't think this is a blocker for the PR. I don't know of any reason a RefCell-based approach cannot be correct. And if this is merged, then a follow-up PR could be made later to change to using Cell.
Acknowledgement: These thoughts are partly motivated by a discussion about #1816 that I had with Nav Ashok (@BitterKanegul) last month. (But any errors or misjudgments in this comment are mine.)
| @@ -21,3 +21,9 @@ bstr = ["dep:bstr"] | |||
| fastrand = "2.0.0" | |||
There was a problem hiding this comment.
fastrand has a default feature std, which is (only) needed for "global generator and global entropy." We needed that before, since we used its thread-local ("global") instance. But as of the change in this PR, we shouldn't need it anymore. So default-features = false can be used now.
This may also be a reasonable time to bump the fastrand dependency from 2.0.0 to 2.4.1.
Per review on GitoxideLabs#2663: - `fastrand::Rng` is a single `u64`, so a `Cell` -- move it out behind a cheap placeholder, advance it, and move it back -- is enough and avoids `RefCell`'s runtime borrow tracking, mirroring `fastrand`'s own thread-local generator. - We no longer use `fastrand`'s global generator, so its `std` default feature isn't needed: set `default-features = false` and bump to 2.4.1. The `wasm32-unknown-unknown` fallback therefore seeds with a fixed value instead of `Rng::new()` (which needs the std-gated global); that target is single-process, so a fixed seed is fine.
What
gix-utilsandgix-fsdraw fast pseudo-random numbers fromfastrand's global generators, whose per-thread seed is derived only from values that can coincide between separate processes — notably small thread IDs. Two concurrently running processes can therefore draw identical sequences, which:gix-lock), andgix-fscapability probes collide on their temporary file names — the spurious "symlinks unsupported" failure seen on CI (Is `overwriting_files_and_lone_directories_works` flaky? #1789).Fixes #1816.
How
Adds a small
gix_utils::rngmodule: a per-threadfastrand::Rngseeded once from a high-entropy OS source viagetrandom::u64(), so sequences differ across processes. The two production call sites are routed through it — the backoff jitter ingix-utils, and the three capability probes ingix-fs(whose now-unused directfastranddependency is dropped).wasm handling
getrandomis added as a target-specific dependency, excluded onwasm32-unknown-unknown, which has no entropy backend by default (and is single-process, so the cross-process concern doesn't apply there). On that target the seed falls back to a best-effort value that deliberately avoidsInstant::now(), since that panics onwasm32-unknown-unknown(#7319). Onwasm32-wasi*targetsgetrandomis included and uses the wasi backend.Note on the public API
This adds a new public
gix_utils::rngmodule (sogix-fscan use it across the crate boundary) — flagging in case it affects the version bump.Out of scope
The
gix-testtoolsport-shuffle (case 3 in the issue) is left as-is — the issue notes it isn't really a problem, only a minor test slowdown.Verification
gix-utils+gix-fsbuild and tests pass;cargo clippy -- -D warningsandcargo fmt --checkare clean.wasm32-unknown-unknown(getrandom excluded, fallback seed used) andwasm32-wasip1(getrandom included). Note: these were verified as builds, not at runtime (no local wasm host).