Skip to content

fix: seed fast pseudo-random numbers to differ across processes#2663

Open
Amey Pawar (ameyypawar) wants to merge 5 commits into
GitoxideLabs:mainfrom
ameyypawar:fix/1816-rng-seed
Open

fix: seed fast pseudo-random numbers to differ across processes#2663
Amey Pawar (ameyypawar) wants to merge 5 commits into
GitoxideLabs:mainfrom
ameyypawar:fix/1816-rng-seed

Conversation

@ameyypawar

Copy link
Copy Markdown
Contributor

What

gix-utils and gix-fs draw fast pseudo-random numbers from fastrand's global generators, whose per-thread seed is derived only from values that can coincide between separate processes — notably small thread IDs. Two concurrently running processes can therefore draw identical sequences, which:

Fixes #1816.

How

Adds a small gix_utils::rng module: a per-thread fastrand::Rng seeded once from a high-entropy OS source via getrandom::u64(), so sequences differ across processes. The two production call sites are routed through it — the backoff jitter in gix-utils, and the three capability probes in gix-fs (whose now-unused direct fastrand dependency is dropped).

wasm handling

getrandom is added as a target-specific dependency, excluded on wasm32-unknown-unknown, which has no entropy backend by default (and is single-process, so the cross-process concern doesn't apply there). On that target the seed falls back to a best-effort value that deliberately avoids Instant::now(), since that panics on wasm32-unknown-unknown (#7319). On wasm32-wasi* targets getrandom is included and uses the wasi backend.

Note on the public API

This adds a new public gix_utils::rng module (so gix-fs can use it across the crate boundary) — flagging in case it affects the version bump.

Out of scope

The gix-testtools port-shuffle (case 3 in the issue) is left as-is — the issue notes it isn't really a problem, only a minor test slowdown.

Verification

  • gix-utils + gix-fs build and tests pass; cargo clippy -- -D warnings and cargo fmt --check are clean.
  • Builds on wasm32-unknown-unknown (getrandom excluded, fallback seed used) and wasm32-wasip1 (getrandom included). Note: these were verified as builds, not at runtime (no local wasm host).

Amey Pawar (ameyypawar) and others added 3 commits June 21, 2026 18:32
`gix-utils` and `gix-fs` drew fast pseudo-random numbers from `fastrand`'s
global generators, whose per-thread seed is derived only from values that
can coincide between separate processes (notably small thread IDs). Two
concurrently running processes could therefore produce identical sequences,
which weakens backoff jitter (meant to avoid a thundering herd of retries)
and can make the filesystem capability probes collide on their temporary
file names -- e.g. spuriously reporting that symlinks are unsupported (GitoxideLabs#1789).

Add a `gix_utils::rng` module that draws from a per-thread `fastrand::Rng`
seeded from a high-entropy OS source via `getrandom`, and route the affected
call sites (backoff jitter and the `gix-fs` capability probes) through it.

`getrandom` is a target-specific dependency that is excluded on
`wasm32-unknown-unknown`, which has no entropy backend by default and runs a
single process; there a best-effort seed is used that deliberately avoids
`Instant::now()` (which panics on that target).

For GitoxideLabs#1816

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: faa19e3139

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".

Comment thread gix-utils/src/rng.rs
use std::hash::{BuildHasher, Hash, Hasher};

let mut hasher = std::collections::hash_map::RandomState::new().build_hasher();
std::process::id().hash(&mut hasher);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid calling process::id in the wasm fallback

On wasm32-unknown-unknown this fallback is always used, and std::process::id() traps at runtime on that target (no pids on this platform). That means the first call to gix_utils::rng::usize—for example through Quadratic::default_with_random()—will abort in browser/unknown WASM builds, whereas the previous fastrand path avoided unsupported OS APIs there. Use a wasm-specific seed that does not call std::process::id() or keep this call behind a non-wasm cfg.

Useful? React with 👍 / 👎.

`std::process::id()` traps at runtime on `wasm32-unknown-unknown`
("no pids on this platform"), and the fallback seed -- the only seed
path on that target -- called it, so the first `rng::usize` would
abort in browser/unknown wasm builds. On that single-process target,
where cross-process distinctness is moot anyway, defer to `fastrand`'s
own wasm-safe default seeding instead. The high-entropy `getrandom`
seed and the `process::id()`-based fallback now compile only off that
target.
Comment thread gix-utils/src/rng.rs
Comment on lines +16 to +18
thread_local! {
static RNG: RefCell<fastrand::Rng> = RefCell::new(new_rng());
}

@EliahKagan Eliah Kagan (EliahKagan) Jun 21, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably use the lighter-weight Cell instead of RefCell. That is possible by making calls to the thread-local generator replace it with a zeroed-out instance, mutating the temporarily extracted instance to generate a number and advance the instance's state, and then replacing it back. fastrand uses Cell for its own thread-local Rng instance.

It feels intuitive to use RefCell for a thread-local PRNG, because a PRNG is often a large object that is expensive to copy. But fastrand implements the Wyrand algorithm, whose state is only 8 bytes. A fastrand::Rng object is really just a u64 under the hood.

(Because fastrand offers functions for generating a variety of different types, it has various helpers and macros it uses to wrap its usage of with in a way that reduces code duplication. Since the purpose of gix_utils::rng is to serve a small number of fast noncryptographic PRNG needs in gix-* crates, I don't think there's any need to structure the code like that here. But it may still be useful to look at the code of global_rng.rs in fastrand.)

While I think Cell is a better approach than RefCell here, I don't think this is a blocker for the PR. I don't know of any reason a RefCell-based approach cannot be correct. And if this is merged, then a follow-up PR could be made later to change to using Cell.

Acknowledgement: These thoughts are partly motivated by a discussion about #1816 that I had with Nav Ashok (@BitterKanegul) last month. (But any errors or misjudgments in this comment are mine.)

Comment thread gix-utils/Cargo.toml Outdated
@@ -21,3 +21,9 @@ bstr = ["dep:bstr"]
fastrand = "2.0.0"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fastrand has a default feature std, which is (only) needed for "global generator and global entropy." We needed that before, since we used its thread-local ("global") instance. But as of the change in this PR, we shouldn't need it anymore. So default-features = false can be used now.

This may also be a reasonable time to bump the fastrand dependency from 2.0.0 to 2.4.1.

Per review on GitoxideLabs#2663:

- `fastrand::Rng` is a single `u64`, so a `Cell` -- move it out behind a
  cheap placeholder, advance it, and move it back -- is enough and avoids
  `RefCell`'s runtime borrow tracking, mirroring `fastrand`'s own
  thread-local generator.
- We no longer use `fastrand`'s global generator, so its `std` default
  feature isn't needed: set `default-features = false` and bump to 2.4.1.
  The `wasm32-unknown-unknown` fallback therefore seeds with a fixed value
  instead of `Rng::new()` (which needs the std-gated global); that target
  is single-process, so a fixed seed is fine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pseudorandom numbers are sometimes the same across processes

2 participants