Skip to content

Add Windows warm daemon IPC#641

Open
nsxdavid wants to merge 2 commits into
justrach:release/0.2.5828from
nsxdavid:windows-ipc-0.2.5828
Open

Add Windows warm daemon IPC#641
nsxdavid wants to merge 2 commits into
justrach:release/0.2.5828from
nsxdavid:windows-ipc-0.2.5828

Conversation

@nsxdavid

Copy link
Copy Markdown
Contributor

Summary

  • Enable the Windows warm-daemon CLI proxy using named-pipe IPC and a Windows daemon lock.
  • Add Windows detached process spawning and bounded subprocess capture needed by daemon startup and local test coverage.
  • Keep the existing POSIX socket/flock path intact; Windows-specific behavior is selected through platform checks.
  • Add a Windows-only warm-daemon test that exercises cold daemon startup followed by a proxied warm query from a Unicode project root.
  • Fix Windows-local test blockers around heap allocation for large watcher queues, environment helpers, rerank trace file access, and POSIX-only tests.

Refs #621.

Validation

  • zig build --global-cache-dir .zig-global-cache test --summary all
    • 23/23 steps succeeded; 858/862 tests passed (4 skipped)
  • zig build --global-cache-dir .zig-global-cache --summary all
    • 3/3 steps succeeded
  • PYTHONIOENCODING=utf-8 python scripts/e2e_mcp_test.py --binary zig-out/bin/codedb.exe --project .
    • 20/20 passed

@nsxdavid nsxdavid marked this pull request as ready for review June 24, 2026 22:36
@nsxdavid

nsxdavid commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Quick note on the benchmark workflow change in 6541fad:

The first CI run completed the benchmark comparison successfully, but failed afterward when the workflow tried to post bench-report.md as a PR comment from a forked PR token:

Resource not accessible by integration

The compare step still fails the job for real benchmark regressions. This change only makes the final reporting comment best-effort when GitHub does not grant the fork PR token permission to create comments.

The benchmark report is still available in the GitHub Actions run under the bench-results artifact, which includes bench-base.json, bench-head.json, and bench-report.md.

@nsxdavid

Copy link
Copy Markdown
Contributor Author

@justrach this is ready for review now.

I targeted release/0.2.5828 as requested, kept the Windows-specific warm-daemon / named-pipe IPC behavior behind platform checks, and included the platform-gated Windows tests plus local Windows validation results in the PR description. CI is green now as well.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6541fad91c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/cli_proxy.zig
var buf: [std.fs.max_path_bytes]u8 = undefined;
const p = std.fmt.bufPrintZ(&buf, "{s}/cli-daemon.lock", .{data_dir}) catch return null;
if (builtin.os.tag == .windows) {
const handle = CreateFileA(p.ptr, GENERIC_READ | GENERIC_WRITE, 0, null, OPEN_ALWAYS, 0, null);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use wide Windows file APIs for the daemon lock

When the Windows data directory contains non-ASCII characters, for example under a localized %USERPROFILE%, this new lock path is still passed to CreateFileA, which interprets the bytes through the process ANSI code page rather than as the UTF-8 paths used elsewhere. In that case daemonLockAvailable/daemonLockTryAcquire can fail or open a mojibake path, so the warm daemon never starts or cannot hold the intended per-project lock; use the wide API (or std.fs) for this path.

Useful? React with 👍 / 👎.

Comment thread src/cio.zig
Comment on lines 748 to +749
if (is_windows) {
spawnDetachedWindows(allocator, argv);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop Windows cli-daemons before nuke removes data

Enabling this Windows detached spawn means ordinary Windows queries now leave an automatic cli-daemon running, but killOtherCodedbProcesses in src/nuke.zig still returns immediately on Windows because it assumed there was no warm daemon to reap. If a user runs codedb nuke during the daemon's idle window, that process can keep files/mappings under ~/.codedb/projects open and Windows deleteTree can fail; the Windows daemon needs a shutdown/reap path before uninstalling.

Useful? React with 👍 / 👎.

@justrach justrach left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking up the named-pipe IPC follow-up @nsxdavid — the MSVCRT argv quoting + round-trip test and the response-size cap are genuinely clean. Before it lands I'd like to tighten the trust model on a couple of the newly-enabled Windows paths. Two main, two minor.

1. Named-pipe squatting → spoofed CLI output + daemon DoS (Medium)
The pipe name is predictable (\\.\pipe\codedb-{Wyhash(0xc0de, root)}-{Wyhash(0xc0de, %USERNAME%)}) and the listener closes + re-creates the instance every loop. A local process can pre-create that name (or win the close/re-create gap):

  • our CreateNamedPipeA then fails and the handler does shutdown.store(true); return; → daemon dies (one-shot DoS);
  • the client opens the name with OPEN_EXISTING, never checks who owns the server end, and writes the reply straight to the terminal (cio.File.stdout().writeAll). A squatter receives the request blob (project path + argv) and returns arbitrary bytes → spoofed output + ANSI-escape injection.

SECURITY_SQOS_PRESENT | SECURITY_IDENTIFICATION on the client is the right call (stops a malicious server impersonating us) but doesn't authenticate the server to the client. Suggested:

  • Client verifies server: GetNamedPipeServerProcessId → compare that PID's token-user SID to our own; bail on mismatch. (Main fix — defeats squat + instance-hijack regardless of name predictability.)
  • Don't shutdown on a single CreateNamedPipeA failure; log + retry.
  • Close the re-create gap (pre-create the next instance before closing the current, or overlapped accept).

2. runCapture runs unqualified git in the repo cwd → potential Windows binary-planting (High, confirming)
runCaptureWindows now actually runs getGitHead's git rev-parse HEAD with argv[0]="git" and cwd = <repo root>. If Windows resolution searches the cwd (incl. PATHEXT), a git.exe/git.bat planted in a repo executes when codedb tags the snapshot SHA — which is automatic. I'm reproducing this on a Windows box; if it confirms, the fix is to resolve git to an absolute path (PATH search excluding cwd) before spawning. Same care for the other unqualified execs.

3. self-update re-enabled on Windows (Low)runCapture also un-gates download-and-replace; just confirm the Windows update path is TLS + checksum/signature verified.

4. pipe DACL (Low)D:P(A;;GA;;;OW) keys on Owner Rights; for an elevated process the owner can default to Administrators, and there's no integrity label. Consider pinning the ACE to the current user's token SID and adding S:(ML;;NW;;;ME).

Happy to pair on the client-side SID check — that one change closes most of the surface. Thanks again!

@justrach

Copy link
Copy Markdown
Owner

Confirmed F2 (binary-planting → code execution) on a real Windows box — and it's a source-level guarantee, not luck.

Empirical PoC (Windows Server, codedb.exe built from this PR's HEAD with Zig 0.16.0):

  1. plant C:\evilrepo\git.batecho PWNED... > C:\codedb_poc_marker.txt
  2. codedb.exe C:\evilrepo index✓ indexed ✓ index ready 2 files
  3. C:\codedb_poc_marker.txt now exists, contents PWNED_BY_PLANTED_GIT_BAT

So indexing a repo that merely contains a git.bat/git.exe/git.cmd runs it. The trigger is automatic: scanBggetGitHead(root)runCapture({ argv = {"git","rev-parse","HEAD"}, cwd = root }).

Why (Zig 0.16.0 std/Io/Threaded.zig, processSpawnWindows): for an unqualified argv[0], dir_buf is seeded with the child cwd (the repo root) and searched before PATH, wildcard + PATHEXT-expanded, with .bat/.cmd routed through cmd.exe:

// "The cwd provided by options is in effect when choosing the executable path to match POSIX semantics."
const cwd_path_w = ... else if (cwd_w) |cwd| break :x cwd;          // child cwd = repo root
var dir_buf: std.ArrayList(u16) = .empty;
if (cwd_path_w.len > 0) try dir_buf.appendSlice(arena, cwd_path_w);  // dir_buf seeded with cwd
windowsCreateProcessPathExt(arena, &dir_buf, &app_buf, PATHEXT, ...) // searched FIRST, before PATH

So even on a box with real git installed, the planted copy wins (cwd is first in the search).

Fix: resolve git to an absolute path (a PATH search that excludes cwd) before runCapture, or otherwise keep cwd out of the executable search. Same care for the other unqualified execs (telemetry/nuke). Happy to send a patch.

@nsxdavid

Copy link
Copy Markdown
Contributor Author

@justrach working the identified issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants