fix(store): preserve corrupt DB instead of deleting (#557) by yasku · Pull Request #620 · DeusData/codebase-memory-mcp

yasku · 2026-06-25T01:31:29Z

What does this PR do?

Fixes #557. The startup/query integrity check unlink()'d a project's .db (plus its -wal/-shm sidecars) on any anomaly, causing total data loss with no backup, no recovery path, and no visible warning. The trigger is frequently a false positive: a WAL leftover after a SIGKILL, a binary variant swap (standard ↔ ui), or a stale-lock break can all make a recoverable DB look corrupt.

This replaces the destructive unlink with a rename to <name>.db.corrupt.<timestamp> and emits a prominent stderr message, matching the issue's suggested fix. The user can now inspect, recover, or attach the file to a bug report.

Two call sites are fixed:

src/mcp/mcp.c resolve_store() — the reported path, hit on the first query against a project.
src/pipeline/pipeline.c try_incremental_or_delete_db() — only preserves on real corruption; a healthy DB rebuilt due to a mode change is still deleted, so .corrupt files don't accumulate on normal reindexes.

If rename fails (permissions, cross-device), it falls back to unlink so no orphan files are left behind.

Checklist

Every commit is signed off (git commit -s) — DCO verified via scripts/check-dco.sh
Tests pass locally (make -f Makefile.cbm test) — 5684 passed
Lint passes for changed files (clang-format clean; cppcheck clean on the three modified files)
New behavior is covered by a test (reproduce-first): resolve_store_preserves_corrupt_db_issue557 seeds a corrupt projects row, issues a query, and asserts the original .db is gone while a .corrupt sidecar remains.

Notes on the lint checklist item (transparency)

A few constraints worth making explicit so the maintainer isn't surprised:

make lint-ci does not pass cleanly locally, but only on pre-existing code I did not touch. My local cppcheck is 2.21.0, which is newer/stricter than CI's and flags pre-existing findings in src/cypher/cypher.c and src/cli/cli.c (e.g. knownConditionTrueFalse, compareValueOutOfTypeRangeError). Running cppcheck scoped to just my three changed files exits clean (0). I deliberately did not touch those files to keep this PR focused (per CONTRIBUTING's "avoid unrelated reformatting").
clang-format version skew. CI uses clang-format-20; I only have clang-format-22 locally. Both modified source files (mcp.c, pipeline.c) are whole-file clean under v22, and since the surrounding pre-existing code shows no diff under v22 either, v20 and v22 appear to agree on this file's style. If CI's v20 disagrees on any of my lines, I'm happy to reformat.
clang-tidy / magic numbers in the test. The new test uses raw strncmp(..., 19) / strncmp(..., 10) literals, matching the existing style in tests/ (e.g. test_cli.c, test_httpd.c, test_pipeline.c). This is fine because test sources are not in LINT_SRCS, and CI's lint step runs cppcheck + clang-format only (clang-tidy is enforced locally, not in CI).
Labels. I don't have triage/write access to this repo, so I couldn't apply any labels (e.g. bug, stability/performance) to match cbm v0.8.1 silently deletes project DBs on "corrupt" detection — data loss with no recovery #557. Please add whatever is appropriate.

The startup/query integrity check unlink()'d a project's .db (plus its -wal/-shm sidecars) on any anomaly, causing total data loss with no backup, no recovery path, and no visible warning. The trigger is often a false positive: a WAL leftover after a SIGKILL, a binary variant swap, or a stale-lock break can all make a recoverable DB look corrupt. Replace the destructive unlink with a rename to <name>.db.corrupt.<ts> and emit a prominent stderr message so the user can inspect, recover, or attach the file to a bug report. Two call sites are fixed: - mcp.c resolve_store(): the reported path, hit on first query. - pipeline.c try_incremental_or_delete_db(): only preserves on real corruption; a healthy DB rebuilt due to a mode change is still deleted so .corrupt files don't accumulate on normal reindexes. Add a regression test (resolve_store_preserves_corrupt_db_issue557) that seeds a corrupt projects row, issues a query, and asserts the original .db is gone while a .corrupt sidecar remains. Signed-off-by: yasku <aguss.cba@gmail.com>

yasku · 2026-06-25T02:15:14Z

Working on fixing this ☺️

CodeQL flagged a time-of-check/time-of-use race in try_incremental_or_delete_db(): the file at db_path was probed with stat() and then operated on by name (rename/unlink), so the underlying file could change between the two. Drop the redundant stat() existence probe and derive existence from a query-mode open (cbm_store_open_path_query, which never creates the file): a NULL result means absent/unreadable, so we return without touching the path. This removes the stat()->rename() dataflow CodeQL reports and matches resolve_store() in mcp.c, which already uses this pattern. Behavior is unchanged (absent -> skip, healthy -> reindex, corrupt -> preserve as .corrupt). Signed-off-by: yasku <aguss.cba@gmail.com>

yasku

check

Resolves conflict in src/pipeline/pipeline.c between #557 (preserve corrupt DB via rename) and #516 (preserve ADRs across reindex). ADR capture now runs before the delete/rename split so it covers both the healthy mode-change reindex and the corrupt-DB path. Opened read-only (cbm_store_open_path_query) so a WAL leftover is not checkpointed into the .db before it is preserved as evidence. Signed-off-by: yasku <aguss.cba@gmail.com>

yasku · 2026-06-25T22:19:40Z

Merged the latest main into the branch to resolve the conflict that appeared after #516 (ADR persistence, merged via #539) landed — both PRs modified the same reindex block in src/pipeline/pipeline.c (try_incremental_or_delete_db).

Conflict resolution

The two changes are complementary, not competing:

manage_adr data loss: ADRs in project_summaries are deleted during graph re-indexing #516 captures any existing ADR into p->saved_adr before the old DB is removed, so the full reindex can restore project_summaries.
This PR (cbm v0.8.1 silently deletes project DBs on "corrupt" detection — data loss with no recovery #557) splits the removal into two paths: a corrupt DB is renamed to <name>.db.corrupt.<ts> (preserved as evidence) instead of being unconditionally unlink()'d.

I moved the ADR capture so it runs before the delete/rename split, which means it now covers both the healthy mode-change reindex and the corrupt-DB path — matching the prior unconditional behavior on main. One deliberate refinement: the ADR is reopened with cbm_store_open_path_query (read-only) rather than a writable handle, so a WAL leftover — itself one of the false-positive corruption triggers this PR addresses — is not checkpointed into the .db immediately before we preserve it as evidence.

Verification

make -f Makefile.cbm test → 5692 passed locally.
No conflict markers remain; mcp.c and tests/test_mcp.c merged cleanly (pipeline.c was the only conflict).
The earlier CodeQL TOCTOU flag on pipeline.c remains addressed — existence is still derived from the cbm_store_open_path_query open, never a stat()-then-use, and the new ADR read follows the same pattern.

No functional change to the #557 fix itself; this update only integrates the latest main.

yasku · 2026-06-26T00:47:44Z

Closing this PR — I'm continuing this work in a private repository, outside the upstream contribution flow. Thanks to the maintainers for the review and the CI feedback on the corrupt-DB preservation change. Issue #557 remains valid and is open for anyone who wants to pick it up independently.

jonlimx mentioned this pull request Jun 25, 2026

cbm v0.8.1 silently deletes project DBs on "corrupt" detection — data loss with no recovery #557

Open

github-advanced-security AI found potential problems Jun 25, 2026

View reviewed changes

Comment thread src/pipeline/pipeline.c Fixed

yasku commented Jun 25, 2026

View reviewed changes

yasku closed this Jun 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(store): preserve corrupt DB instead of deleting (#557)#620

fix(store): preserve corrupt DB instead of deleting (#557)#620
yasku wants to merge 3 commits into
DeusData:mainfrom
yasku:fix/557-preserve-corrupt-db

yasku commented Jun 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

yasku commented Jun 25, 2026

Uh oh!

yasku left a comment

Uh oh!

yasku commented Jun 25, 2026 •

edited

Loading

Uh oh!

yasku commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yasku commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklist

Notes on the lint checklist item (transparency)

Uh oh!

Uh oh!

yasku commented Jun 25, 2026

Uh oh!

yasku left a comment

Choose a reason for hiding this comment

Uh oh!

yasku commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yasku commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yasku commented Jun 25, 2026 •

edited

Loading

yasku commented Jun 25, 2026 •

edited

Loading