Skip to content

fix: eth block returns correctly filled logs bloom field#7156

Open
akaladarshi wants to merge 7 commits into
mainfrom
akaladarshi/fix-logs-bloom
Open

fix: eth block returns correctly filled logs bloom field#7156
akaladarshi wants to merge 7 commits into
mainfrom
akaladarshi/fix-logs-bloom

Conversation

@akaladarshi

@akaladarshi akaladarshi commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Summary of changes

Changes introduced in this pull request:

  • While creating the full ethereum block data, the log bloom field is correctly filed as the Ethereum yellow paper suggested.
  • Now we have a new column in DB where we store the computed logs bloom, so we don't have to compute if it is available in the DB.

Reference issue to close (if applicable)

Closes #7151

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • Bug Fixes
    • Corrected Ethereum block logsBloom values so they are computed from actual block logs instead of using a placeholder value.
    • Improved bloom handling to keep receipts and block data consistent, including better pruning of outdated bloom records.
    • Added support for storing and retrieving bloom data during database upgrades and normal operation.
  • Chores
    • Updated snapshots, filters, and tests to reflect the new block log bloom behavior.

@akaladarshi akaladarshi requested a review from a team as a code owner June 9, 2026 15:54
@akaladarshi akaladarshi requested review from LesnyRumcajs and sudo-shashank and removed request for a team June 9, 2026 15:54
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

This PR fixes an incorrect all-ones eth block logsBloom by computing it from actual block logs via new accrue_eth_log/compute_block_logs_bloom helpers, adds a persistent EthBlockBloomStore trait with implementations across DB backends, a schema migration for a new EthBlockBloom column, GC pruning, and updates filter lists/snapshots/changelog accordingly.

Changes

Eth Block Logs Bloom Computation and Caching

Layer / File(s) Summary
Bloom store contract
src/db/mod.rs
Defines EthBlockBloomStore trait and encode_block_bloom/decode_block_bloom helpers.
Store implementations
src/db/memory.rs, src/db/parity_db.rs, src/db/parity_db/gc.rs
Implements bloom read/write/delete for MemoryDB, ParityDb, and GarbageCollectableParityDb, including a new EthBlockBloom column and encode/decode round-trip test.
Delegation wiring
src/db/car/many.rs, src/db/db_impl.rs, src/tool/subcommands/api_cmd/generate_test_snapshot.rs
Forwards EthBlockBloomStore through ManyCar, DbImpl delegation, and ReadOpsTrackingStore.
Store tests
src/db/tests/subtests/mod.rs, src/db/tests/mem_test.rs, src/db/tests/parity_test.rs
Adds shared block_bloom_prune subtest and per-backend test invocations.
Schema migration
src/db/migration/v0_33_7.rs, src/db/migration/migration_map.rs, src/db/migration/mod.rs
Adds Migration0_33_6_0_33_7 creating the EthBlockBloom column and registers it in the migration chain.
GC pruning
src/db/gc/snapshot.rs
Prunes stale bloom entries below a height cutoff during snapshot export cleanup.
RPC bloom computation
src/rpc/methods/eth.rs
Computes and caches block logsBloom via new helpers; reuses accrue_eth_log in receipt construction; adds decomposition unit test.
Filter lists, snapshots, changelog
scripts/tests/api_compare/filter-list*, src/tool/subcommands/api_cmd/test_snapshots.txt, CHANGELOG.md
Removes now-passing methods from ignore lists, refreshes snapshot filenames, and documents the fix.

Estimated code review effort: 4 (Complex) | ~60 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Rpc as Block construction
  participant BlockBloom as block_logs_bloom
  participant Db as EthBlockBloomStore
  participant Compute as compute_block_logs_bloom
  participant Accrue as accrue_eth_log
  Rpc->>BlockBloom: state_manager, tipset, executed_messages
  BlockBloom->>Db: read_bloom(tipset cid)
  alt cache hit
    Db-->>BlockBloom: cached bloom bytes
  else cache miss
    BlockBloom->>Compute: compute_block_logs_bloom(executed_messages)
    loop per log
      Compute->>Accrue: address + topics
      Accrue-->>Compute: accumulated Bloom
    end
    Compute-->>BlockBloom: computed bloom
    BlockBloom->>Db: write_bloom(tipset cid, height, bloom)
  end
  BlockBloom-->>Rpc: logs_bloom
Loading

Possibly related PRs

Suggested reviewers: hanabi1224, sudo-shashank, LesnyRumcajs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.12% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: fixing Ethereum block logsBloom population.
Linked Issues check ✅ Passed The PR computes block logsBloom from emitted logs, preserves receipt blooms, and applies it to block responses as requested in #7151.
Out of Scope Changes check ✅ Passed The DB, migration, GC, and test snapshot updates support storing and validating block blooms, so they remain in scope.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch akaladarshi/fix-logs-bloom
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch akaladarshi/fix-logs-bloom

Comment @coderabbitai help to get the list of available commands.

@akaladarshi akaladarshi added the RPC requires calibnet RPC checks to run on CI label Jun 9, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/rpc/methods/eth/filter/mod.rs (1)

392-392: ⚡ Quick win

Add context to the error propagation.

The try_from conversion lacks .context(), which violates the coding guideline to "add context with .context() when errors occur." While unlikely to fail in practice, adding context improves debuggability.

🔧 Suggested fix
-                let event_idx_base = u64::try_from(event_count)?;
+                let event_idx_base = u64::try_from(event_count)
+                    .context("event count exceeds u64::MAX")?;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/eth/filter/mod.rs` at line 392, The conversion call let
event_idx_base = u64::try_from(event_count)? should propagate a contextual
error; replace the bare ? with a Context-wrapped error (e.g., call .context(...)
before ?). Update the expression referencing u64::try_from(event_count) so
failures include a clear message like "failed to convert event_count to u64"
(affecting the event_idx_base assignment) to satisfy the `.context()` guideline.

Source: Coding guidelines

src/rpc/methods/eth.rs (1)

4562-4600: ⚡ Quick win

Consider adding a test for compute_block_logs_bloom.

The current test validates accrue_eth_log behavior well (empty bloom, single log, OR composition, idempotence). However, there's no unit test for compute_block_logs_bloom itself, which handles event collection and address resolution. An integration test or unit test covering:

  • Blocks with no events → empty bloom
  • Blocks with valid EVM events → correct bloom
  • Blocks with mixed valid/invalid events → partial bloom

would increase confidence in the full bloom computation path.

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/rpc/methods/eth/filter/mod.rs`:
- Around line 358-365: Add a doc comment above the public function
collect_events_from_messages that succinctly describes its purpose (collect
events from executed messages in a tipset), its parameters (state_manager:
&StateManager, tipset: &Tipset, executed_messages: &[ExecutedMessage], spec:
Option<&impl Matcher>, skip_event: SkipEvent, collected_events: &mut
Vec<CollectedEvent>), its behavior (filters/matches events using spec, skips
events per skip_event, appends found CollectedEvent items into
collected_events), and its return value (anyhow::Result<()> indicating success
or error); make sure the doc mentions that collected_events is mutated in-place
and any notable side effects or error conditions the caller should expect.

---

Nitpick comments:
In `@src/rpc/methods/eth/filter/mod.rs`:
- Line 392: The conversion call let event_idx_base = u64::try_from(event_count)?
should propagate a contextual error; replace the bare ? with a Context-wrapped
error (e.g., call .context(...) before ?). Update the expression referencing
u64::try_from(event_count) so failures include a clear message like "failed to
convert event_count to u64" (affecting the event_idx_base assignment) to satisfy
the `.context()` guideline.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 80cc42cd-3208-4465-a098-75793490f3be

📥 Commits

Reviewing files that changed from the base of the PR and between 7bafcfb and 45541b5.

📒 Files selected for processing (2)
  • src/rpc/methods/eth.rs
  • src/rpc/methods/eth/filter/mod.rs

Comment on lines +358 to +365
pub async fn collect_events_from_messages(
state_manager: &StateManager,
tipset: &Tipset,
executed_messages: &[ExecutedMessage],
spec: Option<&impl Matcher>,
skip_event: SkipEvent,
collected_events: &mut Vec<CollectedEvent>,
) -> anyhow::Result<()> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add doc comment for the new public function.

The new collect_events_from_messages function is public but lacks documentation. As per coding guidelines, all public functions should have doc comments explaining their purpose, parameters, and behavior.

📝 Suggested documentation
+    /// Collects events from executed messages for a given tipset.
+    ///
+    /// This is a lower-level helper that operates on already-loaded executed messages,
+    /// enabling reuse from both RPC event collection and block bloom computation.
+    ///
+    /// # Parameters
+    /// - `state_manager`: State manager for address resolution
+    /// - `tipset`: The tipset containing the executed messages
+    /// - `executed_messages`: Pre-loaded executed messages from the tipset
+    /// - `spec`: Optional filter spec to match specific events
+    /// - `skip_event`: Strategy for handling unresolved addresses
+    /// - `collected_events`: Output buffer for matching events
     pub async fn collect_events_from_messages(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub async fn collect_events_from_messages(
state_manager: &StateManager,
tipset: &Tipset,
executed_messages: &[ExecutedMessage],
spec: Option<&impl Matcher>,
skip_event: SkipEvent,
collected_events: &mut Vec<CollectedEvent>,
) -> anyhow::Result<()> {
/// Collects events from executed messages for a given tipset.
///
/// This is a lower-level helper that operates on already-loaded executed messages,
/// enabling reuse from both RPC event collection and block bloom computation.
///
/// # Parameters
/// - `state_manager`: State manager for address resolution
/// - `tipset`: The tipset containing the executed messages
/// - `executed_messages`: Pre-loaded executed messages from the tipset
/// - `spec`: Optional filter spec to match specific events
/// - `skip_event`: Strategy for handling unresolved addresses
/// - `collected_events`: Output buffer for matching events
pub async fn collect_events_from_messages(
state_manager: &StateManager,
tipset: &Tipset,
executed_messages: &[ExecutedMessage],
spec: Option<&impl Matcher>,
skip_event: SkipEvent,
collected_events: &mut Vec<CollectedEvent>,
) -> anyhow::Result<()> {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/eth/filter/mod.rs` around lines 358 - 365, Add a doc comment
above the public function collect_events_from_messages that succinctly describes
its purpose (collect events from executed messages in a tipset), its parameters
(state_manager: &StateManager, tipset: &Tipset, executed_messages:
&[ExecutedMessage], spec: Option<&impl Matcher>, skip_event: SkipEvent,
collected_events: &mut Vec<CollectedEvent>), its behavior (filters/matches
events using spec, skips events per skip_event, appends found CollectedEvent
items into collected_events), and its return value (anyhow::Result<()>
indicating success or error); make sure the doc mentions that collected_events
is mutated in-place and any notable side effects or error conditions the caller
should expect.

Source: Coding guidelines

@akaladarshi akaladarshi marked this pull request as draft June 10, 2026 07:20
@akaladarshi akaladarshi force-pushed the akaladarshi/fix-logs-bloom branch 2 times, most recently from b0be642 to 96b6e9c Compare June 19, 2026 16:54
@akaladarshi akaladarshi marked this pull request as ready for review June 19, 2026 16:54
@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 54.32526% with 132 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.07%. Comparing base (d83a142) to head (488c3c4).
⚠️ Report is 3 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/db/migration/v0_33_7.rs 6.75% 69 Missing ⚠️
src/db/car/many.rs 0.00% 14 Missing ⚠️
src/db/parity_db/gc.rs 0.00% 14 Missing ⚠️
...tool/subcommands/api_cmd/generate_test_snapshot.rs 0.00% 14 Missing ⚠️
src/rpc/methods/eth.rs 90.58% 3 Missing and 5 partials ⚠️
src/db/gc/snapshot.rs 0.00% 6 Missing ⚠️
src/db/parity_db.rs 86.11% 2 Missing and 3 partials ⚠️
src/db/mod.rs 85.71% 0 Missing and 2 partials ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/db/db_impl.rs 34.32% <ø> (ø)
src/db/memory.rs 91.30% <100.00%> (+1.83%) ⬆️
src/db/migration/migration_map.rs 89.74% <ø> (ø)
src/db/mod.rs 62.00% <85.71%> (+9.22%) ⬆️
src/db/parity_db.rs 68.48% <86.11%> (+1.64%) ⬆️
src/db/gc/snapshot.rs 0.00% <0.00%> (ø)
src/rpc/methods/eth.rs 65.27% <90.58%> (+0.44%) ⬆️
src/db/car/many.rs 51.82% <0.00%> (-2.80%) ⬇️
src/db/parity_db/gc.rs 29.93% <0.00%> (-3.16%) ⬇️
...tool/subcommands/api_cmd/generate_test_snapshot.rs 6.55% <0.00%> (-0.43%) ⬇️
... and 1 more

... and 8 files with indirect coverage changes


Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a4edc19...488c3c4. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sudo-shashank sudo-shashank left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like lotus we should store the computed logs bloom so we don't re-compute it again and again.

@sudo-shashank

Copy link
Copy Markdown
Contributor

There are few more changes required and needs to be ported from the list of changes filecoin-project/lotus#13618 for both correctness and performance.

- Add a tipset_bloom table to the chain index DB for block-level ETH log blooms.
- Add a migration for existing chain index DBs.
- Compute and store tipset-level blooms while indexing actor events.
- Store empty blooms for indexed tipsets with no events.
- Rebuild/store blooms when previously reverted events are unreverted.
- Delete bloom rows on revert.
- Delete old bloom rows during chain index GC.
- Expose GetTipsetBloom on the chain indexer.
- Use indexed blooms when constructing ETH block responses for:
- eth_getBlockByHash
- eth_getBlockByNumber
- new-head subscription block payloads
- Preserve compatibility by falling back to the previous full bloom when no indexed bloom is available.
- Extend TestTxReceiptBloom to cover receipt bloom, empty block bloom, block-by-hash bloom, and block-by-number bloom.

@akaladarshi

Copy link
Copy Markdown
Collaborator Author

There are few more changes required and needs to be ported from the list of changes filecoin-project/lotus#13618 for both correctness and performance.

- Add a tipset_bloom table to the chain index DB for block-level ETH log blooms.
- Add a migration for existing chain index DBs.
- Compute and store tipset-level blooms while indexing actor events.
- Store empty blooms for indexed tipsets with no events.
- Rebuild/store blooms when previously reverted events are unreverted.
- Delete bloom rows on revert.
- Delete old bloom rows during chain index GC.
- Expose GetTipsetBloom on the chain indexer.
- Use indexed blooms when constructing ETH block responses for:
- eth_getBlockByHash
- eth_getBlockByNumber
- new-head subscription block payloads
- Preserve compatibility by falling back to the previous full bloom when no indexed bloom is available.
- Extend TestTxReceiptBloom to cover receipt bloom, empty block bloom, block-by-hash bloom, and block-by-number bloom.

@sudo-shashank When I started working on this, there was a possibility of us shifting to the SQL DB since, that is not the case anymore I will store the data in the current index DB

@akaladarshi akaladarshi marked this pull request as draft June 24, 2026 05:34
@akaladarshi akaladarshi force-pushed the akaladarshi/fix-logs-bloom branch from 96b6e9c to 887dba6 Compare July 1, 2026 12:21
@akaladarshi akaladarshi marked this pull request as ready for review July 1, 2026 12:21

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
CHANGELOG.md (1)

50-50: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Reference the issue instead of the PR.

This entry should link to issue #7151 rather than PR #7156 so the changelog follows the repo’s traceability convention. Based on learnings, when both exist, use [#ISSUE_NO](link-to-issue): description.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CHANGELOG.md` at line 50, The changelog entry currently links to the pull
request instead of the tracking issue, so update the bullet in CHANGELOG.md to
reference issue `#7151` using the repo’s standard [`#ISSUE_NO`](link-to-issue):
description format. Keep the existing description text, but change the linked
identifier from the PR reference to the issue reference so the entry matches the
traceability convention.

Source: Learnings

src/db/gc/snapshot.rs (1)

351-358: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Silent no-op if heaviest_car_tipset() fails.

The delete_blooms_before_height error path logs a warning, but if db.heaviest_car_tipset() itself returns Err, the whole pruning step silently no-ops with no log line, unlike the similar Ok(ts) pattern used later in this function (Lines 365-378) where failures elsewhere are still surfaced. Since this can leave stale bloom rows accumulating unnoticed, consider logging on the Err branch too.

Proposed fix
-            if let Ok(head) = db.heaviest_car_tipset() {
-                let cutoff = head.epoch() - self.recent_state_roots;
-                if let Err(e) = db.delete_blooms_before_height(cutoff) {
-                    tracing::warn!("failed to prune stale block blooms: {e:#}");
-                }
-            }
+            match db.heaviest_car_tipset() {
+                Ok(head) => {
+                    let cutoff = head.epoch() - self.recent_state_roots;
+                    if let Err(e) = db.delete_blooms_before_height(cutoff) {
+                        tracing::warn!("failed to prune stale block blooms: {e:#}");
+                    }
+                }
+                Err(e) => tracing::warn!("failed to get heaviest car tipset for bloom pruning: {e:#}"),
+            }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/gc/snapshot.rs` around lines 351 - 358, The pruning block in snapshot
cleanup silently skips work when db.heaviest_car_tipset() returns Err, so add
explicit logging in that error branch instead of only handling the Ok(head)
case. Update the bloom-pruning logic in the snapshot cleanup path to mirror the
later Ok(ts) style used in the same function by surfacing the failure with
tracing::warn! (or equivalent) and keeping the existing
delete_blooms_before_height error handling unchanged.
src/db/tests/subtests/mod.rs (1)

71-100: 🎯 Functional Correctness | 🔵 Trivial | 💤 Low value

Consider adding an exact-boundary case to the prune test.

The test exercises heights 100 (deleted, < 150) and 200 (kept, > 150), but never a bloom stored exactly at the cutoff (height == 150), which per decode_block_bloom's h < height check should be kept. A small addition would pin down the inclusive boundary explicitly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/tests/subtests/mod.rs` around lines 71 - 100, The block bloom prune
test in block_bloom_prune only covers values below and above the cutoff, so add
an exact-cutoff case to lock in the boundary behavior. In the same test, write a
bloom through EthBlockBloomStore at height 150, then call
delete_blooms_before_height(150) and assert that read_bloom still returns it,
alongside the existing assertions for the lower and higher heights.
src/db/memory.rs (1)

150-172: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Resilient decode-failure handling — good pattern, but see parity_db.rs for a divergent (fail-fast) counterpart.

retain here silently drops any entry that fails to decode, which is a reasonable, resilient default for pruning. Note that ParityDb::delete_blooms_before_height (src/db/parity_db.rs) instead propagates a decode error via ?, aborting the entire prune on a single corrupted row — see the comment there for the concrete risk. Worth aligning both implementations to the same failure semantics.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/memory.rs` around lines 150 - 172, The bloom-pruning logic in
MemoryDB::delete_blooms_before_height currently treats decode failures as silent
drops, which diverges from the fail-fast behavior in
ParityDb::delete_blooms_before_height. Align the failure semantics between the
two implementations by choosing one consistent approach and applying it in the
delete_blooms_before_height path, using the same decode_block_bloom handling
pattern in both MemoryDB and ParityDb so a corrupted entry is either always
tolerated or always surfaced the same way.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/db/migration/v0_33_7.rs`:
- Around line 46-54: Rollback the database rename if `parity_db::Db::add_column`
fails in `v0_33_7` so `old_db` is not left as an empty placeholder while
`temp_db` still contains the real data. Update the migration flow around the
`rename`, `create_dir_all`, and `add_column` steps to restore the original
directory or clean up both `old_db` and `temp_db` before returning the error, so
later startup logic does not delete the only valid copy.

In `@src/db/mod.rs`:
- Around line 156-170: decode_block_bloom currently accepts any payload after
the 8-byte height prefix, so malformed cached blooms can slip through as
logsBloom. Tighten the validation in decode_block_bloom (and any callers that
read its output) to require exactly the expected 256-byte bloom length, and
treat anything shorter or longer as invalid so the cache path falls back or is
ignored.

In `@src/db/parity_db.rs`:
- Around line 269-282: In `ParityDB::delete_blooms_before_height`, a single bad
bloom entry currently aborts pruning because `decode_block_bloom(&entry)?`
propagates immediately. Update this loop to mirror
`MemoryDB::delete_blooms_before_height` by handling decode failures per entry:
skip malformed rows, optionally log them, and continue collecting stale keys for
the rest. Keep the pruning path resilient so one corrupted `EthBlockBloom`
record does not stop `self.db.commit(...)` from deleting the valid stale
entries.

---

Nitpick comments:
In `@CHANGELOG.md`:
- Line 50: The changelog entry currently links to the pull request instead of
the tracking issue, so update the bullet in CHANGELOG.md to reference issue
`#7151` using the repo’s standard [`#ISSUE_NO`](link-to-issue): description format.
Keep the existing description text, but change the linked identifier from the PR
reference to the issue reference so the entry matches the traceability
convention.

In `@src/db/gc/snapshot.rs`:
- Around line 351-358: The pruning block in snapshot cleanup silently skips work
when db.heaviest_car_tipset() returns Err, so add explicit logging in that error
branch instead of only handling the Ok(head) case. Update the bloom-pruning
logic in the snapshot cleanup path to mirror the later Ok(ts) style used in the
same function by surfacing the failure with tracing::warn! (or equivalent) and
keeping the existing delete_blooms_before_height error handling unchanged.

In `@src/db/memory.rs`:
- Around line 150-172: The bloom-pruning logic in
MemoryDB::delete_blooms_before_height currently treats decode failures as silent
drops, which diverges from the fail-fast behavior in
ParityDb::delete_blooms_before_height. Align the failure semantics between the
two implementations by choosing one consistent approach and applying it in the
delete_blooms_before_height path, using the same decode_block_bloom handling
pattern in both MemoryDB and ParityDb so a corrupted entry is either always
tolerated or always surfaced the same way.

In `@src/db/tests/subtests/mod.rs`:
- Around line 71-100: The block bloom prune test in block_bloom_prune only
covers values below and above the cutoff, so add an exact-cutoff case to lock in
the boundary behavior. In the same test, write a bloom through
EthBlockBloomStore at height 150, then call delete_blooms_before_height(150) and
assert that read_bloom still returns it, alongside the existing assertions for
the lower and higher heights.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a154e991-22c1-4133-82f4-19962bd5eaa2

📥 Commits

Reviewing files that changed from the base of the PR and between 96b6e9c and 887dba6.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • scripts/tests/api_compare/filter-list
  • scripts/tests/api_compare/filter-list-gateway
  • src/db/car/many.rs
  • src/db/db_impl.rs
  • src/db/gc/snapshot.rs
  • src/db/memory.rs
  • src/db/migration/migration_map.rs
  • src/db/migration/mod.rs
  • src/db/migration/v0_33_7.rs
  • src/db/mod.rs
  • src/db/parity_db.rs
  • src/db/parity_db/gc.rs
  • src/db/tests/mem_test.rs
  • src/db/tests/parity_test.rs
  • src/db/tests/subtests/mod.rs
  • src/rpc/methods/eth.rs
  • src/tool/subcommands/api_cmd/generate_test_snapshot.rs
  • src/tool/subcommands/api_cmd/test_snapshots.txt
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • filecoin-project/lotus (manual)
💤 Files with no reviewable changes (1)
  • scripts/tests/api_compare/filter-list
✅ Files skipped from review due to trivial changes (2)
  • src/db/migration/mod.rs
  • src/db/db_impl.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/tool/subcommands/api_cmd/test_snapshots.txt
  • scripts/tests/api_compare/filter-list-gateway
  • src/rpc/methods/eth.rs

Comment on lines +46 to +54
std::fs::rename(&old_db, &temp_db).context("failed to rename database directory")?;

// Create a placeholder so the delete step succeeds
std::fs::create_dir_all(&old_db).context("failed to create placeholder directory")?;

info!("Adding EthBlockBloom column to database");
let mut opts = paritydb_0_33_6::to_options(temp_db.clone());
parity_db::Db::add_column(&mut opts, paritydb_0_33_6::eth_block_bloom_column_options())
.context("failed to add EthBlockBloom column")?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect migration wrapper behavior around migrate_core failures.
fd -p 'migration_map.rs' src/db/migration --exec sed -n '1,140p' {}
fd -p 'v0_33_7.rs' src/db/migration --exec sed -n '37,58p' {}

Repository: ChainSafe/forest

Length of output: 6587


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the migration flow, rollback/error handling, and any tests covering failure paths.
rg -n "fn migrate\(|migrate_core\(|remove_dir_all\(|rename\(|rollback|temporary_db_path|migrate failed|migration" src/db/migration -g '!**/target/**'

echo '--- v0_33_7 ---'
sed -n '1,220p' src/db/migration/v0_33_7.rs

echo '--- migration map ---'
sed -n '1,220p' src/db/migration/migration_map.rs

echo '--- search tests ---'
rg -n "v0_33_7|Migration0_33_6_0_33_7|temporary_db|rollback|failed to add EthBlockBloom" src -g '*test*' -g '*tests*'

Repository: ChainSafe/forest

Length of output: 31336


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect how startup/version detection interacts with leftover old_db/temp_db directories.
sed -n '1,260p' src/db/migration/db_migration.rs

echo '---'
rg -n "current db version|version.*dir|old_db_path|new_db_path|temporary_db_path|is_migration_required|Mode::Current|current mode|version database|db version" src -g '!**/target/**'

Repository: ChainSafe/forest

Length of output: 9388


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect version detection and how leftover placeholder/temp directories are interpreted.
sed -n '1,220p' src/db/db_mode.rs

Repository: ChainSafe/forest

Length of output: 5437


Rollback the rename on add_column failure. If this errors, old_db is left as an empty semver-named placeholder while temp_db still holds the only copy of the database; the next startup will treat the placeholder as the source DB and pre_checks will delete temp_db, losing the data. Restore old_db or clean up both paths before returning the error.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/migration/v0_33_7.rs` around lines 46 - 54, Rollback the database
rename if `parity_db::Db::add_column` fails in `v0_33_7` so `old_db` is not left
as an empty placeholder while `temp_db` still contains the real data. Update the
migration flow around the `rename`, `create_dir_all`, and `add_column` steps to
restore the original directory or clean up both `old_db` and `temp_db` before
returning the error, so later startup logic does not delete the only valid copy.

Comment thread src/db/mod.rs Outdated
Comment on lines +156 to +170
pub(crate) fn encode_block_bloom(height: i64, bloom: &[u8]) -> Vec<u8> {
let mut entry = Vec::with_capacity(size_of::<i64>() + bloom.len());
entry.extend_from_slice(&height.to_le_bytes());
entry.extend_from_slice(bloom);
entry
}

/// Splits a block bloom entry into its height and raw bloom bytes.
pub(crate) fn decode_block_bloom(entry: &[u8]) -> anyhow::Result<(i64, &[u8])> {
anyhow::ensure!(
entry.len() >= size_of::<i64>(),
"block bloom entry is too short"
);
let (height, bloom) = entry.split_at(size_of::<i64>());
Ok((i64::from_le_bytes(height.try_into()?), bloom))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win

Reject malformed cached bloom payloads.

decode_block_bloom only checks the 8-byte height prefix, so truncated or oversized cached entries can be returned as Ethereum logsBloom. Enforce the 256-byte bloom payload here, or make readers treat invalid-sized entries as a cache miss/fallback.

Proposed guard
+const ETH_BLOCK_BLOOM_BYTES: usize = 256;
+const ETH_BLOCK_BLOOM_ENTRY_BYTES: usize = size_of::<i64>() + ETH_BLOCK_BLOOM_BYTES;
+
 /// Encodes a block bloom entry as its little-endian height followed by the raw bloom bytes.
 pub(crate) fn encode_block_bloom(height: i64, bloom: &[u8]) -> Vec<u8> {
     let mut entry = Vec::with_capacity(size_of::<i64>() + bloom.len());
@@
 pub(crate) fn decode_block_bloom(entry: &[u8]) -> anyhow::Result<(i64, &[u8])> {
     anyhow::ensure!(
-        entry.len() >= size_of::<i64>(),
-        "block bloom entry is too short"
+        entry.len() == ETH_BLOCK_BLOOM_ENTRY_BYTES,
+        "block bloom entry has invalid length: expected {} bytes, got {}",
+        ETH_BLOCK_BLOOM_ENTRY_BYTES,
+        entry.len()
     );
     let (height, bloom) = entry.split_at(size_of::<i64>());
-    Ok((i64::from_le_bytes(height.try_into()?), bloom))
+    Ok((
+        i64::from_le_bytes(
+            height
+                .try_into()
+                .expect("height prefix length was checked"),
+        ),
+        bloom,
+    ))
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub(crate) fn encode_block_bloom(height: i64, bloom: &[u8]) -> Vec<u8> {
let mut entry = Vec::with_capacity(size_of::<i64>() + bloom.len());
entry.extend_from_slice(&height.to_le_bytes());
entry.extend_from_slice(bloom);
entry
}
/// Splits a block bloom entry into its height and raw bloom bytes.
pub(crate) fn decode_block_bloom(entry: &[u8]) -> anyhow::Result<(i64, &[u8])> {
anyhow::ensure!(
entry.len() >= size_of::<i64>(),
"block bloom entry is too short"
);
let (height, bloom) = entry.split_at(size_of::<i64>());
Ok((i64::from_le_bytes(height.try_into()?), bloom))
const ETH_BLOCK_BLOOM_BYTES: usize = 256;
const ETH_BLOCK_BLOOM_ENTRY_BYTES: usize = size_of::<i64>() + ETH_BLOCK_BLOOM_BYTES;
/// Encodes a block bloom entry as its little-endian height followed by the raw bloom bytes.
pub(crate) fn encode_block_bloom(height: i64, bloom: &[u8]) -> Vec<u8> {
let mut entry = Vec::with_capacity(size_of::<i64>() + bloom.len());
entry.extend_from_slice(&height.to_le_bytes());
entry.extend_from_slice(bloom);
entry
}
/// Splits a block bloom entry into its height and raw bloom bytes.
pub(crate) fn decode_block_bloom(entry: &[u8]) -> anyhow::Result<(i64, &[u8])> {
anyhow::ensure!(
entry.len() == ETH_BLOCK_BLOOM_ENTRY_BYTES,
"block bloom entry has invalid length: expected {} bytes, got {}",
ETH_BLOCK_BLOOM_ENTRY_BYTES,
entry.len()
);
let (height, bloom) = entry.split_at(size_of::<i64>());
Ok((
i64::from_le_bytes(
height
.try_into()
.expect("height prefix length was checked"),
),
bloom,
))
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/mod.rs` around lines 156 - 170, decode_block_bloom currently accepts
any payload after the 8-byte height prefix, so malformed cached blooms can slip
through as logsBloom. Tighten the validation in decode_block_bloom (and any
callers that read its output) to require exactly the expected 256-byte bloom
length, and treat anything shorter or longer as invalid so the cache path falls
back or is ignored.

Source: Linked repositories

Comment thread src/db/parity_db.rs
Comment on lines +269 to +282
fn delete_blooms_before_height(&self, height: i64) -> anyhow::Result<()> {
let mut stale = Vec::new();
let mut iter = self.db.iter(DbColumn::EthBlockBloom as u8)?;
while let Some((key, entry)) = iter.next()? {
if decode_block_bloom(&entry)?.0 < height {
stale.push(key);
}
}
Ok(self.db.commit(
stale
.into_iter()
.map(|key| (DbColumn::EthBlockBloom as u8, key, None::<Vec<u8>>)),
)?)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

A single corrupted bloom row aborts the entire GC prune, unlike MemoryDB's resilient behavior.

delete_blooms_before_height uses decode_block_bloom(&entry)? inside the iteration loop. If any single stored entry is malformed (disk corruption, partial write, future schema change, etc.), the ? propagates immediately and the whole method returns Errno stale rows get deleted, not even the ones that decoded fine. Since this is invoked from GC pruning (per the stack context, failures there are only logged as a warning and swallowed), a single bad row would silently and permanently block bloom pruning, letting the EthBlockBloom column grow unbounded on a persistent store.

Contrast this with MemoryDB::delete_blooms_before_height (src/db/memory.rs), which skips undecodable entries instead of aborting. Consider applying the same per-entry resilience here: skip/log the bad row and continue pruning the rest.

🛡️ Proposed fix: don't let one bad row block the whole prune
     fn delete_blooms_before_height(&self, height: i64) -> anyhow::Result<()> {
         let mut stale = Vec::new();
         let mut iter = self.db.iter(DbColumn::EthBlockBloom as u8)?;
         while let Some((key, entry)) = iter.next()? {
-            if decode_block_bloom(&entry)?.0 < height {
-                stale.push(key);
-            }
+            match decode_block_bloom(&entry) {
+                Ok((h, _)) if h < height => stale.push(key),
+                Ok(_) => {}
+                Err(err) => {
+                    tracing::warn!("skipping corrupted block bloom entry during prune: {err}");
+                }
+            }
         }
         Ok(self.db.commit(
             stale
                 .into_iter()
                 .map(|key| (DbColumn::EthBlockBloom as u8, key, None::<Vec<u8>>)),
         )?)
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/parity_db.rs` around lines 269 - 282, In
`ParityDB::delete_blooms_before_height`, a single bad bloom entry currently
aborts pruning because `decode_block_bloom(&entry)?` propagates immediately.
Update this loop to mirror `MemoryDB::delete_blooms_before_height` by handling
decode failures per entry: skip malformed rows, optionally log them, and
continue collecting stale keys for the rest. Keep the pruning path resilient so
one corrupted `EthBlockBloom` record does not stop `self.db.commit(...)` from
deleting the valid stale entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

eth: eth_getBlockBy* / newHeads return an all-ones logsBloom

2 participants