Skip to content

Refactor dep gen and l2 swimlane collectors to common#1262

Open
vegetabledoww wants to merge 1 commit into
hw-native-sys:mainfrom
vegetabledoww:fix/issue-1253-common-collectors-v2
Open

Refactor dep gen and l2 swimlane collectors to common#1262
vegetabledoww wants to merge 1 commit into
hw-native-sys:mainfrom
vegetabledoww:fix/issue-1253-common-collectors-v2

Conversation

@vegetabledoww

@vegetabledoww vegetabledoww commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes #1253.

This PR hoists the shared DepGen and L2 Swimlane collectors out of the per-architecture a2a3/a5 platform trees into src/common/platform.

  • Move host-side DepGen and L2 Swimlane collectors to common shared host sources.
  • Move AICPU-side DepGen and L2 Swimlane collector implementations to common shared AICPU sources.
  • Update a2a3/a5 onboard and sim CMake source lists to build the common collectors.
  • Keep architecture-specific memory transport through each platform's profiling_copy.cpp implementation.
  • Preserve the PR Fix: reduce profiling buffer drops across collectors #1162 buffer/drop-reduction behavior after the relocation.
  • Update related documentation and stale DepGen comments to match the current in-memory replay path.

Validation

  • g++ -fsyntax-only for the common host collectors with both a2a3 and a5 include paths.
  • cmake --build /data/jinzongquan/simpler/build/ut_cpp_issue1253 -j8
  • ctest --test-dir /data/jinzongquan/simpler/build/ut_cpp_issue1253 --output-on-failure -L no_hardware
    • Result: 40/40 passed.
  • pip install --no-build-isolation -e .
  • a2a3 onboard DFX test through task-submit:
    • test_dep_gen.py
    • test_dep_gen_chain.py
    • test_l2_swimlane.py
    • test_l2_swimlane_mixed.py
    • Result: 4 passed with --platform a2a3 --enable-l2-swimlane 4 --enable-dep-gen.

Notes

  • --enable-l2-swimlane 1 was not used for the final onboard validation because level 1 only captures AICore timing and does not include the AICPU/phase data required by the sched overhead analysis.
  • a5 onboard DFX coverage was not run in this local validation pass.

@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 75c53c75-2966-4a1c-8c2d-775c450c195c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR consolidates architecture-specific L2SwimlaneCollector and DepGenCollector implementations from src/a2a3 and src/a5 into shared src/common headers/sources, removes the now-redundant platform-specific files, updates memory-copy calls to non-SVM-aware helpers, reworks buffer pool seeding logic, rewires CMake build sources, and updates documentation links accordingly.

Changes

Collector consolidation into src/common

Layer / File(s) Summary
Common collector headers and non-SVM memory copy wiring
src/common/platform/include/host/dep_gen_collector.h, src/common/platform/include/host/l2_swimlane_collector.h, src/common/platform/shared/host/dep_gen_collector.cpp, src/common/platform/shared/host/l2_swimlane_collector.cpp
Include guards renamed to SRC_COMMON_PLATFORM_*, docs generalized from a5-specific to non-SVM wording, set_memory_context calls switched to profiling_copy_to/from_device_or_null(), and buffer pool initialization reworked to seed initial_free_count = min(buffers_per_thread, PLATFORM_PROF_SLOT_COUNT) entries across AICPU/AICore task pools and phase pools.
Removal of a2a3/a5 platform-specific collector code
src/a2a3/platform/include/host/{dep_gen_collector.h,l2_swimlane_collector.h}, src/a2a3/platform/shared/host/{dep_gen_collector.cpp,l2_swimlane_collector.cpp}, src/a5/platform/shared/aicpu/{dep_gen_collector_aicpu.cpp,l2_swimlane_collector_aicpu.cpp}
Deletes the fully duplicated per-platform collector headers, host implementations, and AICPU-side dep_gen capture logic previously specific to a2a3 and a5.
Build system rewiring to common sources
src/{a2a3,a5}/platform/{onboard,sim}/host/CMakeLists.txt, tests/ut/cpp/CMakeLists.txt
Updates HOST_RUNTIME_SOURCES and unit test executable source lists to build l2_swimlane_collector.cpp and dep_gen_collector.cpp from common/platform/shared/host instead of platform-specific directories.
Documentation cross-reference updates
docs/dfx/dep_gen.md, docs/dfx/l2-swimlane-profiling.md, docs/hardware/cache-coherency.md, docs/profiling-framework.md
Updates documentation links and file-path references to the new shared src/common collector paths and generalizes a5-specific ("no SVM") phrasing to "non-SVM platforms".

Estimated code review effort: 4 (Complex) | ~60 minutes

Possibly related issues

Possibly related PRs

  • hw-native-sys/simpler#1152: Overlaps at the collector-code level, touching the same L2SwimlaneCollector::initialize and DepGenCollector::init buffer/record sizing logic.
  • hw-native-sys/simpler#1162: Directly overlaps with the free_queue seeding and drop-accounting changes in the common L2SwimlaneCollector.

Poem

A rabbit hopped through a2a3 and a5,
Gathered scattered burrows, kept the code alive,
Into src/common, one warren, tight and neat,
No more SVM guesswork, non-SVM paths complete,
Free queues seeded deep, buffers hop in line 🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive No pull request description was provided, so there is no content to validate against the changeset. Add a brief description of the collector refactor and the common-path updates.
✅ Passed checks (4 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title clearly matches the main change: moving dep gen and L2 swimlane collectors into common paths.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request consolidates the platform-specific implementations of dep_gen_collector and l2_swimlane_collector from src/a2a3 and src/a5 into a unified common implementation under src/common/platform. It updates build configurations, documentation, and unit tests to point to the new shared paths. Additionally, it generalizes platform-specific terminology to generic categories (e.g., "non-SVM platforms") and optimizes the initialization of L2SwimlaneCollector to seed the device-side free queue with multiple buffers. No review comments were provided, so there is no further feedback to address.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@vegetabledoww vegetabledoww force-pushed the fix/issue-1253-common-collectors-v2 branch from 3c2728a to c229fdc Compare July 3, 2026 01:54

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/dfx/l2-swimlane-profiling.md`:
- Line 711: The collector links in the profiling doc are resolving to the wrong
location because the relative path is missing one directory level; update the
`L2SwimlaneCollector` link targets in both the a2a3 and a5 sections to use the
shared header path with the correct `../../src/...` prefix so the markdown
points to the actual file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 21b3d8c8-3a69-46bf-9dcc-12da23a8d58e

📥 Commits

Reviewing files that changed from the base of the PR and between 153daee and 3c2728a.

📒 Files selected for processing (21)
  • docs/dfx/dep_gen.md
  • docs/dfx/l2-swimlane-profiling.md
  • docs/hardware/cache-coherency.md
  • docs/profiling-framework.md
  • src/a2a3/platform/include/host/dep_gen_collector.h
  • src/a2a3/platform/include/host/l2_swimlane_collector.h
  • src/a2a3/platform/onboard/host/CMakeLists.txt
  • src/a2a3/platform/shared/host/dep_gen_collector.cpp
  • src/a2a3/platform/shared/host/l2_swimlane_collector.cpp
  • src/a2a3/platform/sim/host/CMakeLists.txt
  • src/a5/platform/onboard/host/CMakeLists.txt
  • src/a5/platform/shared/aicpu/dep_gen_collector_aicpu.cpp
  • src/a5/platform/shared/aicpu/l2_swimlane_collector_aicpu.cpp
  • src/a5/platform/sim/host/CMakeLists.txt
  • src/common/platform/include/host/dep_gen_collector.h
  • src/common/platform/include/host/l2_swimlane_collector.h
  • src/common/platform/shared/aicpu/dep_gen_collector_aicpu.cpp
  • src/common/platform/shared/aicpu/l2_swimlane_collector_aicpu.cpp
  • src/common/platform/shared/host/dep_gen_collector.cpp
  • src/common/platform/shared/host/l2_swimlane_collector.cpp
  • tests/ut/cpp/CMakeLists.txt
💤 Files with no reviewable changes (6)
  • src/a2a3/platform/shared/host/l2_swimlane_collector.cpp
  • src/a5/platform/shared/aicpu/dep_gen_collector_aicpu.cpp
  • src/a2a3/platform/shared/host/dep_gen_collector.cpp
  • src/a5/platform/shared/aicpu/l2_swimlane_collector_aicpu.cpp
  • src/a2a3/platform/include/host/dep_gen_collector.h
  • src/a2a3/platform/include/host/l2_swimlane_collector.h

```

[`L2SwimlaneCollector`](../src/a2a3/platform/include/host/l2_swimlane_collector.h)
[`L2SwimlaneCollector`](../src/common/platform/include/host/l2_swimlane_collector.h)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Fix the collector link targets.

../src/... from docs/dfx/l2-swimlane-profiling.md resolves under docs/src/..., so both links are broken as written. Use ../../src/... for the shared header in both sections.

Suggested fix
-[`L2SwimlaneCollector`](../src/common/platform/include/host/l2_swimlane_collector.h)
+[`L2SwimlaneCollector`](../../src/common/platform/include/host/l2_swimlane_collector.h)

Apply the same replacement in both the a2a3 and a5 sections.

Also applies to: 841-841

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/dfx/l2-swimlane-profiling.md` at line 711, The collector links in the
profiling doc are resolving to the wrong location because the relative path is
missing one directory level; update the `L2SwimlaneCollector` link targets in
both the a2a3 and a5 sections to use the shared header path with the correct
`../../src/...` prefix so the markdown points to the actual file.

@vegetabledoww vegetabledoww force-pushed the fix/issue-1253-common-collectors-v2 branch 3 times, most recently from ca72843 to 16246e8 Compare July 4, 2026 01:10
Hoist the shared DepGen and L2 Swimlane collector implementations from the per-architecture a2a3/a5 platform trees into src/common/platform.

Update a2a3/a5 onboard and sim build lists to compile the common host collectors, and keep the architecture-specific memory transport behind each platform's profiling_copy.cpp callbacks.

Preserve the PR hw-native-sys#1162 buffer/drop-reduction behavior after relocation, including per-AICPU-thread drain/collector sharding and initial free_queue seeding up to the slot count.

Also refresh related docs and stale DepGen comments to describe the current in-memory replay path instead of a submit_trace.bin intermediary.

Validation: C++ UT build passed; no_hardware ctest passed 40/40; pip editable install passed; a2a3 onboard DepGen and L2 Swimlane DFX tests passed 4/4 via task-submit.
@vegetabledoww vegetabledoww force-pushed the fix/issue-1253-common-collectors-v2 branch from 16246e8 to dd4c487 Compare July 4, 2026 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Code Health] Hoist L2Swimlane & DepGen collectors into common (like TensorDump/ScopeStats): move identical AICPU, migrate host to alloc_paired_buffer

1 participant