diff --git a/doc/continuation-rationale.md b/doc/continuation-rationale.md index c69d3c97a..0ea4a0a50 100644 --- a/doc/continuation-rationale.md +++ b/doc/continuation-rationale.md @@ -23,7 +23,7 @@ The executor concept adopts `continuation&` as the parameter type for struct continuation { std::coroutine_handle<> h; - continuation* next = nullptr; + void* reserved = nullptr; }; concept Executor = requires(E& e, continuation c) { @@ -115,11 +115,11 @@ void post(continuation c) const; **Arguments against:** 1. Breaks zero-allocation queuing. The executor links the - continuation into an intrusive queue via `next`. If `c` is a - stack-local copy, the copy is destroyed when `post` returns and - the queue has a dangling pointer. The whole point of the intrusive - `next` is that the executor queues the *original object*, not a - copy. + continuation into an intrusive queue via the `reserved` slot. If + `c` is a stack-local copy, the copy is destroyed when `post` + returns and the queue has a dangling pointer. The whole point of + the intrusive `reserved` link is that the executor queues the + *original object*, not a copy. 2. For `dispatch`, the inline case (return `c.h` for symmetric transfer) works, but the fallback to `post` has the same problem. @@ -193,12 +193,13 @@ executor's queue. Two locations were considered: must reach into the caller's promise to get the continuation. Both are protocol changes. 2. Burdens task authors. Every promise type that inherits from - `io_awaitable_promise_base` grows by a pointer (the `next` + `io_awaitable_promise_base` grows by a pointer (the `reserved` field) even though most suspension points never queue the continuation (they use symmetric transfer inline). 3. Conflates two concerns. The promise stores "who resumes me when I'm done" — a parent-child relationship. The `continuation` with - `next` means "I'm a queueable unit of work." These are different + its `reserved` queue link means "I'm a queueable unit of work." + These are different concepts. The parent's continuation is only queued when the child finishes and the parent must be posted to a different executor. In the common case (same executor, symmetric transfer), it is @@ -228,7 +229,7 @@ wraps it in the embedded `continuation`, and passes that to **Arguments against:** 1. A new `continuation` is initialized at every `co_await`. Not an - allocation (it is embedded), but `next` and `h` are set each + allocation (it is embedded), but `reserved` and `h` are set each time. 2. Combinator and trampoline patterns (parent dispatch, child launch) do not have an I/O awaitable in scope. These sites need @@ -241,7 +242,7 @@ wraps it in the embedded `continuation`, and passes that to |---|---|---| | Changes `IoAwaitable` concept? | Yes | No | | Continuations per coroutine | One, reused | One per `co_await` | -| Init cost per suspension | None (already set) | Set `h` and `next` | +| Init cost per suspension | None (already set) | Set `h` and `reserved` | | Alignment with corosio `scheduler_op` | Separate patterns | Same pattern | | Burden on task authors | Yes — inherits extra pointer | None | | Combinator / trampoline sites | Free (in promise) | Need explicit storage | @@ -304,7 +305,7 @@ mechanism that operates on handles, not continuations. A `continuation` must not move or be destroyed while it is linked into an executor's queue. When `post(c)` is called, the executor -stores `&c` in an intrusive list via `c.next_`. If `c` moves or is +stores `&c` in an intrusive list via `c.reserved`. If `c` moves or is destroyed before the executor dequeues it, the list has a dangling pointer. @@ -355,7 +356,7 @@ recycling. Two options exist for how the strand interacts with ### Option S1: Strand Queues Continuations Directly -Replace `strand_op` with direct `continuation` queueing via `next`. +Replace `strand_op` with direct `continuation` queueing via the `reserved` link. **Arguments for:** @@ -418,10 +419,10 @@ authors see no change. **Arguments against:** -1. Every coroutine frame grows by 8 bytes (the `next` pointer), +1. Every coroutine frame grows by 8 bytes (the `reserved` pointer), even though the parent's continuation is rarely queued. The common case (same executor, symmetric transfer) returns `c.h` inline — - `next` is dead weight. + `reserved` is dead weight. 2. Conflates "who resumes me" with "I'm a queueable unit." ### Option B2: Keep Promise Base Unchanged (chosen) @@ -470,10 +471,10 @@ The `continuation` change requires updates in corosio: 3. **`scheduler::post(coroutine_handle<>)`** — Currently heap-allocates a `post_handler`. With `continuation`, the scheduler - can queue the continuation directly via `next`, eliminating the - allocation. Whether `continuation::next_` and `scheduler_op`'s - intrusive queue unify or coexist is a corosio-internal design - question. + can queue the continuation directly via its `reserved` link, + eliminating the allocation. Whether `continuation`'s `reserved` + link and `scheduler_op`'s intrusive queue unify or coexist is a + corosio-internal design question. 4. **I/O operation types** (`reactor_op`, `overlapped_op`, `waiter_node`) — These store `coroutine_handle<>` and diff --git a/doc/modules/ROOT/pages/9.design/9k.Executor.adoc b/doc/modules/ROOT/pages/9.design/9k.Executor.adoc index bafb069b6..d6762adeb 100644 --- a/doc/modules/ROOT/pages/9.design/9k.Executor.adoc +++ b/doc/modules/ROOT/pages/9.design/9k.Executor.adoc @@ -41,9 +41,9 @@ If the executor determines it is safe (e.g., the current thread is already assoc === `post(c)` -- Always Queue -Queues the continuation for later execution without ever executing it inline. Never blocks. The continuation is linked into the executor's internal queue via its intrusive `next` pointer -- no per-post heap allocation. +Queues the continuation for later execution without ever executing it inline. Never blocks. The continuation is linked into the executor's internal queue via its `reserved` slot -- no per-post heap allocation. -Both operations accept `continuation&` rather than `std::coroutine_handle<>`. A `continuation` wraps a coroutine handle with an intrusive list pointer, enabling zero-allocation queuing. +Both operations accept `continuation&` rather than `std::coroutine_handle<>`. A `continuation` wraps a coroutine handle with a pointer-sized `reserved` slot, which the executor commandeers as its queue link, enabling zero-allocation queuing. The remaining operations support context access, lifecycle management, and identity: @@ -159,20 +159,20 @@ Corosio confirms this in practice: its entire I/O layer -- sockets, acceptors, t == Why `continuation`, Not Raw `coroutine_handle<>` -The executor accepts `continuation&` rather than `std::coroutine_handle<>`. A `continuation` wraps the handle with an intrusive `next` pointer for zero-allocation queuing: +The executor accepts `continuation&` rather than `std::coroutine_handle<>`. A `continuation` wraps the handle with a pointer-sized `reserved` slot that the executor commandeers as its queue link, enabling zero-allocation queuing: [source,cpp] ---- struct continuation { std::coroutine_handle<> h; - continuation* next = nullptr; + void* reserved = nullptr; }; ---- This design has three consequences: -- **Zero-allocation posting.** The thread pool links the `continuation` directly into its work queue via `next`. No `new work(h)` per post. The queue node is embedded in the thing being queued -- the awaitable, combinator state, or trampoline promise that owns the continuation. +- **Zero-allocation posting.** The thread pool links the `continuation` directly into its work queue via `reserved`. No `new work(h)` per post. The queue node is embedded in the thing being queued -- the awaitable, combinator state, or trampoline promise that owns the continuation. - **Type erasure remains possible.** `executor_ref` wraps any executor behind a uniform vtable. The vtable function pointers accept `continuation&`, which is a concrete type. No templates on promise type are needed. @@ -320,7 +320,7 @@ When the I/O completes (from the reactor thread for epoll, the completion port f ex_.post(cont_); ---- -`post` links the continuation into the executor's work queue via `cont_.next`. No heap allocation occurs -- the continuation is embedded in the awaitable, which is alive for the duration of the suspension. A worker thread dequeues the continuation and calls `cont_.h.resume()`. +`post` links the continuation into the executor's work queue via `cont_.reserved`. No heap allocation occurs -- the continuation is embedded in the awaitable, which is alive for the duration of the suspension. A worker thread dequeues the continuation and calls `cont_.h.resume()`. === Platform Independence diff --git a/include/boost/capy/continuation.hpp b/include/boost/capy/continuation.hpp index 765d05853..e0a581ad4 100644 --- a/include/boost/capy/continuation.hpp +++ b/include/boost/capy/continuation.hpp @@ -19,9 +19,9 @@ namespace capy { /** Executor-facing schedulable unit. - Wraps a `std::coroutine_handle<>` with an intrusive list - pointer so executors can queue continuations without - per-post heap allocation. + Wraps a `std::coroutine_handle<>` with a single + pointer-sized scratch slot so executors can queue + continuations without per-post heap allocation. @par Fields @@ -30,10 +30,15 @@ namespace capy { an I/O awaitable or combinator). Read by the executor when it dequeues the continuation. - @li `next` — intrusive linked-list pointer, owned and - managed exclusively by executor implementations. Users - must not read or write `next` while the continuation - is enqueued. + @li `reserved` — a pointer-sized scratch slot. Ordinary + users must not touch it. Authors of awaitable algorithms + (e.g. `async_mutex`, `async_semaphore`) may commandeer + it for their own node-based data structure, but **only + before** the continuation is submitted to an executor. + On submission the executor **clobbers** `reserved` to + link the continuation into its internal queue; the value + carries **no meaning** afterward. Once submitted, the + caller must not read or modify the continuation at all. @par Ownership and Lifetime @@ -46,14 +51,20 @@ namespace capy { linked into an executor's queue. It must not be moved, destroyed, or enqueued in more than one queue concurrently. + An author who needs a doubly-linked (or otherwise richer) + structure should hold a `continuation` as a member — or + derive from it, since it is an aggregate — and manage their + own links: `reserved` is only a single pre-submission scratch + slot, and it is no longer available once the continuation is + submitted. + @par Copy and Move Trivially copyable and movable (aggregate of a handle and a pointer). However, copying or moving a queued - continuation produces a second object whose `next` is - stale — the executor still points to the original. Copy - and move are safe only when the continuation is not - enqueued. + continuation produces a second object whose `reserved` slot + is stale — the executor still links the original. Copy and + move are safe only when the continuation is not enqueued. @par Thread Safety @@ -69,7 +80,7 @@ namespace capy { struct continuation { std::coroutine_handle<> h; - continuation* next = nullptr; + void* reserved = nullptr; }; } // namespace capy diff --git a/src/ex/detail/strand_queue.hpp b/src/ex/detail/strand_queue.hpp index 0c4a2c9c1..d3dba3cbf 100644 --- a/src/ex/detail/strand_queue.hpp +++ b/src/ex/detail/strand_queue.hpp @@ -21,7 +21,8 @@ namespace detail { /** Single-threaded intrusive FIFO of pending continuations. - Links continuations directly through `continuation::next`, so + Links continuations directly through `continuation::reserved` (a + typed `continuation*` round-tripped through the `void*` slot), so push() carries no per-item allocation. @par Thread Safety @@ -54,9 +55,9 @@ class strand_queue void push(continuation& c) noexcept { - c.next = nullptr; + c.reserved = nullptr; if(tail_) - tail_->next = &c; + tail_->reserved = &c; else head_ = &c; tail_ = &c; @@ -102,7 +103,7 @@ class strand_queue while(batch.head) { continuation* c = batch.head; - batch.head = c->next; + batch.head = static_cast(c->reserved); safe_resume(c->h); } batch.tail = nullptr; diff --git a/src/ex/thread_pool.cpp b/src/ex/thread_pool.cpp index de03b922c..14beea8e5 100644 --- a/src/ex/thread_pool.cpp +++ b/src/ex/thread_pool.cpp @@ -62,16 +62,17 @@ class thread_pool::impl // resume) and post. static inline detail::thread_local_ptr current_; - // Intrusive queue of continuations via continuation::next. - // No per-post allocation: the continuation is owned by the caller. + // Intrusive queue of continuations: the next link is stored in + // continuation::reserved (typed continuation* round-tripped through + // void*). No per-post allocation: the continuation is owned by the caller. continuation* head_ = nullptr; continuation* tail_ = nullptr; void push(continuation* c) noexcept { - c->next = nullptr; + c->reserved = nullptr; if(tail_) - tail_->next = c; + tail_->reserved = c; else head_ = c; tail_ = c; @@ -82,7 +83,7 @@ class thread_pool::impl if(!head_) return nullptr; continuation* c = head_; - head_ = head_->next; + head_ = static_cast(head_->reserved); if(!head_) tail_ = nullptr; return c; diff --git a/test/unit/ex/priority_executor.hpp b/test/unit/ex/priority_executor.hpp index cdbfb9fc0..1986c23fa 100644 --- a/test/unit/ex/priority_executor.hpp +++ b/test/unit/ex/priority_executor.hpp @@ -69,8 +69,8 @@ drain_list(continuation* head) noexcept while(head) { continuation* c = head; - head = c->next; - c->next = nullptr; + head = static_cast(c->reserved); + c->reserved = nullptr; ::boost::capy::safe_resume(c->h); } } @@ -129,16 +129,16 @@ class priority_executor void enqueue_under_lock(continuation& c, priority p) const noexcept { - c.next = nullptr; + c.reserved = nullptr; if(p == priority::high) { - if(state_->high_tail) state_->high_tail->next = &c; + if(state_->high_tail) state_->high_tail->reserved = &c; else state_->high_head = &c; state_->high_tail = &c; } else { - if(state_->low_tail) state_->low_tail->next = &c; + if(state_->low_tail) state_->low_tail->reserved = &c; else state_->low_head = &c; state_->low_tail = &c; } @@ -164,7 +164,7 @@ class priority_executor auto inv = detail::make_priority_invoker(state_); auto& self = inv.h_.promise().self; self.h = inv.h_; - self.next = nullptr; + self.reserved = nullptr; inner_ex_.post(self); }