Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 19 additions & 18 deletions doc/continuation-rationale.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The executor concept adopts `continuation&` as the parameter type for
struct continuation
{
std::coroutine_handle<> h;
continuation* next = nullptr;
void* reserved = nullptr;
};

concept Executor = requires(E& e, continuation c) {
Expand Down Expand Up @@ -115,11 +115,11 @@ void post(continuation c) const;
**Arguments against:**

1. Breaks zero-allocation queuing. The executor links the
continuation into an intrusive queue via `next`. If `c` is a
stack-local copy, the copy is destroyed when `post` returns and
the queue has a dangling pointer. The whole point of the intrusive
`next` is that the executor queues the *original object*, not a
copy.
continuation into an intrusive queue via the `reserved` slot. If
`c` is a stack-local copy, the copy is destroyed when `post`
returns and the queue has a dangling pointer. The whole point of
the intrusive `reserved` link is that the executor queues the
*original object*, not a copy.
2. For `dispatch`, the inline case (return `c.h` for symmetric
transfer) works, but the fallback to `post` has the same problem.

Expand Down Expand Up @@ -193,12 +193,13 @@ executor's queue. Two locations were considered:
must reach into the caller's promise to get the continuation. Both
are protocol changes.
2. Burdens task authors. Every promise type that inherits from
`io_awaitable_promise_base` grows by a pointer (the `next`
`io_awaitable_promise_base` grows by a pointer (the `reserved`
field) even though most suspension points never queue the
continuation (they use symmetric transfer inline).
3. Conflates two concerns. The promise stores "who resumes me when
I'm done" — a parent-child relationship. The `continuation` with
`next` means "I'm a queueable unit of work." These are different
its `reserved` queue link means "I'm a queueable unit of work."
These are different
concepts. The parent's continuation is only queued when the child
finishes and the parent must be posted to a different executor.
In the common case (same executor, symmetric transfer), it is
Expand Down Expand Up @@ -228,7 +229,7 @@ wraps it in the embedded `continuation`, and passes that to
**Arguments against:**

1. A new `continuation` is initialized at every `co_await`. Not an
allocation (it is embedded), but `next` and `h` are set each
allocation (it is embedded), but `reserved` and `h` are set each
time.
2. Combinator and trampoline patterns (parent dispatch, child
launch) do not have an I/O awaitable in scope. These sites need
Expand All @@ -241,7 +242,7 @@ wraps it in the embedded `continuation`, and passes that to
|---|---|---|
| Changes `IoAwaitable` concept? | Yes | No |
| Continuations per coroutine | One, reused | One per `co_await` |
| Init cost per suspension | None (already set) | Set `h` and `next` |
| Init cost per suspension | None (already set) | Set `h` and `reserved` |
| Alignment with corosio `scheduler_op` | Separate patterns | Same pattern |
| Burden on task authors | Yes — inherits extra pointer | None |
| Combinator / trampoline sites | Free (in promise) | Need explicit storage |
Expand Down Expand Up @@ -304,7 +305,7 @@ mechanism that operates on handles, not continuations.

A `continuation` must not move or be destroyed while it is linked
into an executor's queue. When `post(c)` is called, the executor
stores `&c` in an intrusive list via `c.next_`. If `c` moves or is
stores `&c` in an intrusive list via `c.reserved`. If `c` moves or is
destroyed before the executor dequeues it, the list has a dangling
pointer.

Expand Down Expand Up @@ -355,7 +356,7 @@ recycling. Two options exist for how the strand interacts with

### Option S1: Strand Queues Continuations Directly

Replace `strand_op` with direct `continuation` queueing via `next`.
Replace `strand_op` with direct `continuation` queueing via the `reserved` link.

**Arguments for:**

Expand Down Expand Up @@ -418,10 +419,10 @@ authors see no change.

**Arguments against:**

1. Every coroutine frame grows by 8 bytes (the `next` pointer),
1. Every coroutine frame grows by 8 bytes (the `reserved` pointer),
even though the parent's continuation is rarely queued. The common
case (same executor, symmetric transfer) returns `c.h` inline —
`next` is dead weight.
`reserved` is dead weight.
2. Conflates "who resumes me" with "I'm a queueable unit."

### Option B2: Keep Promise Base Unchanged (chosen)
Expand Down Expand Up @@ -470,10 +471,10 @@ The `continuation` change requires updates in corosio:

3. **`scheduler::post(coroutine_handle<>)`** — Currently
heap-allocates a `post_handler`. With `continuation`, the scheduler
can queue the continuation directly via `next`, eliminating the
allocation. Whether `continuation::next_` and `scheduler_op`'s
intrusive queue unify or coexist is a corosio-internal design
question.
can queue the continuation directly via its `reserved` link,
eliminating the allocation. Whether `continuation`'s `reserved`
link and `scheduler_op`'s intrusive queue unify or coexist is a
corosio-internal design question.

4. **I/O operation types** (`reactor_op`, `overlapped_op`,
`waiter_node`) — These store `coroutine_handle<>` and
Expand Down
12 changes: 6 additions & 6 deletions doc/modules/ROOT/pages/9.design/9k.Executor.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,9 @@ If the executor determines it is safe (e.g., the current thread is already assoc

=== `post(c)` -- Always Queue

Queues the continuation for later execution without ever executing it inline. Never blocks. The continuation is linked into the executor's internal queue via its intrusive `next` pointer -- no per-post heap allocation.
Queues the continuation for later execution without ever executing it inline. Never blocks. The continuation is linked into the executor's internal queue via its `reserved` slot -- no per-post heap allocation.

Both operations accept `continuation&` rather than `std::coroutine_handle<>`. A `continuation` wraps a coroutine handle with an intrusive list pointer, enabling zero-allocation queuing.
Both operations accept `continuation&` rather than `std::coroutine_handle<>`. A `continuation` wraps a coroutine handle with a pointer-sized `reserved` slot, which the executor commandeers as its queue link, enabling zero-allocation queuing.

The remaining operations support context access, lifecycle management, and identity:

Expand Down Expand Up @@ -159,20 +159,20 @@ Corosio confirms this in practice: its entire I/O layer -- sockets, acceptors, t

== Why `continuation`, Not Raw `coroutine_handle<>`

The executor accepts `continuation&` rather than `std::coroutine_handle<>`. A `continuation` wraps the handle with an intrusive `next` pointer for zero-allocation queuing:
The executor accepts `continuation&` rather than `std::coroutine_handle<>`. A `continuation` wraps the handle with a pointer-sized `reserved` slot that the executor commandeers as its queue link, enabling zero-allocation queuing:

[source,cpp]
----
struct continuation
{
std::coroutine_handle<> h;
continuation* next = nullptr;
void* reserved = nullptr;
};
----

This design has three consequences:

- **Zero-allocation posting.** The thread pool links the `continuation` directly into its work queue via `next`. No `new work(h)` per post. The queue node is embedded in the thing being queued -- the awaitable, combinator state, or trampoline promise that owns the continuation.
- **Zero-allocation posting.** The thread pool links the `continuation` directly into its work queue via `reserved`. No `new work(h)` per post. The queue node is embedded in the thing being queued -- the awaitable, combinator state, or trampoline promise that owns the continuation.

- **Type erasure remains possible.** `executor_ref` wraps any executor behind a uniform vtable. The vtable function pointers accept `continuation&`, which is a concrete type. No templates on promise type are needed.

Expand Down Expand Up @@ -320,7 +320,7 @@ When the I/O completes (from the reactor thread for epoll, the completion port f
ex_.post(cont_);
----

`post` links the continuation into the executor's work queue via `cont_.next`. No heap allocation occurs -- the continuation is embedded in the awaitable, which is alive for the duration of the suspension. A worker thread dequeues the continuation and calls `cont_.h.resume()`.
`post` links the continuation into the executor's work queue via `cont_.reserved`. No heap allocation occurs -- the continuation is embedded in the awaitable, which is alive for the duration of the suspension. A worker thread dequeues the continuation and calls `cont_.h.resume()`.

=== Platform Independence

Expand Down
35 changes: 23 additions & 12 deletions include/boost/capy/continuation.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ namespace capy {

/** Executor-facing schedulable unit.

Wraps a `std::coroutine_handle<>` with an intrusive list
pointer so executors can queue continuations without
per-post heap allocation.
Wraps a `std::coroutine_handle<>` with a single
pointer-sized scratch slot so executors can queue
continuations without per-post heap allocation.

@par Fields

Expand All @@ -30,10 +30,15 @@ namespace capy {
an I/O awaitable or combinator). Read by the executor
when it dequeues the continuation.

@li `next` — intrusive linked-list pointer, owned and
managed exclusively by executor implementations. Users
must not read or write `next` while the continuation
is enqueued.
@li `reserved` — a pointer-sized scratch slot. Ordinary
users must not touch it. Authors of awaitable algorithms
(e.g. `async_mutex`, `async_semaphore`) may commandeer
it for their own node-based data structure, but **only
before** the continuation is submitted to an executor.
On submission the executor **clobbers** `reserved` to
link the continuation into its internal queue; the value
carries **no meaning** afterward. Once submitted, the
caller must not read or modify the continuation at all.

@par Ownership and Lifetime

Expand All @@ -46,14 +51,20 @@ namespace capy {
linked into an executor's queue. It must not be moved,
destroyed, or enqueued in more than one queue concurrently.

An author who needs a doubly-linked (or otherwise richer)
structure should hold a `continuation` as a member — or
derive from it, since it is an aggregate — and manage their
own links: `reserved` is only a single pre-submission scratch
slot, and it is no longer available once the continuation is
submitted.

@par Copy and Move

Trivially copyable and movable (aggregate of a handle and
a pointer). However, copying or moving a queued
continuation produces a second object whose `next` is
stale — the executor still points to the original. Copy
and move are safe only when the continuation is not
enqueued.
continuation produces a second object whose `reserved` slot
is stale — the executor still links the original. Copy and
move are safe only when the continuation is not enqueued.

@par Thread Safety

Expand All @@ -69,7 +80,7 @@ namespace capy {
struct continuation
{
std::coroutine_handle<> h;
continuation* next = nullptr;
void* reserved = nullptr;
};

} // namespace capy
Expand Down
9 changes: 5 additions & 4 deletions src/ex/detail/strand_queue.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@ namespace detail {

/** Single-threaded intrusive FIFO of pending continuations.

Links continuations directly through `continuation::next`, so
Links continuations directly through `continuation::reserved` (a
typed `continuation*` round-tripped through the `void*` slot), so
push() carries no per-item allocation.

@par Thread Safety
Expand Down Expand Up @@ -54,9 +55,9 @@ class strand_queue
void
push(continuation& c) noexcept
{
c.next = nullptr;
c.reserved = nullptr;
if(tail_)
tail_->next = &c;
tail_->reserved = &c;
else
head_ = &c;
tail_ = &c;
Expand Down Expand Up @@ -102,7 +103,7 @@ class strand_queue
while(batch.head)
{
continuation* c = batch.head;
batch.head = c->next;
batch.head = static_cast<continuation*>(c->reserved);
safe_resume(c->h);
}
batch.tail = nullptr;
Expand Down
11 changes: 6 additions & 5 deletions src/ex/thread_pool.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -62,16 +62,17 @@ class thread_pool::impl
// resume) and post.
static inline detail::thread_local_ptr<impl const> current_;

// Intrusive queue of continuations via continuation::next.
// No per-post allocation: the continuation is owned by the caller.
// Intrusive queue of continuations: the next link is stored in
// continuation::reserved (typed continuation* round-tripped through
// void*). No per-post allocation: the continuation is owned by the caller.
continuation* head_ = nullptr;
continuation* tail_ = nullptr;

void push(continuation* c) noexcept
{
c->next = nullptr;
c->reserved = nullptr;
if(tail_)
tail_->next = c;
tail_->reserved = c;
else
head_ = c;
tail_ = c;
Expand All @@ -82,7 +83,7 @@ class thread_pool::impl
if(!head_)
return nullptr;
continuation* c = head_;
head_ = head_->next;
head_ = static_cast<continuation*>(head_->reserved);
if(!head_)
tail_ = nullptr;
return c;
Expand Down
12 changes: 6 additions & 6 deletions test/unit/ex/priority_executor.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,8 @@ drain_list(continuation* head) noexcept
while(head)
{
continuation* c = head;
head = c->next;
c->next = nullptr;
head = static_cast<continuation*>(c->reserved);
c->reserved = nullptr;
::boost::capy::safe_resume(c->h);
}
}
Expand Down Expand Up @@ -129,16 +129,16 @@ class priority_executor
void
enqueue_under_lock(continuation& c, priority p) const noexcept
{
c.next = nullptr;
c.reserved = nullptr;
if(p == priority::high)
{
if(state_->high_tail) state_->high_tail->next = &c;
if(state_->high_tail) state_->high_tail->reserved = &c;
else state_->high_head = &c;
state_->high_tail = &c;
}
else
{
if(state_->low_tail) state_->low_tail->next = &c;
if(state_->low_tail) state_->low_tail->reserved = &c;
else state_->low_head = &c;
state_->low_tail = &c;
}
Expand All @@ -164,7 +164,7 @@ class priority_executor
auto inv = detail::make_priority_invoker(state_);
auto& self = inv.h_.promise().self;
self.h = inv.h_;
self.next = nullptr;
self.reserved = nullptr;
inner_ex_.post(self);
}

Expand Down
Loading