Add `unstable.sandbox` by scotttrinh · Pull Request #128 · vercel/vercel-py

scotttrinh · 2026-05-28T18:29:07Z

Introduces a new unstable package at vercel.unstable (and internals at vercel._internal.unstable) that currently exposes the latest v2 persistent Sandbox.

Looks like this (excerpt from included example):

async def review_code(files: list[tuple[str, str]], review_agent: str) -> str:
    async with sandbox.create_sandbox(
        runtime="python3.13",
        execution_time_limit=timedelta(minutes=1),
    ) as box:
        await box.fs.mkdir("workspace")
        async with box.fs.batch() as batch:
            for path, content in files:
                batch.write_text(path, content)
            batch.write_text("workspace/review_agent.py", review_agent)

        await box.run_process(
            "python",
            ["workspace/review_agent.py", "workspace"],
            kill_after=timedelta(seconds=30),
            check=True,
        )
        return await box.fs.read_text("workspace/review.md")

This doesn't need to be public and we certainly don't need to make a fake private method that calls this public method.

Slightly better factoring here with two low-level helper methods that vary only in streaming vs non-streaming, and then a small JSON-specific layer atop that for the endpoints that need it.

Also moves base URL ownership to the services since we might eventually need to support a single service making requests to different origins for common multi-origin flows like OIDC.

This system was a little bit over-designed. We don't really care about avoiding use-after-free at the resource level, since that will error as expected from the resource server end. There are also plenty of cases where you'll hit these errors (like if a sandbox exceeds it's maximum execution time limit) so instead of having local state and server-owned state for whether some operation on a resource is valid, we let the server own that. Now that we don't have multiple tokens floating around, a simple boolean on the session is enough to track the session lifetime part of this. Simpler to reason about with fewer moving parts.

Prioritize live tests covering a lot of ground over small focused unit tests that use mocks. There is still some coverage overlap between live and example based tests, but I think that's acceptable for now. We can switch to running example tests less frequently if it starts to be expensive.

We have some complex rules that affect filtering and sorting, so instead of presenting the raw API knobs, this exposes only the legal querying knobs. This is going to be higher maintenance, but it seems worth it for a really nice DX.

vercel · 2026-05-28T18:29:13Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
vercel-py	Ready	Preview	Jun 11, 2026 2:54pm

- Add caching - Improve typing of the log stream - Maintain ordering when getting "both" streams - Raise on in-band error

The old implementation was leaky. Instead of it being an async-shaped iter_coroutine driven business logic module, it had some awkward binding machinery and very loose types. We are striving for only exposing the sync/async split at the very top and very bottom of the stack, so this does a bunch of refactoring to achieve a cleaner separation. Before Async facade -> SandboxService -> SandboxApiClient -> AsyncTransport Sync facade -> iter_coroutine(SandboxService) -> SandboxApiClient -> SyncTransport | -> SandboxService constructs async/sync public handles After Async facade -> AsyncSandboxClient -> SandboxService -> SandboxApiClient -> AsyncTransport Sync facade -> SyncSandboxClient -> SandboxService -> SandboxApiClient -> SyncTransport | | -> public handles -> neutral frozen state only `iter_coroutine` now exists only in `SyncSandboxClient`; handle binding and stream consumption live in the runtime-specific clients.

The session still holds the service options, transport, sleep functions, etc, but each separate domain service will own the service constructor. That means that the session itself will not know much about the underlying domains so that can grow without introducing a centralized registry that knows about all of the different services.

- `Sandbox.fs.write_bytes`/`Sandbox.fs.write_text`/`Sandbox.fs.batch` for file operations instead of the old `write_files` method. - `SandboxCommand` -> `Process`. Modeled on stdlib Popen and asyncio.subprocess classes and concepts, but both async and sync runtimes have a unified set of methods and data types. Not 100% compatible, but we've gone for it being as close as possible to the existing interfaces and affordances as possible without backend changes.

socket-security · 2026-06-05T18:40:58Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	pypi/trio@0.33.0

View full report

We don't need the whole Filesystem abstraction, just a callback that should run when the batch scope exits.

1st1

Few comments here and there, let's have a quick iteration and then merge

1st1 · 2026-06-09T22:19:11Z

+            collect_output=self._collect_output,
+        )
+
+    async def listdir(


list_dir to align with other methods? Or this is inspired by shutil?

Yeah, this is the name from os but I also don't love it.

1st1 · 2026-06-09T22:20:57Z

+            lambda: service.process_logs_response(session_id=self._session_id, process_id=self.id)
+        )
+
+    async def refresh(self) -> Self:


I think we should add docstrings to all public API methods? What's the purpose of refresh?

Good point, there is too much info locked up in the readme that needs to move into docstrings. Will fix.

To the actual question: the purpose of refresh is to get the latest state from the server, typically for updating things like the status and the current_session_id.

1st1 · 2026-06-09T22:26:11Z

+        kill_after: float | timedelta | None = None,
+        check: bool = False,
+        stdout: TextIO | int | None = None,
+        stderr: TextIO | int | None = None,


This can be a problem -- sometimes process output isn't decodable, especially if caught between utf opcodes or something.

Does the underlying vercel sandbox API operate on text or bytes?

In general, subprocess APIs are typically better with BytesIO for this reason. Users basically can decode on their own (potentially setting the codec to ignore decoding/encoding errors for this reason)

The underlying backend API is text-only via NDJSON lines, so yeah, it can't be bytes unless we just re-encode it into bytes. We can talk to the backend team about having something more low-level if they expose bytes, but as-is, we have text.

This seems like it's worth taking a pause on while we figure out what we want to do here. We can expose a text and byte interface where bytes are actually just re-encoded text for now and then switch that to some kind of mode on the backend side that sends buffers directly as number[]?

1st1 · 2026-06-09T22:28:25Z

+        self._apply_payload(payload)
+        return self
+
+    async def extend_execution_time_limit(self, duration: DurationInput) -> Self:


This one is weird, just stating for the record.

I assume we do want to let the sandbox manage its timeout via the API for.. reasons, right?

Yeah, this is a pretty well-used API in a lot of integrations where it is basically like an active keep-alive mechanism. Set the max execution time to something reasonable, and if some event happens before than, extend it again.

1st1 · 2026-06-09T22:28:57Z

+        return self
+
+    async def update_network_policy(self, network_policy: JSONValue) -> Self:
+        payload = await self._service.update_runtime_session_network_policy(


What's JSONValue here? I'd rather work with well-defined types than dicts

ahh, this is a vestige of how I built out the subresources, I missed this one. It's actually a pretty rich type, so I'll add the proper class structure here.

1st1 · 2026-06-09T22:30:27Z

+        )
+
+    def batch(self, *, cwd: RemotePath | None = None) -> "SandboxFilesystemBatch":
+        return SandboxFilesystemBatch(write_files=lambda files: self._write_files(files, cwd=cwd))


Maybe rename to .batch_io()?

For now it's write-only. The original API was write_files(files: Sequence(WriteFile)) and this is just a different API over that same underlying function. @elprans can you weigh in here on this design since it's mostly from your feedback? Or maybe a simple rename to batch_write?

Maybe eventually we'll have a combined read-write batch endpoint, but I'm not sure if we had that, how we'd even expose the read results back given this design?

scotttrinh · 2026-06-10T13:02:42Z

Wanted to make a note here that we've decided to remove the logs() command from the process class for now. It has some utility, but it feels like a misfit compared to the other ways of consuming the stdio streams, so we're going to wait until we have a better understanding of the use case and see if there is a more natural API for it.

Removes the `logs()` method in favor of Popen-style `stdout`/`stderr` streams.

scotttrinh · 2026-06-10T21:08:29Z

One more bit of feedback to record here. It doesn't block merging this PR since we can fast follow, but blocks the promotion of this unstable to the main namespace:

Follow-up from the sandbox/session API discussion with @elprans

The backend model is:

A sandbox is the durable identity, configuration, snapshots, and session history. Omitting name generates one.
A session is one billable execution. Creation always starts an initial session, and a sandbox has at most one active session.
A snapshot preserves filesystem state between sessions and is used to resume a stopped sandbox.

The current Sandbox.session() API exposes sessions too prominently and suggests users can create independent active sessions. We agreed to keep normal lifecycle management at the sandbox level instead:

get_sandbox() only fetches state and never resumes.
resume_sandbox() ensures the sandbox has an active session.
Awaiting create_sandbox() or resume_sandbox() performs no automatic cleanup.
Both can be used as context managers, which always stop the active session on exit.
A create_sandbox() context additionally destroys the sandbox by default.
create_sandbox(..., destroy=False) stops the session but preserves the sandbox, snapshots, and history.
Sandbox.stop() explicitly stops the active session.
Session objects remain available for history, billing, diagnostics, and advanced inspection, but are not required for normal workflows.

Example:

# Stops the session and destroys the sandbox on exit.
async with sandbox.create_sandbox(runtime="python3.13") as box:
    ...

# Stops the session but preserves the sandbox on exit.
async with sandbox.create_sandbox(name="development", destroy=False) as box:
    ...

# Fetches state without resuming or cleanup.
box = await sandbox.get_sandbox(name="development")

# Ensures an active session and stops it on exit.
async with sandbox.resume_sandbox(name="development") as box:
    ...

# Ensures an active session without automatic cleanup.
box = await sandbox.resume_sandbox(name="development")
await box.stop()

Nested or concurrent resume_sandbox() scopes may share the same active session; either scope exiting can stop it. We will document this and correctly surface stopped-session errors rather than adding runtime ownership or lease tracking.

scotttrinh added 13 commits May 27, 2026 15:49

Add unstable SDK session foundation

7330b24

Add Sandbox v2 lifecycle APIs

326e8b9

Add Sandbox runtime workflows

029f260

Add Sandbox snapshot workflows

c2f64f5

Polish unstable Sandbox public surface

c9183a3

Propagate alive tokens to command handles

452d4a9

Call credentials factory directly

36e104d

This doesn't need to be public and we certainly don't need to make a fake private method that calls this public method.

Collapse some of the http helpers

2f0384c

Slightly better factoring here with two low-level helper methods that vary only in streaming vs non-streaming, and then a small JSON-specific layer atop that for the endpoints that need it.

Separate Sync vs Async sessions

45d18d8

Also moves base URL ownership to the services since we might eventually need to support a single service making requests to different origins for common multi-origin flows like OIDC.

Separate wire models from active handles

3057e79

Use typed query object

27ddbbf

We have some complex rules that affect filtering and sorting, so instead of presenting the raw API knobs, this exposes only the legal querying knobs. This is going to be higher maintenance, but it seems worth it for a really nice DX.

scotttrinh temporarily deployed to ci May 28, 2026 18:29 — with GitHub Actions Inactive

Improve command log impl

f4d30cc

- Add caching - Improve typing of the log stream - Maintain ordering when getting "both" streams - Raise on in-band error

scotttrinh temporarily deployed to ci May 28, 2026 19:05 — with GitHub Actions Inactive

vercel Bot deployed to Preview May 28, 2026 19:05 View deployment

vercel Bot deployed to Preview May 28, 2026 20:08 View deployment

scotttrinh temporarily deployed to ci May 28, 2026 20:08 — with GitHub Actions Inactive

scotttrinh temporarily deployed to ci June 2, 2026 14:57 — with GitHub Actions Inactive

fantix approved these changes Jun 2, 2026

View reviewed changes

vercel Bot deployed to Preview June 2, 2026 20:17 View deployment

scotttrinh temporarily deployed to ci June 2, 2026 20:17 — with GitHub Actions Inactive

scotttrinh temporarily deployed to ci June 5, 2026 18:40 — with GitHub Actions Inactive

vercel Bot deployed to Preview June 5, 2026 18:40 View deployment

scotttrinh added 3 commits June 5, 2026 17:32

Only pass the bound write_files for batch

09a5dd8

We don't need the whole Filesystem abstraction, just a callback that should run when the batch scope exits.

Make the batch state transitions more explicit

53676b7

Move shared batch functionality

1838649

1st1 approved these changes Jun 9, 2026

View reviewed changes

scotttrinh added 2 commits June 10, 2026 11:35

wip Docstrings

db63978

Move log streaming to create_process

80b7095

Removes the `logs()` method in favor of Popen-style `stdout`/`stderr` streams.

scotttrinh added 3 commits June 10, 2026 18:40

Finish improving the docstrings

8dd077d

Add structured NetworkPolicy classes

3ae0b47

Center sandbox lifecycle on handles

fa3d4d6

scotttrinh mentioned this pull request Jun 16, 2026

Add vercel.unstable and begin Sandbox V2 #123

Closed

Conversation

scotttrinh commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

socket-security Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

1st1 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scotttrinh Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scotttrinh commented Jun 10, 2026

Uh oh!

scotttrinh commented Jun 10, 2026

Follow-up from the sandbox/session API discussion with @elprans

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

scotttrinh commented May 28, 2026 •

edited

Loading

vercel Bot commented May 28, 2026 •

edited

Loading

socket-security Bot commented Jun 5, 2026 •

edited

Loading

scotttrinh Jun 10, 2026 •

edited

Loading