fix(worker): preserve explicit option defaults#1798
fix(worker): preserve explicit option defaults#1798rosetta-livekit-bot[bot] wants to merge 1 commit into
Conversation
🦋 Changeset detectedLatest commit: 106e78b The changes in this PR will be included in the next version bump. This PR includes changesets to release 35 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
| this.apiSecret = apiSecret; | ||
| this.workerToken = workerToken; | ||
| this.host = host; | ||
| this.port = port || Default.port(production); | ||
| this.port = port ?? Default.port(production); |
There was a problem hiding this comment.
🔴 CLI production mode gets dev defaults for numIdleProcesses and port due to || → ?? change
When a user runs their agent with cli.runApp(new ServerOptions({ agent: '...' })) and uses the start command (production mode), the CLI re-constructs ServerOptions at agents/src/cli.ts:31-32 by spreading the already-resolved options and overriding production to true. The development-mode defaults numIdleProcesses=0 and port=0 (resolved during the first construction with production=false) are now non-nullish values, so ?? preserves them instead of re-evaluating the production defaults.
Trace of the regression
- User code:
new ServerOptions({ agent: '...' })→production=false→numIdleProcesses = undefined ?? Default.numIdleProcesses(false) = 0,port = undefined ?? Default.port(false) = 0 - CLI start:
runServer({ opts, production: true })→ atcli.ts:32:new ServerOptions({ production: true, ...opts })where opts hasnumIdleProcesses: 0, port: 0 - Second construction:
0 ?? Default.numIdleProcesses(true) = 0(wrong, should bemin(parallelism, 4)),0 ?? Default.port(true) = 0(wrong, should be8081) - Old
||behavior:0 || Default.numIdleProcesses(true) = 4,0 || Default.port(true) = 8081— worked by accident because0is falsy
This means all production deployments via the standard CLI path get no pre-warmed processes (numIdleProcesses=0, causing latency for every job) and a random health-check port (port=0, breaking production health probes that expect port 8081).
(Refers to lines 222-246)
Prompt for agents
The root cause is that the CLI re-constructs ServerOptions with a changed `production` flag by spreading the already-resolved instance (cli.ts:31-32), making it impossible to distinguish 'user explicitly passed 0' from 'resolved to 0 from dev default'. The fix from || to ?? is correct for preserving explicit values, but needs a mechanism to handle the re-construction case.
Possible approaches:
1. Track which fields were explicitly set vs defaulted (e.g. store the raw constructor args, or use sentinel values/symbols for 'not explicitly provided').
2. In the CLI re-construction path (cli.ts:31-32), strip out fields that equal their dev-mode defaults before spreading, so the new constructor re-evaluates them with the correct production flag.
3. Instead of re-constructing ServerOptions in cli.ts, mutate the production field on the existing instance and have a method to re-evaluate production-dependent defaults.
4. Use undefined as the representation for 'use the default' and only resolve to concrete values lazily (e.g. in AgentServer constructor), keeping ServerOptions fields as `number | undefined`.
Files involved: agents/src/worker.ts (ServerOptions constructor, lines 222-246), agents/src/cli.ts (runServer function, lines 30-32).
Was this helpful? React with 👍 or 👎 to provide feedback.
| this.loadFunc = loadFunc; | ||
| this.loadThreshold = loadThreshold || Default.loadThreshold(production); | ||
| this.numIdleProcesses = numIdleProcesses || Default.numIdleProcesses(production); | ||
| this.loadThreshold = loadThreshold ?? Default.loadThreshold(production); |
There was a problem hiding this comment.
🚩 Pre-existing bug: loadThreshold also not re-evaluated for production in CLI path
The loadThreshold field has the same re-construction problem as numIdleProcesses and port, but it was already broken before this PR. The dev default is Infinity (truthy), so even the old || code would keep Infinity when the CLI switches to production: true (since Infinity || 0.7 = Infinity). This is partially mitigated for Cloud deployments by the guard at agents/src/worker.ts:322-328 in the AgentServer constructor which forces the production default when workerToken is set. However, self-hosted production deployments (no workerToken) would still get loadThreshold = Infinity, meaning the worker never reports as full. This is pre-existing and not introduced by this PR.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
ServerOptionsdefaults.@livekit/agents.Testing
pnpm build:agentsPorted from livekit/agents#6121
Original PR description
This PR fixes
update_optionsto useNotGivenOr[float]defaults for timeouts, ensuring they are not reset to default values if not explicitly specified. It also adds the missing assignment forinitialize_process_timeoutin theupdate_optionsfunction body.