Skip to content

avoid quadratic multiline-string context scan in is_line_short_enough#5188

Open
sahvx655-wq wants to merge 1 commit into
psf:mainfrom
sahvx655-wq:is-line-short-enough-quadratic
Open

avoid quadratic multiline-string context scan in is_line_short_enough#5188
sahvx655-wq wants to merge 1 commit into
psf:mainfrom
sahvx655-wq:is-line-short-enough-quadratic

Conversation

@sahvx655-wq

Copy link
Copy Markdown
Contributor

Description

While profiling Black on a generated config module I noticed formatting hung for several seconds on a single dict literal whose values were all triple-quoted strings. A flame graph put ~80% of the wall time in is_line_short_enough, specifically the while str(ctx) in line_str loop that walks the multiline string up through its enclosing AST nodes. Each step renders the whole ancestor subtree to a string and substring-searches the line, so when the topmost ancestor is the entire collection that render is O(N), and because is_line_short_enough runs once per candidate split line during line generation the whole pass is O(N²). A dict with 1600 triple-quoted values took ~2.2s; the cost grew quadratically with the number of members.

The walk only needs the ancestors that are wholly present on the current line, and since a line is a contiguous run of leaves an ancestor qualifies exactly when its first and last leaves both belong to the line. Checking two leaf ids against a set built once is O(depth) per step and bails out immediately for ancestors that aren't fully on the line, so the render-and-search disappears and the pass becomes linear (the 1600-value case drops to ~0.23s). Keeping the check here rather than at the call sites means every caller of is_line_short_enough benefits without having to special-case multiline strings. Output is unchanged: byte-identical across the test suite, the Black and stdlib sources in stable/preview, and several thousand fuzzed multiline-string structures.

Checklist - did you ...

  • Implement any code style changes under the --preview style, following the stability policy?
  • Add an entry in CHANGES.md if necessary?
  • Add / update tests if necessary?
  • Add new / update outdated documentation?

@github-actions

Copy link
Copy Markdown
Contributor

diff-shades results comparing this PR (33fb575) to main (b74b230):

--preview style: no changes

--stable style: no changes


What is this? | Workflow run | diff-shades documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant