Skip to content

perf: reduce allocations on Option/Result/Either hot paths#110

Merged
samber merged 8 commits into
masterfrom
perf/reduce-hot-path-allocations
Jun 10, 2026
Merged

perf: reduce allocations on Option/Result/Either hot paths#110
samber merged 8 commits into
masterfrom
perf/reduce-hot-path-allocations

Conversation

@samber

@samber samber commented Jun 10, 2026

Copy link
Copy Markdown
Owner

Summary

Reduces allocations and CPU time on serialization and comparison hot paths, verified with a new benchmark suite (in bench/):

  • Option.UnmarshalJSON: check for null by comparing bytes directly instead of copying the payload into a trimmed buffer first.
  • Option.MarshalJSON: fast path for None — return the constant null bytes without invoking json.Marshal.
  • Result JSON marshaling: marshal via dedicated structs instead of map[string]any, eliminating map allocation and interface boxing.
  • Option/Either/Either3-5 MarshalBinary: write the discriminant byte into the bytes.Buffer up front so the gob payload no longer needs to be copied with append afterwards.
  • Option.Equal: fast path for comparable scalar kinds before falling back to reflect.DeepEqual.

Benchmarks

benchstat comparing master vs this branch (-count=10, linux/amd64, Intel Xeon @ 2.80GHz):

                                      │     master     │            this branch               │
                                      │     sec/op     │    sec/op     vs base                │
EitherBinary/MarshalBinary-4              969.9n ±  5%   899.8n ±  8%   -7.24% (p=0.007 n=10)
EitherBinary/UnmarshalBinary-4            1.307µ ± 26%   1.229µ ±  2%   -6.01% (p=0.000 n=10)
EitherNBinary/Either3/MarshalBinary-4     980.4n ±  2%   880.5n ±  5%  -10.19% (p=0.000 n=10)
EitherNBinary/Either4/MarshalBinary-4    1020.5n ±  4%   917.9n ±  4%  -10.05% (p=0.000 n=10)
EitherNBinary/Either5/MarshalBinary-4     982.2n ±  5%   984.2n ± 10%        ~ (p=0.363 n=10)
OptionMarshalJSON/Some-4                  184.1n ±  1%   186.8n ± 10%        ~ (p=0.404 n=10)
OptionMarshalJSON/None-4                  69.27n ±  5%   14.26n ±  3%  -79.41% (p=0.000 n=10)
OptionUnmarshalJSON/SmallValue-4          341.0n ±  3%   275.3n ±  5%  -19.27% (p=0.000 n=10)
OptionUnmarshalJSON/Null-4               40.695n ±  3%   8.674n ±  2%  -78.69% (p=0.000 n=10)
OptionUnmarshalJSON/LargeValue-4          7.205µ ±  1%   5.794µ ±  5%  -19.59% (p=0.000 n=10)
OptionBinary/MarshalBinary-4              997.2n ±  3%   969.5n ±  2%   -2.78% (p=0.001 n=10)
OptionBinary/UnmarshalBinary-4            1.268µ ±  4%   1.283µ ±  8%        ~ (p=0.616 n=10)
OptionEqual/Int-4                        32.695n ±  3%   9.242n ±  1%  -71.73% (p=0.000 n=10)
OptionEqual/String-4                      97.25n ±  7%   11.94n ±  1%  -87.72% (p=0.000 n=10)
OptionEqual/Struct-4                      142.4n ±  2%   149.1n ±  2%   +4.67% (p=0.000 n=10)
ResultMarshalJSON/Ok-4                    701.6n ±  5%   233.2n ±  9%  -66.75% (p=0.000 n=10)
ResultMarshalJSON/Err-4                  1249.0n ±  2%   257.3n ±  7%  -79.40% (p=0.000 n=10)
geomean                                   437.8n         244.5n        -44.16%

Allocation highlights:

                                      │   master    │            this branch                  │
                                      │ allocs/op   │ allocs/op   vs base                     │
OptionUnmarshalJSON/Null-4                1.000 ± 0%     0.000 ± 0%  -100.00% (p=0.000 n=10)
OptionUnmarshalJSON/SmallValue-4          3.000 ± 0%     2.000 ± 0%   -33.33% (p=0.000 n=10)
OptionEqual/String-4                      2.000 ± 0%     0.000 ± 0%  -100.00% (p=0.000 n=10)
OptionBinary/MarshalBinary-4              15.00 ± 0%     13.00 ± 0%   -13.33% (p=0.000 n=10)
EitherBinary/MarshalBinary-4              13.00 ± 0%     11.00 ± 0%   -15.38% (p=0.000 n=10)
ResultMarshalJSON/Ok-4                    7.000 ± 0%     2.000 ± 0%   -71.43% (p=0.000 n=10)
ResultMarshalJSON/Err-4                  12.000 ± 0%     2.000 ± 0%   -83.33% (p=0.000 n=10)
ResultMarshalJSON/Ok B/op                 456 B → 40 B   -91.23%
ResultMarshalJSON/Err B/op                880 B → 48 B   -94.55%

Note: OptionEqual/Struct shows a small +4.7% regression from the kind check before the reflect.DeepEqual fallback; the scalar fast paths (-72% to -88%) seemed worth the trade-off.

Wire formats (JSON and binary) are unchanged — all existing marshal/unmarshal round-trip tests pass, and the full suite is green with -race.

https://claude.ai/code/session_01CkPtmgw2vWn8vUJm1JERhg


Generated by Claude Code

claude added 7 commits June 10, 2026 17:30
Cover Option, Result, Either/Either3-5, Future, Task, IO, State and Do:
constructors, accessors/transformations, JSON and binary codecs, SQL Scan.
Benchmarks use classic b.N loops to stay compatible with go1.18 CI builds.

https://claude.ai/code/session_01CkPtmgw2vWn8vUJm1JERhg
bytes.ToLower allocates a full copy of the input on every call, even for
large value payloads that are only compared against "null". bytes.EqualFold
performs the same case-insensitive comparison without allocating.

benchstat (go1.24, count=6):
                                       │ before  │ after                 │
OptionUnmarshalJSON/Null-4   ns/op       29.91          8.09  -72.93%
OptionUnmarshalJSON/Null-4   allocs/op    1             0    -100.00%
OptionUnmarshalJSON/Small-4  B/op       176           160      -9.09%
OptionUnmarshalJSON/Large-4  B/op      2320          1168     -49.66%

https://claude.ai/code/session_01CkPtmgw2vWn8vUJm1JERhg
json.Marshal(nil) runs the full encoder machinery (encoder cache lookup,
encode state) just to produce the constant "null". Return the literal
directly. A fresh slice is returned on each call since callers may mutate
the result.

benchstat (go1.24, count=6):
                                  │ before │ after            │
OptionMarshalJSON/None-4  ns/op     55.39    12.00   -78.34%
OptionMarshalJSON/None-4  B/op       8        4      -50.00%

https://claude.ai/code/session_01CkPtmgw2vWn8vUJm1JERhg
Encoding a map[string]any allocates the map, boxes every value in an
interface and sorts keys inside the encoder. Anonymous structs with json
tags produce the exact same bytes ({"result":...} / {"error":{"message":...}})
without any of that work.

benchstat (go1.24, count=6):
                              │ before │ after            │
ResultMarshalJSON/Ok-4   ns/op  576.6    175.2    -69.63%
ResultMarshalJSON/Ok-4   B/op   456       40      -91.23%
ResultMarshalJSON/Ok-4   allocs   7        2      -71.43%
ResultMarshalJSON/Err-4  ns/op 1037.0    203.4    -80.39%
ResultMarshalJSON/Err-4  B/op   880       48      -94.55%
ResultMarshalJSON/Err-4  allocs  12        2      -83.33%

https://claude.ai/code/session_01CkPtmgw2vWn8vUJm1JERhg
MarshalBinary used to prepend the discriminator byte with
append([]byte{tag}, buf.Bytes()...), copying the whole gob payload into a
second slice. Writing the tag byte into the buffer before encoding removes
that copy and its allocation; the cost scales with payload size.

benchstat (go1.24, count=6, small payloads):
                                     │ before │ after            │
OptionBinary/MarshalBinary-4   ns/op  851.7    808.5     -5.08%
OptionBinary/MarshalBinary-4   allocs  15       13      -13.33%
EitherBinary/MarshalBinary-4   ns/op  812.0    748.4     -7.83%
EitherBinary/MarshalBinary-4   allocs  13       11      -15.38%
Either3/4/5 MarshalBinary      allocs  13       11      -15.38%

https://claude.ai/code/session_01CkPtmgw2vWn8vUJm1JERhg
reflect.DeepEqual is 50-200x slower than a typed comparison. For scalar
kinds (bool, ints, uints, floats, complex, string) == is strictly
equivalent to DeepEqual, so compare through interface equality instead.
Pointers, structs, arrays, maps, slices and interfaces keep DeepEqual
semantics (DeepEqual follows pointers, == does not).

benchstat (go1.24, count=6):
                          │ before │ after             │
OptionEqual/Int-4    ns/op  26.41     6.24     -76.36%
OptionEqual/String-4 ns/op  77.49     8.18     -89.44%
OptionEqual/String-4 allocs  2        0       -100.00%
OptionEqual/Struct-4 ns/op  117.6   119.8   ~ (p=0.167)

https://claude.ai/code/session_01CkPtmgw2vWn8vUJm1JERhg
Copilot AI review requested due to automatic review settings June 10, 2026 22:17

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes performance-critical paths in Option, Result, and Either by reducing allocations in JSON/binary serialization and speeding up Option.Equal, and adds a new benchmark suite under bench/ to measure the impact.

Changes:

  • Optimize Result.MarshalJSON to marshal via dedicated structs instead of map[string]any.
  • Optimize Option JSON handling (MarshalJSON fast-path for None, UnmarshalJSON null detection via bytes.EqualFold) and add a scalar-kind fast path for Option.Equal.
  • Reduce allocations in Option/Either/Either3-5 MarshalBinary by writing the discriminant byte directly into the bytes.Buffer before gob encoding; add benchmarks covering these hot paths.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
result.go Switch Result.MarshalJSON from map-based encoding to struct-based encoding to reduce allocations.
option.go Add None JSON fast-path, allocation-reducing null check in UnmarshalJSON, discriminant write optimization for MarshalBinary, and scalar fast-path in Equal.
either.go Write discriminant byte directly into buffer before gob payload to avoid post-encode copy.
either3.go Write argId discriminant into buffer up front in MarshalBinary to avoid append copy.
either4.go Same discriminant-write optimization for Either4.MarshalBinary.
either5.go Same discriminant-write optimization for Either5.MarshalBinary.
bench/result_bench_test.go Add benchmarks for Result constructors/accessors and JSON marshal/unmarshal.
bench/option_bench_test.go Add benchmarks for Option constructors/accessors, JSON marshal/unmarshal, binary marshal/unmarshal, IsZero, Equal, and Scan.
bench/either_bench_test.go Add benchmarks for Either constructors/accessors and binary marshal/unmarshal (including Either3-5 marshal).
bench/io_state_bench_test.go Add benchmarks for IO, State, and Do hot paths.
bench/future_bench_test.go Add benchmarks for Future/Task creation and chaining/collection.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.33%. Comparing base (35a143a) to head (e520cec).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #110      +/-   ##
==========================================
+ Coverage   85.21%   85.33%   +0.12%     
==========================================
  Files          28       28              
  Lines        2192     2210      +18     
==========================================
+ Hits         1868     1886      +18     
  Misses        271      271              
  Partials       53       53              
Flag Coverage Δ
unittests 85.33% <100.00%> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@samber samber merged commit aab5427 into master Jun 10, 2026
19 checks passed
@samber samber deleted the perf/reduce-hot-path-allocations branch June 10, 2026 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants