Skip to content

optimize the fixed32 and fixed64 encoding #772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tardyp
Copy link

@tardyp tardyp commented Aug 21, 2025

The current implementation of write_raw_little_endian32 is just using self.write_raw_bytes.

As this go through the function boundaries, the compiler is not smart enough to simplify the data copy, and end-up calling memcpy.

This inlined implementation is twice as fast as the previous according to the attached bench.

test bench_write_raw_fixed32   ... bench:       5,860.52 ns/iter (+/- 62.51)
test bench_write_raw_fixed64   ... bench:       4,303.08 ns/iter (+/- 29.47)
test bench_write_raw_varint_32 ... bench:       2,555.20 ns/iter (+/- 126.98)

now fixed32/64 is as expected a bit faster than varint

test bench_write_raw_fixed32   ... bench:       2,143.48 ns/iter (+/- 21.70)
test bench_write_raw_fixed64   ... bench:       2,209.28 ns/iter (+/- 32.42)
test bench_write_raw_varint_32 ... bench:       2,480.89 ns/iter (+/- 88.95)

@tardyp
Copy link
Author

tardyp commented Aug 22, 2025

I think that the CI failure is a flaky test. let me know if you disagree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant