Skip to content

Make DCE mandatory and run it much earlier (even per-crate, shrinking .rlibs). #350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 26, 2025

Conversation

eddyb
Copy link
Collaborator

@eddyb eddyb commented Jul 25, 2025

The DCE ("dead code elimination", i.e. removing definitions not reachable from e.g. shader entry-points) pass was, for one reason or another, toggleable, and supposedly shaders should've still compiled without it.

However, the way this was achieved was by ignoring anything that DCE would've removed, in the "zombie" (deferred error) reporting pass, and removing the subset that also had "zombie" decorations.

Additionally, I believe that the concept of "optional DCE" has been outdated for a while, because of:

  • non-enumerable SPIR-T interned (Type, Const) / module-scoped (GlobalVar, Func) handles
    • that is, you can't even observe a FooDef without finding a Foo handle elsewhere
    • everything has to be reached from spirt::Module exports (linker ones + shader entry-points)
      (the DCE pass has a similar definition of "roots", even if rspirv makes it more ad-hoc)
    • even if lowering a SPIR-V module into SPIR-T may allocate all original definitions,
      all unreachable ones are lost (tho returning some kind of Map<Id, _> might be nice)
  • moving away from the old "zombie" (deferred error) system, and favoring more legalization

So this PR removes the --no-dce flag, applies DCE in a few more places, and gets to simplify the early "zombie" reporting pass (as it no longer has any reason to mutate the SPIR-V module, itself).

A niche (but welcome) benefit from running DCE really early, is that --dump-post-merge is now actually able to dump .spirt/.spirt.html, as well (instead of only .spv), since the pre-DCE SPIR-V module is really malformed (instructions referring to IDs never defined anywhere) and DCE is needed to clean that up.


However, the most exciting new use of DCE being added (in the last commit), is in the per-crate "optimize" stage (just before saving SPIR-V modules on disk for e.g. .rlibs and the final linking step).

We've been neglecting it because we're effectively stuck doing something like LTO, but there's no reason to not optimize CGUs (or, well, crates, because we run with 1 CGU for now - another thing I want to fix), and DCE is a pretty safe way to start (but we could also consider e.g. the mem2reg->inline->mem2reg sequence).

And it turns out we can save over 1 MiB of SPIR-V (that shader linking will never see again after this PR), from the minimal dependency set (core, compiler-builtins, glam, libm, num-traits, spirv-std):

(click for exact numbers and methodology)

Before

$ cargo compiletest --target-env spv1.3
$ echo target/compiletest-deps/spirv-unknown-spv1.3/debug/deps/*.rlib | xargs -n1 llvm-ar tv | rg '\.o$' | sort -n -k 3
--------- 0/0     48 Jan  1 02:00 1970 bitflags-3dcf3d9be677438b.bitflags.a47078eccd6f1016-cgu.0.rcgu.o
--------- 0/0     48 Jan  1 02:00 1970 compiletests_deps_helper-32dfa6c895360273.cetel80wcnyywktmqqi32z0t9.1c6l5wu.rcgu.o
--------- 0/0  10900 Jan  1 02:00 1970 spirv_std_types-688f22e3cf3995e3.0h0l2egaag7apdniux376o36z.09692e3.rcgu.o
--------- 0/0 1020660 Jan  1 02:00 1970 spirv_std-3fa5eb13241d7c30.9o2xt2nbmavkkxywdyb4v65wl.13t9lsz.rcgu.o
--------- 0/0 2416856 Jan  1 02:00 1970 num_traits-aa5c62cc8821fd9c.num_traits.92968d3e04044d3b-cgu.0.rcgu.o
--------- 0/0 2921312 Jan  1 02:00 1970 libm-4ae827bf8b5c3151.libm.30b39fea0ef0ed48-cgu.0.rcgu.o
--------- 0/0 3732808 Jan  1 02:00 1970 glam-ce3b52c9fe00e7e0.glam.158081abeaed99-cgu.0.rcgu.o
--------- 0/0 5711704 Jan  1 02:00 1970 compiler_builtins-b469d9463e1e9364.compiler_builtins.93b624f8cc19b5ca-cgu.0.rcgu.o
--------- 0/0 9499536 Jan  1 02:00 1970 core-0d15d1819711cec1.core.d3a3ec86f32056-cgu.0.rcgu.o
$ cargo run -p example-runner-wgpu --release
$ echo target/spirv-builder/spirv-unknown-vulkan1.1/release/deps/*.rlib | xargs -n1 llvm-ar tv | rg '\.o$' | sort -n -k 3 | rg -v shared-
--------- 0/0     88 Jan  1 02:00 1970 bitflags-0de303e2ee1c80c4.bitflags.ca8277fe98621545-cgu.0.rcgu.o
--------- 0/0     88 Jan  1 02:00 1970 bytemuck-e70aaa40b159da49.bytemuck.433330b546080123-cgu.0.rcgu.o
--------- 0/0  10812 Jan  1 02:00 1970 spirv_std_types-531bcff05bb52ca4.2kp5062ggpggjb8a9bw5p13g7.1i1tktr.rcgu.o
--------- 0/0 1017296 Jan  1 02:00 1970 spirv_std-acd900a964871511.5fh9vgyaa0jioh67zv8mt54km.14k8tk4.rcgu.o
--------- 0/0 2405552 Jan  1 02:00 1970 num_traits-cb12f45b10b98d97.num_traits.e2bc6a17b294da77-cgu.0.rcgu.o
--------- 0/0 2943520 Jan  1 02:00 1970 libm-f828430d7c4ee7de.libm.646417dd18176353-cgu.0.rcgu.o
--------- 0/0 3659432 Jan  1 02:00 1970 glam-63f13bd94f2750aa.glam.852b52b1c1856f37-cgu.0.rcgu.o
--------- 0/0 5396468 Jan  1 02:00 1970 compiler_builtins-2138275e1f227a29.compiler_builtins.b8f71e5756201651-cgu.0.rcgu.o
--------- 0/0 9258412 Jan  1 02:00 1970 core-55cff8dca7a6fe24.core.6da4bd542a2a13d-cgu.0.rcgu.o

After

$ cargo compiletest --target-env spv1.3
$ echo target/compiletest-deps/spirv-unknown-spv1.3/debug/deps/*.rlib | xargs -n1 llvm-ar tv | rg '\.o$' | sort -n -k 3
--------- 0/0     48 Jan  1 02:00 1970 bitflags-3dcf3d9be677438b.bitflags.a47078eccd6f1016-cgu.0.rcgu.o
--------- 0/0     48 Jan  1 02:00 1970 compiletests_deps_helper-32dfa6c895360273.cetel80wcnyywktmqqi32z0t9.1kwcmt2.rcgu.o
--------- 0/0  10900 Jan  1 02:00 1970 spirv_std_types-688f22e3cf3995e3.0h0l2egaag7apdniux376o36z.1ldvf4p.rcgu.o
--------- 0/0 850820 Jan  1 02:00 1970 spirv_std-3fa5eb13241d7c30.9o2xt2nbmavkkxywdyb4v65wl.0wbp380.rcgu.o
--------- 0/0 2239204 Jan  1 02:00 1970 num_traits-aa5c62cc8821fd9c.num_traits.92968d3e04044d3b-cgu.0.rcgu.o
--------- 0/0 2854988 Jan  1 02:00 1970 libm-4ae827bf8b5c3151.libm.30b39fea0ef0ed48-cgu.0.rcgu.o
--------- 0/0 3534688 Jan  1 02:00 1970 glam-ce3b52c9fe00e7e0.glam.158081abeaed99-cgu.0.rcgu.o
--------- 0/0 5507400 Jan  1 02:00 1970 compiler_builtins-b469d9463e1e9364.compiler_builtins.93b624f8cc19b5ca-cgu.0.rcgu.o
--------- 0/0 8925012 Jan  1 02:00 1970 core-0d15d1819711cec1.core.d3a3ec86f32056-cgu.0.rcgu.o
$ cargo run -p example-runner-wgpu --release
$ echo target/spirv-builder/spirv-unknown-vulkan1.1/release/deps/*.rlib | xargs -n1 llvm-ar tv | rg '\.o$' | sort -n -k 3 | rg -v shared-
--------- 0/0     88 Jan  1 02:00 1970 bitflags-0de303e2ee1c80c4.bitflags.ca8277fe98621545-cgu.0.rcgu.o
--------- 0/0     88 Jan  1 02:00 1970 bytemuck-e70aaa40b159da49.bytemuck.433330b546080123-cgu.0.rcgu.o
--------- 0/0  10812 Jan  1 02:00 1970 spirv_std_types-531bcff05bb52ca4.2kp5062ggpggjb8a9bw5p13g7.0jr4t2s.rcgu.o
--------- 0/0 847580 Jan  1 02:00 1970 spirv_std-acd900a964871511.5fh9vgyaa0jioh67zv8mt54km.1b0oz1f.rcgu.o
--------- 0/0 2228172 Jan  1 02:00 1970 num_traits-cb12f45b10b98d97.num_traits.e2bc6a17b294da77-cgu.0.rcgu.o
--------- 0/0 2876984 Jan  1 02:00 1970 libm-f828430d7c4ee7de.libm.646417dd18176353-cgu.0.rcgu.o
--------- 0/0 3459556 Jan  1 02:00 1970 glam-63f13bd94f2750aa.glam.852b52b1c1856f37-cgu.0.rcgu.o
--------- 0/0 5195832 Jan  1 02:00 1970 compiler_builtins-2138275e1f227a29.compiler_builtins.b8f71e5756201651-cgu.0.rcgu.o
--------- 0/0 8695756 Jan  1 02:00 1970 core-55cff8dca7a6fe24.core.6da4bd542a2a13d-cgu.0.rcgu.o

Debug (via cargo compiletest)

Before After Change
Total 24.14 MiB 22.81 MiB -1.33 MiB  (-5.49%)
Largest
(core)
 9.06 MiB  8.51 MiB -0.55 MiB  (-6.05%)
Most impacted
(spirv-std)
 0.97 MiB  0.81 MiB -0.16 MiB (-16.64%)

Release (via example-runner-wgpu)

Before After Change
Total 23.55 MiB 22.23 MiB -1.31 MiB  (-5.58%)
Largest
(core)
 8.83 MiB  8.29 MiB -0.54 MiB  (-6.08%)
Most impacted
(spirv-std)
 0.97 MiB  0.81 MiB -0.16 MiB (-16.68%)

(the overall impact of such a change might look even better, if/once we can find a way to remove e.g. the OpSource embedded source code,
which amounts to e.g. almost 4 MiB of core's 8-9 MiB - but that's more future work, alongside with increasing CGU count etc.)

Copy link
Collaborator

@LegNeato LegNeato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome and is surprisingly straightforward. Nice! 🍻

@eddyb eddyb added this pull request to the merge queue Jul 26, 2025
Merged via the queue into Rust-GPU:main with commit 87ea628 Jul 26, 2025
13 checks passed
@eddyb eddyb deleted the max-dce branch July 26, 2025 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants