Skip to content

[LLVM][AArch64InstrInfo] Prevent fill folding when DstReg is SP. #148885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 16, 2025

Conversation

paulwalker-arm
Copy link
Collaborator

We can remove subreg COPY instructions by filling directly into the COPY's destination register. However, this is only valid when the copy and fill have compatible register classes.

Fixes #148659

@llvmbot
Copy link
Member

llvmbot commented Jul 15, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Paul Walker (paulwalker-arm)

Changes

We can remove subreg COPY instructions by filling directly into the COPY's destination register. However, this is only valid when the copy and fill have compatible register classes.

Fixes #148659


Full diff: https://github.com/llvm/llvm-project/pull/148885.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+2-1)
  • (modified) llvm/test/CodeGen/AArch64/spill-fold.mir (+19)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 5420545cc3cec..4d243dcaf4627 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -6294,7 +6294,8 @@ MachineInstr *AArch64InstrInfo::foldMemoryOperandImpl(
         FillRC = nullptr;
         break;
       case AArch64::sub_32:
-        FillRC = &AArch64::GPR32RegClass;
+        if (AArch64::GPR64RegClass.hasSubClassEq(getRegClass(DstReg)))
+          FillRC = &AArch64::GPR32RegClass;
         break;
       case AArch64::ssub:
         FillRC = &AArch64::FPR32RegClass;
diff --git a/llvm/test/CodeGen/AArch64/spill-fold.mir b/llvm/test/CodeGen/AArch64/spill-fold.mir
index 0149e4504bed2..9ea9ce53b68a8 100644
--- a/llvm/test/CodeGen/AArch64/spill-fold.mir
+++ b/llvm/test/CodeGen/AArch64/spill-fold.mir
@@ -10,6 +10,7 @@
   define i64 @test_subreg_fill_fold() { ret i64 0 }
   define double @test_subreg_fill_fold2() { ret double 0.0 }
   define <4 x float> @test_subreg_fill_fold3() { ret <4 x float> undef }
+  define i64 @test_subreg_fill_fold4() { ret i64 0 }
   define i64 @test_nzcv_spill_fold() { ret i64 0 }
 ...
 ---
@@ -121,6 +122,24 @@ body:             |
     RET_ReallyLR implicit $s0
 ...
 ---
+# CHECK-LABEL: name: test_subreg_fill_fold4
+# Ensure the COPY is maintained when its result register class is not compatible
+# with the fill load's.
+name:            test_subreg_fill_fold4
+registers:
+  - { id: 0, class: gpr32 }
+  - { id: 1, class: gpr64sp }
+body:             |
+  bb.0:
+    %0 = COPY $wzr
+    INLINEASM &nop, 1, 12, implicit-def dead $x0, 12, implicit-def dead $x1, 12, implicit-def dead $x2, 12, implicit-def dead $x3, 12, implicit-def dead $x4, 12, implicit-def dead $x5, 12, implicit-def dead $x6, 12, implicit-def dead $x7, 12, implicit-def dead $x8, 12, implicit-def dead $x9, 12, implicit-def dead $x10, 12, implicit-def dead $x11, 12, implicit-def dead $x12, 12, implicit-def dead $x13, 12, implicit-def dead $x14, 12, implicit-def dead $x15, 12, implicit-def dead $x16, 12, implicit-def dead $x17, 12, implicit-def dead $x18, 12, implicit-def dead $x19, 12, implicit-def dead $x20, 12, implicit-def dead $x21, 12, implicit-def dead $x22, 12, implicit-def dead $x23, 12, implicit-def dead $x24, 12, implicit-def dead $x25, 12, implicit-def dead $x26, 12, implicit-def dead $x27, 12, implicit-def dead $x28, 12, implicit-def dead $fp, 12, implicit-def dead $lr, 12, implicit-def $sp
+    ; CHECK: %2:gpr32 = LDRWui %stack.0, 0 :: (load (s32) from %stack.0)
+    ; CHECK: undef %1.sub_32:gpr64sp = COPY %2
+    undef %1.sub_32:gpr64sp = COPY %0
+    $x0 = COPY %1
+    RET_ReallyLR implicit $x0
+...
+---
 # CHECK-LABEL: name: test_nzcv_spill_fold
 # Ensure that nzcv COPY cannot be folded.
 name:            test_nzcv_spill_fold

Comment on lines +128 to +129
name: test_subreg_fill_fold4
registers:
Copy link
Collaborator Author

@paulwalker-arm paulwalker-arm Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a clone of test_subreg_fill_fold with the register class of %1 changed from gpr64 to gpr64sp.

@@ -6294,7 +6294,8 @@ MachineInstr *AArch64InstrInfo::foldMemoryOperandImpl(
FillRC = nullptr;
break;
case AArch64::sub_32:
FillRC = &AArch64::GPR32RegClass;
if (AArch64::GPR64RegClass.hasSubClassEq(getRegClass(DstReg)))
FillRC = &AArch64::GPR32RegClass;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default case sets FillRC to nullptr, but in the case DstReg is not a super-class of GPR64, FillRC is not defined.

We can remove subreg COPY instructions by filling directly into the
COPY's destination register. However, this is only valid when the copy
and fill have compatible register classes.

Fixes llvm#148659
@paulwalker-arm paulwalker-arm merged commit 5abdce4 into llvm:main Jul 16, 2025
9 checks passed
@paulwalker-arm paulwalker-arm deleted the D148659 branch July 16, 2025 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AArch64] Expected a GPR64common register, but got a GPR64sp register
3 participants