[AMDGPU] Add the code generation support for `llvm.[sin/cos].bf16` #149631

shiltian · 2025-07-19T01:53:29Z

This is a partial support because some other instructions have not been upstreamed yet.

shiltian · 2025-07-19T01:53:56Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-07-19T01:54:01Z

@llvm/pr-subscribers-backend-amdgpu

Author: Shilei Tian (shiltian)

Changes

This is a partial support because some other instructions have not been upstreamed yet.

Full diff: https://github.com/llvm/llvm-project/pull/149631.diff

3 Files Affected:

(modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+1-1)
(added) llvm/test/CodeGen/AMDGPU/llvm.cos.bf16.ll (+38)
(added) llvm/test/CodeGen/AMDGPU/llvm.sin.bf16.ll (+38)

diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 79487dcec3525..181db6291b361 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -620,7 +620,7 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
 
     // BF16 - VOP1 Actions.
     if (Subtarget->hasBF16TransInsts())
-      setOperationAction(ISD::FDIV, MVT::bf16, Custom);
+      setOperationAction({ISD::FCOS, ISD::FSIN, ISD::FDIV}, MVT::bf16, Custom);
 
     setOperationAction({ISD::FP_TO_SINT, ISD::FP_TO_UINT}, MVT::f16, Promote);
     setOperationAction({ISD::FP_TO_SINT, ISD::FP_TO_UINT}, MVT::bf16, Promote);
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.cos.bf16.ll b/llvm/test/CodeGen/AMDGPU/llvm.cos.bf16.ll
new file mode 100644
index 0000000000000..ced96ee98e0ad
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/llvm.cos.bf16.ll
@@ -0,0 +1,38 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -global-isel=0 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1250 %s -o - | FileCheck -check-prefixes=GCN %s
+; xUN: llc -global-isel=1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1250 -verify-machineinstrs %s -o - | FileCheck -check-prefix=GCN %s
+
+; FIXME: GlobalISel does not work with bf16
+
+declare bfloat @llvm.cos.bf16(bfloat) #0
+
+define amdgpu_kernel void @cos_bf16_constant_4(ptr addrspace(1) %out) #1 {
+; GCN-LABEL: cos_bf16_constant_4:
+; GCN:       ; %bb.0:
+; GCN-NEXT:    s_load_b64 s[0:1], s[4:5], 0x0
+; GCN-NEXT:    v_cos_bf16_e32 v0, 0x3f23
+; GCN-NEXT:    v_mov_b32_e32 v1, 0
+; GCN-NEXT:    s_wait_kmcnt 0x0
+; GCN-NEXT:    global_store_b16 v1, v0, s[0:1]
+; GCN-NEXT:    s_endpgm
+  %cos = call bfloat @llvm.cos.bf16(bfloat 4.0) #0
+  store bfloat %cos, ptr addrspace(1) %out, align 2
+  ret void
+}
+
+define amdgpu_kernel void @cos_bf16_constant_100(ptr addrspace(1) %out) #1 {
+; GCN-LABEL: cos_bf16_constant_100:
+; GCN:       ; %bb.0:
+; GCN-NEXT:    s_load_b64 s[0:1], s[4:5], 0x0
+; GCN-NEXT:    v_cos_bf16_e32 v0, 0x417f
+; GCN-NEXT:    v_mov_b32_e32 v1, 0
+; GCN-NEXT:    s_wait_kmcnt 0x0
+; GCN-NEXT:    global_store_b16 v1, v0, s[0:1]
+; GCN-NEXT:    s_endpgm
+  %cos = call bfloat @llvm.cos.bf16(bfloat 100.0) #0
+  store bfloat %cos, ptr addrspace(1) %out, align 2
+  ret void
+}
+
+attributes #0 = { nounwind readnone }
+attributes #1 = { nounwind }
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.sin.bf16.ll b/llvm/test/CodeGen/AMDGPU/llvm.sin.bf16.ll
new file mode 100644
index 0000000000000..7a355a36b15bf
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/llvm.sin.bf16.ll
@@ -0,0 +1,38 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -global-isel=0 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1250 %s -o - | FileCheck -check-prefixes=GCN %s
+; xUN: llc -global-isel=1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1250 -verify-machineinstrs %s -o - | FileCheck -check-prefix=GCN %s
+
+; FIXME: GlobalISel does not work with bf16
+
+declare bfloat @llvm.sin.bf16(bfloat) #0
+
+define amdgpu_kernel void @sin_bf16_constant_4(ptr addrspace(1) %out) #1 {
+; GCN-LABEL: sin_bf16_constant_4:
+; GCN:       ; %bb.0:
+; GCN-NEXT:    s_load_b64 s[0:1], s[4:5], 0x0
+; GCN-NEXT:    v_sin_bf16_e32 v0, 0x3f23
+; GCN-NEXT:    v_mov_b32_e32 v1, 0
+; GCN-NEXT:    s_wait_kmcnt 0x0
+; GCN-NEXT:    global_store_b16 v1, v0, s[0:1]
+; GCN-NEXT:    s_endpgm
+  %sin = call bfloat @llvm.sin.bf16(bfloat 4.0) #0
+  store bfloat %sin, ptr addrspace(1) %out, align 2
+  ret void
+}
+
+define amdgpu_kernel void @sin_bf16_constant_100(ptr addrspace(1) %out) #1 {
+; GCN-LABEL: sin_bf16_constant_100:
+; GCN:       ; %bb.0:
+; GCN-NEXT:    s_load_b64 s[0:1], s[4:5], 0x0
+; GCN-NEXT:    v_sin_bf16_e32 v0, 0x417f
+; GCN-NEXT:    v_mov_b32_e32 v1, 0
+; GCN-NEXT:    s_wait_kmcnt 0x0
+; GCN-NEXT:    global_store_b16 v1, v0, s[0:1]
+; GCN-NEXT:    s_endpgm
+  %sin = call bfloat @llvm.sin.bf16(bfloat 100.0) #0
+  store bfloat %sin, ptr addrspace(1) %out, align 2
+  ret void
+}
+
+attributes #0 = { nounwind readnone }
+attributes #1 = { nounwind }

shiltian · 2025-07-21T14:51:08Z

Merge activity

Jul 21, 2:51 PM UTC: A user started a stack merge that includes this pull request via Graphite.
Jul 21, 2:58 PM UTC: Graphite rebased this pull request as part of a merge.
Jul 21, 3:02 PM UTC: @shiltian merged this pull request with Graphite.

This is a partial support because some other instructions have not been upstreamed yet.

…lvm#149631) This is a partial support because some other instructions have not been upstreamed yet.

shiltian requested review from changpeng and rampitec July 19, 2025 01:53

llvmbot added the backend:AMDGPU label Jul 19, 2025

This was referenced Jul 19, 2025

[NFC][AMDGPU] Add an IR test for v_cvt_f16_bf8 #149627

Merged

[gfx1250][SDAG] Lower unsafe bf16 divisions #149628

Merged

changpeng approved these changes Jul 19, 2025

View reviewed changes

shiltian force-pushed the users/shiltian/codegen-for-llvm-sin-cos-bf16 branch from 76cf513 to d362ce6 Compare July 21, 2025 13:39

shiltian force-pushed the users/shiltian/unsafe-bf16-div branch from 251b027 to 622cf01 Compare July 21, 2025 13:39

shiltian force-pushed the users/shiltian/unsafe-bf16-div branch from 622cf01 to fa52b7e Compare July 21, 2025 14:55

Base automatically changed from users/shiltian/unsafe-bf16-div to main July 21, 2025 14:58

[AMDGPU] Add the code generation support for llvm.[sin/cos].bf16

f59e3ce

This is a partial support because some other instructions have not been upstreamed yet.

shiltian force-pushed the users/shiltian/codegen-for-llvm-sin-cos-bf16 branch from d362ce6 to f59e3ce Compare July 21, 2025 14:58

shiltian merged commit e801a10 into main Jul 21, 2025
7 of 9 checks passed

shiltian deleted the users/shiltian/codegen-for-llvm-sin-cos-bf16 branch July 21, 2025 15:02

This was referenced Jul 23, 2025

test abhinavgaba/llvm-project#2

Closed

Add dataFence plugin interface abhinavgaba/llvm-project#3

Closed

mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025

[AMDGPU] Add the code generation support for llvm.[sin/cos].bf16 (l…

f7c4eeb

…lvm#149631) This is a partial support because some other instructions have not been upstreamed yet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Add the code generation support for `llvm.[sin/cos].bf16` #149631

[AMDGPU] Add the code generation support for `llvm.[sin/cos].bf16` #149631

Uh oh!

shiltian commented Jul 19, 2025

Uh oh!

shiltian commented Jul 19, 2025 •

edited

Loading

Uh oh!

llvmbot commented Jul 19, 2025

Uh oh!

shiltian commented Jul 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[AMDGPU] Add the code generation support for llvm.[sin/cos].bf16 #149631

[AMDGPU] Add the code generation support for llvm.[sin/cos].bf16 #149631

Uh oh!

Conversation

shiltian commented Jul 19, 2025

Uh oh!

shiltian commented Jul 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jul 19, 2025

Uh oh!

shiltian commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Uh oh!

[AMDGPU] Add the code generation support for `llvm.[sin/cos].bf16` #149631

[AMDGPU] Add the code generation support for `llvm.[sin/cos].bf16` #149631

shiltian commented Jul 19, 2025 •

edited

Loading

shiltian commented Jul 21, 2025 •

edited

Loading