-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[AMDGPU] Emit entry function Dwarf CFI #164722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/slinder1/11-04-_clang_default_to_async_unwind_tables_for_amdgcn
Are you sure you want to change the base?
[AMDGPU] Emit entry function Dwarf CFI #164722
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
@llvm/pr-subscribers-clang @llvm/pr-subscribers-backend-amdgpu Author: Scott Linder (slinder1) ChangesEntry functions represent the end of unwinding, as they are the Co-authored-by: Scott Linder <scott.linder@amd.com> Patch is 361.62 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164722.diff 26 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp b/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
index 5c39f7a3d6daa..71356aa2aced1 100644
--- a/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
@@ -12,8 +12,10 @@
#include "GCNSubtarget.h"
#include "MCTargetDesc/AMDGPUMCTargetDesc.h"
#include "SIMachineFunctionInfo.h"
+#include "llvm/BinaryFormat/Dwarf.h"
#include "llvm/CodeGen/LiveRegUnits.h"
#include "llvm/CodeGen/MachineFrameInfo.h"
+#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/RegisterScavenging.h"
#include "llvm/Target/TargetMachine.h"
@@ -43,6 +45,15 @@ static MCRegister findUnusedRegister(MachineRegisterInfo &MRI,
return MCRegister();
}
+static bool needsFrameMoves(const MachineFunction &MF) {
+ // FIXME: There are some places in the compiler which are sensitive to the CFI
+ // pseudos and so using MachineFunction::needsFrameMoves has the unintended
+ // effect of making enabling debug info affect codegen. Once we have
+ // identified and fixed those cases this should be replaced with
+ // MF.needsFrameMoves()
+ return true;
+}
+
// Find a scratch register that we can use in the prologue. We avoid using
// callee-save registers since they may appear to be free when this is called
// from canUseAsPrologue (during shrink wrapping), but then no longer be free
@@ -615,10 +626,39 @@ void SIFrameLowering::emitEntryFunctionPrologue(MachineFunction &MF,
const SIRegisterInfo *TRI = &TII->getRegisterInfo();
MachineRegisterInfo &MRI = MF.getRegInfo();
const Function &F = MF.getFunction();
+ const MCRegisterInfo *MCRI = MF.getContext().getRegisterInfo();
MachineFrameInfo &FrameInfo = MF.getFrameInfo();
assert(MFI->isEntryFunction());
+ // Debug location must be unknown since the first debug location is used to
+ // determine the end of the prologue.
+ DebugLoc DL;
+ MachineBasicBlock::iterator I = MBB.begin();
+
+ if (needsFrameMoves(MF)) {
+ // On entry the SP/FP are not set up, so we need to define the CFA in terms
+ // of a literal location expression.
+ static const char CFAEncodedInstUserOpsArr[] = {
+ dwarf::DW_CFA_def_cfa_expression,
+ 4, // length
+ static_cast<char>(dwarf::DW_OP_lit0),
+ static_cast<char>(dwarf::DW_OP_lit0 +
+ dwarf::DW_ASPACE_LLVM_AMDGPU_private_wave),
+ static_cast<char>(dwarf::DW_OP_LLVM_user),
+ static_cast<char>(dwarf::DW_OP_LLVM_form_aspace_address)};
+ static StringRef CFAEncodedInstUserOps =
+ StringRef(CFAEncodedInstUserOpsArr, sizeof(CFAEncodedInstUserOpsArr));
+ buildCFI(MBB, I, DL,
+ MCCFIInstruction::createEscape(nullptr, CFAEncodedInstUserOps,
+ SMLoc(),
+ "CFA is 0 in private_wave aspace"));
+ // Unwinding halts when the return address (PC) is undefined.
+ buildCFI(MBB, I, DL,
+ MCCFIInstruction::createUndefined(
+ nullptr, MCRI->getDwarfRegNum(AMDGPU::PC_REG, false)));
+ }
+
Register PreloadedScratchWaveOffsetReg = MFI->getPreloadedReg(
AMDGPUFunctionArgInfo::PRIVATE_SEGMENT_WAVE_BYTE_OFFSET);
@@ -655,11 +695,6 @@ void SIFrameLowering::emitEntryFunctionPrologue(MachineFunction &MF,
}
}
- // Debug location must be unknown since the first debug location is used to
- // determine the end of the prologue.
- DebugLoc DL;
- MachineBasicBlock::iterator I = MBB.begin();
-
// We found the SRSRC first because it needs four registers and has an
// alignment requirement. If the SRSRC that we found is clobbering with
// the scratch wave offset, which may be in a fixed SGPR or a free SGPR
@@ -2210,3 +2245,15 @@ bool SIFrameLowering::requiresStackPointerReference(
// references the SP, like variable sized stack objects.
return frameTriviallyRequiresSP(MFI);
}
+
+MachineInstr *SIFrameLowering::buildCFI(MachineBasicBlock &MBB,
+ MachineBasicBlock::iterator MBBI,
+ const DebugLoc &DL,
+ const MCCFIInstruction &CFIInst,
+ MachineInstr::MIFlag flag) const {
+ MachineFunction &MF = *MBB.getParent();
+ const SIInstrInfo *TII = MF.getSubtarget<GCNSubtarget>().getInstrInfo();
+ return BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
+ .addCFIIndex(MF.addFrameInst(CFIInst))
+ .setMIFlag(flag);
+}
diff --git a/llvm/lib/Target/AMDGPU/SIFrameLowering.h b/llvm/lib/Target/AMDGPU/SIFrameLowering.h
index a72772987262e..0b691d8f15a48 100644
--- a/llvm/lib/Target/AMDGPU/SIFrameLowering.h
+++ b/llvm/lib/Target/AMDGPU/SIFrameLowering.h
@@ -104,6 +104,12 @@ class SIFrameLowering final : public AMDGPUFrameLowering {
public:
bool requiresStackPointerReference(const MachineFunction &MF) const;
+ /// Create a CFI index for CFIInst and build a MachineInstr around it.
+ MachineInstr *
+ buildCFI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+ const DebugLoc &DL, const MCCFIInstruction &CFIInst,
+ MachineInstr::MIFlag flag = MachineInstr::FrameSetup) const;
+
// Returns true if the function may need to reserve space on the stack for the
// CWSR trap handler.
bool mayReserveScratchForCWSR(const MachineFunction &MF) const;
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll
index e86f7473363f7..c037a93af124b 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll
@@ -13,18 +13,24 @@
define amdgpu_kernel void @system_one_as_acquire() #0 {
; GFX6-LABEL: name: system_one_as_acquire
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_WAITCNT_soft 3952
; GFX6-NEXT: BUFFER_WBINVL1 implicit $exec
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: system_one_as_acquire
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_WAITCNT_soft 3952
; GFX8-NEXT: BUFFER_WBINVL1_VOL implicit $exec
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: system_one_as_acquire
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_WAITCNT_soft 16240
; GFX10WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -33,6 +39,8 @@ define amdgpu_kernel void @system_one_as_acquire() #0 {
;
; GFX10CU-LABEL: name: system_one_as_acquire
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_WAITCNT_soft 16240
; GFX10CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -41,6 +49,8 @@ define amdgpu_kernel void @system_one_as_acquire() #0 {
;
; GFX11WGP-LABEL: name: system_one_as_acquire
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_WAITCNT_soft 1015
; GFX11WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -49,6 +59,8 @@ define amdgpu_kernel void @system_one_as_acquire() #0 {
;
; GFX11CU-LABEL: name: system_one_as_acquire
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_WAITCNT_soft 1015
; GFX11CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -62,34 +74,46 @@ entry:
define amdgpu_kernel void @system_one_as_release() #0 {
; GFX6-LABEL: name: system_one_as_release
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_WAITCNT_soft 3952
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: system_one_as_release
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_WAITCNT_soft 3952
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: system_one_as_release
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_WAITCNT_soft 16240
; GFX10WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10WGP-NEXT: S_ENDPGM 0
;
; GFX10CU-LABEL: name: system_one_as_release
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_WAITCNT_soft 16240
; GFX10CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10CU-NEXT: S_ENDPGM 0
;
; GFX11WGP-LABEL: name: system_one_as_release
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_WAITCNT_soft 1015
; GFX11WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11WGP-NEXT: S_ENDPGM 0
;
; GFX11CU-LABEL: name: system_one_as_release
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_WAITCNT_soft 1015
; GFX11CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11CU-NEXT: S_ENDPGM 0
@@ -101,18 +125,24 @@ entry:
define amdgpu_kernel void @system_one_as_acq_rel() #0 {
; GFX6-LABEL: name: system_one_as_acq_rel
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_WAITCNT_soft 3952
; GFX6-NEXT: BUFFER_WBINVL1 implicit $exec
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: system_one_as_acq_rel
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_WAITCNT_soft 3952
; GFX8-NEXT: BUFFER_WBINVL1_VOL implicit $exec
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: system_one_as_acq_rel
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_WAITCNT_soft 16240
; GFX10WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -121,6 +151,8 @@ define amdgpu_kernel void @system_one_as_acq_rel() #0 {
;
; GFX10CU-LABEL: name: system_one_as_acq_rel
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_WAITCNT_soft 16240
; GFX10CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -129,6 +161,8 @@ define amdgpu_kernel void @system_one_as_acq_rel() #0 {
;
; GFX11WGP-LABEL: name: system_one_as_acq_rel
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_WAITCNT_soft 1015
; GFX11WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -137,6 +171,8 @@ define amdgpu_kernel void @system_one_as_acq_rel() #0 {
;
; GFX11CU-LABEL: name: system_one_as_acq_rel
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_WAITCNT_soft 1015
; GFX11CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -150,18 +186,24 @@ entry:
define amdgpu_kernel void @system_one_as_seq_cst() #0 {
; GFX6-LABEL: name: system_one_as_seq_cst
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_WAITCNT_soft 3952
; GFX6-NEXT: BUFFER_WBINVL1 implicit $exec
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: system_one_as_seq_cst
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_WAITCNT_soft 3952
; GFX8-NEXT: BUFFER_WBINVL1_VOL implicit $exec
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: system_one_as_seq_cst
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_WAITCNT_soft 16240
; GFX10WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -170,6 +212,8 @@ define amdgpu_kernel void @system_one_as_seq_cst() #0 {
;
; GFX10CU-LABEL: name: system_one_as_seq_cst
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_WAITCNT_soft 16240
; GFX10CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -178,6 +222,8 @@ define amdgpu_kernel void @system_one_as_seq_cst() #0 {
;
; GFX11WGP-LABEL: name: system_one_as_seq_cst
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_WAITCNT_soft 1015
; GFX11WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -186,6 +232,8 @@ define amdgpu_kernel void @system_one_as_seq_cst() #0 {
;
; GFX11CU-LABEL: name: system_one_as_seq_cst
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_WAITCNT_soft 1015
; GFX11CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -199,26 +247,38 @@ entry:
define amdgpu_kernel void @singlethread_one_as_acquire() #0 {
; GFX6-LABEL: name: singlethread_one_as_acquire
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: singlethread_one_as_acquire
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: singlethread_one_as_acquire
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_ENDPGM 0
;
; GFX10CU-LABEL: name: singlethread_one_as_acquire
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_ENDPGM 0
;
; GFX11WGP-LABEL: name: singlethread_one_as_acquire
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_ENDPGM 0
;
; GFX11CU-LABEL: name: singlethread_one_as_acquire
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_ENDPGM 0
entry:
fence syncscope("singlethread-one-as") acquire
@@ -228,26 +288,38 @@ entry:
define amdgpu_kernel void @singlethread_one_as_release() #0 {
; GFX6-LABEL: name: singlethread_one_as_release
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: singlethread_one_as_release
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: singlethread_one_as_release
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_ENDPGM 0
;
; GFX10CU-LABEL: name: singlethread_one_as_release
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_ENDPGM 0
;
; GFX11WGP-LABEL: name: singlethread_one_as_release
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_ENDPGM 0
;
; GFX11CU-LABEL: name: singlethread_one_as_release
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_ENDPGM 0
entry:
fence syncscope("singlethread-one-as") release
@@ -257,26 +329,38 @@ entry:
define amdgpu_kernel void @singlethread_one_as_acq_rel() #0 {
; GFX6-LABEL: name: singlethread_one_as_acq_rel
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: singlethread_one_as_acq_rel
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: singlethread_one_as_acq_rel
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_ENDPGM 0
;
; GFX10CU-LABEL: name: singlethread_one_as_acq_rel
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_ENDPGM 0
;
; GFX11WGP-LABEL: name: singlethread_one_as_acq_rel
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCT...
[truncated]
|
|
@llvm/pr-subscribers-llvm-globalisel Author: Scott Linder (slinder1) ChangesEntry functions represent the end of unwinding, as they are the Co-authored-by: Scott Linder <scott.linder@amd.com> Patch is 361.62 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164722.diff 26 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp b/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
index 5c39f7a3d6daa..71356aa2aced1 100644
--- a/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
@@ -12,8 +12,10 @@
#include "GCNSubtarget.h"
#include "MCTargetDesc/AMDGPUMCTargetDesc.h"
#include "SIMachineFunctionInfo.h"
+#include "llvm/BinaryFormat/Dwarf.h"
#include "llvm/CodeGen/LiveRegUnits.h"
#include "llvm/CodeGen/MachineFrameInfo.h"
+#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/RegisterScavenging.h"
#include "llvm/Target/TargetMachine.h"
@@ -43,6 +45,15 @@ static MCRegister findUnusedRegister(MachineRegisterInfo &MRI,
return MCRegister();
}
+static bool needsFrameMoves(const MachineFunction &MF) {
+ // FIXME: There are some places in the compiler which are sensitive to the CFI
+ // pseudos and so using MachineFunction::needsFrameMoves has the unintended
+ // effect of making enabling debug info affect codegen. Once we have
+ // identified and fixed those cases this should be replaced with
+ // MF.needsFrameMoves()
+ return true;
+}
+
// Find a scratch register that we can use in the prologue. We avoid using
// callee-save registers since they may appear to be free when this is called
// from canUseAsPrologue (during shrink wrapping), but then no longer be free
@@ -615,10 +626,39 @@ void SIFrameLowering::emitEntryFunctionPrologue(MachineFunction &MF,
const SIRegisterInfo *TRI = &TII->getRegisterInfo();
MachineRegisterInfo &MRI = MF.getRegInfo();
const Function &F = MF.getFunction();
+ const MCRegisterInfo *MCRI = MF.getContext().getRegisterInfo();
MachineFrameInfo &FrameInfo = MF.getFrameInfo();
assert(MFI->isEntryFunction());
+ // Debug location must be unknown since the first debug location is used to
+ // determine the end of the prologue.
+ DebugLoc DL;
+ MachineBasicBlock::iterator I = MBB.begin();
+
+ if (needsFrameMoves(MF)) {
+ // On entry the SP/FP are not set up, so we need to define the CFA in terms
+ // of a literal location expression.
+ static const char CFAEncodedInstUserOpsArr[] = {
+ dwarf::DW_CFA_def_cfa_expression,
+ 4, // length
+ static_cast<char>(dwarf::DW_OP_lit0),
+ static_cast<char>(dwarf::DW_OP_lit0 +
+ dwarf::DW_ASPACE_LLVM_AMDGPU_private_wave),
+ static_cast<char>(dwarf::DW_OP_LLVM_user),
+ static_cast<char>(dwarf::DW_OP_LLVM_form_aspace_address)};
+ static StringRef CFAEncodedInstUserOps =
+ StringRef(CFAEncodedInstUserOpsArr, sizeof(CFAEncodedInstUserOpsArr));
+ buildCFI(MBB, I, DL,
+ MCCFIInstruction::createEscape(nullptr, CFAEncodedInstUserOps,
+ SMLoc(),
+ "CFA is 0 in private_wave aspace"));
+ // Unwinding halts when the return address (PC) is undefined.
+ buildCFI(MBB, I, DL,
+ MCCFIInstruction::createUndefined(
+ nullptr, MCRI->getDwarfRegNum(AMDGPU::PC_REG, false)));
+ }
+
Register PreloadedScratchWaveOffsetReg = MFI->getPreloadedReg(
AMDGPUFunctionArgInfo::PRIVATE_SEGMENT_WAVE_BYTE_OFFSET);
@@ -655,11 +695,6 @@ void SIFrameLowering::emitEntryFunctionPrologue(MachineFunction &MF,
}
}
- // Debug location must be unknown since the first debug location is used to
- // determine the end of the prologue.
- DebugLoc DL;
- MachineBasicBlock::iterator I = MBB.begin();
-
// We found the SRSRC first because it needs four registers and has an
// alignment requirement. If the SRSRC that we found is clobbering with
// the scratch wave offset, which may be in a fixed SGPR or a free SGPR
@@ -2210,3 +2245,15 @@ bool SIFrameLowering::requiresStackPointerReference(
// references the SP, like variable sized stack objects.
return frameTriviallyRequiresSP(MFI);
}
+
+MachineInstr *SIFrameLowering::buildCFI(MachineBasicBlock &MBB,
+ MachineBasicBlock::iterator MBBI,
+ const DebugLoc &DL,
+ const MCCFIInstruction &CFIInst,
+ MachineInstr::MIFlag flag) const {
+ MachineFunction &MF = *MBB.getParent();
+ const SIInstrInfo *TII = MF.getSubtarget<GCNSubtarget>().getInstrInfo();
+ return BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
+ .addCFIIndex(MF.addFrameInst(CFIInst))
+ .setMIFlag(flag);
+}
diff --git a/llvm/lib/Target/AMDGPU/SIFrameLowering.h b/llvm/lib/Target/AMDGPU/SIFrameLowering.h
index a72772987262e..0b691d8f15a48 100644
--- a/llvm/lib/Target/AMDGPU/SIFrameLowering.h
+++ b/llvm/lib/Target/AMDGPU/SIFrameLowering.h
@@ -104,6 +104,12 @@ class SIFrameLowering final : public AMDGPUFrameLowering {
public:
bool requiresStackPointerReference(const MachineFunction &MF) const;
+ /// Create a CFI index for CFIInst and build a MachineInstr around it.
+ MachineInstr *
+ buildCFI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+ const DebugLoc &DL, const MCCFIInstruction &CFIInst,
+ MachineInstr::MIFlag flag = MachineInstr::FrameSetup) const;
+
// Returns true if the function may need to reserve space on the stack for the
// CWSR trap handler.
bool mayReserveScratchForCWSR(const MachineFunction &MF) const;
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll
index e86f7473363f7..c037a93af124b 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll
@@ -13,18 +13,24 @@
define amdgpu_kernel void @system_one_as_acquire() #0 {
; GFX6-LABEL: name: system_one_as_acquire
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_WAITCNT_soft 3952
; GFX6-NEXT: BUFFER_WBINVL1 implicit $exec
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: system_one_as_acquire
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_WAITCNT_soft 3952
; GFX8-NEXT: BUFFER_WBINVL1_VOL implicit $exec
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: system_one_as_acquire
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_WAITCNT_soft 16240
; GFX10WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -33,6 +39,8 @@ define amdgpu_kernel void @system_one_as_acquire() #0 {
;
; GFX10CU-LABEL: name: system_one_as_acquire
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_WAITCNT_soft 16240
; GFX10CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -41,6 +49,8 @@ define amdgpu_kernel void @system_one_as_acquire() #0 {
;
; GFX11WGP-LABEL: name: system_one_as_acquire
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_WAITCNT_soft 1015
; GFX11WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -49,6 +59,8 @@ define amdgpu_kernel void @system_one_as_acquire() #0 {
;
; GFX11CU-LABEL: name: system_one_as_acquire
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_WAITCNT_soft 1015
; GFX11CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -62,34 +74,46 @@ entry:
define amdgpu_kernel void @system_one_as_release() #0 {
; GFX6-LABEL: name: system_one_as_release
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_WAITCNT_soft 3952
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: system_one_as_release
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_WAITCNT_soft 3952
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: system_one_as_release
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_WAITCNT_soft 16240
; GFX10WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10WGP-NEXT: S_ENDPGM 0
;
; GFX10CU-LABEL: name: system_one_as_release
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_WAITCNT_soft 16240
; GFX10CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10CU-NEXT: S_ENDPGM 0
;
; GFX11WGP-LABEL: name: system_one_as_release
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_WAITCNT_soft 1015
; GFX11WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11WGP-NEXT: S_ENDPGM 0
;
; GFX11CU-LABEL: name: system_one_as_release
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_WAITCNT_soft 1015
; GFX11CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11CU-NEXT: S_ENDPGM 0
@@ -101,18 +125,24 @@ entry:
define amdgpu_kernel void @system_one_as_acq_rel() #0 {
; GFX6-LABEL: name: system_one_as_acq_rel
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_WAITCNT_soft 3952
; GFX6-NEXT: BUFFER_WBINVL1 implicit $exec
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: system_one_as_acq_rel
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_WAITCNT_soft 3952
; GFX8-NEXT: BUFFER_WBINVL1_VOL implicit $exec
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: system_one_as_acq_rel
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_WAITCNT_soft 16240
; GFX10WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -121,6 +151,8 @@ define amdgpu_kernel void @system_one_as_acq_rel() #0 {
;
; GFX10CU-LABEL: name: system_one_as_acq_rel
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_WAITCNT_soft 16240
; GFX10CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -129,6 +161,8 @@ define amdgpu_kernel void @system_one_as_acq_rel() #0 {
;
; GFX11WGP-LABEL: name: system_one_as_acq_rel
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_WAITCNT_soft 1015
; GFX11WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -137,6 +171,8 @@ define amdgpu_kernel void @system_one_as_acq_rel() #0 {
;
; GFX11CU-LABEL: name: system_one_as_acq_rel
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_WAITCNT_soft 1015
; GFX11CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -150,18 +186,24 @@ entry:
define amdgpu_kernel void @system_one_as_seq_cst() #0 {
; GFX6-LABEL: name: system_one_as_seq_cst
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_WAITCNT_soft 3952
; GFX6-NEXT: BUFFER_WBINVL1 implicit $exec
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: system_one_as_seq_cst
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_WAITCNT_soft 3952
; GFX8-NEXT: BUFFER_WBINVL1_VOL implicit $exec
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: system_one_as_seq_cst
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_WAITCNT_soft 16240
; GFX10WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -170,6 +212,8 @@ define amdgpu_kernel void @system_one_as_seq_cst() #0 {
;
; GFX10CU-LABEL: name: system_one_as_seq_cst
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_WAITCNT_soft 16240
; GFX10CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX10CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -178,6 +222,8 @@ define amdgpu_kernel void @system_one_as_seq_cst() #0 {
;
; GFX11WGP-LABEL: name: system_one_as_seq_cst
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_WAITCNT_soft 1015
; GFX11WGP-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11WGP-NEXT: BUFFER_GL1_INV implicit $exec
@@ -186,6 +232,8 @@ define amdgpu_kernel void @system_one_as_seq_cst() #0 {
;
; GFX11CU-LABEL: name: system_one_as_seq_cst
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_WAITCNT_soft 1015
; GFX11CU-NEXT: S_WAITCNT_VSCNT_soft undef $sgpr_null, 0
; GFX11CU-NEXT: BUFFER_GL1_INV implicit $exec
@@ -199,26 +247,38 @@ entry:
define amdgpu_kernel void @singlethread_one_as_acquire() #0 {
; GFX6-LABEL: name: singlethread_one_as_acquire
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: singlethread_one_as_acquire
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: singlethread_one_as_acquire
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_ENDPGM 0
;
; GFX10CU-LABEL: name: singlethread_one_as_acquire
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_ENDPGM 0
;
; GFX11WGP-LABEL: name: singlethread_one_as_acquire
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_ENDPGM 0
;
; GFX11CU-LABEL: name: singlethread_one_as_acquire
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_ENDPGM 0
entry:
fence syncscope("singlethread-one-as") acquire
@@ -228,26 +288,38 @@ entry:
define amdgpu_kernel void @singlethread_one_as_release() #0 {
; GFX6-LABEL: name: singlethread_one_as_release
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: singlethread_one_as_release
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: singlethread_one_as_release
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_ENDPGM 0
;
; GFX10CU-LABEL: name: singlethread_one_as_release
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_ENDPGM 0
;
; GFX11WGP-LABEL: name: singlethread_one_as_release
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11WGP-NEXT: S_ENDPGM 0
;
; GFX11CU-LABEL: name: singlethread_one_as_release
; GFX11CU: bb.0.entry:
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX11CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX11CU-NEXT: S_ENDPGM 0
entry:
fence syncscope("singlethread-one-as") release
@@ -257,26 +329,38 @@ entry:
define amdgpu_kernel void @singlethread_one_as_acq_rel() #0 {
; GFX6-LABEL: name: singlethread_one_as_acq_rel
; GFX6: bb.0.entry:
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX6-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX6-NEXT: S_ENDPGM 0
;
; GFX8-LABEL: name: singlethread_one_as_acq_rel
; GFX8: bb.0.entry:
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX8-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX8-NEXT: S_ENDPGM 0
;
; GFX10WGP-LABEL: name: singlethread_one_as_acq_rel
; GFX10WGP: bb.0.entry:
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10WGP-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10WGP-NEXT: S_ENDPGM 0
;
; GFX10CU-LABEL: name: singlethread_one_as_acq_rel
; GFX10CU: bb.0.entry:
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION escape 0x0f, 0x04, 0x30, 0x36, 0xe9, 0x02
+ ; GFX10CU-NEXT: frame-setup CFI_INSTRUCTION undefined $pc_reg
; GFX10CU-NEXT: S_ENDPGM 0
;
; GFX11WGP-LABEL: name: singlethread_one_as_acq_rel
; GFX11WGP: bb.0.entry:
+ ; GFX11WGP-NEXT: frame-setup CFI_INSTRUCT...
[truncated]
|
| const SIRegisterInfo *TRI = &TII->getRegisterInfo(); | ||
| MachineRegisterInfo &MRI = MF.getRegInfo(); | ||
| const Function &F = MF.getFunction(); | ||
| const MCRegisterInfo *MCRI = MF.getContext().getRegisterInfo(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| const MCRegisterInfo *MCRI = MF.getContext().getRegisterInfo(); |
Replace the uses of this with TRI
Entry functions represent the end of unwinding, as they are the outer-most frame. This implies they can only have a meaningful definition for the CFA, which AMDGPU defines using a memory location description with a literal private address space address. The return address is set to undefined as a sentinel value to signal the end of unwinding. Co-authored-by: Scott Linder <scott.linder@amd.com> Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu@amd.com>
Use nounwind to try to avoid cluttering tests
b3d52d3 to
638fa37
Compare
cbbe613 to
9bd44a8
Compare
|
@arsenm not sure if you also had these tests in mind for using (also graphite is broken for me so I'm just manually managing the stack, ignore the graphite comments for now) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cases that require introducing the IR section and declarations are probably not worth it. We would need a way to directly set nounwind on the mir function
| # RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -mattr=+wavefrontsize64 -verify-machineinstrs -run-pass=prologepilog %s -o - | FileCheck -check-prefixes=FLATSCRW64,GFX12 %s | ||
|
|
||
| --- | | ||
| define void @v_add_co_u32_e32__inline_imm__fi_offset0() #0 { unreachable } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch, adding nounwind this way in MIR is too painful. Is there really no way to directly set this for MIR?

Entry functions represent the end of unwinding, as they are the
outer-most frame. This implies they can only have a meaningful
definition for the CFA, which AMDGPU defines using a memory location
description with a literal private address space address. The return
address is set to undefined as a sentinel value to signal the end of
unwinding.
Co-authored-by: Scott Linder scott.linder@amd.com
Co-authored-by: Venkata Ramanaiah Nalamothu VenkataRamanaiah.Nalamothu@amd.com