Skip to content

[llvm-readobj][COFF] Implement --coff-pseudoreloc in llvm-readobj to dump runtime pseudo-relocation records #151816

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

kikairoya
Copy link
Contributor

MinGW toolchain uses "runtime pseudo-relocation" mechanism to support auto-importing symbols from DLLs.
There is no commonly used tools for dump the pseudo-relocation records, so we implement that functionality in llvm-readobj.

@llvmbot
Copy link
Member

llvmbot commented Aug 2, 2025

@llvm/pr-subscribers-llvm-binary-utilities

Author: Tomohiro Kashiwada (kikairoya)

Changes

MinGW toolchain uses "runtime pseudo-relocation" mechanism to support auto-importing symbols from DLLs.
There is no commonly used tools for dump the pseudo-relocation records, so we implement that functionality in llvm-readobj.


Full diff: https://github.com/llvm/llvm-project/pull/151816.diff

6 Files Affected:

  • (added) llvm/test/tools/llvm-readobj/COFF/Inputs/pseudoreloc.exe ()
  • (added) llvm/test/tools/llvm-readobj/COFF/pseudoreloc.test (+97)
  • (modified) llvm/tools/llvm-readobj/COFFDumper.cpp (+109)
  • (modified) llvm/tools/llvm-readobj/ObjDumper.h (+1)
  • (modified) llvm/tools/llvm-readobj/Opts.td (+3)
  • (modified) llvm/tools/llvm-readobj/llvm-readobj.cpp (+4)
diff --git a/llvm/test/tools/llvm-readobj/COFF/Inputs/pseudoreloc.exe b/llvm/test/tools/llvm-readobj/COFF/Inputs/pseudoreloc.exe
new file mode 100644
index 0000000000000..d4106e99d96f3
Binary files /dev/null and b/llvm/test/tools/llvm-readobj/COFF/Inputs/pseudoreloc.exe differ
diff --git a/llvm/test/tools/llvm-readobj/COFF/pseudoreloc.test b/llvm/test/tools/llvm-readobj/COFF/pseudoreloc.test
new file mode 100644
index 0000000000000..f3db464b4ae69
--- /dev/null
+++ b/llvm/test/tools/llvm-readobj/COFF/pseudoreloc.test
@@ -0,0 +1,97 @@
+RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/pseudoreloc.exe | FileCheck %s
+RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/nop.exe.coff-x86-64 | FileCheck %s --check-prefix=NOSYM
+RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/trivial.obj.coff-i386 | FileCheck %s --check-prefix=NORELOC
+
+CHECK:      Format: COFF-i386
+CHECK-NEXT: Arch: i386
+CHECK-NEXT: AddressSize: 32bit
+CHECK-NEXT: PseudoReloc [
+CHECK-NEXT:  Entry {
+CHECK-NEXT:   Symbol: 0x{{[0-9A-Z]+}}
+CHECK-NEXT:   SymbolName: sym1
+CHECK-NEXT:   Target: 0x{{[0-9A-Z]+}}
+CHECK-NEXT:   BitWidth: {{[0-9]+}}
+CHECK-NEXT:  }
+CHECK-NEXT:  Entry {
+CHECK-NEXT:   Symbol: 0x{{[0-9A-Z]+}}
+CHECK-NEXT:   SymbolName: sym2
+CHECK-NEXT:   Target: 0x{{[0-9A-Z]+}}
+CHECK-NEXT:   BitWidth: {{[0-9]+}}
+CHECK-NEXT:  }
+CHECK-NEXT:  Entry {
+CHECK-NEXT:   Symbol: 0x{{[0-9A-Z]+}}
+CHECK-NEXT:   SymbolName: sym1
+CHECK-NEXT:   Target: 0x{{[0-9A-Z]+}}
+CHECK-NEXT:   BitWidth: {{[0-9]+}}
+CHECK-NEXT:  }
+CHECK-NEXT: ]
+
+NOSYM-NOT: PseudoReloc
+NOSYM: The symbol table has been stripped
+NOSYM-NOT: PseudoReloc
+
+NORELOC-NOT: PseudoReloc
+NORELOC: The symbols for runtime pseudo-relocation are not found
+NORELOC-NOT: PseudoReloc
+
+
+pseudoreloc.exe is generated by following script:
+
+#--- generate.sh
+llvm-mc -triple i386-mingw32 -filetype obj pseudoreloc.dll.s -o pseudoreloc.dll.o
+ld.lld -m i386pe --dll pseudoreloc.dll.o -o pseudoreloc.dll -entry=
+llvm-mc -triple i386-mingw32 -filetype obj pseudoreloc.s -o pseudoreloc.o
+ld.lld -m i386pe pseudoreloc.o pseudoreloc.dll -o pseudoreloc.exe -entry=start
+
+#--- pseudoreloc.dll.s
+    .data
+    .globl _sym1
+_sym1:
+    .long 0x11223344
+    .globl _sym2
+_sym2:
+    .long 0x55667788
+    .section .drectve
+    .ascii " -export:sym1,data "
+    .ascii " -export:sym2,data "
+    .addrsig
+
+#--- pseudoreloc.s
+    .text
+    .globl _start
+_start:
+    mov _local1b, %eax
+    movsb (%eax), %ecx
+    mov _local2, %eax
+    movsb (%eax), %edx
+    mov _local1a, %eax
+    movsb (%eax), %eax
+    add %edx, %eax
+    add %ecx, %eax
+    ret
+
+    .globl __pei386_runtime_relocator
+__pei386_runtime_relocator:
+    mov ___RUNTIME_PSEUDO_RELOC_LIST__, %eax
+    mov ___RUNTIME_PSEUDO_RELOC_LIST_END__, %ecx
+    sub %ecx, %eax
+    ret
+
+    .data
+    .globl  _local1a
+    .p2align 2
+_local1a:
+    .long _sym1+1
+
+    .globl _local2
+    .p2align 2
+_local2:
+    .long _sym2+1
+
+    .globl  _local1b
+    .p2align 2
+_local1b:
+    .long _sym1+3
+
+    .addrsig
+
diff --git a/llvm/tools/llvm-readobj/COFFDumper.cpp b/llvm/tools/llvm-readobj/COFFDumper.cpp
index 96e0a634648e4..45ca018b714f2 100644
--- a/llvm/tools/llvm-readobj/COFFDumper.cpp
+++ b/llvm/tools/llvm-readobj/COFFDumper.cpp
@@ -95,6 +95,7 @@ class COFFDumper : public ObjDumper {
   void printCOFFExports() override;
   void printCOFFDirectives() override;
   void printCOFFBaseReloc() override;
+  void printCOFFPseudoReloc() override;
   void printCOFFDebugDirectory() override;
   void printCOFFTLSDirectory() override;
   void printCOFFResources() override;
@@ -2000,6 +2001,114 @@ void COFFDumper::printCOFFBaseReloc() {
   }
 }
 
+void COFFDumper::printCOFFPseudoReloc() {
+  const StringRef RelocBeginName = Obj->getArch() == Triple::x86
+                                       ? "___RUNTIME_PSEUDO_RELOC_LIST__"
+                                       : "__RUNTIME_PSEUDO_RELOC_LIST__";
+  const StringRef RelocEndName = Obj->getArch() == Triple::x86
+                                     ? "___RUNTIME_PSEUDO_RELOC_LIST_END__"
+                                     : "__RUNTIME_PSEUDO_RELOC_LIST_END__";
+
+  COFFSymbolRef RelocBegin, RelocEnd;
+  auto Count = Obj->getNumberOfSymbols();
+  if (Count == 0) {
+    W.startLine() << "The symbol table has been stripped\n";
+    return;
+  }
+  for (auto i = 0u;
+       i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) {
+    auto Sym = Obj->getSymbol(i);
+    if (Sym.takeError())
+      continue;
+    auto Name = Obj->getSymbolName(*Sym);
+    if (Name.takeError())
+      continue;
+    if (*Name == RelocBeginName) {
+      if (Sym->getSectionNumber() > 0)
+        RelocBegin = *Sym;
+    } else if (*Name == RelocEndName) {
+      if (Sym->getSectionNumber() > 0)
+        RelocEnd = *Sym;
+    }
+  }
+  if (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()) {
+    W.startLine()
+        << "The symbols for runtime pseudo-relocation are not found\n";
+    return;
+  }
+
+  ArrayRef<uint8_t> Data;
+  auto Section = Obj->getSection(RelocBegin.getSectionNumber());
+  if (auto E = Section.takeError()) {
+    reportError(std::move(E), Obj->getFileName());
+    return;
+  }
+  if (auto E = Obj->getSectionContents(*Section, Data)) {
+    reportError(std::move(E), Obj->getFileName());
+    return;
+  }
+  ArrayRef<uint8_t> RawRelocs =
+      Data.take_front(RelocEnd.getValue()).drop_front(RelocBegin.getValue());
+  struct alignas(4) PseudoRelocationHeader {
+    uint32_t Zero1;
+    uint32_t Zero2;
+    uint32_t Signature;
+  };
+  static const PseudoRelocationHeader HeaderV2 = {0, 0, 1};
+  if (RawRelocs.size() < sizeof(HeaderV2) ||
+      (memcmp(RawRelocs.data(), &HeaderV2, sizeof(HeaderV2)) != 0)) {
+    reportWarning(
+        createStringError("Invalid runtime pseudo-relocation records"),
+        Obj->getFileName());
+    return;
+  }
+  struct alignas(4) PseudoRelocationRecord {
+    uint32_t Symbol;
+    uint32_t Target;
+    uint32_t BitSize;
+  };
+  ArrayRef<PseudoRelocationRecord> RelocRecords(
+      reinterpret_cast<const PseudoRelocationRecord *>(
+          RawRelocs.data() + sizeof(PseudoRelocationHeader)),
+      (RawRelocs.size() - sizeof(PseudoRelocationHeader)) /
+          sizeof(PseudoRelocationRecord));
+
+  // Cache of symbol searched at least once in IAT
+  DenseMap<uint32_t, StringRef> ImportedSymbols;
+
+  ListScope D(W, "PseudoReloc");
+  for (const auto &Reloc : RelocRecords) {
+    DictScope Entry(W, "Entry");
+    W.printHex("Symbol", Reloc.Symbol);
+
+    // find and print the pointed symbol from IAT
+    [&]() {
+      for (auto D : Obj->import_directories()) {
+        uint32_t RVA;
+        if (auto E = D.getImportAddressTableRVA(RVA))
+          reportError(std::move(E), Obj->getFileName());
+        if (Reloc.Symbol < RVA)
+          continue;
+        for (auto S : D.imported_symbols()) {
+          if (RVA == Reloc.Symbol) {
+            if (auto E = S.getSymbolName(ImportedSymbols[RVA]))
+              reportError(std::move(E), Obj->getFileName());
+            return;
+          }
+          RVA += Obj->is64() ? 8 : 4;
+        }
+      }
+    }();
+    if (auto Ite = ImportedSymbols.find(Reloc.Symbol);
+        Ite != ImportedSymbols.end()) {
+      W.printString("SymbolName", Ite->second);
+    }
+
+    W.printHex("Target", Reloc.Target);
+    W.printNumber("BitWidth", Reloc.BitSize);
+  }
+}
+
 void COFFDumper::printCOFFResources() {
   ListScope ResourcesD(W, "Resources");
   for (const SectionRef &S : Obj->sections()) {
diff --git a/llvm/tools/llvm-readobj/ObjDumper.h b/llvm/tools/llvm-readobj/ObjDumper.h
index 1dc29661f7178..a654078a770ff 100644
--- a/llvm/tools/llvm-readobj/ObjDumper.h
+++ b/llvm/tools/llvm-readobj/ObjDumper.h
@@ -146,6 +146,7 @@ class ObjDumper {
   virtual void printCOFFExports() { }
   virtual void printCOFFDirectives() { }
   virtual void printCOFFBaseReloc() { }
+  virtual void printCOFFPseudoReloc() {}
   virtual void printCOFFDebugDirectory() { }
   virtual void printCOFFTLSDirectory() {}
   virtual void printCOFFResources() {}
diff --git a/llvm/tools/llvm-readobj/Opts.td b/llvm/tools/llvm-readobj/Opts.td
index 48d43cc635a4f..d519e34a72983 100644
--- a/llvm/tools/llvm-readobj/Opts.td
+++ b/llvm/tools/llvm-readobj/Opts.td
@@ -82,6 +82,9 @@ def codeview_ghash : FF<"codeview-ghash", "Enable global hashing for CodeView ty
 def codeview_merged_types : FF<"codeview-merged-types", "Display the merged CodeView type stream">, Group<grp_coff>;
 def codeview_subsection_bytes : FF<"codeview-subsection-bytes", "Dump raw contents of codeview debug sections and records">, Group<grp_coff>;
 def coff_basereloc : FF<"coff-basereloc", "Display .reloc section">, Group<grp_coff>;
+def coff_pseudoreloc
+    : FF<"coff-pseudoreloc", "Display runtime pseudo-relocations">,
+      Group<grp_coff>;
 def coff_debug_directory : FF<"coff-debug-directory", "Display debug directory">, Group<grp_coff>;
 def coff_directives : FF<"coff-directives", "Display .drectve section">, Group<grp_coff>;
 def coff_exports : FF<"coff-exports", "Display export table">, Group<grp_coff>;
diff --git a/llvm/tools/llvm-readobj/llvm-readobj.cpp b/llvm/tools/llvm-readobj/llvm-readobj.cpp
index 4c84ed701bb9a..2b34761b2cc6c 100644
--- a/llvm/tools/llvm-readobj/llvm-readobj.cpp
+++ b/llvm/tools/llvm-readobj/llvm-readobj.cpp
@@ -154,6 +154,7 @@ static bool CodeViewEnableGHash;
 static bool CodeViewMergedTypes;
 bool CodeViewSubsectionBytes;
 static bool COFFBaseRelocs;
+static bool COFFPseudoRelocs;
 static bool COFFDebugDirectory;
 static bool COFFDirectives;
 static bool COFFExports;
@@ -305,6 +306,7 @@ static void parseOptions(const opt::InputArgList &Args) {
   opts::CodeViewMergedTypes = Args.hasArg(OPT_codeview_merged_types);
   opts::CodeViewSubsectionBytes = Args.hasArg(OPT_codeview_subsection_bytes);
   opts::COFFBaseRelocs = Args.hasArg(OPT_coff_basereloc);
+  opts::COFFPseudoRelocs = Args.hasArg(OPT_coff_pseudoreloc);
   opts::COFFDebugDirectory = Args.hasArg(OPT_coff_debug_directory);
   opts::COFFDirectives = Args.hasArg(OPT_coff_directives);
   opts::COFFExports = Args.hasArg(OPT_coff_exports);
@@ -492,6 +494,8 @@ static void dumpObject(ObjectFile &Obj, ScopedPrinter &Writer,
       Dumper->printCOFFDirectives();
     if (opts::COFFBaseRelocs)
       Dumper->printCOFFBaseReloc();
+    if (opts::COFFPseudoRelocs)
+      Dumper->printCOFFPseudoReloc();
     if (opts::COFFDebugDirectory)
       Dumper->printCOFFDebugDirectory();
     if (opts::COFFTLSDirectory)

@kikairoya
Copy link
Contributor Author

While investigating #149639, we used the script previously created by @mstorsjo and @jeremyd2019 for dumping pseudo‑relocs.
So the tool has already been used at least twice — I bet a third time will come too.

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, this is very much appreciated! The script has indeed been useful a number of times. Some of the times I needed it, I also considered writing something like this, but in that case, most of the cases where I needed it, I wanted to find pseudo relocations that were too narrow (32 bit pseudo relocations in 64 bit binaries), and wanted the more precise diagnostics about where it was caused (which I ask about in a review comment here as well). But in the end, at the time, I ended up implementing that warning in LLD instead, in 6daa4b9.

@@ -0,0 +1,97 @@
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/pseudoreloc.exe | FileCheck %s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, it'd be nicer to synthesize the test binary from yaml with yaml2obj rather than checking in a binary. (We do have some binaries checked in from before, but we'd like to keep that to a minimum, e.g. for binaries that can't be synthesized with yaml2obj yet.)

I see that you have the full procedure included for regenerating the binary, that's nice and appreciated! If converting it to yaml, it's also somewhat customary to strip down the size of it by removing unnecessary data from it. Perhaps it's not necessary in this case if the payload of each section is only a couple dozens of bytes anyway though. But if it is, the instructions would unfortunately end with obj2yaml pseudoreloc.exe > pseudoreloc.exe.yaml # and manually strip down the .yaml file. But if there's not that much unnecessary in there, perhaps we don't need to strip it manually at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Using yaml2obj, I was able to edit the content using sed and easily add tests for errors.

CHECK-NEXT: }
CHECK-NEXT: ]

NOSYM-NOT: PseudoReloc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of needing to repeat the -NOT condition before/after the positive check that we do look for, FileCheck also has got an option --implicit-check-not which you can consider using.

CHECK-NEXT: Symbol: 0x{{[0-9A-Z]+}}
CHECK-NEXT: SymbolName: sym1
CHECK-NEXT: Target: 0x{{[0-9A-Z]+}}
CHECK-NEXT: BitWidth: {{[0-9]+}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While these regex patterns are nice for accepting anything that we'd expect to output, it also makes the test kinda weak - we could start outputting different wrong addresses (and bitwidth sizes), without the test catching it - that's not ideal. So I think it would be good to check the actual numbers as well, to make sure the test catches any potential breakage in it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileCheck has the concept of numeric capture and substitutions and even simple maths like additions and subtractions. It sounds like this would be a good use-case for this functionality. See https://llvm.org/docs/CommandGuide/FileCheck.html#filecheck-numeric-substitution-blocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've rewritten to use numeric capture.

pseudoreloc.exe is generated by following script:

#--- generate.sh
llvm-mc -triple i386-mingw32 -filetype obj pseudoreloc.dll.s -o pseudoreloc.dll.o
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps include a split-file command here as well, for easier execution if one wants to try it?

@@ -82,6 +82,9 @@ def codeview_ghash : FF<"codeview-ghash", "Enable global hashing for CodeView ty
def codeview_merged_types : FF<"codeview-merged-types", "Display the merged CodeView type stream">, Group<grp_coff>;
def codeview_subsection_bytes : FF<"codeview-subsection-bytes", "Dump raw contents of codeview debug sections and records">, Group<grp_coff>;
def coff_basereloc : FF<"coff-basereloc", "Display .reloc section">, Group<grp_coff>;
def coff_pseudoreloc
: FF<"coff-pseudoreloc", "Display runtime pseudo-relocations">,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should include the word "mingw" in the option description as well, as this isn't relevant for general PE-COFF?

i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) {
auto Sym = Obj->getSymbol(i);
if (Sym.takeError())
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC you can't just ignore the errors like this (the error classes has got a destructor that aborts if you haven't actually done anything with the error). See other similar functions here for ways of doing it; e.g. if (!Sym) { consumeError(Sym.takeError()); return; } if we just want to ignore the error, or if (!Sym) reportError(Sym.takeError(), Obj->getFileName()); if we want to print it.

These error classes are quite tricky to use in that sense, so ideally one would need to have tested triggering all of these error cases - and unfortunately it can probably be quite hard to actually force these to fail as well... Perhaps by hex editing a binary to make symbol string offsets out of bounds?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just be a call to reportWarning with the error, or something along those lines IMO. Same below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Warning about this potentially causes too noisy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you've realised this, but reportWarning only emits the first instance of each identical warning, so if the same warning were to be reported lots of times, only the first one will appear in the actual warning output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With Expected<COFFSymbolRef> SymOrErr = Obj->getSymbol(i + 100) to enforce to emit warnings, I got 6 completely same warning messages while dumping pseudoreloc.test.tmp.exe-x86_64 as following:

$ bin/llvm-readobj test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64 --coff-pseudoreloc

File: test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64
Format: COFF-x86-64
Arch: x86_64
AddressSize: 64bit
PseudoReloc [
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': the marker symbols for runtime pseudo-relocation were not found
]

What does "each identical warning" mean?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, I got a bit mixed up.

ObjDumper (from which COFFDumper derives) has a reportUniqueWarning method, which you should use instead of reportWarning. There was discussion quite some time ago that we'd replace reportWarning entirely with reportUniqueWarning usage within llvm-readobj and I thought that had happened, but apparently not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'll use reportUniqueWarning. Thank you!

return;
}
ArrayRef<uint8_t> RawRelocs =
Data.take_front(RelocEnd.getValue()).drop_front(RelocBegin.getValue());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before subtracting RelocEnd from RelocBegin, we'd need to validate that they indeed point into the same section.

const PseudoRelocationHeader HeaderV2(1);
if (RawRelocs.size() < sizeof(HeaderV2) ||
(memcmp(RawRelocs.data(), &HeaderV2, sizeof(HeaderV2)) != 0)) {
reportWarning(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If RawRelocs.size() == 0, we probably should print a different message than this? (Not sure if the linker actually ever does produce that, and what the runtime would do with it.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime treat such case as a old header-less relocation block (V1). Should we support the V1 relocation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't need to bother with the V1 format. But if RawRelocs.size() == 0, then there's no data to interpret as V1 either, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, right. I should treat the case of RawRelocs.size() == 0 as just an empty relocation block.

if (auto E = D.getImportAddressTableRVA(RVA))
reportError(std::move(E), Obj->getFileName());
if (EntryRVA < RVA)
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a similar check we could do, if (EntryRVA > RVA + ImportAddressTableSize) to avoid iterating over the table, if we easily could see that the address is past the end of this table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the exact end position of the list cannot be calculated directly, I calculated an approximate value using a table list sorted by RVA.

ListScope D(W, "PseudoReloc");
for (const auto &Reloc : RelocRecords) {
DictScope Entry(W, "Entry");
W.printHex("Symbol", Reloc.Symbol);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another nice to have feature here, which probably is out of scope for the initial version at least, would be to figure out which data block it belongs to (e.g. which function, or which variable contains the pseudo relocation). But as COFF symbols only have offset but not size, we can't probably do that easily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easier than expected. This is because searching __RUNTIME_PSEUDO_RELOC_LIST__ requires iterating through the symbol table, which allows us to construct a RVA-to-symbol map.

@kikairoya
Copy link
Contributor Author

Thank you for the review! I’ll work on addressing your comments and suggestions.

CHECK-NEXT: Symbol: 0x{{[0-9A-Z]+}}
CHECK-NEXT: SymbolName: sym1
CHECK-NEXT: Target: 0x{{[0-9A-Z]+}}
CHECK-NEXT: BitWidth: {{[0-9]+}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileCheck has the concept of numeric capture and substitutions and even simple maths like additions and subtractions. It sounds like this would be a good use-case for this functionality. See https://llvm.org/docs/CommandGuide/FileCheck.html#filecheck-numeric-substitution-blocks.

i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) {
auto Sym = Obj->getSymbol(i);
if (Sym.takeError())
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just be a call to reportWarning with the error, or something along those lines IMO. Same below.

ArrayRef<uint8_t> Data;
auto Section = Obj->getSection(RelocBegin.getSectionNumber());
if (auto E = Section.takeError()) {
reportError(std::move(E), Obj->getFileName());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer reportWarning over reportError in dumping code, because it allows execution to continue e.g. to run other dumping operations that have been requested. Same throughout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed all reportError to reportWarning.

if (RawRelocs.size() < sizeof(HeaderV2) ||
(memcmp(RawRelocs.data(), &HeaderV2, sizeof(HeaderV2)) != 0)) {
reportWarning(
createStringError("Invalid runtime pseudo-relocation records"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM style guide says warnings and errors should start with lower-case letters.

Copy link
Contributor Author

@kikairoya kikairoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I've addressed the comments. Thanks!

: "__RUNTIME_PSEUDO_RELOC_LIST__";
const StringRef RelocEndName = Obj->getArch() == Triple::x86
? "___RUNTIME_PSEUDO_RELOC_LIST_END__"
: "__RUNTIME_PSEUDO_RELOC_LIST_END__";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the test for x86_64 and switched testee source to LLVM IR to support multiple arch.

@@ -0,0 +1,97 @@
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/pseudoreloc.exe | FileCheck %s
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Using yaml2obj, I was able to edit the content using sed and easily add tests for errors.

ArrayRef<uint8_t> Data;
auto Section = Obj->getSection(RelocBegin.getSectionNumber());
if (auto E = Section.takeError()) {
reportError(std::move(E), Obj->getFileName());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed all reportError to reportWarning.

const PseudoRelocationHeader HeaderV2(1);
if (RawRelocs.size() < sizeof(HeaderV2) ||
(memcmp(RawRelocs.data(), &HeaderV2, sizeof(HeaderV2)) != 0)) {
reportWarning(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime treat such case as a old header-less relocation block (V1). Should we support the V1 relocation?

if (auto E = D.getImportAddressTableRVA(RVA))
reportError(std::move(E), Obj->getFileName());
if (EntryRVA < RVA)
continue;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the exact end position of the list cannot be calculated directly, I calculated an approximate value using a table list sorted by RVA.

ListScope D(W, "PseudoReloc");
for (const auto &Reloc : RelocRecords) {
DictScope Entry(W, "Entry");
W.printHex("Symbol", Reloc.Symbol);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easier than expected. This is because searching __RUNTIME_PSEUDO_RELOC_LIST__ requires iterating through the symbol table, which allows us to construct a RVA-to-symbol map.

CHECK-NEXT: Symbol: 0x{{[0-9A-Z]+}}
CHECK-NEXT: SymbolName: sym1
CHECK-NEXT: Target: 0x{{[0-9A-Z]+}}
CHECK-NEXT: BitWidth: {{[0-9]+}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've rewritten to use numeric capture.

void COFFDumper::printCOFFPseudoReloc() {
if (!Obj->getDOSHeader()) {
W.startLine()
<< "pseudo-relocation is only meaningful for a PE image file\n";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question - such this message should be a warning or a just printing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the ELF case, if a thing isn't meaningful, we often just emit an empty output. For example, try --dyn-symbols on an ELF ET_REL object. I haven't checked if that's the pattern with the COFF side of things though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made changes to exit silently with showing an empty list on these conditions:

  • not an exe/dll image file
  • the symbol table was stripped
  • the relocation list has no entry
  • the size of the relocation list is zero (begin == end)

and to warn with reportWarning on these conditions:

  • the marker of the pseudo relocation is absent from the symbol table
  • the header of relocation list is broken
  • the marker of the relocation list points out of section space
  • the size of the relocation list is negative
  • the symbol table is broken
    • the symbol table is truncated
    • the name of the symbol or the section can't be fetched
      (I can't make the tests for these 2 conditions, because yaml2obj can't generate such corrupted binary. )
    • the section which the symbol belongs to is not found

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I can't make the tests for these 2 conditions, because yaml2obj can't generate such corrupted binary. )

Optional, but you could (in a separate PR) look at enhancing yaml2obj to allow you to create such a corrupted binary. In ELF, we have several fields whose only real use-case is to allow us to test specific error cases. See for example tests that use fields like SHName and SHOffset in the ELF llvm-readobj tests.

@@ -81,12 +81,17 @@ def codeview : FF<"codeview", "Display CodeView debug information">, Group<grp_c
def codeview_ghash : FF<"codeview-ghash", "Enable global hashing for CodeView type stream de-duplication">, Group<grp_coff>;
def codeview_merged_types : FF<"codeview-merged-types", "Display the merged CodeView type stream">, Group<grp_coff>;
def codeview_subsection_bytes : FF<"codeview-subsection-bytes", "Dump raw contents of codeview debug sections and records">, Group<grp_coff>;
def coff_basereloc : FF<"coff-basereloc", "Display .reloc section">, Group<grp_coff>;
def coff_basereloc : FF<"coff-basereloc", "Display .reloc section">,
Group<grp_coff>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git clang-format folded this line. Should it be ignored?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's somewhat expected that the formatting tools can tweak surrounding lines a little bit - but in this case, it's kinda far away. So I'm ok with excluding this change (although the CI formatting checker might complain and want you to include it), but I'm also ok with including it.

@kikairoya
Copy link
Contributor Author

kikairoya commented Aug 6, 2025

The CI failure seems related to #152299; I can't find Linking CXX executable bin/llc from build log so perhaps is it reused across jobs?

RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/pseudoreloc.exe | FileCheck %s
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/nop.exe.coff-x86-64 | FileCheck %s --check-prefix=NOSYM
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/trivial.obj.coff-i386 | FileCheck %s --check-prefix=NORELOC
REQUIRE: x86-registered-target
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
REQUIRE: x86-registered-target
REQUIRES: x86-registered-target

But why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's totally wrong spurious... I think I was confused about generating the yaml and running the test.

DEFINE: %{local} = 0x3000
DEFINE: %{addr1l} = E8
DEFINE: %{addr3l} = D8
RUN: yaml2obj %p/Inputs/pseudoreloc.x86_64.yaml | llvm-readobj --coff-pseudoreloc - 2>&1 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to do yaml2obj ... -o %t then a separate llvm-readobj run command, as it means the intermediate object is left on disk, allowing for post-mortem test debugging.

Also for longer commands, I prefer to split by having the | at the end of the previous line, to show that the first line is continued, without needing to look at the next line, then indentation on the next to signify it's a continuation without needing to look at the line before.

NORELOC-NOT: PseudoReloc
NORELOC: The symbols for runtime pseudo-relocation are not found
NORELOC-NOT: PseudoReloc
; check that silently ignore when imported symbol name is not found
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment feels a bit garbled. Try "Test that llvm-readobj silently ignores missing imported symbol names." Also, please ensure comments follow standard English grammar, with capital letters and full stops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for my poor English. I very appreciate you correcting me!

And, as other erroneous conditions should be warned rather than ignored, this one also should be warned, right?

INVALIDSYMBOL: Symbol: 0xFFFF00
INVALIDSYMBOL-NEXT: Target: 0x[[#%X,LOCAL1A+8]]


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: get rid of extra blank line.

NORELOC: The symbols for runtime pseudo-relocation are not found
NORELOC-NOT: PseudoReloc
; check that silently ignore when imported symbol name is not found
RUN: sed -E -e '/Name: *\.rdata/,/Name:/{/SectionData:/{s/%{addr1l}200000/30000000/;s/%{addr3l}200000/00FFFF00/}}' %p/Inputs/pseudoreloc.i386.yaml \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels extremely complex for what is necessary. yaml2obj provides a -D option, which you can use to parameterise things, rather than trying to do complicated find/replace sed expressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it's complex and hard to understand, but it's a binary patch relied on where the linker actually places symbols at, so it seems complexed sed-ing can't be avoided.
I had tried to make easier to read with moving sed into the yaml generator script.

};
std::map<uint32_t, SymbolEntry> RVASymbolMap;
COFFSymbolRef RelocBegin, RelocEnd;
for (auto i = 0u; i < Count; ++i) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LLVM style guide generally prefers to avoid auto. Please review the examples. I don't think it's appropriate here or immediately below.

auto Sym = Obj->getSymbol(i);
if (Sym.takeError())
if (!Sym) {
consumeError(Sym.takeError());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's generally poor form to simply throw away errors. We should be reporting them with reportWarning, as I already said in a previous comment. Same applies throughout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to reportWarning.
However, if the image file is truly corrupted, it might emit a huge amount of warnings. I think such warnings should be reported by --symbols, not by --coff-pseudoreloc. Alternatively, it may be more appropriate to limit the reporting to once. May I ask your opinion?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment about how reportWarning deduplicates identical warnings. If the warnings aren't actually identical, but are noisy, there's probably a case for only reporting the first time (per input file).

}
if (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()) {
W.startLine()
<< "The symbols for runtime pseudo-relocation are not found\n";
<< "the symbols for runtime pseudo-relocation are not found\n";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely feels like it should be a warning. Also "were not found" not "are not found".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's usually suspicious state to be warned, but theoretically not.

  • Although link.exe (and lld-link without -debug:dwarf ) omits the symbol table, the PE spec allows a PE image with a symbol table.
  • With custom crt0.o and/or a custom linker script can omit the marker of the pseudo relocation.

That said, making this message a warning makes sense to me as those conditions are extremely rare.


if (RelocEnd.getValue() < RelocBegin.getValue()) {
reportWarning(createStringError("the symbols for runtime pseudo-relocation "
"don't consist a valid region"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warnings should generally include additional context, e.g. the incorrect values that were derived.

- fix position of line breaks
- name intermediate output binary
- remove a sprious and doubled empty lines
- make parameter more symbolic a bit
- use parametrized yaml
  * sed while generating source rather than testing
Copy link
Contributor Author

@kikairoya kikairoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review again!
I've made changes to address your comments and suggestions. Thanks!

RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/pseudoreloc.exe | FileCheck %s
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/nop.exe.coff-x86-64 | FileCheck %s --check-prefix=NOSYM
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/trivial.obj.coff-i386 | FileCheck %s --check-prefix=NORELOC
REQUIRE: x86-registered-target
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's totally wrong spurious... I think I was confused about generating the yaml and running the test.

NORELOC-NOT: PseudoReloc
NORELOC: The symbols for runtime pseudo-relocation are not found
NORELOC-NOT: PseudoReloc
; check that silently ignore when imported symbol name is not found
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for my poor English. I very appreciate you correcting me!

And, as other erroneous conditions should be warned rather than ignored, this one also should be warned, right?

NORELOC: The symbols for runtime pseudo-relocation are not found
NORELOC-NOT: PseudoReloc
; check that silently ignore when imported symbol name is not found
RUN: sed -E -e '/Name: *\.rdata/,/Name:/{/SectionData:/{s/%{addr1l}200000/30000000/;s/%{addr3l}200000/00FFFF00/}}' %p/Inputs/pseudoreloc.i386.yaml \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it's complex and hard to understand, but it's a binary patch relied on where the linker actually places symbols at, so it seems complexed sed-ing can't be avoided.
I had tried to make easier to read with moving sed into the yaml generator script.

const PseudoRelocationHeader HeaderV2(1);
if (RawRelocs.size() < sizeof(HeaderV2) ||
(memcmp(RawRelocs.data(), &HeaderV2, sizeof(HeaderV2)) != 0)) {
reportWarning(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, right. I should treat the case of RawRelocs.size() == 0 as just an empty relocation block.

const coff_section *Section;
StringRef SymbolName;
};
std::map<uint32_t, SymbolEntry> RVASymbolMap;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've switched to use SmallVector and then stable_sort and unique.

void COFFDumper::printCOFFPseudoReloc() {
if (!Obj->getDOSHeader()) {
W.startLine()
<< "pseudo-relocation is only meaningful for a PE image file\n";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made changes to exit silently with showing an empty list on these conditions:

  • not an exe/dll image file
  • the symbol table was stripped
  • the relocation list has no entry
  • the size of the relocation list is zero (begin == end)

and to warn with reportWarning on these conditions:

  • the marker of the pseudo relocation is absent from the symbol table
  • the header of relocation list is broken
  • the marker of the relocation list points out of section space
  • the size of the relocation list is negative
  • the symbol table is broken
    • the symbol table is truncated
    • the name of the symbol or the section can't be fetched
      (I can't make the tests for these 2 conditions, because yaml2obj can't generate such corrupted binary. )
    • the section which the symbol belongs to is not found

auto Count = Obj->getNumberOfSymbols();
if (Count == 0) {
W.startLine() << "The symbol table has been stripped\n";
W.startLine() << "the symbol table has been stripped\n";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be aligned with the "not a PE" case, returning silently with emitting an empty list.

auto Sym = Obj->getSymbol(i);
if (Sym.takeError())
if (!Sym) {
consumeError(Sym.takeError());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to reportWarning.
However, if the image file is truly corrupted, it might emit a huge amount of warnings. I think such warnings should be reported by --symbols, not by --coff-pseudoreloc. Alternatively, it may be more appropriate to limit the reporting to once. May I ask your opinion?

i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) {
auto Sym = Obj->getSymbol(i);
if (Sym.takeError())
continue;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Warning about this potentially causes too noisy.

}
if (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()) {
W.startLine()
<< "The symbols for runtime pseudo-relocation are not found\n";
<< "the symbols for runtime pseudo-relocation are not found\n";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's usually suspicious state to be warned, but theoretically not.

  • Although link.exe (and lld-link without -debug:dwarf ) omits the symbol table, the PE spec allows a PE image with a symbol table.
  • With custom crt0.o and/or a custom linker script can omit the marker of the pseudo relocation.

That said, making this message a warning makes sense to me as those conditions are extremely rare.

Copy link

github-actions bot commented Aug 11, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not done a full review this time around, but there are plenty of things to address nonetheless. Thanks for working on this!



pseudoreloc.*.yaml is generated by following script:
To regenerate Inputs/pseudoreloc.*.yaml, run following one-liner and review actual address map:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment marker here for readability.


# Ensure the binaries generated from parameterized yaml and original one are exact same.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Ensure the binaries generated from parameterized yaml and original one are exact same.
# Ensure the binaries generated from the parameterized yaml and original one are exactly the same.

W.startLine()
<< "pseudo-relocation is only meaningful for a PE image file\n";
ListScope D(W, "PseudoReloc");
W.flush();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of the flush?

ListScope D(W, "PseudoReloc");
W.flush();

// Pseudo-relocation is only meaningful for a PE image file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "pseudo-relocation" here referring to the general concept or specifically the option? I'm guessing it probably doesn't matter, but it impacts how you write this comment. If referring to the option, use the option name, including the "--" prefix. Otherwise, I'd say "Pseudo-relocations are only meaningful with PE image files."

COFFSymbolRef Sym;
if (Expected<COFFSymbolRef> SymOrErr = Obj->getSymbol(i))
Sym = *SymOrErr;
else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM style says if one part of an if/else needs braces, they all should have them.

for (const auto &Reloc : RelocRecords) {
DictScope Entry(W, "Entry");

W.printHex("Symbol", Reloc.Symbol);
if (const auto *Sym = ImportedSymbols.find(Reloc.Symbol))
W.printString("SymbolName", *Sym);
if (Expected<StringRef> SymOrErr = ImportedSymbols.find(Reloc.Symbol))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above re. braces.


W.printHex("Target", Reloc.Target);
if (auto Ite = RVASymbolMap.upper_bound(Reloc.Target.value());
if (auto Ite = llvm::upper_bound(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above re. braces.

i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) {
auto Sym = Obj->getSymbol(i);
if (Sym.takeError())
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you've realised this, but reportWarning only emits the first instance of each identical warning, so if the same warning were to be reported lots of times, only the first one will appear in the actual warning output.

void COFFDumper::printCOFFPseudoReloc() {
if (!Obj->getDOSHeader()) {
W.startLine()
<< "pseudo-relocation is only meaningful for a PE image file\n";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I can't make the tests for these 2 conditions, because yaml2obj can't generate such corrupted binary. )

Optional, but you could (in a separate PR) look at enhancing yaml2obj to allow you to create such a corrupted binary. In ELF, we have several fields whose only real use-case is to allow us to test specific error cases. See for example tests that use fields like SHName and SHOffset in the ELF llvm-readobj tests.

auto Sym = Obj->getSymbol(i);
if (Sym.takeError())
if (!Sym) {
consumeError(Sym.takeError());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment about how reportWarning deduplicates identical warnings. If the warnings aren't actually identical, but are noisy, there's probably a case for only reporting the first time (per input file).

Copy link
Contributor Author

@kikairoya kikairoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you the reviewing a lot!
I've made a change for the wording etc, only contains NFC portions.
I'm working on addressing the remaining comments.

i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) {
auto Sym = Obj->getSymbol(i);
if (Sym.takeError())
continue;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With Expected<COFFSymbolRef> SymOrErr = Obj->getSymbol(i + 100) to enforce to emit warnings, I got 6 completely same warning messages while dumping pseudoreloc.test.tmp.exe-x86_64 as following:

$ bin/llvm-readobj test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64 --coff-pseudoreloc

File: test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64
Format: COFF-x86-64
Arch: x86_64
AddressSize: 64bit
PseudoReloc [
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': the marker symbols for runtime pseudo-relocation were not found
]

What does "each identical warning" mean?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants