-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[llvm-readobj][COFF] Implement --coff-pseudoreloc in llvm-readobj to dump runtime pseudo-relocation records #151816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-binary-utilities Author: Tomohiro Kashiwada (kikairoya) ChangesMinGW toolchain uses "runtime pseudo-relocation" mechanism to support auto-importing symbols from DLLs. Full diff: https://github.com/llvm/llvm-project/pull/151816.diff 6 Files Affected:
diff --git a/llvm/test/tools/llvm-readobj/COFF/Inputs/pseudoreloc.exe b/llvm/test/tools/llvm-readobj/COFF/Inputs/pseudoreloc.exe
new file mode 100644
index 0000000000000..d4106e99d96f3
Binary files /dev/null and b/llvm/test/tools/llvm-readobj/COFF/Inputs/pseudoreloc.exe differ
diff --git a/llvm/test/tools/llvm-readobj/COFF/pseudoreloc.test b/llvm/test/tools/llvm-readobj/COFF/pseudoreloc.test
new file mode 100644
index 0000000000000..f3db464b4ae69
--- /dev/null
+++ b/llvm/test/tools/llvm-readobj/COFF/pseudoreloc.test
@@ -0,0 +1,97 @@
+RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/pseudoreloc.exe | FileCheck %s
+RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/nop.exe.coff-x86-64 | FileCheck %s --check-prefix=NOSYM
+RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/trivial.obj.coff-i386 | FileCheck %s --check-prefix=NORELOC
+
+CHECK: Format: COFF-i386
+CHECK-NEXT: Arch: i386
+CHECK-NEXT: AddressSize: 32bit
+CHECK-NEXT: PseudoReloc [
+CHECK-NEXT: Entry {
+CHECK-NEXT: Symbol: 0x{{[0-9A-Z]+}}
+CHECK-NEXT: SymbolName: sym1
+CHECK-NEXT: Target: 0x{{[0-9A-Z]+}}
+CHECK-NEXT: BitWidth: {{[0-9]+}}
+CHECK-NEXT: }
+CHECK-NEXT: Entry {
+CHECK-NEXT: Symbol: 0x{{[0-9A-Z]+}}
+CHECK-NEXT: SymbolName: sym2
+CHECK-NEXT: Target: 0x{{[0-9A-Z]+}}
+CHECK-NEXT: BitWidth: {{[0-9]+}}
+CHECK-NEXT: }
+CHECK-NEXT: Entry {
+CHECK-NEXT: Symbol: 0x{{[0-9A-Z]+}}
+CHECK-NEXT: SymbolName: sym1
+CHECK-NEXT: Target: 0x{{[0-9A-Z]+}}
+CHECK-NEXT: BitWidth: {{[0-9]+}}
+CHECK-NEXT: }
+CHECK-NEXT: ]
+
+NOSYM-NOT: PseudoReloc
+NOSYM: The symbol table has been stripped
+NOSYM-NOT: PseudoReloc
+
+NORELOC-NOT: PseudoReloc
+NORELOC: The symbols for runtime pseudo-relocation are not found
+NORELOC-NOT: PseudoReloc
+
+
+pseudoreloc.exe is generated by following script:
+
+#--- generate.sh
+llvm-mc -triple i386-mingw32 -filetype obj pseudoreloc.dll.s -o pseudoreloc.dll.o
+ld.lld -m i386pe --dll pseudoreloc.dll.o -o pseudoreloc.dll -entry=
+llvm-mc -triple i386-mingw32 -filetype obj pseudoreloc.s -o pseudoreloc.o
+ld.lld -m i386pe pseudoreloc.o pseudoreloc.dll -o pseudoreloc.exe -entry=start
+
+#--- pseudoreloc.dll.s
+ .data
+ .globl _sym1
+_sym1:
+ .long 0x11223344
+ .globl _sym2
+_sym2:
+ .long 0x55667788
+ .section .drectve
+ .ascii " -export:sym1,data "
+ .ascii " -export:sym2,data "
+ .addrsig
+
+#--- pseudoreloc.s
+ .text
+ .globl _start
+_start:
+ mov _local1b, %eax
+ movsb (%eax), %ecx
+ mov _local2, %eax
+ movsb (%eax), %edx
+ mov _local1a, %eax
+ movsb (%eax), %eax
+ add %edx, %eax
+ add %ecx, %eax
+ ret
+
+ .globl __pei386_runtime_relocator
+__pei386_runtime_relocator:
+ mov ___RUNTIME_PSEUDO_RELOC_LIST__, %eax
+ mov ___RUNTIME_PSEUDO_RELOC_LIST_END__, %ecx
+ sub %ecx, %eax
+ ret
+
+ .data
+ .globl _local1a
+ .p2align 2
+_local1a:
+ .long _sym1+1
+
+ .globl _local2
+ .p2align 2
+_local2:
+ .long _sym2+1
+
+ .globl _local1b
+ .p2align 2
+_local1b:
+ .long _sym1+3
+
+ .addrsig
+
diff --git a/llvm/tools/llvm-readobj/COFFDumper.cpp b/llvm/tools/llvm-readobj/COFFDumper.cpp
index 96e0a634648e4..45ca018b714f2 100644
--- a/llvm/tools/llvm-readobj/COFFDumper.cpp
+++ b/llvm/tools/llvm-readobj/COFFDumper.cpp
@@ -95,6 +95,7 @@ class COFFDumper : public ObjDumper {
void printCOFFExports() override;
void printCOFFDirectives() override;
void printCOFFBaseReloc() override;
+ void printCOFFPseudoReloc() override;
void printCOFFDebugDirectory() override;
void printCOFFTLSDirectory() override;
void printCOFFResources() override;
@@ -2000,6 +2001,114 @@ void COFFDumper::printCOFFBaseReloc() {
}
}
+void COFFDumper::printCOFFPseudoReloc() {
+ const StringRef RelocBeginName = Obj->getArch() == Triple::x86
+ ? "___RUNTIME_PSEUDO_RELOC_LIST__"
+ : "__RUNTIME_PSEUDO_RELOC_LIST__";
+ const StringRef RelocEndName = Obj->getArch() == Triple::x86
+ ? "___RUNTIME_PSEUDO_RELOC_LIST_END__"
+ : "__RUNTIME_PSEUDO_RELOC_LIST_END__";
+
+ COFFSymbolRef RelocBegin, RelocEnd;
+ auto Count = Obj->getNumberOfSymbols();
+ if (Count == 0) {
+ W.startLine() << "The symbol table has been stripped\n";
+ return;
+ }
+ for (auto i = 0u;
+ i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) {
+ auto Sym = Obj->getSymbol(i);
+ if (Sym.takeError())
+ continue;
+ auto Name = Obj->getSymbolName(*Sym);
+ if (Name.takeError())
+ continue;
+ if (*Name == RelocBeginName) {
+ if (Sym->getSectionNumber() > 0)
+ RelocBegin = *Sym;
+ } else if (*Name == RelocEndName) {
+ if (Sym->getSectionNumber() > 0)
+ RelocEnd = *Sym;
+ }
+ }
+ if (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()) {
+ W.startLine()
+ << "The symbols for runtime pseudo-relocation are not found\n";
+ return;
+ }
+
+ ArrayRef<uint8_t> Data;
+ auto Section = Obj->getSection(RelocBegin.getSectionNumber());
+ if (auto E = Section.takeError()) {
+ reportError(std::move(E), Obj->getFileName());
+ return;
+ }
+ if (auto E = Obj->getSectionContents(*Section, Data)) {
+ reportError(std::move(E), Obj->getFileName());
+ return;
+ }
+ ArrayRef<uint8_t> RawRelocs =
+ Data.take_front(RelocEnd.getValue()).drop_front(RelocBegin.getValue());
+ struct alignas(4) PseudoRelocationHeader {
+ uint32_t Zero1;
+ uint32_t Zero2;
+ uint32_t Signature;
+ };
+ static const PseudoRelocationHeader HeaderV2 = {0, 0, 1};
+ if (RawRelocs.size() < sizeof(HeaderV2) ||
+ (memcmp(RawRelocs.data(), &HeaderV2, sizeof(HeaderV2)) != 0)) {
+ reportWarning(
+ createStringError("Invalid runtime pseudo-relocation records"),
+ Obj->getFileName());
+ return;
+ }
+ struct alignas(4) PseudoRelocationRecord {
+ uint32_t Symbol;
+ uint32_t Target;
+ uint32_t BitSize;
+ };
+ ArrayRef<PseudoRelocationRecord> RelocRecords(
+ reinterpret_cast<const PseudoRelocationRecord *>(
+ RawRelocs.data() + sizeof(PseudoRelocationHeader)),
+ (RawRelocs.size() - sizeof(PseudoRelocationHeader)) /
+ sizeof(PseudoRelocationRecord));
+
+ // Cache of symbol searched at least once in IAT
+ DenseMap<uint32_t, StringRef> ImportedSymbols;
+
+ ListScope D(W, "PseudoReloc");
+ for (const auto &Reloc : RelocRecords) {
+ DictScope Entry(W, "Entry");
+ W.printHex("Symbol", Reloc.Symbol);
+
+ // find and print the pointed symbol from IAT
+ [&]() {
+ for (auto D : Obj->import_directories()) {
+ uint32_t RVA;
+ if (auto E = D.getImportAddressTableRVA(RVA))
+ reportError(std::move(E), Obj->getFileName());
+ if (Reloc.Symbol < RVA)
+ continue;
+ for (auto S : D.imported_symbols()) {
+ if (RVA == Reloc.Symbol) {
+ if (auto E = S.getSymbolName(ImportedSymbols[RVA]))
+ reportError(std::move(E), Obj->getFileName());
+ return;
+ }
+ RVA += Obj->is64() ? 8 : 4;
+ }
+ }
+ }();
+ if (auto Ite = ImportedSymbols.find(Reloc.Symbol);
+ Ite != ImportedSymbols.end()) {
+ W.printString("SymbolName", Ite->second);
+ }
+
+ W.printHex("Target", Reloc.Target);
+ W.printNumber("BitWidth", Reloc.BitSize);
+ }
+}
+
void COFFDumper::printCOFFResources() {
ListScope ResourcesD(W, "Resources");
for (const SectionRef &S : Obj->sections()) {
diff --git a/llvm/tools/llvm-readobj/ObjDumper.h b/llvm/tools/llvm-readobj/ObjDumper.h
index 1dc29661f7178..a654078a770ff 100644
--- a/llvm/tools/llvm-readobj/ObjDumper.h
+++ b/llvm/tools/llvm-readobj/ObjDumper.h
@@ -146,6 +146,7 @@ class ObjDumper {
virtual void printCOFFExports() { }
virtual void printCOFFDirectives() { }
virtual void printCOFFBaseReloc() { }
+ virtual void printCOFFPseudoReloc() {}
virtual void printCOFFDebugDirectory() { }
virtual void printCOFFTLSDirectory() {}
virtual void printCOFFResources() {}
diff --git a/llvm/tools/llvm-readobj/Opts.td b/llvm/tools/llvm-readobj/Opts.td
index 48d43cc635a4f..d519e34a72983 100644
--- a/llvm/tools/llvm-readobj/Opts.td
+++ b/llvm/tools/llvm-readobj/Opts.td
@@ -82,6 +82,9 @@ def codeview_ghash : FF<"codeview-ghash", "Enable global hashing for CodeView ty
def codeview_merged_types : FF<"codeview-merged-types", "Display the merged CodeView type stream">, Group<grp_coff>;
def codeview_subsection_bytes : FF<"codeview-subsection-bytes", "Dump raw contents of codeview debug sections and records">, Group<grp_coff>;
def coff_basereloc : FF<"coff-basereloc", "Display .reloc section">, Group<grp_coff>;
+def coff_pseudoreloc
+ : FF<"coff-pseudoreloc", "Display runtime pseudo-relocations">,
+ Group<grp_coff>;
def coff_debug_directory : FF<"coff-debug-directory", "Display debug directory">, Group<grp_coff>;
def coff_directives : FF<"coff-directives", "Display .drectve section">, Group<grp_coff>;
def coff_exports : FF<"coff-exports", "Display export table">, Group<grp_coff>;
diff --git a/llvm/tools/llvm-readobj/llvm-readobj.cpp b/llvm/tools/llvm-readobj/llvm-readobj.cpp
index 4c84ed701bb9a..2b34761b2cc6c 100644
--- a/llvm/tools/llvm-readobj/llvm-readobj.cpp
+++ b/llvm/tools/llvm-readobj/llvm-readobj.cpp
@@ -154,6 +154,7 @@ static bool CodeViewEnableGHash;
static bool CodeViewMergedTypes;
bool CodeViewSubsectionBytes;
static bool COFFBaseRelocs;
+static bool COFFPseudoRelocs;
static bool COFFDebugDirectory;
static bool COFFDirectives;
static bool COFFExports;
@@ -305,6 +306,7 @@ static void parseOptions(const opt::InputArgList &Args) {
opts::CodeViewMergedTypes = Args.hasArg(OPT_codeview_merged_types);
opts::CodeViewSubsectionBytes = Args.hasArg(OPT_codeview_subsection_bytes);
opts::COFFBaseRelocs = Args.hasArg(OPT_coff_basereloc);
+ opts::COFFPseudoRelocs = Args.hasArg(OPT_coff_pseudoreloc);
opts::COFFDebugDirectory = Args.hasArg(OPT_coff_debug_directory);
opts::COFFDirectives = Args.hasArg(OPT_coff_directives);
opts::COFFExports = Args.hasArg(OPT_coff_exports);
@@ -492,6 +494,8 @@ static void dumpObject(ObjectFile &Obj, ScopedPrinter &Writer,
Dumper->printCOFFDirectives();
if (opts::COFFBaseRelocs)
Dumper->printCOFFBaseReloc();
+ if (opts::COFFPseudoRelocs)
+ Dumper->printCOFFPseudoReloc();
if (opts::COFFDebugDirectory)
Dumper->printCOFFDebugDirectory();
if (opts::COFFTLSDirectory)
|
While investigating #149639, we used the script previously created by @mstorsjo and @jeremyd2019 for dumping pseudo‑relocs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this, this is very much appreciated! The script has indeed been useful a number of times. Some of the times I needed it, I also considered writing something like this, but in that case, most of the cases where I needed it, I wanted to find pseudo relocations that were too narrow (32 bit pseudo relocations in 64 bit binaries), and wanted the more precise diagnostics about where it was caused (which I ask about in a review comment here as well). But in the end, at the time, I ended up implementing that warning in LLD instead, in 6daa4b9.
@@ -0,0 +1,97 @@ | |||
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/pseudoreloc.exe | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, it'd be nicer to synthesize the test binary from yaml with yaml2obj
rather than checking in a binary. (We do have some binaries checked in from before, but we'd like to keep that to a minimum, e.g. for binaries that can't be synthesized with yaml2obj
yet.)
I see that you have the full procedure included for regenerating the binary, that's nice and appreciated! If converting it to yaml, it's also somewhat customary to strip down the size of it by removing unnecessary data from it. Perhaps it's not necessary in this case if the payload of each section is only a couple dozens of bytes anyway though. But if it is, the instructions would unfortunately end with obj2yaml pseudoreloc.exe > pseudoreloc.exe.yaml # and manually strip down the .yaml file
. But if there's not that much unnecessary in there, perhaps we don't need to strip it manually at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Using yaml2obj, I was able to edit the content using sed
and easily add tests for errors.
CHECK-NEXT: } | ||
CHECK-NEXT: ] | ||
|
||
NOSYM-NOT: PseudoReloc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of needing to repeat the -NOT
condition before/after the positive check that we do look for, FileCheck
also has got an option --implicit-check-not
which you can consider using.
CHECK-NEXT: Symbol: 0x{{[0-9A-Z]+}} | ||
CHECK-NEXT: SymbolName: sym1 | ||
CHECK-NEXT: Target: 0x{{[0-9A-Z]+}} | ||
CHECK-NEXT: BitWidth: {{[0-9]+}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While these regex patterns are nice for accepting anything that we'd expect to output, it also makes the test kinda weak - we could start outputting different wrong addresses (and bitwidth sizes), without the test catching it - that's not ideal. So I think it would be good to check the actual numbers as well, to make sure the test catches any potential breakage in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FileCheck has the concept of numeric capture and substitutions and even simple maths like additions and subtractions. It sounds like this would be a good use-case for this functionality. See https://llvm.org/docs/CommandGuide/FileCheck.html#filecheck-numeric-substitution-blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I've rewritten to use numeric capture.
pseudoreloc.exe is generated by following script: | ||
|
||
#--- generate.sh | ||
llvm-mc -triple i386-mingw32 -filetype obj pseudoreloc.dll.s -o pseudoreloc.dll.o |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps include a split-file
command here as well, for easier execution if one wants to try it?
llvm/tools/llvm-readobj/Opts.td
Outdated
@@ -82,6 +82,9 @@ def codeview_ghash : FF<"codeview-ghash", "Enable global hashing for CodeView ty | |||
def codeview_merged_types : FF<"codeview-merged-types", "Display the merged CodeView type stream">, Group<grp_coff>; | |||
def codeview_subsection_bytes : FF<"codeview-subsection-bytes", "Dump raw contents of codeview debug sections and records">, Group<grp_coff>; | |||
def coff_basereloc : FF<"coff-basereloc", "Display .reloc section">, Group<grp_coff>; | |||
def coff_pseudoreloc | |||
: FF<"coff-pseudoreloc", "Display runtime pseudo-relocations">, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this should include the word "mingw" in the option description as well, as this isn't relevant for general PE-COFF?
i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) { | ||
auto Sym = Obj->getSymbol(i); | ||
if (Sym.takeError()) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC you can't just ignore the errors like this (the error classes has got a destructor that aborts if you haven't actually done anything with the error). See other similar functions here for ways of doing it; e.g. if (!Sym) { consumeError(Sym.takeError()); return; }
if we just want to ignore the error, or if (!Sym) reportError(Sym.takeError(), Obj->getFileName());
if we want to print it.
These error classes are quite tricky to use in that sense, so ideally one would need to have tested triggering all of these error cases - and unfortunately it can probably be quite hard to actually force these to fail as well... Perhaps by hex editing a binary to make symbol string offsets out of bounds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should just be a call to reportWarning
with the error, or something along those lines IMO. Same below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Warning about this potentially causes too noisy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you've realised this, but reportWarning
only emits the first instance of each identical warning, so if the same warning were to be reported lots of times, only the first one will appear in the actual warning output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Expected<COFFSymbolRef> SymOrErr = Obj->getSymbol(i + 100)
to enforce to emit warnings, I got 6 completely same warning messages while dumping pseudoreloc.test.tmp.exe-x86_64
as following:
$ bin/llvm-readobj test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64 --coff-pseudoreloc
File: test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64
Format: COFF-x86-64
Arch: x86_64
AddressSize: 64bit
PseudoReloc [
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': the marker symbols for runtime pseudo-relocation were not found
]
What does "each identical warning" mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies, I got a bit mixed up.
ObjDumper
(from which COFFDumper
derives) has a reportUniqueWarning
method, which you should use instead of reportWarning
. There was discussion quite some time ago that we'd replace reportWarning
entirely with reportUniqueWarning
usage within llvm-readobj and I thought that had happened, but apparently not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I'll use reportUniqueWarning
. Thank you!
return; | ||
} | ||
ArrayRef<uint8_t> RawRelocs = | ||
Data.take_front(RelocEnd.getValue()).drop_front(RelocBegin.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before subtracting RelocEnd
from RelocBegin
, we'd need to validate that they indeed point into the same section.
const PseudoRelocationHeader HeaderV2(1); | ||
if (RawRelocs.size() < sizeof(HeaderV2) || | ||
(memcmp(RawRelocs.data(), &HeaderV2, sizeof(HeaderV2)) != 0)) { | ||
reportWarning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If RawRelocs.size() == 0
, we probably should print a different message than this? (Not sure if the linker actually ever does produce that, and what the runtime would do with it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The runtime treat such case as a old header-less relocation block (V1). Should we support the V1 relocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't need to bother with the V1 format. But if RawRelocs.size() == 0
, then there's no data to interpret as V1 either, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, right. I should treat the case of RawRelocs.size() == 0
as just an empty relocation block.
if (auto E = D.getImportAddressTableRVA(RVA)) | ||
reportError(std::move(E), Obj->getFileName()); | ||
if (EntryRVA < RVA) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a similar check we could do, if (EntryRVA > RVA + ImportAddressTableSize)
to avoid iterating over the table, if we easily could see that the address is past the end of this table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the exact end position of the list cannot be calculated directly, I calculated an approximate value using a table list sorted by RVA.
ListScope D(W, "PseudoReloc"); | ||
for (const auto &Reloc : RelocRecords) { | ||
DictScope Entry(W, "Entry"); | ||
W.printHex("Symbol", Reloc.Symbol); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another nice to have feature here, which probably is out of scope for the initial version at least, would be to figure out which data block it belongs to (e.g. which function, or which variable contains the pseudo relocation). But as COFF symbols only have offset but not size, we can't probably do that easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easier than expected. This is because searching __RUNTIME_PSEUDO_RELOC_LIST__
requires iterating through the symbol table, which allows us to construct a RVA-to-symbol map.
Thank you for the review! I’ll work on addressing your comments and suggestions. |
CHECK-NEXT: Symbol: 0x{{[0-9A-Z]+}} | ||
CHECK-NEXT: SymbolName: sym1 | ||
CHECK-NEXT: Target: 0x{{[0-9A-Z]+}} | ||
CHECK-NEXT: BitWidth: {{[0-9]+}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FileCheck has the concept of numeric capture and substitutions and even simple maths like additions and subtractions. It sounds like this would be a good use-case for this functionality. See https://llvm.org/docs/CommandGuide/FileCheck.html#filecheck-numeric-substitution-blocks.
i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) { | ||
auto Sym = Obj->getSymbol(i); | ||
if (Sym.takeError()) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should just be a call to reportWarning
with the error, or something along those lines IMO. Same below.
ArrayRef<uint8_t> Data; | ||
auto Section = Obj->getSection(RelocBegin.getSectionNumber()); | ||
if (auto E = Section.takeError()) { | ||
reportError(std::move(E), Obj->getFileName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer reportWarning
over reportError
in dumping code, because it allows execution to continue e.g. to run other dumping operations that have been requested. Same throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed all reportError
to reportWarning
.
if (RawRelocs.size() < sizeof(HeaderV2) || | ||
(memcmp(RawRelocs.data(), &HeaderV2, sizeof(HeaderV2)) != 0)) { | ||
reportWarning( | ||
createStringError("Invalid runtime pseudo-relocation records"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM style guide says warnings and errors should start with lower-case letters.
- ignore properly - use reportWarning instead of reportError - uncapitalize message
- use --implicit-check-not - use obj2yaml - use numeric capture pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I've addressed the comments. Thanks!
: "__RUNTIME_PSEUDO_RELOC_LIST__"; | ||
const StringRef RelocEndName = Obj->getArch() == Triple::x86 | ||
? "___RUNTIME_PSEUDO_RELOC_LIST_END__" | ||
: "__RUNTIME_PSEUDO_RELOC_LIST_END__"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the test for x86_64 and switched testee source to LLVM IR to support multiple arch.
@@ -0,0 +1,97 @@ | |||
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/pseudoreloc.exe | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Using yaml2obj, I was able to edit the content using sed
and easily add tests for errors.
ArrayRef<uint8_t> Data; | ||
auto Section = Obj->getSection(RelocBegin.getSectionNumber()); | ||
if (auto E = Section.takeError()) { | ||
reportError(std::move(E), Obj->getFileName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed all reportError
to reportWarning
.
const PseudoRelocationHeader HeaderV2(1); | ||
if (RawRelocs.size() < sizeof(HeaderV2) || | ||
(memcmp(RawRelocs.data(), &HeaderV2, sizeof(HeaderV2)) != 0)) { | ||
reportWarning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The runtime treat such case as a old header-less relocation block (V1). Should we support the V1 relocation?
if (auto E = D.getImportAddressTableRVA(RVA)) | ||
reportError(std::move(E), Obj->getFileName()); | ||
if (EntryRVA < RVA) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the exact end position of the list cannot be calculated directly, I calculated an approximate value using a table list sorted by RVA.
ListScope D(W, "PseudoReloc"); | ||
for (const auto &Reloc : RelocRecords) { | ||
DictScope Entry(W, "Entry"); | ||
W.printHex("Symbol", Reloc.Symbol); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easier than expected. This is because searching __RUNTIME_PSEUDO_RELOC_LIST__
requires iterating through the symbol table, which allows us to construct a RVA-to-symbol map.
CHECK-NEXT: Symbol: 0x{{[0-9A-Z]+}} | ||
CHECK-NEXT: SymbolName: sym1 | ||
CHECK-NEXT: Target: 0x{{[0-9A-Z]+}} | ||
CHECK-NEXT: BitWidth: {{[0-9]+}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I've rewritten to use numeric capture.
void COFFDumper::printCOFFPseudoReloc() { | ||
if (!Obj->getDOSHeader()) { | ||
W.startLine() | ||
<< "pseudo-relocation is only meaningful for a PE image file\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question - such this message should be a warning or a just printing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the ELF case, if a thing isn't meaningful, we often just emit an empty output. For example, try --dyn-symbols
on an ELF ET_REL object. I haven't checked if that's the pattern with the COFF side of things though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made changes to exit silently with showing an empty list on these conditions:
- not an exe/dll image file
- the symbol table was stripped
- the relocation list has no entry
- the size of the relocation list is zero (begin == end)
and to warn with reportWarning
on these conditions:
- the marker of the pseudo relocation is absent from the symbol table
- the header of relocation list is broken
- the marker of the relocation list points out of section space
- the size of the relocation list is negative
- the symbol table is broken
- the symbol table is truncated
- the name of the symbol or the section can't be fetched
(I can't make the tests for these 2 conditions, because yaml2obj can't generate such corrupted binary. ) - the section which the symbol belongs to is not found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I can't make the tests for these 2 conditions, because yaml2obj can't generate such corrupted binary. )
Optional, but you could (in a separate PR) look at enhancing yaml2obj to allow you to create such a corrupted binary. In ELF, we have several fields whose only real use-case is to allow us to test specific error cases. See for example tests that use fields like SHName
and SHOffset
in the ELF llvm-readobj tests.
llvm/tools/llvm-readobj/Opts.td
Outdated
@@ -81,12 +81,17 @@ def codeview : FF<"codeview", "Display CodeView debug information">, Group<grp_c | |||
def codeview_ghash : FF<"codeview-ghash", "Enable global hashing for CodeView type stream de-duplication">, Group<grp_coff>; | |||
def codeview_merged_types : FF<"codeview-merged-types", "Display the merged CodeView type stream">, Group<grp_coff>; | |||
def codeview_subsection_bytes : FF<"codeview-subsection-bytes", "Dump raw contents of codeview debug sections and records">, Group<grp_coff>; | |||
def coff_basereloc : FF<"coff-basereloc", "Display .reloc section">, Group<grp_coff>; | |||
def coff_basereloc : FF<"coff-basereloc", "Display .reloc section">, | |||
Group<grp_coff>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git clang-format
folded this line. Should it be ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's somewhat expected that the formatting tools can tweak surrounding lines a little bit - but in this case, it's kinda far away. So I'm ok with excluding this change (although the CI formatting checker might complain and want you to include it), but I'm also ok with including it.
The CI failure seems related to #152299; I can't find |
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/pseudoreloc.exe | FileCheck %s | ||
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/nop.exe.coff-x86-64 | FileCheck %s --check-prefix=NOSYM | ||
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/trivial.obj.coff-i386 | FileCheck %s --check-prefix=NORELOC | ||
REQUIRE: x86-registered-target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REQUIRE: x86-registered-target | |
REQUIRES: x86-registered-target |
But why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's totally wrong spurious... I think I was confused about generating the yaml and running the test.
DEFINE: %{local} = 0x3000 | ||
DEFINE: %{addr1l} = E8 | ||
DEFINE: %{addr3l} = D8 | ||
RUN: yaml2obj %p/Inputs/pseudoreloc.x86_64.yaml | llvm-readobj --coff-pseudoreloc - 2>&1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to do yaml2obj ... -o %t
then a separate llvm-readobj
run command, as it means the intermediate object is left on disk, allowing for post-mortem test debugging.
Also for longer commands, I prefer to split by having the |
at the end of the previous line, to show that the first line is continued, without needing to look at the next line, then indentation on the next to signify it's a continuation without needing to look at the line before.
NORELOC-NOT: PseudoReloc | ||
NORELOC: The symbols for runtime pseudo-relocation are not found | ||
NORELOC-NOT: PseudoReloc | ||
; check that silently ignore when imported symbol name is not found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment feels a bit garbled. Try "Test that llvm-readobj silently ignores missing imported symbol names." Also, please ensure comments follow standard English grammar, with capital letters and full stops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for my poor English. I very appreciate you correcting me!
And, as other erroneous conditions should be warned rather than ignored, this one also should be warned, right?
INVALIDSYMBOL: Symbol: 0xFFFF00 | ||
INVALIDSYMBOL-NEXT: Target: 0x[[#%X,LOCAL1A+8]] | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: get rid of extra blank line.
NORELOC: The symbols for runtime pseudo-relocation are not found | ||
NORELOC-NOT: PseudoReloc | ||
; check that silently ignore when imported symbol name is not found | ||
RUN: sed -E -e '/Name: *\.rdata/,/Name:/{/SectionData:/{s/%{addr1l}200000/30000000/;s/%{addr3l}200000/00FFFF00/}}' %p/Inputs/pseudoreloc.i386.yaml \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels extremely complex for what is necessary. yaml2obj provides a -D option, which you can use to parameterise things, rather than trying to do complicated find/replace sed expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think it's complex and hard to understand, but it's a binary patch relied on where the linker actually places symbols at, so it seems complexed sed
-ing can't be avoided.
I had tried to make easier to read with moving sed into the yaml generator script.
}; | ||
std::map<uint32_t, SymbolEntry> RVASymbolMap; | ||
COFFSymbolRef RelocBegin, RelocEnd; | ||
for (auto i = 0u; i < Count; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LLVM style guide generally prefers to avoid auto. Please review the examples. I don't think it's appropriate here or immediately below.
auto Sym = Obj->getSymbol(i); | ||
if (Sym.takeError()) | ||
if (!Sym) { | ||
consumeError(Sym.takeError()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's generally poor form to simply throw away errors. We should be reporting them with reportWarning
, as I already said in a previous comment. Same applies throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to reportWarning
.
However, if the image file is truly corrupted, it might emit a huge amount of warnings. I think such warnings should be reported by --symbols
, not by --coff-pseudoreloc
. Alternatively, it may be more appropriate to limit the reporting to once. May I ask your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my other comment about how reportWarning
deduplicates identical warnings. If the warnings aren't actually identical, but are noisy, there's probably a case for only reporting the first time (per input file).
} | ||
if (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()) { | ||
W.startLine() | ||
<< "The symbols for runtime pseudo-relocation are not found\n"; | ||
<< "the symbols for runtime pseudo-relocation are not found\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely feels like it should be a warning. Also "were not found" not "are not found".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's usually suspicious state to be warned, but theoretically not.
- Although
link.exe
(andlld-link
without-debug:dwarf
) omits the symbol table, the PE spec allows a PE image with a symbol table. - With custom
crt0.o
and/or a custom linker script can omit the marker of the pseudo relocation.
That said, making this message a warning makes sense to me as those conditions are extremely rare.
|
||
if (RelocEnd.getValue() < RelocBegin.getValue()) { | ||
reportWarning(createStringError("the symbols for runtime pseudo-relocation " | ||
"don't consist a valid region"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should generally include additional context, e.g. the incorrect values that were derived.
- fix position of line breaks - name intermediate output binary - remove a sprious and doubled empty lines
- make parameter more symbolic a bit - use parametrized yaml * sed while generating source rather than testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review again!
I've made changes to address your comments and suggestions. Thanks!
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/pseudoreloc.exe | FileCheck %s | ||
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/nop.exe.coff-x86-64 | FileCheck %s --check-prefix=NOSYM | ||
RUN: llvm-readobj --coff-pseudoreloc %p/Inputs/trivial.obj.coff-i386 | FileCheck %s --check-prefix=NORELOC | ||
REQUIRE: x86-registered-target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's totally wrong spurious... I think I was confused about generating the yaml and running the test.
NORELOC-NOT: PseudoReloc | ||
NORELOC: The symbols for runtime pseudo-relocation are not found | ||
NORELOC-NOT: PseudoReloc | ||
; check that silently ignore when imported symbol name is not found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for my poor English. I very appreciate you correcting me!
And, as other erroneous conditions should be warned rather than ignored, this one also should be warned, right?
NORELOC: The symbols for runtime pseudo-relocation are not found | ||
NORELOC-NOT: PseudoReloc | ||
; check that silently ignore when imported symbol name is not found | ||
RUN: sed -E -e '/Name: *\.rdata/,/Name:/{/SectionData:/{s/%{addr1l}200000/30000000/;s/%{addr3l}200000/00FFFF00/}}' %p/Inputs/pseudoreloc.i386.yaml \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think it's complex and hard to understand, but it's a binary patch relied on where the linker actually places symbols at, so it seems complexed sed
-ing can't be avoided.
I had tried to make easier to read with moving sed into the yaml generator script.
const PseudoRelocationHeader HeaderV2(1); | ||
if (RawRelocs.size() < sizeof(HeaderV2) || | ||
(memcmp(RawRelocs.data(), &HeaderV2, sizeof(HeaderV2)) != 0)) { | ||
reportWarning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, right. I should treat the case of RawRelocs.size() == 0
as just an empty relocation block.
const coff_section *Section; | ||
StringRef SymbolName; | ||
}; | ||
std::map<uint32_t, SymbolEntry> RVASymbolMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've switched to use SmallVector
and then stable_sort
and unique
.
void COFFDumper::printCOFFPseudoReloc() { | ||
if (!Obj->getDOSHeader()) { | ||
W.startLine() | ||
<< "pseudo-relocation is only meaningful for a PE image file\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made changes to exit silently with showing an empty list on these conditions:
- not an exe/dll image file
- the symbol table was stripped
- the relocation list has no entry
- the size of the relocation list is zero (begin == end)
and to warn with reportWarning
on these conditions:
- the marker of the pseudo relocation is absent from the symbol table
- the header of relocation list is broken
- the marker of the relocation list points out of section space
- the size of the relocation list is negative
- the symbol table is broken
- the symbol table is truncated
- the name of the symbol or the section can't be fetched
(I can't make the tests for these 2 conditions, because yaml2obj can't generate such corrupted binary. ) - the section which the symbol belongs to is not found
auto Count = Obj->getNumberOfSymbols(); | ||
if (Count == 0) { | ||
W.startLine() << "The symbol table has been stripped\n"; | ||
W.startLine() << "the symbol table has been stripped\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be aligned with the "not a PE" case, returning silently with emitting an empty list.
auto Sym = Obj->getSymbol(i); | ||
if (Sym.takeError()) | ||
if (!Sym) { | ||
consumeError(Sym.takeError()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to reportWarning
.
However, if the image file is truly corrupted, it might emit a huge amount of warnings. I think such warnings should be reported by --symbols
, not by --coff-pseudoreloc
. Alternatively, it may be more appropriate to limit the reporting to once. May I ask your opinion?
i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) { | ||
auto Sym = Obj->getSymbol(i); | ||
if (Sym.takeError()) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Warning about this potentially causes too noisy.
} | ||
if (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()) { | ||
W.startLine() | ||
<< "The symbols for runtime pseudo-relocation are not found\n"; | ||
<< "the symbols for runtime pseudo-relocation are not found\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's usually suspicious state to be warned, but theoretically not.
- Although
link.exe
(andlld-link
without-debug:dwarf
) omits the symbol table, the PE spec allows a PE image with a symbol table. - With custom
crt0.o
and/or a custom linker script can omit the marker of the pseudo relocation.
That said, making this message a warning makes sense to me as those conditions are extremely rare.
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not done a full review this time around, but there are plenty of things to address nonetheless. Thanks for working on this!
|
||
|
||
pseudoreloc.*.yaml is generated by following script: | ||
To regenerate Inputs/pseudoreloc.*.yaml, run following one-liner and review actual address map: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment marker here for readability.
|
||
# Ensure the binaries generated from parameterized yaml and original one are exact same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Ensure the binaries generated from parameterized yaml and original one are exact same. | |
# Ensure the binaries generated from the parameterized yaml and original one are exactly the same. |
W.startLine() | ||
<< "pseudo-relocation is only meaningful for a PE image file\n"; | ||
ListScope D(W, "PseudoReloc"); | ||
W.flush(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of the flush?
ListScope D(W, "PseudoReloc"); | ||
W.flush(); | ||
|
||
// Pseudo-relocation is only meaningful for a PE image file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "pseudo-relocation" here referring to the general concept or specifically the option? I'm guessing it probably doesn't matter, but it impacts how you write this comment. If referring to the option, use the option name, including the "--" prefix. Otherwise, I'd say "Pseudo-relocations are only meaningful with PE image files."
COFFSymbolRef Sym; | ||
if (Expected<COFFSymbolRef> SymOrErr = Obj->getSymbol(i)) | ||
Sym = *SymOrErr; | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM style says if one part of an if/else needs braces, they all should have them.
for (const auto &Reloc : RelocRecords) { | ||
DictScope Entry(W, "Entry"); | ||
|
||
W.printHex("Symbol", Reloc.Symbol); | ||
if (const auto *Sym = ImportedSymbols.find(Reloc.Symbol)) | ||
W.printString("SymbolName", *Sym); | ||
if (Expected<StringRef> SymOrErr = ImportedSymbols.find(Reloc.Symbol)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above re. braces.
|
||
W.printHex("Target", Reloc.Target); | ||
if (auto Ite = RVASymbolMap.upper_bound(Reloc.Target.value()); | ||
if (auto Ite = llvm::upper_bound( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above re. braces.
i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) { | ||
auto Sym = Obj->getSymbol(i); | ||
if (Sym.takeError()) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you've realised this, but reportWarning
only emits the first instance of each identical warning, so if the same warning were to be reported lots of times, only the first one will appear in the actual warning output.
void COFFDumper::printCOFFPseudoReloc() { | ||
if (!Obj->getDOSHeader()) { | ||
W.startLine() | ||
<< "pseudo-relocation is only meaningful for a PE image file\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I can't make the tests for these 2 conditions, because yaml2obj can't generate such corrupted binary. )
Optional, but you could (in a separate PR) look at enhancing yaml2obj to allow you to create such a corrupted binary. In ELF, we have several fields whose only real use-case is to allow us to test specific error cases. See for example tests that use fields like SHName
and SHOffset
in the ELF llvm-readobj tests.
auto Sym = Obj->getSymbol(i); | ||
if (Sym.takeError()) | ||
if (!Sym) { | ||
consumeError(Sym.takeError()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my other comment about how reportWarning
deduplicates identical warnings. If the warnings aren't actually identical, but are noisy, there's probably a case for only reporting the first time (per input file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you the reviewing a lot!
I've made a change for the wording etc, only contains NFC portions.
I'm working on addressing the remaining comments.
i < Count && (!RelocBegin.getRawPtr() || !RelocEnd.getRawPtr()); ++i) { | ||
auto Sym = Obj->getSymbol(i); | ||
if (Sym.takeError()) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Expected<COFFSymbolRef> SymOrErr = Obj->getSymbol(i + 100)
to enforce to emit warnings, I got 6 completely same warning messages while dumping pseudoreloc.test.tmp.exe-x86_64
as following:
$ bin/llvm-readobj test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64 --coff-pseudoreloc
File: test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64
Format: COFF-x86-64
Arch: x86_64
AddressSize: 64bit
PseudoReloc [
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': Invalid data was encountered while parsing the file
bin/llvm-readobj: warning: 'test/tools/llvm-readobj/COFF/Output/pseudoreloc.test.tmp.exe-x86_64': the marker symbols for runtime pseudo-relocation were not found
]
What does "each identical warning" mean?
MinGW toolchain uses "runtime pseudo-relocation" mechanism to support auto-importing symbols from DLLs.
There is no commonly used tools for dump the pseudo-relocation records, so we implement that functionality in llvm-readobj.