c++ - LLVM IR codegen segfaults during exit only when method declarations have parameters
问题描述
Explanation
I am creating a compiler for a C-like language using yacc/bison, flex, and the LLVM toolchain (LLVM 12) using the LLVM C++ API. I have been developing and testing on Ubuntu version 20.04.3 LTS (Focal Fossa) and macOS 11.6 Big Sur. Currently, the issue is the program segfaulting when exiting the program when a method declaration has method parameters such as simply:
func test(x int) void {}
The LLVM IR will be printed properly as
; ModuleID = 'Test'
source_filename = "Test"
define void @test(i32 %x) {
entry:
%x1 = alloca i32, align 4
store i32 %x, i32* %x1, align 4
ret void
}
And will segfault immediately after.
A method declaration like
func test() int {
var x int;
x = 5;
return (x);
}
Will not segfault.
GDB reports that the segfault occurs during llvm::LLVMContextImpl::~LLVMContextImpl()
. Valgrind reports ~LLVMContextImpl()
doing an invalid read of size 8.
Edit: Valgrind output relating to invalid read
==10254== Invalid read of size 8
==10254== at 0x5553C30: llvm::LLVMContextImpl::~LLVMContextImpl() (in /usr/lib/x86_64-linux-gnu/libLLVM-12.so.1)
==10254== by 0x5552130: llvm::LLVMContext::~LLVMContext() (in /usr/lib/x86_64-linux-gnu/libLLVM-12.so.1)
==10254== by 0xA44AA26: __run_exit_handlers (exit.c:108)
==10254== by 0xA44ABDF: exit (exit.c:139)
==10254== by 0xA4280B9: (below main) (libc-start.c:342)
==10254== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==10254==
==10254==
==10254== Process terminating with default action of signal 11 (SIGSEGV)
==10254== Access not within mapped region at address 0x0
==10254== at 0x5553C30: llvm::LLVMContextImpl::~LLVMContextImpl() (in /usr/lib/x86_64-linux-gnu/libLLVM-12.so.1)
==10254== by 0x5552130: llvm::LLVMContext::~LLVMContext() (in /usr/lib/x86_64-linux-gnu/libLLVM-12.so.1)
==10254== by 0xA44AA26: __run_exit_handlers (exit.c:108)
==10254== by 0xA44ABDF: exit (exit.c:139)
==10254== by 0xA4280B9: (below main) (libc-start.c:342)
==10254== If you believe this happened as a result of a stack
==10254== overflow in your program's main thread (unlikely but
==10254== possible), you can try to increase the size of the
==10254== main thread stack using the --main-stacksize= flag.
==10254== The main thread stack size used in this run was 8388608.
I'm hoping that by asking here I can get some kind of hint for how to work towards solving this issue. I've been stuck on this for days.
Source Code Fragments
The sections of my code relating to method declarations and method parameters are as follow, I apologize for the length:
Bison grammar rule for program
program: extern_list decafpackage
{
ProgramAST *prog = new ProgramAST((decafStmtList*)$1, (PackageAST*)$2);
if (printAST) {
cout << getString(prog) << endl;
}
prog->Codegen();
delete prog;
}
;
Bison grammar rule for method declaration
method_decl: T_FUNC T_ID T_LPAREN params T_RPAREN method_type method_block
{
$$ = new Method(*$2, $6->str(), $4, $7);
delete $2;
delete $6;
}
Bison grammar rule for method parameter
param: T_ID type { $$ = new VarDef(*$1, $2->str()); delete $1; delete $2; }
;
C++ Method::Codegen() handling of parameters
llvm::Function *func = llvm::Function::Create(
llvm::FunctionType::get(returnTy, args, false),
llvm::Function::ExternalLinkage,
name,
TheModule
);
llvm::BasicBlock *BB = llvm::BasicBlock::Create(TheContext, "entry", func);
Builder.SetInsertPoint(BB);
. . .
for (auto &Arg : func->args()) {
llvm::AllocaInst* Alloca = CreateEntryBlockAlloca(func, Arg.getName().str());
Builder.CreateStore(&Arg, Alloca);
sTStack->enter_symtbl(Arg.getName().str(), Alloca);
}
C++ VarDef::Codegen()
llvm::Value *Codegen() {
llvm::Type* ty = getLLVMType(type);
llvm::AllocaInst* V = Builder.CreateAlloca(ty, 0, name);
V->setName(name);
sTStack->enter_symtbl(name, V);
return V;
return nullptr;
}
Bison main
int main() {
// Setup
llvm::LLVMContext &Context = TheContext;
TheModule = new llvm::Module("Test", Context);
FPM = std::make_unique<llvm::legacy::FunctionPassManager>(TheModule);
FPM->add(llvm::createPromoteMemoryToRegisterPass());
FPM->add(llvm::createInstructionCombiningPass());
FPM->add(llvm::createReassociatePass());
FPM->add(llvm::createGVNPass());
FPM->add(llvm::createCFGSimplificationPass());
FPM->doInitialization();
int retval = yyparse();
TheModule->print(llvm::errs(), nullptr);
return(retval >= 1 ? EXIT_FAILURE : EXIT_SUCCESS);
}
解决方案
Solution:
The problem was in lines of code not included. llvm::Function::Create
requires an llvm::FunctionType
which can be provided by filling a vector with llvm::Type*
objects. I wrote a function to do this:
void getLLVMTypes(vector<llvm::Type*>* v) {
for (auto* i : stmts) {
llvm::Type* type = getLLVMType(i->getType());
((llvm::Value*)(type))->setName(i->getName()); // Problem
v->push_back(type);
}
}
The issue was casting each llvm::Type*
object to llvm::Value*
and using llvm::Value::setName
to set its name. I did this to counter a problem I had earlier with parameter names not being set. I'm not entirely sure what the issue was, I had trouble compiling LLVM from source with debug flags, but it's a gnarly looking line of code and removing it, along with using an alternative way to preserve method parameter names, solved the issue.
推荐阅读
- geospatial - 如果用户未指定,ETRS89 在地理空间操作中的默认容差是多少?
- python - 带有 Python 3.9.0a1 错误的 AWS CLI `from collections import MutableMapping`
- c# - 网络标准 2.1 中的 DbProviderFactories 不能被网络框架 4.7.2 引用
- python - 使用 Selenium 和 BeautifulSoup 的输入来抓取网站?
- authentication - Spring Security 和 WSO2 授权
- c# - 如何预先分配 T 列表的 C# 列表
- python - 如何在不包括节假日的日期中添加工作日
- php - 如何使用带有python代码的php以表格格式显示数据框
- javascript - 在javascript上的另一个函数中使用函数中的变量
- docusignapi - 查询 oauth/userinfo 给出 UserNotFound