Overview
Now that I was able to build mlibc, I needed to start actually implementing the things that it requires to function.
Stubbing the Internal Dependencies
- Now cautiously optimistic, I decide to stub out all of the functions that mlibc needs to work (and are giving me the linker errors), to at least make sure I can do that and resolve any issues, before I actually start implementing them. I mainly follow the Managarm sysdep to find what it should all look like. And it compiles successfully! It successfully builds several shared object files, including
libc.so
andld.so
. To double check everything, I runfile build/libc.so
and am pleased to see that it is an ELF 64-bit shared object for aarch64.- Note that it does say
warning: .fini_array section has zero size
. This is because I do not have any global destructors, and adding one makes the warning go away.
- Note that it does say
-
Now if I can just link with a sample program, it should pretty much just work [I thought optimistically]. I find a StackOverflow answer on how to link with a different libc. It was as simple as
aarch64-linux-gnu-gcc -Xlinker -rpath=./build/ -Xlinker -I./build/ld.so test.c
! gcc exits with success, and the output file metadata looks good. But I can’t easily test it, since I have an x86-64 machine. I could probably get a Linux ISO and boot it up in QEMU, but at this moment I’m running around campus working on my laptop between classes, so the quicker thing to do is just use the few Google Cloud credits I have leftover from a cloud computing class and spin up an ARM VM for a few minutes to test with. But then I don’t really know how I’ll get the shared library to work correctly on Linux, considering that it was built with a custom linker. So I end up switching over to trying a static binary instead (by appending-static
to the gcc command above). I upload the file and try to run it in GDB on the ARM machine. But when I try setting a breakpoint onsys_exit
(Just running a simple C program that immediately callsexit
which in mlibc callsmlibc::sys_exit
defined by the sysdep), but there is no such symbol. Not completely trusting this, I load up the file in Ghidra, and sure enough none of thesys_
symbols were there. -
While previously, I had been using the
meson setup
option for mlibc-Ddefault_library=both
, to build both shared and static libraries, I decide to focus on static libraries to try to get that part working first. So now I runmeson setup build --cross-file scripts/aarch64-pinceros-gcc.txt --reconfigure -Ddefault_library=static
, recompile mlibc, and try to link with the test program again. However, this yields a massive wall of undefined reference linker errors, to mainly what seem to be functions relating to floating point and atomic operations. -
At this point, I actually join the Managarm Discord server using the link in the README to see if anyone has encountered similar issues before. I search for “undefined reference to __dso_handle” since that is in the first error. I eventually see a conversation from October 5, 2020 between @geertiebear and @Beliriel which makes me think to try installing after compiling. So I run
meson install -C build --destdir=./install
-
Now after installing, when I try to link I get a different set of undefined reference linker errors, which are almost entirely related to atomic instructions. Which seems like an improvement from before.
- At this point, I feel like I am just going in circles, so I decide to give clang another try. I’m going to omit the majority of this because it ended up being a dead end, but basically I clone the
llvm-project
repo and try buildingcompiler-rt
and usingclang
for everything. It did not end up well for me. -
After that fails, I decide that instead of trying to use gcc for both libc and the test program, what if I were to use gcc for libc and clang for the test program? First I have to create some stubs for the undefined references (they won’t behave correctly, but I just want to be able to build). It am able to successfully build without linker errors. When I run it on the ARM VM, there is a stack overflow, but the good news is that I observe that it is clearly running with mlibc by looking at the stack trace! (The
__ensure_fail
function and the logging system mutually recursively calling each other, since the logger is stubbed out with theMLIBC_UNIMPLEMENTED
macro which ensures failure, which involves logging, and so on)
Simplifying the Build Process
- So now that I have it working with a combination of gcc and clang, I want to try to get it working all under the roof of one compiler toolchain. There is lots of basically just trial and error over the span of a couple weeks, trying various combination of compiler and linker flags. For some of my more notable attempts, see the commented out portions of
compile.sh
at this commit on GitHib. -
Eventually I sync up with Alex (@ameyer1024) who had some prior work getting Newlib to work for porting DOOM. His main suggestions are to separate out the compiling and building into two separate commands, and to specify all of the paths manually to make sure the compiler and linker finds everything correctly. Eventually I end up with the following (which is the uncommented portion of
compile.sh
in the commit above):aarch64-linux-gnu-gcc -c test.c -I./build/install/usr/local/include aarch64-linux-gnu-ld -nostdlib test.o -L./build/install/usr/local/lib -static \ -o a.out \ ./build/sysdeps/pinceros/crt0.o ./build/sysdeps/pinceros/crti.o \ -lc /usr/lib/gcc/aarch64-linux-gnu/14.2.0/libgcc.a
- While this resolves the majority of the linker errors, I have to so some slightly cursed special handling for
__getauxval
and__dso_handle
. Despite being in a static build (which you would assume wouldn’t need these since they relate to dynamic linking), the resulting binary still has references to them. I am not sure if this is due to the slightly questionable way that mlibc reuses dynamic linker logic in static builds, or if libgcc is just being weird here. For__getauxval
, I have to edit some conditional compilation by adding my owndefined(PINCEROS)
, along with the corresponding-DPINCEROS
in the cross compilation file, and hardcoding__getauxval
to return 0, since our kernel does not supply an auxiliary vector. For__dso_handle
, I just make an assembly file that defines it as a global quad (aka a pointer), and add that file to themeson.build
. Now, building the test program with mlibc and running it on the ARM VM works (or at least gets to the stack overflow), without even having to stub out a bunch of builtins!