So to be very frank, I kind of got lost in the sauce and do not exactly remember everything I did, so I will base it off of my scratch pad, and all my diffs.
Anyways, one thing I learned is that the first thing you need working is AMD's custom compiler(?) tool chain, but it doesn't work out of the box, you need a few tiny patches for stuff like mapping CLOCK_MONOTONIC_FAST to CLOCK_MONOTONIC_RAW , and some other werird patches like a weird link to environ variable. Lowkey I was shocked that what I did worked, and that it correctly pulled envrioment variables when I did run the program. You can get a copy of my branch at https://github.com/Yohello1/llvm-project-rocm# (hash: 8450d34b327abfe9e2994a6476003f33221ad8d5)
I used this command; cmake -S llvm -B build \
-DLLVM_ENABLE_PROJECTS="clang;lld" \
-DLLVM_EXTERNAL_PROJECTS="amd-device-libs;comgr" \
-DLLVM_EXTERNAL_AMD_DEVICE_LIBS_SOURCE_DIR="/root/dev/llvm-project/amd/device-libs" \
-DLLVM_EXTERNAL_COMGR_SOURCE_DIR="/root/dev/llvm-project/amd/comgr" \
-DLLVM_TARGETS_TO_BUILD="AMDGPU;X86" \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_INSTALL_PREFIX=/usr/local/rocm-llvm
one weird thing I found was; if you built it with Debug, it does not work cause it (seemingly) tries to optimise some parts and strips debug info out of that, and when you get to linking stage it fails cause of that??? Though, I did not dive deeply into the issue, you can replicate it by just changing the build type to Debug
NONE THANK GOD, AND IT WORKED WHOOOO MY SHIM LAYER WHICH WAS SHODDILY CODED WORKS WEEEE
So I was able to get it successfully compile today, and install, I too was shocked, so I decided to put it to test with a quick script to see if it can detect the gpu, and (to no one's suprise) it does not detect any, infact it can't even talk to the amdkfd driver. HOWEVER IT GRACEFULLY WENT TO EXCEPTION THINGY AND DID NOT CRASH IT WORKED AS EXPECT WHOOOO THIS IS MASSIVE PROGRESS. This is the cmake build thingy I used; cmake -DNUMA_LIBRARIES=/usr/local/lib/libnuma.so -DNUMA_INCLUDE_DIR=/usr/local/include -DClang_DIR=/usr/local/rocm-llvm/lib/cmake/clang/ -DCMAKE_BUILD_TYPE=RelWithDeb .. . https://github.com/Yohello1/rocm-systems (hash:660429045eafe89636f9d17c08f8797037dedc19)
Anyways, to get this working, there are about 1542 lines of changes... So might take a minute to get through all of them.
Relised this also needed
So there's this value on linux (_IOC_SIZESHIFT) which is defined cause it changed by hardware, it defines the size of ioctl or smth, on FreeBSD it's just hard coded to 16... So I just hard coded it to 16 by doing #define _IOC_SIZESHIFT 16 . Then uh there's this thing called MADV_DONTFORK which is used ot indicate to not copy over the memory when you fork a process if I remember right. Now on linux this works via madvise, but on FreeSBD you need to do this via minhherit. So I wrote out a static inlined function, and a macro which will do subsittuion of the custom wrapper in place of madvise when it's needed. Following that you'll see that I set MAP_NORESERVE to 0, on linux this would cause a lot of issues, because it is strict in terms of memory allocation, but in FreeBSD it's "lazy" where it doesnt care till it's used, so we good to just not care. Then finally we have MADV_HUGEPAGE, now the man pages for this literally say it's an optional function... So I jsut no-op'ed it.
Had to wrap #include inside a "is on linux defined" macro cause it doesnt exist in freebsd, eah, pretty simple.
I uh just deleted #include oops, should've wrapped in #ifdef __linux__, and #endif . Also switched over some lines to use standard unix calls instead.
This build file was fundementally flawed, the runtime required a lot of c++20 features, but it built with 17 by default???? Also, if you were on a unix based system, it would assume Linux... Also made it use my custom freebsd src files isntead of the linux ones so yeah.
Just switchedout includes with preprocessor include
Just switchedout includes with preprocessor include
Just switchedout includes with preprocessor include. This one required some brainceslls to do.
Required A LOT of preprocessor code switching out, and changing to get it to work
Just switched out syscalls
Added headers
MORE CURSED PRE PROCESSOR MACROS HAHAHAH
i dunno ngl, but this was needed cause otherwise it wouldnt make the driver gpu list types thingy ma bober. Also for some reason it couldnt init the list the way it was before, so changed how it gets init'd
I couldnt get bash or smth else isnalled, so I just used ksh...
I copied over the linux file, and compiled to get all errors, then asked LLM for solutions and spent like 1+ hours reading its code to make sure it wouldn't break. but lowkey I do expect it to break
This one was a bit weird, had to force it to use fall back method which was os agnostic
This one was a bit more than just header include, had to define a macro to replace all stuff of SYS_gettid, with thr_self(0), lowkey idk if it works, it should, but Im not too familiar with the api. Then had to make a lottt of type conversions a lot more speicifc/explicit to stop errors
Based on what the gdb tracing showed, I will need to fix how it handles paths to not load dxg no matter what, and load the correct driver (the code is defineinetly... interesting???), and then work on getting amdkfd working, then I can get ioctl working. I cannot easily provide a timeline this time, I will target to get the issues with loading the wrong driver fixed by next Friday
So I was running the test program and stepping through EVERYTHING (including the mutex and atomics code), and took about 3 hours of deep focus to get the loader where I found that it tries to load dxg (windows thingy), and no matter what, it says it is loaded?????? I don't know if this is the important part, but I am gonna keep tracing later tmrw, had to stop today cause friend's apartment building caught fire and they needed a place. This project is somewhere between "This is a challenge which I didn't know I needed", and "This will push my understanding of computers to its limits".