FreeBSD Full native port of ROCm
Hello everyone, this is the first post of many to come. So this term I'm working at FreeBSD, and the project I've been assigned is; "Port ROCm from Linux to FreeBSD". Which is a pretty tall... order, but I not one to turn down a challenge like this, it sounds fun to be honest.
So I started, it took about 2-3 days to figure out how to use FreeBSD (and lowkey I still need a lot of help), but I got it to successfully run kldload amdgpu !!!! By the way, I'm using my laptop (Ryzen 7 7840HS, Radeon 780M, RTX 4060, 32gb ram) to be work station 1, it runs linux, and I use it as reference. I'm using FreeBSD Foundation's Framework Desktop with a Strix Halo CPU to do a lot of the testing and development. I spent maybe 2 days trying to get graphcs working, before just dropping it and moving on. Anyways, to get the drivers working; you gotta pull from the linux firmware repo, and use the latest version of drm-kmod (not from ports), then it will work. Might also need to recompile kernel/world.
Anyways now that works, let's move onto the important part, getting all the projects in the ROCm super repo . Firstly, I looked through the projects, and saw "rocm-core", and thought that was a really good idea, so I started working on that. Cloned the super-repo, ran cmake, ran make, and then wrote a short test program to query version... AND IT WORKED, FIRST TRY, NO ISSUES, INCREDIBLE. Then I decided to do some googling... and realised that the rocr-runtime is where all the important stuff lives...
Then I decided to go and try to run it, and it failed, to be honest I would be jumping with joy if it worked. So it failed because it needed libdrm, and libnuma. Now libdrm on FreeBSD is basically the same as on Linux (same lib iirc). But FreeBSD does not have a libnuma, it's handled by the OS(?, kernel?) instead, and it's handled VERY differently. So I'm going to assume you (the reader) knows what a NUMA memory region is, and why it matters. So FreeBSD handles NUMA mapping via process/threads, i.e a process can choose to be in one NUMA region, and now whatever it touches, reads, writes is now a part of the region it assigned itself. But Linux, works VERY differently, it works by doing address mapping (VMA iirc, not sure if that's the right name). It says that `X` address range all belongs to one domain, and Y range belongs to another domain. Now I'm sure, if you've done any sort of "low level" (highkey this shit is still high level), you can see the issues. Another issue is that FreeBSD doesn't have a libnuma, it is a kernel api accessed via sysctl. Thus I wrote a libnuma translation layer to do some translation stuff, since I don't want to make significant changes to the ROCm source code when not needed (upstreaming would NOT be fun). Here's the library now if you go through it, you'll notice that it's very small. That's intentional, I went through (or more accurtely I made an LLM go through) and find all the unique functions which were defined by numactl and imported by the runtime, and implemented them. Now uh... if you think back, I'm working on a single domain setup, and I don't want to rewrite how FreeBSD manages NUMA policies, so I uh, just returned 0 for all bindings, since that will work fine for a single domain setup. We can make it a lot better later.
Now that we have the library, we can compile it! Make sure to NOT build with the compat layer, btw here is my branch. It will finish the cmake stage, but crash on make, I'm working on patches to the runtime to fix those, but right now I haven't published those patches. There will be a lot of touching ioctl stuff heheheh. So stay tuned, they will be out by Friday next week!
Let me know if you have any suggestions.