muziqaz wrote: ↑Mon May 12, 2025 4:58 am
What distro is this?
Kernel?
ROCm version (if used)
Thanks
Debian bookworm (kernel 6.1) with ROCm 6.4.0.
The problem is not a FAH problem but probably an amdgpu problem. The folding core is using as much memory as it needs. It is the DRM framebuffer subsystem that fails to allocate memory (despite GTT being available) and kills the graphics, logging an error that it failed to pin the framebuffer due to error -12 (which corresponds to ENOMEM i.e. insufficient memory for the allocation). The folding core keeps merrily chugging along the whole time while using its 504 MB VRAM. It's not even aware that anything happened.
I have since blacklisted 12922-12925 so it is no longer a problem for me.
I think Debian Trixie will have a newer kernel. I could use a newer kernel on Bookworm but it's the stable Debian so things are a bit old. I'll give it a try if I run into more projects that use too much VRAM.
It doesn't bother me too much because I can blacklist projects now, and because I'm comfortable with command line so I can work with the graphical desktop environment turned off. Tmux is as nice as any window manager.
Not a complaint, just an observation from a WU I had to dump: 16781 had a TPF of nearly 40 minutes and resource utilization was low, indicating that it's not particularly efficient on this GPU compared to other projects of similar atom count and number of steps.
I'm sure it will fold with ease when HIP is finally out, though!
arisu wrote: ↑Sat Jun 07, 2025 12:49 pm
Not a complaint, just an observation from a WU I had to dump: 16781 had a TPF of nearly 40 minutes and resource utilization was low, indicating that it's not particularly efficient on this GPU compared to other projects of similar atom count and number of steps.
I'm sure it will fold with ease when HIP is finally out, though!
Interesting, I was sometimes experiencing pretty much the same level of low performance and super low power consumption with species 3 iGPU in Ryzen 5850U (this one is not even RDNA). Even worse than that, sometimes a WU couldn't even fold the first 1% and just timed out after 15 minutes. Something is definitely broken somewhere and I tend to agree with muziqaz that iGPU folding is not a good idea (at least in its current state).