Page 1 of 1

rusticl - should it work?

Posted: Wed Oct 08, 2025 10:30 pm
by borketh
I just finished a massive Core 24 WU with only rusticl installed, and it seemed to work just fine. However, Core 27 appears to very much not like me when I have my setup like this.

An example of a Core 27 WU (Project 15320) failing to find an OpenCL context and promptly dying.

Code: Select all

22:12:58:I1:WU164:Folding@home GPU Core27 Folding@home Core
22:12:58:I1:WU164:Version 8.2.1
22:12:58:I1:WU164:  GPU info: Platform: OpenCL: rusticl
22:12:58:I1:WU164:  GPU info: PlatformIndex: 0
22:12:58:I1:WU164:  GPU info: Device: AMD Radeon RX 7800 XT (radeonsi, navi32, LLVM 20.1.8, DRM 3.64, 6.17.1-2-cachyos)
22:12:58:I1:WU164:  GPU info: DeviceIndex: 0
22:12:58:I1:WU164:  GPU info: Vendor: 0x1002
22:12:58:I1:WU164:  GPU info: PCI: 03:00:00
22:12:58:I1:WU164:  GPU info: Compute: 3.0
22:12:58:I1:WU164:  GPU info: Driver: 25.2
22:12:58:I1:WU164:  GPU info: GPU: true
22:12:58:I1:WU164:  GPU info: Platform: OpenCL: rusticl
22:12:58:I1:WU164:  GPU info: PlatformIndex: 0
22:12:58:I1:WU164:  GPU info: Device: AMD Radeon Graphics (radeonsi, raphael_mendocino, LLVM 20.1.8, DRM 3.64, 6.17.1-2-cachyos)
22:12:58:I1:WU164:  GPU info: DeviceIndex: 1
22:12:58:I1:WU164:  GPU info: Vendor: 0x1002
22:12:58:I1:WU164:  GPU info: PCI: 18:00:00
22:12:58:I1:WU164:  GPU info: Compute: 3.0
22:12:58:I1:WU164:  GPU info: Driver: 25.2
22:12:58:I1:WU164:  GPU info: GPU: true
22:12:58:I1:WU164:  Checkpoint write interval: 62500 steps (5%) [20 total]
22:12:58:I1:WU164:  JSON viewer frame write interval: 12500 steps (1%) [100 total]
22:12:58:I1:WU164:  XTC frame write interval: 50000 steps (4%) [25 total]
22:12:58:I1:WU164:  TRR frame write interval: disabled
22:12:58:I1:WU164:  Global context and integrator variables write interval: disabled
22:12:58:I1:WU164:There are 3 platforms available.
22:12:58:I1:WU164:Platform 0: Reference
22:12:58:I1:WU164:Platform 1: CPU
22:12:58:I1:WU164:Platform 2: OpenCL
22:12:58:I1:WU164:  opencl-device 0 specified
22:13:00:I1:WU164:Attempting to create OpenCL context:
22:13:00:I1:WU164:  Configuring platform OpenCL
22:13:00:I1:WU164:Failed to create OpenCL context:
22:13:00:I1:WU164:No compatible OpenCL platform is available
22:13:00:I1:WU164:ERROR:125: Failed to create a GPU-enabled OpenMM Context.
22:13:00:I1:WU164:Saving result file ../logfile_01.txt
22:13:00:I1:WU164:Saving result file science.log
22:13:00:I1:WU164:Folding@home Core Shutdown: BAD_WORK_UNIT
22:13:00:E :WU164:Core returned BAD_WORK_UNIT (114)
And here's Core 24 (Project 12129) finding valid OpenCL just fine

Code: Select all

01:50:11:I1:WU159:Folding@home GPU Core24 Folding@home Core
01:50:11:I1:WU159:Version 8.1.4
01:50:11:I1:WU159:  Checkpoint write interval: 250000 steps (5%) [20 total]
01:50:11:I1:WU159:  JSON viewer frame write interval: 50000 steps (1%) [100 total]
01:50:11:I1:WU159:  XTC frame write interval: 25000 steps (0.5%) [200 total]
01:50:11:I1:WU159:  TRR frame write interval: disabled
01:50:11:I1:WU159:  Global context and integrator variables write interval: disabled
01:50:11:I1:WU159:There are 3 platforms available.
01:50:11:I1:WU159:Platform 0: Reference
01:50:11:I1:WU159:Platform 1: CPU
01:50:11:I1:WU159:Platform 2: OpenCL
01:50:11:I1:WU159:  opencl-device 0 specified
01:50:12:I1:Machine state finish
01:50:14:I1:WU159:Attempting to create OpenCL context:
01:50:14:I1:WU159:  Configuring platform OpenCL
01:50:15:I1:WU159:  Using OpenCL on OpenCL platformId 0 and gpu 0
01:50:15:I1:WU159:  GPU info: Platform: OpenCL: rusticl
01:50:15:I1:WU159:  GPU info: PlatformIndex: 0
01:50:15:I1:WU159:  GPU info: Device: AMD Radeon RX 7800 XT (radeonsi, navi32, LLVM 20.1.8, DRM 3.64, 6.17.0-4-cachyos)
01:50:15:I1:WU159:  GPU info: DeviceIndex: 0
01:50:15:I1:WU159:  GPU info: Vendor: 0x1002
01:50:15:I1:WU159:  GPU info: PCI: 03:00:00
01:50:15:I1:WU159:  GPU info: Compute: 3.0
01:50:15:I1:WU159:  GPU info: Driver: 25.2
01:50:15:I1:WU159:  GPU info: GPU: true
01:50:15:I1:WU159:Completed 2000000 out of 5000000 steps (40%)
What's interesting to me is that the client now shows OpenCL compute level 3.0 properly, which it never did in other configurations where the rocm opencl was installed, even with rusticl. Would that be a potential reason I was assigned a big WU like 12129? I usually don't get them over 100k pts.

I tried the fix from last time I posted with replacing glibc with the system version, however that didn't fix it like in Core 23.

Re: rusticl - should it work?

Posted: Wed Oct 08, 2025 10:41 pm
by muziqaz
Rusticl does not work with FAH.
Remove mesa-ocl-icd loader (package name might be a bit different), and let rocm icd loader take over to push AMDs proprietary drivers forward.
If you cannot live without Mesa, then abandon FAH until rusticl is supported (not soon)

Re: rusticl - should it work?

Posted: Wed Oct 08, 2025 10:57 pm
by borketh
I understand it isn't supported, which is why I'm asking why it worked in the first place.

Re: rusticl - should it work?

Posted: Wed Oct 08, 2025 11:01 pm
by muziqaz
borketh wrote: Wed Oct 08, 2025 10:57 pm I understand it isn't supported, which is why I'm asking why it worked in the first place.
Because it is not supported. If it worked with one core, that is a miracle, and not a constant.

Re: rusticl - should it work?

Posted: Wed Oct 08, 2025 11:26 pm
by Joe_H
A different WU from the same project, or one from a different project using different features of the OpenMM code in the same core number might also have failed. The experience that F@h has had with mesa and now rusticl is that it is computationally fragile. They have been very good at getting the code to work on benchmarks, but both have not been as stable as the OpenCL implementations from AMD, Nvidia or Intel.

Re: rusticl - should it work?

Posted: Thu Oct 09, 2025 4:34 am
by muziqaz
FAH is moving to HIP sometime in next to decades, so there is that

Re: rusticl - should it work?

Posted: Thu Oct 09, 2025 5:09 am
by borketh
I heard, and I'm excited for it. Count me in for being a lab rat for experimental HIP cores!

Re: rusticl - should it work?

Posted: Thu Oct 09, 2025 5:37 am
by muziqaz
borketh wrote: Thu Oct 09, 2025 5:09 am I heard, and I'm excited for it. Count me in for being a lab rat for experimental HIP cores!
There is not going to be experimental core ;)