Instead, GPU folding stopped working altogether.
fahclient would find the GPU OK, download a work unit, and start it only to choke immediately with BAD_WORK_UNIT. Repeat until the client auto-disabled the GPU slot.
This looked and felt like an OpenCL problem - it's been problematic for me long ago folding on a GPU on linux. BUT - this time OpenCL appeared to be working fine elsewhere in the system - clinfo ran normally, fahbench could find and use it, and even the fahclient syslog reported
Code: Select all
04:27:21:******************************* System ********************************
04:27:21: CPU: AMD Ryzen 7 5800X 8-Core Processor
04:27:21: CPU ID: AuthenticAMD Family 25 Model 33 Stepping 2
04:27:21: CPUs: 16
04:27:21: Memory: 31.27GiB
04:27:21: Free Memory: 24.57GiB
04:27:21: Threads: POSIX_THREADS
04:27:21: OS Version: 5.15
04:27:21: Has Battery: false
04:27:21: On Battery: false
04:27:21: UTC Offset: -6
04:27:21: PID: 42255
04:27:21: CWD: /var/lib/fahclient
04:27:21: OS: Linux 5.15.0-56-generic x86_64
04:27:21: OS Arch: AMD64
04:27:21: GPUs: 1
04:27:21: GPU 0: Bus:9 Slot:0 Func:0 AMD:6 Navi 22 XT-XL [Radeon RX
04:27:21: 6700/6700XT/6800M]
04:27:21: CUDA: Not detected: Failed to open dynamic library 'libcuda.so':
04:27:21: libcuda.so: cannot open shared object file: No such file or
04:27:21: directory
04:27:21:OpenCL Device 0: Platform:0 Device:0 Bus:9 Slot:0 Compute:2.0 Driver:3513.0
Code: Select all
04:28:03:WU00:FS01:0x22:Project: 18909 (Run 37, Clone 4, Gen 31)
04:28:03:WU00:FS01:0x22:Reading tar file core.xml
04:28:03:WU00:FS01:0x22:Reading tar file integrator.xml
04:28:03:WU00:FS01:0x22:Reading tar file state.xml
04:28:03:WU00:FS01:0x22:Reading tar file system.xml
04:28:03:WU00:FS01:0x22:Digital signatures verified
04:28:03:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
04:28:03:WU00:FS01:0x22:Version 0.0.20
04:28:03:WU00:FS01:0x22: Checkpoint write interval: 62500 steps (5%) [20 total]
04:28:03:WU00:FS01:0x22: JSON viewer frame write interval: 12500 steps (1%) [100 total]
04:28:03:WU00:FS01:0x22: XTC frame write interval: 25000 steps (2%) [50 total]
04:28:03:WU00:FS01:0x22: Global context and integrator variables write interval: disabled
04:28:03:WU00:FS01:0x22:There are 2 platforms available.
04:28:03:WU00:FS01:0x22:Platform 0: Reference
04:28:03:WU00:FS01:0x22:Platform 1: CPU
04:28:03:WU00:FS01:0x22:opencl-device was set but OpenCL platform could not be found.
04:28:03:WU00:FS01:0x22:ERROR:126: Neither CUDA nor OpenCL is available.
Code: Select all
02:27:17:WU01:FS01:0x22:There are 3 platforms available.
02:27:17:WU01:FS01:0x22:Platform 0: Reference
02:27:17:WU01:FS01:0x22:Platform 1: CPU
02:27:17:WU01:FS01:0x22:Platform 2: OpenCL
Core22 ships from Folding@Home work servers with libstdc++.so.6 version GLIBCXX_3.4.28, but libamdocl64.so (the AMD ROCm OpenCL implementation) requires GLIBCXX_3.4.29
Fortunately the system libstdc++ (/lib/x86_64-linux-gnu/libstdc++.so.6) is GLIBCXX_3.4.30, so it can just be swapped in.
Workaround
Configure the GPU slot, and enable it.
Let it fail and disable.
Code: Select all
cd /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/
sudo rm libstdc++.so.6
sudo ln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6 libstdc++.so.6
This workaround will last as long until a new core version is released or something else clears your cores.foldingathome.org cache. Then you will need to apply it to the new directory.
Fix
The Folding@Home team needs to update the version of libstdc++ they are shipping with their workunits.
Oh, and does the new ROCm improve folding performance?
Estimated PPD 2110772