Page 1 of 1

How to Fix, GPUs went to Unsupported after reboot

Posted: Thu Aug 28, 2025 7:56 am
by quartz64
I have a test platform with 8 Nvidia H800 GPUs. Everything worked fine for two days (ArchLinux, fah-client 8.4.9, Driver Version: 580.76.05, CUDA Version: 13.0), but in the morning after rebooting, all GPUs are shown as unsupported. My GPU is present in a GPUS.txt file (0x10de:0x2322:2:9:GH100 [H800 PCIe]). What went wrong?

Code: Select all

07:34:16:I1:*********************** Folding@home Client ***********************
07:34:16:I1: Version: 8.4.9
07:34:16:I1: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
07:34:16:I1: Org: foldingathome.org
07:34:16:I1: Copyright: 2023-2024, foldingathome.org
07:34:16:I1: Homepage: https://foldingathome.org/
07:34:16:I1: License: GPL-3.0-or-later
07:34:16:I1: URL: https://v8-4.foldingathome.org/
07:34:16:I1: Date: Aug 25 2025
07:34:16:I1: Time: 14:11:23
07:34:16:I1: Revision: 360fe71b1bd05bb89814bfb97b73a5bda84802d6
07:34:16:I1: Branch: makepkg
07:34:16:I1: Compiler: GNU 15.2.1 20250813
07:34:16:I1: Options: -Wsuggest-override -faligned-new -std=c++14 -fsigned-char
07:34:16:I1: -ffunction-sections -fdata-sections -O3 -funroll-loops -fno-pie
07:34:16:I1: Platform: linux 6.16.1-arch1-1
07:34:16:I1: Bits: 64
07:34:16:I1: Mode: Release
07:34:16:I1: Args: --config=/etc/fah-client/config.xml
07:34:16:I1: --log=/var/log/fah-client/log.txt
07:34:16:I1: --log-rotate-dir=/var/log/fah-client/
07:34:16:I1: Config: /etc/fah-client/config.xml

07:34:20:I3:gpus = {
07:34:20:I3: "gpu:206:00:00": {"vendor": 4318, "device": 8994, "type": "nvidia", "supported": false, "description": "GH100 [H800 PCIe]"},
07:34:20:I3: "gpu:209:00:00": {"vendor": 4318, "device": 8994, "type": "nvidia", "supported": false, "description": "GH100 [H800 PCIe]"},
07:34:20:I3: "gpu:213:00:00": {"vendor": 4318, "device": 8994, "type": "nvidia", "supported": false, "description": "GH100 [H800 PCIe]"},
07:34:20:I3: "gpu:214:00:00": {"vendor": 4318, "device": 8994, "type": "nvidia", "supported": false, "description": "GH100 [H800 PCIe]"},
07:34:20:I3: "gpu:79:00:00": {"vendor": 4318, "device": 8994, "type": "nvidia", "supported": false, "description": "GH100 [H800 PCIe]"},
07:34:20:I3: "gpu:82:00:00": {"vendor": 4318, "device": 8994, "type": "nvidia", "supported": false, "description": "GH100 [H800 PCIe]"},
07:34:20:I3: "gpu:86:00:00": {"vendor": 4318, "device": 8994, "type": "nvidia", "supported": false, "description": "GH100 [H800 PCIe]"},
07:34:20:I3: "gpu:87:00:00": {"vendor": 4318, "device": 8994, "type": "nvidia", "supported": false, "description": "GH100 [H800 PCIe]"}
07:34:20:I3:}

Re: How to Fix, GPUs went to Unsupported after reboot

Posted: Thu Aug 28, 2025 9:15 am
by muziqaz
sudo systemctl restart fah-client

This is needed after every reboot on linux

Re: How to Fix, GPUs went to Unsupported after reboot

Posted: Thu Aug 28, 2025 10:56 am
by quartz64
Probably, the problem was related to the driver version, after rebooting the new Nvidia driver (v. 580) loaded.
Solved the problem by switching to Ubuntu Server 24.04 (nvidia driver 570.158.01).

Re: How to Fix, GPUs went to Unsupported after reboot

Posted: Thu Aug 28, 2025 12:46 pm
by muziqaz
No, this is more of an issue with latest distros. 24.04 does not seem to have this issue