Someone just started folding using an Nvidia H100 SXM5!!

A forum for discussing FAH-related hardware choices and info on actual products (not speculation).

Moderator: Site Moderators

Forum rules
Please read the forum rules before posting.
muziqaz
Posts: 1916
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by muziqaz »

Lars developer frequents Linus tech tips forum fah section.
If you want you can try contact them there
FAH Omega tester
Image
arisu
Posts: 579
Joined: Mon Feb 24, 2025 11:11 pm

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by arisu »

calxalot wrote: Sat May 17, 2025 5:57 am Yup. I would say it would be good for the client itself to be more accurate.
Speaking of which, my iGPU is now estimated to be getting 120M PPD after a sudden restart. The client really does need to be fixed. :lol:

If I was connected to Lars at the time, I would have obliterated the 780M stats. Maybe that happened to someone else. It would explain why it thinks that this little iGPU gets 1M PPD when it gets barely half that on the best of days.

Unfortunately the LTT forum isn't letting me register. It's a shame because all I wanted to do was ask the Lars owner if he could disable Cloudflare bot detection on api-folding.lar.systems so that I could write a Python script for users who don't want to keep their browser open 24/7 just to contribute to the database.
BobWilliams757
Posts: 565
Joined: Fri Apr 03, 2020 2:22 pm
Hardware configuration: ASRock X370M PRO4
Ryzen 2400G APU
16 GB DDR4-3200
MSI GTX 1660 Super Gaming X

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by BobWilliams757 »

arisu wrote: Sat May 17, 2025 11:30 pm
calxalot wrote: Sat May 17, 2025 5:57 am Yup. I would say it would be good for the client itself to be more accurate.
Speaking of which, my iGPU is now estimated to be getting 120M PPD after a sudden restart. The client really does need to be fixed. :lol:

If I was connected to Lars at the time, I would have obliterated the 780M stats. Maybe that happened to someone else. It would explain why it thinks that this little iGPU gets 1M PPD when it gets barely half that on the best of days.

Unfortunately the LTT forum isn't letting me register. It's a shame because all I wanted to do was ask the Lars owner if he could disable Cloudflare bot detection on api-folding.lar.systems so that I could write a Python script for users who don't want to keep their browser open 24/7 just to contribute to the database.
I have to admit, I don't contribute to LAR very often because I just just close the browser and forget. I appreciate what he is doing, I just have to remind myself not to close the browser when I'm contributing.
Fold them if you get them!
arisu
Posts: 579
Joined: Mon Feb 24, 2025 11:11 pm

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by arisu »

That's exactly why I'd like to make a Python script to do that. It could run in the background and you could forget about it, and, whenever FAH is running, it could automatically connect to it and send data (while also doing sanity checks to ensure that it doesn't upload faulty data, and I can implement reliable sanity checks for the most common problems pretty easily).

But alas, the API is behind Cloudflare.
CaptainHalon
Posts: 114
Joined: Mon Apr 13, 2020 11:47 am

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by CaptainHalon »

Just for fun (I know, I have no life), I tried getting an H100 SXM5 folding on TensorDock. No matter what Ubuntu image I used, how I tried to configure things and make sure OpenCL was installed and ready, I could never get the thing to fold. Ubuntu 22, Ubuntu 24, Nvidia 550, Nvidia 570...tried just about every image combo as well as configuring after the fact, but never would fold. On the other hand, putting a pair of 5090's under a Windows VM is far easier.
muziqaz
Posts: 1916
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by muziqaz »

Your conventional nVidia drivers won't work with SXM5, I believe. You probably need some sort of pro drivers or something. Not sure how Nvidia separates them
FAH Omega tester
Image
CaptainHalon
Posts: 114
Joined: Mon Apr 13, 2020 11:47 am

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by CaptainHalon »

muziqaz wrote: Wed Aug 13, 2025 10:13 am Your conventional nVidia drivers won't work with SXM5, I believe. You probably need some sort of pro drivers or something. Not sure how Nvidia separates them
Yeah, someone who knows a lot more about it than me I'm sure would have better luck. I sank about 2 hours into trying to get it configured, even resorting to asking chatGPT how to fix it. Got to the point where everything was showing that it should work. But it just didn't.
toTOW
Site Moderator
Posts: 6472
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by toTOW »

The templates provided by this cloud service might lack some software components required by FAH ...
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
arisu
Posts: 579
Joined: Mon Feb 24, 2025 11:11 pm

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by arisu »

CaptainHalon wrote: Wed Aug 13, 2025 10:23 am
muziqaz wrote: Wed Aug 13, 2025 10:13 am Your conventional nVidia drivers won't work with SXM5, I believe. You probably need some sort of pro drivers or something. Not sure how Nvidia separates them
Yeah, someone who knows a lot more about it than me I'm sure would have better luck. I sank about 2 hours into trying to get it configured, even resorting to asking chatGPT how to fix it. Got to the point where everything was showing that it should work. But it just didn't.
Are you able to run OpenCL test applications? What is the vendor ID and product ID of the GPU? If it is a cloud system then it might be exposing a fake vendor/product ID that is not in the FAH GPU whitelist.
muziqaz
Posts: 1916
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by muziqaz »

arisu wrote: Thu Aug 14, 2025 8:32 pm
CaptainHalon wrote: Wed Aug 13, 2025 10:23 am
muziqaz wrote: Wed Aug 13, 2025 10:13 am Your conventional nVidia drivers won't work with SXM5, I believe. You probably need some sort of pro drivers or something. Not sure how Nvidia separates them
Yeah, someone who knows a lot more about it than me I'm sure would have better luck. I sank about 2 hours into trying to get it configured, even resorting to asking chatGPT how to fix it. Got to the point where everything was showing that it should work. But it just didn't.
Are you able to run OpenCL test applications? What is the vendor ID and product ID of the GPU? If it is a cloud system then it might be exposing a fake vendor/product ID that is not in the FAH GPU whitelist.
That fake pci id is usually nonsensical MS ID, which cannot be whitelisted
FAH Omega tester
Image
arisu
Posts: 579
Joined: Mon Feb 24, 2025 11:11 pm

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by arisu »

If that's the case then it explains why everything seems like it should work. Maybe the client should give a more meaningful error, something like "GPU detected but not whitelisted" instead of just "supported: false" which could mean anything from not having CUDA drivers installed to being on a genuinely unsupported GPU. Better yet, it could distinguish between being blacklisted (species 0) and not being found in the gpus.json file (in that case it could add a message saying to post a request on the forum to whitelist the GPU). I'll write a PR for that in a week when I return from a trip.

I may also write a PR that allows people to manually override the GPU detection via the config file. Maybe like <gpu-detection-override v='0:4318:9008'/> to mean "treat GPU at index 0 as vendor 4318 device 9008" (aka H100 SXM5). That would be helpful for people running on the cloud who know what they're doing, as well as testers. It probably should not be exposed in the web interface or naive people might try to fold on some mobile Fermi GPU or something.
muziqaz
Posts: 1916
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by muziqaz »

arisu wrote: Thu Aug 14, 2025 9:18 pm If that's the case then it explains why everything seems like it should work. Maybe the client should give a more meaningful error, something like "GPU detected but not whitelisted" instead of just "supported: false" which could mean anything from not having CUDA drivers installed to being on a genuinely unsupported GPU. Better yet, it could distinguish between being blacklisted (species 0) and not being found in the gpus.json file (in that case it could add a message saying to post a request on the forum to whitelist the GPU). I'll write a PR for that in a week when I return from a trip.

I may also write a PR that allows people to manually override the GPU detection via the config file. Maybe like <gpu-detection-override v='0:4318:9008'/> to mean "treat GPU at index 0 as vendor 4318 device 9008" (aka H100 SXM5). That would be helpful for people running on the cloud who know what they're doing, as well as testers. It probably should not be exposed in the web interface or naive people might try to fold on some mobile Fermi GPU or something.
The last bit might be dangerous, when people start spoofing their GPUs ;) Even if you hide it, people will find out and exploit it
FAH Omega tester
Image
Joe_H
Site Admin
Posts: 8180
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by Joe_H »

muziqaz wrote: Thu Aug 14, 2025 9:50 pm The last bit might be dangerous, when people start spoofing their GPUs Even if you hide it, people will find out and exploit it
No might be dangerous about it. Earlier when GPU folding was just starting out people figured out how to spoof what GPU they had. Mostly it went okay, then some new GPUs came out that turned out to have a bug. An entire project that I heard about was messed up from invalid results from those GPUs and had to be rerun from the beginning. Since I wasn't a mod or doing internal testing there may have been more damage I didn't hear about.

That experience is part of what is behind the GPU "whitelist" that also bans a whole bunch of PCI IDs. Some of those very old ones in the list were banned for returning bad results.
Image
arisu
Posts: 579
Joined: Mon Feb 24, 2025 11:11 pm

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by arisu »

Joe_H wrote: Thu Aug 14, 2025 10:02 pm That experience is part of what is behind the GPU "whitelist" that also bans a whole bunch of PCI IDs. Some of those very old ones in the list were banned for returning bad results.
That makes sense. If I make a PR, I'll have it so that it does not allow overriding an explicitly banned device.
muziqaz
Posts: 1916
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: Someone just started folding using an Nvidia H100 SXM5!!

Post by muziqaz »

arisu wrote: Thu Aug 14, 2025 10:06 pm
Joe_H wrote: Thu Aug 14, 2025 10:02 pm That experience is part of what is behind the GPU "whitelist" that also bans a whole bunch of PCI IDs. Some of those very old ones in the list were banned for returning bad results.
That makes sense. If I make a PR, I'll have it so that it does not allow overriding an explicitly banned device.
There is no explicitly banned GPUs per se. All GPUs on whitelist which are disabled are marked as species 0. So, you can code for those GPUs, however, we have the rest of the GPUs, which are enabled and have different species designations. We use those species to disable GPUs per project basis. So let's say species 5 from AMD fail specific project, we constrain against that species, but in grand scheme of things those GPUs in that species are not banned, but they are temporarily disabled, they still work in other projects.
So user who might have species 5 AMD GPU decides that they are not getting enough work, spoof their gpu to some nvidia GPU, and it starts receiving the project which it is not supposed to, and starts failing it left and right. User thinks it might be one off, and keeps the GPU running, which continues downloading all the projects which are failing on it
FAH Omega tester
Image
Post Reply