What is best practice for overclocking / undervolting a 50xx in Linux?

A forum for discussing FAH-related hardware choices and info on actual products (not speculation).

Moderator: Site Moderators

Forum rules
Please read the forum rules before posting.
Post Reply
foldinghomealone
Posts: 146
Joined: Wed Feb 01, 2017 7:07 pm
Hardware configuration: 5900x + 5080

What is best practice for overclocking / undervolting a 50xx in Linux?

Post by foldinghomealone »

Hi there,

I just managed to set up a FAH client in Xubuntu with my new 5080.

As I never used Linux before, how can I overclock and/or undervolt a 5080?
What is the best testing method as I don't want to dump too many WUs?

Thanks in advance
Image
muziqaz
Posts: 2063
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Post by muziqaz »

Best practice is to keep it how god (Jensen or Man in the jacket) intended. That way when things fail, you know it is not your hardware
FAH Omega tester
Image
Joe_H
Site Admin
Posts: 8212
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Post by Joe_H »

With the more recent GPUs experience is to let the firmware handle clocking and voltages. Control the GPU through setting power limits or thermal limits.
Image
Albuquerquefx
Posts: 9
Joined: Wed Oct 01, 2025 3:05 am
Hardware configuration: AMD 9800X3D + 5090, Windows 11
AMD 5950X + 4070 Super, Fedora 42 VM on Proxmox 8.3
Intel i7-3930k + 4070 Super + 4090, Fedora 42

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Post by Albuquerquefx »

Since the question is about how to undervolt + overclock, I'm going to answer the question rather than opine on the merits. This is gonna be long, I assume you're OK with reading.

If you're accustomed to a Windows-esque undervolt + overclock experience, you're in for a hard time. Getting a proper undervolt + overclock in Linux is a mostly command-line driven set of tasks. Also, to be really honest, I find the GPU load / stability testing methods available in Linux sorely lacking. Speaking only for me and my experience over the past decade or so, I much prefer to test the card's limits and stability within Windows, where I can run expanded GPU tests like Kombustor or 3DMark or the like. I also like how fine grained we can get with MSI Afterburner or Asus's GPU utility whatever it's called (can you tell I don't use it? heh...) Then I take good notes and move to Linux to implement those limits.

Purely for example sake, let's take my personal RTX 4090 on my dedicated folding rig. Under Windows 10 OS, after weeks of load testing, stability testing, and playing video games ( :D ) I arrived at a fully stable undervolt + overclock combo of 0.950v at 2745MHz. Great! Now how do we translate those settings into Linux?

FIrst, you'll need Coolbits enabled, which means you need the NVIDIA blackbox driver as Nouveau can't do it. Getting the driver is pretty straightforward, however Coolbits isn't always so easy. There's a litany of examples online on how you can use nvidia-settings, or edit xorg.conf, or depending on your distro you might use an override file or some such, to get Coolbits fully enabled. I'll tell you right now sometimes it SUCKS getting Coolbits to fully enable. Both of my dedicated Folding boxes are Fedora because, again speaking only for myself, I have the most success getting Coolbits enabled under Fedora versus Ubuntu LTS versus the other various distro or three I bounced through and gave up on. Coolbits is what enables clock management, which is obviously required for any quantity of overclocking.

Next, we set a clock limit. For my example, I need my card to stop at 2745MHz GPU core, so sudo nvidia-smi -lgc 2745 is the proper command. This executes the nvidia-smi tool in a superuser context to LockGraphicsClock at 2745MHz. IF you're familiar with the -lgc argument, you may wonder why we aren't using the lower / upper bound optional argument (example: -lgc 210,2745). This is because, without a proper voltage curve editor, the low-clock state can actually destabilize your GPU. I'll explain in the next step...

Now we set a clock offset, and finding the right clock offset requires a little bit of fiddling and CLI work. Once Coolbits is enabled AND you've locked your graphics clock, you can find Clock Offset in sudo nvidia-settings under the PowerMiser configuration tab for your GPU. Type in a clock offset down at the bottom of the pane, press enter, and it applies immediately. For my example, I want 0.95v for my 2745MHz clock target. Clock offset moves in increments of 15MHz rounded-down, so if you type in +23, you're actually getting +15MHz, if you type in +49, you're actually getting +45MHz, etc. Eventually you'll find the clock offset which forces the voltage you want at your desired clock, but what you've really done is really jacked up the factory clock / voltage curve. So, if you permit your card to try moving back into idle clocks, you might be at a place where your idle voltage (0.825 or whatever) is now somehow trying to run at 500MHz and it crashes and burns. This is why we lock the card to the high speed and don't let the clocks "wander."

To get the voltage tuned right via clock offset, you'll need another command: nvidia-smi -q -d VOLTAGE (no sudo needed) which will show you the current running voltage of the GPU. What you will need to do is load up the GPU with something, maybe folding so long as you move gently, and then begin dialing the clock offset upward, then checking the VOLTAGE output, then moving the clock offset again, until the VOLTAGE output finally closely matches your intended target. Also, please round up on the voltage as you will not get it exactly where you want. Thus, since I'm aiming for 0.95v and I keep moving the clock offset by 15MHz increments, eventually it will jump from 0.956v to 0.947v which is too far. Once you find that point, move the clock offset down two notches (eg if +240MHz got you 0.947v, and you're aiming for 0.95v, then I'd suggest moving down to +210MHz as a good stable point. We want to err on the side of caution, because I can tell you from experience you'll never dead-center the voltage on Linux like you could on Windows. It will always be just a tiny bit higher on the nvidia-smi -q -d VOLTAGE output. I just looked at my 4090 under Linux actually runs at like 0.963v, even though it was weeks-long stable at 0.95v under Windows. This is because you're actually setting an offset to the quirky factory curve (in Linux) rather than a hard voltage limit (in Windows), and there will be seemingly random instances where the Linux interpretation of the load point on the factory volts vs clock vurve somehow skews in a way where you get a lower blip of voltage and it locks up your box.

Alright, so you've set the hard clock limit at 2745, and you've done a bunch of back and forth with clock offset and VOLTAGE command line work to get yourself in the right place. Now you fold, and you watch it very closely. So long as you gave yourself the voltage slack aaaaaand you did a good job testing for stability of your undervolt + overclock in Windows, this should be the end of it. If at any point you notice the box is suddenly sluggish and folding performance is crawling, it's because your GPU has crashed and recovered and you need to back off the clock offset by another 30MHz. I'd say keep an eye on it for at least the first 48 hours of full-time folding, and then if it stays stable for that long, check it at least once a day for the first probably two weeks, and then just check on it occasionally (eg at least twice a week) for the rest of time. Depending on the temperature of your room, you may notice instability as the seasons change, thanks to environmental temperature swings.

Some people will tell you to manage it by powerlimit, which does provide results but isn't the same as undervolting despite what you may be told. Power limiting the card does nothing to the stock clock vs voltage curve, which means 2745MHz at 1.05v will never be achievable on my 4090 because the power limit of 350W is far too low to permit it to happen. Using all the examples I gave above, 2745MHz at 0.95v is something I can hit pretty consistently, even at a 350W power limit, because I've actually undervolted it.
Image
Post Reply