What is best practice for overclocking / undervolting a 50xx in Linux?

foldinghomealone · Post by **foldinghomealone** » Thu Oct 09, 2025 7:02 am

Hi there,

I just managed to set up a FAH client in Xubuntu with my new 5080.

As I never used Linux before, how can I overclock and/or undervolt a 5080?
What is the best testing method as I don't want to dump too many WUs?

Thanks in advance

muziqaz · Post by **muziqaz** » Thu Oct 09, 2025 7:19 am

Best practice is to keep it how god (Jensen or Man in the jacket) intended. That way when things fail, you know it is not your hardware

Post by **Joe_H** » Thu Oct 09, 2025 4:12 pm

With the more recent GPUs experience is to let the firmware handle clocking and voltages. Control the GPU through setting power limits or thermal limits.

Albuquerquefx · Post by **Albuquerquefx** » Thu Oct 09, 2025 8:34 pm

Since the question is about how to undervolt + overclock, I'm going to answer the question rather than opine on the merits. This is gonna be long, I assume you're OK with reading.

If you're accustomed to a Windows-esque undervolt + overclock experience, you're in for a hard time. Getting a proper undervolt + overclock in Linux is a mostly command-line driven set of tasks. Also, to be really honest, I find the GPU load / stability testing methods available in Linux sorely lacking. Speaking only for me and my experience over the past decade or so, I much prefer to test the card's limits and stability within Windows, where I can run expanded GPU tests like Kombustor or 3DMark or the like. I also like how fine grained we can get with MSI Afterburner or Asus's GPU utility whatever it's called (can you tell I don't use it? heh...) Then I take good notes and move to Linux to implement those limits.

Purely for example sake, let's take my personal RTX 4090 on my dedicated folding rig. Under Windows 10 OS, after weeks of load testing, stability testing, and playing video games (

) I arrived at a fully stable undervolt + overclock combo of 0.950v at 2745MHz. Great! Now how do we translate those settings into Linux?

FIrst, you'll need Coolbits enabled, which means you need the NVIDIA blackbox driver as Nouveau can't do it. Getting the driver is pretty straightforward, however Coolbits isn't always so easy. There's a litany of examples online on how you can use nvidia-settings, or edit xorg.conf, or depending on your distro you might use an override file or some such, to get Coolbits fully enabled. I'll tell you right now sometimes it SUCKS getting Coolbits to fully enable. Both of my dedicated Folding boxes are Fedora because, again speaking only for myself, I have the most success getting Coolbits enabled under Fedora versus Ubuntu LTS versus the other various distro or three I bounced through and gave up on. Coolbits is what enables clock management, which is obviously required for any quantity of overclocking.

Next, we set a clock limit. For my example, I need my card to stop at 2745MHz GPU core, so sudo nvidia-smi -lgc 2745 is the proper command. This executes the nvidia-smi tool in a superuser context to LockGraphicsClock at 2745MHz. IF you're familiar with the -lgc argument, you may wonder why we aren't using the lower / upper bound optional argument (example: -lgc 210,2745). This is because, without a proper voltage curve editor, the low-clock state can actually destabilize your GPU. I'll explain in the next step...

Now we set a clock offset, and finding the right clock offset requires a little bit of fiddling and CLI work. Once Coolbits is enabled AND you've locked your graphics clock, you can find Clock Offset in sudo nvidia-settings under the PowerMiser configuration tab for your GPU. Type in a clock offset down at the bottom of the pane, press enter, and it applies immediately. For my example, I want 0.95v for my 2745MHz clock target. Clock offset moves in increments of 15MHz rounded-down, so if you type in +23, you're actually getting +15MHz, if you type in +49, you're actually getting +45MHz, etc. Eventually you'll find the clock offset which forces the voltage you want at your desired clock, but what you've really done is really jacked up the factory clock / voltage curve. So, if you permit your card to try moving back into idle clocks, you might be at a place where your idle voltage (0.825 or whatever) is now somehow trying to run at 500MHz and it crashes and burns. This is why we lock the card to the high speed and don't let the clocks "wander."

To get the voltage tuned right via clock offset, you'll need another command: nvidia-smi -q -d VOLTAGE (no sudo needed) which will show you the current running voltage of the GPU. What you will need to do is load up the GPU with something, maybe folding so long as you move gently, and then begin dialing the clock offset upward, then checking the VOLTAGE output, then moving the clock offset again, until the VOLTAGE output finally closely matches your intended target. Also, please round up on the voltage as you will not get it exactly where you want. Thus, since I'm aiming for 0.95v and I keep moving the clock offset by 15MHz increments, eventually it will jump from 0.956v to 0.947v which is too far. Once you find that point, move the clock offset down two notches (eg if +240MHz got you 0.947v, and you're aiming for 0.95v, then I'd suggest moving down to +210MHz as a good stable point. We want to err on the side of caution, because I can tell you from experience you'll never dead-center the voltage on Linux like you could on Windows. It will always be just a tiny bit higher on the nvidia-smi -q -d VOLTAGE output. I just looked at my 4090 under Linux actually runs at like 0.963v, even though it was weeks-long stable at 0.95v under Windows. This is because you're actually setting an offset to the quirky factory curve (in Linux) rather than a hard voltage limit (in Windows), and there will be seemingly random instances where the Linux interpretation of the load point on the factory volts vs clock vurve somehow skews in a way where you get a lower blip of voltage and it locks up your box.

Alright, so you've set the hard clock limit at 2745, and you've done a bunch of back and forth with clock offset and VOLTAGE command line work to get yourself in the right place. Now you fold, and you watch it very closely. So long as you gave yourself the voltage slack aaaaaand you did a good job testing for stability of your undervolt + overclock in Windows, this should be the end of it. If at any point you notice the box is suddenly sluggish and folding performance is crawling, it's because your GPU has crashed and recovered and you need to back off the clock offset by another 30MHz. I'd say keep an eye on it for at least the first 48 hours of full-time folding, and then if it stays stable for that long, check it at least once a day for the first probably two weeks, and then just check on it occasionally (eg at least twice a week) for the rest of time. Depending on the temperature of your room, you may notice instability as the seasons change, thanks to environmental temperature swings.

Some people will tell you to manage it by powerlimit, which does provide results but isn't the same as undervolting despite what you may be told. Power limiting the card does nothing to the stock clock vs voltage curve, which means 2745MHz at 1.05v will never be achievable on my 4090 because the power limit of 350W is far too low to permit it to happen. Using all the examples I gave above, 2745MHz at 0.95v is something I can hit pretty consistently, even at a 350W power limit, because I've actually undervolted it.

foldinghomealone · Post by **foldinghomealone** » Sat Oct 25, 2025 9:31 am

Albuquerquefx wrote: ↑Thu Oct 09, 2025 8:34 pm Since the question is about how to undervolt + overclock, I'm going to answer the question rather than opine on the merits. This is gonna be long, I assume you're OK with reading.
...

For sure I am. I am here to learn...
Thank you so much for your detailed manual

So, finally I had time to test settings in Win11 properly.

Currently, when folding in Win11 I use following settings (using Afterburner)
clock lock: 3000MHz (locked by pressing "l" and "l")
curve: +470MHz (entered, not shifted manually in the curve editor)
mem clock: +1500MHz (don't know if any positive or negative influence)
Maybe not the most fine tuned settings, but they were working stable with the WUs I've gotten the last week.

Today I tried to transfer those setting to Xubuntu.

Enabling coolbits:
sudo nvidia-xconfig --cool-bits=28
sudo reboot
Seemed to work properly

Locking GPU Clock:
sudo nvidia-smi -lgc 3000

Using nvidia-settings and Voltage information:
Unfortunately, when using the following command, no voltage information is given:
nvidia-smi -q -d VOLTAGE

=============NVSMI LOG==============

Timestamp : Sat Oct 25 11:21:59 2025
Driver Version : 580.65.06
CUDA Version : 13.0

Attached GPUs : 1

So I just entered the +470MHz and +1500MHz mem clock what I use in Windows and currently it seems to work.

Using Xubuntu and swapping between "windows?" and going back and forth to the terminal it seems a bit "slow responding". At least it feels different from before.
Will see what the next days will bring.

edit:
+470MHz doesn't work properly --> next try with 440MHz

Albuquerquefx · Post by **Albuquerquefx** » Mon Oct 27, 2025 1:56 am

It looks like somewhere between the 560 and 580 drivers the nvidia-smi -q -d VOLTAGE got broken somehow. I haven't looked at it in a while, but it absolutely used to work...

For whatever its worth, my own benchmarking strongly suggested Windows and Linux operating systems performed almost identically when left alone in a headless environment. There are some caveats to this, such as Windows being a jerk and forcing reboots for OS updates which then logs you out. I'm not against patching an operating system, but it screws with your uptime / folding time. I end up using Linux anyway just because I prefer Linux for these sorts of headless workloads...

BobWilliams757 · Post by **BobWilliams757** » Thu Oct 30, 2025 4:07 pm

Just for input - Myself and a couple others have found that lowering the memory clocks is actually more efficient. I haven't seen any testing on newere hardware, but the idea is that anything since Turing has more than enough memory bandwidth and there is little if anything to be gained. BUT driving the memory clocks up uses more power, and that power (if running power limited) could be used to increase clocks more, or simply just not use the power for something already fast enough. It might be worth some testing when using locked clocks.

As for the undervolting, it seems to be covered. But for the ultimate in how to get hard data to make your decisions, the thread below has some great ideas.

https://linustechtips.com/topic/14 ... gpus/

Albuquerquefx · Post by **Albuquerquefx** » Fri Nov 07, 2025 11:24 pm

That thread is a gold mine, thank you for posting it Bob! I know what I'm gonna be doing this weekend

muziqaz · Post by **muziqaz** » Fri Nov 07, 2025 11:45 pm

Albuquerquefx wrote: ↑Fri Nov 07, 2025 11:24 pm That thread is a gold mine, thank you for posting it Bob! I know what I'm gonna be doing this weekend

Overheating your GPU?

Albuquerquefx · Post by **Albuquerquefx** » Sat Nov 08, 2025 2:51 am

muziqaz wrote: ↑Fri Nov 07, 2025 11:45 pm
Albuquerquefx wrote: ↑Fri Nov 07, 2025 11:24 pm That thread is a gold mine, thank you for posting it Bob! I know what I'm gonna be doing this weekend
Overheating your GPU?

I mean, in theory, the result should be even less power draw than the roughly ~1500W (GPU-only) I currently dedicate to the fold. My office and "wiring closet" are always quite a bit warmer than the rest of the house for a reason!

BobWilliams757 · Post by **BobWilliams757** » Sat Nov 08, 2025 7:03 pm

Albuquerquefx wrote: ↑Fri Nov 07, 2025 11:24 pm That thread is a gold mine, thank you for posting it Bob! I know what I'm gonna be doing this weekend

To say that he goes in depth would be an understatement. Looking for that efficiency always makes sense for the long game.

Albuquerquefx · Post by **Albuquerquefx** » Mon Nov 24, 2025 5:52 pm

Thought i'd come back and give some thoughts on the tuning thread linked above.

First, HFM (Harlam's Folding Monitor) doesn't support the v8 client, which means there's no good way to ascertain "points per frame". It looks like Harlam has abandoned the project somewhere around the beginning of 2024, which is understandable yet also sad. Given this, I decided to just use a singular WU and then tightly measured timing between frames to build a reasonable estimate of performance based on seconds per frame; lower obviously being better.

Ultimately, here are my findings:

Irrespective of ultimate power or clock speed, properly undervolting always results in a far more efficient result. In Linux, this is enabled by a combination of positive GPU clock offset and then limiting the clock to a specific point (eg this is how you stop the GPU from moving further up the voltage ramp.)
The "most efficient" method discussed in the linked thread was clock-locking for both minimum and maximum speeds. When combined with undervolting above, this was indeed more power efficient than simply setting an upper power limit.
Despite what I just typed in the prior bullet, I found that setting an upper power limit with a lot of extra clock on the table substantially increased performance for certain WUs. Specifically, WUs with less atoms end up generating less load, so the clocks can ramp higher. While this does permit the card to gulp down a few more watts, it also results in a significant uptick in performance for smaller-atom WUs. This is more important on "big" GPUs where many WUs just aren't big enough to fill all the umpteen-thousand CUDA cores.

I ended up using the data to determine the best power limit for each card, found a better upper clock limit to then fine-tune the undervolt settings. For example, my 4090 was already running at a power limit of 350W, a clock offset of +210, and a clock limit of 2625 (which resulted in about 900mV upper voltage limit.) After tuning, the same 4090 now has a power limit of 335W, a clock offset of +240, and a clock limit of 2550 (which results in about 850mV upper voltage limit.) The result is actually better than the PL indicates; it's about 10% less power for about 2-3% less performance. Compared to purely stock, I'm down roughly 30% power but about 7% total performance, which is IMO astounding.

The 4070 Supers were far less interesting; just like the 4090, they were already power and clock limited before I started. After data collection, I set the MSI flavored one to PL=180, LGC=2490 and offset of +150, and the Asus one to PL=190, LGC=2550 and offset of +195. Power change wasn't a lot, I think I found another 5-6% at most with about a 2-3% loss in PPD. Compared to pure stock, they're down about 20% power at roughly equivalent performance. For whatever reason, the MSI one has always been a little less performant and I assume its related to the cooling.

I did futz around with the 5090, but it does double-duty in my gaming rig, and I didn't find anything more interesting than the undervolt I already had via Afterburner. I did play with more aggressiver PL, however I was already operating at 400W so further reductions in PL took ever-larger bites out of the GPU clock. I kept my current config of 2820MHz at 890mV with PL=400, and it's really happy there.

Cliff's Notes: the LTT forum suggestions are still worth doing, but it's far less intuitive since the recommended tooling doesn't work anymore. A solid undervolt will get you most of the results, but if you're looking for that extra fistful of efficiency, you may indeed find it by looking more closely.

foldinghomealone · Post by **foldinghomealone** » Mon Dec 01, 2025 1:58 pm

BobWilliams757 wrote: ↑Thu Oct 30, 2025 4:07 pm Just for input - Myself and a couple others have found that lowering the memory clocks is actually more efficient. I haven't seen any testing on newere hardware, ...
https://linustechtips.com/topic/14 ... gpus/

I have some testing done with my 5080 but can say only for "big" project 18263, with...
-500MHz MemClock the TPF is ~1:59 (~25Mio PPD)
+1500MHz MemClock the TPF is ~1:48 (~29Mio PPD)

So, for this project it clearly is faster when memclock is increased instead of decreased.
However, I didn't consider / measure energy efficiency.

Please consider, that I have overclocked / undervolted the system as stated above. Only memclock changed.

It seems also that the smaller WUs from other project benefit from increase of memclock, however it's much more difficult to properly analyze the effects, because the difference in TPF is much smaller and TPF of shorter / smaller WUs fluctuate more than big WUs.

BobWilliams757 · Post by **BobWilliams757** » Wed Dec 03, 2025 4:05 pm

The testing I did was on Turing hardware, with many fewer cores. I would imagine at some point the memory bandwidth vs total core count comes into the picture, along with power limits and where a little extra power helps and doesn't.

I've found that most changes will work on all projects, but at times there are outliers for some reason. So possibly the CPU load vs bandwidth vs cores, etc..... it can get tricky.

Either way, without testing we never know.

Folding Forum

What is best practice for overclocking / undervolting a 50xx in Linux?

What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?

Re: What is best practice for overclocking / undervolting a 50xx in Linux?