MeeLee wrote:How do you overclock and set fan speed in Ubuntu?
Do you use the GUI, and then copy a settings file to boot up with these settings?
Caveat - I am not a linux pro/expert, I'm barely a linux user. What follows is what I did, caused me a lot of frustration and probably has a lot of bad practice in it.
That machine was in hibernation for summer and it did not want to be awoken, it was comfy where it was.
I'm not overclocking per se, I'm underclocking but it's the same techniques.
This only applies to NVidia gpus, I have neither the suitable GPUs nor the inclination to figure this out for AMD.
I run advanced control on a machine separate from the ones I run the client on.
Notes on the Machine
It's Ubuntu desktop I'm running. 18.04 was originally installed, it's now upgraded to 20.04.
I'm folding exclusively on NVidia GPUs, those are the most powerful GPUs I have.
I'm running it headless on a wireless connection. The desktop install gives me the desktop session needed to automatically connect the wireless and run the GPUs.
I didn't disable the gui or anything else to reduce resources, it's a very capable machine I'm running.
That said, I don't do cpu folding on it, the cpu was impeding the gpu folding more than it was adding to cpu folding. My Ryzen handles the cpu folding.
Code: Select all
*********************** Log Started 2021-02-08T23:10:06Z ***********************
23:10:06:Trying to access database...
23:10:06:Successfully acquired database lock
23:10:06:Read GPUs.txt
23:10:08:Enabled folding slot 00: PAUSED gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 (by user)
23:10:08:Enabled folding slot 01: PAUSED gpu:1:TU104 [GeForce RTX 2070 SUPER] 8218 (by user)
23:10:08:****************************** FAHClient ******************************
23:10:08: Version: 7.6.13
23:10:08: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
23:10:08: Copyright: 2020 foldingathome.org
23:10:08: Homepage: https://foldingathome.org/
23:10:08: Date: Apr 28 2020
23:10:08: Time: 04:20:16
23:10:08: Revision: 5a652817f46116b6e135503af97f18e094414e3b
23:10:08: Branch: master
23:10:08: Compiler: GNU 8.3.0
23:10:08: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
23:10:08: -funroll-loops -fno-pie
23:10:08: Platform: linux2 4.19.0-5-amd64
23:10:08: Bits: 64
23:10:08: Mode: Release
23:10:08: Args: --child /etc/fahclient/config.xml --run-as fahclient
23:10:08: --pid-file=/var/run/fahclient.pid --daemon
23:10:08: Config: /etc/fahclient/config.xml
23:10:08:******************************** CBang ********************************
23:10:08: Date: Apr 25 2020
23:10:08: Time: 00:07:53
23:10:08: Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
23:10:08: Branch: master
23:10:08: Compiler: GNU 8.3.0
23:10:08: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
23:10:08: -funroll-loops -fno-pie -fPIC
23:10:08: Platform: linux2 4.19.0-5-amd64
23:10:08: Bits: 64
23:10:08: Mode: Release
23:10:08:******************************* System ********************************
23:10:08: CPU: Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz
23:10:08: CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
23:10:08: CPUs: 8
23:10:08: Memory: 31.28GiB
23:10:08: Free Memory: 30.49GiB
23:10:08: Threads: POSIX_THREADS
23:10:08: OS Version: 5.4
23:10:08: Has Battery: false
23:10:08: On Battery: false
23:10:08: UTC Offset: 0
23:10:08: PID: 1080
23:10:08: CWD: /var/lib/fahclient
23:10:08: OS: Linux 5.4.0-65-generic x86_64
23:10:08: OS Arch: AMD64
23:10:08: GPUs: 2
23:10:08: GPU 0: Bus:2 Slot:0 Func:0 NVIDIA:8 GP102 [GeForce GTX 1080 Ti] 11380
23:10:08: GPU 1: Bus:1 Slot:0 Func:0 NVIDIA:8 TU104 [GeForce RTX 2070 SUPER]
23:10:08: 8218
23:10:08: CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:7.5 Driver:11.1
23:10:08: CUDA Device 1: Platform:0 Device:1 Bus:2 Slot:0 Compute:6.1 Driver:11.1
23:10:08:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:455.38
23:10:08:OpenCL Device 1: Platform:0 Device:1 Bus:2 Slot:0 Compute:1.2 Driver:455.38
23:10:08:******************************* libFAH ********************************
Setup the machine
The easy bit first.
Install desktop linux OS, configure wireless working if you plan to use it.
Download the Linux 64-bit Geforce driver from NVidia, I found using the browser easier that wget.
Update/upgrade ubuntu as far as you can through the package manager and the console.
In the console if I know I'm going to be doing stuff which needs root privileges I don't waste time prepending everything with sudo, I just elevate the environment to root using sudo -i
Download and install the FAHClient.
You may need to enable SSH for remote access, I can't remember just now.
WARNING - I have found that ubuntu desktop silently installs updates without waiting for your input and can break the loading of the driver, making the following a little sketchy. I'm assuming it's breaking the loader, I'm re-installing the same driver that was installed before the silent updates and it works perfectly fine after. My lazy ass workaround is to simply re-install the drivers while I am applying less important updates/upgrades.
The not-so-easy bit?
Purge the neuveau.
You'll need exit to the console, terminate the xterm and remove/purge the nvidia neuveau drivers. Google for guidelines.
The neuveau drivers need to be completely purged or NVidia's proprietary driver will fail to install.
It took me several rounds of purging the neuveau drivers before the proprietary drivers would install, I think I ended up just disconnecting the monitor and doing this remotely through putty.
Install the the proprietary drivers
bash the NVidia%%%.run file
These NVidia drivers contain everything we need - OpenCL, Cuda, NVidia-SMI, NVidia-settings. No need to separately install opencl.
For the umpteenth time reboot the machine, make sure advanced control can communicate with, and control, the client.
At this point you should be ready if you just want to fold.
But no, that's way too easy. We want to remotely control the GPU performance and fan speeds. Now the fun starts.
Below is my own custom rc.local file. It applies the power limits at startup but fails to apply the fan speeds, so I have to run it again once I've logged in.
Let's pull apart one of the nvidia-smi commands.
"/usr/bin/nvidia-smi -i 0 -pl 110"
"-i 0" - this is the index of the gpu you are affecting, running the command "nvidia-smi" will give a report of the nvidia gpus in your system, the index is the the left of the gpu name. If you leave this out the command will affect all gpus.
"-pl 110" - this is the power limit in watts you want to apply to your card(s), this must be set within the range set by nvidia.
Just punching "nvidia-smi" into the console gives you a summary of the cards in your system, "nvidia-smi -h" go knock yourself out.
The fan settings part
"/usr/bin/nvidia-settings -h" - this simply gives you the commands available to nvidia-settings
"DISPLAY=:0 XAUTHORITY=/run/user/1000/gdm/Xauthority" - this is the context in which nvidia-settings needs to run if you actually want to query or set anything on the gpus. It's important you locate your Xauthority file, this is mine, online guides gave a different location. If you're using the wrong Xauthority file the following nvidia-settings command just throws an error.
"DISPLAY=:0 XAUTHORITY=/run/user/1000/gdm/Xauthority /usr/bin/nvidia-settings -q fans" - will give a report of the fans available to control
In the command to set the speed of the fans "[fan:0]/" is an optional entry, with it your specifying a single fan to control, without it you're doing them all. My dual card setup has 3 fans between them, nvidia-settings exposed all 3 of them.
Code: Select all
#!/bin/bash
# scripts/commands to execute at boot time using root privileges
# persistence mode does not apply to geforce but may work on other nvidia cards
nvidia-smi -pm 1
# lower the power target of the GPUs, GTX 1080 Ti should be id 1 with RTX 2070 Super id 0
# RTX 2070 SUPER and GTX 1080Ti combo
/usr/bin/nvidia-smi -i 0 -pl 110
/usr/bin/nvidia-smi -i 1 -pl 150
# take control of GPU fans
DISPLAY=:0 XAUTHORITY=/run/user/1000/gdm/Xauthority /usr/bin/nvidia-settings -a GPUFanControlState=1
# set speed of gpu fans individually, the "[fan:0]/" part is important, change the number to suit
# NOTE - The fans may be in a different order to the GPUs, test to determine which fan is which
# DISPLAY=:0 XAUTHORITY=/run/user/1000/gdm/Xauthority nvidia-settings -a [Fan:0]/GPUTargetFanSpeed=65
# set speed of all gpu fans
DISPLAY=:0 XAUTHORITY=/run/user/1000/gdm/Xauthority /usr/bin/nvidia-settings -a GPUTargetFanSpeed=55
exit 0
I've just spent the last 3-4 hours pulling all of this together, I hope it makes sense and is useful.
I would love to hear some feedback.