markdotgooley wrote:HaloJones wrote:
Are the cards running that hot because you're letting NVidia control the fan curves? If you don't intervene that's what NVidia will do - run them up to 82C.
My 1070s never go much above 25C over ambient but that's what water-cooling will get you.
I’m running Ubuntu 18.04 (can’t get things to work under anything later) and I seem to have limited control of things (or I’m just too ignorant). I think I could maybe force power consumption to 125W a card rather than the roughly 150W each they’re currently using. Maybe I can use the nvidia-smi command on a Linux terminal if that’s working (haven’t tried recently) to try to tweak some settings. The cards aren’t supposed to throttle until 90C and if neither goes over 82C, maybe I shouldn’t worry. Ambient is typically 25C.
I can only tell you my way of doing things.
First, enable coolbits=28 on your system:
Code: Select all
sudo nvidia-xconfig --enable-all-gpus
sudo nvidia-xconfig --cool-bits=28
then reboot
then in GUI adjust fan curve to max (or maximum you feel comfortable with. Most GPUs do 3000RPM quite fine with little noise, but some people prefer to run them 2800RPM or 2500RPM).
The lower the fan speed, the lower the performance.
Then cap the power. 2060 and Super (more than likely KO as well), should run at 125W. There's no performance gain above this value, just power loss.
Then do a safe OC, depending on your GPU manufacturer, those OC values are different from card to card.
I have EVGA GPUs that only allow +65Mhz safely (they do +100Mhz shortly, but will cause bad WUs). I have Gigabyte GPUs that do +120Mhz. I have Asus GPUs that do +35Mhz only for the ROG strix (factory overclocked), or +220Mhz for one of their blower GPUs. Every brand is different, and there even are differences between models too.
The memory I always set to +1400Mhz, as the memory modules are underclocked from the factory by default. Those memory modules are 15Gbps (both Samsung and Hynix), but run at 14Gbps or lower from the factory. The difference between +1000Mhz and +1400Mhz isn't really noticeable (less than 1 to 2% in PPD).
Once a safe OC is paired with a safe power cap, and a fan speed maxed out, you'll run way lower temps.
Also, make sure the PSU is helping the case fans suck the hot air from inside the case. Sometimes it requires mounting the PSU upside down.
Another thing you can do to reduce GPU temps, is get rid of the front mounting bracket. And if your GPUs are sandwiched in the case, remove the Pcie mounting bracket on the case that's between the GPUs, for better airflow. Use cardboard and tape to create a duct, in case hot exited air would be sucked back inside the PC.
Your setup should allow GPU temps to run below 75C.