Core temperatures

A forum for discussing FAH-related hardware choices and info on actual products (not speculation).

Moderator: Site Moderators

Forum rules
Please read the forum rules before posting.
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Core temperatures

Post by ajm »

But why with a one software only, and not with others that use the same instruction sets? And that only when using a certain processor, and not with others?

And why do I have those 997 errors only when using the Threadripper? Twice already, I went from an X299 system to a TRX40 system with the same Windows, on the same disk. Just place the system disk in the new kit, boot, let Windows charge the drivers for the new CPU and chipset, and hit the ground running. Those errors appear as the only change in the whole set-up is the processor and the architecture it demands. All else stays the same.

And I don't have a temperature problem - I have that covered now. I could even overclock into instability without throttling or over-stressing the cooling. What I have is a one dead CPU problem, on a machine that is crucial for me. And I don't want a two dead CPUs one.

But that said, I also disable the boost during the night, when all the machine is doing is backups and index-updates. With the regedit tweak, it's a matter of 2-3 clicks.

EDIT: And the dead CPU passed while running (and folding) without any boost (or DOCP), at base speed only, because I did not have sufficient cooling then.
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Core temperatures

Post by ajm »

MeeLee wrote:I'm currently experiencing issues with my 3950x, where the scalar doesn't seem to work well anymore, and the CPU is throttling at 4,1Ghz (with threads going between 100% and 80% active).
Something is not working right now, and hence why I looked online about a possible PBO overboost issue with Ryzen (and threadrippers), that cause high temps and high wattages.
With what kind of load (software) and at which Package temperature are you experiencing these problems? Have you tried/compared Linpack Xtreme with FAH?

EDIT: And what kind of cooling are you using? https://youtu.be/QxEPye6mSsI
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Core temperatures

Post by ajm »

PantherX wrote:
ajm wrote:A question for @PantherX: What do you think of AIDA64 Extreme CPU stress test? As a complement to Linpack? ...
I haven't used AIDA64 so can't comment. While it seems that they have AV2 support, I guess that the question is how are they using it in the benchmark.

(...)
From AIDA64 Technical Support:
The Stability test on your CPU is using Zen2 FMA with YMM registers instead of AVX2
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Core temperatures

Post by PantherX »

Humm... does that mean that AVX2 support is emulated (not native) in AIDA64. If that's my understanding (correct me if I am wrong), then it would make sense that the thermal output would be lower when compared to folding as it would use native AVX2 support which would generate more heat.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Core temperatures

Post by ajm »

Or it means that AIDA64 adapts the stability test to the tested processor, and it didn't use AVX2 at all on the Threadripper.
Anyway, AIDA64 generated more heat than Linpack Xtreme.

But I asked the support. We'll see.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Core temperatures

Post by PantherX »

ajm wrote:...But I asked the support. We'll see.
Appreciate that you're going the extra mile to see what's happening!

Out-of-the-box-thinking :idea: I wonder if you can log a support case pointing out that on a Threadripper, folding is more intensive than their stability test so their product may not be fit for purpose and see what their response is :idea:
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Core temperatures

Post by ajm »

done!
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Core temperatures

Post by ajm »

AMD Threadripper 3970X under heavy AVX2 load: Defective design? (No, but there is an issue)
https://forum.level1techs.com/t/amd-thr ... e/153883/5

https://www.anandtech.com/show/15044/th ... s-on-7nm/6
Image

COMMENT: If you want to fold, go Xeon or X299, and stay away from the Threadripper...
MeeLee
Posts: 1339
Joined: Tue Feb 19, 2019 10:16 pm

Re: Core temperatures

Post by MeeLee »

ajm wrote:
MeeLee wrote:I'm currently experiencing issues with my 3950x, where the scalar doesn't seem to work well anymore, and the CPU is throttling at 4,1Ghz (with threads going between 100% and 80% active).
Something is not working right now, and hence why I looked online about a possible PBO overboost issue with Ryzen (and threadrippers), that cause high temps and high wattages.
With what kind of load (software) and at which Package temperature are you experiencing these problems? Have you tried/compared Linpack Xtreme with FAH?

EDIT: And what kind of cooling are you using? https://youtu.be/QxEPye6mSsI
Cooling is not an issue. 240mm water cooled. The CPU remains relatively cool.
The board doesn't have a sensor, but the cooling solution doesn't surpass 40c, which makes me believe the CPU runs at around 60-70C max.

I presume either I have one of the first 3950x versions, or my motherboard (ASUS X570 TUF) is pure garbage (also one of the first boards available supporting the 3000 series CPUs, with their inherent problems).

I knew the Infinity fabric couldn't run at 1800Mhz, even though it should, but it seemed to run odd at 1700Mhz.
For a while it ran fine though, but started having problems (hanging, freezes) down the road that became more frequent.
I lowered it to 1666, which ran fine for a while, but errors later again. Then further to 1633Mhz, where it runs stable.
Then I adjusted the 3600Mhz DDR RAM which ran at 3400Mhz already, down to 3266Mhz. That seemed to do the trick.
It's mostly running stable again now.
I'm starting to believe the board/CPU combination doesn't support faster than maybe 3200Mhz (3400Mhz possibly as a far overclock from factory).
Faster memory never ran well on this board.

It'll be replaced by the end of the week, whenever the new delivery comes in.
The seller already returned the money,before I even shipped it back.
Means something...
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Core temperatures

Post by ajm »

MeeLee wrote:(...)Cooling is not an issue. 240mm water cooled. The CPU remains relatively cool.
The board doesn't have a sensor, but the cooling solution doesn't surpass 40c, which makes me believe the CPU runs at around 60-70C max.
(...)
The board does not have a sensor for the CPU??? Incredible!

I don't think that you can deduce the CPU temp on the basis of the loop temp. There is a threshold above which the loop just can't dissipate more heat and that heat then stays in the CPU. Be careful. And glad you can get a new board!
JimboPalmer
Posts: 2522
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: Core temperatures

Post by JimboPalmer »

This is what I think I know about F@H Core_a7 0.0.18 and AVX.

Core_a7 can use either SSE2 or AVX_256. It will not use AVX2 or AVX_512 on any CPU, nor for that matter, other flavors of SSE. (GROMACS can, so this may change in the future)

Zen and Zen+ can do AVX_256 but they do so with two micro ops of 128 bits each, Zen 2 does AVX_256 in one micro op as does Intel. So there is a distinct performance gain with Zen2, and so a gain in temperature.

SSE2 does a vector of four 32 bit FP operations at once per thread, while AVX_256 does a vector of eight 32 bit FP operations at once, so for intel and Zen2, expect a 2 to 1 jump in performance. In the above charts with some other program, it jumps from 9987 to 18,996. That is close to 2 to 1.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Core temperatures

Post by ajm »

JimboPalmer wrote:Zen and Zen+ can do AVX_256 but they do so with two micro ops of 128 bits each, Zen 2 does AVX_256 in one micro op as does Intel. So there is a distinct performance gain with Zen2, and so a gain in temperature.

SSE2 does a vector of four 32 bit FP operations at once per thread, while AVX_256 does a vector of eight 32 bit FP operations at once, so for intel and Zen2, expect a 2 to 1 jump in performance. In the above charts with some other program, it jumps from 9987 to 18,996. That is close to 2 to 1.
But that is nothing compared to the Xeon W-3175X, which jumps from 6522 to 52889, that is, a factor of 8.1.
There IS a major difference between Intel and Zen2 with AVX, and it doesn't benefit to Zen2: the 3970X is almost 3 times less performing than the Xeon, with more cores. Almost twice less performing than an i9-7960X, with twice the number of cores. I know that it gets hotter than a 7940X, too.
Why? Probably a less than optimal implementation of AVX, no?
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Core temperatures

Post by bruce »

The recent Intel line of CPUs all support AVX. (I've got some older AMD and Intel CPUs that only support SSE2.) I'm not sure about AMD's AVX, but Intel's AVX adds an appreciable amount of heat over just using SSE so the package is more likely to thermal-limit, especially if you're using a generic heat-sink. I suppose it will be even worse when somebody decides to permit the use of Intel's iGPU for folding.

Anyway, I'd get more production out of my CPU chips if I upgraded my heat-sinks. That may be what's going on in the With AVX / without charts above, too.
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Core temperatures

Post by ajm »

There is no "generic" heat-sink for the latest Threadrippers.
But it's true that hardly any of the cooling solutions AMD is proposing (https://www.amd.com/en/thermal-solutions-threadripper ) would allow for prolonged folding at boost speed.
Neil-B
Posts: 1996
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Core temperatures

Post by Neil-B »

… but by definition boost is not for sustained processing of any sort let along something as intensive as FAH folding so this shouldn't come as a surprise? … my guess is that in order to claim high boosts AMD accepted that they would only be sustainable for short spiky bursts due to heat issues (whatever the cooling) and hence how they have defined boost.

tbh I use Intel server grade cpus which always sound "slow" but tend to consistently over achieve their stated speeds even when folding FAH without thermal issues … even if those stated speeds keep becoming less effective as Intel nerfs their CPUs performance !!!

It all depends on where the various chip makers see their market and how they choose to design/sell their wares … It would appear that AMD and the MoBo manufacturers have possibly pushed Threadrippers to an extreme where power consumption and heat generation issues may be more of an issue than they would like for a larger part of their market sector (it is beginning to snowball a bit in the press/media).

Inadequate cooling is the "easy target" but there seems to be an equally if not more valid issue with significant heat generation at high frequency/power levels causing cooling to be inadequate … Quite what AMD have done within the Threadrippers to make this such a marked issue (especially for FAH folding) will probably never see the light of day - but I guess they might learn from this and future CPUs may be performant to a slightly wider usage pattern.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Post Reply