Page 6 of 9

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 12:17 am
by PantherX
On the 1000 series: reduction of 16s
On the 2000 series: reduction of 10s
On the 3000 series: reduction of 4s

Considering that this is the first release of the drivers, I am optimistic that after 6 months, we might see more reduction. Also, FahCore_22 is in active development so more performances due to optimizations/new features is possible :)

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 4:40 am
by bruce
MeeLee wrote:Cuda doesn't seem to scale very well.This is when 2 identical GPUs plugged in, and load sharing between them via CUDA?
Seems to work well for the GTX series, but the higher the GPU, the lower the performance gains, it seems.
With older GPUs, the only load sharing is in doubling the frame-rate of your game. The actual 3d processing isn't shared.
A lot of your tests are probably with small WUs, which means your problem can't benefit from additional parallelism. There's a limit to the benifit from adding additional shaders because the steps that are serial in nature begin to dominate. When that happens, adding more shaders or even double-pumping them can't do much to reduce the TPF.

Has any of that changed with the RTX3000 series? (i have not studied the details of he 3000 series yet)

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 5:45 am
by ajm
Warning: Potential problem for folding with 30x0 cards that use (only or too many) cheaper POSCAP capacitors instead of MLCC groups for powering the GPU: https://www.igorslab.de/en/what-real-wh ... drtx-3090/

Worst known cases: Zotac Trinity and Colorful cards.
Best bet: ASUS TUF and of course Strix.

See also: https://youtu.be/x6bUUEEe-X8

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 6:14 am
by gunnarre
PantherX wrote:FYI, here's some data for comparison only, Linux with Project 11765, time for 1% in seconds:
Different project, so not directly comparable, but on project 14487 time for 1% in seconds:
100 1660S OpenCL
69 1660S CUDA
ajm wrote:See also: https://youtu.be/x6bUUEEe-X8
I agree that the most likely "fix" is a new card firmware that will bring down the clocks. Manually underclocking the card is the workaround, and it will make your card more energy efficient at the same time. A bit sad for the people who will spend extra on a factory-overclocked card for a couple more frames per second in a video game.

Edit: He's mixing up some of the engineering terms - like he says "ESD" when he means electromagnetic interference (EMI), but the gist seems correct.

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 6:33 am
by ajm
And if the problem can cause crashes in games, it probably will cause enough instability for derailing FAH at much lower frequencies.

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 6:41 am
by gunnarre
I'm not so sure about that - maybe it might happen at the end of work units. The crashes happen when the card boosts above a certain frequency, and that only happens when the temperature and power draw is low enough to allow it. That won't happen in the middle of a WU unless you're water cooling or something. Even worse for folding is if this causes incorrect results to be produced. In that sense it's better to crash than to give incorrect results to the science team.

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 6:50 am
by ajm
:idea: FAH could/should ask manufacturers (or official reviewers like Jay) to send such cards (for a limited period of time) to some of their trusted beta testers in order to see if the client or the GPU.txt should be adapted.

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 7:48 am
by PantherX
gunnarre wrote:...In that sense it's better to crash than to give incorrect results to the science team.
There's plenty of sanity checks:
1) During the checkpoints of GPU WUs which is verified by the CPU
2) When packing the WU from the client
3) When the WU is received at the Server
4) When researchers do spot checks (this can be time consuming but you can't easily overlook an error)

If any error gets past all those levels, that's a PhD paper since they have found a way to defy the laws of physics :)

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 7:56 am
by gunnarre
So I take it that protein folding is one of those "hard to do, but quick to verify" computing tasks, a bit like cryptographic hashing?

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 8:01 am
by PantherX
The way I understand it is that atoms/molecules can only move a certain distance in 2 femtoseconds. Let's say it can only move 2 meters in 2 femtoseconds. Thus, I would check to ensure that the atoms can't move 20 meters in 4 femtoseconds. If it does, then the data is invalid. I do believe that the actual simulation uses cube boxes since it is 3D space. Thus, it would except atoms/molecules to be within that cube. If it "escapes" then something is not going right.

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 10:21 am
by scott@bjorn3d
EVGA has stated they found the issue in lab testing and their shipping cards are fine. Some reviewers got cards before the lab found the issue. My 3080 FTW3 Ultra is shipping next week so I will do allot of testing with it and will report back how it does compared to my 2080TI FTW3 Ultra

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 10:23 am
by ap1978
ajm wrote:Warning: Potential problem for folding with 30x0 cards that use (only or too many) cheaper POSCAP capacitors instead of MLCC groups for powering the GPU: https://www.igorslab.de/en/what-real-wh ... drtx-3090/

Worst known cases: Zotac Trinity and Colorful cards.
Best bet: ASUS TUF and of course Strix.

See also: https://youtu.be/x6bUUEEe-X8
Yeah, I saw that.. I have ordered two Strix OC cards, hope they are ok.

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 1:15 pm
by ajm
ap1978 wrote:Yeah, I saw that.. I have ordered two Strix OC cards, hope they are ok.
They are: Strix cards have six MLCC groups, no cheap capacitor at all.

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 3:07 pm
by ajm
As for the 3080 FTW3 Ultra, it is a bit unsure. There are pictures online with a six POSCAPs design (bad) and others with 4 POSCAPs / 2 MLCC groups (same as FE).

6 POSCAPs:
Image

4 POSCAPs / 2 MLCC:
Image

6 MLCC groups (Strix 3080):
Image

Re: New RTX3xxx cards

Posted: Sat Sep 26, 2020 3:44 pm
by ap1978
"Grapevine says that there are reports of instabilities on ASUS TUF and Strix cards as well. So 6x MLCC does not make you immune."

reddit - Ampere POSCAP/MLCC Counts : hardware