WU 13416 low ppd long run time

Moderators: Site Moderators, FAHC Science Team

HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: WU 13416 low ppd long run time

Post by HaloJones »

just for some balance

P13416 (R376, C170, G1) running around 35% above average
P13416 (R791, C148, G1) ditto
single 1070

Image
bumbel123
Posts: 21
Joined: Fri Mar 20, 2020 12:17 pm

Re: WU 13416 low ppd long run time

Post by bumbel123 »

Breach wrote:Same observations, 13416 (1297, 133, 1) for example - PPD is lower than usual by about 25%.
That can be the worst case for one single Job/WU and GPU ... I also have seen a drop in average PPD for my bunch of Turing GTX GPUs, since they just get 13416 the last days. But in longer term monitoring its rather 15% over all.

Many of these Jobs have quite normal average recognition, but many are lower. It doesn't harm my gear that much, because I run a bunch of GPUs and CPUs.

But what I can see, my GTX 1660 TI doesn't really suffer compared to the (s)lower models. The GTX 1650/1650S constantly gets much lower recognition, the 1660/1660S are doing a roller coaster drive in terms of PPD.

Anyone with just one GPU, cannot really compensate and does suffer a bit ... I (can) compensate with my Ryzen's, especially my R5 3600 does constantly do 130.000+ PPD, the 0xa7 WU do like MHz more than number of threads obviously.
  • Ryzen 3 3100 (Zen2), Nv GTX 1660 Ti, Win10 Ent
  • Ryzen 5 3600 (Zen2), Nv RTX 2060, Win10 Ent
  • Ryzen 7 2700 (Zen+), Nv RTX 2060, Nv GTX 1660, Ubuntu LTS 20.04
  • Ryzen 5 3600 (Zen2), Nv RTX 2060, Nv GTX 1650, Ubuntu LTS 20.04
kiore
Posts: 921
Joined: Fri Jan 16, 2009 5:45 pm
Location: USA

Re: WU 13416 low ppd long run time

Post by kiore »

HaloJones wrote:just for some balance

P13416 (R376, C170, G1) running around 35% above average
P13416 (R791, C148, G1) ditto

Agreed, there are significant variations in length of run and PPD on these units on my RTX 2080ti in windows although many have very low ppd some have significantly higher, I have observed a range from 1.1 to 3.6 mppd between different runs.
I understand that this can be frustrating but these are essential units it seems so I encourage everyone to bear with this minor blip in the machinery and if possible tweak cpu core allowances for the gpu to give these some extra headroom.
Image
i7 7800x RTX 3070 OS= win10. AMD 3700x RTX 2080ti OS= win10 .

Team page: https://www.rationalskepticism.org/viewtopic.php?t=616
Sparkly
Posts: 73
Joined: Sun Apr 19, 2020 11:01 am

Re: WU 13416 low ppd long run time

Post by Sparkly »

These 13416 things are now spending so much CPU and System resources away from the GPU that the GPU doesn't really do much, because of all the waiting, so maybe it is time to take a closer look at the general CPU to GPU handling of WUs in software. (R1166 C7 G2)

Image
Bastiaan_NL
Posts: 25
Joined: Wed May 13, 2020 4:34 am
Hardware configuration: I7 7700k, RTX 2080Ti
I3 9100F, 2x 5700XT
Location: Netherlands

Re: WU 13416 low ppd long run time

Post by Bastiaan_NL »

Sparkly wrote:These 13416 things are now spending so much CPU and System resources away from the GPU that the GPU doesn't really do much, because of all the waiting, so maybe it is time to take a closer look at the general CPU to GPU handling of WUs in software. (R1166 C7 G2)

Are you still running a CPU client?
I paused the CPU client and the load was back on the GPU's, even though the CPU was only at 80%.
So for now I'm running without the CPU client, 20kppd sacrifice for at least a tenfold gain.
Sparkly
Posts: 73
Joined: Sun Apr 19, 2020 11:01 am

Re: WU 13416 low ppd long run time

Post by Sparkly »

Bastiaan_NL wrote:Are you still running a CPU client?
I have no CPU slots on these systems, they are GPU only and running dedicated with no other activity on them, these WUs just use an insane amount of CPU and system resources to handle the GPU.
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: WU 13416 low ppd long run time

Post by Joe_H »

You keep bringing this up, the projects would not complete on CPU folding within the timeframe necessary to get the results to other researchers.

There are also known issues with the AMD drivers.

Future developments may make it possible to change the assignment mix, but these WUs and their results are needed immediately.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Sparkly
Posts: 73
Joined: Sun Apr 19, 2020 11:01 am

Re: WU 13416 low ppd long run time

Post by Sparkly »

Joe_H wrote:You keep bringing this up, the projects would not complete on CPU folding within the timeframe necessary to get the results to other researchers.
Who said anything about running these projects on CPU, what I am saying is that the CPU to GPU communication is creating an insane amount of overhead that shouldn’t be there in the first place, this is about programming and nothing else.
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: WU 13416 low ppd long run time

Post by Joe_H »

Sparkly wrote:
Joe_H wrote:You keep bringing this up, the projects would not complete on CPU folding within the timeframe necessary to get the results to other researchers.
Who said anything about running these projects on CPU, what I am saying is that the CPU to GPU communication is creating an insane amount of overhead that shouldn’t be there in the first place, this is about programming and nothing else.
This is what you wrote:
Sparkly wrote:...so maybe it is time to take a closer look at the general CPU to GPU handling of WUs in software
On top of all of your other complaints, how else is it to be taken? Programming changes will take time, perhaps a lot of time. But it is only now you bring up "programming". There is also the apparent assumption on your part that the people programming OpenMM and creating the folding core from that code base can make major changes win how the GPU is used.

All that would be in the future, for now they can just use the data from these projects to improve configuration for future projects. Maybe they will identify why some seemingly similar runs end up processing so differently. That might show up as an improvement in a future revision of the core, the projects, or not until an entirely new folding core.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Sparkly
Posts: 73
Joined: Sun Apr 19, 2020 11:01 am

Re: WU 13416 low ppd long run time

Post by Sparkly »

Joe_H wrote:On top of all of your other complaints, how else is it to be taken?
What I wrote has nothing to do with sending these projects to CPU, if you even bother to read what it says, since it talks about handling the CPU to GPU communication in the software, something you are correct in that I have commented on elsewhere also, since it is perfectly clear based on hardware comparisons and running the numbers from CPU, GPU and System, that the overhead created is not taken into consideration when handling the WUs on GPU, since a good part of it comes from activating the PCIe bus etc.
astrorob
Posts: 43
Joined: Sun Mar 15, 2020 7:59 pm

Re: WU 13416 low ppd long run time

Post by astrorob »

so not to pile on, but i do feel something is weird with 13416. i've got 5 GPUs across 2 machines. 13416 happens to be running on 2 GPUs on a windows 10 machine and 1 GPU on a linux machine. on the windows 10 machine, the two GPUs in question are a GTX 1060 with 8GB of memory and a RTX 2060 with 6GB of memory. on those two GPUs the TPF is 3m39s and 2m12s respectively.

on the linux box, 13416 is running on an ATI RX5500, and the TPF is 25m30s.

my understanding has always been that the nvidia drivers are subpar on all platforms and don't use DMA to move data between the GPU and CPU, leading to higher CPU utilization for the CPU thread associated with the GPU thread. in contrast i understood the AMD drivers to use DMA and generally be more efficient at data tranfser. on the linux box i do see the nvidia CPU thread pegged at 100%, but the ATI cpu thread is ~70% of a single core, which is higher than i've ever seen.

13416 is taking 2+ days to complete on the ATI GPU. because i have to shut down during the day due to high electricity prices (TOU plan), i can't actually complete 13416 on the RX5500 before it times out. therefore all the power and cost of the video card are being wasted.

hopefully this is a transient situation... if this is happening to me then it is certainly happening to 1000s of other people and lots of compute power is going to waste.

if this turns out to be a permanent situation, is there a mechanism for researchers to blacklist certain types of GPU for a given WU?

and an update - it does seem like 13416 is CPU-bound on the RX5500 - the machine in question had 6 threads (out of 8) running rosetta@home, with 2 threads reserved for the GPUs. i've lowered R@H to 5 threads and now i'm seeing 16m55s TPF on the RX5500.
Image
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: WU 13416 low ppd long run time

Post by HaloJones »

P13416 has huge variations so you're not comparing apples to apples.

e.g. two 1070s in the same machine, same PCIE speed, same overclock, same drivers, same OS.
P13416 (R577, C150, G1) - TPF 02:20 for 1307292 ppd
P13416 (R910, C67, G0) - TPF 03:14 for 799240 ppd

Don't get hung up on this. Some P13416 are super quick and give great points, some are the opposite. Overall, it balances out.
single 1070

Image
psaam0001
Posts: 378
Joined: Mon May 18, 2020 2:02 am
Location: Ruckersville, Virginia, USA

Re: WU 13416 low ppd long run time

Post by psaam0001 »

For now, I am not running anymore GPU jobs on my Ryzen3's integrated GPU. All GPU tasks I get are relegated to the GTX 1650 in that same machine.

Paul
astrorob
Posts: 43
Joined: Sun Mar 15, 2020 7:59 pm

Re: WU 13416 low ppd long run time

Post by astrorob »

HaloJones wrote:P13416 has huge variations so you're not comparing apples to apples.

e.g. two 1070s in the same machine, same PCIE speed, same overclock, same drivers, same OS.
P13416 (R577, C150, G1) - TPF 02:20 for 1307292 ppd
P13416 (R910, C67, G0) - TPF 03:14 for 799240 ppd

Don't get hung up on this. Some P13416 are super quick and give great points, some are the opposite. Overall, it balances out.
i get it, but there is evidence that something is wrong with these WUs on AMD GPUs running under linux. i'm getting CORE_RESTART and BAD_WORK_UNIT after running for 10+ hours, with Potential Energy erros.

this is only happening on the AMD GPU in my linux machine.
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: WU 13416 low ppd long run time

Post by bruce »

So when Rosetta is running and the GPU is NOT running, how many CPU threads remain idle?

Yes, the NVIdia drivers require one or more CPU threads to move data to/from the GPU. AMD's advertising makes an excellent point that their driver's access to main RAM is a good feature.
Post Reply