Project 13402

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

Nuitari
Posts: 78
Joined: Sun Jun 09, 2019 4:03 am
Hardware configuration: 1x Nvidia 1050ti
1x Nvidia 1660Super
1x Nvidia GTX 660
1x Nvidia 1060 3gb
1x AMD rx570
2x AMD rx560
1x AMD Ryzen 7 PRO 1700
1x AMD Ryzen 7 3700X
1x AMD Phenom II
1x AMD A8-9600
1x Intel i5-4590S

Project 13402

Post by Nuitari »

From the announcement, it is expected that WUs will take 3 to 4h and should only be assigned for fast GPUs.
viewtopic.php?f=24&t=35056

I do see that a WU got assigned to a RX 560, I'm not sure this counts as "fast". It should take about 15h to do 1 WU on that project.
Image
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: Project 13402

Post by JohnChodera »

That's odd---I'm not finding the RX 560 in GPUs.txt:

Code: Select all

$ grep 560 GPUs.txt 
0x1002:0x68b9:::Juniper [Radeon HD 5600/5700]
0x1002:0x68c1:::Redwood [Radeon HD 5600 Series]
0x1002:0x731f:1:6:Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT]
0x1002:0x958c:::RV630GL [FireGL v5600]
0x1002:0x9904:1:5:Trinity [Radeon HD 7560D] S 389
0x10de:0x019d:::G80 [Quadro FX 5600]
0x10de:0x0311:::NV31 [GeForce FX 5600 Ultra]
0x10de:0x0312:::NV31 [GeForce FX 5600]
0x10de:0x0314:::NV31 [GeForce FX 5600XT]
0x10de:0x031a:::NV31M [GeForce FX Go5600]
0x10de:0x039e:::G73GL [Quadro FX 560]
0x10de:0x1082:2:2:GF114 [GeForce GTX 560 Ti]
0x10de:0x1084:2:2:GF114 [GeForce GTX 560]
0x10de:0x1087:2:2:GF110 [GeForce GTX 560 Ti]
0x10de:0x1200:2:2:GF114 [GeForce GTX 560 Ti]
0x10de:0x1201:2:2:GF114 [GeForce GTX 560]
0x10de:0x1202:2:2:GF114 [GeForce GTX 560 Ti OEM]
0x10de:0x1208:2:2:GF114 [GeForce GTX 560 SE]
0x10de:0x1251:2:2:GF116 [GeForce GTX 560M]
What GPU device ID is this, and which GPUSpecies does it get assigned to?
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: Project 13402

Post by JohnChodera »

I've restricted the AMD GPU Species to >=6 until we can sort this out.

~ John Chodera // MSKCC
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Project 13402

Post by Joe_H »

JohnChodera wrote:That's odd---I'm not finding the RX 560 in GPUs.txt:
The RX 560 is most likely to show as a RX 460 in GPUs.txt, one version is basically the same GPU chip at a slightly higher clock rate. Or, since there were 3 different variants of the RX 560, it might match one of the other entries for cards based on the Baffin chip from AMD.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Nuitari
Posts: 78
Joined: Sun Jun 09, 2019 4:03 am
Hardware configuration: 1x Nvidia 1050ti
1x Nvidia 1660Super
1x Nvidia GTX 660
1x Nvidia 1060 3gb
1x AMD rx570
2x AMD rx560
1x AMD Ryzen 7 PRO 1700
1x AMD Ryzen 7 3700X
1x AMD Phenom II
1x AMD A8-9600
1x Intel i5-4590S

Re: Project 13402

Post by Nuitari »

pci id is 1002:67ef

In GPUs.txt it is identified as:

Code: Select all

0x1002:0x67ef:1:5:Baffin XT [Radeon RX 460]
The lspci output shows
07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] (rev e5)

The RX570 is also considered as species 5 and can do it at a TPF of 4m 3s, so about 6.75h, not sure if you want to exclude those or not.
Image
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: Project 13402

Post by JohnChodera »

There's no way to selectively exclude a subset of Species 5 cards, so I'm forced to exclude all Species 5 until we can further refine the GPU Species for AMD.

We have some plans to do data analytics on some standard benchmark systems this week to improve the state of things!

~ John Chodera // MSKCC
VegaZhree3
Posts: 22
Joined: Sat Apr 11, 2020 10:42 am

Re: Project 13402

Post by VegaZhree3 »

I also got assigned this WU. GPU is GTX 1660 Super, but this WU utilizes the hardware differenty. Usually the WUs would make the GPU draw about 110-120W. When running this one it draws max. 90W, and there is no power limit or something like that. Also this one have %70 "Copy" load when checking from the task manager, normally it was around %30. The usual GPU load from GPU-Z is similiar with the other WUs. The PPD is the lowest i've seen, around 450K.
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: Project 13402

Post by JohnChodera »

Thanks for the report, @VegaZhree3! This is a new type of workload for us---one that allows us to directly make predictions for the chemists about which molecules to make---and we're still iteratively refining performance.

~ John Chodera // MSKCC
Nuitari
Posts: 78
Joined: Sun Jun 09, 2019 4:03 am
Hardware configuration: 1x Nvidia 1050ti
1x Nvidia 1660Super
1x Nvidia GTX 660
1x Nvidia 1060 3gb
1x AMD rx570
2x AMD rx560
1x AMD Ryzen 7 PRO 1700
1x AMD Ryzen 7 3700X
1x AMD Phenom II
1x AMD A8-9600
1x Intel i5-4590S

Re: Project 13402

Post by Nuitari »

@JohnChodera the same seems to apply to project 13403 (75,10,2)
I'm seeting about 91W usage on the nvidia GTX 1660 Super, 85% gpu utilisation, PCIe usage at 19%
The core thread is using 50% of a core at all times.

Increasing the priority of the CPU FAHCore_22 thread does increase the GPU to 99% usage and about 100W (out of 125W). PCIe usage jumps to 23%
Image
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: Project 13402

Post by HaloJones »

13403 (11, 94, 2) on a dedicated 1070 with Windows 10. TPF 2:35 for an estimated 732225 ppd.

GPU Usage is showing at around 93%, Bus at 38%

PPD is a little low for this card but not excessively so.
single 1070

Image
Nuitari
Posts: 78
Joined: Sun Jun 09, 2019 4:03 am
Hardware configuration: 1x Nvidia 1050ti
1x Nvidia 1660Super
1x Nvidia GTX 660
1x Nvidia 1060 3gb
1x AMD rx570
2x AMD rx560
1x AMD Ryzen 7 PRO 1700
1x AMD Ryzen 7 3700X
1x AMD Phenom II
1x AMD A8-9600
1x Intel i5-4590S

Re: Project 13402

Post by Nuitari »

Tweaking the CPU priority for the RX560 reduced the TPF enough to save about 2h on a whole WU.
Image
muziqaz
Posts: 946
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 7950x3D, 5950x, 5800x3D, 3900x
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Project 13402

Post by muziqaz »

Problem with AMD is they have no clue how to systematically codename their GPUs, and they just assigned same "string" if you like to a wide bunch of their GPUs.
You have
AMDSpecies6 which is 5600, 5700, 5700xt
AMDSpecies5 which is Radeon7, Vega56/64 and all the frikkin GPUs bellow that.
Researchers can only exclude GPUs per Species. It similar to nVidia as well, where one species includes everything from that series of GPU including ultra fast high end and ultra slow low end.
If you exclude everything from AMD except AMDSpecies6 you end up with 4-6 cards available from AMD side, which are 5600, 5600xt, 5700, 5700xt, and probably much slower 5500, 5500xt. In process you exclude super fast Radeon 7, which is doing extremely well on these two projects, and also you exclude Vega56 and 64 which again are doing wonderful on this project. Suggestion was given to leave Species5 in before these projects went live. I still am for it, because again, we are losing crap ton of fast cards in Species5 if we leave Species6 only for these projects.

My suggestion would be to increase the deadline for these projects a bit to accommodate a bit slower cards. When does the server re-issue a WU? on Preferred deadline or Final deadline? If its on preferred deadline, then shorten it, but then again, no one wants to fold a WU for a day and see it being dumped by server just because someone with much faster GPU finished it before you :(
There is no favorable outcome out of this situation until we come up with new identification process in new fahclient.

On the other hand, these projects are getting chewed up by masses of nVidia GPU folders anyway, I would guess :D
FAH Omega tester
muziqaz
Posts: 946
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 7950x3D, 5950x, 5800x3D, 3900x
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Project 13402

Post by muziqaz »

By the way, these two projects love high shader count, wide mem controller cards, which is higher end hardware. All the mid range, low end cards, be it nvidia or AMD will see lower than normal PPD, while higher, ultra high end GPUs will shine on these.
We see these kind of discrepancies with PPD on CPU projects a lot, I believe it's time to get used to similar behavior on GPU side as well. Some projects work well on certain hardware, others on other ;)
FAH Omega tester
jrweiss
Posts: 704
Joined: Tue Dec 04, 2007 6:56 am
Hardware configuration: Ryzen 7 5700G, 22.40.46 VGA driver; 32GB G-Skill Trident DDR4-3200; Samsung 860EVO 1TB Boot SSD; VelociRaptor 1TB; MSI GTX 1050ti, 551.23 studio driver; BeQuiet FM 550 PSU; Lian Li PC-9F; Win11Pro-64, F@H 8.3.5.

[Suspended] Ryzen 7 3700X, MSI X570MPG, 32GB G-Skill Trident Z DDR4-3600; Corsair MP600 M.2 PCIe Gen4 Boot, Samsung 840EVO-250 SSDs; VelociRaptor 1TB, Raptor 150; MSI GTX 1050ti, 526.98 driver; Kingwin Stryker 500 PSU; Lian Li PC-K7B. Win10Pro-64, F@H 8.3.5.
Location: @Home
Contact:

Re: Project 13402

Post by jrweiss »

Currently running 13402 (65, 64, 0) on my 1050ti. With 61% complete, it's showing 6:28 per frame, and est 172602 PPD. Some other current projects get over 200K PPD, but this is within range...
Ryzen 7 5700G, 22.40.46 VGA driver; MSI GTX 1050ti, 551.23 studio driver
Ryzen 7 3700X; MSI GTX 1050ti, 551.23 studio driver [Suspended]
Nuitari
Posts: 78
Joined: Sun Jun 09, 2019 4:03 am
Hardware configuration: 1x Nvidia 1050ti
1x Nvidia 1660Super
1x Nvidia GTX 660
1x Nvidia 1060 3gb
1x AMD rx570
2x AMD rx560
1x AMD Ryzen 7 PRO 1700
1x AMD Ryzen 7 3700X
1x AMD Phenom II
1x AMD A8-9600
1x Intel i5-4590S

Re: Project 13402

Post by Nuitari »

TBH I really don't care about the PPD. Its nice, but its far from the whole point of doing this.

The only reason I posted about it is the announcement seems to indicate they want a quick (3 to 4h) turnaround and that there are cards which are clearly not going to match this window of expectation. If they are ok with slower turnaround (like 15h) then I'm fine with it, even if I'd get less PPD then otherwise. The original WU completed in 14h 11m on the slowest RX560 I have.

From what I read, the server will reissue a WU once the preferred deadline has been reached.
Image
Post Reply