Page 1 of 3
Wu 17721 on Radeon 6900XT (WTH???)
Posted: Tue Mar 30, 2021 4:06 am
by Starman157
What's wrong with WU 17721 (PRCG 11,2,51) running on Radeon 21.3.1 WHQL (specifically running on a 6900XT). PPD of only 1.6 million????
I'm guessing based on a 2080Ti or a 3080 the 6900XT should be somewhere in the 5.0-5.5 million range.
Something is really wrong with this WU.
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Tue Mar 30, 2021 3:02 pm
by JimboPalmer
It is possible for a protein to be too simple, not enough bonds to utilize all the shaders available.
As GPUs get more powerful, the proteins that can be folded get larger, but existing research on smaller proteins still needs to finish.
https://www.techpowerup.com/gpu-specs/r ... 0-xt.c3481
The 6900 XT has 5120 shaders.
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Tue Mar 30, 2021 5:20 pm
by ajm
Well, there is a HUGE difference indeed between the 6900XT and eg the 3080 for that project:
6900XT avr. PPD 1,563,523
3080 avr. PPD 6,011,894
https://folding.lar.systems/projects/fo ... file/17721
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Tue Mar 30, 2021 5:33 pm
by iero
ajm wrote:Well, there is a HUGE difference indeed between the 6900XT and eg the 3080 for that project:
6900XT avr. PPD 1,563,523
3080 avr. PPD 6,011,894
https://folding.lar.systems/projects/fo ... file/17721
Any ideas as to why this is happening?
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Tue Mar 30, 2021 6:01 pm
by ajm
Nope, never seen this before. Maybe the researcher is doing some experiment with that project, and thus making especially good use of CUDA. I remember having seen a few "unicorn" WUs with it on my system.
It is listed as unspecified but belongs to a Cancer series managed by Dr. Matthew Chan:
https://stats.foldingathome.org/project?p=17721
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Wed Mar 31, 2021 3:38 pm
by Starman157
It would be a good thing if FAH could associate WUs with appropriate graphics cards. The 17711-17724 WUs are obviously NOT optimized for AMD graphics cards.
I'm just thinking of all of the other more appropriate work that could be done with the same amount of electricity.
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Wed Mar 31, 2021 3:50 pm
by JimboPalmer
Starman157 wrote:It would be a good thing if FAH could associate WUs with appropriate graphics cards. The 17711-17724 WUs are obviously NOT optimized for AMD graphics cards.
F@H can black ball given species from work on given projects. Asking would involve some one closer to the Development cycle than I am. (red, orange or bolded names are a clue)
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Wed Mar 31, 2021 4:31 pm
by Joe_H
JimboPalmer wrote:Starman157 wrote:It would be a good thing if FAH could associate WUs with appropriate graphics cards. The 17711-17724 WUs are obviously NOT optimized for AMD graphics cards.
F@H can black ball given species from work on given projects. Asking would involve some one closer to the Development cycle than I am. (red, orange or bolded names are a clue)
Yes, they could block the project from being assigned to the 6900 XT. But with the current setup for GPU species on the AMD side, that block would include all Navi based AMD GPUs, not just the 6900 XT. In the current GPUs.txt file that is a total of 15 different entries, and the way AMD uses PCI device ID numbers that represents more cards than that.
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Wed Mar 31, 2021 4:39 pm
by bruce
FAH Development has a plan to optimize assignments by expanding the concept of GPU_Species but it seems that the project is stalled. That would be a good place to incorporate an enhancement, but resurrecting the overall plan would be better than trying to fix the issue individually.
The project owner can
1) Figure out how to fix those WUs on AMD
... or ...
2) Exclude assignments to all AMD GPUs or to half of them.
Fixing the bug is obviously the better choice.
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Wed Mar 31, 2021 4:46 pm
by Starman157
"that block would include all Navi based AMD GPUs"; Not true Joe. It seems FAH does make the distinction of Navi based AMD products between the 5xxx (Navi 10) and 6xxx (Navi 21).
Based on the performance of this WU set, why not exclude ALL of Navi from running them, so that they can get to other WUs that they CAN compute well.
Either that, or fix the WUs as I can't even guess as to why these WUs take 2.5 times longer as compared to Nvidia. Yes, that should equate to 2.5 times the electricity to do the work; No, I'm not equating the power used between AMD and Nvidia.
Or make FAH compute structure GPU agnostic (LOL).
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Wed Mar 31, 2021 4:56 pm
by Starman157
I just got one in this set on my Radeon HD7950. Sure enough, it's running 2.5x slower than "normal". So this problem is more AMD generic than Navi based.
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Wed Mar 31, 2021 5:08 pm
by Joe_H
Starman157 wrote:"that block would include all Navi based AMD GPUs"; Not true Joe. It seems FAH does make the distinction of Navi based AMD products between the 5xxx (Navi 10) and 6xxx (Navi 21).
No, it does not do what you think. The description field where Navi is listed by type is descriptive only and not used otherwise. The part that is used are the 3rd and 4th fields. In the case of all Navi entries that is a '1' in the 3rd field indicating AMD, and a '6' in the 4th field (species) indicating Navi because the change in microarchitecture required code changes in the OpenMM and folding core code.
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Wed Mar 31, 2021 5:39 pm
by Starman157
Not sure what you mean Joe. The PCI Device ID for the 6900XT is 1002:73BF and the 5700XT is 1002:731F. Clearly they are different.
Are you saying that FAH doesn't use the PCI device ID for identification in favour of other means which doesn't discriminate like the PCI device ID does?
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Wed Mar 31, 2021 6:14 pm
by Joe_H
F@h uses the device ID to determine which GPU you have. After that for assignment purposes they are then classified into maker - field 3, and species - field 4. So all Navi based GPUs are shown as maker 1 - AMD and species 6. All other usable AMD GPUs are species 5. Species 4 and lower either do not support FP64 calculations, or are based on microarchitectures that F@h no longer supports. If the fields are zero or empty, then the card is blacklisted.
The third field in the GPUs.txt entry uses 1 for AMD, 2 for nVidia, and 3 for Intel. The species field number are specific to each card source, so a species 5 nVidia card is distinct from a species 5 AMD one.
For the AMD entries, AMD uses the same first two fields of the PCI device ID number on a range of cards using the same basic chip. Distinction between them is in a third field that F@h does not currently scan for and use. Sometimes the description field lists all variants connected to that device ID, but not always. It depends on the information available at the time the entry was created.
Re: Wu 17721 on Radeon 6900XT (WTH???)
Posted: Wed Mar 31, 2021 6:19 pm
by bruce
Starman157 wrote:... it's running 2.5x slower than "normal"...
Each project is unique. "Normal" and "2.5x slower" depend on several internal project settings. While the number of atoms isn't likely to be adjusted, it's fairly easy to alter the project to process a shorter or longer simulated time. Setting it to process 40% as much modeled time would cause it to upload completed WUs 2.5x as frequently and the points per WU would need to be adjusted to about 40%.
That would change your perception of the project. It would also make it run slightly less efficiently because of the increased time spent uploading and downloading data.
The number of simulated ns per WU is preset by the project owner. You'll find it printed in FAH's science log.