GPU shows "failed"

It seems that a lot of GPU problems revolve around specific versions of drivers. Though AMD has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

JimboPalmer
Posts: 2522
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: GPU shows "failed"

Post by JimboPalmer »

Peter_Hucker wrote:I don't have the time to mess around with versions..
That is why I linked the last known version for you. I am retired and have the time.

OpenCL comes in versions, F@H only 'recently' required version 1.2, prior to that many older Nvidia cards worked that fail currently. Speeding up the software occasionally requires hardware features added in newer cards. It maybe frustrating that a 13 year old design is no longer new, but it is like dog years. I would not buy a card older than the Pascal or RDNA designs currently.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Peter_Hucker
Posts: 308
Joined: Wed Feb 16, 2022 1:18 am

Re: GPU shows "failed"

Post by Peter_Hucker »

JimboPalmer wrote:
Peter_Hucker wrote:I don't have the time to mess around with versions..
That is why I linked the last known version for you. I am retired and have the time.

OpenCL comes in versions, F@H only 'recently' required version 1.2, prior to that many older Nvidia cards worked that fail currently. Speeding up the software occasionally requires hardware features added in newer cards. It maybe frustrating that a 13 year old design is no longer new, but it is like dog years. I would not buy a card older than the Pascal or RDNA designs currently.
Sorry, I didn't realise you'd linked to a good version. This? https://www.amd.com/en/support/graphics ... eon-r9-280 That's just the latest version?

Anyway it would appear the offending computer has an out of date driver. Yet if I click update in the Radeon software it says it's up to date. I'll try installing the one you linked to another day (I can't reboot at the moment because it's scanning 5 USB sticks for errors.
Peter_Hucker
Posts: 308
Joined: Wed Feb 16, 2022 1:18 am

Re: GPU shows "failed"

Post by Peter_Hucker »

Joe_H wrote:Double precision is used in a few spots where needed to carry enough precision in critical calculations. They are a fraction of the total calculations done in the F@h GPU Core_22. You can emulate the double precision using singles, but it takes more than just three. Emulating the double precision work has been tried in the past, it was a lot slower than native double precision even with GPUs that process it at a small fraction of their single precision rate.
You say more than three. How many roughly? Because modern Nvidia cards are 32 times faster at single than double, so even if it required 32 single instructions to emulate 1 double, you wouldn't lose any speed. If it's less than 32, you'd gain speed.
Last edited by Peter_Hucker on Sat Feb 19, 2022 2:30 pm, edited 1 time in total.
Peter_Hucker
Posts: 308
Joined: Wed Feb 16, 2022 1:18 am

Re: GPU shows "failed"

Post by Peter_Hucker »

Peter_Hucker wrote:
JimboPalmer wrote:
Peter_Hucker wrote:I don't have the time to mess around with versions..
That is why I linked the last known version for you. I am retired and have the time.

OpenCL comes in versions, F@H only 'recently' required version 1.2, prior to that many older Nvidia cards worked that fail currently. Speeding up the software occasionally requires hardware features added in newer cards. It maybe frustrating that a 13 year old design is no longer new, but it is like dog years. I would not buy a card older than the Pascal or RDNA designs currently.
Sorry, I didn't realise you'd linked to a good version. This? https://www.amd.com/en/support/graphics ... eon-r9-280 That's just the latest version?

Anyway it would appear the offending computer has an out of date driver. Yet if I click update in the Radeon software it says it's up to date. I'll try installing the one you linked to another day (I can't reboot at the moment because it's scanning 5 USB sticks for errors.
And another computer that's not working has your driver version. The one that's ok has a later version (presumably for the newer card that's also in there), so I'm not sure what that does with the old card, I would have assumed it was using your version. But it works and the one with yours doesn't. I'll try upgrading them both by one notch later and see what happens. Rebooting is a pain and can't be done just now.
Joe_H
Site Admin
Posts: 7936
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: GPU shows "failed"

Post by Joe_H »

Peter_Hucker wrote:
Joe_H wrote:Double precision is used in a few spots where needed to carry enough precision in critical calculations. They are a fraction of the total calculations done in the F@h GPU Core_22. You can emulate the double precision using singles, but it takes more than just three. Emulating the double precision work has been tried in the past, it was a lot slower than native double precision even with GPUs that process it at a small fraction of their single precision rate.
You say more than three. How many roughly? Because modern Nvidia cards are 32 times faster at single than double, so even if it required 32 single instructions to emulate 1 double, you wouldn't lose any speed. If it's less than 32, you'd gain speed.
I don't know how many, but that does not matter. You skipped right over the part where I mentioned that emulation had been tried in the past, it was much slower - period. I will take the word of the researchers who actually did try this out.

Edit: Looked a little deeper, papers on one approach to emulating doubles using singles were looked at by researchers using another distributed computing platform. To implement they would also have needed to add BLAS and FFT libraries to their codebase, which would have added complexity and increased the executable size.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Peter_Hucker
Posts: 308
Joined: Wed Feb 16, 2022 1:18 am

Re: GPU shows "failed"

Post by Peter_Hucker »

JimboPalmer wrote:
Peter_Hucker wrote:I don't have the time to mess around with versions..
That is why I linked the last known version for you. I am retired and have the time.

OpenCL comes in versions, F@H only 'recently' required version 1.2, prior to that many older Nvidia cards worked that fail currently. Speeding up the software occasionally requires hardware features added in newer cards. It maybe frustrating that a 13 year old design is no longer new, but it is like dog years. I would not buy a card older than the Pascal or RDNA designs currently.
One of my computers had an earlier version than this, so I'm now upgrading it to see if that helps. But the other computer already had this version, and was also sticking, but overnight it got going, and reports it did 2 successfull tasks and no failed tasks, so I assume it was the problem I'd seen people refer to on Discord - that a certain project doesn't like AMD GPUs. Presumably the first part of the task was horrendously slow because all the work was being done on the CPU, and the end part got the GPU running and completed it on time.
Post Reply