Page 1 of 3

Bad work units/folding slot failed RX580

Posted: Sun Mar 29, 2020 9:43 am
by Demandzm
I have 2 rx580s with the latest drivers. Whenever I get gpu work units they always go to folding slot 2. The other card has never seen any work as far as i can tell so I cant say for sure if this is specific to only 1 card. After a few minutes of work folding stops. Checking the log shows core shutdown: Bad_Work_Unit, then at some point the folding slot fails. I logged the temperatures and seen a peak of about 68 degrees. Both cards have 0 memory errors, run games and benchmarks fine. I even ran a cryptominer for an hour or so and everything was stable.

log is too big for the forum.
https://pastebin.com/5y775Lw3

Re: Bad work units/folding slot failed RX580

Posted: Sun Mar 29, 2020 1:03 pm
by ajm
You have a lot of bad units, always with the same code (114 = 0x72), and a series of "Error invoking kernel sortShortList". I don't know exactly, but from what I gather in older cases, it looks like a driver issue. There is a number of those with win10. For example, when a driver is updated by win10, FAH can be messed up. All the more if you use two cards.
The best solution I'm aware of is to uninstall thoroughly and reinstall FAH. You can first make a copy of the file config.xml in %AppData%\FAHClient which contains everything you need to reconfigure your account rapidly afterwards.
When FAH is installed, it will scan the hardware with more accuracy than it can after a driver update.

Re: Bad work units/folding slot failed RX580

Posted: Sun Mar 29, 2020 5:30 pm
by Joe_H
There was a problem found with some projects of a certain size and AMD cards that results in this issue. They are working on determining if the problem is in the driver or the folding core code. Temporarily they have disabled assignment of these projects to most AMD cards, the RDNA based ones such as the 5700XT do not appear to be affected by this. Searching for "sortShortList" should find the topic where this has been discussed.

I am not sure of exactly when the setting was made to not send these GPU projects to AMD cards, but you should not be getting any more. Let us know if you do and provide the project number.

Re: Bad work units/folding slot failed RX580

Posted: Sun Mar 29, 2020 6:42 pm
by rts
FWIW I see the exact same error on a 7870 GHz Edition. On Linux it just didn't work at all with FAH (invalid platformId size error, both with radeon and amdgpu; note that OpenCL memtest worked). On Windows it produces the following error.
GPU 0: Bus:1 Slot:0 Func:0 AMD:5 Pitcairn [Radeon HD 7800]
...
16:19:37:WU01:FS01:0x22:Project: 11759 (Run 0, Clone 145, Gen 17)
16:19:37:WU01:FS01:0x22:Unit: 0x0000002180fccb0a5e6d7c7c8e4495f9
16:19:37:WU01:FS01:0x22:Reading tar file core.xml
16:19:37:WU01:FS01:0x22:Reading tar file integrator.xml
16:19:37:WU01:FS01:0x22:Reading tar file state.xml
16:19:38:WU01:FS01:0x22:Reading tar file system.xml
16:19:39:WU01:FS01:0x22:Digital signatures verified
16:19:39:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
16:19:39:WU01:FS01:0x22:Version 0.0.2
16:20:03:WU01:FS01:0x22:ERROR:exception: Error invoking kernel sortShortList: clEnqueueNDRangeKernel (-5)
16:20:03:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
16:20:03:WU01:FS01:0x22:Saving result file science.log
16:20:03:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
16:20:03:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
Fresh Windows 10 install, drivers were never updated, because I installed them today and installed FAH afterwards. So I think it's pretty safe to say it's not a driver detection / installation problem.

Edit: It finally got another WU and this one seems to work just fine. PRCG is 11744 (0, 3733, 24).

Re: Bad work units/folding slot failed RX580

Posted: Sun Mar 29, 2020 8:34 pm
by mbainrot
I am getting this issue too for 11572 and 11781 on RX570

glad to hear it's not just me as that card is a problem child for me (bluescreens/kernal panics computers when in an Akito node pro eGPU enclosure...) and was gonna throw it out lol

my log (few reboots/restarts of FAHclient as this is a freshly spun box)

Code: Select all

*********************** Log Started 2020-03-29T20:07:10Z ***********************
20:07:10:************************* Folding@home Client *************************
20:07:10:        Website: https://foldingathome.org/
20:07:10:      Copyright: (c) 2009-2018 foldingathome.org
20:07:10:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
20:07:10:           Args: --child --lifeline 3781 /etc/fahclient/config.xml --run-as root
20:07:10:                 --pid-file=/var/run/fahclient.pid --daemon
20:07:10:         Config: /etc/fahclient/config.xml
20:07:10:******************************** Build ********************************
20:07:10:        Version: 7.5.1
20:07:10:           Date: May 11 2018
20:07:10:           Time: 19:59:04
20:07:10:     Repository: Git
20:07:10:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
20:07:10:         Branch: master
20:07:10:       Compiler: GNU 6.3.0 20170516
20:07:10:        Options: -std=gnu++98 -O3 -funroll-loops
20:07:10:       Platform: linux2 4.14.0-3-amd64
20:07:10:           Bits: 64
20:07:10:           Mode: Release
20:07:10:******************************* System ********************************
20:07:10:            CPU: Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz
20:07:10:         CPU ID: GenuineIntel Family 6 Model 15 Stepping 6
20:07:10:           CPUs: 2
20:07:10:         Memory: 1.94GiB
20:07:10:    Free Memory: 810.36MiB
20:07:10:        Threads: POSIX_THREADS
20:07:10:     OS Version: 4.15
20:07:10:    Has Battery: false
20:07:10:     On Battery: false
20:07:10:     UTC Offset: 11
20:07:10:            PID: 3783
20:07:10:            CWD: /var/lib/fahclient
20:07:10:             OS: Linux 4.15.0-20-generic x86_64
20:07:10:        OS Arch: AMD64
20:07:10:           GPUs: 1
20:07:10:          GPU 0: Bus:1 Slot:0 Func:0 AMD:5 Ellesmere XT [Radeon RX
20:07:10:                 470/480/570/580/590]
20:07:10:           CUDA: Not detected: Failed to open dynamic library 'libcuda.so':
20:07:10:                 libcuda.so: cannot open shared object file: No such file or
20:07:10:                 directory
20:07:10:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:3004.6
20:07:10:***********************************************************************
20:07:10:<config>
20:07:10:  <!-- Client Control -->
20:07:10:  <fold-anon v='true'/>
20:07:10:
20:07:10:  <!-- HTTP Server -->
20:07:10:  <allow v='127.0.0.1 172.16.0.0/16'/>
20:07:10:
20:07:10:  <!-- Network -->
20:07:10:  <proxy v=':8080'/>
20:07:10:
20:07:10:  <!-- Remote Command Server -->
20:07:10:  <command-allow-no-pass v='127.0.0.1 172.16.0.0/16'/>
20:07:10:
20:07:10:  <!-- Slot Control -->
20:07:10:  <power v='full'/>
20:07:10:
20:07:10:  <!-- Folding Slots -->
20:07:10:  <slot id='1' type='GPU'>
20:07:10:    <opencl-index v='0'/>
20:07:10:  </slot>
20:07:10:</config>
20:07:10:Switching to user root
20:07:10:Trying to access database...
20:07:12:Successfully acquired database lock
20:07:12:Enabled folding slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580/590]
20:07:12:WU00:FS01:Connecting to 65.254.110.245:8080
20:07:14:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:07:14:WU00:FS01:Connecting to 18.218.241.186:80
20:07:15:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
20:07:15:ERROR:WU00:FS01:Exception: Could not get an assignment
20:07:15:WU00:FS01:Connecting to 65.254.110.245:8080
20:07:16:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:07:16:WU00:FS01:Connecting to 18.218.241.186:80
20:07:17:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
20:07:17:ERROR:WU00:FS01:Exception: Could not get an assignment
20:08:15:WU00:FS01:Connecting to 65.254.110.245:8080
20:08:16:WU00:FS01:Assigned to work server 13.90.152.57
20:08:16:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580/590] from 13.90.152.57
20:08:16:WU00:FS01:Connecting to 13.90.152.57:8080
20:09:00:Saving configuration to /etc/fahclient/config.xml
20:09:00:<config>
20:09:00:  <!-- Client Control -->
20:09:00:  <fold-anon v='true'/>
20:09:00:
20:09:00:  <!-- HTTP Server -->
20:09:00:  <allow v='127.0.0.1 172.16.0.0/16'/>
20:09:00:
20:09:00:  <!-- Network -->
20:09:00:  <proxy v=':8080'/>
20:09:00:
20:09:00:  <!-- Remote Command Server -->
20:09:00:  <command-allow-no-pass v='127.0.0.1 172.16.0.0/16'/>
20:09:00:
20:09:00:  <!-- Slot Control -->
20:09:00:  <power v='full'/>
20:09:00:
20:09:00:  <!-- User Information -->
20:09:00:  <team v='223512'/>
20:09:00:  <user v='Max_Bainrot'/>
20:09:00:
20:09:00:  <!-- Folding Slots -->
20:09:00:  <slot id='1' type='GPU'>
20:09:00:    <opencl-index v='0'/>
20:09:00:  </slot>
20:09:00:</config>
20:09:05:WU00:FS01:Downloading 86.24MiB
20:09:11:WU00:FS01:Download 4.28%
20:09:14:Saving configuration to /etc/fahclient/config.xml
20:09:14:<config>
20:09:14:  <!-- Client Control -->
20:09:14:  <fold-anon v='true'/>
20:09:14:
20:09:14:  <!-- HTTP Server -->
20:09:14:  <allow v='127.0.0.1 172.16.0.0/16'/>
20:09:14:
20:09:14:  <!-- Network -->
20:09:14:  <proxy v=':8080'/>
20:09:14:
20:09:14:  <!-- Remote Command Server -->
20:09:14:  <command-allow-no-pass v='127.0.0.1 172.16.0.0/16'/>
20:09:14:
20:09:14:  <!-- Slot Control -->
20:09:14:  <power v='full'/>
20:09:14:
20:09:14:  <!-- User Information -->
20:09:14:  <team v='223512'/>
20:09:14:  <user v='Max_Bainrot'/>
20:09:14:
20:09:14:  <!-- Folding Slots -->
20:09:14:  <slot id='1' type='GPU'>
20:09:14:    <opencl-index v='0'/>
20:09:14:  </slot>
20:09:14:</config>
20:09:17:WU00:FS01:Download 8.84%
20:09:23:WU00:FS01:Download 14.57%
20:09:23:Saving configuration to /etc/fahclient/config.xml
20:09:23:<config>
20:09:23:  <!-- Client Control -->
20:09:23:  <fold-anon v='true'/>
20:09:23:
20:09:23:  <!-- HTTP Server -->
20:09:23:  <allow v='127.0.0.1 172.16.0.0/16'/>
20:09:23:
20:09:23:  <!-- Network -->
20:09:23:  <proxy v=':8080'/>
20:09:23:
20:09:23:  <!-- Remote Command Server -->
20:09:23:  <command-allow-no-pass v='127.0.0.1 172.16.0.0/16'/>
20:09:23:
20:09:23:  <!-- Slot Control -->
20:09:23:  <power v='full'/>
20:09:23:
20:09:23:  <!-- User Information -->
20:09:23:  <passkey v='********************************'/>
20:09:23:  <team v='223518'/>
20:09:23:  <user v='Max_Bainrot'/>
20:09:23:
20:09:23:  <!-- Folding Slots -->
20:09:23:  <slot id='1' type='GPU'>
20:09:23:    <opencl-index v='0'/>
20:09:23:  </slot>
20:09:23:</config>
20:09:30:WU00:FS01:Download 17.18%
20:09:36:WU00:FS01:Download 20.80%
20:09:42:WU00:FS01:Download 25.00%
20:09:48:WU00:FS01:Download 30.00%
20:09:54:WU00:FS01:Download 34.79%
20:10:00:WU00:FS01:Download 39.64%
20:10:06:WU00:FS01:Download 44.43%
20:10:12:WU00:FS01:Download 49.07%
20:10:15:Saving configuration to /etc/fahclient/config.xml
20:10:15:<config>
20:10:15:  <!-- Client Control -->
20:10:15:  <fold-anon v='true'/>
20:10:15:
20:10:15:  <!-- HTTP Server -->
20:10:15:  <allow v='127.0.0.1 172.16.0.0/16'/>
20:10:15:
20:10:15:  <!-- Network -->
20:10:15:  <proxy v=':8080'/>
20:10:15:
20:10:15:  <!-- Remote Command Server -->
20:10:15:  <command-allow-no-pass v='127.0.0.1 172.16.0.0/16'/>
20:10:15:
20:10:15:  <!-- Slot Control -->
20:10:15:  <power v='full'/>
20:10:15:
20:10:15:  <!-- User Information -->
20:10:15:  <passkey v='********************************'/>
20:10:15:  <team v='223518'/>
20:10:15:  <user v='Max_Bainrot'/>
20:10:15:
20:10:15:  <!-- Folding Slots -->
20:10:15:  <slot id='1' type='GPU'>
20:10:15:    <opencl-index v='0'/>
20:10:15:  </slot>
20:10:15:</config>
20:10:18:WU00:FS01:Download 53.85%
20:10:24:WU00:FS01:Download 58.70%
20:10:30:WU00:FS01:Download 62.47%
20:10:36:WU00:FS01:Download 64.50%
20:10:42:WU00:FS01:Download 68.78%
20:10:48:WU00:FS01:Download 74.43%
20:10:54:WU00:FS01:Download 78.93%
20:11:00:WU00:FS01:Download 84.22%
20:11:06:WU00:FS01:Download 89.36%
20:11:12:WU00:FS01:Download 94.51%
20:11:18:WU00:FS01:Download 99.51%
20:11:18:WU00:FS01:Download complete
20:11:19:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:11781 run:0 clone:1092 gen:15 core:0x22 unit:0x0000001e0d5a98395e73c52e8b172c52
20:11:19:WU00:FS01:Starting
20:11:19:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 705 -lifeline 3783 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
20:11:19:WU00:FS01:Started FahCore on PID 3819
20:11:19:WU00:FS01:Core PID:3823
20:11:19:WU00:FS01:FahCore 0x22 started
20:11:19:WU00:FS01:0x22:*********************** Log Started 2020-03-29T20:11:19Z ***********************
20:11:19:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
20:11:19:WU00:FS01:0x22:       Type: 0x22
20:11:19:WU00:FS01:0x22:       Core: Core22
20:11:19:WU00:FS01:0x22:    Website: https://foldingathome.org/
20:11:19:WU00:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
20:11:19:WU00:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
20:11:19:WU00:FS01:0x22:             <rafal.wiewiora@choderalab.org>
20:11:19:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 705 -lifeline 3819 -checkpoint 15
20:11:19:WU00:FS01:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
20:11:19:WU00:FS01:0x22:     Config: <none>
20:11:19:WU00:FS01:0x22:************************************ Build *************************************
20:11:19:WU00:FS01:0x22:    Version: 0.0.2
20:11:19:WU00:FS01:0x22:       Date: Dec 6 2019
20:11:19:WU00:FS01:0x22:       Time: 21:20:17
20:11:19:WU00:FS01:0x22: Repository: Git
20:11:19:WU00:FS01:0x22:   Revision: f87d92b58abdf7e6bf2e173cfbc4dc3e837c7042
20:11:19:WU00:FS01:0x22:     Branch: core22
20:11:19:WU00:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
20:11:19:WU00:FS01:0x22:    Options: -std=gnu++98 -O3 -funroll-loops
20:11:19:WU00:FS01:0x22:   Platform: linux2 4.9.87-linuxkit-aufs
20:11:19:WU00:FS01:0x22:       Bits: 64
20:11:19:WU00:FS01:0x22:       Mode: Release
20:11:19:WU00:FS01:0x22:************************************ System ************************************
20:11:19:WU00:FS01:0x22:        CPU: Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz
20:11:19:WU00:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 15 Stepping 6
20:11:19:WU00:FS01:0x22:       CPUs: 2
20:11:19:WU00:FS01:0x22:     Memory: 1.94GiB
20:11:19:WU00:FS01:0x22:Free Memory: 758.71MiB
20:11:19:WU00:FS01:0x22:    Threads: POSIX_THREADS
20:11:19:WU00:FS01:0x22: OS Version: 4.15
20:11:19:WU00:FS01:0x22:Has Battery: false
20:11:19:WU00:FS01:0x22: On Battery: false
20:11:19:WU00:FS01:0x22: UTC Offset: 11
20:11:19:WU00:FS01:0x22:        PID: 3823
20:11:19:WU00:FS01:0x22:        CWD: /var/lib/fahclient/work
20:11:19:WU00:FS01:0x22:         OS: Linux 4.15.0-20-generic x86_64
20:11:19:WU00:FS01:0x22:    OS Arch: AMD64
20:11:19:WU00:FS01:0x22:********************************************************************************
20:11:19:WU00:FS01:0x22:Project: 11781 (Run 0, Clone 1092, Gen 15)
20:11:19:WU00:FS01:0x22:Unit: 0x0000001e0d5a98395e73c52e8b172c52
20:11:19:WU00:FS01:0x22:Reading tar file core.xml
20:11:19:WU00:FS01:0x22:Reading tar file integrator.xml
20:11:19:WU00:FS01:0x22:Reading tar file state.xml
20:11:19:WU00:FS01:0x22:Reading tar file system.xml
20:11:20:WU00:FS01:0x22:Digital signatures verified
20:11:20:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
20:11:20:WU00:FS01:0x22:Version 0.0.2
20:11:44:WU00:FS01:0x22:ERROR:exception: Error invoking kernel sortShortList: clEnqueueNDRangeKernel (-5)
20:11:44:WU00:FS01:0x22:Saving result file ../logfile_01.txt
20:11:44:WU00:FS01:0x22:Saving result file science.log
20:11:44:WU00:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
20:11:45:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
20:11:45:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:11781 run:0 clone:1092 gen:15 core:0x22 unit:0x0000001e0d5a98395e73c52e8b172c52
20:11:45:WU00:FS01:Uploading 8.00KiB to 13.90.152.57
20:11:45:WU00:FS01:Connecting to 13.90.152.57:8080
20:11:45:WU01:FS01:Connecting to 65.254.110.245:8080
20:11:46:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:11:46:WU01:FS01:Connecting to 18.218.241.186:80
20:11:47:WU01:FS01:Assigned to work server 40.114.52.201
20:11:47:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580/590] from 40.114.52.201
20:11:47:WU01:FS01:Connecting to 40.114.52.201:8080
20:12:01:WU00:FS01:Upload complete
20:12:01:WU00:FS01:Server responded WORK_ACK (400)
20:12:01:WU00:FS01:Cleaning up
20:13:57:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
20:13:57:WU01:FS01:Connecting to 40.114.52.201:80
20:16:08:ERROR:WU01:FS01:Exception: Failed to connect to 40.114.52.201:80: Connection timed out
20:16:08:WU01:FS01:Connecting to 65.254.110.245:8080
20:16:09:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:16:09:WU01:FS01:Connecting to 18.218.241.186:80
20:16:10:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
20:16:10:ERROR:WU01:FS01:Exception: Could not get an assignment
20:17:08:WU01:FS01:Connecting to 65.254.110.245:8080
20:17:09:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:17:09:WU01:FS01:Connecting to 18.218.241.186:80
20:17:10:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
20:17:10:ERROR:WU01:FS01:Exception: Could not get an assignment
20:18:46:WU01:FS01:Connecting to 65.254.110.245:8080
20:18:47:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:18:47:WU01:FS01:Connecting to 18.218.241.186:80
20:18:48:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
20:18:48:ERROR:WU01:FS01:Exception: Could not get an assignment
20:21:23:WU01:FS01:Connecting to 65.254.110.245:8080
20:21:24:WU01:FS01:Assigned to work server 140.163.4.231
20:21:24:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580/590] from 140.163.4.231
20:21:24:WU01:FS01:Connecting to 140.163.4.231:8080
20:22:08:WU01:FS01:Downloading 13.15MiB
20:22:14:WU01:FS01:Download 13.31%
20:22:20:WU01:FS01:Download 17.11%
20:22:26:WU01:FS01:Download 22.34%
20:22:32:WU01:FS01:Download 46.11%
20:22:38:WU01:FS01:Download 69.40%
20:22:45:WU01:FS01:Download 80.80%
20:22:51:WU01:FS01:Download 99.82%
20:22:51:WU01:FS01:Download complete
20:22:51:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11752 run:0 clone:3229 gen:13 core:0x22 unit:0x0000001b8ca304e75e6a806dd63c1b2e
20:22:51:WU01:FS01:Starting
20:22:51:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 705 -lifeline 3783 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
20:22:51:WU01:FS01:Started FahCore on PID 3858
20:22:51:WU01:FS01:Core PID:3862
20:22:51:WU01:FS01:FahCore 0x22 started
20:22:51:WU01:FS01:0x22:*********************** Log Started 2020-03-29T20:22:51Z ***********************
20:22:51:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
20:22:51:WU01:FS01:0x22:       Type: 0x22
20:22:51:WU01:FS01:0x22:       Core: Core22
20:22:51:WU01:FS01:0x22:    Website: https://foldingathome.org/
20:22:51:WU01:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
20:22:51:WU01:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
20:22:51:WU01:FS01:0x22:             <rafal.wiewiora@choderalab.org>
20:22:51:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 705 -lifeline 3858 -checkpoint 15
20:22:51:WU01:FS01:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
20:22:51:WU01:FS01:0x22:     Config: <none>
20:22:51:WU01:FS01:0x22:************************************ Build *************************************
20:22:51:WU01:FS01:0x22:    Version: 0.0.2
20:22:51:WU01:FS01:0x22:       Date: Dec 6 2019
20:22:51:WU01:FS01:0x22:       Time: 21:20:17
20:22:51:WU01:FS01:0x22: Repository: Git
20:22:51:WU01:FS01:0x22:   Revision: f87d92b58abdf7e6bf2e173cfbc4dc3e837c7042
20:22:51:WU01:FS01:0x22:     Branch: core22
20:22:51:WU01:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
20:22:51:WU01:FS01:0x22:    Options: -std=gnu++98 -O3 -funroll-loops
20:22:51:WU01:FS01:0x22:   Platform: linux2 4.9.87-linuxkit-aufs
20:22:51:WU01:FS01:0x22:       Bits: 64
20:22:51:WU01:FS01:0x22:       Mode: Release
20:22:51:WU01:FS01:0x22:************************************ System ************************************
20:22:51:WU01:FS01:0x22:        CPU: Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz
20:22:51:WU01:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 15 Stepping 6
20:22:51:WU01:FS01:0x22:       CPUs: 2
20:22:51:WU01:FS01:0x22:     Memory: 1.94GiB
20:22:51:WU01:FS01:0x22:Free Memory: 837.48MiB
20:22:51:WU01:FS01:0x22:    Threads: POSIX_THREADS
20:22:51:WU01:FS01:0x22: OS Version: 4.15
20:22:51:WU01:FS01:0x22:Has Battery: false
20:22:51:WU01:FS01:0x22: On Battery: false
20:22:51:WU01:FS01:0x22: UTC Offset: 11
20:22:51:WU01:FS01:0x22:        PID: 3862
20:22:51:WU01:FS01:0x22:        CWD: /var/lib/fahclient/work
20:22:51:WU01:FS01:0x22:         OS: Linux 4.15.0-20-generic x86_64
20:22:51:WU01:FS01:0x22:    OS Arch: AMD64
20:22:51:WU01:FS01:0x22:********************************************************************************
20:22:51:WU01:FS01:0x22:Project: 11752 (Run 0, Clone 3229, Gen 13)
20:22:51:WU01:FS01:0x22:Unit: 0x0000001b8ca304e75e6a806dd63c1b2e
20:22:51:WU01:FS01:0x22:Reading tar file core.xml
20:22:51:WU01:FS01:0x22:Reading tar file integrator.xml
20:22:51:WU01:FS01:0x22:Reading tar file state.xml
20:22:54:WU01:FS01:0x22:Reading tar file system.xml
20:22:55:WU01:FS01:0x22:Digital signatures verified
20:22:55:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
20:22:55:WU01:FS01:0x22:Version 0.0.2
20:23:20:WU01:FS01:0x22:ERROR:exception: Error invoking kernel sortShortList: clEnqueueNDRangeKernel (-5)
20:23:20:WU01:FS01:0x22:Saving result file ../logfile_01.txt
20:23:20:WU01:FS01:0x22:Saving result file science.log
20:23:20:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
20:23:20:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
20:23:20:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11752 run:0 clone:3229 gen:13 core:0x22 unit:0x0000001b8ca304e75e6a806dd63c1b2e
20:23:20:WU01:FS01:Uploading 2.65KiB to 140.163.4.231
20:23:20:WU01:FS01:Connecting to 140.163.4.231:8080
20:23:21:WU00:FS01:Connecting to 65.254.110.245:8080
20:23:22:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:23:22:WU00:FS01:Connecting to 18.218.241.186:80
20:23:23:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
20:23:23:ERROR:WU00:FS01:Exception: Could not get an assignment
20:23:23:WU00:FS01:Connecting to 65.254.110.245:8080
20:23:24:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:23:24:WU00:FS01:Connecting to 18.218.241.186:80
20:23:25:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
20:23:25:ERROR:WU00:FS01:Exception: Could not get an assignment
20:24:19:WU01:FS01:Upload complete
20:24:19:WU01:FS01:Server responded WORK_ACK (400)
20:24:19:WU01:FS01:Cleaning up
20:24:23:WU00:FS01:Connecting to 65.254.110.245:8080
20:24:24:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:24:24:WU00:FS01:Connecting to 18.218.241.186:80
20:24:25:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
20:24:25:ERROR:WU00:FS01:Exception: Could not get an assignment
20:26:00:WU00:FS01:Connecting to 65.254.110.245:8080
20:26:01:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:26:01:WU00:FS01:Connecting to 18.218.241.186:80
20:26:02:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
20:26:02:ERROR:WU00:FS01:Exception: Could not get an assignment
20:28:37:WU00:FS01:Connecting to 65.254.110.245:8080
20:28:38:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
20:28:38:WU00:FS01:Connecting to 18.218.241.186:80
20:28:39:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
20:28:39:ERROR:WU00:FS01:Exception: Could not get an assignment

Re: Bad work units/folding slot failed RX580

Posted: Sun Mar 29, 2020 11:44 pm
by Demandzm
Joe_H wrote:There was a problem found with some projects of a certain size and AMD cards that results in this issue. They are working on determining if the problem is in the driver or the folding core code. Temporarily they have disabled assignment of these projects to most AMD cards, the RDNA based ones such as the 5700XT do not appear to be affected by this. Searching for "sortShortList" should find the topic where this has been discussed.

I am not sure of exactly when the setting was made to not send these GPU projects to AMD cards, but you should not be getting any more. Let us know if you do and provide the project number.

I will look into that. I also clean installed folding at home and gpu drivers just to be sure that wasnt an issue. Thanks guys.

Re: Bad work units/folding slot failed RX580

Posted: Mon Mar 30, 2020 1:24 am
by Demandzm
Im seeing the same errors again. I havent made it through the entire thread about the sortShortList issue yet but so far I dont think this is related. The projects that are failing have 62180 atoms and only on slot 2. The other card has work on project 11749 (also 62180 atoms) and is working fine. Im going to wait until the current WUs are finished and remove one of the cards and see if I still get errors.

Here is the log if anyone wants to see it.
https://pastebin.com/UM6UvwUy

Re: Bad work units/folding slot failed RX580

Posted: Mon Mar 30, 2020 3:16 am
by rocketraman
Same error here, Linux RX590. I don't see the `sortShortList` error in any logs in `/var/lib/fahclient`.

Code: Select all

04:03:17:WU03:FS01:0x22:*********************** Log Started 2020-03-29T04:03:17Z ***********************
04:03:17:WU03:FS01:0x22:*************************** Core22 Folding@home Core ***************************
04:03:17:WU03:FS01:0x22:       Type: 0x22
04:03:17:WU03:FS01:0x22:       Core: Core22
04:03:17:WU03:FS01:0x22:    Website: https://foldingathome.org/
04:03:17:WU03:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
04:03:17:WU03:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
04:03:17:WU03:FS01:0x22:             <rafal.wiewiora@choderalab.org>
04:03:17:WU03:FS01:0x22:       Args: -dir 03 -suffix 01 -version 705 -lifeline 1540691 -checkpoint 15
04:03:17:WU03:FS01:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
04:03:17:WU03:FS01:0x22:     Config: <none>
04:03:17:WU03:FS01:0x22:************************************ Build *************************************
04:03:17:WU03:FS01:0x22:    Version: 0.0.2
04:03:17:WU03:FS01:0x22:       Date: Dec 6 2019
04:03:17:WU03:FS01:0x22:       Time: 21:20:17
04:03:17:WU03:FS01:0x22: Repository: Git
04:03:17:WU03:FS01:0x22:   Revision: f87d92b58abdf7e6bf2e173cfbc4dc3e837c7042
04:03:17:WU03:FS01:0x22:     Branch: core22
04:03:17:WU03:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
04:03:17:WU03:FS01:0x22:    Options: -std=gnu++98 -O3 -funroll-loops
04:03:17:WU03:FS01:0x22:   Platform: linux2 4.9.87-linuxkit-aufs
04:03:17:WU03:FS01:0x22:       Bits: 64
04:03:17:WU03:FS01:0x22:       Mode: Release
04:03:17:WU03:FS01:0x22:************************************ System ************************************
04:03:17:WU03:FS01:0x22:        CPU: AMD Ryzen Threadripper 1950X 16-Core Processor
04:03:17:WU03:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 1 Stepping 1
04:03:17:WU03:FS01:0x22:       CPUs: 32
04:03:17:WU03:FS01:0x22:     Memory: 62.73GiB
04:03:17:WU03:FS01:0x22:Free Memory: 1.42GiB
04:03:17:WU03:FS01:0x22:    Threads: POSIX_THREADS
04:03:17:WU03:FS01:0x22: OS Version: 5.5
04:03:17:WU03:FS01:0x22:Has Battery: false
04:03:17:WU03:FS01:0x22: On Battery: false
04:03:17:WU03:FS01:0x22: UTC Offset: -4
04:03:17:WU03:FS01:0x22:        PID: 1540695
04:03:17:WU03:FS01:0x22:        CWD: /var/lib/fahclient/work
04:03:17:WU03:FS01:0x22:         OS: Linux 5.5.10-200.fc31.x86_64 x86_64
04:03:17:WU03:FS01:0x22:    OS Arch: AMD64
04:03:17:WU03:FS01:0x22:********************************************************************************
04:03:17:WU03:FS01:0x22:Project: 11762 (Run 0, Clone 7554, Gen 19)
04:03:17:WU03:FS01:0x22:Unit: 0x0000002080fccb0a5e7113d4b08154af
04:03:17:WU03:FS01:0x22:Reading tar file core.xml
04:03:17:WU03:FS01:0x22:Reading tar file integrator.xml
04:03:17:WU03:FS01:0x22:Reading tar file state.xml
04:03:17:WU03:FS01:0x22:Reading tar file system.xml
04:03:17:WU03:FS01:0x22:Digital signatures verified
04:03:17:WU03:FS01:0x22:Folding@home GPU Core22 Folding@home Core
04:03:17:WU03:FS01:0x22:Version 0.0.2
04:03:17:WU03:FS01:0x22:ERROR:126: Bad platformId size.
04:03:17:WU03:FS01:0x22:Saving result file ../logfile_01.txt
04:03:17:WU03:FS01:0x22:Saving result file science.log
04:03:17:WU03:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
04:03:17:WARNING:WU03:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
04:03:18:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:11762 run:0 clone:7554 gen:19 core:0x22 unit:0x0000002080fccb0a5e7113d4b08154af
04:03:18:WU03:FS01:Uploading 7.50KiB to 128.252.203.10
04:03:18:WU03:FS01:Connecting to 128.252.203.10:8080
04:03:33:WU03:FS01:Upload 100.00%
04:04:35:WU03:FS01:Upload complete
04:04:35:WU03:FS01:Server responded WORK_ACK (400)
04:04:35:WU03:FS01:Cleaning up

Re: Bad work units/folding slot failed RX580

Posted: Mon Mar 30, 2020 4:39 am
by Joe_H
Demandzm wrote:Im seeing the same errors again. I havent made it through the entire thread about the sortShortList issue yet but so far I dont think this is related. The projects that are failing have 62180 atoms and only on slot 2. The other card has work on project 11749 (also 62180 atoms) and is working fine. Im going to wait until the current WUs are finished and remove one of the cards and see if I still get errors.

Here is the log if anyone wants to see it.
https://pastebin.com/UM6UvwUy
Going through the beginning of the log in your first post, those WUs were failing with the sortShortList error. If you are seeing a different error in later WUs, could you post extracts from the log here that includes just those errors.

Re: Bad work units/folding slot failed RX580

Posted: Mon Mar 30, 2020 1:44 pm
by rts
Another failure, Project: 14533 (Run 0, Clone 3484, Gen 11)

Code: Select all

gpu:0:Pitcairn [Radeon HD 7800]

02:00:46:WU00:FS01:0x22:Project: 14533 (Run 0, Clone 3484, Gen 11)
02:00:46:WU00:FS01:0x22:Unit: 0x0000001980fccb025e72f219706e5d21
02:00:46:WU00:FS01:0x22:Reading tar file core.xml
02:00:46:WU00:FS01:0x22:Reading tar file integrator.xml
02:00:46:WU00:FS01:0x22:Reading tar file state.xml
02:00:46:WU00:FS01:0x22:Reading tar file system.xml
02:00:47:WU00:FS01:0x22:Digital signatures verified
02:00:47:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
02:00:47:WU00:FS01:0x22:Version 0.0.2
******************************* Date: 2020-03-30 *******************************
13:38:55:WU00:FS01:0x22:ERROR:exception: Error invoking kernel sortShortList: clEnqueueNDRangeKernel (-5)
13:38:55:WU00:FS01:0x22:Saving result file ..\logfile_01.txt
13:38:55:WU00:FS01:0x22:Saving result file science.log
13:38:55:WU00:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
13:38:56:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
Note the huuuuge delay between the WU starting and it failing.
rocketraman wrote:Same error here, Linux RX590. I don't see the `sortShortList` error in any logs in `/var/lib/fahclient`.

Code: Select all

04:03:17:WU03:FS01:0x22:ERROR:126: Bad platformId size.
That's a different issue. FWIW I don't think f@h works with opencl-mesa (radeon/amdgpu). You might try the proprietary AMDGPU-PRO drivers.

Re: Bad work units/folding slot failed RX580

Posted: Mon Mar 30, 2020 2:27 pm
by omcn777
I also have 2 RX580s. I am running kubuntu 18.04.03 I spent several hours trying to get the GPUs to work. I alas ended up messing up my whole install, so I started again with a new install. I got FAHClient, control, and viewer installed although I had to use aptitude to install python-gnome2 and then I had to wget a .deb of python-support. When following the DOCs, the --force-depends switch to install the FAHControl deb installs without the dependencies, but it does not use the correct dependencies that are default on kubuntu. There is a note in the DOCs about this new version just working with current dependencies on most distros.. (Not kubuntu) Under the REDHAT install instructions they mention symlink redirect would work for the python dependencies, but for Debian based distros it's not clear how to resolve those issues. Anyhow, I got FAHControl to not crash. Next issue, No GPUs? OK more googling and reading DOCs... I manually edited /etc/fahclient/config.xml and changed gpu line to true. I restarted the init.d service (systemctl is failed start cause it is already running.. why is this software built for init.d , I know it's old code but still.. can we stop the whining and move on :) ) I then clicked on configure in the FAHControl panel. I added my two GPUS with default -1. They show as "Ready" but if I look under system info tab I show GPUs 0 ? So I am a bit confused. I am happy to provide log files if you tell me where to dump them from? I did check /var/lib/fahclient/logs.txt and I will copy pasta that here. I see a note about openCL but I borked my Kubuntu install last night trying to get amdgpu installed and/or amdgpu-pro I ended up screwing up x server and lost my GUI all together so uh.. I am not the most advanced of users, admittedly. Little help? I am not completely hopeless. :D I am also getting a lot of "Empty work assignments" which I assume is do to the 20x more end-nodes added in the last week? Just please lmk if I am doing something wrong my end for that as well? Thanks.

Code: Select all

13:49:13:WU01:FS01:Started FahCore on PID 9528
13:49:13:WU01:FS01:Core PID:9532
13:49:13:WU01:FS01:FahCore 0x22 started
13:49:13:WU01:FS01:0x22:*********************** Log Started 2020-03-30T13:49:13Z ***********************
13:49:13:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
13:49:13:WU01:FS01:0x22:       Type: 0x22
13:49:13:WU01:FS01:0x22:       Core: Core22
13:49:13:WU01:FS01:0x22:    Website: https://foldingathome.org/
13:49:13:WU01:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
13:49:13:WU01:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
13:49:13:WU01:FS01:0x22:             <rafal.wiewiora@choderalab.org>
13:49:13:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 704 -lifeline 9528 -checkpoint 15 -gpu
13:49:13:WU01:FS01:0x22:             0 -gpu-vendor ati
13:49:13:WU01:FS01:0x22:     Config: <none>
13:49:13:WU01:FS01:0x22:************************************ Build *************************************
13:49:13:WU01:FS01:0x22:    Version: 0.0.2
13:49:13:WU01:FS01:0x22:       Date: Dec 6 2019
13:49:13:WU01:FS01:0x22:       Time: 21:20:17
13:49:13:WU01:FS01:0x22: Repository: Git
13:49:13:WU01:FS01:0x22:   Revision: f87d92b58abdf7e6bf2e173cfbc4dc3e837c7042
13:49:13:WU01:FS01:0x22:     Branch: core22
13:49:13:WU01:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
13:49:13:WU01:FS01:0x22:    Options: -std=gnu++98 -O3 -funroll-loops
13:49:13:WU01:FS01:0x22:   Platform: linux2 4.9.87-linuxkit-aufs
13:49:13:WU01:FS01:0x22:       Bits: 64
13:49:13:WU01:FS01:0x22:       Mode: Release
13:49:13:WU01:FS01:0x22:************************************ System ************************************
13:49:13:WU01:FS01:0x22:        CPU: AMD Ryzen 7 2700 Eight-Core Processor
13:49:13:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 8 Stepping 2
13:49:13:WU01:FS01:0x22:       CPUs: 16
13:49:13:WU01:FS01:0x22:     Memory: 15.56GiB
13:49:13:WU01:FS01:0x22:Free Memory: 6.13GiB
13:49:13:WU01:FS01:0x22:    Threads: POSIX_THREADS
13:49:13:WU01:FS01:0x22: OS Version: 5.3
13:49:13:WU01:FS01:0x22:Has Battery: false
13:49:13:WU01:FS01:0x22: On Battery: false
13:49:13:WU01:FS01:0x22: UTC Offset: -4
13:49:13:WU01:FS01:0x22:        PID: 9532
13:49:13:WU01:FS01:0x22:        CWD: /var/lib/fahclient/work
13:49:13:WU01:FS01:0x22:         OS: Linux 5.3.0-42-generic x86_64
13:49:13:WU01:FS01:0x22:    OS Arch: AMD64
13:49:13:WU01:FS01:0x22:********************************************************************************
13:49:13:WU01:FS01:0x22:Project: 11780 (Run 0, Clone 3697, Gen 13)
13:49:13:WU01:FS01:0x22:Unit: 0x000000140d5a98395e73c557ebe618e7
13:49:13:WU01:FS01:0x22:Reading tar file core.xml
13:49:13:WU01:FS01:0x22:Reading tar file integrator.xml
13:49:13:WU01:FS01:0x22:Reading tar file state.xml
13:49:13:WU01:FS01:0x22:Reading tar file system.xml
13:49:13:WU01:FS01:0x22:Digital signatures verified
13:49:13:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
13:49:13:WU01:FS01:0x22:Version 0.0.2
13:49:13:WU01:FS01:0x22:ERROR:exception: There is no registered Platform called "OpenCL"
13:49:13:WU01:FS01:0x22:Saving result file ../logfile_01.txt
13:49:13:WU01:FS01:0x22:Saving result file science.log
13:49:13:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
13:49:13:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
13:49:13:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11780 run:0 clone:3697 gen:13 core:0x22 unit:0x000000140d5a98395e73c557ebe618e7
13:49:13:WU01:FS01:Uploading 7.00KiB to 13.90.152.57
13:49:13:WU01:FS01:Connecting to 13.90.152.57:8080
13:49:14:WU03:FS01:Connecting to 65.254.110.245:80
13:49:14:WARNING:WU03:FS01:Failed to get assignment from '65.254.110.245:80': Empty work server assignment
13:49:14:WU03:FS01:Connecting to 18.218.241.186:80
Can anyone point me in the correct direction, I wish to commit as much GPU and CPU cycles to science as I can. Thank you.

Re: Bad work units/folding slot failed RX580

Posted: Mon Mar 30, 2020 2:40 pm
by favrepeoria
I am seeing the same errors with a RX 570. Seems odd that is stops working now. I have only had the card for about a week and it ran fine until last night.

Code: Select all

13:23:05:WU00:FS01:0x22:Project: 11776 (Run 0, Clone 12294, Gen 5)
13:23:05:WU00:FS01:0x22:Unit: 0x0000000e287234c95e74333c8d969930
13:23:05:WU00:FS01:0x22:Reading tar file core.xml
13:23:05:WU00:FS01:0x22:Reading tar file integrator.xml
13:23:05:WU00:FS01:0x22:Reading tar file state.xml
13:23:05:WU00:FS01:0x22:Reading tar file system.xml
13:23:05:WU00:FS01:0x22:Digital signatures verified
13:23:05:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
13:23:05:WU00:FS01:0x22:Version 0.0.2
13:23:18:WU00:FS01:0x22:ERROR:exception: Error invoking kernel sortShortList: clEnqueueNDRangeKernel (-5)
13:23:18:WU00:FS01:0x22:Saving result file ..\logfile_01.txt
13:23:18:WU00:FS01:0x22:Saving result file science.log
13:23:18:WU00:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
13:23:19:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
13:23:19:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:11776 run:0 clone:12294 gen:5 core:0x22 unit:0x0000000e287234c95e74333c8d969930

Re: Bad work units/folding slot failed RX580

Posted: Mon Mar 30, 2020 6:49 pm
by rts
omcn777 wrote:I also have 2 RX580s. I am running kubuntu 18.04.03 I spent several hours trying to get the GPUs to work. I alas ended up messing up my whole install, so I started again with a new install. I got FAHClient, control, and viewer installed although I had to use aptitude to install python-gnome2 and then I had to wget a .deb of python-support. When following the DOCs, the --force-depends switch to install the FAHControl deb installs without the dependencies, but it does not use the correct dependencies that are default on kubuntu. There is a note in the DOCs about this new version just working with current dependencies on most distros.. (Not kubuntu) Under the REDHAT install instructions they mention symlink redirect would work for the python dependencies, but for Debian based distros it's not clear how to resolve those issues. Anyhow, I got FAHControl to not crash. Next issue, No GPUs? OK more googling and reading DOCs... I manually edited /etc/fahclient/config.xml and changed gpu line to true. I restarted the init.d service (systemctl is failed start cause it is already running.. why is this software built for init.d , I know it's old code but still.. can we stop the whining and move on :) ) I then clicked on configure in the FAHControl panel. I added my two GPUS with default -1. They show as "Ready" but if I look under system info tab I show GPUs 0 ? So I am a bit confused. I am happy to provide log files if you tell me where to dump them from? I did check /var/lib/fahclient/logs.txt and I will copy pasta that here. I see a note about openCL but I borked my Kubuntu install last night trying to get amdgpu installed and/or amdgpu-pro I ended up screwing up x server and lost my GUI all together so uh.. I am not the most advanced of users, admittedly. Little help? I am not completely hopeless. :D I am also getting a lot of "Empty work assignments" which I assume is do to the 20x more end-nodes added in the last week? Just please lmk if I am doing something wrong my end for that as well? Thanks.

Code: Select all

13:49:13:WU01:FS01:Started FahCore on PID 9528
13:49:13:WU01:FS01:Core PID:9532
13:49:13:WU01:FS01:FahCore 0x22 started
13:49:13:WU01:FS01:0x22:*********************** Log Started 2020-03-30T13:49:13Z ***********************
13:49:13:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
13:49:13:WU01:FS01:0x22:       Type: 0x22
13:49:13:WU01:FS01:0x22:       Core: Core22
13:49:13:WU01:FS01:0x22:    Website: https://foldingathome.org/
13:49:13:WU01:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
13:49:13:WU01:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
13:49:13:WU01:FS01:0x22:             <rafal.wiewiora@choderalab.org>
13:49:13:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 704 -lifeline 9528 -checkpoint 15 -gpu
13:49:13:WU01:FS01:0x22:             0 -gpu-vendor ati
13:49:13:WU01:FS01:0x22:     Config: <none>
13:49:13:WU01:FS01:0x22:************************************ Build *************************************
13:49:13:WU01:FS01:0x22:    Version: 0.0.2
13:49:13:WU01:FS01:0x22:       Date: Dec 6 2019
13:49:13:WU01:FS01:0x22:       Time: 21:20:17
13:49:13:WU01:FS01:0x22: Repository: Git
13:49:13:WU01:FS01:0x22:   Revision: f87d92b58abdf7e6bf2e173cfbc4dc3e837c7042
13:49:13:WU01:FS01:0x22:     Branch: core22
13:49:13:WU01:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
13:49:13:WU01:FS01:0x22:    Options: -std=gnu++98 -O3 -funroll-loops
13:49:13:WU01:FS01:0x22:   Platform: linux2 4.9.87-linuxkit-aufs
13:49:13:WU01:FS01:0x22:       Bits: 64
13:49:13:WU01:FS01:0x22:       Mode: Release
13:49:13:WU01:FS01:0x22:************************************ System ************************************
13:49:13:WU01:FS01:0x22:        CPU: AMD Ryzen 7 2700 Eight-Core Processor
13:49:13:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 8 Stepping 2
13:49:13:WU01:FS01:0x22:       CPUs: 16
13:49:13:WU01:FS01:0x22:     Memory: 15.56GiB
13:49:13:WU01:FS01:0x22:Free Memory: 6.13GiB
13:49:13:WU01:FS01:0x22:    Threads: POSIX_THREADS
13:49:13:WU01:FS01:0x22: OS Version: 5.3
13:49:13:WU01:FS01:0x22:Has Battery: false
13:49:13:WU01:FS01:0x22: On Battery: false
13:49:13:WU01:FS01:0x22: UTC Offset: -4
13:49:13:WU01:FS01:0x22:        PID: 9532
13:49:13:WU01:FS01:0x22:        CWD: /var/lib/fahclient/work
13:49:13:WU01:FS01:0x22:         OS: Linux 5.3.0-42-generic x86_64
13:49:13:WU01:FS01:0x22:    OS Arch: AMD64
13:49:13:WU01:FS01:0x22:********************************************************************************
13:49:13:WU01:FS01:0x22:Project: 11780 (Run 0, Clone 3697, Gen 13)
13:49:13:WU01:FS01:0x22:Unit: 0x000000140d5a98395e73c557ebe618e7
13:49:13:WU01:FS01:0x22:Reading tar file core.xml
13:49:13:WU01:FS01:0x22:Reading tar file integrator.xml
13:49:13:WU01:FS01:0x22:Reading tar file state.xml
13:49:13:WU01:FS01:0x22:Reading tar file system.xml
13:49:13:WU01:FS01:0x22:Digital signatures verified
13:49:13:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
13:49:13:WU01:FS01:0x22:Version 0.0.2
13:49:13:WU01:FS01:0x22:ERROR:exception: There is no registered Platform called "OpenCL"
13:49:13:WU01:FS01:0x22:Saving result file ../logfile_01.txt
13:49:13:WU01:FS01:0x22:Saving result file science.log
13:49:13:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
13:49:13:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
13:49:13:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11780 run:0 clone:3697 gen:13 core:0x22 unit:0x000000140d5a98395e73c557ebe618e7
13:49:13:WU01:FS01:Uploading 7.00KiB to 13.90.152.57
13:49:13:WU01:FS01:Connecting to 13.90.152.57:8080
13:49:14:WU03:FS01:Connecting to 65.254.110.245:80
13:49:14:WARNING:WU03:FS01:Failed to get assignment from '65.254.110.245:80': Empty work server assignment
13:49:14:WU03:FS01:Connecting to 18.218.241.186:80
Can anyone point me in the correct direction, I wish to commit as much GPU and CPU cycles to science as I can. Thank you.
"There is no registered Platform called "OpenCL"" is a very different problem than the one discussed here ("Error invoking kernel sortShortList: clEnqueueNDRangeKernel (-5)"). Your problem is most likely an incorrect OpenCL driver install. Usually there is one package for the runtime (e.g. opencl-mesa, opencl-amd, opencl-nvidia) and another for the ICD (ocl-icd), and you may also need a package for the driver itself. You can use a tool like clinfo to check if OpenCL is set up correctly.

Re: Bad work units/folding slot failed RX580

Posted: Mon Mar 30, 2020 6:52 pm
by rts
Some more PRCGs with this error

Project: 11764 (Run 0, Clone 3514, Gen 16)
Project: 11764 (Run 0, Clone 3514, Gen 16)
Project: 11764 (Run 0, Clone 3514, Gen 16)
Project: 11764 (Run 0, Clone 3514, Gen 16)
Project: 11764 (Run 0, Clone 3514, Gen 16)
Project: 11764 (Run 0, Clone 3514, Gen 16)
Project: 11764 (Run 0, Clone 3514, Gen 16)
Project: 11764 (Run 0, Clone 3514, Gen 16)
Project: 11764 (Run 0, Clone 3514, Gen 16)
...

... yep, the machine got the same PRCG assigned some nine or ten times in a row.

Only one of these produced a different error:

Code: Select all

14:03:51:WU00:FS01:0x22:Project: 11764 (Run 0, Clone 3514, Gen 16)
14:03:51:WU00:FS01:0x22:Unit: 0x0000001f80fccb0a5e6d869ef2b6d474
14:03:51:WU00:FS01:0x22:Reading tar file core.xml
14:03:51:WU00:FS01:0x22:Reading tar file integrator.xml
14:03:51:WU00:FS01:0x22:Reading tar file state.xml
14:03:52:WU00:FS01:0x22:Reading tar file system.xml
14:03:54:WU00:FS01:0x22:Digital signatures verified
14:03:54:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
14:03:54:WU00:FS01:0x22:Version 0.0.2
14:05:32:WARNING:WU00:FS01:FahCore crashed with Windows unhandled exception code 0x40010004, searching for this code online may provide more information
14:05:32:WARNING:WU00:FS01:FahCore returned: UNKNOWN_ENUM (1073807364 = 0x40010004)
Edit: That last one is a false positive. This error code happens when Windows terminates a process due to the machine restarting.

Re: Bad work units/folding slot failed RX580

Posted: Mon Mar 30, 2020 8:08 pm
by Demandzm
Joe_H wrote:Going through the beginning of the log in your first post, those WUs were failing with the sortShortList error. If you are seeing a different error in later WUs, could you post extracts from the log here that includes just those errors.
Sure. I'll do what I can to get this figured out.