Page 1 of 1

WARNING:WU05:FS04:FahCore returned: CORE_RESTART (98 = 0x62)

Posted: Sun Mar 15, 2020 5:48 am
by jonfen801
How should I troubleshoot this? each time I restart the process (ubuntu 18.04) a different set (or only one) of the GPUs keeps throwing this error...

Code: Select all

05:40:40:WU04:FS01:0x22:Completed 20000 out of 1000000 steps (2%)
05:40:51:WU00:FS00:0xa7:Completed 90000 out of 125000 steps (72%)
05:40:56:WU05:FS04:Starting
05:40:56:WU05:FS04:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21 -dir 05 -suffix 01 -version 705 -lifeline 7258 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 3 -gpu 3
05:40:56:WU05:FS04:Started FahCore on PID 7478
05:40:56:WU05:FS04:Core PID:7482
05:40:56:WU05:FS04:FahCore 0x21 started
05:40:56:WU03:FS03:Starting
05:40:56:WU03:FS03:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21 -dir 03 -suffix 01 -version 705 -lifeline 7258 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 2 -gpu 2
05:40:56:WU03:FS03:Started FahCore on PID 7485
05:40:56:WU03:FS03:Core PID:7489
05:40:56:WU03:FS03:FahCore 0x21 started
05:41:11:WARNING:WU05:FS04:FahCore returned: CORE_RESTART (98 = 0x62)
05:41:11:WARNING:WU03:FS03:FahCore returned: CORE_RESTART (98 = 0x62)
05:41:25:7:192.168.0.5:New Web connection
05:41:56:WU05:FS04:Starting
05:41:56:WU05:FS04:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21 -dir 05 -suffix 01 -version 705 -lifeline 7258 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 3 -gpu 3
05:41:56:WU05:FS04:Started FahCore on PID 7492
05:41:56:WU05:FS04:Core PID:7496
05:41:56:WU05:FS04:FahCore 0x21 started
05:41:56:WU03:FS03:Starting
05:41:56:WU03:FS03:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21 -dir 03 -suffix 01 -version 705 -lifeline 7258 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 2 -gpu 2
05:41:56:WU03:FS03:Started FahCore on PID 7499
05:41:56:WU03:FS03:Core PID:7503
05:41:56:WU03:FS03:FahCore 0x21 started
05:42:11:WARNING:WU05:FS04:FahCore returned: CORE_RESTART (98 = 0x62)
05:42:11:WARNING:WU03:FS03:FahCore returned: CORE_RESTART (98 = 0x62)
05:42:31:WU04:FS01:0x22:Completed 30000 out of 1000000 steps (3%)
05:42:56:WU05:FS04:Starting
05:42:56:WU05:FS04:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21 -dir 05 -suffix 01 -version 705 -lifeline 7258 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 3 -gpu 3
05:42:56:WU05:FS04:Started FahCore on PID 7506
05:42:56:WU05:FS04:Core PID:7510
05:42:56:WU05:FS04:FahCore 0x21 started
05:42:56:WU03:FS03:Starting
05:42:56:WU03:FS03:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21 -dir 03 -suffix 01 -version 705 -lifeline 7258 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 2 -gpu 2
05:42:56:WU03:FS03:Started FahCore on PID 7513
05:42:56:WU03:FS03:Core PID:7517
05:42:56:WU03:FS03:FahCore 0x21 started
05:43:11:WARNING:WU05:FS04:FahCore returned: CORE_RESTART (98 = 0x62)
05:43:11:WARNING:WU03:FS03:FahCore returned: CORE_RESTART (98 = 0x62)
05:43:56:WU05:FS04:Starting
05:43:56:WU05:FS04:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21 -dir 05 -suffix 01 -version 705 -lifeline 7258 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 3 -gpu 3
05:43:56:WU05:FS04:Started FahCore on PID 7521
05:43:56:WU05:FS04:Core PID:7525
05:43:56:WU05:FS04:FahCore 0x21 started
05:43:56:WU03:FS03:Starting
05:43:56:WU03:FS03:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21 -dir 03 -suffix 01 -version 705 -lifeline 7258 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 2 -gpu 2
05:43:56:WU03:FS03:Started FahCore on PID 7528
05:43:56:WU03:FS03:Core PID:7532
05:43:56:WU03:FS03:FahCore 0x21 started
05:43:58:WU01:FS02:0x21:Completed 250000 out of 25000000 steps (1%)
05:44:11:WARNING:WU05:FS04:FahCore returned: CORE_RESTART (98 = 0x62)
05:44:11:WARNING:WU03:FS03:FahCore returned: CORE_RESTART (98 = 0x62)
05:44:23:WU04:FS01:0x22:Completed 40000 out of 1000000 steps (4%)
05:44:56:WU05:FS04:Starting
05:44:56:WU05:FS04:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21 -dir 05 -suffix 01 -version 705 -lifeline 7258 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 3 -gpu 3
05:44:56:WU05:FS04:Started FahCore on PID 7535
05:44:56:WU05:FS04:Core PID:7539
05:44:56:WU05:FS04:FahCore 0x21 started
05:44:56:WU03:FS03:Starting
05:44:56:WU03:FS03:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21 -dir 03 -suffix 01 -version 705 -lifeline 7258 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 2 -gpu 2
05:44:56:WU03:FS03:Started FahCore on PID 7542
05:44:56:WU03:FS03:Core PID:7546
05:44:56:WU03:FS03:FahCore 0x21 started
05:45:11:WARNING:WU05:FS04:FahCore returned: CORE_RESTART (98 = 0x62)
05:45:11:WARNING:WU03:FS03:FahCore returned: CORE_RESTART (98 = 0x62)

Re: WARNING:WU05:FS04:FahCore returned: CORE_RESTART (98 = 0

Posted: Sun Mar 15, 2020 6:03 am
by bruce
What process are you restarting and why?

The FAHClient process should be running (forever) as a Daemon whether your folding or not. As a background task, it downloads new assignments (WUs) and starts other processes like FAHCore_* which perform the computations to process the WU to completion. Result files are then passed to FAHClient for uploading. FAHClient should be started automatically at boot time.

You shou;d manage all these processes using WebControl or FAHControl which allows you monitor what is happening in the background as well as to interactively PAUSE active WUs or reconfigure other settings.

Re: WARNING:WU05:FS04:FahCore returned: CORE_RESTART (98 = 0

Posted: Sun Mar 15, 2020 12:53 pm
by toTOW
I don't know what you're doing, but this is the first time I see such messages ...

Re: WARNING:WU05:FS04:FahCore returned: CORE_RESTART (98 = 0

Posted: Sun Mar 15, 2020 1:11 pm
by foldy
@jonfen801: How many GPUs do you have? Can you post the first part of the FAH log.txt?

Re: WARNING:WU05:FS04:FahCore returned: CORE_RESTART (98 = 0

Posted: Sun Mar 15, 2020 11:02 pm
by jonfen801
@bruce: The daemon doesn't work on 18.04, it just fails. It gives no info just "FAIL" with an exit code of 1

@foldy: This is the config.xml file I am using:
viewtopic.php?f=16&t=32427&e=0

As a work around I have been running the following: $ cd /var/lib/fabclient && sudo FABClient

That works, but the same config in /etc/fabclient does not.
bruce wrote:What process are you restarting and why?

The FAHClient process should be running (forever) as a Daemon whether your folding or not. As a background task, it downloads new assignments (WUs) and starts other processes like FAHCore_* which perform the computations to process the WU to completion. Result files are then passed to FAHClient for uploading. FAHClient should be started automatically at boot time.

You shou;d manage all these processes using WebControl or FAHControl which allows you monitor what is happening in the background as well as to interactively PAUSE active WUs or reconfigure other settings.