Page 1 of 1

bad work units

Posted: Sat Aug 22, 2020 3:50 pm
by Forcinghavok
Bad work units crashing my system, other errors, and error: 10053: An established connection was aborted. Lets see if we can figure this out. Thank you.

Code: Select all

*********************** Log Started 2020-08-22T15:38:03Z ***********************
15:38:03:Trying to access database...
15:38:03:Successfully acquired database lock
15:38:03:Read GPUs.txt
15:38:03:Enabled folding slot 00: READY cpu:4
15:38:04:Enabled folding slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580/590]
15:38:05:****************************** FAHClient ******************************
15:38:05:        Version: 7.6.13
15:38:05:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:38:05:      Copyright: 2020 foldingathome.org
15:38:05:       Homepage: https://foldingathome.org/
15:38:05:           Date: Apr 27 2020
15:38:05:           Time: 21:21:01
15:38:05:       Revision: 5a652817f46116b6e135503af97f18e094414e3b
15:38:05:         Branch: master
15:38:05:       Compiler: Visual C++ 2008
15:38:05:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
15:38:05:       Platform: win32 10
15:38:05:           Bits: 32
15:38:05:           Mode: Release
15:38:05:         Config: C:\Users\erikb\AppData\Roaming\FAHClient\config.xml
15:38:05:******************************** CBang ********************************
15:38:05:           Date: Apr 24 2020
15:38:05:           Time: 17:07:55
15:38:05:       Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
15:38:05:         Branch: master
15:38:05:       Compiler: Visual C++ 2008
15:38:05:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
15:38:05:       Platform: win32 10
15:38:05:           Bits: 32
15:38:05:           Mode: Release
15:38:05:******************************* System ********************************
15:38:05:            CPU: AMD FX(tm)-8350 Eight-Core Processor
15:38:05:         CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
15:38:05:           CPUs: 8
15:38:05:         Memory: 15.90GiB
15:38:05:    Free Memory: 13.22GiB
15:38:05:        Threads: WINDOWS_THREADS
15:38:05:     OS Version: 6.2
15:38:05:    Has Battery: false
15:38:05:     On Battery: false
15:38:05:     UTC Offset: -4
15:38:05:            PID: 8952
15:38:05:            CWD: C:\Users\erikb\AppData\Roaming\FAHClient
15:38:05:  Win32 Service: false
15:38:05:             OS: Windows 10 Home
15:38:05:        OS Arch: AMD64
15:38:05:           GPUs: 1
15:38:05:          GPU 0: Bus:1 Slot:0 Func:0 AMD:5 Ellesmere XT [Radeon RX
15:38:05:                 470/480/570/580/590]
15:38:05:           CUDA: Not detected: Failed to open dynamic library 'nvcuda.dll': The
15:38:05:                 specified module could not be found.
15:38:05:
15:38:05:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:3110.7
15:38:05:******************************* libFAH ********************************
15:38:05:           Date: Apr 15 2020
15:38:05:           Time: 14:53:14
15:38:05:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
15:38:05:         Branch: master
15:38:05:       Compiler: Visual C++ 2008
15:38:05:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
15:38:05:       Platform: win32 10
15:38:05:           Bits: 32
15:38:05:           Mode: Release
15:38:05:***********************************************************************
15:38:05:<config>
15:38:05:  <!-- Folding Core -->
15:38:05:  <checkpoint v='3'/>
15:38:05:  <core-priority v='low'/>
15:38:05:
15:38:05:  <!-- Network -->
15:38:05:  <proxy v=':8080'/>
15:38:05:
15:38:05:  <!-- Slot Control -->
15:38:05:  <pause-on-battery v='false'/>
15:38:05:  <power v='full'/>
15:38:05:
15:38:05:  <!-- User Information -->
15:38:05:  <team v='2092'/>
15:38:05:  <user v='ForcingHavok'/>
15:38:05:
15:38:05:  <!-- Folding Slots -->
15:38:05:  <slot id='0' type='CPU'>
15:38:05:    <cpus v='4'/>
15:38:05:  </slot>
15:38:05:  <slot id='1' type='GPU'/>
15:38:05:</config>
15:38:05:WU00:FS00:Starting
15:38:05:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\erikb\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7.exe -dir 00 -suffix 01 -version 706 -lifeline 8952 -checkpoint 3 -np 4
15:38:05:WU00:FS00:Started FahCore on PID 7228
15:38:06:WU00:FS00:Core PID:2504
15:38:06:WU00:FS00:FahCore 0xa7 started
15:38:07:WU02:FS01:Starting
15:38:07:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\erikb\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.11/Core_22.fah/FahCore_22.exe -dir 02 -suffix 01 -version 706 -lifeline 8952 -checkpoint 3 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
15:38:07:WU02:FS01:Started FahCore on PID 8456
15:38:07:WU02:FS01:Core PID:2676
15:38:07:WU02:FS01:FahCore 0x22 started
15:38:08:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
15:38:08:WU00:FS00:0xa7:*********************** Log Started 2020-08-22T15:38:08Z ***********************
15:38:08:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
15:38:08:WU00:FS00:0xa7:       Type: 0xa7
15:38:08:WU00:FS00:0xa7:       Core: Gromacs
15:38:08:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 706 -lifeline 7228 -checkpoint 3 -np 4
15:38:08:WU00:FS00:0xa7:************************************ CBang *************************************
15:38:08:WU00:FS00:0xa7:       Date: Nov 27 2019
15:38:08:WU00:FS00:0xa7:       Time: 03:40:09
15:38:08:WU00:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
15:38:08:WU00:FS00:0xa7:     Branch: master
15:38:08:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
15:38:08:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
15:38:08:WU00:FS00:0xa7:   Platform: win32 10
15:38:08:WU00:FS00:0xa7:       Bits: 64
15:38:08:WU00:FS00:0xa7:       Mode: Release
15:38:08:WU00:FS00:0xa7:************************************ System ************************************
15:38:08:WU00:FS00:0xa7:        CPU: AMD FX(tm)-8350 Eight-Core Processor
15:38:08:WU00:FS00:0xa7:     CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
15:38:08:WU00:FS00:0xa7:       CPUs: 8
15:38:08:WU00:FS00:0xa7:     Memory: 15.90GiB
15:38:08:WU00:FS00:0xa7:Free Memory: 13.17GiB
15:38:08:WU00:FS00:0xa7:    Threads: WINDOWS_THREADS
15:38:08:WU00:FS00:0xa7: OS Version: 6.2
15:38:08:WU00:FS00:0xa7:Has Battery: false
15:38:08:WU00:FS00:0xa7: On Battery: false
15:38:08:WU00:FS00:0xa7: UTC Offset: -4
15:38:08:WU00:FS00:0xa7:        PID: 2504
15:38:08:WU00:FS00:0xa7:        CWD: C:\Users\erikb\AppData\Roaming\FAHClient\work
15:38:08:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
15:38:08:WU00:FS00:0xa7:    Version: 0.0.19
15:38:08:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:38:08:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
15:38:08:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
15:38:08:WU00:FS00:0xa7:       Date: Nov 25 2019
15:38:08:WU00:FS00:0xa7:       Time: 17:12:41
15:38:08:WU00:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
15:38:08:WU00:FS00:0xa7:     Branch: master
15:38:08:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
15:38:08:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
15:38:08:WU00:FS00:0xa7:   Platform: win32 10
15:38:08:WU00:FS00:0xa7:       Bits: 64
15:38:08:WU00:FS00:0xa7:       Mode: Release
15:38:08:WU00:FS00:0xa7:************************************ Build *************************************
15:38:08:WU00:FS00:0xa7:       SIMD: avx_256
15:38:08:WU00:FS00:0xa7:********************************************************************************
15:38:08:WU00:FS00:0xa7:Project: 13828 (Run 672, Clone 5, Gen 121)
15:38:08:WU00:FS00:0xa7:Unit: 0x0000008c80fccb095e73ac58e6026ab3
15:38:08:WU00:FS00:0xa7:Digital signatures verified
15:38:08:WU02:FS01:Starting
15:38:08:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\erikb\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.11/Core_22.fah/FahCore_22.exe -dir 02 -suffix 01 -version 706 -lifeline 8952 -checkpoint 3 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
15:38:08:WU02:FS01:Started FahCore on PID 4388
15:38:08:WU02:FS01:Core PID:8828
15:38:08:WU02:FS01:FahCore 0x22 started
15:38:08:WU00:FS00:0xa7:Calling: mdrun -s frame121.tpr -o frame121.trr -x frame121.xtc -cpi state.cpt -cpt 3 -nt 4
15:38:09:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
15:38:16:WU00:FS00:0xa7:Steps: first=15125000 total=125000
15:38:19:FS00:Shutting core down
15:38:20:WU00:FS00:0xa7:WARNING:Console control signal 1 on PID 2504
15:38:20:WU00:FS00:0xa7:Exiting, please wait. . .
15:38:23:WU00:FS00:0xa7:Completed 104441 out of 125000 steps (83%)
15:38:31:WU00:FS00:0xa7:Folding@home Core Shutdown: INTERRUPTED
15:38:31:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
15:39:04:Removing old file 'configs/config-20200819-000925.xml'
15:39:04:Saving configuration to config.xml
15:39:04:<config>
15:39:04:  <!-- Folding Core -->
15:39:04:  <checkpoint v='3'/>
15:39:04:  <core-priority v='low'/>
15:39:04:
15:39:04:  <!-- Network -->
15:39:04:  <proxy v=':8080'/>
15:39:04:
15:39:04:  <!-- Slot Control -->
15:39:04:  <pause-on-battery v='false'/>
15:39:04:  <power v='full'/>
15:39:04:
15:39:04:  <!-- User Information -->
15:39:04:  <team v='2092'/>
15:39:04:  <user v='ForcingHavok'/>
15:39:04:
15:39:04:  <!-- Folding Slots -->
15:39:04:  <slot id='0' type='CPU'>
15:39:04:    <cpus v='4'/>
15:39:04:    <idle v='true'/>
15:39:04:  </slot>
15:39:04:  <slot id='1' type='GPU'>
15:39:04:    <idle v='true'/>
15:39:04:  </slot>
15:39:04:</config>
15:44:03:WU00:FS00:Starting
15:44:03:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\erikb\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7.exe -dir 00 -suffix 01 -version 706 -lifeline 8952 -checkpoint 3 -np 4
15:44:03:WU00:FS00:Started FahCore on PID 2904
15:44:03:WU00:FS00:Core PID:7312
15:44:03:WU00:FS00:FahCore 0xa7 started
15:44:03:WU02:FS01:Starting
15:44:03:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\erikb\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.11/Core_22.fah/FahCore_22.exe -dir 02 -suffix 01 -version 706 -lifeline 8952 -checkpoint 3 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
15:44:03:WU02:FS01:Started FahCore on PID 8536
15:44:03:WU00:FS00:0xa7:*********************** Log Started 2020-08-22T15:44:03Z ***********************
15:44:03:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
15:44:03:WU00:FS00:0xa7:       Type: 0xa7
15:44:03:WU00:FS00:0xa7:       Core: Gromacs
15:44:03:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 706 -lifeline 2904 -checkpoint 3 -np 4
15:44:03:WU02:FS01:Core PID:8980
15:44:03:WU00:FS00:0xa7:************************************ CBang *************************************
15:44:03:WU02:FS01:FahCore 0x22 started
15:44:03:WU00:FS00:0xa7:       Date: Nov 27 2019
15:44:03:WU00:FS00:0xa7:       Time: 03:40:09
15:44:03:WU00:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
15:44:03:WU00:FS00:0xa7:     Branch: master
15:44:03:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
15:44:03:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
15:44:03:WU00:FS00:0xa7:   Platform: win32 10
15:44:03:WU00:FS00:0xa7:       Bits: 64
15:44:03:WU00:FS00:0xa7:       Mode: Release
15:44:03:WU00:FS00:0xa7:************************************ System ************************************
15:44:03:WU00:FS00:0xa7:        CPU: AMD FX(tm)-8350 Eight-Core Processor
15:44:03:WU00:FS00:0xa7:     CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
15:44:03:WU00:FS00:0xa7:       CPUs: 8
15:44:03:WU00:FS00:0xa7:     Memory: 15.90GiB
15:44:03:WU00:FS00:0xa7:Free Memory: 13.48GiB
15:44:03:WU00:FS00:0xa7:    Threads: WINDOWS_THREADS
15:44:03:WU00:FS00:0xa7: OS Version: 6.2
15:44:03:WU00:FS00:0xa7:Has Battery: false
15:44:03:WU00:FS00:0xa7: On Battery: false
15:44:03:WU00:FS00:0xa7: UTC Offset: -4
15:44:03:WU00:FS00:0xa7:        PID: 7312
15:44:03:WU00:FS00:0xa7:        CWD: C:\Users\erikb\AppData\Roaming\FAHClient\work
15:44:03:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
15:44:03:WU00:FS00:0xa7:    Version: 0.0.19
15:44:03:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:44:03:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
15:44:03:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
15:44:03:WU00:FS00:0xa7:       Date: Nov 25 2019
15:44:03:WU00:FS00:0xa7:       Time: 17:12:41
15:44:03:WU00:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
15:44:03:WU00:FS00:0xa7:     Branch: master
15:44:03:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
15:44:03:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
15:44:03:WU00:FS00:0xa7:   Platform: win32 10
15:44:03:WU00:FS00:0xa7:       Bits: 64
15:44:03:WU00:FS00:0xa7:       Mode: Release
15:44:03:WU00:FS00:0xa7:************************************ Build *************************************
15:44:03:WU00:FS00:0xa7:       SIMD: avx_256
15:44:03:WU00:FS00:0xa7:********************************************************************************
15:44:03:WU00:FS00:0xa7:Project: 13828 (Run 672, Clone 5, Gen 121)
15:44:03:WU00:FS00:0xa7:Unit: 0x0000008c80fccb095e73ac58e6026ab3
15:44:03:WU00:FS00:0xa7:Digital signatures verified
15:44:03:WU00:FS00:0xa7:Calling: mdrun -s frame121.tpr -o frame121.trr -x frame121.xtc -cpi state.cpt -cpt 3 -nt 4
15:44:04:WU00:FS00:0xa7:Steps: first=15125000 total=125000
15:44:07:WU00:FS00:0xa7:Completed 104452 out of 125000 steps (83%)
15:44:21:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
15:44:21:Removing old file 'configs/config-20200819-035204.xml'
15:44:21:Saving configuration to config.xml
15:44:21:<config>
15:44:21:  <!-- Folding Core -->
15:44:21:  <checkpoint v='3'/>
15:44:21:  <core-priority v='low'/>
15:44:21:
15:44:21:  <!-- Network -->
15:44:21:  <proxy v=':8080'/>
15:44:21:
15:44:21:  <!-- Slot Control -->
15:44:21:  <pause-on-battery v='false'/>
15:44:21:  <power v='full'/>
15:44:21:
15:44:21:  <!-- User Information -->
15:44:21:  <team v='2092'/>
15:44:21:  <user v='ForcingHavok'/>
15:44:21:
15:44:21:  <!-- Folding Slots -->
15:44:21:  <slot id='0' type='CPU'>
15:44:21:    <cpus v='4'/>
15:44:21:  </slot>
15:44:21:  <slot id='1' type='GPU'/>
15:44:21:</config>
15:44:23:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
15:45:03:WU02:FS01:Starting
15:45:03:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\erikb\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.11/Core_22.fah/FahCore_22.exe -dir 02 -suffix 01 -version 706 -lifeline 8952 -checkpoint 3 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
15:45:03:WU02:FS01:Started FahCore on PID 9060
15:45:03:WU02:FS01:Core PID:7896
15:45:03:WU02:FS01:FahCore 0x22 started
15:45:04:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)

Re: bad work units

Posted: Sat Aug 22, 2020 6:05 pm
by JimboPalmer
15:44:23:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
is usually the web client or viewer closing and is harmless.

15:38:05:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:3110.7
you look to have a recent AMD driver, with OpenCL installed correctly

Re: bad work units

Posted: Sat Aug 22, 2020 9:39 pm
by Forcinghavok
what about the bad work units?

Re: bad work units

Posted: Sat Aug 22, 2020 10:02 pm
by PantherX
Looking at your configuration, I see that you're missing a passkey. It is recommended for security reasons and bonus points: https://foldingathome.org/support/faq/points/passkey/

I have noticed that the checkpoint is set to 3 minutes (<checkpoint v='3'/>) which I think is a bit excessive. GPU WUs do not use that value for checkpoints. Instead, it is only used by the CPU slot. The default is 15 minutes but if you fold 24/7 on that system, you may want to consider 30 minutes if you have a reliable power supply to the system.

I would generally recommend to stick with the default for folding to stop if the system is on battery since folding is very intensive and can drain the battery very quickly (<pause-on-battery v='false'/>).

Regarding the GPU WU, it's interesting that the output of FahCore_22 isn't being displayed. That's an unusual behavior and I would suggest that you find the science.log file which should be here: C:\Users\erikb\AppData\Roaming\FAHClient\work\02\01 and paste it so we can see what's happening. If you can also make a copy of the work folder for archival reasons, it would be great for further investigation :)

Re: bad work units

Posted: Sun Aug 23, 2020 7:11 am
by bruce
PantherX wrote:I would generally recommend to stick with the default for folding to stop if the system is on battery since folding is very intensive and can drain the battery very quickly (<pause-on-battery v='false'/>).
As a general recommendation, I support that, but I run with that setting. I sometimes have to move my laptop from one room to another and that's enough to unnecessarily restart from the last from the previous checkpoint. :evil: If I'm going to be on battery for more than a few minutes, I deal with it manually. YMMV

Re: bad work units

Posted: Tue Nov 17, 2020 11:54 pm
by Forcinghavok
15:44:23:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.

Hello, I posted about this but never really seemed to get an answer for this!? Is this an issue? If not, why does my system keep saying something about the connecting being aborted??

Thanks.

Re: bad work units

Posted: Wed Nov 18, 2020 2:58 am
by bruce
Forcinghavok wrote:15:44:23:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.

Hello, I posted about this but never really seemed to get an answer for this!? Is this an issue? If not, why does my system keep saying something about the connecting being aborted??

Thanks.
There are several possibles but 99% of them can be ignored.

The most common even is when you close FAHControl or WebControl using the (close window ) [X] rather than selecting the Exit button in the screen's interior. It seems that the programmer assumes that he gave you the functionality to exit the app without closing the window and (s)he decided to warn you that in their opinion, you should use the feature that has been provided. (well, maybe not ... that's just my sarcasm).

Everybody knows that if they close a window, the APP that's running in that window should silently exit without a warning message ... everybody except FAH.