13404 (378, 16 ,0) failed at 49%

Moderators: Site Moderators, FAHC Science Team

Post Reply
STFC9F22
Posts: 26
Joined: Fri Jul 06, 2012 7:14 pm

13404 (378, 16 ,0) failed at 49%

Post by STFC9F22 »

Hi,

Project 13404 (Run 378, Clone 16, Gen 0) has failed this morning at 49% due to 'Following exception occured: Particle coordinate is nan'.

I reported a similar issue yesterday relating to 13405 (156,45,0) (see viewtopic.php?f=19&t=35144), at which time the WU Status page for that WU indicated that it was the third failure of that WU that day - although I now see that a fourth Donor is subsequently reported as folding it successfully.

The following is an extract from today's log of the GPU slot at the time of failure -

Code: Select all

09:06:51:WU00:FS01:0x22:Completed 470000 out of 1000000 steps (47%)
09:09:36:WU00:FS01:0x22:Completed 480000 out of 1000000 steps (48%)
09:12:23:WU00:FS01:0x22:Completed 490000 out of 1000000 steps (49%)
09:14:14:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
09:14:14:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
09:14:29:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
09:14:29:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
09:14:44:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
09:14:44:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
09:14:44:WU00:FS01:0x22:ERROR:114: Max Retries Reached
09:14:44:WU00:FS01:0x22:Saving result file ..\logfile_01.txt
09:14:44:WU00:FS01:0x22:Saving result file badstate-0.xml
09:14:44:WU00:FS01:0x22:Saving result file badstate-1.xml
09:14:44:WU00:FS01:0x22:Saving result file badstate-2.xml
09:14:44:WU00:FS01:0x22:Saving result file checkpointState.xml
09:14:45:WU00:FS01:0x22:Saving result file checkpt.crc
09:14:45:WU00:FS01:0x22:Saving result file globals.csv
09:14:45:WU00:FS01:0x22:Saving result file positions.xtc
09:14:45:WU00:FS01:0x22:Saving result file science.log
09:14:45:WU00:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
09:14:46:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
09:14:46:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13404 run:378 clone:16 gen:0 core:0x22 unit:0x0000000112bc7d9a5eb37aa6e4bad523
09:14:46:WU00:FS01:Uploading 4.81MiB to 18.188.125.154
09:14:46:WU00:FS01:Connecting to 18.188.125.154:8080
09:14:47:WU02:FS01:Connecting to assign1.foldingathome.org:80
09:14:47:WARNING:WU02:FS01:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
09:14:47:WU02:FS01:Connecting to assign2.foldingathome.org:80
09:14:48:WARNING:WU02:FS01:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration
09:14:48:WU02:FS01:Connecting to assign3.foldingathome.org:80
09:14:48:WARNING:WU02:FS01:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration
09:14:48:WU02:FS01:Connecting to assign4.foldingathome.org:80
09:14:49:WU02:FS01:Assigned to work server 128.252.203.10
09:14:49:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1070] 6463 from 128.252.203.10
09:14:49:WU02:FS01:Connecting to 128.252.203.10:8080
09:14:52:WU00:FS01:Upload 14.29%
09:14:58:WU00:FS01:Upload 28.58%
09:15:04:WU00:FS01:Upload 41.57%
09:15:10:WU00:FS01:Upload 55.86%
09:15:10:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
09:15:10:WU02:FS01:Connecting to 128.252.203.10:80
09:15:16:WU00:FS01:Upload 70.15%
09:15:22:WU00:FS01:Upload 83.14%
09:15:28:WU00:FS01:Upload 97.43%
09:15:30:WU00:FS01:Upload complete
09:15:30:WU00:FS01:Server responded WORK_ACK (400)
09:15:30:WU00:FS01:Cleaning up
My GPU is not overclocked and temperatures are consistently reported below 50C. I see that the Work Unit Status for 13404 (378, 16 ,0) is currently showing it as having failed for a second donor.

The following is the full log of this mornings session which might help to identify if there is commonality in the hardware or software set-up of donors for which these Work Units are failing.

Code: Select all

*********************** Log Started 2020-05-08T06:58:35Z ***********************
06:58:35:Trying to access database...
06:58:35:Successfully acquired database lock
06:58:35:Read GPUs.txt
06:58:36:Enabled folding slot 01: PAUSED gpu:0:GP104 [GeForce GTX 1070] 6463 (by user)
06:58:36:Enabled folding slot 00: PAUSED cpu:4 (by user)
06:58:36:****************************** FAHClient ******************************
06:58:36:        Version: 7.6.13
06:58:36:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
06:58:36:      Copyright: 2020 foldingathome.org
06:58:36:       Homepage: https://foldingathome.org/
06:58:36:           Date: Apr 27 2020
06:58:36:           Time: 21:21:01
06:58:36:       Revision: 5a652817f46116b6e135503af97f18e094414e3b
06:58:36:         Branch: master
06:58:36:       Compiler: Visual C++ 2008
06:58:36:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:58:36:       Platform: win32 10
06:58:36:           Bits: 32
06:58:36:           Mode: Release
06:58:36:         Config: C:\Users\Malcolm\AppData\Roaming\FAHClient\config.xml
06:58:36:******************************** CBang ********************************
06:58:36:           Date: Apr 24 2020
06:58:36:           Time: 17:07:55
06:58:36:       Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
06:58:36:         Branch: master
06:58:36:       Compiler: Visual C++ 2008
06:58:36:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:58:36:       Platform: win32 10
06:58:36:           Bits: 32
06:58:36:           Mode: Release
06:58:36:******************************* System ********************************
06:58:36:            CPU: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
06:58:36:         CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
06:58:36:           CPUs: 8
06:58:36:         Memory: 15.86GiB
06:58:36:    Free Memory: 11.71GiB
06:58:36:        Threads: WINDOWS_THREADS
06:58:36:     OS Version: 6.2
06:58:36:    Has Battery: false
06:58:36:     On Battery: false
06:58:36:     UTC Offset: 1
06:58:36:            PID: 12492
06:58:36:            CWD: C:\Users\Malcolm\AppData\Roaming\FAHClient
06:58:36:  Win32 Service: false
06:58:36:             OS: Windows 10 Home
06:58:36:        OS Arch: AMD64
06:58:36:           GPUs: 1
06:58:36:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 GP104 [GeForce GTX 1070] 6463
06:58:36:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:6.1 Driver:11.0
06:58:36:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:445.75
06:58:36:OpenCL Device 1: Platform:1 Device:0 Bus:NA Slot:NA Compute:1.2 Driver:20.19
06:58:36:******************************* libFAH ********************************
06:58:36:           Date: Apr 15 2020
06:58:36:           Time: 14:53:14
06:58:36:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
06:58:36:         Branch: master
06:58:36:       Compiler: Visual C++ 2008
06:58:36:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:58:36:       Platform: win32 10
06:58:36:           Bits: 32
06:58:36:           Mode: Release
06:58:36:***********************************************************************
06:58:36:<config>
06:58:36:  <!-- Folding Core -->
06:58:36:  <checkpoint v='30'/>
06:58:36:
06:58:36:  <!-- Folding Slot Configuration -->
06:58:36:  <cause v='COVID_19'/>
06:58:36:
06:58:36:  <!-- Network -->
06:58:36:  <proxy v=':8080'/>
06:58:36:
06:58:36:  <!-- Slot Control -->
06:58:36:  <pause-on-start v='true'/>
06:58:36:
06:58:36:  <!-- User Information -->
06:58:36:  <passkey v='*****'/>
06:58:36:  <user v='STFC9F22'/>
06:58:36:
06:58:36:  <!-- Work Unit Control -->
06:58:36:  <next-unit-percentage v='100'/>
06:58:36:
06:58:36:  <!-- Folding Slots -->
06:58:36:  <slot id='1' type='GPU'/>
06:58:36:  <slot id='0' type='CPU'>
06:58:36:    <cpus v='4'/>
06:58:36:  </slot>
06:58:36:</config>
06:58:57:FS01:Unpaused
06:58:57:FS00:Unpaused
06:58:57:WU00:FS01:Connecting to assign1.foldingathome.org:80
06:58:57:WU01:FS00:Connecting to assign1.foldingathome.org:80
06:58:57:WARNING:WU00:FS01:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
06:58:57:WU00:FS01:Connecting to assign2.foldingathome.org:80
06:58:58:WU01:FS00:Assigned to work server 3.133.76.19
06:58:58:WU01:FS00:Requesting new work unit for slot 00: READY cpu:4 from 3.133.76.19
06:58:58:WU01:FS00:Connecting to 3.133.76.19:8080
06:58:58:WARNING:WU00:FS01:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration
06:58:58:WU00:FS01:Connecting to assign3.foldingathome.org:80
06:58:58:WARNING:WU00:FS01:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration
06:58:58:WU00:FS01:Connecting to assign4.foldingathome.org:80
06:58:59:WARNING:WU00:FS01:Failed to get assignment from 'assign4.foldingathome.org:80': No WUs available for this configuration
06:58:59:ERROR:WU00:FS01:Exception: Could not get an assignment
06:58:59:WU00:FS01:Connecting to assign1.foldingathome.org:80
06:59:00:WU00:FS01:Assigned to work server 3.133.76.19
06:59:00:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1070] 6463 from 3.133.76.19
06:59:00:WU00:FS01:Connecting to 3.133.76.19:8080
06:59:19:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
06:59:19:WU01:FS00:Connecting to 3.133.76.19:80
06:59:21:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
06:59:21:WU00:FS01:Connecting to 3.133.76.19:80
06:59:40:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
06:59:40:WU01:FS00:Connecting to assign1.foldingathome.org:80
06:59:40:WU01:FS00:Assigned to work server 206.223.170.146
06:59:40:WU01:FS00:Requesting new work unit for slot 00: READY cpu:4 from 206.223.170.146
06:59:40:WU01:FS00:Connecting to 206.223.170.146:8080
06:59:41:WU01:FS00:Downloading 1.37MiB
06:59:42:ERROR:WU00:FS01:Exception: Failed to connect to 3.133.76.19:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
06:59:42:WU01:FS00:Download complete
06:59:43:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:14534 run:135 clone:2 gen:45 core:0xa7 unit:0x00000032cedfaa925ea34b75ceff2474
06:59:43:WU01:FS00:Starting
06:59:43:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\Malcolm\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/avx/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 706 -lifeline 12492 -checkpoint 30 -np 4
06:59:43:WU01:FS00:Started FahCore on PID 16716
06:59:43:WU01:FS00:Core PID:29820
06:59:43:WU01:FS00:FahCore 0xa7 started
06:59:43:WU01:FS00:0xa7:*********************** Log Started 2020-05-08T06:59:43Z ***********************
06:59:43:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
06:59:43:WU01:FS00:0xa7:       Type: 0xa7
06:59:43:WU01:FS00:0xa7:       Core: Gromacs
06:59:43:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 16716 -checkpoint 30 -np
06:59:43:WU01:FS00:0xa7:             4
06:59:43:WU01:FS00:0xa7:************************************ CBang *************************************
06:59:43:WU01:FS00:0xa7:       Date: Oct 26 2019
06:59:43:WU01:FS00:0xa7:       Time: 01:38:25
06:59:43:WU01:FS00:0xa7:   Revision: c46a1a011a24143739ac7218c5a435f66777f62f
06:59:43:WU01:FS00:0xa7:     Branch: master
06:59:43:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
06:59:43:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:59:43:WU01:FS00:0xa7:   Platform: win32 10
06:59:43:WU01:FS00:0xa7:       Bits: 64
06:59:43:WU01:FS00:0xa7:       Mode: Release
06:59:43:WU01:FS00:0xa7:************************************ System ************************************
06:59:43:WU01:FS00:0xa7:        CPU: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
06:59:43:WU01:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
06:59:43:WU01:FS00:0xa7:       CPUs: 8
06:59:43:WU01:FS00:0xa7:     Memory: 15.86GiB
06:59:43:WU01:FS00:0xa7:Free Memory: 11.65GiB
06:59:43:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
06:59:43:WU01:FS00:0xa7: OS Version: 6.2
06:59:43:WU01:FS00:0xa7:Has Battery: false
06:59:43:WU01:FS00:0xa7: On Battery: false
06:59:43:WU01:FS00:0xa7: UTC Offset: 1
06:59:43:WU01:FS00:0xa7:        PID: 29820
06:59:43:WU01:FS00:0xa7:        CWD: C:\Users\Malcolm\AppData\Roaming\FAHClient\work
06:59:43:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
06:59:43:WU01:FS00:0xa7:    Version: 0.0.18
06:59:43:WU01:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
06:59:43:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
06:59:43:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
06:59:43:WU01:FS00:0xa7:       Date: Oct 26 2019
06:59:43:WU01:FS00:0xa7:       Time: 01:52:30
06:59:43:WU01:FS00:0xa7:   Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
06:59:43:WU01:FS00:0xa7:     Branch: master
06:59:43:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
06:59:43:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:59:43:WU01:FS00:0xa7:   Platform: win32 10
06:59:43:WU01:FS00:0xa7:       Bits: 64
06:59:43:WU01:FS00:0xa7:       Mode: Release
06:59:43:WU01:FS00:0xa7:************************************ Build *************************************
06:59:43:WU01:FS00:0xa7:       SIMD: avx_256
06:59:43:WU01:FS00:0xa7:********************************************************************************
06:59:43:WU01:FS00:0xa7:Project: 14534 (Run 135, Clone 2, Gen 45)
06:59:43:WU01:FS00:0xa7:Unit: 0x00000032cedfaa925ea34b75ceff2474
06:59:43:WU01:FS00:0xa7:Reading tar file core.xml
06:59:43:WU01:FS00:0xa7:Reading tar file frame45.tpr
06:59:43:WU01:FS00:0xa7:Digital signatures verified
06:59:43:WU01:FS00:0xa7:Calling: mdrun -s frame45.tpr -o frame45.trr -x frame45.xtc -cpt 30 -nt 4
06:59:43:WU01:FS00:0xa7:Steps: first=45000000 total=1000000
06:59:43:WU01:FS00:0xa7:Completed 1 out of 1000000 steps (0%)
06:59:59:WU00:FS01:Connecting to assign1.foldingathome.org:80
07:00:00:WARNING:WU00:FS01:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
07:00:00:WU00:FS01:Connecting to assign2.foldingathome.org:80
07:00:00:WU00:FS01:Assigned to work server 18.188.125.154
07:00:00:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1070] 6463 from 18.188.125.154
07:00:00:WU00:FS01:Connecting to 18.188.125.154:8080
07:00:02:WU00:FS01:Downloading 6.63MiB
07:00:08:WU00:FS01:Download 81.13%
07:00:09:WU00:FS01:Download complete
07:00:09:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13404 run:378 clone:16 gen:0 core:0x22 unit:0x0000000112bc7d9a5eb37aa6e4bad523
07:00:09:WU00:FS01:Starting
07:00:09:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\Malcolm\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 12492 -checkpoint 30 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
07:00:09:WU00:FS01:Started FahCore on PID 31272
07:00:09:WU00:FS01:Core PID:2532
07:00:09:WU00:FS01:FahCore 0x22 started
07:00:10:WU00:FS01:0x22:*********************** Log Started 2020-05-08T07:00:09Z ***********************
07:00:10:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
07:00:10:WU00:FS01:0x22:       Type: 0x22
07:00:10:WU00:FS01:0x22:       Core: Core22
07:00:10:WU00:FS01:0x22:    Website: https://foldingathome.org/
07:00:10:WU00:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
07:00:10:WU00:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
07:00:10:WU00:FS01:0x22:             <rafal.wiewiora@choderalab.org>
07:00:10:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 706 -lifeline 31272 -checkpoint 30
07:00:10:WU00:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
07:00:10:WU00:FS01:0x22:             0 -gpu 0
07:00:10:WU00:FS01:0x22:     Config: <none>
07:00:10:WU00:FS01:0x22:************************************ Build *************************************
07:00:10:WU00:FS01:0x22:    Version: 0.0.5
07:00:10:WU00:FS01:0x22:       Date: Apr 22 2020
07:00:10:WU00:FS01:0x22:       Time: 04:42:59
07:00:10:WU00:FS01:0x22: Repository: Git
07:00:10:WU00:FS01:0x22:   Revision: 2d69202c898bd9bb3e093f51cd32bf411c2a0388
07:00:10:WU00:FS01:0x22:     Branch: HEAD
07:00:10:WU00:FS01:0x22:   Compiler: Visual C++ 2008
07:00:10:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:00:10:WU00:FS01:0x22:   Platform: win32 10
07:00:10:WU00:FS01:0x22:       Bits: 64
07:00:10:WU00:FS01:0x22:       Mode: Release
07:00:10:WU00:FS01:0x22:************************************ System ************************************
07:00:10:WU00:FS01:0x22:        CPU: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
07:00:10:WU00:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
07:00:10:WU00:FS01:0x22:       CPUs: 8
07:00:10:WU00:FS01:0x22:     Memory: 15.86GiB
07:00:10:WU00:FS01:0x22:Free Memory: 11.54GiB
07:00:10:WU00:FS01:0x22:    Threads: WINDOWS_THREADS
07:00:10:WU00:FS01:0x22: OS Version: 6.2
07:00:10:WU00:FS01:0x22:Has Battery: false
07:00:10:WU00:FS01:0x22: On Battery: false
07:00:10:WU00:FS01:0x22: UTC Offset: 1
07:00:10:WU00:FS01:0x22:        PID: 2532
07:00:10:WU00:FS01:0x22:        CWD: C:\Users\Malcolm\AppData\Roaming\FAHClient\work
07:00:10:WU00:FS01:0x22:         OS: Windows 10 Home
07:00:10:WU00:FS01:0x22:    OS Arch: AMD64
07:00:10:WU00:FS01:0x22:********************************************************************************
07:00:10:WU00:FS01:0x22:Project: 13404 (Run 378, Clone 16, Gen 0)
07:00:10:WU00:FS01:0x22:Unit: 0x0000000112bc7d9a5eb37aa6e4bad523
07:00:10:WU00:FS01:0x22:Reading tar file core.xml
07:00:10:WU00:FS01:0x22:Reading tar file integrator.xml
07:00:10:WU00:FS01:0x22:Reading tar file state.xml
07:00:10:WU00:FS01:0x22:Reading tar file system.xml
07:00:10:WU00:FS01:0x22:Digital signatures verified
07:00:10:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
07:00:10:WU00:FS01:0x22:Version 0.0.5
07:00:23:WU00:FS01:0x22:Completed 0 out of 1000000 steps (0%)
07:00:23:WU00:FS01:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
07:02:58:WU01:FS00:0xa7:Completed 10000 out of 1000000 steps (1%)
07:03:05:WU00:FS01:0x22:Completed 10000 out of 1000000 steps (1%)
07:05:46:WU00:FS01:0x22:Completed 20000 out of 1000000 steps (2%)
07:06:11:WU01:FS00:0xa7:Completed 20000 out of 1000000 steps (2%)
07:08:25:WU00:FS01:0x22:Completed 30000 out of 1000000 steps (3%)
07:09:19:WU01:FS00:0xa7:Completed 30000 out of 1000000 steps (3%)
07:11:06:WU00:FS01:0x22:Completed 40000 out of 1000000 steps (4%)
07:12:27:WU01:FS00:0xa7:Completed 40000 out of 1000000 steps (4%)
07:13:46:WU00:FS01:0x22:Completed 50000 out of 1000000 steps (5%)
07:15:37:WU01:FS00:0xa7:Completed 50000 out of 1000000 steps (5%)
07:16:25:WU00:FS01:0x22:Completed 60000 out of 1000000 steps (6%)
07:18:45:WU01:FS00:0xa7:Completed 60000 out of 1000000 steps (6%)
07:19:04:WU00:FS01:0x22:Completed 70000 out of 1000000 steps (7%)
07:21:41:WU00:FS01:0x22:Completed 80000 out of 1000000 steps (8%)
07:21:52:WU01:FS00:0xa7:Completed 70000 out of 1000000 steps (7%)
07:24:19:WU00:FS01:0x22:Completed 90000 out of 1000000 steps (9%)
07:24:59:WU01:FS00:0xa7:Completed 80000 out of 1000000 steps (8%)
07:26:58:WU00:FS01:0x22:Completed 100000 out of 1000000 steps (10%)
07:28:06:WU01:FS00:0xa7:Completed 90000 out of 1000000 steps (9%)
07:29:36:WU00:FS01:0x22:Completed 110000 out of 1000000 steps (11%)
07:31:14:WU01:FS00:0xa7:Completed 100000 out of 1000000 steps (10%)
07:32:14:WU00:FS01:0x22:Completed 120000 out of 1000000 steps (12%)
07:34:22:WU01:FS00:0xa7:Completed 110000 out of 1000000 steps (11%)
07:34:52:WU00:FS01:0x22:Completed 130000 out of 1000000 steps (13%)
07:37:30:WU00:FS01:0x22:Completed 140000 out of 1000000 steps (14%)
07:37:30:WU01:FS00:0xa7:Completed 120000 out of 1000000 steps (12%)
07:40:08:WU00:FS01:0x22:Completed 150000 out of 1000000 steps (15%)
07:40:38:WU01:FS00:0xa7:Completed 130000 out of 1000000 steps (13%)
07:42:46:WU00:FS01:0x22:Completed 160000 out of 1000000 steps (16%)
07:43:45:WU01:FS00:0xa7:Completed 140000 out of 1000000 steps (14%)
07:45:24:WU00:FS01:0x22:Completed 170000 out of 1000000 steps (17%)
07:46:54:WU01:FS00:0xa7:Completed 150000 out of 1000000 steps (15%)
07:48:02:WU00:FS01:0x22:Completed 180000 out of 1000000 steps (18%)
07:50:03:WU01:FS00:0xa7:Completed 160000 out of 1000000 steps (16%)
07:50:40:WU00:FS01:0x22:Completed 190000 out of 1000000 steps (19%)
07:53:11:WU01:FS00:0xa7:Completed 170000 out of 1000000 steps (17%)
07:53:18:WU00:FS01:0x22:Completed 200000 out of 1000000 steps (20%)
07:55:56:WU00:FS01:0x22:Completed 210000 out of 1000000 steps (21%)
07:56:20:WU01:FS00:0xa7:Completed 180000 out of 1000000 steps (18%)
07:58:34:WU00:FS01:0x22:Completed 220000 out of 1000000 steps (22%)
07:59:29:WU01:FS00:0xa7:Completed 190000 out of 1000000 steps (19%)
08:01:12:WU00:FS01:0x22:Completed 230000 out of 1000000 steps (23%)
08:02:38:WU01:FS00:0xa7:Completed 200000 out of 1000000 steps (20%)
08:03:50:WU00:FS01:0x22:Completed 240000 out of 1000000 steps (24%)
08:05:46:WU01:FS00:0xa7:Completed 210000 out of 1000000 steps (21%)
08:06:28:WU00:FS01:0x22:Completed 250000 out of 1000000 steps (25%)
08:08:54:WU01:FS00:0xa7:Completed 220000 out of 1000000 steps (22%)
08:09:15:WU00:FS01:0x22:Completed 260000 out of 1000000 steps (26%)
08:11:58:WU00:FS01:0x22:Completed 270000 out of 1000000 steps (27%)
08:12:02:WU01:FS00:0xa7:Completed 230000 out of 1000000 steps (23%)
08:14:42:WU00:FS01:0x22:Completed 280000 out of 1000000 steps (28%)
08:15:10:WU01:FS00:0xa7:Completed 240000 out of 1000000 steps (24%)
08:17:26:WU00:FS01:0x22:Completed 290000 out of 1000000 steps (29%)
08:18:18:WU01:FS00:0xa7:Completed 250000 out of 1000000 steps (25%)
08:20:10:WU00:FS01:0x22:Completed 300000 out of 1000000 steps (30%)
08:21:26:WU01:FS00:0xa7:Completed 260000 out of 1000000 steps (26%)
08:22:54:WU00:FS01:0x22:Completed 310000 out of 1000000 steps (31%)
08:24:34:WU01:FS00:0xa7:Completed 270000 out of 1000000 steps (27%)
08:25:38:WU00:FS01:0x22:Completed 320000 out of 1000000 steps (32%)
08:27:42:WU01:FS00:0xa7:Completed 280000 out of 1000000 steps (28%)
08:28:22:WU00:FS01:0x22:Completed 330000 out of 1000000 steps (33%)
08:30:50:WU01:FS00:0xa7:Completed 290000 out of 1000000 steps (29%)
08:31:06:WU00:FS01:0x22:Completed 340000 out of 1000000 steps (34%)
08:33:52:WU00:FS01:0x22:Completed 350000 out of 1000000 steps (35%)
08:33:59:WU01:FS00:0xa7:Completed 300000 out of 1000000 steps (30%)
08:36:37:WU00:FS01:0x22:Completed 360000 out of 1000000 steps (36%)
08:37:07:WU01:FS00:0xa7:Completed 310000 out of 1000000 steps (31%)
08:39:22:WU00:FS01:0x22:Completed 370000 out of 1000000 steps (37%)
08:40:15:WU01:FS00:0xa7:Completed 320000 out of 1000000 steps (32%)
08:42:06:WU00:FS01:0x22:Completed 380000 out of 1000000 steps (38%)
08:43:23:WU01:FS00:0xa7:Completed 330000 out of 1000000 steps (33%)
08:44:51:WU00:FS01:0x22:Completed 390000 out of 1000000 steps (39%)
08:46:31:WU01:FS00:0xa7:Completed 340000 out of 1000000 steps (34%)
08:47:35:WU00:FS01:0x22:Completed 400000 out of 1000000 steps (40%)
08:49:39:WU01:FS00:0xa7:Completed 350000 out of 1000000 steps (35%)
08:50:20:WU00:FS01:0x22:Completed 410000 out of 1000000 steps (41%)
08:52:47:WU01:FS00:0xa7:Completed 360000 out of 1000000 steps (36%)
08:53:04:WU00:FS01:0x22:Completed 420000 out of 1000000 steps (42%)
08:55:49:WU00:FS01:0x22:Completed 430000 out of 1000000 steps (43%)
08:55:55:WU01:FS00:0xa7:Completed 370000 out of 1000000 steps (37%)
08:58:36:WU00:FS01:0x22:Completed 440000 out of 1000000 steps (44%)
08:59:03:WU01:FS00:0xa7:Completed 380000 out of 1000000 steps (38%)
09:01:21:WU00:FS01:0x22:Completed 450000 out of 1000000 steps (45%)
09:02:09:WU01:FS00:0xa7:Completed 390000 out of 1000000 steps (39%)
09:04:06:WU00:FS01:0x22:Completed 460000 out of 1000000 steps (46%)
09:05:15:WU01:FS00:0xa7:Completed 400000 out of 1000000 steps (40%)
09:06:51:WU00:FS01:0x22:Completed 470000 out of 1000000 steps (47%)
09:08:21:WU01:FS00:0xa7:Completed 410000 out of 1000000 steps (41%)
09:09:36:WU00:FS01:0x22:Completed 480000 out of 1000000 steps (48%)
09:11:28:WU01:FS00:0xa7:Completed 420000 out of 1000000 steps (42%)
09:12:23:WU00:FS01:0x22:Completed 490000 out of 1000000 steps (49%)
09:14:14:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
09:14:14:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
09:14:29:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
09:14:29:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
09:14:34:WU01:FS00:0xa7:Completed 430000 out of 1000000 steps (43%)
09:14:44:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
09:14:44:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
09:14:44:WU00:FS01:0x22:ERROR:114: Max Retries Reached
09:14:44:WU00:FS01:0x22:Saving result file ..\logfile_01.txt
09:14:44:WU00:FS01:0x22:Saving result file badstate-0.xml
09:14:44:WU00:FS01:0x22:Saving result file badstate-1.xml
09:14:44:WU00:FS01:0x22:Saving result file badstate-2.xml
09:14:44:WU00:FS01:0x22:Saving result file checkpointState.xml
09:14:45:WU00:FS01:0x22:Saving result file checkpt.crc
09:14:45:WU00:FS01:0x22:Saving result file globals.csv
09:14:45:WU00:FS01:0x22:Saving result file positions.xtc
09:14:45:WU00:FS01:0x22:Saving result file science.log
09:14:45:WU00:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
09:14:46:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
09:14:46:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13404 run:378 clone:16 gen:0 core:0x22 unit:0x0000000112bc7d9a5eb37aa6e4bad523
09:14:46:WU00:FS01:Uploading 4.81MiB to 18.188.125.154
09:14:46:WU00:FS01:Connecting to 18.188.125.154:8080
09:14:47:WU02:FS01:Connecting to assign1.foldingathome.org:80
09:14:47:WARNING:WU02:FS01:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
09:14:47:WU02:FS01:Connecting to assign2.foldingathome.org:80
09:14:48:WARNING:WU02:FS01:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration
09:14:48:WU02:FS01:Connecting to assign3.foldingathome.org:80
09:14:48:WARNING:WU02:FS01:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration
09:14:48:WU02:FS01:Connecting to assign4.foldingathome.org:80
09:14:49:WU02:FS01:Assigned to work server 128.252.203.10
09:14:49:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1070] 6463 from 128.252.203.10
09:14:49:WU02:FS01:Connecting to 128.252.203.10:8080
09:14:52:WU00:FS01:Upload 14.29%
09:14:58:WU00:FS01:Upload 28.58%
09:15:04:WU00:FS01:Upload 41.57%
09:15:10:WU00:FS01:Upload 55.86%
09:15:10:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
09:15:10:WU02:FS01:Connecting to 128.252.203.10:80
09:15:16:WU00:FS01:Upload 70.15%
09:15:22:WU00:FS01:Upload 83.14%
09:15:28:WU00:FS01:Upload 97.43%
09:15:30:WU00:FS01:Upload complete
09:15:30:WU00:FS01:Server responded WORK_ACK (400)
09:15:30:WU00:FS01:Cleaning up
09:15:31:ERROR:WU02:FS01:Exception: Failed to connect to 128.252.203.10:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
09:15:31:WU02:FS01:Connecting to assign1.foldingathome.org:80
09:15:32:WARNING:WU02:FS01:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
09:15:32:WU02:FS01:Connecting to assign2.foldingathome.org:80
09:15:32:WARNING:WU02:FS01:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration
09:15:32:WU02:FS01:Connecting to assign3.foldingathome.org:80
09:15:33:WARNING:WU02:FS01:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration
09:15:33:WU02:FS01:Connecting to assign4.foldingathome.org:80
09:15:33:WARNING:WU02:FS01:Failed to get assignment from 'assign4.foldingathome.org:80': No WUs available for this configuration
09:15:33:ERROR:WU02:FS01:Exception: Could not get an assignment
09:16:31:WU02:FS01:Connecting to assign1.foldingathome.org:80
09:16:32:WARNING:WU02:FS01:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
09:16:32:WU02:FS01:Connecting to assign2.foldingathome.org:80
09:16:32:WARNING:WU02:FS01:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration
09:16:32:WU02:FS01:Connecting to assign3.foldingathome.org:80
09:16:33:WARNING:WU02:FS01:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration
09:16:33:WU02:FS01:Connecting to assign4.foldingathome.org:80
09:16:33:WARNING:WU02:FS01:Failed to get assignment from 'assign4.foldingathome.org:80': No WUs available for this configuration
09:16:33:ERROR:WU02:FS01:Exception: Could not get an assignment
09:17:24:WU01:FS00:0xa7:Completed 440000 out of 1000000 steps (44%)
09:18:09:WU02:FS01:Connecting to assign1.foldingathome.org:80
09:18:09:WU02:FS01:Assigned to work server 18.188.125.154
09:18:09:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1070] 6463 from 18.188.125.154
09:18:09:WU02:FS01:Connecting to 18.188.125.154:8080
09:18:11:WU02:FS01:Downloading 5.99MiB
09:18:17:WU02:FS01:Download 92.84%
09:18:17:WU02:FS01:Download complete
09:18:17:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:13405 run:56 clone:93 gen:2 core:0x22 unit:0x0000000512bc7d9a5eb3a391112dfec8
09:18:17:WU02:FS01:Starting
09:18:17:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\Malcolm\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 02 -suffix 01 -version 706 -lifeline 12492 -checkpoint 30 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
09:18:17:WU02:FS01:Started FahCore on PID 34308
09:18:17:WU02:FS01:Core PID:8300
09:18:17:WU02:FS01:FahCore 0x22 started
09:18:18:WU02:FS01:0x22:*********************** Log Started 2020-05-08T09:18:17Z ***********************
09:18:18:WU02:FS01:0x22:*************************** Core22 Folding@home Core ***************************
09:18:18:WU02:FS01:0x22:       Type: 0x22
09:18:18:WU02:FS01:0x22:       Core: Core22
09:18:18:WU02:FS01:0x22:    Website: https://foldingathome.org/
09:18:18:WU02:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
09:18:18:WU02:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
09:18:18:WU02:FS01:0x22:             <rafal.wiewiora@choderalab.org>
09:18:18:WU02:FS01:0x22:       Args: -dir 02 -suffix 01 -version 706 -lifeline 34308 -checkpoint 30
09:18:18:WU02:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
09:18:18:WU02:FS01:0x22:             0 -gpu 0
09:18:18:WU02:FS01:0x22:     Config: <none>
09:18:18:WU02:FS01:0x22:************************************ Build *************************************
09:18:18:WU02:FS01:0x22:    Version: 0.0.5
09:18:18:WU02:FS01:0x22:       Date: Apr 22 2020
09:18:18:WU02:FS01:0x22:       Time: 04:42:59
09:18:18:WU02:FS01:0x22: Repository: Git
09:18:18:WU02:FS01:0x22:   Revision: 2d69202c898bd9bb3e093f51cd32bf411c2a0388
09:18:18:WU02:FS01:0x22:     Branch: HEAD
09:18:18:WU02:FS01:0x22:   Compiler: Visual C++ 2008
09:18:18:WU02:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
09:18:18:WU02:FS01:0x22:   Platform: win32 10
09:18:18:WU02:FS01:0x22:       Bits: 64
09:18:18:WU02:FS01:0x22:       Mode: Release
09:18:18:WU02:FS01:0x22:************************************ System ************************************
09:18:18:WU02:FS01:0x22:        CPU: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
09:18:18:WU02:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
09:18:18:WU02:FS01:0x22:       CPUs: 8
09:18:18:WU02:FS01:0x22:     Memory: 15.86GiB
09:18:18:WU02:FS01:0x22:Free Memory: 11.56GiB
09:18:18:WU02:FS01:0x22:    Threads: WINDOWS_THREADS
09:18:18:WU02:FS01:0x22: OS Version: 6.2
09:18:18:WU02:FS01:0x22:Has Battery: false
09:18:18:WU02:FS01:0x22: On Battery: false
09:18:18:WU02:FS01:0x22: UTC Offset: 1
09:18:18:WU02:FS01:0x22:        PID: 8300
09:18:18:WU02:FS01:0x22:        CWD: C:\Users\Malcolm\AppData\Roaming\FAHClient\work
09:18:18:WU02:FS01:0x22:         OS: Windows 10 Home
09:18:18:WU02:FS01:0x22:    OS Arch: AMD64
09:18:18:WU02:FS01:0x22:********************************************************************************
09:18:18:WU02:FS01:0x22:Project: 13405 (Run 56, Clone 93, Gen 2)
09:18:18:WU02:FS01:0x22:Unit: 0x0000000512bc7d9a5eb3a391112dfec8
09:18:18:WU02:FS01:0x22:Reading tar file core.xml
09:18:18:WU02:FS01:0x22:Reading tar file integrator.xml
09:18:18:WU02:FS01:0x22:Reading tar file state.xml
09:18:18:WU02:FS01:0x22:Reading tar file system.xml
09:18:18:WU02:FS01:0x22:Digital signatures verified
09:18:18:WU02:FS01:0x22:Folding@home GPU Core22 Folding@home Core
09:18:18:WU02:FS01:0x22:Version 0.0.5
09:18:30:WU02:FS01:0x22:Completed 0 out of 1000000 steps (0%)
09:18:30:WU02:FS01:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
09:20:25:WU01:FS00:0xa7:Completed 450000 out of 1000000 steps (45%)
09:21:13:WU02:FS01:0x22:Completed 10000 out of 1000000 steps (1%)
09:23:34:WU01:FS00:0xa7:Completed 460000 out of 1000000 steps (46%)
09:23:51:WU02:FS01:0x22:Completed 20000 out of 1000000 steps (2%)
09:26:29:WU02:FS01:0x22:Completed 30000 out of 1000000 steps (3%)
09:26:42:WU01:FS00:0xa7:Completed 470000 out of 1000000 steps (47%)
09:29:08:WU02:FS01:0x22:Completed 40000 out of 1000000 steps (4%)
09:29:50:WU01:FS00:0xa7:Completed 480000 out of 1000000 steps (48%)
09:31:47:WU02:FS01:0x22:Completed 50000 out of 1000000 steps (5%)
09:32:58:WU01:FS00:0xa7:Completed 490000 out of 1000000 steps (49%)
09:34:26:WU02:FS01:0x22:Completed 60000 out of 1000000 steps (6%)
09:36:06:WU01:FS00:0xa7:Completed 500000 out of 1000000 steps (50%)
09:37:04:WU02:FS01:0x22:Completed 70000 out of 1000000 steps (7%)
09:39:13:WU01:FS00:0xa7:Completed 510000 out of 1000000 steps (51%)
09:39:43:WU02:FS01:0x22:Completed 80000 out of 1000000 steps (8%)
09:42:21:WU01:FS00:0xa7:Completed 520000 out of 1000000 steps (52%)
09:42:21:WU02:FS01:0x22:Completed 90000 out of 1000000 steps (9%)
09:45:00:WU02:FS01:0x22:Completed 100000 out of 1000000 steps (10%)
09:45:30:WU01:FS00:0xa7:Completed 530000 out of 1000000 steps (53%)
09:47:39:WU02:FS01:0x22:Completed 110000 out of 1000000 steps (11%)
09:48:40:WU01:FS00:0xa7:Completed 540000 out of 1000000 steps (54%)
09:50:18:WU02:FS01:0x22:Completed 120000 out of 1000000 steps (12%)
09:51:48:WU01:FS00:0xa7:Completed 550000 out of 1000000 steps (55%)
09:52:57:WU02:FS01:0x22:Completed 130000 out of 1000000 steps (13%)
09:54:57:WU01:FS00:0xa7:Completed 560000 out of 1000000 steps (56%)
09:55:35:WU02:FS01:0x22:Completed 140000 out of 1000000 steps (14%)
09:58:06:WU01:FS00:0xa7:Completed 570000 out of 1000000 steps (57%)
09:58:14:WU02:FS01:0x22:Completed 150000 out of 1000000 steps (15%)
10:00:53:WU02:FS01:0x22:Completed 160000 out of 1000000 steps (16%)
10:01:16:WU01:FS00:0xa7:Completed 580000 out of 1000000 steps (58%)
10:03:32:WU02:FS01:0x22:Completed 170000 out of 1000000 steps (17%)
10:04:25:WU01:FS00:0xa7:Completed 590000 out of 1000000 steps (59%)
10:06:11:WU02:FS01:0x22:Completed 180000 out of 1000000 steps (18%)
10:07:34:WU01:FS00:0xa7:Completed 600000 out of 1000000 steps (60%)
10:08:50:WU02:FS01:0x22:Completed 190000 out of 1000000 steps (19%)
10:10:43:WU01:FS00:0xa7:Completed 610000 out of 1000000 steps (61%)
10:11:28:WU02:FS01:0x22:Completed 200000 out of 1000000 steps (20%)
10:13:52:WU01:FS00:0xa7:Completed 620000 out of 1000000 steps (62%)
10:14:07:WU02:FS01:0x22:Completed 210000 out of 1000000 steps (21%)
10:16:46:WU02:FS01:0x22:Completed 220000 out of 1000000 steps (22%)
10:17:01:WU01:FS00:0xa7:Completed 630000 out of 1000000 steps (63%)
10:19:24:WU02:FS01:0x22:Completed 230000 out of 1000000 steps (23%)
10:20:11:WU01:FS00:0xa7:Completed 640000 out of 1000000 steps (64%)
10:22:03:WU02:FS01:0x22:Completed 240000 out of 1000000 steps (24%)
10:23:21:WU01:FS00:0xa7:Completed 650000 out of 1000000 steps (65%)
10:24:42:WU02:FS01:0x22:Completed 250000 out of 1000000 steps (25%)
10:26:33:WU01:FS00:0xa7:Completed 660000 out of 1000000 steps (66%)
10:27:31:WU02:FS01:0x22:Completed 260000 out of 1000000 steps (26%)
10:29:48:WU01:FS00:0xa7:Completed 670000 out of 1000000 steps (67%)
10:30:22:WU02:FS01:0x22:Completed 270000 out of 1000000 steps (27%)
10:33:06:WU01:FS00:0xa7:Completed 680000 out of 1000000 steps (68%)
10:33:15:WU02:FS01:0x22:Completed 280000 out of 1000000 steps (28%)
10:36:07:WU02:FS01:0x22:Completed 290000 out of 1000000 steps (29%)
10:36:22:WU01:FS00:0xa7:Completed 690000 out of 1000000 steps (69%)
10:38:57:WU02:FS01:0x22:Completed 300000 out of 1000000 steps (30%)
10:39:37:WU01:FS00:0xa7:Completed 700000 out of 1000000 steps (70%)
10:41:45:WU02:FS01:0x22:Completed 310000 out of 1000000 steps (31%)
10:42:49:WU01:FS00:0xa7:Completed 710000 out of 1000000 steps (71%)
10:44:31:WU02:FS01:0x22:Completed 320000 out of 1000000 steps (32%)
10:46:00:WU01:FS00:0xa7:Completed 720000 out of 1000000 steps (72%)
10:47:19:WU02:FS01:0x22:Completed 330000 out of 1000000 steps (33%)
10:49:10:WU01:FS00:0xa7:Completed 730000 out of 1000000 steps (73%)
10:50:06:WU02:FS01:0x22:Completed 340000 out of 1000000 steps (34%)
10:52:24:WU01:FS00:0xa7:Completed 740000 out of 1000000 steps (74%)
10:52:52:WU02:FS01:0x22:Completed 350000 out of 1000000 steps (35%)
10:55:34:WU01:FS00:0xa7:Completed 750000 out of 1000000 steps (75%)
10:55:38:WU02:FS01:0x22:Completed 360000 out of 1000000 steps (36%)
10:58:24:WU02:FS01:0x22:Completed 370000 out of 1000000 steps (37%)
10:58:45:WU01:FS00:0xa7:Completed 760000 out of 1000000 steps (76%)
11:01:09:WU02:FS01:0x22:Completed 380000 out of 1000000 steps (38%)
11:01:56:WU01:FS00:0xa7:Completed 770000 out of 1000000 steps (77%)
11:03:54:WU02:FS01:0x22:Completed 390000 out of 1000000 steps (39%)
11:05:06:WU01:FS00:0xa7:Completed 780000 out of 1000000 steps (78%)
11:06:40:WU02:FS01:0x22:Completed 400000 out of 1000000 steps (40%)
11:08:16:WU01:FS00:0xa7:Completed 790000 out of 1000000 steps (79%)
11:09:25:WU02:FS01:0x22:Completed 410000 out of 1000000 steps (41%)
11:11:24:WU01:FS00:0xa7:Completed 800000 out of 1000000 steps (80%)
11:12:10:WU02:FS01:0x22:Completed 420000 out of 1000000 steps (42%)
11:14:31:WU01:FS00:0xa7:Completed 810000 out of 1000000 steps (81%)
11:14:56:WU02:FS01:0x22:Completed 430000 out of 1000000 steps (43%)
11:17:40:WU01:FS00:0xa7:Completed 820000 out of 1000000 steps (82%)
11:17:41:WU02:FS01:0x22:Completed 440000 out of 1000000 steps (44%)
11:20:26:WU02:FS01:0x22:Completed 450000 out of 1000000 steps (45%)
11:20:47:WU01:FS00:0xa7:Completed 830000 out of 1000000 steps (83%)
11:22:36:FS01:Finishing
11:22:36:FS00:Finishing
11:23:11:WU02:FS01:0x22:Completed 460000 out of 1000000 steps (46%)
11:23:55:WU01:FS00:0xa7:Completed 840000 out of 1000000 steps (84%)
11:25:55:WU02:FS01:0x22:Completed 470000 out of 1000000 steps (47%)
11:27:02:WU01:FS00:0xa7:Completed 850000 out of 1000000 steps (85%)
11:28:40:WU02:FS01:0x22:Completed 480000 out of 1000000 steps (48%)
11:30:10:WU01:FS00:0xa7:Completed 860000 out of 1000000 steps (86%)
11:31:26:WU02:FS01:0x22:Completed 490000 out of 1000000 steps (49%)
11:33:19:WU01:FS00:0xa7:Completed 870000 out of 1000000 steps (87%)
11:34:11:WU02:FS01:0x22:Completed 500000 out of 1000000 steps (50%)
11:36:27:WU01:FS00:0xa7:Completed 880000 out of 1000000 steps (88%)
11:36:52:WU02:FS01:0x22:Completed 510000 out of 1000000 steps (51%)
11:39:32:WU02:FS01:0x22:Completed 520000 out of 1000000 steps (52%)
11:39:36:WU01:FS00:0xa7:Completed 890000 out of 1000000 steps (89%)
11:42:12:WU02:FS01:0x22:Completed 530000 out of 1000000 steps (53%)
11:42:44:WU01:FS00:0xa7:Completed 900000 out of 1000000 steps (90%)
11:44:52:WU02:FS01:0x22:Completed 540000 out of 1000000 steps (54%)
11:45:52:WU01:FS00:0xa7:Completed 910000 out of 1000000 steps (91%)
11:47:32:WU02:FS01:0x22:Completed 550000 out of 1000000 steps (55%)
11:49:00:WU01:FS00:0xa7:Completed 920000 out of 1000000 steps (92%)
11:50:12:WU02:FS01:0x22:Completed 560000 out of 1000000 steps (56%)
11:52:10:WU01:FS00:0xa7:Completed 930000 out of 1000000 steps (93%)
11:52:50:WU02:FS01:0x22:Completed 570000 out of 1000000 steps (57%)
11:55:19:WU01:FS00:0xa7:Completed 940000 out of 1000000 steps (94%)
11:55:30:WU02:FS01:0x22:Completed 580000 out of 1000000 steps (58%)
11:58:11:WU02:FS01:0x22:Completed 590000 out of 1000000 steps (59%)
11:58:28:WU01:FS00:0xa7:Completed 950000 out of 1000000 steps (95%)
12:00:51:WU02:FS01:0x22:Completed 600000 out of 1000000 steps (60%)
12:01:37:WU01:FS00:0xa7:Completed 960000 out of 1000000 steps (96%)
12:03:31:WU02:FS01:0x22:Completed 610000 out of 1000000 steps (61%)
12:04:47:WU01:FS00:0xa7:Completed 970000 out of 1000000 steps (97%)
12:06:10:WU02:FS01:0x22:Completed 620000 out of 1000000 steps (62%)
12:07:56:WU01:FS00:0xa7:Completed 980000 out of 1000000 steps (98%)
12:08:50:WU02:FS01:0x22:Completed 630000 out of 1000000 steps (63%)
12:11:06:WU01:FS00:0xa7:Completed 990000 out of 1000000 steps (99%)
12:11:30:WU02:FS01:0x22:Completed 640000 out of 1000000 steps (64%)
12:14:10:WU02:FS01:0x22:Completed 650000 out of 1000000 steps (65%)
12:14:15:WU01:FS00:0xa7:Completed 1000000 out of 1000000 steps (100%)
12:14:15:WU01:FS00:0xa7:Saving result file ..\logfile_01.txt
12:14:15:WU01:FS00:0xa7:Saving result file frame45.trr
12:14:15:WU01:FS00:0xa7:Saving result file frame45.xtc
12:14:15:WU01:FS00:0xa7:Saving result file md.log
12:14:15:WU01:FS00:0xa7:Saving result file science.log
12:14:16:WU01:FS00:0xa7:Folding@home Core Shutdown: FINISHED_UNIT
12:14:16:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
12:14:16:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14534 run:135 clone:2 gen:45 core:0xa7 unit:0x00000032cedfaa925ea34b75ceff2474
12:14:16:WU01:FS00:Uploading 2.67MiB to 206.223.170.146
12:14:16:WU01:FS00:Connecting to 206.223.170.146:8080
12:14:22:WU01:FS00:Upload 25.75%
12:14:28:WU01:FS00:Upload 49.17%
12:14:34:WU01:FS00:Upload 74.92%
12:14:40:WU01:FS00:Upload 100.00%
12:14:40:WU01:FS00:Upload complete
12:14:40:WU01:FS00:Server responded WORK_ACK (400)
12:14:40:WU01:FS00:Final credit estimate, 5780.00 points
12:14:40:WU01:FS00:Cleaning up
12:16:46:WU02:FS01:0x22:Completed 660000 out of 1000000 steps (66%)
12:19:22:WU02:FS01:0x22:Completed 670000 out of 1000000 steps (67%)
12:21:58:WU02:FS01:0x22:Completed 680000 out of 1000000 steps (68%)
12:24:34:WU02:FS01:0x22:Completed 690000 out of 1000000 steps (69%)
12:27:10:WU02:FS01:0x22:Completed 700000 out of 1000000 steps (70%)
12:28:03:FS01:Paused
12:28:03:FS01:Shutting core down
12:28:03:WU02:FS01:0x22:WARNING:Console control signal 1 on PID 8300
12:28:03:WU02:FS01:0x22:Exiting, please wait. . .
12:28:03:WU02:FS01:0x22:Folding@home Core Shutdown: INTERRUPTED
12:28:03:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
12:29:00:Removing old file 'configs/config-20200426-165141.xml'
12:29:00:Saving configuration to config.xml
12:29:00:<config>
12:29:00:  <!-- Folding Core -->
12:29:00:  <checkpoint v='30'/>
12:29:00:
12:29:00:  <!-- Folding Slot Configuration -->
12:29:00:  <cause v='COVID_19'/>
12:29:00:
12:29:00:  <!-- Network -->
12:29:00:  <proxy v=':8080'/>
12:29:00:
12:29:00:  <!-- Slot Control -->
12:29:00:  <pause-on-start v='true'/>
12:29:00:
12:29:00:  <!-- User Information -->
12:29:00:  <passkey v='*****'/>
12:29:00:  <user v='STFC9F22'/>
12:29:00:
12:29:00:  <!-- Work Unit Control -->
12:29:00:  <next-unit-percentage v='100'/>
12:29:00:
12:29:00:  <!-- Folding Slots -->
12:29:00:  <slot id='1' type='GPU'>
12:29:00:    <paused v='true'/>
12:29:00:  </slot>
12:29:00:  <slot id='0' type='CPU'>
12:29:00:    <cpus v='4'/>
12:29:00:  </slot>
12:29:00:</config>
For what it's worth:

The only change to my machine, made immediately prior to these two failures, was an upgrade of the client from 7.6.9 to 7.6.13.

Also the log shows that following the failure of 13404 (378, 16 ,0) today, I picked up 13405 (56,93,2) which progressed sucessfully to just over 70% at which point as an experiment I paused it (the last action in the log above) to test whether there might be an issue resuming from checkpoint. On restarting the client I notice that it resumed from 50% - Is it possible that the checkpoints for these Projects have inadvertently been set to 50% rather than 5%?
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: 13404 (378, 16 ,0) failed at 49%

Post by JohnChodera »

Thanks for the report! These projects represent a totally new workload for us---nonequilibrium free energy calculations to help prioritize compounds for the COVID Moonshot [https://covid.postera.ai/covid]---and we're still working out some of the issues with various parts of this workflow. It's almost certainly not an issue with your machine---some RUNs are just much more likely to fail. We'll improve this iteratively over the next few projects in this series.

The checkpoints were set at every 25% in order to have a maximum chance of recovery, since the risky nonequilibrium part starts at 25% and runs to 50%. Backing up to 25% is the only sensible way to recover at this point, though we're working on ways to improve that to allow much finer-grained checkpointing.

Thanks so much for bearing with us! We're still getting a ton of usable data despite the failures, but will work to improve things for everyone.

~ John Chodera // MSKCC
Post Reply