Project 13414 2 Faulty WUs
Posted: Mon Jun 22, 2020 6:08 am
I've had 9 WUs on project 13414 in the last 2 or 3 days, 2 in the last 24 hours were faulty. Here are the relevant logs:
and
Code: Select all
12:06:07:WU01:FS01:Starting
12:06:07:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 1531 -checkpoint 1
5 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
12:06:07:WU01:FS01:Started FahCore on PID 19874
12:06:07:WU01:FS01:Core PID:19878
12:06:07:WU01:FS01:FahCore 0x22 started
12:06:07:WU01:FS01:0x22:*********************** Log Started 2020-06-21T12:06:07Z ***********************
12:06:07:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
12:06:07:WU01:FS01:0x22: Core: Core22
12:06:07:WU01:FS01:0x22: Type: 0x22
12:06:07:WU01:FS01:0x22: Version: 0.0.10
12:06:07:WU01:FS01:0x22: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
12:06:07:WU01:FS01:0x22: Copyright: 2020 foldingathome.org
12:06:07:WU01:FS01:0x22: Homepage: https://foldingathome.org/
12:06:07:WU01:FS01:0x22: Date: Jun 16 2020
12:06:07:WU01:FS01:0x22: Time: 15:55:31
12:06:07:WU01:FS01:0x22: Revision: 147051aad40bcbec7d4b25105bbedfab425f1dc2
12:06:07:WU01:FS01:0x22: Branch: core22-0.0.10
12:06:07:WU01:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
12:06:07:WU01:FS01:0x22: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
12:06:07:WU01:FS01:0x22: Platform: linux2 4.19.76-linuxkit
12:06:07:WU01:FS01:0x22: Bits: 64
12:06:07:WU01:FS01:0x22: Mode: Release
12:06:07:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
12:06:07:WU01:FS01:0x22: <peastman@stanford.edu>
12:06:07:WU01:FS01:0x22: Args: -dir 01 -suffix 01 -version 706 -lifeline 19874 -checkpoint 15
12:06:07:WU01:FS01:0x22: -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
12:06:07:WU01:FS01:0x22:************************************ libFAH ************************************
12:06:07:WU01:FS01:0x22: Date: Jun 2 2020
12:06:07:WU01:FS01:0x22: Time: 00:07:31
12:06:07:WU01:FS01:0x22: Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
12:06:07:WU01:FS01:0x22: Branch: HEAD
12:06:07:WU01:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
12:06:07:WU01:FS01:0x22: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
12:06:07:WU01:FS01:0x22: Platform: linux2 4.19.76-linuxkit
12:06:07:WU01:FS01:0x22: Bits: 64
12:06:07:WU01:FS01:0x22: Mode: Release
12:06:07:WU01:FS01:0x22:************************************ CBang *************************************
12:06:07:WU01:FS01:0x22: Date: May 31 2020
12:06:07:WU01:FS01:0x22: Time: 20:16:34
12:06:07:WU01:FS01:0x22: Revision: 75fcee0b8e713cb47f5191a3689d5f4f07244c7f
12:06:07:WU01:FS01:0x22: Branch: HEAD
12:06:07:WU01:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
12:06:07:WU01:FS01:0x22: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
12:06:07:WU01:FS01:0x22: -fPIC
12:06:07:WU01:FS01:0x22: Platform: linux2 4.19.76-linuxkit
12:06:07:WU01:FS01:0x22: Bits: 64
12:06:07:WU01:FS01:0x22: Mode: Release
12:06:07:WU01:FS01:0x22:************************************ System ************************************
12:06:07:WU01:FS01:0x22: CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
12:06:07:WU01:FS01:0x22: CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
12:06:07:WU01:FS01:0x22: CPUs: 12
12:06:07:WU01:FS01:0x22: Memory: 31.13GiB
12:06:07:WU01:FS01:0x22:Free Memory: 12.19GiB
12:06:07:WU01:FS01:0x22: Threads: POSIX_THREADS
12:06:07:WU01:FS01:0x22: OS Version: 4.15
12:06:07:WU01:FS01:0x22:Has Battery: false
12:06:07:WU01:FS01:0x22: On Battery: false
12:06:07:WU01:FS01:0x22: UTC Offset: 1
12:06:07:WU01:FS01:0x22: PID: 19878
12:06:07:WU01:FS01:0x22: CWD: /var/lib/fahclient/work
12:06:07:WU01:FS01:0x22:********************************************************************************
12:06:07:WU01:FS01:0x22:Project: 13414 (Run 92, Clone 49, Gen 0)
12:06:07:WU01:FS01:0x22:Unit: 0x0000000412bc7d9a5eed8c71067ebec5
12:06:07:WU01:FS01:0x22:Digital signatures verified
12:06:07:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
12:06:07:WU01:FS01:0x22:Version 0.0.10
12:06:07:WU01:FS01:0x22: Checkpoint write interval: 50000 steps (5%) [20 total]
12:06:07:WU01:FS01:0x22: JSON viewer frame write interval: 10000 steps (1%) [100 total]
12:06:07:WU01:FS01:0x22: XTC frame write interval: 250000 steps (25%) [4 total]
12:06:07:WU01:FS01:0x22: Global context and integrator variables write interval: 250 steps (0.025%) [4000 total]
12:06:11:WU01:FS01:0x22:Completed 0 out of 1000000 steps (0%)
12:06:34:WU01:FS01:0x22:An exception occurred at step 501: Particle coordinate is nan
12:06:34:WU01:FS01:0x22:Max number of attempts to resume from last checkpoint (2) reached. Aborting.
12:06:34:WU01:FS01:0x22:ERROR:114: Max number of attempts to resume from last checkpoint reached.
12:06:34:WU01:FS01:0x22:Saving result file ../logfile_01.txt
12:06:34:WU01:FS01:0x22:Saving result file globals.csv
12:06:34:WU01:FS01:0x22:Saving result file science.log
12:06:34:WU01:FS01:0x22:Saving result file state.xml
12:06:35:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
ESC[93m12:06:36:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)ESC[0m
12:06:36:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13414 run:92 clone:49 gen:0 core:0x22 unit:0x0000000412bc7d9a5eed8c71067ebec5
Code: Select all
01:09:30:WU01:FS01:Starting
01:09:30:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 1531 -checkpoint 1
5 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
01:09:30:WU01:FS01:Started FahCore on PID 9691
01:09:30:WU01:FS01:Core PID:9695
01:09:30:WU01:FS01:FahCore 0x22 started
01:09:31:WU01:FS01:0x22:*********************** Log Started 2020-06-22T01:09:30Z ***********************
01:09:31:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
01:09:31:WU01:FS01:0x22: Core: Core22
01:09:31:WU01:FS01:0x22: Type: 0x22
01:09:31:WU01:FS01:0x22: Version: 0.0.10
01:09:31:WU01:FS01:0x22: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
01:09:31:WU01:FS01:0x22: Copyright: 2020 foldingathome.org
01:09:31:WU01:FS01:0x22: Homepage: https://foldingathome.org/
01:09:31:WU01:FS01:0x22: Date: Jun 16 2020
01:09:31:WU01:FS01:0x22: Time: 15:55:31
01:09:31:WU01:FS01:0x22: Revision: 147051aad40bcbec7d4b25105bbedfab425f1dc2
01:09:31:WU01:FS01:0x22: Branch: core22-0.0.10
01:09:31:WU01:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
01:09:31:WU01:FS01:0x22: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
01:09:31:WU01:FS01:0x22: Platform: linux2 4.19.76-linuxkit
01:09:31:WU01:FS01:0x22: Bits: 64
01:09:31:WU01:FS01:0x22: Mode: Release
01:09:31:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
01:09:31:WU01:FS01:0x22: <peastman@stanford.edu>
01:09:31:WU01:FS01:0x22: Args: -dir 01 -suffix 01 -version 706 -lifeline 9691 -checkpoint 15
01:09:31:WU01:FS01:0x22: -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
01:09:31:WU01:FS01:0x22:************************************ libFAH ************************************
01:09:31:WU01:FS01:0x22: Date: Jun 2 2020
01:09:31:WU01:FS01:0x22: Time: 00:07:31
01:09:31:WU01:FS01:0x22: Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
01:09:31:WU01:FS01:0x22: Branch: HEAD
01:09:31:WU01:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
01:09:31:WU01:FS01:0x22: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
01:09:31:WU01:FS01:0x22: Platform: linux2 4.19.76-linuxkit
01:09:31:WU01:FS01:0x22: Bits: 64
01:09:31:WU01:FS01:0x22: Mode: Release
01:09:31:WU01:FS01:0x22:************************************ CBang *************************************
01:09:31:WU01:FS01:0x22: Date: May 31 2020
01:09:31:WU01:FS01:0x22: Time: 20:16:34
01:09:31:WU01:FS01:0x22: Revision: 75fcee0b8e713cb47f5191a3689d5f4f07244c7f
01:09:31:WU01:FS01:0x22: Branch: HEAD
01:09:31:WU01:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
01:09:31:WU01:FS01:0x22: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
01:09:31:WU01:FS01:0x22: -fPIC
01:09:31:WU01:FS01:0x22: Platform: linux2 4.19.76-linuxkit
01:09:31:WU01:FS01:0x22: Bits: 64
01:09:31:WU01:FS01:0x22: Mode: Release
01:09:31:WU01:FS01:0x22:************************************ System ************************************
01:09:31:WU01:FS01:0x22: CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
01:09:31:WU01:FS01:0x22: CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
01:09:31:WU01:FS01:0x22: CPUs: 12
01:09:31:WU01:FS01:0x22: Memory: 31.13GiB
01:09:31:WU01:FS01:0x22:Free Memory: 15.40GiB
01:09:31:WU01:FS01:0x22: Threads: POSIX_THREADS
01:09:31:WU01:FS01:0x22: OS Version: 4.15
01:09:31:WU01:FS01:0x22:Has Battery: false
01:09:31:WU01:FS01:0x22: On Battery: false
01:09:31:WU01:FS01:0x22: UTC Offset: 1
01:09:31:WU01:FS01:0x22: PID: 9695
01:09:31:WU01:FS01:0x22: CWD: /var/lib/fahclient/work
01:09:31:WU01:FS01:0x22:********************************************************************************
01:09:31:WU01:FS01:0x22:Project: 13414 (Run 141, Clone 69, Gen 0)
01:09:31:WU01:FS01:0x22:Unit: 0x0000000112bc7d9a5eed8c6fefe67d54
01:09:31:WU01:FS01:0x22:Digital signatures verified
01:09:31:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
01:09:31:WU01:FS01:0x22:Version 0.0.10
01:09:31:WU01:FS01:0x22: Checkpoint write interval: 50000 steps (5%) [20 total]
01:09:31:WU01:FS01:0x22: JSON viewer frame write interval: 10000 steps (1%) [100 total]
01:09:31:WU01:FS01:0x22: XTC frame write interval: 250000 steps (25%) [4 total]
01:09:31:WU01:FS01:0x22: Global context and integrator variables write interval: 250 steps (0.025%) [4000 total]
01:09:35:WU01:FS01:0x22:Completed 0 out of 1000000 steps (0%)
01:10:13:WU01:FS01:0x22:An exception occurred at step 2258: Particle coordinate is nan
01:10:13:WU01:FS01:0x22:Max number of attempts to resume from last checkpoint (2) reached. Aborting.
01:10:13:WU01:FS01:0x22:ERROR:114: Max number of attempts to resume from last checkpoint reached.
01:10:13:WU01:FS01:0x22:Saving result file ../logfile_01.txt
01:10:13:WU01:FS01:0x22:Saving result file globals.csv
01:10:13:WU01:FS01:0x22:Saving result file science.log
01:10:13:WU01:FS01:0x22:Saving result file state.xml
01:10:14:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
ESC[93m01:10:15:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)ESC[0m
01:10:15:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13414 run:141 clone:69 gen:0 core:0x22 unit:0x0000000112bc7d9a5eed8c6fefe67d54