Issues, perhaps bad WU? - 16417
Posted: Tue Apr 07, 2020 3:55 pm
I'm seeing issues recently. Could this be a bad WU? My system is stable and not overclocked and I've not seen this issue before.
Code: Select all
15:37:20:WU01:FS00:Starting
15:37:20:WU01:FS00:Removing old file './work/01/logfile_01-20200407-150655.txt'
15:37:20:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 01 -suffix 01 -version 705 -lifeline 2089 -checkpoint 15 -np 24
15:37:20:WU01:FS00:Started FahCore on PID 8993
15:37:20:WU01:FS00:Core PID:8997
15:37:20:WU01:FS00:FahCore 0xa7 started
15:37:21:WU01:FS00:0xa7:*********************** Log Started 2020-04-07T15:37:20Z ***********************
15:37:21:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
15:37:21:WU01:FS00:0xa7: Type: 0xa7
15:37:21:WU01:FS00:0xa7: Core: Gromacs
15:37:21:WU01:FS00:0xa7: Args: -dir 01 -suffix 01 -version 705 -lifeline 8993 -checkpoint 15 -np
15:37:21:WU01:FS00:0xa7: 24
15:37:21:WU01:FS00:0xa7:************************************ CBang *************************************
15:37:21:WU01:FS00:0xa7: Date: Nov 5 2019
15:37:21:WU01:FS00:0xa7: Time: 06:06:57
15:37:21:WU01:FS00:0xa7: Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
15:37:21:WU01:FS00:0xa7: Branch: master
15:37:21:WU01:FS00:0xa7: Compiler: GNU 8.3.0
15:37:21:WU01:FS00:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
15:37:21:WU01:FS00:0xa7: Platform: linux2 4.19.0-5-amd64
15:37:21:WU01:FS00:0xa7: Bits: 64
15:37:21:WU01:FS00:0xa7: Mode: Release
15:37:21:WU01:FS00:0xa7:************************************ System ************************************
15:37:21:WU01:FS00:0xa7: CPU: AMD Ryzen 9 3900X 12-Core Processor
15:37:21:WU01:FS00:0xa7: CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
15:37:21:WU01:FS00:0xa7: CPUs: 24
15:37:21:WU01:FS00:0xa7: Memory: 31.37GiB
15:37:21:WU01:FS00:0xa7:Free Memory: 9.22GiB
15:37:21:WU01:FS00:0xa7: Threads: POSIX_THREADS
15:37:21:WU01:FS00:0xa7: OS Version: 5.3
15:37:21:WU01:FS00:0xa7:Has Battery: false
15:37:21:WU01:FS00:0xa7: On Battery: false
15:37:21:WU01:FS00:0xa7: UTC Offset: 1
15:37:21:WU01:FS00:0xa7: PID: 8997
15:37:21:WU01:FS00:0xa7: CWD: /var/lib/fahclient/work
15:37:21:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
15:37:21:WU01:FS00:0xa7: Version: 0.0.18
15:37:21:WU01:FS00:0xa7: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:37:21:WU01:FS00:0xa7: Copyright: 2019 foldingathome.org
15:37:21:WU01:FS00:0xa7: Homepage: https://foldingathome.org/
15:37:21:WU01:FS00:0xa7: Date: Nov 5 2019
15:37:21:WU01:FS00:0xa7: Time: 06:13:26
15:37:21:WU01:FS00:0xa7: Revision: 490c9aa2957b725af319379424d5c5cb36efb656
15:37:21:WU01:FS00:0xa7: Branch: master
15:37:21:WU01:FS00:0xa7: Compiler: GNU 8.3.0
15:37:21:WU01:FS00:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie
15:37:21:WU01:FS00:0xa7: Platform: linux2 4.19.0-5-amd64
15:37:21:WU01:FS00:0xa7: Bits: 64
15:37:21:WU01:FS00:0xa7: Mode: Release
15:37:21:WU01:FS00:0xa7:************************************ Build *************************************
15:37:21:WU01:FS00:0xa7: SIMD: avx_256
15:37:21:WU01:FS00:0xa7:********************************************************************************
15:37:21:WU01:FS00:0xa7:Project: 16417 (Run 1957, Clone 3, Gen 11)
15:37:21:WU01:FS00:0xa7:Unit: 0x0000000b96880e6e5e8a605c572322d6
15:37:21:WU01:FS00:0xa7:Reading tar file core.xml
15:37:21:WU01:FS00:0xa7:Reading tar file frame11.tpr
15:37:21:WU01:FS00:0xa7:Digital signatures verified
15:37:21:WU01:FS00:0xa7:Calling: mdrun -s frame11.tpr -o frame11.trr -x frame11.xtc -cpt 15 -nt 24
15:37:22:WU01:FS00:0xa7:Steps: first=2750000 total=250000
15:37:22:WU01:FS00:0xa7:ERROR:
15:37:22:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
15:37:22:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
15:37:22:WU01:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
15:37:22:WU01:FS00:0xa7:ERROR:
15:37:22:WU01:FS00:0xa7:ERROR:Fatal error:
15:37:22:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 20 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
15:37:22:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
15:37:22:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
15:37:22:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
15:37:22:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
15:37:22:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
15:37:27:WU01:FS00:0xa7:WARNING:Unexpected exit() call
15:37:27:WU01:FS00:0xa7:WARNING:Unexpected exit from science code
15:37:27:WU01:FS00:0xa7:Saving result file ../logfile_01.txt
15:37:27:WU01:FS00:0xa7:Saving result file md.log
15:37:27:WU01:FS00:0xa7:Saving result file science.log
15:37:27:WU01:FS00:0xa7:Caught signal SIGSEGV(11) on PID 8997
15:37:27:WU01:FS00:0xa7:Caught signal SIGSEGV(11) on PID 8997
15:37:27:WU01:FS00:0xa7:Caught signal SIGSEGV(11) on PID 8997
15:37:27:WU01:FS00:0xa7:Caught signal SIGSEGV(11) on PID 8997
15:37:27:WU01:FS00:0xa7:Caught signal SIGSEGV(11) on PID 8997
15:37:27:WU01:FS00:0xa7:Caught signal SIGSEGV(11) on PID 8997
15:37:27:WU01:FS00:0xa7:Caught signal SIGSEGV(11) on PID 8997
15:37:27:WU01:FS00:0xa7:Caught signal SIGSEGV(11) on PID 8997
15:37:28:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)