There is no domain decomposition for 20 ranks
Posted: Tue Apr 21, 2020 1:24 pm
I keep getting WU's that won't fold on 24 cores so they just sit there doing nothing all night while I am asleep. How can I avoid these?
Code: Select all
13:22:48:WU00:FS00:0xa7:*********************** Log Started 2020-04-21T13:22:47Z ***********************
13:22:48:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
13:22:48:WU00:FS00:0xa7: Type: 0xa7
13:22:48:WU00:FS00:0xa7: Core: Gromacs
13:22:48:WU00:FS00:0xa7: Args: -dir 00 -suffix 01 -version 706 -lifeline 129135 -checkpoint 5 -np
13:22:48:WU00:FS00:0xa7: 24
13:22:48:WU00:FS00:0xa7:************************************ CBang *************************************
13:22:48:WU00:FS00:0xa7: Date: Nov 5 2019
13:22:48:WU00:FS00:0xa7: Time: 06:06:57
13:22:48:WU00:FS00:0xa7: Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
13:22:48:WU00:FS00:0xa7: Branch: master
13:22:48:WU00:FS00:0xa7: Compiler: GNU 8.3.0
13:22:48:WU00:FS00:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
13:22:48:WU00:FS00:0xa7: Platform: linux2 4.19.0-5-amd64
13:22:48:WU00:FS00:0xa7: Bits: 64
13:22:48:WU00:FS00:0xa7: Mode: Release
13:22:48:WU00:FS00:0xa7:************************************ System ************************************
13:22:48:WU00:FS00:0xa7: CPU: AMD Ryzen Threadripper 1950X 16-Core Processor
13:22:48:WU00:FS00:0xa7: CPU ID: AuthenticAMD Family 23 Model 1 Stepping 1
13:22:48:WU00:FS00:0xa7: CPUs: 32
13:22:48:WU00:FS00:0xa7: Memory: 15.55GiB
13:22:48:WU00:FS00:0xa7:Free Memory: 500.07MiB
13:22:48:WU00:FS00:0xa7: Threads: POSIX_THREADS
13:22:48:WU00:FS00:0xa7: OS Version: 5.5
13:22:48:WU00:FS00:0xa7:Has Battery: false
13:22:48:WU00:FS00:0xa7: On Battery: false
13:22:48:WU00:FS00:0xa7: UTC Offset: -5
13:22:48:WU00:FS00:0xa7: PID: 129139
13:22:48:WU00:FS00:0xa7: CWD: /var/lib/fahclient/work
13:22:48:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
13:22:48:WU00:FS00:0xa7: Version: 0.0.18
13:22:48:WU00:FS00:0xa7: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
13:22:48:WU00:FS00:0xa7: Copyright: 2019 foldingathome.org
13:22:48:WU00:FS00:0xa7: Homepage: https://foldingathome.org/
13:22:48:WU00:FS00:0xa7: Date: Nov 5 2019
13:22:48:WU00:FS00:0xa7: Time: 06:13:26
13:22:48:WU00:FS00:0xa7: Revision: 490c9aa2957b725af319379424d5c5cb36efb656
13:22:48:WU00:FS00:0xa7: Branch: master
13:22:48:WU00:FS00:0xa7: Compiler: GNU 8.3.0
13:22:48:WU00:FS00:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie
13:22:48:WU00:FS00:0xa7: Platform: linux2 4.19.0-5-amd64
13:22:48:WU00:FS00:0xa7: Bits: 64
13:22:48:WU00:FS00:0xa7: Mode: Release
13:22:48:WU00:FS00:0xa7:************************************ Build *************************************
13:22:48:WU00:FS00:0xa7: SIMD: avx_256
13:22:48:WU00:FS00:0xa7:********************************************************************************
13:22:48:WU00:FS00:0xa7:Project: 14576 (Run 0, Clone 770, Gen 67)
13:22:48:WU00:FS00:0xa7:Unit: 0x00000053287234c95e792335830607aa
13:22:48:WU00:FS00:0xa7:Reading tar file core.xml
13:22:48:WU00:FS00:0xa7:Reading tar file frame67.tpr
13:22:48:WU00:FS00:0xa7:Digital signatures verified
13:22:48:WU00:FS00:0xa7:Calling: mdrun -s frame67.tpr -o frame67.trr -x frame67.xtc -cpt 5 -nt 24
13:22:48:WU00:FS00:0xa7:Steps: first=33500000 total=500000
13:22:48:WU00:FS00:0xa7:ERROR:
13:22:48:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
13:22:48:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
13:22:48:WU00:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
13:22:48:WU00:FS00:0xa7:ERROR:
13:22:48:WU00:FS00:0xa7:ERROR:Fatal error:
13:22:48:WU00:FS00:0xa7:ERROR:There is no domain decomposition for 20 ranks that is compatible with the given box and a minimum cell size of 1.37225 nm
13:22:48:WU00:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
13:22:48:WU00:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
13:22:48:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
13:22:48:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
13:22:48:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
13:22:52:WU00:FS00:0xa7:WARNING:Unexpected exit() call
13:22:52:WU00:FS00:0xa7:WARNING:Unexpected exit from science code
13:22:52:WU00:FS00:0xa7:Saving result file ../logfile_01.txt
13:22:52:WU00:FS00:0xa7:Saving result file md.log
13:22:52:WU00:FS00:0xa7:Saving result file science.log
13:22:53:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)