Page 1 of 1

14246 - error with 15 cpu threads

Posted: Fri Dec 20, 2019 2:53 am
by squads

Code: Select all

*********************** Log Started 2019-12-19T12:05:59Z ***********************
************************** Gromacs Folding@home Core ***************************
       Type: 0xa7
       Core: Gromacs
       Args: -dir 02 -suffix 01 -version 705 -lifeline 17602 -checkpoint 15 -np
             15
************************************ CBang *************************************
       Date: Nov 5 2019
       Time: 06:06:57
   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
     Branch: master
   Compiler: GNU 8.3.0
    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
   Platform: linux2 4.19.0-5-amd64
       Bits: 64
       Mode: Release
************************************ System ************************************
        CPU: AMD Ryzen 7 3700X 8-Core Processor
     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
       CPUs: 16
     Memory: 31.34GiB
Free Memory: 28.05GiB
    Threads: POSIX_THREADS
 OS Version: 5.0
Has Battery: false
 On Battery: false
 UTC Offset: -5
        PID: 17606
        CWD: /var/lib/fahclient/work
******************************** Build - libFAH ********************************
    Version: 0.0.18
     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
  Copyright: 2019 foldingathome.org
   Homepage: https://foldingathome.org/
       Date: Nov 5 2019
       Time: 06:13:26
   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
     Branch: master
   Compiler: GNU 8.3.0
    Options: -std=c++11 -O3 -funroll-loops -fno-pie
   Platform: linux2 4.19.0-5-amd64
       Bits: 64
       Mode: Release
************************************ Build *************************************
       SIMD: avx_256
********************************************************************************
Project: 14246 (Run 0, Clone 71, Gen 229)
Unit: 0x0000014c80fccb0a5d6fe21fc23c8136
Reading tar file core.xml
Reading tar file frame229.tpr
Digital signatures verified
Calling: mdrun -s frame229.tpr -o frame229.trr -x frame229.xtc -cpt 15 -nt 15
Steps: first=57250000 total=250000
ERROR:
ERROR:-------------------------------------------------------
ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
ERROR:
ERROR:Fatal error:
ERROR:There is no domain decomposition for 15 ranks that is compatible with the given box and a minimum cell size of 1.45733 nm
ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
ERROR:Look in the log file for details on the domain decomposition
ERROR:For more information and tips for troubleshooting, please check the GROMACS
ERROR:website at http://www.gromacs.org/Documentation/Errors
ERROR:-------------------------------------------------------
WARNING:Unexpected exit() call
WARNING:Unexpected exit from science code
Saving result file ../logfile_01.txt
Saving result file md.log
Saving result file science.log
This happens repeatedly every time I receive a 14246 WU. I haven't had issues with any other type of work using this CPU with 15 threads. If I am understanding the documentation for GROMACS correctly, this WU isn't large enough to be broken up into 15 pieces

Re: 14246 - error with 15 cpu threads

Posted: Tue Feb 25, 2020 8:15 pm
by Frontiers
You can try to update both BIOS and OS to later kernel versions, Zen2 at moment of launch had some problems with Linux, which were fixed with later bios updates and with next Linux versions. As I can remember - Linux kernel 5.0 was pre-Zen2.

Re: 14246 - error with 15 cpu threads

Posted: Tue Feb 25, 2020 10:49 pm
by JimboPalmer
You have 16 threads and I think, a GPU, which reserves one thread for its use, thus it is trying to use 15.
F@H CPU cores hate 'large' prime numbers and multiples of 'large' prime numbers. 7 is always 'large' but 5 is mostly not 'large'. Except occasionally, like today, for you. To the best of my knowledge 3 is never 'large'.

I can't see what version of the client you are using but 7.5.1 is supposed to try to adjust automatically.

Setting the CPU to 12 threads would work, but leave more threads unused, you could make a second CPU thread with 3 threads, then you would keep them all used.

If you are not folding on a GPU, then you may have other configuration issues. Posting the first 200 lines of the log tells us a great deal about your PC and it's configuration.

viewtopic.php?f=24&t=26036

Re: 14246 - error with 15 cpu threads

Posted: Tue Feb 25, 2020 11:59 pm
by MeeLee
Then use 14 threads. You should always reserve 1 core for the OS anyway, and 1 for feeding the remaining CPU cores (especially true once you hit more than 15 threads)

Re: 14246 - error with 15 cpu threads

Posted: Wed Feb 26, 2020 12:02 am
by Joe_H
MeeLee wrote:Then use 14 threads. You should always reserve 1 core for the OS anyway, and 1 for feeding the remaining CPU cores (especially true once you hit more than 15 threads)
14 is a multiple of the prime number 7, it will never be used fully. At best the AS will assign WU's to use 12 of those 14 threads.