Page 1 of 1

Fedora 32 issue with FAH Control / GROMACS

Posted: Sun Jun 14, 2020 5:42 pm
by Kirito309
HELP!

Running Fedora 32. FAH had been working well then the FAH Control disappeared after update.
Was able to get FAH Control back (Python 2-3 issue) but now the following cycles once a minute
on my machine. Was turning 72,000 PPD and now I am running 32. At this point I am at the limit
of my understanding of software. Any help would be appreciated.

Project is 14524 333,5,22

Thank you! Kirito ( I broke all the hyperlinks in this )

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Log File Below~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Code: Select all

15:59:43:WU00:FS00:Starting
15:59:43:WU00:FS00:Removing old file 'work/00/logfile_01-20200612-231159.txt'
15:59:43:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome-org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 706 -lifeline 1528 -checkpoint 15 -np 24
15:59:43:WU00:FS00:Started FahCore on PID 8562
15:59:43:WU00:FS00:Core PID:8566
15:59:43:WU00:FS00:FahCore 0xa7 started
15:59:44:WU00:FS00:0xa7:*********************** Log Started 2020-06-14T15:59:43Z ***********************
15:59:44:WU00:FS00:0xa7:************************** Gromacs Folding @ home Core ***************************
15:59:44:WU00:FS00:0xa7:       Type: 0xa7
15:59:44:WU00:FS00:0xa7:       Core: Gromacs
15:59:44:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 706 -lifeline 8562 -checkpoint 15 -np
15:59:44:WU00:FS00:0xa7:             24
15:59:44:WU00:FS00:0xa7:************************************ CBang *************************************
15:59:44:WU00:FS00:0xa7:       Date: Nov 5 2019
15:59:44:WU00:FS00:0xa7:       Time: 06:06:57
15:59:44:WU00:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
15:59:44:WU00:FS00:0xa7:     Branch: master
15:59:44:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
15:59:44:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
15:59:44:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
15:59:44:WU00:FS00:0xa7:       Bits: 64
15:59:44:WU00:FS00:0xa7:       Mode: Release
15:59:44:WU00:FS00:0xa7:************************************ System ************************************
15:59:44:WU00:FS00:0xa7:        CPU: Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
15:59:44:WU00:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 62 Stepping 4
15:59:44:WU00:FS00:0xa7:       CPUs: 24
15:59:44:WU00:FS00:0xa7:     Memory: 62.72GiB
15:59:44:WU00:FS00:0xa7:Free Memory: 58.75GiB
15:59:44:WU00:FS00:0xa7:    Threads: POSIX_THREADS
15:59:44:WU00:FS00:0xa7: OS Version: 5.6
15:59:44:WU00:FS00:0xa7:Has Battery: false
15:59:44:WU00:FS00:0xa7: On Battery: false
15:59:44:WU00:FS00:0xa7: UTC Offset: -4
15:59:44:WU00:FS00:0xa7:        PID: 8566
15:59:44:WU00:FS00:0xa7:        CWD: /var/lib/fahclient/work
15:59:44:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
15:59:44:WU00:FS00:0xa7:    Version: 0.0.18
15:59:44:WU00:FS00:0xa7:     Author: 
15:59:44:WU00:FS00:0xa7:  Copyright:
15:59:44:WU00:FS00:0xa7:   Homepage: 
15:59:44:WU00:FS00:0xa7:       Date: Nov 5 2019
15:59:44:WU00:FS00:0xa7:       Time: 06:13:26
15:59:44:WU00:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
15:59:44:WU00:FS00:0xa7:     Branch: master
15:59:44:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
15:59:44:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
15:59:44:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
15:59:44:WU00:FS00:0xa7:       Bits: 64
15:59:44:WU00:FS00:0xa7:       Mode: Release
15:59:44:WU00:FS00:0xa7:************************************ Build *************************************
15:59:44:WU00:FS00:0xa7:       SIMD: avx_256
15:59:44:WU00:FS00:0xa7:********************************************************************************
15:59:44:WU00:FS00:0xa7:Project: 14524 (Run 333, Clone 5, Gen 22)
15:59:44:WU00:FS00:0xa7:Unit: 0x0000002580fccb0a5e781c0de4bfe2da
15:59:44:WU00:FS00:0xa7:Reading tar file core.xml
15:59:44:WU00:FS00:0xa7:Reading tar file frame22.tpr
15:59:44:WU00:FS00:0xa7:Digital signatures verified
15:59:44:WU00:FS00:0xa7:Calling: mdrun -s frame22.tpr -o frame22.trr -x frame22.xtc -cpt 15 -nt 24
15:59:44:WU00:FS00:0xa7:Steps: first=5500000 total=250000
15:59:44:WU00:FS00:0xa7:ERROR:
15:59:44:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
15:59:44:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
15:59:44:WU00:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
15:59:44:WU00:FS00:0xa7:ERROR:
15:59:44:WU00:FS00:0xa7:ERROR:Fatal error:
15:59:44:WU00:FS00:0xa7:ERROR:There is no domain decomposition for 20 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
15:59:44:WU00:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
15:59:44:WU00:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
15:59:44:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
15:59:44:WU00:FS00:0xa7:ERROR:website  -gromacs-org / Documentation / Errors
15:59:44:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
15:59:49:WU00:FS00:0xa7:WARNING:Unexpected exit() call
15:59:49:WU00:FS00:0xa7:WARNING:Unexpected exit from science code
15:59:49:WU00:FS00:0xa7:Saving result file ../logfile_01.txt
15:59:49:WU00:FS00:0xa7:Saving result file md.log
15:59:49:WU00:FS00:0xa7:Saving result file science.log
15:59:49:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
16:00:34:35:127.0.0.1:New Web session
16:00:43:WU00:FS00:Starting
16:00:43:WU00:FS00:Removing old file 'work/00/logfile_01-20200612-231259.txt'
16:00:43:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome-org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 706 -lifeline 1528 -checkpoint 15 -np 24
16:00:43:WU00:FS00:Started FahCore on PID 8920
16:00:43:WU00:FS00:Core PID:8924
16:00:43:WU00:FS00:FahCore 0xa7 started
Mod Edit: Added Code Tags - PantherX

Re: Fedora 32 issue with FAH Control / GROMACS

Posted: Sun Jun 14, 2020 6:31 pm
by Joe_H
Try setting the number of CPU threads to a lower number like 18 after pausing. Restart after a minute to allow the setting to save. It may also work on 21 threads, and almost certainly will work on 16. They have been having recurring problems with this project assigning to CPU thread counts that it has problems with, settings keep reverting.

Re: Fedora 32 issue with FAH Control / GROMACS

Posted: Sun Jun 14, 2020 11:16 pm
by Kirito309
Thank you!

I had to turn it down to medium speed at 12 of 24 cores to get it to keep running.
Once this job finishes (tomorrow sometime) hopefully I can go back to full speed.

Thank you folks for helping me out.

Kirito

Re: Fedora 32 issue with FAH Control / GROMACS

Posted: Tue Jun 16, 2020 3:16 am
by bruce
There will be times when what you're calling "full speed" will be difficult to fulfill. You may find that long-term you'd be better of with two slots of 12 or one of 16 and one of 8 threads. The slider was developed when most CPUs had 16 or less threads and it has not been updated.