Page 1 of 1

your system is not well equilibrated

Posted: Thu May 07, 2020 9:00 pm
by haf1646
I can't get past one failing work unit on one of my clients that has been running for weeks great. I have tried stopping the Linux service and running FAHClient --dump 00. Even restating the entire machine, but I still get an error in the log (pasted below). Any ideas how to get past this?

It is a pretty simple Linux Mint 19.3 Tricia install on a older Dell Inspiron 530 with no extra GPUs or anything.

Thanks in advance

Code: Select all

*********************** Log Started 2020-05-07T20:42:54Z ***********************
20:42:54:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
20:42:54:WU00:FS00:0xa7:       Type: 0xa7
20:42:54:WU00:FS00:0xa7:       Core: Gromacs
20:42:54:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 3550 -checkpoint 15 -np 4
20:42:54:WU00:FS00:0xa7:************************************ CBang *************************************
20:42:54:WU00:FS00:0xa7:       Date: Nov 5 2019
20:42:54:WU00:FS00:0xa7:       Time: 05:57:01
20:42:54:WU00:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
20:42:54:WU00:FS00:0xa7:     Branch: master
20:42:54:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
20:42:54:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
20:42:54:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
20:42:54:WU00:FS00:0xa7:       Bits: 64
20:42:54:WU00:FS00:0xa7:       Mode: Release
20:42:54:WU00:FS00:0xa7:************************************ System ************************************
20:42:54:WU00:FS00:0xa7:        CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
20:42:54:WU00:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 15 Stepping 11
20:42:54:WU00:FS00:0xa7:       CPUs: 4
20:42:54:WU00:FS00:0xa7:     Memory: 3.83GiB
20:42:54:WU00:FS00:0xa7:Free Memory: 1.21GiB
20:42:54:WU00:FS00:0xa7:    Threads: POSIX_THREADS
20:42:54:WU00:FS00:0xa7: OS Version: 5.3
20:42:54:WU00:FS00:0xa7:Has Battery: false
20:42:54:WU00:FS00:0xa7: On Battery: false
20:42:54:WU00:FS00:0xa7: UTC Offset: -4
20:42:54:WU00:FS00:0xa7:        PID: 3554
20:42:54:WU00:FS00:0xa7:        CWD: /var/lib/fahclient/work
20:42:54:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
20:42:54:WU00:FS00:0xa7:    Version: 0.0.18
20:42:54:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
20:42:54:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
20:42:54:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
20:42:54:WU00:FS00:0xa7:       Date: Nov 5 2019
20:42:54:WU00:FS00:0xa7:       Time: 06:13:26
20:42:54:WU00:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
20:42:54:WU00:FS00:0xa7:     Branch: master
20:42:54:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
20:42:54:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
20:42:54:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
20:42:54:WU00:FS00:0xa7:       Bits: 64
20:42:54:WU00:FS00:0xa7:       Mode: Release
20:42:54:WU00:FS00:0xa7:************************************ Build *************************************
20:42:54:WU00:FS00:0xa7:       SIMD: sse2
20:42:54:WU00:FS00:0xa7:********************************************************************************
20:42:54:WU00:FS00:0xa7:Project: 14800 (Run 643, Clone 1, Gen 0)
20:42:54:WU00:FS00:0xa7:Unit: 0x000000000002894b5eaa15f52926aac3
20:42:54:WU00:FS00:0xa7:Digital signatures verified
20:42:54:WU00:FS00:0xa7:Calling: mdrun -s frame0.tpr -o frame0.trr -cpt 15 -nt 4
20:42:54:WU00:FS00:0xa7:Steps: first=0 total=250000
20:42:56:Removing old file 'configs/config-20200430-000957.xml'
20:42:56:Saving configuration to /etc/fahclient/config.xml
20:42:56:<config>
20:42:56:  <!-- Client Control -->
20:42:56:  <fold-anon v='true'/>
20:42:56:
20:42:56:  <!-- Folding Slot Configuration -->
20:42:56:  <gpu v='false'/>
20:42:56:
20:42:56:  <!-- Network -->
20:42:56:  <proxy v=':8080'/>
20:42:56:
20:42:56:  <!-- Remote Command Server -->
20:42:56:  <password v='************'/>
20:42:56:
20:42:56:  <!-- Slot Control -->
20:42:56:  <power v='full'/>
20:42:56:
20:42:56:  <!-- User Information -->
20:42:56:  <passkey v='********************************'/>
20:42:56:  <team v='44851'/>
20:42:56:  <user v='synnoack'/>
20:42:56:
20:42:56:  <!-- Folding Slots -->
20:42:56:  <slot id='0' type='CPU'/>
20:42:56:</config>
20:42:57:WU00:FS00:0xa7:Completed 1 out of 250000 steps (0%)
20:42:57:WU00:FS00:0xa7:ERROR:
20:42:57:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
20:42:57:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
20:42:57:WU00:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-sse-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
20:42:57:WU00:FS00:0xa7:ERROR:
20:42:57:WU00:FS00:0xa7:ERROR:Fatal error:
20:42:57:WU00:FS00:0xa7:ERROR:7 particles communicated to PME rank 2 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
20:42:57:WU00:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
20:42:57:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
20:42:57:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
20:42:57:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
20:43:03:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)

Re: your system is not well equilibrated

Posted: Thu May 07, 2020 9:32 pm
by HaloJones
the work is stored in /var/lib/fahclient/work

stop the client, then go to that directory and delete the directory 00 that you will find there

restart the client

Re: your system is not well equilibrated

Posted: Thu May 07, 2020 11:54 pm
by haf1646
Perfect! Fixed. Thanks

Re: your system is not well equilibrated

Posted: Fri May 08, 2020 10:29 am
by vvoelz
Sorry you got a bad WU haf1646! Thanks for reporting this. Each of these RUNs contains different ligand, and it appears that for a rare few they're not completely equilibrated. I've STOPPED all further WUs from this p14800/RUN643.