Blowing up - 14369

Moderators: Site Moderators, FAHC Science Team

Post Reply
Jan
Posts: 79
Joined: Tue Mar 31, 2020 6:46 pm

Blowing up - 14369

Post by Jan »

Hi,

pretty much never had a problem with CPU projects. However, now I had this error a number of times in a row, I think always project 14369:

Code: Select all

14:22:06:WU02:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "C:\Program Files (x86)\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/avx/Core_a7.fah/FahCore_a7.exe" -dir 02 -suffix 01 -version 705 -lifeline 9040 -checkpoint 15 -np 7
14:22:06:WU02:FS00:Started FahCore on PID 908
14:22:06:WU02:FS00:Core PID:11892
14:22:06:WU02:FS00:FahCore 0xa7 started
14:22:06:WU02:FS00:0xa7:*********************** Log Started 2020-04-02T14:22:06Z ***********************
14:22:06:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
14:22:06:WU02:FS00:0xa7:       Type: 0xa7
14:22:06:WU02:FS00:0xa7:       Core: Gromacs
14:22:06:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 705 -lifeline 908 -checkpoint 15 -np 7
14:22:06:WU02:FS00:0xa7:************************************ CBang *************************************
14:22:06:WU02:FS00:0xa7:       Date: Oct 26 2019
14:22:06:WU02:FS00:0xa7:       Time: 01:38:25
14:22:06:WU02:FS00:0xa7:   Revision: c46a1a011a24143739ac7218c5a435f66777f62f
14:22:06:WU02:FS00:0xa7:     Branch: master
14:22:06:WU02:FS00:0xa7:   Compiler: Visual C++ 2008
14:22:06:WU02:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
14:22:06:WU02:FS00:0xa7:   Platform: win32 10
14:22:06:WU02:FS00:0xa7:       Bits: 64
14:22:06:WU02:FS00:0xa7:       Mode: Release
14:22:06:WU02:FS00:0xa7:************************************ System ************************************
14:22:06:WU02:FS00:0xa7:        CPU: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
14:22:06:WU02:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
14:22:06:WU02:FS00:0xa7:       CPUs: 8
14:22:06:WU02:FS00:0xa7:     Memory: 15.96GiB
14:22:06:WU02:FS00:0xa7:Free Memory: 9.59GiB
14:22:06:WU02:FS00:0xa7:    Threads: WINDOWS_THREADS
14:22:06:WU02:FS00:0xa7: OS Version: 6.2
14:22:06:WU02:FS00:0xa7:Has Battery: false
14:22:06:WU02:FS00:0xa7: On Battery: false
14:22:06:WU02:FS00:0xa7: UTC Offset: 2
14:22:06:WU02:FS00:0xa7:        PID: 11892
14:22:06:WU02:FS00:0xa7:        CWD: C:\Program Files (x86)\FAHClient\work
14:22:06:WU02:FS00:0xa7:******************************** Build - libFAH ********************************
14:22:06:WU02:FS00:0xa7:    Version: 0.0.18
14:22:06:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
14:22:06:WU02:FS00:0xa7:  Copyright: 2019 foldingathome.org
14:22:06:WU02:FS00:0xa7:   Homepage: https://foldingathome.org/
14:22:06:WU02:FS00:0xa7:       Date: Oct 26 2019
14:22:06:WU02:FS00:0xa7:       Time: 01:52:30
14:22:06:WU02:FS00:0xa7:   Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
14:22:06:WU02:FS00:0xa7:     Branch: master
14:22:06:WU02:FS00:0xa7:   Compiler: Visual C++ 2008
14:22:06:WU02:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
14:22:06:WU02:FS00:0xa7:   Platform: win32 10
14:22:06:WU02:FS00:0xa7:       Bits: 64
14:22:06:WU02:FS00:0xa7:       Mode: Release
14:22:06:WU02:FS00:0xa7:************************************ Build *************************************
14:22:06:WU02:FS00:0xa7:       SIMD: avx_256
14:22:06:WU02:FS00:0xa7:********************************************************************************
14:22:06:WU02:FS00:0xa7:Project: 14369 (Run 357, Clone 0, Gen 0)
14:22:06:WU02:FS00:0xa7:Unit: 0x000000049bf7a4d55e81dcfc6421d95c
14:22:06:WU02:FS00:0xa7:Digital signatures verified
14:22:06:WU02:FS00:0xa7:Reducing thread count from 7 to 6 to avoid domain decomposition by a prime number > 3
14:22:06:WU02:FS00:0xa7:Calling: mdrun -s frame0.tpr -o frame0.trr -cpt 15 -nt 6
14:22:06:WU02:FS00:0xa7:Steps: first=0 total=2500000
14:22:06:WU02:FS00:0xa7:Completed 1 out of 2500000 steps (0%)
14:22:12:WU02:FS00:0xa7:ERROR:
14:22:12:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
14:22:12:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
14:22:12:WU02:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\pme.c, line: 754
14:22:12:WU02:FS00:0xa7:ERROR:
14:22:12:WU02:FS00:0xa7:ERROR:Fatal error:
14:22:12:WU02:FS00:0xa7:ERROR:1 particles communicated to PME rank 1 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
14:22:12:WU02:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
14:22:12:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
14:22:12:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
14:22:12:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
14:22:17:WU02:FS00:0xa7:ERROR:
14:22:17:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
14:22:17:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
14:22:17:WU02:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\pme.c, line: 754
14:22:17:WU02:FS00:0xa7:ERROR:
14:22:17:WU02:FS00:0xa7:ERROR:Fatal error:
14:22:17:WU02:FS00:0xa7:ERROR:2 particles communicated to PME rank 1 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
14:22:17:WU02:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
14:22:17:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
14:22:17:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
14:22:17:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
14:22:22:WARNING:WU02:FS00:FahCore returned an unknown error code which probably indicates that it crashed
14:22:22:WARNING:WU02:FS00:FahCore returned: UNKNOWN_ENUM (-1073741819 = 0xc0000005)
Please let me know if you need more information, as far as I understand the description at gromacs.org, the error is not on my end.
toTOW
Site Moderator
Posts: 6349
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Blowing up - 14369

Post by toTOW »

It looks like we have a bad WU here. I've forwarded your report to the researcher in charge of this project.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Blowing up - 14369

Post by bruce »

Gen 0. Not yet retried.

Probably the researcher needs to fix other run/clones in that project
vvoelz
Pande Group Member
Posts: 552
Joined: Sun Dec 02, 2007 8:07 pm
Location: Temple University, Philadelphia PA

Re: Blowing up - 14369

Post by vvoelz »

Hi Jan -- Thanks for this valuable report! There error is definitely in the WU, and NOT your fault.

All of these jobs are *slightly* different (they test different molecules), and we cant test them *all* before release. For now I will STOP that RUN to see what's going on. Hopefully you pick up more/different projects soon!
Jan
Posts: 79
Joined: Tue Mar 31, 2020 6:46 pm

Re: Blowing up - 14369

Post by Jan »

No worries, thanks for the quick replies. Glad I could help.
Post Reply