Project 13813 - BAD_FRAME_CHECKSUM

Moderators: Site Moderators, FAHC Science Team

Post Reply
Dr. Merkwürdigliebe
Posts: 30
Joined: Tue Nov 08, 2016 7:52 pm
Hardware configuration: Xeon 1230v3 + Geforce RTX 2080
Location: Germany

Project 13813 - BAD_FRAME_CHECKSUM

Post by Dr. Merkwürdigliebe »

Code: Select all

*********************** Log Started 2018-07-01T08:30:51Z ***********************
08:30:51:************************* Folding@home Client *************************
08:30:51:        Website: https://foldingathome.org/
08:30:51:      Copyright: (c) 2009-2018 foldingathome.org
08:30:51:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
08:30:51:           Args: --child --lifeline 2434 /etc/fahclient/config.xml --run-as
08:30:51:                 fahclient --pid-file=/var/run/fahclient.pid --daemon
08:30:51:         Config: /etc/fahclient/config.xml
08:30:51:******************************** Build ********************************
08:30:51:        Version: 7.5.1
08:30:51:           Date: May 11 2018
08:30:51:           Time: 19:59:04
08:30:51:     Repository: Git
08:30:51:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
08:30:51:         Branch: master
08:30:51:       Compiler: GNU 6.3.0 20170516
08:30:51:        Options: -std=gnu++98 -O3 -funroll-loops
08:30:51:       Platform: linux2 4.14.0-3-amd64
08:30:51:           Bits: 64
08:30:51:           Mode: Release
08:30:51:******************************* System ********************************
08:30:51:            CPU: Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz
08:30:51:         CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
08:30:51:           CPUs: 8
08:30:51:         Memory: 15.59GiB
08:30:51:    Free Memory: 14.60GiB
08:30:51:        Threads: POSIX_THREADS
08:30:51:     OS Version: 4.16
08:30:51:    Has Battery: false
08:30:51:     On Battery: false
08:30:51:     UTC Offset: 2
08:30:51:            PID: 2436
08:30:51:            CWD: /var/lib/fahclient
08:30:51:             OS: Linux 4.16.13-041613-generic x86_64
08:30:51:        OS Arch: AMD64
08:30:51:           GPUs: 0
08:30:51:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:6.1 Driver:9.2
08:30:51:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:396.24
08:30:51:***********************************************************************
08:30:51:<config>
08:30:51:  <!-- Client Control -->
08:30:51:  <fold-anon v='true'/>
08:30:51:
08:30:51:  <!-- Folding Slot Configuration -->
08:30:51:  <gpu v='false'/>
08:30:51:
08:30:51:  <!-- Network -->
08:30:51:  <proxy v=':8080'/>
08:30:51:
08:30:51:  <!-- Slot Control -->
08:30:51:  <power v='full'/>
08:30:51:
08:30:51:  <!-- User Information -->
08:30:51:  <passkey v='********************************'/>
08:30:51:  <team v='38412'/>
08:30:51:  <user v='Random_Dude'/>
08:30:51:
08:30:51:  <!-- Folding Slots -->
08:30:51:  <slot id='0' type='CPU'/>
08:30:51:</config>
08:30:51:Switching to user fahclient
08:30:51:Trying to access database...
08:30:51:Successfully acquired database lock
08:30:51:Enabled folding slot 00: READY cpu:8
08:30:51:WU00:FS00:Starting
08:30:51:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/AVX/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 2436 -checkpoint 15 -np 8
08:30:51:WU00:FS00:Started FahCore on PID 2446
08:30:51:WU00:FS00:Core PID:2460
08:30:51:WU00:FS00:FahCore 0xa7 started
08:30:52:WU00:FS00:0xa7:*********************** Log Started 2018-07-01T08:30:51Z ***********************
08:30:52:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
08:30:52:WU00:FS00:0xa7:       Type: 0xa7
08:30:52:WU00:FS00:0xa7:       Core: Gromacs
08:30:52:WU00:FS00:0xa7:    Website: https://foldingathome.org/
08:30:52:WU00:FS00:0xa7:  Copyright: (c) 2009-2018 foldingathome.org
08:30:52:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
08:30:52:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 2446 -checkpoint 15 -np 8
08:30:52:WU00:FS00:0xa7:     Config: <none>
08:30:52:WU00:FS00:0xa7:************************************ Build *************************************
08:30:52:WU00:FS00:0xa7:    Version: 0.0.17
08:30:52:WU00:FS00:0xa7:       Date: Apr 27 2018
08:30:52:WU00:FS00:0xa7:       Time: 19:09:21
08:30:52:WU00:FS00:0xa7: Repository: Git
08:30:52:WU00:FS00:0xa7:   Revision: 21359963583d09ec2063ef946399441c4df4ccd7
08:30:52:WU00:FS00:0xa7:     Branch: master
08:30:52:WU00:FS00:0xa7:   Compiler: GNU 6.3.0 20170516
08:30:52:WU00:FS00:0xa7:    Options: -std=gnu++98 -O3 -funroll-loops
08:30:52:WU00:FS00:0xa7:   Platform: linux2 4.14.0-3-amd64
08:30:52:WU00:FS00:0xa7:       Bits: 64
08:30:52:WU00:FS00:0xa7:       Mode: Release
08:30:52:WU00:FS00:0xa7:       SIMD: avx_256
08:30:52:WU00:FS00:0xa7:************************************ System ************************************
08:30:52:WU00:FS00:0xa7:        CPU: Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz
08:30:52:WU00:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
08:30:52:WU00:FS00:0xa7:       CPUs: 8
08:30:52:WU00:FS00:0xa7:     Memory: 15.59GiB
08:30:52:WU00:FS00:0xa7:Free Memory: 14.59GiB
08:30:52:WU00:FS00:0xa7:    Threads: POSIX_THREADS
08:30:52:WU00:FS00:0xa7: OS Version: 4.16
08:30:52:WU00:FS00:0xa7:Has Battery: false
08:30:52:WU00:FS00:0xa7: On Battery: false
08:30:52:WU00:FS00:0xa7: UTC Offset: 2
08:30:52:WU00:FS00:0xa7:        PID: 2460
08:30:52:WU00:FS00:0xa7:        CWD: /var/lib/fahclient/work
08:30:52:WU00:FS00:0xa7:         OS: Linux 4.16.13-041613-generic x86_64
08:30:52:WU00:FS00:0xa7:    OS Arch: AMD64
08:30:52:WU00:FS00:0xa7:********************************************************************************
08:30:52:WU00:FS00:0xa7:Project: 13813 (Run 0, Clone 506, Gen 5)
08:30:52:WU00:FS00:0xa7:Unit: 0x0000000580fccb025ae0907ab3af11a4
08:30:52:WU00:FS00:0xa7:Digital signatures verified
08:30:52:WU00:FS00:0xa7:Calling: mdrun -s frame5.tpr -o frame5.trr -x frame5.xtc -cpi state.cpt -cpt 15 -nt 8
08:30:52:WU00:FS00:0xa7:Steps: first=1250000 total=250000
08:30:52:WU00:FS00:0xa7:Completed 1387 out of 250000 steps (0%)
08:31:14:WARNING:WU00:FS00:FahCore returned: WU_STALLED (127 = 0x7f)
08:31:14:WU00:FS00:Starting
08:31:14:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/AVX/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 2436 -checkpoint 15 -np 8
08:31:14:WU00:FS00:Started FahCore on PID 3408
08:31:14:WU00:FS00:Core PID:3426
08:31:14:WU00:FS00:FahCore 0xa7 started
08:31:14:WARNING:WU00:FS00:FahCore returned: BAD_FRAME_CHECKSUM (112 = 0x70)
08:31:14:WARNING:WU00:FS00:Fatal error, dumping
Apparently, it happened twice in a row.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project 13813 - BAD_FRAME_CHECKSUM

Post by bruce »

It's not clear what happened to this WU on your system. I can say that there was no report of your system's failure since the WU was dumped.

Nevertheless, Project: 13813 (Run 0, Clone 506, Gen 5) was assigned to other systems and was successfully completed by one of the other three assignees. Based on that successful completion, the next WU in that trajectory (Gen 6) was produced and was also successfully completed.

Both WU_STALLED errors and BAD_FRAME_CHECKSUM errors are local hardware errors -- often caused by excessive heat or by marginal overclocking.
Dr. Merkwürdigliebe
Posts: 30
Joined: Tue Nov 08, 2016 7:52 pm
Hardware configuration: Xeon 1230v3 + Geforce RTX 2080
Location: Germany

Re: Project 13813 - BAD_FRAME_CHECKSUM

Post by Dr. Merkwürdigliebe »

Hmm, weird, my system is not overclocked. I don't think excessive heat is the problem either.

Will keep an eye on this...
Post Reply