Page 1 of 1

Error after reboot next day, starting new jobs

Posted: Fri May 22, 2020 6:37 am
by Sumsum
Hi!

This ist the 2nd time I 'loose' the work of 2 days processing.
I bootet my PC and got the message that fahCore_a7.exe has crashed. This is the last log. 2 days ago, when Ishotdown the PC, I had a workingunit about 70% done (+/- some percent), but this is gone. I believe, that it's a waste of energy, if the app continues to crash and restart work every day.

This is the last log, maybe someone can help:

Code: Select all

*********************** Log Started 2020-05-22T06:09:15Z ***********************
06:09:15:Trying to access database...
06:09:15:Successfully acquired database lock
06:09:15:Read GPUs.txt
06:09:16:Enabled folding slot 00: READY cpu:2
06:09:17:****************************** FAHClient ******************************
06:09:17:        Version: 7.6.13
06:09:17:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
06:09:17:      Copyright: 2020 foldingathome.org
06:09:17:       Homepage: https://foldingathome.org/
06:09:17:           Date: Apr 27 2020
06:09:17:           Time: 21:21:01
06:09:17:       Revision: 5a652817f46116b6e135503af97f18e094414e3b
06:09:17:         Branch: master
06:09:17:       Compiler: Visual C++ 2008
06:09:17:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:09:17:       Platform: win32 10
06:09:17:           Bits: 32
06:09:17:           Mode: Release
06:09:17:         Config: C:\Users\S\AppData\Roaming\FAHClient\config.xml
06:09:17:******************************** CBang ********************************
06:09:17:           Date: Apr 24 2020
06:09:17:           Time: 17:07:55
06:09:17:       Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
06:09:17:         Branch: master
06:09:17:       Compiler: Visual C++ 2008
06:09:17:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:09:17:       Platform: win32 10
06:09:17:           Bits: 32
06:09:17:           Mode: Release
06:09:17:******************************* System ********************************
06:09:17:            CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
06:09:17:         CPU ID: GenuineIntel Family 6 Model 15 Stepping 11
06:09:17:           CPUs: 4
06:09:17:         Memory: 8.00GiB
06:09:17:    Free Memory: 6.20GiB
06:09:17:        Threads: WINDOWS_THREADS
06:09:17:     OS Version: 6.2
06:09:17:    Has Battery: false
06:09:17:     On Battery: false
06:09:17:     UTC Offset: 2
06:09:17:            PID: 9040
06:09:17:            CWD: C:\Users\S\AppData\Roaming\FAHClient
06:09:17:  Win32 Service: false
06:09:17:             OS: Windows 10 Home
06:09:17:        OS Arch: AMD64
06:09:17:           GPUs: 1
06:09:17:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:3 GF114 [GeForce GTX 560 Ti]
06:09:17:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:2.1 Driver:9.1
06:09:17:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.1 Driver:388.13
06:09:17:******************************* libFAH ********************************
06:09:17:           Date: Apr 15 2020
06:09:17:           Time: 14:53:14
06:09:17:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
06:09:17:         Branch: master
06:09:17:       Compiler: Visual C++ 2008
06:09:17:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:09:17:       Platform: win32 10
06:09:17:           Bits: 32
06:09:17:           Mode: Release
06:09:17:***********************************************************************
06:09:19:<config>
06:09:19:  <!-- Folding Slot Configuration -->
06:09:19:  <cause v='COVID_19'/>
06:09:19:
06:09:19:  <!-- Network -->
06:09:19:  <proxy v=':8080'/>
06:09:19:
06:09:19:  <!-- User Information -->
06:09:19:  <passkey v='*****'/>
06:09:19:  <team v='263429'/>
06:09:19:  <user v='Longhitter'/>
06:09:19:
06:09:19:  <!-- Folding Slots -->
06:09:19:  <slot id='0' type='CPU'>
06:09:19:    <cpus v='2'/>
06:09:19:    <idle v='False'/>
06:09:19:    <paused v='False'/>
06:09:19:  </slot>
06:09:19:</config>
06:09:19:WU00:FS00:Starting
06:09:19:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\S\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_a7.fah/FahCore_a7.exe -dir 00 -suffix 01 -version 706 -lifeline 9040 -checkpoint 15 -np 2
06:09:19:WU00:FS00:Started FahCore on PID 11600
06:09:22:WU00:FS00:Core PID:12796
06:09:22:WU00:FS00:FahCore 0xa7 started
06:09:22:4:127.0.0.1:New Web session
06:09:22:WU00:FS00:0xa7:*********************** Log Started 2020-05-22T06:09:22Z ***********************
06:09:22:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
06:09:22:WU00:FS00:0xa7:       Type: 0xa7
06:09:22:WU00:FS00:0xa7:       Core: Gromacs
06:09:22:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 706 -lifeline 11600 -checkpoint 15 -np
06:09:22:WU00:FS00:0xa7:             2
06:09:22:WU00:FS00:0xa7:************************************ CBang *************************************
06:09:22:WU00:FS00:0xa7:       Date: Oct 26 2019
06:09:22:WU00:FS00:0xa7:       Time: 01:38:35
06:09:22:WU00:FS00:0xa7:   Revision: c46a1a011a24143739ac7218c5a435f66777f62f
06:09:22:WU00:FS00:0xa7:     Branch: master
06:09:22:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
06:09:22:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:09:22:WU00:FS00:0xa7:   Platform: win32 10
06:09:22:WU00:FS00:0xa7:       Bits: 64
06:09:22:WU00:FS00:0xa7:       Mode: Release
06:09:22:WU00:FS00:0xa7:************************************ System ************************************
06:09:22:WU00:FS00:0xa7:        CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
06:09:22:WU00:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 15 Stepping 11
06:09:22:WU00:FS00:0xa7:       CPUs: 4
06:09:22:WU00:FS00:0xa7:     Memory: 8.00GiB
06:09:22:WU00:FS00:0xa7:Free Memory: 5.84GiB
06:09:22:WU00:FS00:0xa7:    Threads: WINDOWS_THREADS
06:09:22:WU00:FS00:0xa7: OS Version: 6.2
06:09:22:WU00:FS00:0xa7:Has Battery: false
06:09:22:WU00:FS00:0xa7: On Battery: false
06:09:22:WU00:FS00:0xa7: UTC Offset: 2
06:09:22:WU00:FS00:0xa7:        PID: 12796
06:09:22:WU00:FS00:0xa7:        CWD: C:\Users\S\AppData\Roaming\FAHClient\work
06:09:22:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
06:09:22:WU00:FS00:0xa7:    Version: 0.0.18
06:09:22:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
06:09:22:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
06:09:22:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
06:09:22:WU00:FS00:0xa7:       Date: Oct 26 2019
06:09:22:WU00:FS00:0xa7:       Time: 01:52:44
06:09:22:WU00:FS00:0xa7:   Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
06:09:22:WU00:FS00:0xa7:     Branch: master
06:09:22:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
06:09:22:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:09:22:WU00:FS00:0xa7:   Platform: win32 10
06:09:22:WU00:FS00:0xa7:       Bits: 64
06:09:22:WU00:FS00:0xa7:       Mode: Release
06:09:22:WU00:FS00:0xa7:************************************ Build *************************************
06:09:23:WU00:FS00:0xa7:       SIMD: sse2
06:09:23:WU00:FS00:0xa7:********************************************************************************
06:09:23:WU00:FS00:0xa7:Project: 13828 (Run 289, Clone 5, Gen 27)
06:09:23:WU00:FS00:0xa7:Unit: 0x0000002780fccb095e73ae2451fa12c2
06:09:23:WU00:FS00:0xa7:Digital signatures verified
06:09:23:WU00:FS00:0xa7:Calling: mdrun -s frame27.tpr -o frame27.trr -x frame27.xtc -cpi state.cpt -cpt 15 -nt 2
06:09:23:WU00:FS00:0xa7:ERROR:Guru Meditation #56a4b8aa84726420.8c770da332567b8f (4664832.9329664) '00/01/frame27.trr'
06:09:23:WU00:FS00:0xa7:WARNING:Unexpected exit() call
06:09:23:WU00:FS00:0xa7:WARNING:Unexpected exit from science code
06:09:23:WU00:FS00:0xa7:Saving result file ..\logfile_01.txt
06:09:23:WU00:FS00:0xa7:Saving result file frame27.trr
06:09:23:WU00:FS00:0xa7:ERROR:Guru Meditation #56a4b8aa84726420.8c770da332567b8f (4664832.9329664) '00/01/frame27.trr'
06:09:23:9:127.0.0.1:New Web session
06:10:05:WARNING:WU00:FS00:FahCore returned: FAILED_3 (255 = 0xff)
06:10:05:WU00:FS00:Sending unit results: id:00 state:SEND error:FAULTY project:13828 run:289 clone:5 gen:27 core:0xa7 unit:0x0000002780fccb095e73ae2451fa12c2
06:10:05:WU00:FS00:Connecting to 128.252.203.9:8080
06:10:05:WU01:FS00:Connecting to assign1.foldingathome.org:80
06:10:06:WU01:FS00:Assigned to work server 128.252.203.1
06:10:06:WU01:FS00:Requesting new work unit for slot 00: READY cpu:2 from 128.252.203.1
06:10:06:WU01:FS00:Connecting to 128.252.203.1:8080
06:10:09:WU01:FS00:Downloading 1.39MiB
06:10:10:WU01:FS00:Download complete
06:10:10:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:16428 run:1012 clone:2 gen:73 core:0xa7 unit:0x0000005380fccb015ea74e8c6735bcc6
06:10:10:WU01:FS00:Starting
06:10:10:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\S\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 706 -lifeline 9040 -checkpoint 15 -np 2
06:10:10:WU01:FS00:Started FahCore on PID 9536
06:10:10:WU01:FS00:Core PID:9420
06:10:10:WU01:FS00:FahCore 0xa7 started
06:10:11:WU01:FS00:0xa7:*********************** Log Started 2020-05-22T06:10:10Z ***********************
06:10:11:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
06:10:11:WU01:FS00:0xa7:       Type: 0xa7
06:10:11:WU01:FS00:0xa7:       Core: Gromacs
06:10:11:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 9536 -checkpoint 15 -np 2
06:10:11:WU01:FS00:0xa7:************************************ CBang *************************************
06:10:11:WU01:FS00:0xa7:       Date: Oct 26 2019
06:10:11:WU01:FS00:0xa7:       Time: 01:38:35
06:10:11:WU01:FS00:0xa7:   Revision: c46a1a011a24143739ac7218c5a435f66777f62f
06:10:11:WU01:FS00:0xa7:     Branch: master
06:10:11:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
06:10:11:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:10:11:WU01:FS00:0xa7:   Platform: win32 10
06:10:11:WU01:FS00:0xa7:       Bits: 64
06:10:11:WU01:FS00:0xa7:       Mode: Release
06:10:11:WU01:FS00:0xa7:************************************ System ************************************
06:10:11:WU01:FS00:0xa7:        CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
06:10:11:WU01:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 15 Stepping 11
06:10:11:WU01:FS00:0xa7:       CPUs: 4
06:10:11:WU01:FS00:0xa7:     Memory: 8.00GiB
06:10:11:WU01:FS00:0xa7:Free Memory: 5.79GiB
06:10:11:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
06:10:11:WU01:FS00:0xa7: OS Version: 6.2
06:10:11:WU01:FS00:0xa7:Has Battery: false
06:10:11:WU01:FS00:0xa7: On Battery: false
06:10:11:WU01:FS00:0xa7: UTC Offset: 2
06:10:11:WU01:FS00:0xa7:        PID: 9420
06:10:11:WU01:FS00:0xa7:        CWD: C:\Users\S\AppData\Roaming\FAHClient\work
06:10:11:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
06:10:11:WU01:FS00:0xa7:    Version: 0.0.18
06:10:11:WU01:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
06:10:11:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
06:10:11:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
06:10:11:WU01:FS00:0xa7:       Date: Oct 26 2019
06:10:11:WU01:FS00:0xa7:       Time: 01:52:44
06:10:11:WU01:FS00:0xa7:   Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
06:10:11:WU01:FS00:0xa7:     Branch: master
06:10:11:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
06:10:11:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:10:11:WU01:FS00:0xa7:   Platform: win32 10
06:10:11:WU01:FS00:0xa7:       Bits: 64
06:10:11:WU01:FS00:0xa7:       Mode: Release
06:10:11:WU01:FS00:0xa7:************************************ Build *************************************
06:10:11:WU01:FS00:0xa7:       SIMD: sse2
06:10:11:WU01:FS00:0xa7:********************************************************************************
06:10:11:WU01:FS00:0xa7:Project: 16428 (Run 1012, Clone 2, Gen 73)
06:10:11:WU01:FS00:0xa7:Unit: 0x0000005380fccb015ea74e8c6735bcc6
06:10:11:WU01:FS00:0xa7:Reading tar file core.xml
06:10:11:WU01:FS00:0xa7:Reading tar file frame73.tpr
06:10:11:WU01:FS00:0xa7:Digital signatures verified
06:10:11:WU01:FS00:0xa7:Calling: mdrun -s frame73.tpr -o frame73.trr -x frame73.xtc -cpt 15 -nt 2
06:10:11:WU01:FS00:0xa7:Steps: first=36500000 total=500000
06:10:12:WU01:FS00:0xa7:Completed 1 out of 500000 steps (0%)

Re: Error after reboot next day, starting new jobs

Posted: Fri May 22, 2020 7:24 am
by bruce
I'd recomend when you go to bed, you do one of two things:
* Click on FINISH in the advanced Client It'll finish the active assignment, upload the results, and it WILL NOT download and start the next WU.
* Click on PAUSE in the advanced client. wait a minute or two for the active WU to switch to the pause condidion. Then shut down your computer normally.

Either way, when you start up your computer the next morning, the progress should still be associated with the WU.

Re: Error after reboot next day, starting new jobs

Posted: Fri May 22, 2020 7:52 am
by JimboPalmer
06:09:17:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.1 Driver:388.13

It may not be your problem, but that is not the newest driver for your graphics card.

https://www.nvidia.com/en-us/drivers/results/132845/

Is the latest and last driver for that old a card.

https://www.techpowerup.com/gpu-specs/g ... 60-ti.c273

Re: Error after reboot next day, starting new jobs

Posted: Fri May 22, 2020 8:05 am
by PantherX
Welcome to the F@H Forum Sumsum,

Please note that this error:
ERROR:Guru Meditation

Is generally caused by corrupted checkpoints. That can happen due to few reasons; improper shutdown, anti-virus/anti-malware/anti-spyware/anti-ransomware, corrupted files, storage issues. In your case, you stated that the issue happens when turning off your system and then later turning it on. The reason is that sometime, the OS don't provide enough time for the data to be successfully flushed to the disk and verified before the OS stops the process. Thus, bruce's suggestion is a potential solution for you.

Re: Error after reboot next day, starting new jobs

Posted: Fri May 22, 2020 8:16 am
by Sumsum
Thank you all, I'll try that next time.

I wonder because of a newer Driver, autoupdate should have updated it. Anyway, I disabled GPU, as I felt, that it producing a lot of head while not performing very well. Heat means loudness and that's not optimal for Homeoffice. That was different some years ago when I used the PC for gaming with headset on ;-)

Maybe I'll find a way to change shutdown-timeout, if not, I'll use the 'pause'-way, even if fire-and-forget was, what I really wanted.

Thx!

Re: Error after reboot next day, starting new jobs

Posted: Fri May 22, 2020 8:33 am
by NRT_AntiKytherA
I had that behaviour happen on Windows 10 a couple of times initially so always pause any folding before shutting down or restarting the machine now. In the evenings I tend to hit the Finish button in the Advanced GUI, then leave the machine on for at least 10 minutes after that says the slots are paused to let it clean up and cool down a bit before shutting the machine down.

Re: Error after reboot next day, starting new jobs

Posted: Tue May 26, 2020 6:04 am
by Sumsum
One more thing... :D

Is there an expert option to unpause / start folding at startup? I learned to pause folding before shutdown, but skipping that restart would be nice, specially, when PC is in idle for some time after beeing startet before going into the bathroom. :wink:

Re: Error after reboot next day, starting new jobs

Posted: Tue May 26, 2020 8:07 am
by PantherX
I use this which means that whenever my system reboots, my GPU slot is paused always and I need to manually unpause it:
<pause-on-start v='true'/>

The default setting is false which means that it will start folding unless the previous state of the slot was Paused. In that case, it remembers that it was pause and will stick with that state even after restart. That is denoted by this flag:
<paused v='true'/>

In theory, you could setup a batch script that sends a telnet command to FAHClient instructing it to start folding after X seconds after start-up. Have a look at this topic for some pointers: viewtopic.php?f=106&t=33809