Core 17 has suddenly started crashing

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Core 17 has suddenly started crashing

Post by PantherX »

Eagle wrote:...I don't know why that "Date: 2014-06-05"-line is written into the log, although the day (and hence the date) is still the same...
This information is always printed in the log file every 6 hours by default.
Eagle wrote:...The only "new thing" I found is this line:

Code: Select all

11:24:25:WU01:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
However, 7im told me that it's unused and hence can be ignored...
To elaborate a little, this feature was introduced in FahCore_17 version 0.0.52 (IIRC) so that users with only a single Nvidia GPU can stop folding for X time when Y temperature was reached. This feature doesn't work on AMD GPUs and on multiple Nvidia GPUs. Since you haven't seen this message before, it suggested that something fishy occurred with the FahCore_17 update.
Eagle wrote:...So, the only real change is the change of the work-directory from my hard disk to my solid-state drive. Can this really be causing "lost lifeline" and things like that? I can hardly imagine that, but then again, I'm just a passionate FAH user, no insider...
That shouldn't cause an issue at all. I have kept the data directory on HDD Drive with the program folder on the SSD Drive and can fold without any issues. Here's my configuration:

Code: Select all

*********************** Log Started 2014-05-28T20:00:37Z ***********************
20:00:37:************************* Folding@home Client *************************
20:00:37:      Website: http://folding.stanford.edu/
20:00:37:    Copyright: (c) 2009-2014 Stanford University
20:00:37:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
20:00:37:         Args: 
20:00:37:       Config: D:/FAH/V7/config.xml
20:00:37:******************************** Build ********************************
20:00:37:      Version: 7.4.4
20:00:37:         Date: Mar 4 2014
20:00:37:         Time: 20:26:54
20:00:37:      SVN Rev: 4130
20:00:37:       Branch: fah/trunk/client
20:00:37:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
20:00:37:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
20:00:37:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
20:00:37:     Platform: win32 XP
20:00:37:         Bits: 32
20:00:37:         Mode: Release
20:00:37:******************************* System ********************************
20:00:37:          CPU: Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz
20:00:37:       CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
20:00:37:         CPUs: 8
20:00:37:       Memory: 15.89GiB
20:00:37:  Free Memory: 12.08GiB
20:00:37:      Threads: WINDOWS_THREADS
20:00:37:   OS Version: 6.2
20:00:37:  Has Battery: true
20:00:37:   On Battery: false
20:00:37:   UTC Offset: 3
20:00:37:          PID: 2096
20:00:37:          CWD: D:/FAH/V7
20:00:37:           OS: Windows 8 Pro
20:00:37:      OS Arch: AMD64
20:00:37:         GPUs: 1
20:00:37:        GPU 0: NVIDIA:2 GF114 [GeForce GTX 675M]
20:00:37:         CUDA: 2.1
20:00:37:  CUDA Driver: 6000
20:00:37:Win32 Service: false
20:00:37:***********************************************************************
20:00:38:<config>
20:00:38:  <!-- Network -->
20:00:38:  <proxy v=':8080'/>
20:00:38:
20:00:38:  <!-- Remote Command Server -->
20:00:38:  <password v='*********'/>
20:00:38:
20:00:38:  <!-- Slot Control -->
20:00:38:  <power v='full'/>
20:00:38:
20:00:38:  <!-- User Information -->
20:00:38:  <passkey v='********************************'/>
20:00:38:  <team v='69411'/>
20:00:38:  <user v='PantherX'/>
20:00:38:
20:00:38:  <!-- Folding Slots -->
20:00:38:  <slot id='0' type='CPU'>
20:00:38:    <cpus v='7'/>
20:00:38:    <max-packet-size v='small'/>
20:00:38:    <max-slot-errors v='1'/>
20:00:38:    <max-unit-errors v='1'/>
20:00:38:    <next-unit-percentage v='100'/>
20:00:38:    <pause-on-start v='true'/>
20:00:38:  </slot>
20:00:38:  <slot id='1' type='GPU'>
20:00:38:    <max-slot-errors v='1'/>
20:00:38:    <max-unit-errors v='1'/>
20:00:38:    <next-unit-percentage v='100'/>
20:00:38:    <pause-on-start v='true'/>
20:00:38:  </slot>
20:00:38:</config>
20:00:38:Connecting to assign-GPU.stanford.edu:80
20:00:39:Updated GPUs.txt
20:00:39:Read GPUs.txt
20:00:39:Trying to access database...
20:00:40:Successfully acquired database lock
20:00:40:Enabled folding slot 00: PAUSED cpu:7 (by user)
20:00:40:Enabled folding slot 01: PAUSED gpu:0:GF114 [GeForce GTX 675M] (by user)
20:04:22:Clean exit
Eagle wrote:...Any further information would be greatly appreciated!
Since this issue started few weeks ago, my best guess would be a mangled FahCore update. As a test, can you please replace the beta value with advanced and see if it continues to fold FahCore_17 WUs properly? Do note that with the advanced flag, you might download FahCore_17 version 0.0.52 in a different directory and continue to fold FahCore_17 WU without issues. If this works fine, then you can revert back to your original setup by doing a fresh installation.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Eagle
Posts: 116
Joined: Sun Feb 17, 2008 1:06 am
Hardware configuration: AMD Ryzen ThreadRipper 2950X (3.5 GHz)
ASUS Prime X399-A
G.Skill 32 GB DDR4-RAM (3.2 GHz)
EVGA GeForce RTX 2080 Ti Black (1.8 / 7 GHz)
Samsung 970 Pro 1 TB, 850 Pro 512 GB, Crucial C300 256 GB
Western Digital Black 2 TB, Gold 4 TB
Location: » Earth » Europe » Germany
Contact:

Re: Core 17 has suddenly started crashing

Post by Eagle »

PantherX wrote:This information is always printed in the log file every 6 hours by default.
Thanks for the information!
PantherX wrote:To elaborate a little, this feature was introduced in FahCore_17 version 0.0.52 (IIRC) so that users with only a single Nvidia GPU can stop folding for X time when Y temperature was reached. This feature doesn't work on AMD GPUs and on multiple Nvidia GPUs. Since you haven't seen this message before, it suggested that something fishy occurred with the FahCore_17 update.
That makes me wonder..
PantherX wrote:That shouldn't cause an issue at all. I have kept the data directory on HDD Drive with the program folder on the SSD Drive and can fold without any issues. Here's my configuration:

Code: Select all

*********************** Log Started 2014-05-28T20:00:37Z ***********************
20:00:37:************************* Folding@home Client *************************
20:00:37:      Website: http://folding.stanford.edu/
20:00:37:    Copyright: (c) 2009-2014 Stanford University
20:00:37:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
20:00:37:         Args: 
20:00:37:       Config: D:/FAH/V7/config.xml
20:00:37:******************************** Build ********************************
20:00:37:      Version: 7.4.4
20:00:37:         Date: Mar 4 2014
20:00:37:         Time: 20:26:54
20:00:37:      SVN Rev: 4130
20:00:37:       Branch: fah/trunk/client
20:00:37:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
20:00:37:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
20:00:37:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
20:00:37:     Platform: win32 XP
20:00:37:         Bits: 32
20:00:37:         Mode: Release
20:00:37:******************************* System ********************************
20:00:37:          CPU: Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz
20:00:37:       CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
20:00:37:         CPUs: 8
20:00:37:       Memory: 15.89GiB
20:00:37:  Free Memory: 12.08GiB
20:00:37:      Threads: WINDOWS_THREADS
20:00:37:   OS Version: 6.2
20:00:37:  Has Battery: true
20:00:37:   On Battery: false
20:00:37:   UTC Offset: 3
20:00:37:          PID: 2096
20:00:37:          CWD: D:/FAH/V7
20:00:37:           OS: Windows 8 Pro
20:00:37:      OS Arch: AMD64
20:00:37:         GPUs: 1
20:00:37:        GPU 0: NVIDIA:2 GF114 [GeForce GTX 675M]
20:00:37:         CUDA: 2.1
20:00:37:  CUDA Driver: 6000
20:00:37:Win32 Service: false
20:00:37:***********************************************************************
20:00:38:<config>
20:00:38:  <!-- Network -->
20:00:38:  <proxy v=':8080'/>
20:00:38:
20:00:38:  <!-- Remote Command Server -->
20:00:38:  <password v='*********'/>
20:00:38:
20:00:38:  <!-- Slot Control -->
20:00:38:  <power v='full'/>
20:00:38:
20:00:38:  <!-- User Information -->
20:00:38:  <passkey v='********************************'/>
20:00:38:  <team v='69411'/>
20:00:38:  <user v='PantherX'/>
20:00:38:
20:00:38:  <!-- Folding Slots -->
20:00:38:  <slot id='0' type='CPU'>
20:00:38:    <cpus v='7'/>
20:00:38:    <max-packet-size v='small'/>
20:00:38:    <max-slot-errors v='1'/>
20:00:38:    <max-unit-errors v='1'/>
20:00:38:    <next-unit-percentage v='100'/>
20:00:38:    <pause-on-start v='true'/>
20:00:38:  </slot>
20:00:38:  <slot id='1' type='GPU'>
20:00:38:    <max-slot-errors v='1'/>
20:00:38:    <max-unit-errors v='1'/>
20:00:38:    <next-unit-percentage v='100'/>
20:00:38:    <pause-on-start v='true'/>
20:00:38:  </slot>
20:00:38:</config>
20:00:38:Connecting to assign-GPU.stanford.edu:80
20:00:39:Updated GPUs.txt
20:00:39:Read GPUs.txt
20:00:39:Trying to access database...
20:00:40:Successfully acquired database lock
20:00:40:Enabled folding slot 00: PAUSED cpu:7 (by user)
20:00:40:Enabled folding slot 01: PAUSED gpu:0:GF114 [GeForce GTX 675M] (by user)
20:04:22:Clean exit
Since this issue started few weeks ago, my best guess would be a mangled FahCore update.
So, although your basic setup equaled mine, you didn't experience the error, but I did. Very strange..
PantherX wrote:As a test, can you please replace the beta value with advanced and see if it continues to fold FahCore_17 WUs properly? Do note that with the advanced flag, you might download FahCore_17 version 0.0.52 in a different directory and continue to fold FahCore_17 WU without issues. If this works fine, then you can revert back to your original setup by doing a fresh installation.
Core 17 Beta will take about an hour and a half, before it's finish. But right afterwards, I'm going to try your suggestion and report back afterwards.
Michael Jordan: “I can accept failure — But I can’t accept not trying.”
Image
Eagle
Posts: 116
Joined: Sun Feb 17, 2008 1:06 am
Hardware configuration: AMD Ryzen ThreadRipper 2950X (3.5 GHz)
ASUS Prime X399-A
G.Skill 32 GB DDR4-RAM (3.2 GHz)
EVGA GeForce RTX 2080 Ti Black (1.8 / 7 GHz)
Samsung 970 Pro 1 TB, 850 Pro 512 GB, Crucial C300 256 GB
Western Digital Black 2 TB, Gold 4 TB
Location: » Earth » Europe » Germany
Contact:

Re: Core 17 has suddenly started crashing

Post by Eagle »

Alright, this is getting strange now! Everything runs just fine:

Code: Select all

12:00:02:WU00:FS01:Connecting to 171.67.108.201:80
12:00:02:WU00:FS01:Assigned to work server 171.64.65.93
12:00:02:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GK110 [GeForce GTX 780] from 171.64.65.93
12:00:02:WU00:FS01:Connecting to 171.64.65.93:8080
12:00:03:WU00:FS01:Downloading 2.92MiB
12:00:09:WU00:FS01:Download 23.51%
12:00:15:WU00:FS01:Download 57.70%
12:00:21:WU00:FS01:Download 89.76%
12:00:22:WU00:FS01:Download complete
12:00:22:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9102 run:2 clone:19 gen:25 core:0x17 unit:0x0000001e0a3b1e81537c069339e5f10d
12:00:22:WU00:FS01:Downloading core from http://web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah
12:00:22:WU00:FS01:Connecting to web.stanford.edu:80
12:00:22:WU00:FS01:FahCore 17: Downloading 2.55MiB
12:00:28:WU00:FS01:FahCore 17: 34.31%
12:00:34:WU00:FS01:FahCore 17: 73.52%
12:00:38:WU00:FS01:FahCore 17: Download complete
12:00:38:WU00:FS01:Valid core signature
12:00:38:WU00:FS01:Unpacked 8.60MiB to cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe
12:00:38:WU00:FS01:Starting
12:00:38:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/USER/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 704 -lifeline 6128 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
12:00:38:WU00:FS01:Started FahCore on PID 1620
12:00:38:WU00:FS01:Core PID:1664
12:00:38:WU00:FS01:FahCore 0x17 started
12:00:39:WU00:FS01:0x17:*********************** Log Started 2014-06-06T12:00:38Z ***********************
12:00:39:WU00:FS01:0x17:Project: 9102 (Run 2, Clone 19, Gen 25)
12:00:39:WU00:FS01:0x17:Unit: 0x0000001e0a3b1e81537c069339e5f10d
12:00:39:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
12:00:39:WU00:FS01:0x17:Machine: 1
12:00:39:WU00:FS01:0x17:Reading tar file state.xml
12:00:39:WU00:FS01:0x17:Reading tar file system.xml
12:00:40:WU00:FS01:0x17:Reading tar file integrator.xml
12:00:40:WU00:FS01:0x17:Reading tar file core.xml
12:00:40:WU00:FS01:0x17:Digital signatures verified
12:00:40:WU00:FS01:0x17:Folding@home GPU core17
12:00:40:WU00:FS01:0x17:Version 0.0.52
12:02:08:WU00:FS01:0x17:Completed 0 out of 2500000 steps (0%)
12:02:08:WU00:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
12:05:00:WU00:FS01:0x17:Completed 25000 out of 2500000 steps (1%)
12:07:38:WU00:FS01:0x17:Completed 50000 out of 2500000 steps (2%)
12:10:22:WU00:FS01:0x17:Completed 75000 out of 2500000 steps (3%)
12:12:54:WU00:FS01:0x17:Completed 100000 out of 2500000 steps (4%)
12:15:42:WU00:FS01:0x17:Completed 125000 out of 2500000 steps (5%)
12:18:16:WU00:FS01:0x17:Completed 150000 out of 2500000 steps (6%)
12:22:01:WU00:FS01:0x17:Completed 175000 out of 2500000 steps (7%)
12:25:14:WU00:FS01:0x17:Completed 200000 out of 2500000 steps (8%)
12:28:58:WU00:FS01:0x17:Completed 225000 out of 2500000 steps (9%)
12:32:11:WU00:FS01:0x17:Completed 250000 out of 2500000 steps (10%)
(...)
No lost lifeline, not a single error up until 10% - and since the initial error(s) happened way before reaching 5%, I assume the current "incarnation" of Core 17, version 0.0.52, is correct. Should I try and return to my previous setup? I don't want a WU mess being sent back to Stanford because of all these trial-and-error actions..
Michael Jordan: “I can accept failure — But I can’t accept not trying.”
Image
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Core 17 has suddenly started crashing

Post by 7im »

This is at least the second time I have seen moving the data location from a secondary drive back to the primary drive fix a weird problem. But I have also seen dual locations work as PX has shown again.

I also don't know why Eagle's X: drive location worked for so long and then stopped working. Maybe drives A through F work better, and G - Z have an issue?
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
davidcoton
Posts: 1094
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: Core 17 has suddenly started crashing

Post by davidcoton »

I suggest that it was the core update that didn't like the data directory location. That could explain the timing in this case. But I don't know why, and I can't explain why it seems to work in other cases -- presumably there would be more reports if core updates always failed with a non-standard data directory.

David
Image
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Core 17 has suddenly started crashing

Post by PantherX »

Eagle wrote:...So, although your basic setup equaled mine, you didn't experience the error, but I did. Very strange...
You might be in luck since I haven't folded on my laptop with V7, only the Folding App. Thus, the cores haven't updated. I can start-up V7 and monitor it to see if the issue is replicated or not.
Eagle wrote:Alright, this is getting strange now! Everything runs just fine:...No lost lifeline, not a single error up until 10% - and since the initial error(s) happened way before reaching 5%, I assume the current "incarnation" of Core 17, version 0.0.52, is correct. Should I try and return to my previous setup? I don't want a WU mess being sent back to Stanford because of all these trial-and-error actions..
That's good news. If you do want to return to your previous set-up, set the slot to finish and once all WUs are completed, you can perform a fresh installation (remember that during uninstallation, select the option to delete the data) and hope for the best. To avoid burning through WUs, what you can do is once you start-up folding on the new set-up, simply select the Slot, right-click it and select Finish. Thus, if it errors out, no new WU will be downloaded. If it finishes successfully, it means that you can keep the set-up and fold happily.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Eagle
Posts: 116
Joined: Sun Feb 17, 2008 1:06 am
Hardware configuration: AMD Ryzen ThreadRipper 2950X (3.5 GHz)
ASUS Prime X399-A
G.Skill 32 GB DDR4-RAM (3.2 GHz)
EVGA GeForce RTX 2080 Ti Black (1.8 / 7 GHz)
Samsung 970 Pro 1 TB, 850 Pro 512 GB, Crucial C300 256 GB
Western Digital Black 2 TB, Gold 4 TB
Location: » Earth » Europe » Germany
Contact:

Re: Core 17 has suddenly started crashing

Post by Eagle »

Sorry for the late reply, but I had personal issues to deal with and I also wanted to get reliable results.
Now, let's get right to it:
7im wrote:This is at least the second time I have seen moving the data location from a secondary drive back to the primary drive fix a weird problem. But I have also seen dual locations work as PX has shown again.

I also don't know why Eagle's X: drive location worked for so long and then stopped working. Maybe drives A through F work better, and G - Z have an issue?
FAH finished and was removed completely (including the folders on the X: drive) afterwards. I then rebooted my machine, installed FAH via the freshly downloaded installer (with the data directory being on the X: drive again) and configured it like this:

Code: Select all

20:30:23:<config>
20:30:23:  <!-- Network -->
20:30:23:  <proxy v=':8080'/>
20:30:23:
20:30:23:  <!-- Slot Control -->
20:30:23:  <power v='full'/>
20:30:23:
20:30:23:  <!-- User Information -->
20:30:23:  <passkey v='********************************'/>
20:30:23:  <team v='34361'/>
20:30:23:  <user v='Eagle'/>
20:30:23:
20:30:23:  <!-- Folding Slots -->
20:30:23:  <slot id='0' type='CPU'/>
20:30:23:  <slot id='1' type='GPU'/>
20:30:23:</config>
As you can see, I didn't enable advanced and/or beta folding. However, right after installing I did notice these two changes:
1. Right-clicking FAH's Systray-icon opened the context menu right away and at the icon's position - previously, it took about 1-3 seconds and it opened at the cursor's position (wherever I moved it to in the meantime).
2. Although no beta (i.e. FAH 0.0.55 right now, IIRC) was requested and it did download 0.0.52, I received (and still receive) the temperature line:

Code: Select all

WU02:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
davidcoton wrote:I suggest that it was the core update that didn't like the data directory location. That could explain the timing in this case. But I don't know why, and I can't explain why it seems to work in other cases -- presumably there would be more reports if core updates always failed with a non-standard data directory.

David
Since FAH folded for a while now, I don't understand why it folds with 0.0.52 again (using the same program and data location prior to the "single drive test") without the previous error?!
PantherX wrote:You might be in luck since I haven't folded on my laptop with V7, only the Folding App. Thus, the cores haven't updated. I can start-up V7 and monitor it to see if the issue is replicated or not.
It would be great if you can do so. :)
PantherX wrote:That's good news. If you do want to return to your previous set-up, set the slot to finish and once all WUs are completed, you can perform a fresh installation (remember that during uninstallation, select the option to delete the data) and hope for the best. To avoid burning through WUs, what you can do is once you start-up folding on the new set-up, simply select the Slot, right-click it and select Finish. Thus, if it errors out, no new WU will be downloaded. If it finishes successfully, it means that you can keep the set-up and fold happily.
Thanks for all those handy tips. They're all appreciated and being followed. :) Results were/are as described above.

Although I'd like to investigate this to the point of "total clarification", I do want to take a break here and thank you guys for helping me out and guiding me to a point where my FAH setup returned to its desired state. Thanks a lot! :)
If I can be of any further help, please let me know!
Michael Jordan: “I can accept failure — But I can’t accept not trying.”
Image
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Core 17 has suddenly started crashing

Post by PantherX »

Eagle wrote:...1. Right-clicking FAH's Systray-icon opened the context menu right away and at the icon's position - previously, it took about 1-3 seconds and it opened at the cursor's position (wherever I moved it to in the meantime)...
Humm, that was fixed in V7.3.7 (https://fah.stanford.edu/projects/FAHClient/ticket/994) which meant that once you installed V7.4.4, you should have gotten the proper version.
Eagle wrote:...2. Although no beta (i.e. FAH 0.0.55 right now, IIRC) was requested and it did download 0.0.52, I received (and still receive) the temperature line:

Code: Select all

WU02:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
...
That is expected since it was a new feature introduced to Windows only version 0.0.52.
Eagle wrote:...It would be great if you can do so. :)...
Sure, might have to schedule it for the weekend (at the earliest) so I can monitor it. Will post the results here once I have completed the test.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Eagle
Posts: 116
Joined: Sun Feb 17, 2008 1:06 am
Hardware configuration: AMD Ryzen ThreadRipper 2950X (3.5 GHz)
ASUS Prime X399-A
G.Skill 32 GB DDR4-RAM (3.2 GHz)
EVGA GeForce RTX 2080 Ti Black (1.8 / 7 GHz)
Samsung 970 Pro 1 TB, 850 Pro 512 GB, Crucial C300 256 GB
Western Digital Black 2 TB, Gold 4 TB
Location: » Earth » Europe » Germany
Contact:

Re: Core 17 has suddenly started crashing

Post by Eagle »

That's strange, because I do swear that I've experienced it with 7.4.4.

Regarding the message: but I've never seen it before the advanced/beta try, although FAH was already at version 0.0.52 back then?!

Alright, I'll wait then. But please note that just after posting yesterday, the error returned just today:

Code: Select all

02:59:26:WU01:FS01:0x15:Folding@home Core Shutdown: FINISHED_UNIT
02:59:26:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
02:59:26:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:7624 run:423 clone:0 gen:299 core:0x15 unit:0x00000199664f2dd14fe612f7ab56012e
02:59:26:WU01:FS01:Uploading 807.35KiB to 171.64.65.105
02:59:26:WU02:FS01:Starting
02:59:26:WU01:FS01:Connecting to 171.64.65.105:8080
02:59:26:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "X:/Folding At Home/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe" -dir 02 -suffix 01 -version 704 -lifeline 8264 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
02:59:26:WU02:FS01:Started FahCore on PID 12180
02:59:27:WU02:FS01:Core PID:10900
02:59:27:WU02:FS01:FahCore 0x17 started
02:59:27:WU02:FS01:0x17:*********************** Log Started 2014-06-18T02:59:27Z ***********************
02:59:27:WU02:FS01:0x17:Project: 9406 (Run 29, Clone 0, Gen 61)
02:59:27:WU02:FS01:0x17:Unit: 0x000000600a3b1e5c533dd15c8f1365b9
02:59:27:WU02:FS01:0x17:CPU: 0x00000000000000000000000000000000
02:59:27:WU02:FS01:0x17:Machine: 1
02:59:27:WU02:FS01:0x17:Reading tar file state.xml
02:59:28:WU02:FS01:0x17:Reading tar file system.xml
02:59:29:WU02:FS01:0x17:Reading tar file integrator.xml
02:59:29:WU02:FS01:0x17:Reading tar file core.xml
02:59:29:WU02:FS01:0x17:Digital signatures verified
02:59:29:WU02:FS01:0x17:Folding@home GPU core17
02:59:29:WU02:FS01:0x17:Version 0.0.52
02:59:32:WU01:FS01:Upload 47.56%
02:59:37:WU01:FS01:Upload complete
02:59:38:WU01:FS01:Server responded WORK_ACK (400)
02:59:38:WU01:FS01:Final credit estimate, 14093.00 points
02:59:38:WU01:FS01:Cleaning up
03:04:10:WU02:FS01:0x17:Completed 0 out of 2000000 steps (0%)
03:04:10:WU02:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
03:07:56:WU02:FS01:0x17:Completed 20000 out of 2000000 steps (1%)
(...)
All Folding went just fine with only percentages being logged, hence cut.
(...)
08:11:48:WU02:FS01:0x17:Completed 1740000 out of 2000000 steps (87%)

--- REBOOT (caused by Windows updates) ---

*********************** Log Started 2014-06-18T08:16:34Z ***********************
08:16:34:************************* Folding@home Client *************************
08:16:34:      Website: http://folding.stanford.edu/
08:16:34:    Copyright: (c) 2009-2014 Stanford University
08:16:34:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
08:16:34:         Args: 
08:16:34:       Config: X:/Folding At Home/config.xml
08:16:34:******************************** Build ********************************
08:16:34:      Version: 7.4.4
08:16:34:         Date: Mar 4 2014
08:16:34:         Time: 20:26:54
08:16:34:      SVN Rev: 4130
08:16:34:       Branch: fah/trunk/client
08:16:34:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
08:16:34:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
08:16:34:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
08:16:34:     Platform: win32 XP
08:16:34:         Bits: 32
08:16:34:         Mode: Release
08:16:34:******************************* System ********************************
08:16:34:          CPU: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
08:16:34:       CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
08:16:34:         CPUs: 8
08:16:34:       Memory: 31.97GiB
08:16:34:  Free Memory: 20.66GiB
08:16:34:      Threads: WINDOWS_THREADS
08:16:34:   OS Version: 6.1
08:16:34:  Has Battery: false
08:16:34:   On Battery: false
08:16:34:   UTC Offset: 2
08:16:34:          PID: 5196
08:16:34:          CWD: X:/Folding At Home
08:16:34:           OS: Windows 7 Ultimate
08:16:34:      OS Arch: AMD64
08:16:34:         GPUs: 1
08:16:34:        GPU 0: NVIDIA:3 GK110 [GeForce GTX 780]
08:16:34:         CUDA: 3.5
08:16:34:  CUDA Driver: 6000
08:16:34:Win32 Service: false
08:16:34:***********************************************************************
08:16:34:<config>
08:16:34:  <!-- Network -->
08:16:34:  <proxy v=':8080'/>
08:16:34:
08:16:34:  <!-- Slot Control -->
08:16:34:  <power v='full'/>
08:16:34:
08:16:34:  <!-- User Information -->
08:16:34:  <passkey v='********************************'/>
08:16:34:  <team v='34361'/>
08:16:34:  <user v='Eagle'/>
08:16:34:
08:16:34:  <!-- Folding Slots -->
08:16:34:  <slot id='0' type='CPU'/>
08:16:34:  <slot id='1' type='GPU'/>
08:16:34:</config>
08:16:34:Trying to access database...
08:16:35:Successfully acquired database lock
08:16:35:Enabled folding slot 00: READY cpu:7
08:16:35:Enabled folding slot 01: READY gpu:0:GK110 [GeForce GTX 780]
08:16:35:WU00:FS00:Starting
08:16:35:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "X:/Folding At Home/cores/web.stanford.edu/~pande/Win32/AMD64/Core_a3.fah/FahCore_a3.exe" -dir 00 -suffix 01 -version 704 -lifeline 5196 -checkpoint 15 -np 7
08:16:35:WU00:FS00:Started FahCore on PID 6840
08:16:35:WU00:FS00:Core PID:6852
08:16:35:WU00:FS00:FahCore 0xa3 started

--- NOTE: I've cut the CPU slot log below this line ---

08:16:35:WU02:FS01:Starting
08:16:35:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "X:/Folding At Home/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe" -dir 02 -suffix 01 -version 704 -lifeline 5196 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
08:16:35:WU02:FS01:Started FahCore on PID 6952
08:16:36:WU02:FS01:Core PID:7012
08:16:36:WU02:FS01:FahCore 0x17 started
08:16:37:WU02:FS01:0x17:*********************** Log Started 2014-06-18T08:16:37Z ***********************
08:16:37:WU02:FS01:0x17:Project: 9406 (Run 29, Clone 0, Gen 61)
08:16:37:WU02:FS01:0x17:Unit: 0x000000600a3b1e5c533dd15c8f1365b9
08:16:37:WU02:FS01:0x17:CPU: 0x00000000000000000000000000000000
08:16:37:WU02:FS01:0x17:Machine: 1
08:16:37:WU02:FS01:0x17:Digital signatures verified
08:16:37:WU02:FS01:0x17:Folding@home GPU core17
08:16:37:WU02:FS01:0x17:Version 0.0.52
08:16:37:WU02:FS01:0x17:  Found a checkpoint file
08:21:29:WU02:FS01:0x17:Completed 1700000 out of 2000000 steps (85%)
08:21:29:WU02:FS01:0x17:Lost lifeline PID 6952, exiting
08:21:29:WU02:FS01:0x17:Lost lifeline PID 6952, exiting
08:21:29:WU02:FS01:0x17:ERROR:103: Lost client lifeline
08:21:29:WU02:FS01:0x17:Folding@home Core Shutdown: CLIENT_DIED
08:21:30:WARNING:WU02:FS01:FahCore returned an unknown error code which probably indicates that it crashed
08:21:30:WARNING:WU02:FS01:FahCore returned: CLIENT_DIED (103 = 0x67)
08:21:30:WU02:FS01:Starting
08:21:30:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "X:/Folding At Home/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe" -dir 02 -suffix 01 -version 704 -lifeline 5196 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
08:21:30:WU02:FS01:Started FahCore on PID 8120
08:21:30:WU02:FS01:Core PID:4220
08:21:30:WU02:FS01:FahCore 0x17 started
08:21:30:WU02:FS01:0x17:*********************** Log Started 2014-06-18T08:21:30Z ***********************
08:21:30:WU02:FS01:0x17:Project: 9406 (Run 29, Clone 0, Gen 61)
08:21:30:WU02:FS01:0x17:Unit: 0x000000600a3b1e5c533dd15c8f1365b9
08:21:30:WU02:FS01:0x17:CPU: 0x00000000000000000000000000000000
08:21:30:WU02:FS01:0x17:Machine: 1
08:21:30:WU02:FS01:0x17:Digital signatures verified
08:21:30:WU02:FS01:0x17:Folding@home GPU core17
08:21:30:WU02:FS01:0x17:Version 0.0.52
08:21:30:WU02:FS01:0x17:  Found a checkpoint file
08:26:50:WU02:FS01:0x17:Completed 1700000 out of 2000000 steps (85%)
08:26:50:WU02:FS01:0x17:Lost lifeline PID 8120, exiting
08:26:50:WU02:FS01:0x17:Lost lifeline PID 8120, exiting
08:26:50:WU02:FS01:0x17:ERROR:103: Lost client lifeline
08:26:50:WU02:FS01:0x17:Folding@home Core Shutdown: CLIENT_DIED
08:26:50:WARNING:WU02:FS01:FahCore returned an unknown error code which probably indicates that it crashed
08:26:50:WARNING:WU02:FS01:FahCore returned: CLIENT_DIED (103 = 0x67)
08:26:50:WU02:FS01:Starting
08:26:50:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "X:/Folding At Home/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe" -dir 02 -suffix 01 -version 704 -lifeline 5196 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
08:26:50:WU02:FS01:Started FahCore on PID 7488
08:26:50:WU02:FS01:Core PID:5940
08:26:50:WU02:FS01:FahCore 0x17 started
08:26:51:WU02:FS01:0x17:*********************** Log Started 2014-06-18T08:26:50Z ***********************
08:26:51:WU02:FS01:0x17:Project: 9406 (Run 29, Clone 0, Gen 61)
08:26:51:WU02:FS01:0x17:Unit: 0x000000600a3b1e5c533dd15c8f1365b9
08:26:51:WU02:FS01:0x17:CPU: 0x00000000000000000000000000000000
08:26:51:WU02:FS01:0x17:Machine: 1
08:26:51:WU02:FS01:0x17:Digital signatures verified
08:26:51:WU02:FS01:0x17:Folding@home GPU core17
08:26:51:WU02:FS01:0x17:Version 0.0.52
08:26:51:WU02:FS01:0x17:  Found a checkpoint file
08:31:38:WU02:FS01:0x17:Completed 1700000 out of 2000000 steps (85%)
08:31:38:WU02:FS01:0x17:Lost lifeline PID 7488, exiting
08:31:38:WU02:FS01:0x17:Lost lifeline PID 7488, exiting
08:31:38:WU02:FS01:0x17:ERROR:103: Lost client lifeline
08:31:38:WU02:FS01:0x17:Folding@home Core Shutdown: CLIENT_DIED
08:31:38:WARNING:WU02:FS01:FahCore returned an unknown error code which probably indicates that it crashed
08:31:38:WARNING:WU02:FS01:FahCore returned: CLIENT_DIED (103 = 0x67
In between, the following updates were installed (in chronological order from old to new) and I finally rebooted about 5 hours ago:

Code: Select all

Definition Update for Windows Defender - KB915597 (Definition 1.175.1478.0)
Definition Update for Windows Defender - KB915597 (Definition 1.175.1813.0)
Security update for Microsoft Office 2010 (KB2767915) 32-Bit-Edition
Update for Windows 7 for x64 based Systems (KB2952664)
Security update for Windows 7 for x64 based Systems (KB2957503)
Definition Update for Microsoft Office 2010 (KB982726) 32-Bit-Edition
Cumulative Security Update for Internet Explorer 11 for Windows 7 for x64 Systems (KB2957689)
Update for Microsoft Word 2010 (KB2880529) 32-Bit-Edition
Security Update for Windows 7 for x64 based Systems (KB2965788)
Update for Windows 7 for x64 based Systems (KB2800095)
Security Update for Windows 7 for x64 based Systems (KB2939576)
Security Update for Windows 7 for x64 based Systems (KB2957189)
Security Update for Windows 7 for x64 based Systems (KB2957509)
Windows Malicious Software Removal Tool x64 - June 2014 (KB890830)
Definition Update for Windows Defender - KB915597 (Definition 1.175.2521.0)
I'm starting to believe FAH wants my SSD to be heavily used (which I try to avoid for obvious reasons).. :?
Michael Jordan: “I can accept failure — But I can’t accept not trying.”
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Core 17 has suddenly started crashing

Post by bruce »

Eagle wrote:That's strange, because I do swear that I've experienced it with 7.4.4.

Regarding the message: but I've never seen it before the advanced/beta try, although FAH was already at version 0.0.52 back then?!
Version numbers of FAHClients, of a Linux FahCore_*, and a Windows FahCore_* are independent of each other. Last I checked, the latest Linux FahCore_17 was 0.0.46, the latest WIndows FahCore_17 was 0.0.52 (0.0.55 is being tested) and the FAHClient was 7.4.4. Each one can be updated independently and presumably work with unmodified versions of something else. You're responsible for updating FAHClient. Periodically the FahCores update themselves automatically, depending on the requirements of the WU you're assigned.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Core 17 has suddenly started crashing

Post by PantherX »

Eagle wrote:...I'm starting to believe FAH wants my SSD to be heavily used (which I try to avoid for obvious reasons).. :?
It is understandable if you want to minimize data written on an SSD. However, it seems that unless you write several GBs of data everyday, a consumer SSD would last quite a while with normal usage (http://techreport.com/review/26523/the- ... a-petabyte). Pretty sure that F@H alone is no way close to writing that much data so it should be fairly safe.

I started up my GTX 675M and so far it is working fine:

Code: Select all

*********************** Log Started 2014-06-18T21:06:56Z ***********************
21:06:56:************************* Folding@home Client *************************
21:06:56:      Website: http://folding.stanford.edu/
21:06:56:    Copyright: (c) 2009-2014 Stanford University
21:06:56:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
21:06:56:         Args: 
21:06:56:       Config: D:/FAH/V7/config.xml
21:06:56:******************************** Build ********************************
21:06:56:      Version: 7.4.4
21:06:56:         Date: Mar 4 2014
21:06:56:         Time: 20:26:54
21:06:56:      SVN Rev: 4130
21:06:56:       Branch: fah/trunk/client
21:06:56:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
21:06:56:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
21:06:56:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
21:06:56:     Platform: win32 XP
21:06:56:         Bits: 32
21:06:56:         Mode: Release
21:06:56:******************************* System ********************************
21:06:56:          CPU: Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz
21:06:56:       CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
21:06:56:         CPUs: 8
21:06:56:       Memory: 15.89GiB
21:06:56:  Free Memory: 11.29GiB
21:06:56:      Threads: WINDOWS_THREADS
21:06:56:   OS Version: 6.2
21:06:56:  Has Battery: true
21:06:56:   On Battery: false
21:06:56:   UTC Offset: 3
21:06:56:          PID: 10568
21:06:56:          CWD: D:/FAH/V7
21:06:56:           OS: Windows 8 Pro
21:06:56:      OS Arch: AMD64
21:06:56:         GPUs: 1
21:06:56:        GPU 0: NVIDIA:2 GF114 [GeForce GTX 675M]
21:06:56:         CUDA: 2.1
21:06:56:  CUDA Driver: 6000
21:06:56:Win32 Service: false
21:06:56:***********************************************************************
21:06:56:<config>
21:06:56:  <!-- Network -->
21:06:56:  <proxy v=':8080'/>
21:06:56:
21:06:56:  <!-- Remote Command Server -->
21:06:56:  <password v='*********'/>
21:06:56:
21:06:56:  <!-- Slot Control -->
21:06:56:  <power v='full'/>
21:06:56:
21:06:56:  <!-- User Information -->
21:06:56:  <passkey v='********************************'/>
21:06:56:  <team v='69411'/>
21:06:56:  <user v='PantherX'/>
21:06:56:
21:06:56:  <!-- Folding Slots -->
21:06:56:  <slot id='0' type='CPU'>
21:06:56:    <cpus v='7'/>
21:06:56:    <max-packet-size v='small'/>
21:06:56:    <max-slot-errors v='1'/>
21:06:56:    <max-unit-errors v='1'/>
21:06:56:    <next-unit-percentage v='100'/>
21:06:56:    <pause-on-start v='true'/>
21:06:56:  </slot>
21:06:56:  <slot id='1' type='GPU'>
21:06:56:    <max-slot-errors v='1'/>
21:06:56:    <max-unit-errors v='1'/>
21:06:56:    <next-unit-percentage v='100'/>
21:06:56:    <pause-on-start v='true'/>
21:06:56:  </slot>
21:06:56:</config>
21:06:56:Trying to access database...
21:06:56:Successfully acquired database lock
21:06:56:Enabled folding slot 00: PAUSED cpu:7 (by user)
21:06:56:Enabled folding slot 01: PAUSED gpu:0:GF114 [GeForce GTX 675M] (by user)
21:07:06:FS01:Unpaused
21:07:07:WU00:FS01:Connecting to 171.67.108.201:80
21:07:11:WU00:FS01:Assigned to work server 140.163.4.231
21:07:11:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GF114 [GeForce GTX 675M] from 140.163.4.231
21:07:11:WU00:FS01:Connecting to 140.163.4.231:8080
21:07:12:WU00:FS01:Downloading 4.84MiB
21:07:17:WU00:FS01:Download complete
21:07:17:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13001 run:316 clone:0 gen:16 core:0x17 unit:0x00000028538b3db75328a93e6b24aff3
21:07:18:WU00:FS01:Downloading core from http://web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah
21:07:18:WU00:FS01:Connecting to web.stanford.edu:80
21:07:32:WU00:FS01:FahCore 17: Downloading 2.55MiB
21:07:38:WU00:FS01:FahCore 17: 19.60%
21:07:44:WU00:FS01:FahCore 17: 46.56%
21:07:50:WU00:FS01:FahCore 17: 71.07%
21:07:56:WU00:FS01:FahCore 17: 98.02%
21:07:56:WU00:FS01:FahCore 17: Download complete
21:07:57:WU00:FS01:Valid core signature
21:07:57:WU00:FS01:Unpacked 8.60MiB to cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe
21:07:57:WU00:FS01:Starting
21:07:57:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" D:/FAH/V7/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 704 -lifeline 10568 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
21:07:57:WU00:FS01:Started FahCore on PID 4744
21:07:57:WU00:FS01:Core PID:10104
21:07:57:WU00:FS01:FahCore 0x17 started
21:07:58:WU00:FS01:0x17:*********************** Log Started 2014-06-18T21:07:58Z ***********************
21:07:58:WU00:FS01:0x17:Project: 13001 (Run 316, Clone 0, Gen 16)
21:07:58:WU00:FS01:0x17:Unit: 0x00000028538b3db75328a93e6b24aff3
21:07:58:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
21:07:58:WU00:FS01:0x17:Machine: 1
21:07:58:WU00:FS01:0x17:Reading tar file state.xml
21:08:00:WU00:FS01:0x17:Reading tar file system.xml
21:08:01:WU00:FS01:0x17:Reading tar file integrator.xml
21:08:01:WU00:FS01:0x17:Reading tar file core.xml
21:08:01:WU00:FS01:0x17:Digital signatures verified
21:08:01:WU00:FS01:0x17:Folding@home GPU core17
21:08:01:WU00:FS01:0x17:Version 0.0.52
21:11:13:WU00:FS01:0x17:Completed 0 out of 5000000 steps (0%)
21:11:13:WU00:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
21:13:41:FS01:Finishing
21:45:34:WU00:FS01:0x17:Completed 50000 out of 5000000 steps (1%)
22:20:10:WU00:FS01:0x17:Completed 100000 out of 5000000 steps (2%)
22:55:32:WU00:FS01:0x17:Completed 150000 out of 5000000 steps (3%)
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Eagle
Posts: 116
Joined: Sun Feb 17, 2008 1:06 am
Hardware configuration: AMD Ryzen ThreadRipper 2950X (3.5 GHz)
ASUS Prime X399-A
G.Skill 32 GB DDR4-RAM (3.2 GHz)
EVGA GeForce RTX 2080 Ti Black (1.8 / 7 GHz)
Samsung 970 Pro 1 TB, 850 Pro 512 GB, Crucial C300 256 GB
Western Digital Black 2 TB, Gold 4 TB
Location: » Earth » Europe » Germany
Contact:

Re: Core 17 has suddenly started crashing

Post by Eagle »

bruce wrote:Version numbers of FAHClients, of a Linux FahCore_*, and a Windows FahCore_* are independent of each other. Last I checked, the latest Linux FahCore_17 was 0.0.46, the latest WIndows FahCore_17 was 0.0.52 (0.0.55 is being tested) and the FAHClient was 7.4.4. Each one can be updated independently and presumably work with unmodified versions of something else. You're responsible for updating FAHClient. Periodically the FahCores update themselves automatically, depending on the requirements of the WU you're assigned.
You got me wrong on this. I had 7.4.4 installed right when it came out. About every 2-4 weeks, I check for a new FAH client package, because the client still lacks an auto-updater (no offense intended here, I'm patiently waiting for it).
However, AFAIR, the message of FAH 0.0.52 in question was logged _after_ I switched program- and data-directory onto C:, but _never_ before the switch, i.e. on the X: drive.
PantherX wrote:It is understandable if you want to minimize data written on an SSD. However, it seems that unless you write several GBs of data everyday, a consumer SSD would last quite a while with normal usage (http://techreport.com/review/26523/the- ... a-petabyte). Pretty sure that F@H alone is no way close to writing that much data so it should be fairly safe.
I know about reviews regarding the wear-leveling, even the facts that NAND flash from the Intel-Micron joint venture (I own a Crucial C300 with 256 GB) is considered to be (one of the) best in class with about 72 TB of data that can be written per cell before EOL and using a RAM disk for all temporary data, complemented by software-adjustments like no hard drive cache within Firefox, will greatly extend expectable lifetime of my SSD. But I'm a software architect/engineer, so I'm aiming for improvement - the CPU slot has no such issues, only the GPU one. That's why I do believe it's a bug and bugs can be fixed. ;)

Regarding my FAH setup, I'm not happy about the SSD usage. Weeks have passed since the error's origin was found, but no fix was provided until now. However, I do respect all the people working on FAH and the fact their time is limited - which might be why they work on other issues right now. So, I'm only waiting for GROMACS to finish and then switch to the one-drive workaround until a fix is available.
PantherX wrote:I started up my GTX 675M and so far it is working fine:

Code: Select all

*********************** Log Started 2014-06-18T21:06:56Z ***********************
21:06:56:************************* Folding@home Client *************************
21:06:56:      Website: http://folding.stanford.edu/
21:06:56:    Copyright: (c) 2009-2014 Stanford University
21:06:56:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
21:06:56:         Args: 
21:06:56:       Config: D:/FAH/V7/config.xml
21:06:56:******************************** Build ********************************
21:06:56:      Version: 7.4.4
21:06:56:         Date: Mar 4 2014
21:06:56:         Time: 20:26:54
21:06:56:      SVN Rev: 4130
21:06:56:       Branch: fah/trunk/client
21:06:56:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
21:06:56:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
21:06:56:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
21:06:56:     Platform: win32 XP
21:06:56:         Bits: 32
21:06:56:         Mode: Release
21:06:56:******************************* System ********************************
21:06:56:          CPU: Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz
21:06:56:       CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
21:06:56:         CPUs: 8
21:06:56:       Memory: 15.89GiB
21:06:56:  Free Memory: 11.29GiB
21:06:56:      Threads: WINDOWS_THREADS
21:06:56:   OS Version: 6.2
21:06:56:  Has Battery: true
21:06:56:   On Battery: false
21:06:56:   UTC Offset: 3
21:06:56:          PID: 10568
21:06:56:          CWD: D:/FAH/V7
21:06:56:           OS: Windows 8 Pro
21:06:56:      OS Arch: AMD64
21:06:56:         GPUs: 1
21:06:56:        GPU 0: NVIDIA:2 GF114 [GeForce GTX 675M]
21:06:56:         CUDA: 2.1
21:06:56:  CUDA Driver: 6000
21:06:56:Win32 Service: false
21:06:56:***********************************************************************
21:06:56:<config>
21:06:56:  <!-- Network -->
21:06:56:  <proxy v=':8080'/>
21:06:56:
21:06:56:  <!-- Remote Command Server -->
21:06:56:  <password v='*********'/>
21:06:56:
21:06:56:  <!-- Slot Control -->
21:06:56:  <power v='full'/>
21:06:56:
21:06:56:  <!-- User Information -->
21:06:56:  <passkey v='********************************'/>
21:06:56:  <team v='69411'/>
21:06:56:  <user v='PantherX'/>
21:06:56:
21:06:56:  <!-- Folding Slots -->
21:06:56:  <slot id='0' type='CPU'>
21:06:56:    <cpus v='7'/>
21:06:56:    <max-packet-size v='small'/>
21:06:56:    <max-slot-errors v='1'/>
21:06:56:    <max-unit-errors v='1'/>
21:06:56:    <next-unit-percentage v='100'/>
21:06:56:    <pause-on-start v='true'/>
21:06:56:  </slot>
21:06:56:  <slot id='1' type='GPU'>
21:06:56:    <max-slot-errors v='1'/>
21:06:56:    <max-unit-errors v='1'/>
21:06:56:    <next-unit-percentage v='100'/>
21:06:56:    <pause-on-start v='true'/>
21:06:56:  </slot>
21:06:56:</config>
21:06:56:Trying to access database...
21:06:56:Successfully acquired database lock
21:06:56:Enabled folding slot 00: PAUSED cpu:7 (by user)
21:06:56:Enabled folding slot 01: PAUSED gpu:0:GF114 [GeForce GTX 675M] (by user)
21:07:06:FS01:Unpaused
21:07:07:WU00:FS01:Connecting to 171.67.108.201:80
21:07:11:WU00:FS01:Assigned to work server 140.163.4.231
21:07:11:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GF114 [GeForce GTX 675M] from 140.163.4.231
21:07:11:WU00:FS01:Connecting to 140.163.4.231:8080
21:07:12:WU00:FS01:Downloading 4.84MiB
21:07:17:WU00:FS01:Download complete
21:07:17:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13001 run:316 clone:0 gen:16 core:0x17 unit:0x00000028538b3db75328a93e6b24aff3
21:07:18:WU00:FS01:Downloading core from http://web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah
21:07:18:WU00:FS01:Connecting to web.stanford.edu:80
21:07:32:WU00:FS01:FahCore 17: Downloading 2.55MiB
21:07:38:WU00:FS01:FahCore 17: 19.60%
21:07:44:WU00:FS01:FahCore 17: 46.56%
21:07:50:WU00:FS01:FahCore 17: 71.07%
21:07:56:WU00:FS01:FahCore 17: 98.02%
21:07:56:WU00:FS01:FahCore 17: Download complete
21:07:57:WU00:FS01:Valid core signature
21:07:57:WU00:FS01:Unpacked 8.60MiB to cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe
21:07:57:WU00:FS01:Starting
21:07:57:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" D:/FAH/V7/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 704 -lifeline 10568 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
21:07:57:WU00:FS01:Started FahCore on PID 4744
21:07:57:WU00:FS01:Core PID:10104
21:07:57:WU00:FS01:FahCore 0x17 started
21:07:58:WU00:FS01:0x17:*********************** Log Started 2014-06-18T21:07:58Z ***********************
21:07:58:WU00:FS01:0x17:Project: 13001 (Run 316, Clone 0, Gen 16)
21:07:58:WU00:FS01:0x17:Unit: 0x00000028538b3db75328a93e6b24aff3
21:07:58:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
21:07:58:WU00:FS01:0x17:Machine: 1
21:07:58:WU00:FS01:0x17:Reading tar file state.xml
21:08:00:WU00:FS01:0x17:Reading tar file system.xml
21:08:01:WU00:FS01:0x17:Reading tar file integrator.xml
21:08:01:WU00:FS01:0x17:Reading tar file core.xml
21:08:01:WU00:FS01:0x17:Digital signatures verified
21:08:01:WU00:FS01:0x17:Folding@home GPU core17
21:08:01:WU00:FS01:0x17:Version 0.0.52
21:11:13:WU00:FS01:0x17:Completed 0 out of 5000000 steps (0%)
21:11:13:WU00:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
21:13:41:FS01:Finishing
21:45:34:WU00:FS01:0x17:Completed 50000 out of 5000000 steps (1%)
22:20:10:WU00:FS01:0x17:Completed 100000 out of 5000000 steps (2%)
22:55:32:WU00:FS01:0x17:Completed 150000 out of 5000000 steps (3%)
Excuse my nit-picking question, but did it fold until 100%? I'm just asking, because mine went fine for about week and even on the first failing WU, it worked until 87% which, after a reboot, was "reset" to 85% (latest checkpoint) and then the client immediately died. :(
Michael Jordan: “I can accept failure — But I can’t accept not trying.”
Image
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Core 17 has suddenly started crashing

Post by 7im »

The FAHClient and FAHCores would seem to be in a feature freeze at the moment. This bug also has a low occurrence rate (a few) and has a valid workaround. It will get very low priority, unfortunately.

Over time, FAH may be switching to a new type of FAHCore, which may render this (and many other pending issues) a non-issue. Realistically, I would not expect this issue to be addressed in the current FAHClient, unless the bug was embarrassingly obvious and easy to fix. Entschuldigung.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Eagle
Posts: 116
Joined: Sun Feb 17, 2008 1:06 am
Hardware configuration: AMD Ryzen ThreadRipper 2950X (3.5 GHz)
ASUS Prime X399-A
G.Skill 32 GB DDR4-RAM (3.2 GHz)
EVGA GeForce RTX 2080 Ti Black (1.8 / 7 GHz)
Samsung 970 Pro 1 TB, 850 Pro 512 GB, Crucial C300 256 GB
Western Digital Black 2 TB, Gold 4 TB
Location: » Earth » Europe » Germany
Contact:

Re: Core 17 has suddenly started crashing

Post by Eagle »

I'm an optimist - I'm waiting for 7.4.5 / 0.0.56 then and if that doesn't work, well, Maxwell is coming..
Trotzdem danke, 7im! ;)
Michael Jordan: “I can accept failure — But I can’t accept not trying.”
Image
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Core 17 has suddenly started crashing

Post by PantherX »

Eagle wrote:...Excuse my nit-picking question, but did it fold until 100%? I'm just asking, because mine went fine for about week and even on the first failing WU, it worked until 87% which, after a reboot, was "reset" to 85% (latest checkpoint) and then the client immediately died. :(
It is still folding non-stop on the same WU of Project 13001 with the TPF of ~35 minutes. It is currently at 45% and folding (needs ~1.33 days to finish). Do you want me to reboot the system? If so, do you remember how you rebooted the system:
1) Exit FAHClient waited for sometime then rebooted the system
2) Simply rebooted the system without exiting the FAHClient

Also, how do you start-up FAHClient:
1) Automatically when Windows log-in
2) Manually via a schortcut

With the above information, I can have a better chance at reproducing your error and hopefully, find further similarities if not the root cause.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Post Reply