Page 3 of 7

Re: Core 17 has suddenly started crashing

Posted: Mon Dec 09, 2013 10:22 am
by ArVee
No no, as you suggest, that would be redundant and overkill. The advice named both of them and I repeated it. In my case it was Afterburner which I otherwise found quite helpful but I've reverted to Everest in short bursts in its stead when needed.

Re: Core 17 has suddenly started crashing

Posted: Mon Dec 09, 2013 5:04 pm
by bruce
Presumably, by shutting off both tools, the GPU has reverted to the default clock-speed. Also, presumably, you were using those tools to overclock. If these assumptions are correct, I doubt it was either (or both) of those tools that caused the GPU to be unstable, but rather a clock-speed that was too high for a GPU that runs an application that actually puts a heavy load on the GPU.

The hammer and tongs was a very good first step, and you did learn something, but perhaps some additional experimentation would clarify the actual issue out of several possibilities.

You've probably already seen comments saying that increasing the core clock is very important to FAH whereas adjusting memory clocks is rather unimportant. For games, the reverse is probably true. To some extent, if you reduce one, you may be able to increase the other slightly as total heat is probably the limiting factor.

Re: Core 17 has suddenly started crashing

Posted: Mon Dec 09, 2013 6:16 pm
by ArVee
My experience with a number of systems is that the clock rates set by Afterburner at least (to be clear, I never ran both that and Precision at once) are retained whether either tool is being run or not, so the info is getting written and retained somewhere. The clocks are therefore still up at the same overclock that had run both Core 15 and Core 17 in the past, so something changed to cause the problem. Dust, goblins, who knows. Just as likely is something like the revised Core_17, not that I'm blaming it. On the fix and what was tried in getting there, what changed and was the ONLY change in the last and successful of a series of scenarios where constants were maintained apart from one change at a time, was the turning off of Afterburner. It's not necessary in any of my other folding systems but on that one it worked immediately and has continued to work a day later. I was surprised but just so delighted that I wasn't arguing. Must be the constant polling of the gpu that the utility must do, I don't know but that would be my guess.

Thank you for the reminder on mem clocks, Bruce, I've meant for a long time to get to trying reducing mem clocks below stock in an effort to see if it allowed a bit more upside on the shader side, just never thought it would net me much so never bothered, always something else to do, but I will now, given the reminder, probably this coming wknd.

Re: Core 17 has suddenly started crashing

Posted: Mon Dec 09, 2013 7:34 pm
by 7im
Both tools have a check box option to load the profile with windows start, which also starts the application.

I had a similar problem, where the profile worked for weeks, then it started causing win7 not to fully boot, then restart, crash, restart, etc. When I booted to safe mode, and removed the profile (basically going back to defaults) the Windows crash stopped.

Past performance is no guarantee of future stability. Something always changes, drivers, WUs, FAHCores, etc. When you live on the edge of stability, you occasionally fall off the edge.

Re: Core 17 has suddenly started crashing

Posted: Mon Jan 06, 2014 3:32 am
by mlportersr
Ok, I'm back. I am the guy who started this thread originally and I have followed all the advice that was given to me. The last thing I did was update my video driver. Unfortunately after I did that update all that I was downloading was Core 15 WUs. I finally caught a Core 17 WU and it is still crashing. What should I do now, and is there a way to tell my client to ignore Core 17 WUs.

Mike...

Re: Core 17 has suddenly started crashing

Posted: Mon Jan 06, 2014 5:30 am
by 7im
Not with V7. Core 15 is going end of life and soon there will only be core 17s.

Re: Core 17 has suddenly started crashing

Posted: Wed Mar 12, 2014 3:50 am
by Time2Kill
mlportersr wrote:Ok, I'm back. I am the guy who started this thread originally and I have followed all the advice that was given to me. The last thing I did was update my video driver. Unfortunately after I did that update all that I was downloading was Core 15 WUs. I finally caught a Core 17 WU and it is still crashing. What should I do now, and is there a way to tell my client to ignore Core 17 WUs.

Mike...

I am with this guy. I have invested 4 hours of troubleshooting and testing to try to get a core 17wu to pass but they just keep failing right out of the gate.

I am trying to get this thing to do me more than 20K per day. I am on the EVGA Team and they raised the values beyond what this machine has been doing. HELP???

System:

i7 960 stock, GTX 570's Stock, Win7 x64

Re: Core 17 has suddenly started crashing

Posted: Wed Mar 12, 2014 2:32 pm
by 7im
What driver version?

Re: Core 17 has suddenly started crashing

Posted: Wed Mar 12, 2014 9:21 pm
by Joe_H
Do you also have the OpenCL support installed as part of the nVidia driver package? Without it Core_17 will not work, while Core_15 uses CUDA instead.

Re: Core 17 has suddenly started crashing

Posted: Thu Mar 13, 2014 2:21 am
by PantherX
Welcome to the F@H Forum Time2Kill,

Please include the log file so that we can see your system configuration and F@H settings. Any error messages would be helpful too for troubleshooting your issue.

Re: Core 17 has suddenly started crashing

Posted: Mon Jun 02, 2014 4:53 pm
by Eagle
Hello everyone,

using version 0.0.52, Core 17 crashes on my end, too.

nVidia driver package: 337.88
CUDA driver version: 8.17.13.3788 (CUDA 6.0.1)

My FAH setup:

Code: Select all

*********************** Log Started 2014-06-02T16:23:30Z ***********************
16:23:30:************************* Folding@home Client *************************
16:23:30:      Website: http://folding.stanford.edu/
16:23:30:    Copyright: (c) 2009-2014 Stanford University
16:23:30:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:23:30:         Args: 
16:23:30:       Config: X:/Folding At Home/config.xml
16:23:30:******************************** Build ********************************
16:23:30:      Version: 7.4.4
16:23:30:         Date: Mar 4 2014
16:23:30:         Time: 20:26:54
16:23:30:      SVN Rev: 4130
16:23:30:       Branch: fah/trunk/client
16:23:30:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
16:23:30:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
16:23:30:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
16:23:30:     Platform: win32 XP
16:23:30:         Bits: 32
16:23:30:         Mode: Release
16:23:30:******************************* System ********************************
16:23:30:          CPU: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
16:23:30:       CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
16:23:30:         CPUs: 8
16:23:30:       Memory: 31.97GiB
16:23:30:  Free Memory: 24.40GiB
16:23:30:      Threads: WINDOWS_THREADS
16:23:30:   OS Version: 6.1
16:23:30:  Has Battery: false
16:23:30:   On Battery: false
16:23:30:   UTC Offset: 2
16:23:30:          PID: 5640
16:23:30:          CWD: X:/Folding At Home
16:23:30:           OS: Windows 7 Ultimate
16:23:30:      OS Arch: AMD64
16:23:30:         GPUs: 1
16:23:30:        GPU 0: NVIDIA:3 GK110 [GeForce GTX 780]
16:23:30:         CUDA: 3.5
16:23:30:  CUDA Driver: 6000
16:23:30:Win32 Service: false
16:23:30:***********************************************************************
16:23:30:<config>
16:23:30:  <service-description v='Folding@home Client'/>
16:23:30:  <service-restart v='true'/>
16:23:30:  <service-restart-delay v='5000'/>
16:23:30:
16:23:30:  <!-- Client Control -->
16:23:30:  <client-threads v='6'/>
16:23:30:  <cycle-rate v='4'/>
16:23:30:  <cycles v='-1'/>
16:23:30:  <data-directory v='.'/>
16:23:30:  <disable-sleep-when-active v='true'/>
16:23:30:  <exec-directory v='C:\Program Files (x86)\FAHClient'/>
16:23:30:  <exit-when-done v='false'/>
16:23:30:  <fold-anon v='false'/>
16:23:30:  <open-web-control v='false'/>
16:23:30:
16:23:30:  <!-- Configuration -->
16:23:30:  <config-rotate v='true'/>
16:23:30:  <config-rotate-dir v='configs'/>
16:23:30:  <config-rotate-max v='16'/>
16:23:30:
16:23:30:  <!-- Debugging -->
16:23:30:  <assignment-servers>
16:23:30:    assign3.stanford.edu:8080 assign4.stanford.edu:80
16:23:30:  </assignment-servers>
16:23:30:  <auth-as v='true'/>
16:23:30:  <capture-directory v='capture'/>
16:23:30:  <capture-on-error v='false'/>
16:23:30:  <capture-packets v='false'/>
16:23:30:  <capture-requests v='false'/>
16:23:30:  <capture-responses v='false'/>
16:23:30:  <capture-sockets v='false'/>
16:23:30:  <core-exec v='FahCore_$type'/>
16:23:30:  <core-wrapper-exec v='FAHCoreWrapper'/>
16:23:30:  <debug-sockets v='false'/>
16:23:30:  <exception-locations v='true'/>
16:23:30:  <gpu-assignment-servers>
16:23:30:    assign-GPU.stanford.edu:80 assign-GPU2.stanford.edu:80
16:23:30:  </gpu-assignment-servers>
16:23:30:  <stack-traces v='false'/>
16:23:30:
16:23:30:  <!-- Error Handling -->
16:23:30:  <max-slot-errors v='10'/>
16:23:30:  <max-unit-errors v='5'/>
16:23:30:
16:23:30:  <!-- Folding Core -->
16:23:30:  <checkpoint v='15'/>
16:23:30:  <core-dir v='cores'/>
16:23:30:  <core-priority v='idle'/>
16:23:30:  <cpu-affinity v='false'/>
16:23:30:  <cpu-usage v='100'/>
16:23:30:  <gpu-usage v='100'/>
16:23:30:  <no-assembly v='false'/>
16:23:30:
16:23:30:  <!-- Folding Slot Configuration -->
16:23:30:  <cause v='ANY'/>
16:23:30:  <client-subtype v='STDCLI'/>
16:23:30:  <client-type v='normal'/>
16:23:30:  <cpu-species v='X86_PENTIUM_II'/>
16:23:30:  <cpu-type v='AMD64'/>
16:23:30:  <cpus v='-1'/>
16:23:30:  <gpu v='true'/>
16:23:30:  <max-packet-size v='normal'/>
16:23:30:  <os-species v='UNKNOWN'/>
16:23:30:  <os-type v='WIN32'/>
16:23:30:  <project-key v='0'/>
16:23:30:  <smp v='true'/>
16:23:30:
16:23:30:  <!-- GUI -->
16:23:30:  <gui-enabled v='true'/>
16:23:30:
16:23:30:  <!-- HTTP Server -->
16:23:30:  <allow v='127.0.0.1'/>
16:23:30:  <connection-timeout v='60'/>
16:23:30:  <deny v='0/0'/>
16:23:30:  <http-addresses v='0:7396'/>
16:23:30:  <https-addresses v=''/>
16:23:30:  <max-connect-time v='900'/>
16:23:30:  <max-connections v='800'/>
16:23:30:  <max-request-length v='52428800'/>
16:23:30:  <min-connect-time v='300'/>
16:23:30:  <threads v='8'/>
16:23:30:
16:23:30:  <!-- Logging -->
16:23:30:  <log v='log.txt'/>
16:23:30:  <log-color v='false'/>
16:23:30:  <log-crlf v='true'/>
16:23:30:  <log-date v='false'/>
16:23:30:  <log-date-periodically v='21600'/>
16:23:30:  <log-debug v='true'/>
16:23:30:  <log-domain v='false'/>
16:23:30:  <log-header v='true'/>
16:23:30:  <log-level v='true'/>
16:23:30:  <log-no-info-header v='true'/>
16:23:30:  <log-redirect v='false'/>
16:23:30:  <log-rotate v='true'/>
16:23:30:  <log-rotate-dir v='logs'/>
16:23:30:  <log-rotate-max v='5'/>
16:23:30:  <log-short-level v='false'/>
16:23:30:  <log-simple-domains v='true'/>
16:23:30:  <log-thread-id v='false'/>
16:23:30:  <log-thread-prefix v='true'/>
16:23:30:  <log-time v='true'/>
16:23:30:  <log-to-screen v='true'/>
16:23:30:  <log-truncate v='false'/>
16:23:30:  <verbosity v='5'/>
16:23:30:
16:23:30:  <!-- Network -->
16:23:30:  <proxy v=':8080'/>
16:23:30:  <proxy-enable v='false'/>
16:23:30:  <proxy-pass v=''/>
16:23:30:  <proxy-user v=''/>
16:23:30:
16:23:30:  <!-- Process Control -->
16:23:30:  <child v='false'/>
16:23:30:  <daemon v='false'/>
16:23:30:  <pid v='false'/>
16:23:30:  <pid-file v='Folding@home Client.pid'/>
16:23:30:  <respawn v='false'/>
16:23:30:  <service v='false'/>
16:23:30:
16:23:30:  <!-- Remote Command Server -->
16:23:30:  <command-address v='0.0.0.0'/>
16:23:30:  <command-allow-no-pass v='127.0.0.1'/>
16:23:30:  <command-deny-no-pass v='0/0'/>
16:23:30:  <command-enable v='true'/>
16:23:30:  <command-port v='36330'/>
16:23:30:
16:23:30:  <!-- Slot Control -->
16:23:30:  <idle v='false'/>
16:23:30:  <max-shutdown-wait v='60'/>
16:23:30:  <pause-on-battery v='true'/>
16:23:30:  <pause-on-start v='false'/>
16:23:30:  <paused v='false'/>
16:23:30:  <power v='full'/>
16:23:30:
16:23:30:  <!-- User Information -->
16:23:30:  <machine-id v='0'/>
16:23:30:  <passkey v='********************************'/>
16:23:30:  <team v='34361'/>
16:23:30:  <user v='Eagle3386'/>
16:23:30:
16:23:30:  <!-- Web Server -->
16:23:30:  <web-allow v='127.0.0.1'/>
16:23:30:  <web-deny v='0/0'/>
16:23:30:  <web-enable v='true'/>
16:23:30:
16:23:30:  <!-- Web Server Sessions -->
16:23:30:  <session-cookie v='sid'/>
16:23:30:  <session-lifetime v='86400'/>
16:23:30:  <session-timeout v='3600'/>
16:23:30:
16:23:30:  <!-- Work Unit Control -->
16:23:30:  <dump-after-deadline v='true'/>
16:23:30:  <max-queue v='16'/>
16:23:30:  <max-units v='0'/>
16:23:30:  <next-unit-percentage v='99'/>
16:23:30:  <stall-detection-enabled v='false'/>
16:23:30:  <stall-percent v='5'/>
16:23:30:  <stall-timeout v='1800'/>
16:23:30:
16:23:30:  <!-- Folding Slots -->
16:23:30:  <slot id='1' type='GPU'>
16:23:30:    <client-type v='advanced'/>
16:23:30:  </slot>
16:23:30:  <slot id='0' type='CPU'>
16:23:30:    <client-type v='bigadv'/>
16:23:30:    <cpus v='7'/>
16:23:30:  </slot>
16:23:30:</config>
16:23:30:Trying to access database...
16:23:30:Successfully acquired database lock
16:23:30:Enabled folding slot 01: READY gpu:0:GK110 [GeForce GTX 780]
16:23:30:Enabled folding slot 00: READY cpu:7
16:23:30:Started thread 6 on PID 5640
16:23:30:Started thread 4 on PID 5640
16:23:30:Started thread 5 on PID 5640
16:23:30:Started thread 9 on PID 5640
16:23:30:Started thread 7 on PID 5640
16:23:30:Started thread 8 on PID 5640
16:23:30:WU02:FS00:Starting
16:23:30:WU02:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "X:/Folding At Home/cores/web.stanford.edu/~pande/Win32/AMD64/Core_a3.fah/FahCore_a3.exe" -dir 02 -suffix 01 -version 704 -lifeline 5640 -checkpoint 15 -np 7
16:23:30:WU02:FS00:Started FahCore on PID 6888
16:23:30:Started thread 10 on PID 5640
16:23:30:WU02:FS00:Core PID:6912
16:23:30:WU02:FS00:FahCore 0xa3 started
16:23:30:WU02:FS00:0xa3:
16:23:30:WU02:FS00:0xa3:*------------------------------*
16:23:30:WU02:FS00:0xa3:Folding@Home Gromacs SMP Core
16:23:30:WU02:FS00:0xa3:Version 2.27 (Dec. 15, 2010)
16:23:30:WU02:FS00:0xa3:
16:23:30:WU02:FS00:0xa3:Preparing to commence simulation
16:23:30:WU02:FS00:0xa3:- Ensuring status. Please wait.
16:23:32:WU00:FS01:Connecting to 171.67.108.201:80
16:23:34:WU00:FS01:Assigned to work server 140.163.4.231
16:23:34:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GK110 [GeForce GTX 780] from 140.163.4.231
16:23:34:WU00:FS01:Connecting to 140.163.4.231:8080
16:23:35:WU00:FS01:Downloading 4.83MiB
16:23:40:WU00:FS01:Download complete
16:23:40:WU02:FS00:0xa3:- Looking at optimizations...
16:23:40:WU02:FS00:0xa3:- Working with standard loops on this execution.
16:23:40:WU02:FS00:0xa3:- Previous termination of core was improper.
16:23:40:WU02:FS00:0xa3:- Files status OK
16:23:40:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13001 run:434 clone:1 gen:23 core:0x17 unit:0x00000028538b3db75328cabc362813c8
16:23:40:WU00:FS01:Downloading core from http://web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah
16:23:40:WU00:FS01:Connecting to web.stanford.edu:80
16:23:40:WU02:FS00:0xa3:- Expanded 1766332 -> 2700840 (decompressed 152.9 percent)
16:23:40:WU02:FS00:0xa3:Called DecompressByteArray: compressed_data_size=1766332 data_size=2700840, decompressed_data_size=2700840 diff=0
16:23:40:WU02:FS00:0xa3:- Digital signature verified
16:23:40:WU02:FS00:0xa3:
16:23:40:WU02:FS00:0xa3:Project: 7504 (Run 14, Clone 78, Gen 234)
16:23:40:WU02:FS00:0xa3:
16:23:40:WU02:FS00:0xa3:Entering M.D.
16:23:41:WU00:FS01:FahCore 17: Downloading 2.55MiB
16:23:46:WU02:FS00:0xa3:Using Gromacs checkpoints
16:23:47:WU00:FS01:FahCore 17: 36.76%
16:23:47:WU02:FS00:0xa3:Mapping NT from 7 to 7 
16:23:48:WU02:FS00:0xa3:Resuming from checkpoint
16:23:48:WU02:FS00:0xa3:Verified 02/wudata_01.log
16:23:49:WU02:FS00:0xa3:Verified 02/wudata_01.trr
16:23:49:WU02:FS00:0xa3:Verified 02/wudata_01.xtc
16:23:49:WU02:FS00:0xa3:Verified 02/wudata_01.edr
16:23:49:WU02:FS00:0xa3:Completed 267340 out of 500000 steps  (53%)
16:23:53:WU00:FS01:FahCore 17: 75.97%
16:23:56:WU00:FS01:FahCore 17: Download complete
16:23:56:WU00:FS01:Valid core signature
16:23:56:WARNING:WU00:FS01:FahCore has not changed since last download, aborting core update
16:23:56:WU00:FS01:Starting
16:23:56:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "X:/Folding At Home/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe" -dir 00 -suffix 01 -version 704 -lifeline 5640 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
16:23:56:WU00:FS01:Started FahCore on PID 6964
16:23:56:Started thread 11 on PID 5640
16:23:57:WU00:FS01:Core PID:4132
16:23:57:WU00:FS01:FahCore 0x17 started
16:23:58:WU00:FS01:0x17:*********************** Log Started 2014-06-02T16:23:58Z ***********************
16:23:58:WU00:FS01:0x17:Project: 13001 (Run 434, Clone 1, Gen 23)
16:23:58:WU00:FS01:0x17:Unit: 0x00000028538b3db75328cabc362813c8
16:23:58:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
16:23:58:WU00:FS01:0x17:Machine: 1
16:23:58:WU00:FS01:0x17:Reading tar file state.xml
16:23:59:WU00:FS01:0x17:Reading tar file system.xml
16:24:00:WU00:FS01:0x17:Reading tar file integrator.xml
16:24:00:WU00:FS01:0x17:Reading tar file core.xml
16:24:00:WU00:FS01:0x17:Digital signatures verified
16:24:00:WU00:FS01:0x17:Folding@home GPU core17
16:24:00:WU00:FS01:0x17:Version 0.0.52
16:26:38:WU02:FS00:0xa3:Completed 270000 out of 500000 steps  (54%)
16:26:45:Started thread 12 on PID 5640
16:27:53:WU00:FS01:0x17:Completed 0 out of 5000000 steps (0%)
16:27:53:WU00:FS01:0x17:Lost lifeline PID 6964, exiting
16:27:53:WU00:FS01:0x17:Lost lifeline PID 6964, exiting
16:27:53:WU00:FS01:0x17:ERROR:103: Lost client lifeline
16:27:53:WU00:FS01:0x17:Folding@home Core Shutdown: CLIENT_DIED
16:27:54:WARNING:WU00:FS01:FahCore returned an unknown error code which probably indicates that it crashed
16:27:54:WARNING:WU00:FS01:FahCore returned: CLIENT_DIED (103 = 0x67)
16:27:54:WU00:FS01:Starting
16:27:54:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "X:/Folding At Home/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe" -dir 00 -suffix 01 -version 704 -lifeline 5640 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
16:27:54:WU00:FS01:Started FahCore on PID 7912
16:27:54:Started thread 13 on PID 5640
16:27:54:WU00:FS01:Core PID:7924
16:27:54:WU00:FS01:FahCore 0x17 started
16:27:54:WU00:FS01:0x17:*********************** Log Started 2014-06-02T16:27:54Z ***********************
16:27:54:WU00:FS01:0x17:Project: 13001 (Run 434, Clone 1, Gen 23)
16:27:54:WU00:FS01:0x17:Unit: 0x00000028538b3db75328cabc362813c8
16:27:54:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
16:27:54:WU00:FS01:0x17:Machine: 1
16:27:54:WU00:FS01:0x17:Reading tar file state.xml
16:27:55:WU00:FS01:0x17:Reading tar file system.xml
16:27:56:WU00:FS01:0x17:Reading tar file integrator.xml
16:27:56:WU00:FS01:0x17:Reading tar file core.xml
16:27:56:WU00:FS01:0x17:Digital signatures verified
16:27:56:WU00:FS01:0x17:Folding@home GPU core17
16:27:56:WU00:FS01:0x17:Version 0.0.52
16:31:48:WU00:FS01:0x17:Completed 0 out of 5000000 steps (0%)
16:31:48:WU00:FS01:0x17:Lost lifeline PID 7912, exiting
16:31:48:WU00:FS01:0x17:Lost lifeline PID 7912, exiting
16:31:48:WU00:FS01:0x17:ERROR:103: Lost client lifeline
16:31:48:WU00:FS01:0x17:Folding@home Core Shutdown: CLIENT_DIED
16:31:49:WARNING:WU00:FS01:FahCore returned an unknown error code which probably indicates that it crashed
16:31:49:WARNING:WU00:FS01:FahCore returned: CLIENT_DIED (103 = 0x67)
16:31:49:WU00:FS01:Starting
16:31:49:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "X:/Folding At Home/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe" -dir 00 -suffix 01 -version 704 -lifeline 5640 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
16:31:49:WU00:FS01:Started FahCore on PID 8032
16:31:49:Started thread 14 on PID 5640
16:31:49:WU00:FS01:Core PID:2396
16:31:49:WU00:FS01:FahCore 0x17 started
16:31:50:WU00:FS01:0x17:*********************** Log Started 2014-06-02T16:31:49Z ***********************
16:31:50:WU00:FS01:0x17:Project: 13001 (Run 434, Clone 1, Gen 23)
16:31:50:WU00:FS01:0x17:Unit: 0x00000028538b3db75328cabc362813c8
16:31:50:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
16:31:50:WU00:FS01:0x17:Machine: 1
16:31:50:WU00:FS01:0x17:Reading tar file state.xml
16:31:50:WU00:FS01:0x17:Reading tar file system.xml
16:31:51:WU00:FS01:0x17:Reading tar file integrator.xml
16:31:51:WU00:FS01:0x17:Reading tar file core.xml
16:31:51:WU00:FS01:0x17:Digital signatures verified
16:31:51:WU00:FS01:0x17:Folding@home GPU core17
16:31:51:WU00:FS01:0x17:Version 0.0.52
16:32:48:WU02:FS00:0xa3:Completed 275000 out of 500000 steps  (55%)
16:35:39:WU00:FS01:0x17:Completed 0 out of 5000000 steps (0%)
16:35:39:WU00:FS01:0x17:Lost lifeline PID 8032, exiting
16:35:39:WU00:FS01:0x17:Lost lifeline PID 8032, exiting
16:35:39:WU00:FS01:0x17:ERROR:103: Lost client lifeline
16:35:39:WU00:FS01:0x17:Folding@home Core Shutdown: CLIENT_DIED
16:35:39:WARNING:WU00:FS01:FahCore returned an unknown error code which probably indicates that it crashed
16:35:39:WARNING:WU00:FS01:FahCore returned: CLIENT_DIED (103 = 0x67)
The error is repeated several times until the slot finally fails. If you need any further information, please let me know.

Best regards,
Martin

Re: Core 17 has suddenly started crashing

Posted: Mon Jun 02, 2014 5:00 pm
by 7im
When it errors on 0% it's more likely a bad WU and not a client or core crash. Also, FAHCores don't really crash (in a released version). They just report the error data from the WU.

beta cores are a different issue.

Please put the verbosity setting back to 3. All the extra data in the log is not help. The log only shows non-default config settings at level 3, and that's they data we want.

Re: Core 17 has suddenly started crashing

Posted: Mon Jun 02, 2014 5:10 pm
by Eagle
Sorry for the log spam.. Changed that accordingly - new config log:

Code: Select all

17:06:39:Saving configuration to config.xml
17:06:39:<config>
17:06:39:  <!-- Logging -->
17:06:39:  <log-rotate-max v='5'/>
17:06:39:
17:06:39:  <!-- Network -->
17:06:39:  <proxy v=':8080'/>
17:06:39:
17:06:39:  <!-- Slot Control -->
17:06:39:  <power v='full'/>
17:06:39:
17:06:39:  <!-- User Information -->
17:06:39:  <passkey v='********************************'/>
17:06:39:  <team v='34361'/>
17:06:39:  <user v='Eagle3386'/>
17:06:39:
17:06:39:  <!-- Folding Slots -->
17:06:39:  <slot id='1' type='GPU'>
17:06:39:    <client-type v='advanced'/>
17:06:39:  </slot>
17:06:39:  <slot id='0' type='CPU'>
17:06:39:    <client-type v='bigadv'/>
17:06:39:    <cpus v='7'/>
17:06:39:  </slot>
17:06:39:</config>
Regarding the error, it doesn't happen at exactly 0%. It's mostly around 1% when the error happens - if that makes a difference.

Re: Core 17 has suddenly started crashing

Posted: Mon Jun 02, 2014 5:16 pm
by 7im
Is the GPU overclocked?

Been folding a while and this just started? Or just started folding?

Re: Core 17 has suddenly started crashing

Posted: Mon Jun 02, 2014 5:32 pm
by Eagle
No personal OC, but EVGA applied some higher clocks.
This specific GPU is folding for about a year now and until a couple of days/weeks, everything ran just fine.