Page 1 of 2

FAH makes GPU driver fail (official 13.9 and beta 13.11v9.2)

Posted: Thu Nov 21, 2013 3:19 am
by heffeque
This is the system:

Code: Select all

*********************** Log Started 2013-11-21T02:35:12Z ***********************
02:35:12:************************* Folding@home Client *************************
02:35:12:      Website: http://folding.stanford.edu/
02:35:12:    Copyright: (c) 2009-2013 Stanford University
02:35:12:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
02:35:12:         Args: --open-web-control
02:35:12:       Config: <none>
02:35:12:******************************** Build ********************************
02:35:12:      Version: 7.3.6
02:35:12:         Date: Feb 18 2013
02:35:12:         Time: 15:25:17
02:35:12:      SVN Rev: 3923
02:35:12:       Branch: fah/trunk/client
02:35:12:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
02:35:12:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
02:35:12:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
02:35:12:     Platform: win32 XP
02:35:12:         Bits: 32
02:35:12:         Mode: Release
02:35:12:******************************* System ********************************
02:35:12:          CPU: AMD E-450 APU with Radeon(tm) HD Graphics
02:35:12:       CPU ID: AuthenticAMD Family 20 Model 2 Stepping 0
02:35:12:         CPUs: 2
02:35:12:       Memory: 6.98GiB
02:35:12:  Free Memory: 4.70GiB
02:35:12:      Threads: WINDOWS_THREADS
02:35:12:  Has Battery: false
02:35:12:   On Battery: false
02:35:12:   UTC offset: 1
02:35:12:          PID: 6036
02:35:12:          CWD: C:/ProgramData/FAHClient
02:35:12:           OS: Windows 7 Ultimate
02:35:12:      OS Arch: AMD64
02:35:12:         GPUs: 1
02:35:12:        GPU 0: ATI:4 Wrestler [Radeon HD 6320]
02:35:12:         CUDA: Not detected
02:35:12:Win32 Service: false
02:35:12:***********************************************************************
02:35:12:<config>
02:35:12:  <!-- Folding Slots -->
02:35:12:</config>
02:35:13:Connecting to assign-GPU.stanford.edu:80
02:35:14:Connecting to assign-GPU.stanford.edu:8080
02:35:15:Read GPUs.txt
02:35:15:Trying to access database...
02:35:16:Successfully acquired database lock
02:35:16:Enabled folding slot 00: PAUSED gpu:0:Wrestler [Radeon HD 6320] (not configured)
02:35:16:Enabled folding slot 01: PAUSED cpu:1 (not configured)
02:35:20:3:127.0.0.1:New Web connection
02:35:43:Set client configured
02:35:43:WU00:FS01:Connecting to assign-GPU.stanford.edu:80
02:35:44:WU00:FS01:Connecting to assign3.stanford.edu:8080
02:35:45:WU00:FS01:News: Welcome to Folding@Home
02:35:45:WU00:FS01:Assigned to work server 171.64.65.81
02:35:45:WU00:FS01:Requesting new work unit for slot 01: READY cpu:1 from 171.64.65.81
02:35:45:WU00:FS01:Connecting to 171.64.65.81:8080
02:35:46:WU00:FS01:Downloading 368.09KiB
02:35:48:WU00:FS01:Download complete
02:35:48:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:10450 run:162 clone:0 gen:175 core:0xa4 unit:0x000003360a3b1e7550a539fc3dd17901
02:35:49:WU00:FS01:Downloading core from http://www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah
02:35:49:WU00:FS01:Connecting to www.stanford.edu:80
02:35:49:WU01:FS00:Connecting to assign-GPU.stanford.edu:80
02:35:49:WU00:FS01:FahCore a4: Downloading 2.89MiB
02:35:50:WU01:FS00:News: Welcome to Folding@Home
02:35:50:WU01:FS00:Assigned to work server 171.64.65.69
02:35:50:WU01:FS00:Requesting new work unit for slot 00: READY gpu:0:Wrestler [Radeon HD 6320] from 171.64.65.69
02:35:50:WU01:FS00:Connecting to 171.64.65.69:8080
02:35:51:WU01:FS00:Downloading 4.18MiB
02:35:55:WU00:FS01:FahCore a4: 23.81%
02:35:57:WU01:FS00:Download 34.42%
02:35:57:WARNING:Exception: 8:127.0.0.1: Send error: 10053: An established connection was aborted by the software in your host machine.
02:36:01:WU00:FS01:FahCore a4: 51.94%
02:36:03:WU01:FS00:Download 77.83%
02:36:05:WU01:FS00:Download complete
02:36:05:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:8900 run:67 clone:3 gen:29 core:0x17 unit:0x00000037028c126651a6355e8250728b
02:36:05:WU01:FS00:Downloading core from http://www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah
02:36:05:WU01:FS00:Connecting to www.stanford.edu:80
02:36:06:WU01:FS00:FahCore 17: Downloading 2.55MiB
02:36:07:WU00:FS01:FahCore a4: 77.91%
02:36:11:WU00:FS01:FahCore a4: Download complete
02:36:11:WU00:FS01:Valid core signature
02:36:11:WU00:FS01:Unpacked 9.59MiB to cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe
02:36:11:WU00:FS01:Starting
02:36:11:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 703 -lifeline 6036 -checkpoint 15
02:36:12:WU00:FS01:Started FahCore on PID 2356
02:36:12:WU01:FS00:FahCore 17: 24.51%
02:36:12:WU00:FS01:Core PID:5348
02:36:12:WU00:FS01:FahCore 0xa4 started
02:36:13:WU00:FS01:0xa4:
02:36:13:WU00:FS01:0xa4:*------------------------------*
02:36:13:WU00:FS01:0xa4:Folding@Home Gromacs GB Core
02:36:13:WU00:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
02:36:13:WU00:FS01:0xa4:
02:36:13:WU00:FS01:0xa4:Preparing to commence simulation
02:36:13:WU00:FS01:0xa4:- Looking at optimizations...
02:36:13:WU00:FS01:0xa4:- Created dyn
02:36:13:WU00:FS01:0xa4:- Files status OK
02:36:13:WU00:FS01:0xa4:- Expanded 376417 -> 690248 (decompressed 183.3 percent)
02:36:13:WU00:FS01:0xa4:Called DecompressByteArray: compressed_data_size=376417 data_size=690248, decompressed_data_size=690248 diff=0
02:36:13:WU00:FS01:0xa4:- Digital signature verified
02:36:13:WU00:FS01:0xa4:
02:36:13:WU00:FS01:0xa4:Project: 10450 (Run 162, Clone 0, Gen 175)
02:36:13:WU00:FS01:0xa4:
02:36:13:WU00:FS01:0xa4:Assembly optimizations on if available.
02:36:13:WU00:FS01:0xa4:Entering M.D.
02:36:18:WU01:FS00:FahCore 17: 58.81%
02:36:19:WU00:FS01:0xa4:Mapping NT from 1 to 1 
02:36:20:WU00:FS01:0xa4:Completed 0 out of 2000000 steps  (0%)
02:36:24:WU01:FS00:FahCore 17: 85.77%
02:36:26:WU01:FS00:FahCore 17: Download complete
02:36:26:WU01:FS00:Valid core signature
02:36:26:WU01:FS00:Unpacked 8.60MiB to cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe
02:36:27:WU01:FS00:Starting
02:36:27:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 703 -lifeline 6036 -checkpoint 15 -gpu 0 -gpu-vendor ati
02:36:27:WU01:FS00:Started FahCore on PID 3780
02:36:28:WU01:FS00:Core PID:5460
02:36:28:WU01:FS00:FahCore 0x17 started
02:36:29:WU01:FS00:0x17:*********************** Log Started 2013-11-21T02:36:29Z ***********************
02:36:29:WU01:FS00:0x17:Project: 8900 (Run 67, Clone 3, Gen 29)
02:36:29:WU01:FS00:0x17:Unit: 0x00000037028c126651a6355e8250728b
02:36:29:WU01:FS00:0x17:CPU: 0x00000000000000000000000000000000
02:36:29:WU01:FS00:0x17:Machine: 0
02:36:29:WU01:FS00:0x17:Reading tar file state.xml
02:36:32:WU01:FS00:0x17:Reading tar file system.xml
02:36:34:WU01:FS00:0x17:Reading tar file integrator.xml
02:36:34:WU01:FS00:0x17:Reading tar file core.xml
02:36:34:WU01:FS00:0x17:Digital signatures verified
02:36:34:WU01:FS00:0x17:Folding@home GPU core17
02:36:34:WU01:FS00:0x17:Version 0.0.52
It's a cheap Zotac AD04 with 8 GB of RAM (I've set it to 7 GB for the system, and 1 GB for the GPU)

Things I've noticed:
  • The GPU on my E-450 isn't Wrestler (FAH bug?). It's Zacate.
  • It takes a few minutes of Core_17 folding to make the GPU driver crash (tested on official "13.9" and beta "13.11 v9.2")
  • After it makes the GPU driver hang, the machine restarts the GPU driver but anything that has to do with the GPU gets reeealy slow, even after closing FAH.
  • There's no log on FAH that indicates any errors on the GPU side even after the GPU driver crash.
  • Progress on the GPU side stays at 0.00% although it does complete a few steps.
  • CPU usage lowers to 50%, even on full mode.
  • Turning Folding Power to off for a while and turning it to Full again will make the GPU start folding again.
Log after setting it to off, closing FAH and starting it again:

Code: Select all

*********************** Log Started 2013-11-21T03:12:33Z ***********************
03:12:33:************************* Folding@home Client *************************
03:12:33:      Website: http://folding.stanford.edu/
03:12:33:    Copyright: (c) 2009-2013 Stanford University
03:12:33:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:12:33:         Args: --open-web-control
03:12:33:       Config: C:/ProgramData/FAHClient/config.xml
03:12:33:******************************** Build ********************************
03:12:33:      Version: 7.3.6
03:12:33:         Date: Feb 18 2013
03:12:33:         Time: 15:25:17
03:12:33:      SVN Rev: 3923
03:12:33:       Branch: fah/trunk/client
03:12:33:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
03:12:33:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
03:12:33:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
03:12:33:     Platform: win32 XP
03:12:33:         Bits: 32
03:12:33:         Mode: Release
03:12:33:******************************* System ********************************
03:12:33:          CPU: AMD E-450 APU with Radeon(tm) HD Graphics
03:12:33:       CPU ID: AuthenticAMD Family 20 Model 2 Stepping 0
03:12:33:         CPUs: 2
03:12:33:       Memory: 6.98GiB
03:12:33:  Free Memory: 4.79GiB
03:12:33:      Threads: WINDOWS_THREADS
03:12:33:  Has Battery: false
03:12:33:   On Battery: false
03:12:33:   UTC offset: 1
03:12:33:          PID: 5092
03:12:33:          CWD: C:/ProgramData/FAHClient
03:12:33:           OS: Windows 7 Ultimate
03:12:33:      OS Arch: AMD64
03:12:33:         GPUs: 1
03:12:33:        GPU 0: ATI:4 Wrestler [Radeon HD 6320]
03:12:33:         CUDA: Not detected
03:12:33:Win32 Service: false
03:12:33:***********************************************************************
03:12:33:<config>
03:12:33:  <!-- Folding Slot Configuration -->
03:12:33:  <power v='off'/>
03:12:33:
03:12:33:  <!-- User Information -->
03:12:33:  <team v='****'/>
03:12:33:  <user v='****'/>
03:12:33:
03:12:33:  <!-- Folding Slots -->
03:12:33:  <slot id='0' type='GPU'/>
03:12:33:  <slot id='1' type='CPU'/>
03:12:33:</config>
03:12:33:Trying to access database...
03:12:33:Successfully acquired database lock
03:12:33:Enabled folding slot 00: PAUSED gpu:0:Wrestler [Radeon HD 6320] (paused)
03:12:33:Enabled folding slot 01: PAUSED cpu:2 (paused)
03:12:41:WU00:FS01:Starting
03:12:41:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 703 -lifeline 5092 -checkpoint 15
03:12:41:WU00:FS01:Started FahCore on PID 6032
03:12:41:WU00:FS01:Core PID:5984
03:12:41:WU00:FS01:FahCore 0xa4 started
03:12:42:WU01:FS00:Starting
03:12:42:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 703 -lifeline 5092 -checkpoint 15 -gpu 0 -gpu-vendor ati
03:12:42:WU01:FS00:Started FahCore on PID 2856
03:12:42:WU01:FS00:Core PID:5936
03:12:42:WU01:FS00:FahCore 0x17 started
03:12:42:WU00:FS01:0xa4:
03:12:42:WU00:FS01:0xa4:*------------------------------*
03:12:42:WU00:FS01:0xa4:Folding@Home Gromacs GB Core
03:12:42:WU00:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
03:12:42:WU00:FS01:0xa4:
03:12:42:WU00:FS01:0xa4:Preparing to commence simulation
03:12:42:WU00:FS01:0xa4:- Looking at optimizations...
03:12:42:WU00:FS01:0xa4:- Files status OK
03:12:42:WU00:FS01:0xa4:- Expanded 376417 -> 690248 (decompressed 183.3 percent)
03:12:42:WU00:FS01:0xa4:Called DecompressByteArray: compressed_data_size=376417 data_size=690248, decompressed_data_size=690248 diff=0
03:12:42:WU00:FS01:0xa4:- Digital signature verified
03:12:42:WU00:FS01:0xa4:
03:12:42:WU00:FS01:0xa4:Project: 10450 (Run 162, Clone 0, Gen 175)
03:12:42:WU00:FS01:0xa4:
03:12:42:WU00:FS01:0xa4:Assembly optimizations on if available.
03:12:42:WU00:FS01:0xa4:Entering M.D.
03:12:42:WU01:FS00:0x17:*********************** Log Started 2013-11-21T03:12:42Z ***********************
03:12:42:WU01:FS00:0x17:Project: 8900 (Run 67, Clone 3, Gen 29)
03:12:42:WU01:FS00:0x17:Unit: 0x00000037028c126651a6355e8250728b
03:12:42:WU01:FS00:0x17:CPU: 0x00000000000000000000000000000000
03:12:42:WU01:FS00:0x17:Machine: 0
03:12:42:WU01:FS00:0x17:Reading tar file state.xml
03:12:45:WU01:FS00:0x17:Reading tar file system.xml
03:12:46:WU01:FS00:0x17:Reading tar file integrator.xml
03:12:46:WU01:FS00:0x17:Reading tar file core.xml
03:12:46:WU01:FS00:0x17:Digital signatures verified
03:12:46:WU01:FS00:0x17:Folding@home GPU core17
03:12:46:WU01:FS00:0x17:Version 0.0.52
03:12:48:WU00:FS01:0xa4:Using Gromacs checkpoints
03:12:48:WU00:FS01:0xa4:Mapping NT from 1 to 1 
03:12:49:WU00:FS01:0xa4:Resuming from checkpoint
03:12:49:WU00:FS01:0xa4:Verified 00/wudata_01.log
03:12:49:WU00:FS01:0xa4:Verified 00/wudata_01.trr
03:12:49:WU00:FS01:0xa4:Verified 00/wudata_01.xtc
03:12:49:WU00:FS01:0xa4:Verified 00/wudata_01.edr
03:12:49:WU00:FS01:0xa4:Completed 4260 out of 2000000 steps  (0%)
Any more stuff I can add?

Is it a driver problem or is it a Core_17 problem? Should this GPU type be blacklisted because of this?

Thanks in advance. (Oh, and I understand if this thread isn't on your top priorities. This hardware is hardly worth the fuss, but who knows... you might find bugs to squish on other hardware thanks to this info) Cheers!

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Thu Nov 21, 2013 3:52 am
by P5-133XL
I'd like to see more of the log. Both of the above logs only contain about one minute of folding. What we need to see is progress or lack of it which takes time (hours or days).

Are you OC'ing the GPU in any way? Video driver failures are common for OC'ing. Folding stresses your video card far more than other activities so any problems with it will likely show up and Folding requires accuracy in its calculations while applications like games are far less persnickety: If a pixel is mis-calculated then that won't cause a problem for a game but a miscalculation in folding is a disaster in the simulation it is running. Also, when your driver fails it will typically down-clock significantly causing every GPU activity to be sluggish. The only way to restore the clock rate is to reboot. It is also a common failure mode for the client upon a driver failure for FAHControl/Webcontrol to indicate GPU progress but the Log will indicate no GPU progress.

Most of the observations you describe are normal. Core_17 can take several minutes (pre-loading the WU) before starting to run. Lag is common when GPU folding. Your CPU is at 50% because the CPU WU you are folding is a single-core WU on a dual-core CPU. It is common for a new install to get a single-core WU when it first runs because the default is to run in medium mode which uses one less CPU core than full (i.e. 2-1 = 1 or 50%). After, the WU completes and if you maintain full mode successive WU's will use all the CPU cores.

The description for your GPU is merely cosmetic and doesn't affect folding in any way.

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Thu Nov 21, 2013 3:56 am
by PantherX
Welcome to the F@H Forum heffeque,

Can you please tell us what temperature the GPU is operating at while folding? You can use GPU-Z (http://www.techpowerup.com/downloads/SysInfo/GPU-Z/) to get the idea of idle and load temperatures. If it is too hot, I would suggest that you set your CPU Slot to Finish and once done, see if the temperature drops to an acceptable level.

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Thu Nov 21, 2013 10:13 am
by heffeque
Hi, thanks for your kind welcome and quick replies!

No OC'ing and no down-clocking is going on. The whole thing consumes 25 Watts at the most (CPU+GPU is 18 W) so there's no down-clocking going on either.

CPU temperature: 52 ºC.
GPU temperature: 68 ºC.
System fan: 2368 RPM.

Here are some more logs:

Code: Select all

*********************** Log Started 2013-11-21T03:12:33Z ***********************
03:12:33:************************* Folding@home Client *************************
03:12:33:      Website: http://folding.stanford.edu/
03:12:33:    Copyright: (c) 2009-2013 Stanford University
03:12:33:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:12:33:         Args: --open-web-control
03:12:33:       Config: C:/ProgramData/FAHClient/config.xml
03:12:33:******************************** Build ********************************
03:12:33:      Version: 7.3.6
03:12:33:         Date: Feb 18 2013
03:12:33:         Time: 15:25:17
03:12:33:      SVN Rev: 3923
03:12:33:       Branch: fah/trunk/client
03:12:33:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
03:12:33:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
03:12:33:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
03:12:33:     Platform: win32 XP
03:12:33:         Bits: 32
03:12:33:         Mode: Release
03:12:33:******************************* System ********************************
03:12:33:          CPU: AMD E-450 APU with Radeon(tm) HD Graphics
03:12:33:       CPU ID: AuthenticAMD Family 20 Model 2 Stepping 0
03:12:33:         CPUs: 2
03:12:33:       Memory: 6.98GiB
03:12:33:  Free Memory: 4.79GiB
03:12:33:      Threads: WINDOWS_THREADS
03:12:33:  Has Battery: false
03:12:33:   On Battery: false
03:12:33:   UTC offset: 1
03:12:33:          PID: 5092
03:12:33:          CWD: C:/ProgramData/FAHClient
03:12:33:           OS: Windows 7 Ultimate
03:12:33:      OS Arch: AMD64
03:12:33:         GPUs: 1
03:12:33:        GPU 0: ATI:4 Wrestler [Radeon HD 6320]
03:12:33:         CUDA: Not detected
03:12:33:Win32 Service: false
03:12:33:***********************************************************************
03:12:33:<config>
03:12:33:  <!-- Folding Slot Configuration -->
03:12:33:  <power v='off'/>
03:12:33:
03:12:33:  <!-- User Information -->
03:12:33:  <team v='16794'/>
03:12:33:  <user v='heffeque'/>
03:12:33:
03:12:33:  <!-- Folding Slots -->
03:12:33:  <slot id='0' type='GPU'/>
03:12:33:  <slot id='1' type='CPU'/>
03:12:33:</config>
03:12:33:Trying to access database...
03:12:33:Successfully acquired database lock
03:12:33:Enabled folding slot 00: PAUSED gpu:0:Wrestler [Radeon HD 6320] (paused)
03:12:33:Enabled folding slot 01: PAUSED cpu:2 (paused)
03:12:41:WU00:FS01:Starting
03:12:41:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 703 -lifeline 5092 -checkpoint 15
03:12:41:WU00:FS01:Started FahCore on PID 6032
03:12:41:WU00:FS01:Core PID:5984
03:12:41:WU00:FS01:FahCore 0xa4 started
03:12:42:WU01:FS00:Starting
03:12:42:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 703 -lifeline 5092 -checkpoint 15 -gpu 0 -gpu-vendor ati
03:12:42:WU01:FS00:Started FahCore on PID 2856
03:12:42:WU01:FS00:Core PID:5936
03:12:42:WU01:FS00:FahCore 0x17 started
03:12:42:WU00:FS01:0xa4:
03:12:42:WU00:FS01:0xa4:*------------------------------*
03:12:42:WU00:FS01:0xa4:Folding@Home Gromacs GB Core
03:12:42:WU00:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
03:12:42:WU00:FS01:0xa4:
03:12:42:WU00:FS01:0xa4:Preparing to commence simulation
03:12:42:WU00:FS01:0xa4:- Looking at optimizations...
03:12:42:WU00:FS01:0xa4:- Files status OK
03:12:42:WU00:FS01:0xa4:- Expanded 376417 -> 690248 (decompressed 183.3 percent)
03:12:42:WU00:FS01:0xa4:Called DecompressByteArray: compressed_data_size=376417 data_size=690248, decompressed_data_size=690248 diff=0
03:12:42:WU00:FS01:0xa4:- Digital signature verified
03:12:42:WU00:FS01:0xa4:
03:12:42:WU00:FS01:0xa4:Project: 10450 (Run 162, Clone 0, Gen 175)
03:12:42:WU00:FS01:0xa4:
03:12:42:WU00:FS01:0xa4:Assembly optimizations on if available.
03:12:42:WU00:FS01:0xa4:Entering M.D.
03:12:42:WU01:FS00:0x17:*********************** Log Started 2013-11-21T03:12:42Z ***********************
03:12:42:WU01:FS00:0x17:Project: 8900 (Run 67, Clone 3, Gen 29)
03:12:42:WU01:FS00:0x17:Unit: 0x00000037028c126651a6355e8250728b
03:12:42:WU01:FS00:0x17:CPU: 0x00000000000000000000000000000000
03:12:42:WU01:FS00:0x17:Machine: 0
03:12:42:WU01:FS00:0x17:Reading tar file state.xml
03:12:45:WU01:FS00:0x17:Reading tar file system.xml
03:12:46:WU01:FS00:0x17:Reading tar file integrator.xml
03:12:46:WU01:FS00:0x17:Reading tar file core.xml
03:12:46:WU01:FS00:0x17:Digital signatures verified
03:12:46:WU01:FS00:0x17:Folding@home GPU core17
03:12:46:WU01:FS00:0x17:Version 0.0.52
03:12:48:WU00:FS01:0xa4:Using Gromacs checkpoints
03:12:48:WU00:FS01:0xa4:Mapping NT from 1 to 1 
03:12:49:WU00:FS01:0xa4:Resuming from checkpoint
03:12:49:WU00:FS01:0xa4:Verified 00/wudata_01.log
03:12:49:WU00:FS01:0xa4:Verified 00/wudata_01.trr
03:12:49:WU00:FS01:0xa4:Verified 00/wudata_01.xtc
03:12:49:WU00:FS01:0xa4:Verified 00/wudata_01.edr
03:12:49:WU00:FS01:0xa4:Completed 4260 out of 2000000 steps  (0%)
03:54:38:WU00:FS01:0xa4:Completed 20000 out of 2000000 steps  (1%)
04:45:47:WU00:FS01:0xa4:Completed 40000 out of 2000000 steps  (2%)
05:36:13:WU00:FS01:0xa4:Completed 60000 out of 2000000 steps  (3%)
06:26:32:WU00:FS01:0xa4:Completed 80000 out of 2000000 steps  (4%)
07:17:50:WU00:FS01:0xa4:Completed 100000 out of 2000000 steps  (5%)
08:10:54:WU00:FS01:0xa4:Completed 120000 out of 2000000 steps  (6%)
09:03:42:WU00:FS01:0xa4:Completed 140000 out of 2000000 steps  (7%)
******************************* Date: 2013-11-21 *******************************
09:56:35:WU00:FS01:0xa4:Completed 160000 out of 2000000 steps  (8%)
Basically says that the CPU is doing stuff. No news from GPU after 6 hours.
Should I set it to "off" and to "full" again to see if the GPU starts working again?

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Thu Nov 21, 2013 12:28 pm
by PantherX
During the 6 hours, is there any GPU Usage or is the GPU idle?

Also, from your log, it seems that you are not using a passkey (http://folding.stanford.edu/home/faq/faq-passkey/). I would recommend that you get one so that you can take advantage of the bonus points program once you qualify for it.

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Fri Nov 22, 2013 12:37 am
by heffeque
Not sure what the bonus points are for, but I did the passkey thing anyway :-)

As for GPU Usage... I can leave the GPU-Z log going on tonight if it's helpful.

I'll reboot my system, I'll start GPU-Z and then I'll unpause FAH.

Generally unpausing FAH will make the driver fail about 5 to 10 minutes after that.
Maybe the Memory Usage will help clarify the problem.

Tomorrow I'll post GPU-Z's log and FAH's log.

Cheers!

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Fri Nov 22, 2013 1:22 am
by PantherX
In short, bonus points allows a donor to gain additional points for returning a WU before the Timeout. The quicker the WU is successfully returned, the higher the bonus points are. The reason is that WUs returned quicker than others are more valuable to F@H. We prefer quick returns as it allows the progress of the serial nature of the project to be faster and thus obtain more scientific data for analysis.

Driver failure isn't a good start. Let's hope it can be resolved once more data is made available.

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Sat Nov 23, 2013 1:54 pm
by heffeque
Folding log:

Code: Select all

*********************** Log Started 2013-11-22T12:19:32Z ***********************
12:19:32:************************* Folding@home Client *************************
12:19:32:      Website: http://folding.stanford.edu/
12:19:32:    Copyright: (c) 2009-2013 Stanford University
12:19:32:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
12:19:32:         Args: 
12:19:32:       Config: C:/ProgramData/FAHClient/config.xml
12:19:32:******************************** Build ********************************
12:19:32:      Version: 7.3.6
12:19:32:         Date: Feb 18 2013
12:19:32:         Time: 15:25:17
12:19:32:      SVN Rev: 3923
12:19:32:       Branch: fah/trunk/client
12:19:32:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
12:19:32:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
12:19:32:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
12:19:32:     Platform: win32 XP
12:19:32:         Bits: 32
12:19:32:         Mode: Release
12:19:32:******************************* System ********************************
12:19:32:          CPU: AMD E-450 APU with Radeon(tm) HD Graphics
12:19:32:       CPU ID: AuthenticAMD Family 20 Model 2 Stepping 0
12:19:32:         CPUs: 2
12:19:32:       Memory: 6.98GiB
12:19:32:  Free Memory: 5.92GiB
12:19:32:      Threads: WINDOWS_THREADS
12:19:32:  Has Battery: false
12:19:32:   On Battery: false
12:19:32:   UTC offset: 1
12:19:32:          PID: 2364
12:19:32:          CWD: C:/ProgramData/FAHClient
12:19:32:           OS: Windows 7 Ultimate
12:19:32:      OS Arch: AMD64
12:19:32:         GPUs: 1
12:19:32:        GPU 0: ATI:4 Wrestler [Radeon HD 6320]
12:19:32:         CUDA: Not detected
12:19:32:Win32 Service: false
12:19:32:***********************************************************************
12:19:32:<config>
12:19:32:  <!-- Folding Slot Configuration -->
12:19:32:  <power v='off'/>
12:19:32:
12:19:32:  <!-- User Information -->
12:19:32:  <passkey v='********************************'/>
12:19:32:  <team v='16794'/>
12:19:32:  <user v='heffeque'/>
12:19:32:
12:19:32:  <!-- Folding Slots -->
12:19:32:  <slot id='0' type='GPU'/>
12:19:32:  <slot id='1' type='CPU'/>
12:19:32:</config>
12:19:32:Trying to access database...
12:19:38:Successfully acquired database lock
12:19:38:Enabled folding slot 00: PAUSED gpu:0:Wrestler [Radeon HD 6320] (paused)
12:19:38:Enabled folding slot 01: PAUSED cpu:2 (paused)
12:43:53:WU00:FS01:Starting
12:43:53:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 703 -lifeline 2364 -checkpoint 15
12:43:54:WU00:FS01:Started FahCore on PID 2920
12:43:54:WU00:FS01:Core PID:4616
12:43:54:WU00:FS01:FahCore 0xa4 started
12:43:54:WU01:FS00:Starting
12:43:54:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 703 -lifeline 2364 -checkpoint 15 -gpu 0 -gpu-vendor ati
12:43:54:WU01:FS00:Started FahCore on PID 4380
12:43:54:WU01:FS00:Core PID:3160
12:43:54:WU01:FS00:FahCore 0x17 started
12:43:54:WU00:FS01:0xa4:
12:43:54:WU00:FS01:0xa4:*------------------------------*
12:43:54:WU00:FS01:0xa4:Folding@Home Gromacs GB Core
12:43:54:WU00:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
12:43:54:WU00:FS01:0xa4:
12:43:54:WU00:FS01:0xa4:Preparing to commence simulation
12:43:54:WU00:FS01:0xa4:- Looking at optimizations...
12:43:54:WU00:FS01:0xa4:- Files status OK
12:43:54:WU00:FS01:0xa4:- Expanded 376417 -> 690248 (decompressed 183.3 percent)
12:43:54:WU00:FS01:0xa4:Called DecompressByteArray: compressed_data_size=376417 data_size=690248, decompressed_data_size=690248 diff=0
12:43:54:WU00:FS01:0xa4:- Digital signature verified
12:43:54:WU00:FS01:0xa4:
12:43:54:WU00:FS01:0xa4:Project: 10450 (Run 162, Clone 0, Gen 175)
12:43:54:WU00:FS01:0xa4:
12:43:54:WU00:FS01:0xa4:Assembly optimizations on if available.
12:43:54:WU00:FS01:0xa4:Entering M.D.
12:43:57:WU01:FS00:0x17:*********************** Log Started 2013-11-22T12:43:57Z ***********************
12:43:57:WU01:FS00:0x17:Project: 8900 (Run 67, Clone 3, Gen 29)
12:43:57:WU01:FS00:0x17:Unit: 0x00000037028c126651a6355e8250728b
12:43:57:WU01:FS00:0x17:CPU: 0x00000000000000000000000000000000
12:43:57:WU01:FS00:0x17:Machine: 0
12:43:57:WU01:FS00:0x17:Reading tar file state.xml
12:43:59:WU01:FS00:0x17:Reading tar file system.xml
12:44:00:WU00:FS01:0xa4:Using Gromacs checkpoints
12:44:00:WU00:FS01:0xa4:Mapping NT from 1 to 1 
12:44:00:WU00:FS01:0xa4:Resuming from checkpoint
12:44:00:WU00:FS01:0xa4:Verified 00/wudata_01.log
12:44:01:WU01:FS00:0x17:Reading tar file integrator.xml
12:44:01:WU01:FS00:0x17:Reading tar file core.xml
12:44:01:WU01:FS00:0x17:Digital signatures verified
12:44:01:WU01:FS00:0x17:Folding@home GPU core17
12:44:01:WU01:FS00:0x17:Version 0.0.52
12:44:01:WU00:FS01:0xa4:Verified 00/wudata_01.trr
12:44:01:WU00:FS01:0xa4:Verified 00/wudata_01.xtc
12:44:01:WU00:FS01:0xa4:Verified 00/wudata_01.edr
12:44:01:WU00:FS01:0xa4:Completed 406460 out of 2000000 steps  (20%)
13:18:28:WU00:FS01:0xa4:Completed 420000 out of 2000000 steps  (21%)
14:09:09:WU00:FS01:0xa4:Completed 440000 out of 2000000 steps  (22%)
15:00:11:WU00:FS01:0xa4:Completed 460000 out of 2000000 steps  (23%)
15:50:57:WU00:FS01:0xa4:Completed 480000 out of 2000000 steps  (24%)
16:41:56:WU00:FS01:0xa4:Completed 500000 out of 2000000 steps  (25%)
17:32:37:WU00:FS01:0xa4:Completed 520000 out of 2000000 steps  (26%)
******************************* Date: 2013-11-22 *******************************
18:23:17:WU00:FS01:0xa4:Completed 540000 out of 2000000 steps  (27%)
19:13:55:WU00:FS01:0xa4:Completed 560000 out of 2000000 steps  (28%)
20:04:43:WU00:FS01:0xa4:Completed 580000 out of 2000000 steps  (29%)
20:55:18:WU00:FS01:0xa4:Completed 600000 out of 2000000 steps  (30%)
21:21:40:FS00:Shutting core down
21:21:40:FS01:Shutting core down
21:21:40:WU01:FS00:0x17:WARNING:Console control signal 1 on PID 3160
21:21:40:WU01:FS00:0x17:Exiting, please wait. . .
21:21:41:WU00:FS01:0xa4:Client no longer detected. Shutting down core 
21:21:41:WU00:FS01:0xa4:
21:21:41:WU00:FS01:0xa4:Folding@home Core Shutdown: CLIENT_DIED
21:21:41:WU00:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
GPU-Z log:
https://dl.dropboxusercontent.com/u/387 ... r%20Log.7z

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Sat Nov 23, 2013 4:07 pm
by 7im
Zacate is the CPU core not the GPU core, so the GPU was correctly detected as an HD 6320, which has 80 shaders. While I would like to solve this driver crashing problem, it may be the device is simply underpowered and that could be the cause of the driver reset. And as you said, if may not be powerful enough to finish the WU before the deadlines.

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Sat Nov 23, 2013 6:29 pm
by heffeque
7im wrote:Zacate is the CPU core not the GPU core, so the GPU was correctly detected as an HD 6320, which has 80 shaders. While I would like to solve this driver crashing problem, it may be the device is simply underpowered and that could be the cause of the driver reset. And as you said, if may not be powerful enough to finish the WU before the deadlines.
Is there any way I can disable the GPU so that only the CPU does the job and doesn't make my GPU driver fail?
As for being underpowered... maybe, but it does other OpenCL stuff without problems.

I guess it's too much of a fuss to make this thing work with the GPU too :-/

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Sat Nov 23, 2013 9:40 pm
by 7im
Remove the GPU slot under Configure, slots tab.

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Sat Nov 23, 2013 10:31 pm
by heffeque
Thanks for the info!

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Sun Nov 24, 2013 3:33 am
by PantherX
From the GPU-Z log file, the only time duration that the GPU frequency dropped from 3300 MHz to 507.7 MHz was for about a minute starting at 22-11-13 1:23. Since the F@H Log is in UTC, am not sure of the time difference (accounting for the DST) so can't match-up the GPU-Z log to F@H log.

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Sun Nov 24, 2013 9:27 pm
by heffeque
PantherX wrote:From the GPU-Z log file, the only time duration that the GPU frequency dropped from 3300 MHz to 507.7 MHz was for about a minute starting at 22-11-13 1:23. Since the F@H Log is in UTC, am not sure of the time difference (accounting for the DST) so can't match-up the GPU-Z log to F@H log.
FAH's 12:43:53 coincides with GPU-Z's 13:43:53 (I live in Barcelona right now).

Hope that helps!

Re: FAH makes GPU driver fail (official 13.9 and beta 13.11v

Posted: Mon Nov 25, 2013 5:37 am
by bruce
PantherX wrote:From the GPU-Z log file, the only time duration that the GPU frequency dropped from 3300 MHz to 507.7 MHz was for about a minute starting at 22-11-13 1:23. Since the F@H Log is in UTC, am not sure of the time difference (accounting for the DST) so can't match-up the GPU-Z log to F@H log.
You probably never noticed, but you'll find all kinds of interesting information in the preamble to the log:
heffeque wrote: 12:19:32: On Battery: false
12:19:32: UTC offset: 1
12:19:32: PID: 2364
12:19:32: CWD: C:/ProgramData/FAHClient