Page 1 of 2

I'm new and I have some GPU issue with FAH

Posted: Sat Mar 09, 2013 8:17 pm
by Breach
Hello,

I'm new to FAH so please bear with me. ;-) I installed FAH on both my PC (i3770k/GTX 295) and my Macbook Air (2011). CPU folding on the PC/Mac works great. However I have an issue with GPU folding [Edit:on my PC]. Both my GPUs got work assigned, no errors in the logs. The problem is that if I move the slider to Full OR wait for the FAH screensaver to kick in on Medium nothing seems to happen (besides the computer becoming unresponsive which is I guess OK given that GPUs can't prioritise). However:

- nothing seems to happen - the GPU work units stay at 0.00%...?
- When I switch back to Medium / exit the screensaver my computer remains unresponsive - it's like the GPUs are still processing even though the client states they're shut down and waiting for idle - I have to actually kill FAH altogether to get control of my PC back - is this normal? I tried reinstalling the client but nothing changed... I'm using the latest nVidia drivers 314.07, I believe.

Many thanks!

Updated with specs:
Computer Specs: CPU - 3770k, 8GB DDR3, 4cores + HT; GPU nVidia GTX 295 - drivers 317.04
Network Connection: Cable - NAT, no proxies
Operating System: Windows 8, all HFs and updates
Overclocked?: Was... given my troubles it's at stock
Stable?: Yes, memtest86, furmark, memtestcl, prime95...
Software: FAH 7.3.6, CPU a4, GPU FAH 11
WU details: 5767 (4, 206, 3735), 5770 (2, 158, 5195)

Re: I'm new and I have some GPU issue with FAH

Posted: Sat Mar 09, 2013 9:00 pm
by PantherX
welcome to the F@H Forum Breach,

Depending on the WU assigned, it might take some time for the first percent to be completed so how long did you wait? On my GTX 260, most WUs take less than 1 minute to show the first percentage. Once the screensaver is exited, the screen might lag for a second or so than everything will work fine while the GPU Slot pauses. I didn't have to terminate F@H from Task Manager.

Can you please post your log, especially the initial section as it will contain information about your system and configuration of F@H. Do note that the Code Tag is available in the Full Editor.

Re: I'm new and I have some GPU issue with FAH

Posted: Sat Mar 09, 2013 9:30 pm
by kiore
Hi and welcome, GPU folding does not yet work on OSX or Linux, what operating system are you using?
Assume OSX on the mac.

Re: I'm new and I have some GPU issue with FAH

Posted: Sat Mar 09, 2013 9:33 pm
by Breach
PantherX wrote:welcome to the F@H Forum Breach,

Depending on the WU assigned, it might take some time for the first percent to be completed so how long did you wait? On my GTX 260, most WUs take less than 1 minute to show the first percentage. Once the screensaver is exited, the screen might lag for a second or so than everything will work fine while the GPU Slot pauses. I didn't have to terminate F@H from Task Manager.

Can you please post your log, especially the initial section as it will contain information about your system and configuration of F@H. Do note that the Code Tag is available in the Full Editor.
OK, complete log here:
https://dl.dropbox.com/u/28406605/log2.txt

Basically when I exit the screensaver I'm looking at a blue (Windows 8) background for a few seconds and then my desktop is lagging for about 1 minute before it returns to normal, even though the GPU's are in 'stopped, waiting for idle' state. Is it normal to have to wait so much after the GPU job is stopped? Thanks!

Edit: And another example - left it on the screensaver a bit longer - that time it took even longer to get back to the desktop (still 0.00%...):
https://dl.dropbox.com/u/28406605/log3.txt

@kiore - sorry, the Mac is unrelated - this is about a problem I'm having on my PC.

Re: I'm new and I have some GPU issue with FAH

Posted: Sat Mar 09, 2013 10:11 pm
by PantherX
IIRC, the 1 minute lag should happen only once with a fresh installation of V7. Does this happen every time or not?

Re: I'm new and I have some GPU issue with FAH

Posted: Sat Mar 09, 2013 10:22 pm
by Breach
Well, this is a fresh installation (I reinstalled Windows today too), but it happens every time... I can't seem to get GPU crunching to get going at all. Tried, rebooting, reinstalling the FAH client - nothing seems to help.

Re: I'm new and I have some GPU issue with FAH

Posted: Sat Mar 09, 2013 10:57 pm
by PantherX
Okay, how about this:

Since you have a fresh installation of F@H, get GPU-Z and start it.
Make the power level to Full
Does the GPU-Z show 90% to 100% on one or both of the GPUs?

You can filter the logs according to the Slots and you can post it here by using the Code Tags.

Re: I'm new and I have some GPU issue with FAH

Posted: Sat Mar 09, 2013 11:27 pm
by Breach
PantherX wrote:Okay, how about this:

Since you have a fresh installation of F@H, get GPU-Z and start it.
Make the power level to Full
Does the GPU-Z show 90% to 100% on one or both of the GPUs?

You can filter the logs according to the Slots and you can post it here by using the Code Tags.
Hi,

Thanks, here are the logs:

Slot 0:

Code: Select all

*********************** Log Started 2013-03-09T23:19:51Z ***********************
23:19:52:WU01:FS00:Starting
23:19:52:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 01 -suffix 01 -version 703 -lifeline 5008 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
23:19:52:WU01:FS00:Started FahCore on PID 2404
23:19:52:WU01:FS00:Core PID:5600
23:19:52:WU01:FS00:FahCore 0x11 started
23:19:52:WU01:FS00:0x11:
23:19:52:WU01:FS00:0x11:*------------------------------*
23:19:52:WU01:FS00:0x11:Folding@Home GPU Core
23:19:52:WU01:FS00:0x11:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
23:19:52:WU01:FS00:0x11:
23:19:52:WU01:FS00:0x11:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
23:19:52:WU01:FS00:0x11:Build host: amoeba
23:19:52:WU01:FS00:0x11:Board Type: Nvidia
23:19:52:WU01:FS00:0x11:Core      : 
23:19:52:WU01:FS00:0x11:Preparing to commence simulation
23:19:52:WU01:FS00:0x11:- Looking at optimizations...
23:19:52:WU01:FS00:0x11:- Files status OK
23:19:52:WU01:FS00:0x11:- Expanded 46800 -> 252912 (decompressed 540.4 percent)
23:19:52:WU01:FS00:0x11:Called DecompressByteArray: compressed_data_size=46800 data_size=252912, decompressed_data_size=252912 diff=0
23:19:52:WU01:FS00:0x11:- Digital signature verified
23:19:52:WU01:FS00:0x11:
23:19:52:WU01:FS00:0x11:Project: 5767 (Run 12, Clone 88, Gen 2746)
23:19:52:WU01:FS00:0x11:
23:19:52:WU01:FS00:0x11:Assembly optimizations on if available.
23:19:52:WU01:FS00:0x11:Entering M.D.
23:19:58:WU01:FS00:0x11:Will resume from checkpoint file
23:19:58:WU01:FS00:0x11:Tpr hash 01/wudata_01.tpr:  476679852 5082355 852223987 3859769145 67790066
23:19:58:WU01:FS00:0x11:
23:19:58:WU01:FS00:0x11:Calling fah_main args: 14 usage=100
23:19:58:WU01:FS00:0x11:
23:19:58:WU01:FS00:0x11:Working on Protein
23:20:37:WU01:FS00:0x11:Client config unavailable.
23:20:37:WU01:FS00:0x11:Starting GUI Server
23:20:39:WU01:FS00:0x11:Resuming from checkpoint
23:20:39:WU01:FS00:0x11:fcCheckPointResume: retreived and current tpr file hash:
23:20:39:WU01:FS00:0x11:   0    476679852    476679852
23:20:39:WU01:FS00:0x11:   1      5082355      5082355
23:20:39:WU01:FS00:0x11:   2    852223987    852223987
23:20:39:WU01:FS00:0x11:   3   3859769145   3859769145
23:20:39:WU01:FS00:0x11:   4     67790066     67790066
23:20:39:WU01:FS00:0x11:fcCheckPointResume: file hashes same.
23:20:39:WU01:FS00:0x11:fcCheckPointResume: state restored.
23:20:39:WU01:FS00:0x11:Verified 01/wudata_01.log
23:20:39:WU01:FS00:0x11:Verified 01/wudata_01.edr
23:20:39:WU01:FS00:0x11:Verified 01/wudata_01.xtc
23:21:12:FS00:Shutting core down
23:21:22:WU01:FS00:0x11:Client no longer detected. Shutting down core 
23:21:22:WU01:FS00:0x11:
23:21:22:WU01:FS00:0x11:Folding@home Core Shutdown: CLIENT_DIED
23:21:22:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
Slot 1:

Code: Select all

*********************** Log Started 2013-03-09T23:19:51Z ***********************
23:19:52:WU02:FS01:Starting
23:19:52:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 02 -suffix 01 -version 703 -lifeline 5008 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
23:19:52:WU02:FS01:Started FahCore on PID 5624
23:19:52:WU02:FS01:Core PID:6804
23:19:52:WU02:FS01:FahCore 0x11 started
23:19:52:WU02:FS01:0x11:
23:19:52:WU02:FS01:0x11:*------------------------------*
23:19:52:WU02:FS01:0x11:Folding@Home GPU Core
23:19:52:WU02:FS01:0x11:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
23:19:52:WU02:FS01:0x11:
23:19:52:WU02:FS01:0x11:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
23:19:52:WU02:FS01:0x11:Build host: amoeba
23:19:52:WU02:FS01:0x11:Board Type: Nvidia
23:19:52:WU02:FS01:0x11:Core      : 
23:19:52:WU02:FS01:0x11:Preparing to commence simulation
23:19:52:WU02:FS01:0x11:- Looking at optimizations...
23:19:52:WU02:FS01:0x11:- Files status OK
23:19:52:WU02:FS01:0x11:- Expanded 46762 -> 252912 (decompressed 540.8 percent)
23:19:52:WU02:FS01:0x11:Called DecompressByteArray: compressed_data_size=46762 data_size=252912, decompressed_data_size=252912 diff=0
23:19:52:WU02:FS01:0x11:- Digital signature verified
23:19:52:WU02:FS01:0x11:
23:19:52:WU02:FS01:0x11:Project: 5768 (Run 5, Clone 178, Gen 2958)
23:19:52:WU02:FS01:0x11:
23:19:52:WU02:FS01:0x11:Assembly optimizations on if available.
23:19:52:WU02:FS01:0x11:Entering M.D.
23:19:58:WU02:FS01:0x11:Will resume from checkpoint file
23:19:58:WU02:FS01:0x11:Tpr hash 02/wudata_01.tpr:  2649353037 2018842240 4081007759 447651459 1725502388
23:19:58:WU02:FS01:0x11:
23:19:58:WU02:FS01:0x11:Calling fah_main args: 14 usage=100
23:19:58:WU02:FS01:0x11:
23:19:58:WU02:FS01:0x11:Working on Protein
23:20:31:WU02:FS01:0x11:Client config unavailable.
23:20:31:WU02:FS01:0x11:Starting GUI Server
23:20:32:WU02:FS01:0x11:Resuming from checkpoint
23:20:32:WU02:FS01:0x11:fcCheckPointResume: retreived and current tpr file hash:
23:20:32:WU02:FS01:0x11:   0   2649353037   2649353037
23:20:32:WU02:FS01:0x11:   1   2018842240   2018842240
23:20:32:WU02:FS01:0x11:   2   4081007759   4081007759
23:20:32:WU02:FS01:0x11:   3    447651459    447651459
23:20:32:WU02:FS01:0x11:   4   1725502388   1725502388
23:20:32:WU02:FS01:0x11:fcCheckPointResume: file hashes same.
23:20:32:WU02:FS01:0x11:fcCheckPointResume: state restored.
23:20:32:WU02:FS01:0x11:Verified 02/wudata_01.log
23:20:32:WU02:FS01:0x11:Verified 02/wudata_01.edr
23:20:32:WU02:FS01:0x11:Verified 02/wudata_01.xtc
23:21:12:FS01:Shutting core down
23:21:22:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
GPU-Z shows average GPU usage between 30-40% (!) on both GPUS. HOWEVER I did notice my VRAM utilisation sometimes being in excess of 800MB (GTX 295 has about 870MB). Coupled with the lag I suspect it might be thrashing to system memor which would explain the horrid responses... Is that a possibility? Thanks.

[Edit: At desktop my VRAM utilisation is about 360MB per GPU]

Re: I'm new and I have some GPU issue with FAH

Posted: Sun Mar 10, 2013 1:03 am
by Breach
OK, so after spending 3 hours on this I deleted the two projects from the working directory and guess what? I'm back in business - close to 100% GPU usage (VRAM is about 840MB so I guess that wasn't it). Weird, weird, weird. Either there was something wrong with the projects or... I have no idea :-) I'm now working on project 5768 and 5769 and it's folding on Full with no UI lag whatsoever too.

If someone has any insights - I'd be delighted to be enlightened.

Re: I'm new and I have some GPU issue with FAH

Posted: Sun Mar 10, 2013 2:42 am
by PantherX
It could be that you got 2 bad WUs... highly unlikely but plausible. I checked the WU Database for records and nothing showed up so have marked it for a follow-up.

If your GPU was 30% to 40%, then it could be that the GPU was being "starved" by the CPU. The GPU needs a constant stream of data which is provided by the CPU. If the CPU is too slow or busy, it can result in the GPU being under-utilized. Are you running any GPU intensive application since 360 MB per GPU (720 MB in total VRAM) is a rather significant amount. For comparison, my desktop GTX 260 with 896 MB is using 499.19 MB with only FAHControl and HFM.NET running. The idle mode is around 100 MBs and under normal usage which includes folding, I get to about 750 MB of usage.

Nonetheless, am glad to you are folding now without any issues :)

Re: I'm new and I have some GPU issue with FAH

Posted: Sun Mar 10, 2013 9:45 am
by Breach
If your GPU was 30% to 40%, then it could be that the GPU was being "starved" by the CPU
Possible, though I don't think so... other GPU tasks start updating progress within 1 minute of download and this was was stuck forever at 0.00%... Also I don't have an other CPU intensive applications competing with FAH.
Are you running any GPU intensive application since 360 MB per GPU (720 MB in total VRAM) is a rather significant amount.
No nothing really - just Chrome. I did a test though - I rebooted - VRAM usage after reboot was about 110MB. When I start FAH it jumped to about 660. Then with Chrome and Firefox it'd go to about 330MB idle. + FAH that would become about 800MB

At the moment (things working properly) with Chrome and FAH it's running smooth while the VRAM usage is at 880MB. So although the VRAM utilisation is a bit unexpected I don't think it's the cause itself.

I think for some reason either the WUs or the core put the GPU in some sort of a deadlock situation which wouldn't clear itself. I have no evidence to prove it though - just the set of symptoms ;-) I'll update if I hit it again with another WU. Cheers.

Re: I'm new and I have some GPU issue with FAH

Posted: Sun Mar 10, 2013 2:43 pm
by Breach
OK, my woes seem to continue. Now, I don't know whether it's related or not, but I seem to have the following issue:

Scenario:

1. FAH starts, I put it into Full - it fetches GPU WUs and starts folding - GPUs at 90% - no UI lag
2. I exit FAH, FAH control and reboot the computer
3. FAH automatically starts after reboot in Full mode
4. GPU folding continues, but:
- The UI is really choppy
- I have a HUGE number of increasing Virtual Memory page faults
- GPU utilisation is about 30-40%
- I give it some time to recover - it doesnt. I quit FAH - in about 2-3 minutes the core process quits and my UI is responsive again

Log:

Code: Select all

*********************** Log Started 2013-03-10T14:27:34Z ***********************
14:27:34:************************* Folding@home Client *************************
14:27:34:      Website: http://folding.stanford.edu/
14:27:34:    Copyright: (c) 2009-2013 Stanford University
14:27:34:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
14:27:34:         Args: 
14:27:34:       Config: C:/ProgramData/FAHClient/config.xml
14:27:34:******************************** Build ********************************
14:27:34:      Version: 7.3.6
14:27:34:         Date: Feb 18 2013
14:27:34:         Time: 15:25:17
14:27:34:      SVN Rev: 3923
14:27:34:       Branch: fah/trunk/client
14:27:34:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
14:27:34:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
14:27:34:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
14:27:34:     Platform: win32 XP
14:27:34:         Bits: 32
14:27:34:         Mode: Release
14:27:34:******************************* System ********************************
14:27:34:          CPU: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
14:27:34:       CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
14:27:34:         CPUs: 8
14:27:34:       Memory: 7.88GiB
14:27:34:  Free Memory: 6.18GiB
14:27:34:      Threads: WINDOWS_THREADS
14:27:34:  Has Battery: false
14:27:34:   On Battery: false
14:27:34:   UTC offset: 1
14:27:34:          PID: 6500
14:27:34:          CWD: C:/ProgramData/FAHClient
14:27:34:           OS: Windows 8 Pro with Media Center
14:27:34:      OS Arch: AMD64
14:27:34:         GPUs: 2
14:27:34:        GPU 0: NVIDIA:1 GT200 [GeForce GTX 295]
14:27:34:        GPU 1: NVIDIA:1 GT200 [GeForce GTX 295]
14:27:34:         CUDA: 1.3
14:27:34:  CUDA Driver: 5000
14:27:34:Win32 Service: false
14:27:34:***********************************************************************
14:27:34:<config>
14:27:34:  <!-- Folding Slot Configuration -->
14:27:34:  <power v='full'/>
14:27:34:
14:27:34:  <!-- Network -->
14:27:34:  <proxy v=':8080'/>
14:27:34:
14:27:34:  <!-- User Information -->
14:27:34:  <passkey v='********************************'/>
14:27:34:  <user v='Alexander_Ivanchev'/>
14:27:34:
14:27:34:  <!-- Folding Slots -->
14:27:34:  <slot id='0' type='GPU'/>
14:27:34:  <slot id='1' type='GPU'/>
14:27:34:  <slot id='2' type='CPU'/>
14:27:34:</config>
14:27:34:Trying to access database...
14:27:34:Successfully acquired database lock
14:27:34:Enabled folding slot 00: READY gpu:0:GT200 [GeForce GTX 295]
14:27:34:Enabled folding slot 01: READY gpu:1:GT200 [GeForce GTX 295]
14:27:34:Enabled folding slot 02: READY cpu:8
14:27:34:WU02:FS01:Starting
14:27:34:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 02 -suffix 01 -version 703 -lifeline 6500 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
14:27:34:WU02:FS01:Started FahCore on PID 6612
14:27:34:WU02:FS01:Core PID:6624
14:27:34:WU02:FS01:FahCore 0x11 started
14:27:34:WU00:FS02:Starting
14:27:34:WU00:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 703 -lifeline 6500 -checkpoint 15 -np 8
14:27:34:WU00:FS02:Started FahCore on PID 6636
14:27:34:WU00:FS02:Core PID:6648
14:27:34:WU00:FS02:FahCore 0xa4 started
14:27:34:WU01:FS00:Starting
14:27:34:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 01 -suffix 01 -version 703 -lifeline 6500 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
14:27:34:WU01:FS00:Started FahCore on PID 6672
14:27:34:WU01:FS00:Core PID:6684
14:27:34:WU01:FS00:FahCore 0x11 started
14:27:34:WU02:FS01:0x11:
14:27:34:WU02:FS01:0x11:*------------------------------*
14:27:34:WU02:FS01:0x11:Folding@Home GPU Core
14:27:34:WU02:FS01:0x11:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
14:27:34:WU02:FS01:0x11:
14:27:34:WU02:FS01:0x11:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
14:27:34:WU02:FS01:0x11:Build host: amoeba
14:27:34:WU02:FS01:0x11:Board Type: Nvidia
14:27:34:WU02:FS01:0x11:Core      : 
14:27:34:WU02:FS01:0x11:Preparing to commence simulation
14:27:34:WU02:FS01:0x11:- Looking at optimizations...
14:27:34:WU02:FS01:0x11:- Files status OK
14:27:34:WU02:FS01:0x11:- Expanded 45444 -> 251112 (decompressed 552.5 percent)
14:27:34:WU02:FS01:0x11:Called DecompressByteArray: compressed_data_size=45444 data_size=251112, decompressed_data_size=251112 diff=0
14:27:34:WU02:FS01:0x11:- Digital signature verified
14:27:34:WU02:FS01:0x11:
14:27:34:WU02:FS01:0x11:Project: 5770 (Run 2, Clone 158, Gen 5195)
14:27:34:WU02:FS01:0x11:
14:27:34:WU02:FS01:0x11:Assembly optimizations on if available.
14:27:34:WU02:FS01:0x11:Entering M.D.
14:27:34:WU00:FS02:0xa4:
14:27:34:WU00:FS02:0xa4:*------------------------------*
14:27:34:WU00:FS02:0xa4:Folding@Home Gromacs GB Core
14:27:34:WU00:FS02:0xa4:Version 2.27 (Dec. 15, 2010)
14:27:34:WU00:FS02:0xa4:
14:27:34:WU00:FS02:0xa4:Preparing to commence simulation
14:27:34:WU00:FS02:0xa4:- Looking at optimizations...
14:27:34:WU00:FS02:0xa4:- Files status OK
14:27:34:WU00:FS02:0xa4:- Expanded 1081655 -> 3050744 (decompressed 282.0 percent)
14:27:34:WU00:FS02:0xa4:Called DecompressByteArray: compressed_data_size=1081655 data_size=3050744, decompressed_data_size=3050744 diff=0
14:27:34:WU00:FS02:0xa4:- Digital signature verified
14:27:34:WU00:FS02:0xa4:
14:27:34:WU00:FS02:0xa4:Project: 8082 (Run 22, Clone 48, Gen 10)
14:27:34:WU00:FS02:0xa4:
14:27:34:WU00:FS02:0xa4:Assembly optimizations on if available.
14:27:34:WU00:FS02:0xa4:Entering M.D.
14:27:34:WU01:FS00:0x11:
14:27:34:WU01:FS00:0x11:*------------------------------*
14:27:34:WU01:FS00:0x11:Folding@Home GPU Core
14:27:34:WU01:FS00:0x11:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
14:27:34:WU01:FS00:0x11:
14:27:34:WU01:FS00:0x11:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
14:27:34:WU01:FS00:0x11:Build host: amoeba
14:27:34:WU01:FS00:0x11:Board Type: Nvidia
14:27:34:WU01:FS00:0x11:Core      : 
14:27:34:WU01:FS00:0x11:Preparing to commence simulation
14:27:34:WU01:FS00:0x11:- Looking at optimizations...
14:27:34:WU01:FS00:0x11:- Files status OK
14:27:34:WU01:FS00:0x11:- Expanded 46678 -> 252912 (decompressed 541.8 percent)
14:27:34:WU01:FS00:0x11:Called DecompressByteArray: compressed_data_size=46678 data_size=252912, decompressed_data_size=252912 diff=0
14:27:34:WU01:FS00:0x11:- Digital signature verified
14:27:34:WU01:FS00:0x11:
14:27:34:WU01:FS00:0x11:Project: 5767 (Run 4, Clone 206, Gen 3735)
14:27:34:WU01:FS00:0x11:
14:27:34:WU01:FS00:0x11:Assembly optimizations on if available.
14:27:34:WU01:FS00:0x11:Entering M.D.
14:27:40:WU02:FS01:0x11:Will resume from checkpoint file
14:27:40:WU02:FS01:0x11:Tpr hash 02/wudata_01.tpr:  837396161 1558970019 2183451600 3988018081 301792261
14:27:40:WU02:FS01:0x11:
14:27:40:WU02:FS01:0x11:Calling fah_main args: 14 usage=100
14:27:40:WU02:FS01:0x11:
14:27:40:WU01:FS00:0x11:Will resume from checkpoint file
14:27:40:WU01:FS00:0x11:Tpr hash 01/wudata_01.tpr:  2847500468 2107912794 3009904781 3312413012 1078031564
14:27:40:WU01:FS00:0x11:
14:27:40:WU01:FS00:0x11:Calling fah_main args: 14 usage=100
14:27:40:WU01:FS00:0x11:
14:27:40:WU02:FS01:0x11:Working on Protein
14:27:40:WU00:FS02:0xa4:Using Gromacs checkpoints
14:27:40:WU00:FS02:0xa4:Mapping NT from 8 to 8 
14:27:41:WU00:FS02:0xa4:Resuming from checkpoint
14:27:41:WU00:FS02:0xa4:Verified 00/wudata_01.log
14:27:41:WU00:FS02:0xa4:Verified 00/wudata_01.trr
14:27:41:WU00:FS02:0xa4:Verified 00/wudata_01.xtc
14:27:41:WU00:FS02:0xa4:Verified 00/wudata_01.edr
14:27:41:WU00:FS02:0xa4:Completed 29160 out of 500000 steps  (5%)
14:27:41:WU01:FS00:0x11:Working on Protein
14:28:11:WU02:FS01:0x11:Client config unavailable.
14:28:11:WU02:FS01:0x11:Resuming from checkpoint
14:28:11:WU02:FS01:0x11:fcCheckPointResume: retreived and current tpr file hash:
14:28:11:WU02:FS01:0x11:   0    837396161    837396161
14:28:11:WU02:FS01:0x11:   1   1558970019   1558970019
14:28:11:WU02:FS01:0x11:   2   2183451600   2183451600
14:28:11:WU02:FS01:0x11:   3   3988018081   3988018081
14:28:11:WU02:FS01:0x11:   4    301792261    301792261
14:28:11:WU02:FS01:0x11:fcCheckPointResume: file hashes same.
14:28:11:WU02:FS01:0x11:fcCheckPointResume: state restored.
14:28:11:WU02:FS01:0x11:Verified 02/wudata_01.log
14:28:11:WU02:FS01:0x11:Verified 02/wudata_01.edr
14:28:11:WU02:FS01:0x11:Verified 02/wudata_01.xtc
14:28:11:WU02:FS01:0x11:Completed 17%
14:28:11:WU02:FS01:0x11:Starting GUI Server
14:29:09:WU01:FS00:0x11:Client config unavailable.
14:29:10:WARNING:WU01:FS00:Detected clock skew (1 mins 36 secs), adjusting time estimates
14:29:10:WU01:FS00:0x11:Starting GUI Server
14:29:12:WU01:FS00:0x11:Resuming from checkpoint
14:29:12:WU01:FS00:0x11:fcCheckPointResume: retreived and current tpr file hash:
14:29:12:WU01:FS00:0x11:   0   2847500468   2847500468
14:29:12:WU01:FS00:0x11:   1   2107912794   2107912794
14:29:12:WU01:FS00:0x11:   2   3009904781   3009904781
14:29:12:WU01:FS00:0x11:   3   3312413012   3312413012
14:29:12:WU01:FS00:0x11:   4   1078031564   1078031564
14:29:12:WU01:FS00:0x11:fcCheckPointResume: file hashes same.
14:29:12:WU01:FS00:0x11:fcCheckPointResume: state restored.
14:29:12:WU01:FS00:0x11:Verified 01/wudata_01.log
14:29:12:WU01:FS00:0x11:Verified 01/wudata_01.edr
14:29:12:WU01:FS00:0x11:Verified 01/wudata_01.xtc
14:29:12:WU01:FS00:0x11:Completed 19%
14:29:14:WU00:FS02:0xa4:Completed 30000 out of 500000 steps  (6%)
14:32:00:WU00:FS02:0xa4:Completed 35000 out of 500000 steps  (7%)
14:32:14:FS00:Shutting core down
14:32:14:FS01:Shutting core down
14:32:14:FS02:Shutting core down
14:32:24:WU02:FS01:0x11:Client no longer detected. Shutting down core 
14:32:24:WU02:FS01:0x11:
14:32:24:WU02:FS01:0x11:Folding@home Core Shutdown: CLIENT_DIED
14:32:25:WU01:FS00:0x11:Client no longer detected. Shutting down core 
14:32:25:WU00:FS02:0xa4:Client no longer detected. Shutting down core 
14:32:25:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
14:32:25:WU01:FS00:0x11:
14:32:25:WU00:FS02:0xa4:
14:32:25:WU01:FS00:0x11:Folding@home Core Shutdown: CLIENT_DIED
14:32:25:WU00:FS02:0xa4:Folding@home Core Shutdown: CLIENT_DIED
14:32:25:WU00:FS02:FahCore returned: INTERRUPTED (102 = 0x66)
14:32:25:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
14:32:25:WU00:FS02:Starting
14:32:25:WARNING:WU00:FS02:Changed SMP threads from 8 to 7 this can cause some work units to fail
14:32:25:WU00:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 703 -lifeline 6500 -checkpoint 15 -np 7
14:32:25:WU00:FS02:Started FahCore on PID 2968
14:32:25:WU00:FS02:Core PID:1620
14:32:25:WU00:FS02:FahCore 0xa4 started
14:32:26:WU00:FS02:0xa4:
14:32:26:WU00:FS02:0xa4:*------------------------------*
14:32:26:WU00:FS02:0xa4:Folding@Home Gromacs GB Core
14:32:26:WU00:FS02:0xa4:Version 2.27 (Dec. 15, 2010)
14:32:26:WU00:FS02:0xa4:
14:32:26:WU00:FS02:0xa4:Preparing to commence simulation
14:32:26:WU00:FS02:0xa4:- Looking at optimizations...
14:32:26:WU00:FS02:0xa4:- Files status OK
14:32:26:WU00:FS02:0xa4:- Expanded 1081655 -> 3050744 (decompressed 282.0 percent)
14:32:26:WU00:FS02:0xa4:Called DecompressByteArray: compressed_data_size=1081655 data_size=3050744, decompressed_data_size=3050744 diff=0
14:32:26:WU00:FS02:0xa4:- Digital signature verified
14:32:26:WU00:FS02:0xa4:
14:32:26:WU00:FS02:0xa4:Project: 8082 (Run 22, Clone 48, Gen 10)
14:32:26:WU00:FS02:0xa4:
14:32:26:WU00:FS02:0xa4:Assembly optimizations on if available.
14:32:26:WU00:FS02:0xa4:Entering M.D.
14:32:32:WU00:FS02:0xa4:Using Gromacs checkpoints
14:32:32:WU00:FS02:0xa4:Mapping NT from 7 to 7 
14:32:32:WU00:FS02:0xa4:Resuming from checkpoint
14:32:32:WU00:FS02:0xa4:Verified 00/wudata_01.log
14:32:32:WU00:FS02:0xa4:Verified 00/wudata_01.trr
14:32:32:WU00:FS02:0xa4:Verified 00/wudata_01.xtc
14:32:32:WU00:FS02:0xa4:Verified 00/wudata_01.edr
14:32:32:WU00:FS02:0xa4:Completed 29160 out of 500000 steps  (5%)
14:32:55:WU00:FS02:0xa4:Completed 30000 out of 500000 steps  (6%)
Virtual memory screenshot when things the GPU is folding normally (before the reboot);

Image


Virtual memory screenshot when things go to hell (after the reboot - note the delta!):
Image

The page faults explains the choppy desktop experience but I have absolutely no clue why it happens. I'll now try to get a dump though for some reason Process Explorer always ends up with a 0KB file.

Am I cursed? ;-)

Re: I'm new and I have some GPU issue with FAH

Posted: Sun Mar 10, 2013 2:57 pm
by P5-133XL
How much RAM does your machine have. What you are describing is a Windows feature where if it runs out of physical RAM is it page-swapping off the HD slowing everything down by a couple of orders of magnitude.

You are also noticing this behavior on reboot. Does it settle down after a few minutes? During a boot, Windows will be fetching lots of stuff till it gets everything important into RAM. This process will continue on for several minutes after the computer screen is up and taking instruction.

Neither of these issues are specifically folding related but I could see GPU folding lag being an amplifier of the bad user-experience while they go on.

Re: I'm new and I have some GPU issue with FAH

Posted: Sun Mar 10, 2013 3:02 pm
by Napoleon
Try setting the CPU slot to use 6 CPUs - go to Advanced Control -> Configure -> Slots -> cpu to do that. The two GPU feeders (FahCore_11.exe processes) need some CPU time as well, and in my experience they cause frequent HW interrupts, which may or may not explain the frequent page faults delta.

Unfortunately, choppy UI during GPU folding is more of a rule than an exception. GPUs don't have a priority system like the CPU/OS does. It's "first come, first served" only which tends to make the display unresponsive. You're better off pausing the GPU folding one way or the other when you're using your computer.

According to the log, the setup has plenty of free RAM, so swapping shouldn't be a problem. AFAIK, page fault doesn't necessarily mean swap file thrashing. AFAIK, it also happens when a (user) process needs to get outside of its process boundary limits to do something HW specific.

These parts of the log look weird though. Why isn't it showing an even 8GB for total memory? 64bit Win8 Pro shouldn't have any problems with 8GB (it can support up to 512GB). Some weird chipset limitation, perhaps?

Code: Select all

14:27:34:       Memory: 7.88GiB
14:27:34:  Free Memory: 6.18GiB
...
14:27:34:           OS: Windows 8 Pro with Media Center
14:27:34:      OS Arch: AMD64

Re: I'm new and I have some GPU issue with FAH

Posted: Sun Mar 10, 2013 3:17 pm
by Breach
Thanks guys.

@P5-133XL - I have 8GB of RAM and with FAH and all it never goes beyond 3/8 utilization - that's why I'm even more perplexed by that intense swapping. No, it doesn't settle. When I quit FAH after the initial boot (meaning waiting on all cores/processes to quit in task manager) and I start FAH again resuming the GPU units results in the absolutely same behavior.

@Napoleon - I stopped the CPU slot altogether - same story. I'm aware that GPUs can't prioritise, but when it's working normally (without the vm page faults) it works like a charm - barely any lag, when it's swapping... horrible. I'm OK with pausing GPU folding, but a) that behavior is not normal b) my gpus aren't fully utilized when running c) these writes will destroy my SSD where the swap file is :)

So the mystery remains - why isn't it swapping when it initially gets the WUs while after a reboot/resume it starts acting like that? Any ideas what next to try? Thanks.