Page 1 of 1
					
				GT 640M
				Posted: Tue Jul 03, 2012 6:25 pm
				by RedKnightRG
				I downloaded the updated GPUs.txt file; without it v7 doesn't even detect the GT 640M.  With the updated GPUS.txt the v7 client will create a GPU slot for me but it never actually does anything as described elsewhere.
			 
			
					
				Re: GT 640M
				Posted: Tue Jul 03, 2012 7:11 pm
				by bruce
				I can think of three possibilities.  Perhaps the GT 640M should not be whitelisted.  Perhaps is should be but there are no WUs for it.  Perhaps there are active projects but the FAHCore isn't working with that GPU.  Stanford is not fully prepared for the other new Kepler GPUs (reason 2 and/or 3).
I'll ask around.
			 
			
					
				Re: GT 640M
				Posted: Tue Jul 03, 2012 7:18 pm
				by RedKnightRG
				For clarity here's my original description of the issue:
I'm getting the same issue as paz: the GPU slot hangs after the Starting GUI Server line. 
I'm trying to fold on a new samsung series 7 laptop that has a GT 640M in it. GPU-Z shows that FAH is allocating a couple hundred megs of VRAM and the core turbos up to 645mhz but GPU load stays pegged at 0% and no work is ever done. I observe the same problem if I use the v6 client with the -forceGPU fermi flag so I assume this has to be a compatibility problem between Core 15 and the kepler architecture in my GPU.... Or a driver problem?
I'm on driver version 295.55 as that is the latest supported driver by samsung for this laptop; I have as yet been unable to install the latest WHQL driver from NVIDIA's website.
Here's a copy of the log for that slot:
Code: Select all
*********************** Log Started 2012-07-03T15:06:34Z ***********************
15:06:34:WU01:FS01:Downloading core from http://www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah
15:06:34:WU01:FS01:Connecting to www.stanford.edu:80
15:06:35:WU01:FS01:FahCore 15: Downloading 1.49MiB
15:06:41:WU01:FS01:FahCore 15: 42.05%
15:06:47:WU01:FS01:FahCore 15: 88.31%
15:06:48:WU01:FS01:FahCore 15: Download complete
15:06:48:WU01:FS01:Valid core signature
15:06:48:WU01:FS01:Unpacked 4.47MiB to cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe
15:06:48:WU01:FS01:Starting
15:06:48:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "C:/Users/Red Knight/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe" -dir 01 -suffix 01 -version 701 -lifeline 7724 -checkpoint 15 -gpu 0
15:06:48:WU01:FS01:Started FahCore on PID 2248
15:06:48:WU01:FS01:Core PID:7224
15:06:48:WU01:FS01:FahCore 0x15 started
15:06:49:WU01:FS01:0x15:
15:06:49:WU01:FS01:0x15:*------------------------------*
15:06:49:WU01:FS01:0x15:Folding@Home GPU Core
15:06:49:WU01:FS01:0x15:Version                2.22 (Thu Dec 8 17:08:05 PST 2011)
15:06:49:WU01:FS01:0x15:Build host             SimbiosNvdWin7
15:06:49:WU01:FS01:0x15:Board Type             NVIDIA/CUDA
15:06:49:WU01:FS01:0x15:Core                   15
15:06:49:WU01:FS01:0x15:
15:06:49:WU01:FS01:0x15:Window's signal control handler registered.
15:06:49:WU01:FS01:0x15:Preparing to commence simulation
15:06:49:WU01:FS01:0x15:- Looking at optimizations...
15:06:49:WU01:FS01:0x15:- Files status OK
15:06:49:WU01:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
15:06:49:WU01:FS01:0x15:- Expanded 145840 -> 660994 (decompressed 453.2 percent)
15:06:49:WU01:FS01:0x15:Called DecompressByteArray: compressed_data_size=145840 data_size=660994, decompressed_data_size=660994 diff=0
15:06:49:WU01:FS01:0x15:- Digital signature verified
15:06:49:WU01:FS01:0x15:
15:06:49:WU01:FS01:0x15:Project: 8020 (Run 4, Clone 137, Gen 81)
15:06:49:WU01:FS01:0x15:
15:06:49:WU01:FS01:0x15:Assembly optimizations on if available.
15:06:49:WU01:FS01:0x15:Entering M.D.
15:06:51:WU01:FS01:0x15:Tpr hash 01/wudata_01.tpr:  1918208994 601738095 2893739497 697289720 2995203303
15:06:51:WU01:FS01:0x15:GPU device info: vendor=0 device=0 name=<NA> match=0
15:06:51:WU01:FS01:0x15:Working on Gromacs Runs On Most of All Computer Systems
15:06:51:WU01:FS01:0x15:Client config unavailable.
15:06:51:WU01:FS01:0x15:Starting GUI Server
Also here's a copy of the GPU-Z output for my card:
 
The eagle eyed reader may note that the CUDA option is unchecked in GPU-Z.  I do not know if this is a problem with GPU-Z or the drivers.  I'm going to try and compile a simple CUDA application to verify that the driver version I have doesn't have some major CUDA related bug in it...
 
			
					
				Re: GT 640M
				Posted: Tue Jul 03, 2012 7:45 pm
				by bruce
				Send you eagle-eye toward the bottom of the "System Info" panel where V7 says GPUs n / CUDA : xxx
CUDA should be part of any of the recent NVidia drivers.
			 
			
					
				Re: GT 640M
				Posted: Tue Jul 03, 2012 8:53 pm
				by 7im
				Without the benefit of seeing the **** System **** section of the log, I can think of three possibilities.
1.  Non-OEM Samsung driver doesn't have CUDA.  Move to an NV driver version.
2.  Installed fah client to start as a service.  This is not supported by Microsoft.  Run fah client at Windows startup instead.
3.  Unlikely, but accessing the PC through Remote Desktop disables CUDA, disabling the FAH client.  Use LMI or VNC instead.
			 
			
					
				Re: GT 640M
				Posted: Tue Jul 03, 2012 10:08 pm
				by RedKnightRG
				System portion of the log:  (It reports CUDA 3.0)
Code: Select all
*********************** Log Started 2012-07-03T15:03:35Z ***********************
15:03:35:************************* Folding@home Client *************************
15:03:35:      Website: http://folding.stanford.edu/
15:03:35:    Copyright: (c) 2009-2012 Stanford University
15:03:35:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:03:35:         Args: --lifeline 1808 --command-port=36330
15:03:35:       Config: C:/Users/Red Knight/AppData/Roaming/FAHClient/config.xml
15:03:35:******************************** Build ********************************
15:03:35:      Version: 7.1.52
15:03:35:         Date: Mar 20 2012
15:03:35:         Time: 19:37:42
15:03:35:      SVN Rev: 3515
15:03:35:       Branch: fah/trunk/client
15:03:35:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
15:03:35:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
15:03:35:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT
15:03:35:     Platform: win32 XP
15:03:35:         Bits: 32
15:03:35:         Mode: Release
15:03:35:******************************* System ********************************
15:03:35:          CPU: Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz
15:03:35:       CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
15:03:35:         CPUs: 8
15:03:35:       Memory: 5.79GiB
15:03:35:  Free Memory: 2.30GiB
15:03:35:      Threads: WINDOWS_THREADS
15:03:35:   On Battery: false
15:03:35:   UTC offset: -4
15:03:35:          PID: 9000
15:03:35:          CWD: C:/Users/Red Knight/AppData/Roaming/FAHClient
15:03:35:           OS: Windows 7 Home Premium
15:03:35:      OS Arch: AMD64
15:03:35:         GPUs: 1
15:03:35:        GPU 0: FERMI:1 GK107 [GeForce GT 640M]
15:03:35:         CUDA: 3.0
15:03:35:  CUDA Driver: 4020
15:03:35:Win32 Service: false
15:03:35:***********************************************************************
15:03:35:<config>
15:03:35:  <!-- Folding Slot Configuration -->
15:03:35:  <gpu v='true'/>
15:03:35:
15:03:35:  <!-- Network -->
15:03:35:  <proxy v=':8080'/>
15:03:35:
15:03:35:  <!-- User Information -->
15:03:35:  <passkey v='********************************'/>
15:03:35:  <team v='14'/>
15:03:35:  <user v='RedKnightRG'/>
15:03:35:
15:03:35:  <!-- Folding Slots -->
15:03:35:  <slot id='0' type='SMP'/>
15:03:35:  <slot id='1' type='GPU'>
15:03:35:    <client-type v='beta'/>
15:03:35:  </slot>
15:03:35:</config>
 1. Non-OEM Samsung driver doesn't have CUDA. Move to an NV driver version. 
I'm worried about this one but the reference NV drivers refuse to install on this laptop.  This is something I will have to take up with Samsung tech support as I'm sure I'll want updated drivers for other reasons like game fixes.
 2. Installed fah client to start as a service. This is not supported by Microsoft. Run fah client at Windows startup instead. 
Running FAHControl as a windows application so this isn't the problem.
 3. Unlikely, but accessing the PC through Remote Desktop disables CUDA, disabling the FAH client. Use LMI or VNC instead.
Not possible as I do not RDP into this laptop.
 
			
					
				Re: GT 640M
				Posted: Tue Jul 03, 2012 10:29 pm
				by 7im
				Quit FAHControl.  Start, all Prog, Fahclient, Data folder.  Open Work folder.  Remove the 0x slot folder that corresponds to the GPU slot.  01 from what I see in the log above...
Restart FAHControl.  And...?
			 
			
					
				Re: GT 640M
				Posted: Tue Jul 03, 2012 11:19 pm
				by RedKnightRG
				We're off to the races!  I'd tried deleting the core before but I didn't touch the work folder; did I grab a beta 'kepler=okay' WU this time?
Log file:
Code: Select all
*********************** Log Started 2012-07-03T23:01:27Z ***********************
23:01:27:WU01:FS01:Cleaning up
23:01:27:WU01:FS01:Connecting to assign-GPU.stanford.edu:80
23:01:28:WU01:FS01:News: Welcome to Folding@Home
23:01:28:WU01:FS01:Assigned to work server 171.67.108.143
23:01:28:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:"GK107 [GeForce GT 640M]" from 171.67.108.143
23:01:28:WU01:FS01:Connecting to 171.67.108.143:8080
23:01:28:WU01:FS01:Downloading 143.02KiB
23:01:29:WU01:FS01:Download complete
23:01:29:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:OK project:8020 run:7 clone:22 gen:36 core:0x15 unit:0x0000002c6953ee2f4f967acc6e0b501a
23:01:29:WU01:FS01:Downloading core from http://www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/beta/Core_15.fah
23:01:29:WU01:FS01:Connecting to www.stanford.edu:80
23:01:29:WU01:FS01:FahCore 15: Downloading 1.88MiB
23:01:35:WU01:FS01:FahCore 15: 86.64%
23:01:35:WU01:FS01:FahCore 15: Download complete
23:01:35:WU01:FS01:Valid core signature
23:01:35:WU01:FS01:Unpacked 7.71MiB to cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/beta/Core_15.fah/FahCore_15.exe
23:01:35:WU01:FS01:Starting
23:01:35:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "C:/Users/Red Knight/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/beta/Core_15.fah/FahCore_15.exe" -dir 01 -suffix 01 -version 701 -lifeline 8700 -checkpoint 15 -gpu 0
23:01:35:WU01:FS01:Started FahCore on PID 6328
23:01:36:WU01:FS01:Core PID:7468
23:01:36:WU01:FS01:FahCore 0x15 started
23:01:36:WU01:FS01:0x15:
23:01:36:WU01:FS01:0x15:*------------------------------*
23:01:36:WU01:FS01:0x15:Folding@Home GPU Core
23:01:36:WU01:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
23:01:36:WU01:FS01:0x15:Build host             AmoebaRemote
23:01:36:WU01:FS01:0x15:Board Type             NVIDIA/CUDA
23:01:36:WU01:FS01:0x15:Core                   15
23:01:36:WU01:FS01:0x15:
23:01:36:WU01:FS01:0x15:Window's signal control handler registered.
23:01:36:WU01:FS01:0x15:Preparing to commence simulation
23:01:36:WU01:FS01:0x15:- Looking at optimizations...
23:01:36:WU01:FS01:0x15:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
23:01:36:WU01:FS01:0x15:- Created dyn
23:01:36:WU01:FS01:0x15:- Files status OK
23:01:36:WU01:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
23:01:36:WU01:FS01:0x15:- Expanded 145945 -> 660994 (decompressed 452.9 percent)
23:01:36:WU01:FS01:0x15:Called DecompressByteArray: compressed_data_size=145945 data_size=660994, decompressed_data_size=660994 diff=0
23:01:36:WU01:FS01:0x15:- Digital signature verified
23:01:36:WU01:FS01:0x15:
23:01:36:WU01:FS01:0x15:Project: 8020 (Run 7, Clone 22, Gen 36)
23:01:36:WU01:FS01:0x15:
23:01:36:WU01:FS01:0x15:Assembly optimizations on if available.
23:01:36:WU01:FS01:0x15:Entering M.D.
23:01:38:WU01:FS01:0x15:Tpr hash 01/wudata_01.tpr:  1046507648 730977226 3966105438 3908558382 529823314
23:01:38:WU01:FS01:0x15:GPU device id=0
23:01:38:WU01:FS01:0x15:Working on Gromacs Runs On Most of All Computer Systems
23:01:38:WU01:FS01:0x15:Client config unavailable.
23:01:38:WU01:FS01:0x15:Starting GUI Server
23:02:56:WU01:FS01:0x15:Setting checkpoint frequency: 250000
23:02:56:WU01:FS01:0x15:Completed         3 out of 25000000 steps (0%).
23:10:12:FS01:Paused
23:10:13:FS01:Shutting core down
23:10:16:WU01:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
23:10:44:FS01:Unpaused
23:10:44:WU01:FS01:Starting
23:10:44:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "C:/Users/Red Knight/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/beta/Core_15.fah/FahCore_15.exe" -dir 01 -suffix 01 -version 701 -lifeline 8700 -checkpoint 15 -gpu 0
23:10:44:WU01:FS01:Started FahCore on PID 2600
23:10:44:WU01:FS01:Core PID:4328
23:10:44:WU01:FS01:FahCore 0x15 started
23:10:45:WU01:FS01:0x15:
23:10:45:WU01:FS01:0x15:*------------------------------*
23:10:45:WU01:FS01:0x15:Folding@Home GPU Core
23:10:45:WU01:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
23:10:45:WU01:FS01:0x15:Build host             AmoebaRemote
23:10:45:WU01:FS01:0x15:Board Type             NVIDIA/CUDA
23:10:45:WU01:FS01:0x15:Core                   15
23:10:45:WU01:FS01:0x15:
23:10:45:WU01:FS01:0x15:Window's signal control handler registered.
23:10:45:WU01:FS01:0x15:Preparing to commence simulation
23:10:45:WU01:FS01:0x15:- Looking at optimizations...
23:10:45:WU01:FS01:0x15:- Files status OK
23:10:45:WU01:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
23:10:45:WU01:FS01:0x15:- Expanded 145945 -> 660994 (decompressed 452.9 percent)
23:10:45:WU01:FS01:0x15:Called DecompressByteArray: compressed_data_size=145945 data_size=660994, decompressed_data_size=660994 diff=0
23:10:45:WU01:FS01:0x15:- Digital signature verified
23:10:45:WU01:FS01:0x15:
23:10:45:WU01:FS01:0x15:Project: 8020 (Run 7, Clone 22, Gen 36)
23:10:45:WU01:FS01:0x15:
23:10:45:WU01:FS01:0x15:Assembly optimizations on if available.
23:10:45:WU01:FS01:0x15:Entering M.D.
23:10:47:WU01:FS01:0x15:Tpr hash 01/wudata_01.tpr:  1046507648 730977226 3966105438 3908558382 529823314
23:10:47:WU01:FS01:0x15:GPU device id=0
23:10:47:WU01:FS01:0x15:Working on Gromacs Runs On Most of All Computer Systems
23:10:47:WU01:FS01:0x15:Client config unavailable.
23:10:47:WU01:FS01:0x15:Starting GUI Server
23:11:52:WU01:FS01:0x15:Setting checkpoint frequency: 250000
23:11:52:WU01:FS01:0x15:Completed         3 out of 25000000 steps (0%).
GPU-Z indicates that FAH is now loading the 640M:
 
You can see that Nvidia can turbo the core clock up to 708mhz (reference clock is 625mhz).   After awhile though it starts rapidly switching between turbo speed and a very low speed (88mhz) according to GPU-Z and the long term average mhz is only ~500mhz:
 
This is why you can't get killer PPD on a laptop - there's just too much damn heat to dissipate, at least on a laptop this thin...  Still being able to run SMP-8 and GPU fold on 384 CUDA cores with a machine that's only 0.9" thick is pretty awesome!  
 
 
Thanks for all the help 7im/Bruce, you guys have been a big help.  It looks like the whitelisting of the GT 640M is just fine!
 
			
					
				Re: GT 640M
				Posted: Wed Jul 04, 2012 1:07 am
				by 7im
				Just so everyone is clear.   Deleting a work unit is ALWAYS the last resort.  I have always said that, and I have always done that.  NOTE that in this same thread as well, that I tried all other courses of action and verified all other settings before coming to that step!
Also note that I do not condone the use of the beta setting by non beta team members, nor did I promote that option in this thread. Besides, that information is easy enough to find in a dozen other locations, so I never need to cross that line. Advanced WUs have a higher failure rate than standard FAH WUs but not all that high. Beta Wus have a higher failure rate than Advanced WU. In many cases you WILL get WUs that have problems. For help with beta problems, you MUST join the beta team and post the discussion in that forum. This topic is a borderline infringement of that policy because you admit to using the beta switch. But I also have issues with letting hardware sit idle when they could be productive finding a treatment or cure to a disease that hits close to home. Technically I should ask you to remove the beta setting before helping you, otherwise someone might get butt hurt about it instead of seeing the bigger picture.
And lastly, I would not shortcut the diagnostic process when that process could be helpful in any way in finding and solving a problem.  But this Kepler not folding issue has been documented in several threads already now, all with logs posted, and no new insights have come from it or from the logs, other than Keplers can't fold yet under normal circumstances.  1. Most Keplers were not whitelisted until just a day ago, and 2. most if not all public work units do not fold on Kepler hardware, and tend to lock up the slot so that even diagnostic attempts are pointless.
Pande Group will just have to duplicate and solve this problem in house under a spot light.  Field stripping this one is of no value.
That being said, I am sure every effort is being made to make folding on Kepler hardware as easy and productive as possible, for everyone, some time in the very near future.
Have a safe Holiday, bre-atches.