Page 1 of 1

stuck at 99.99%

Posted: Thu Feb 07, 2013 3:12 am
by wered
ok, all three of the testing running on my gpus have been sitting at 99.99% for over and hour now, they have al been saying between 6-14 seconds to go for over an hour, can anyone explain to me why this is or whats wrong so i can fix it?

system specs
MSI 990FXA-GD80
AMD FX-8350
16gigs Gskill sniper 2133mhz
2 XFX Double-D Radeon HD6850s
1 Msi Twin FROZER radeoon HD6770 (catalyst 12.11)
2 seagte 1tb hard drives 7200rpm
1 ADATA 128gig SSD (system on this one, also FAH on here)
Windows 7

Re: stuck at 99.99%

Posted: Thu Feb 07, 2013 3:24 am
by P5-133XL
You need to look at the logs to find out what they are doing. It is common for the v7 client to sit there for up to a frame after being unpaused. An hour is a bit long for a frame time for most GPU's. The v7 client can also sit at 99% as a WU is preparing to be sent but it would also be unusual for all 3 GPU's to finish at the same time. So one has to look at the logs.

Re: stuck at 99.99%

Posted: Thu Feb 07, 2013 4:48 am
by mmonnin
I wonder if they're all working on the same WU...attempt to write/complete at once.

Re: stuck at 99.99%

Posted: Thu Feb 07, 2013 2:42 pm
by wered

Code: Select all

*********************** Log Started 2013-02-07T14:38:45Z ***********************
14:38:45:************************* Folding@home Client *************************
14:38:45:      Website: http://folding.stanford.edu/
14:38:45:    Copyright: (c) 2009-2012 Stanford University
14:38:45:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
14:38:45:         Args: --lifeline 5476 --command-port=36330
14:38:45:       Config: C:/Users/kyle/AppData/Roaming/FAHClient/config.xml
14:38:45:******************************** Build ********************************
14:38:45:      Version: 7.2.9
14:38:45:         Date: Oct 3 2012
14:38:45:         Time: 18:05:48
14:38:45:      SVN Rev: 3578
14:38:45:       Branch: fah/trunk/client
14:38:45:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
14:38:45:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
14:38:45:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
14:38:45:     Platform: win32 XP
14:38:45:         Bits: 32
14:38:45:         Mode: Release
14:38:45:******************************* System ********************************
14:38:45:          CPU: AMD FX(tm)-8350 Eight-Core Processor
14:38:45:       CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
14:38:45:         CPUs: 8
14:38:45:       Memory: 15.97GiB
14:38:45:  Free Memory: 13.55GiB
14:38:45:      Threads: WINDOWS_THREADS
14:38:45:   On Battery: false
14:38:45:   UTC offset: -5
14:38:45:          PID: 5028
14:38:45:          CWD: C:/Users/kyle/AppData/Roaming/FAHClient
14:38:45:           OS: Windows 7 Ultimate
14:38:45:      OS Arch: AMD64
14:38:45:         GPUs: 3
14:38:45:        GPU 0: ATI:4 Barts PRO [ATI Radeon HD 6800 Series]
14:38:45:        GPU 1: ATI:4 Barts PRO [ATI Radeon HD 6800 Series]
14:38:45:        GPU 2: ATI:4 Juniper XT [AMD Radeon HD 6000 Series]
14:38:45:         CUDA: Not detected
14:38:45:Win32 Service: false
14:38:45:***********************************************************************
14:38:45:<config>
14:38:45:  <!-- Folding Slot Configuration -->
14:38:45:  <gpu v='true'/>
14:38:45:
14:38:45:  <!-- User Information -->
14:38:45:  <team v='223752'/>
14:38:45:  <user v='Wered'/>
14:38:45:
14:38:45:  <!-- Folding Slots -->
14:38:45:</config>
14:38:45:Trying to access database...
14:38:45:Successfully acquired database lock
14:38:45:Enabled folding slot 00: READY gpu:0:"Barts PRO [ATI Radeon HD 6800 Series]"
14:38:45:Enabled folding slot 01: READY gpu:1:"Barts PRO [ATI Radeon HD 6800 Series]"
14:38:45:Enabled folding slot 02: READY gpu:2:"Juniper XT [AMD Radeon HD 6000 Series]"
14:38:45:Enabled folding slot 03: READY smp:8
14:38:45:WU01:FS01:Starting
14:38:45:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/kyle/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_16.fah/FahCore_16.exe -dir 01 -suffix 01 -version 702 -lifeline 5028 -checkpoint 15 -gpu 1
14:38:45:WU01:FS01:Started FahCore on PID 4196
14:38:45:WU01:FS01:Core PID:5204
14:38:45:WU01:FS01:FahCore 0x16 started
14:38:45:WU02:FS02:Starting
14:38:45:WU02:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/kyle/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_16.fah/FahCore_16.exe -dir 02 -suffix 01 -version 702 -lifeline 5028 -checkpoint 15 -gpu 2
14:38:45:WU02:FS02:Started FahCore on PID 1948
14:38:45:WU02:FS02:Core PID:3908
14:38:45:WU02:FS02:FahCore 0x16 started
14:38:45:WU00:FS00:Starting
14:38:45:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/kyle/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_16.fah/FahCore_16.exe -dir 00 -suffix 01 -version 702 -lifeline 5028 -checkpoint 15 -gpu 0
14:38:45:WU00:FS00:Started FahCore on PID 4400
14:38:45:WU00:FS00:Core PID:5064
14:38:45:WU00:FS00:FahCore 0x16 started
14:38:45:WU03:FS03:Starting
14:38:45:WU03:FS03:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/kyle/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 03 -suffix 01 -version 702 -lifeline 5028 -checkpoint 15 -np 8
14:38:45:WU03:FS03:Started FahCore on PID 4412
14:38:45:WU03:FS03:Core PID:4472
14:38:45:WU03:FS03:FahCore 0xa4 started
14:38:45:WU01:FS01:0x16:
14:38:45:WU01:FS01:0x16:*------------------------------*
14:38:45:WU01:FS01:0x16:Folding@Home GPU Core
14:38:45:WU01:FS01:0x16:Version 2.11 (Thu Dec 9 15:00:14 PST 2010)
14:38:45:WU01:FS01:0x16:
14:38:45:WU01:FS01:0x16:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.30729.01 for 80x86 
14:38:45:WU01:FS01:0x16:Build host: user-f6d030f24f
14:38:45:WU01:FS01:0x16:Board Type: AMD/OpenCL
14:38:45:WU01:FS01:0x16:Core      : x=16
14:38:45:WU01:FS01:0x16: Window's signal control handler registered.
14:38:45:WU01:FS01:0x16:Preparing to commence simulation
14:38:45:WU01:FS01:0x16:- Looking at optimizations...
14:38:45:WU02:FS02:0x16:
14:38:45:WU01:FS01:0x16:- Files status OK
14:38:45:WU02:FS02:0x16:*------------------------------*
14:38:45:WU01:FS01:0x16:sizeof(CORE_PACKET_HDR) = 512 file=<>
14:38:45:WU02:FS02:0x16:Folding@Home GPU Core
14:38:45:WU01:FS01:0x16:- Expanded 45052 -> 171163 (decompressed 379.9 percent)
14:38:45:WU02:FS02:0x16:Version 2.11 (Thu Dec 9 15:00:14 PST 2010)
14:38:45:WU01:FS01:0x16:Called DecompressByteArray: compressed_data_size=45052 data_size=171163, decompressed_data_size=171163 diff=0
14:38:45:WU02:FS02:0x16:
14:38:45:WU01:FS01:0x16:- Digital signature verified
14:38:45:WU02:FS02:0x16:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.30729.01 for 80x86 
14:38:45:WU01:FS01:0x16:
14:38:45:WU00:FS00:0x16:
14:38:45:WU02:FS02:0x16:Build host: user-f6d030f24f
14:38:45:WU01:FS01:0x16:Project: 11293 (Run 13, Clone 321, Gen 42)
14:38:45:WU00:FS00:0x16:*------------------------------*
14:38:45:WU02:FS02:0x16:Board Type: AMD/OpenCL
14:38:45:WU01:FS01:0x16:
14:38:45:WU00:FS00:0x16:Folding@Home GPU Core
14:38:45:WU02:FS02:0x16:Core      : x=16
14:38:45:WU01:FS01:0x16:Assembly optimizations on if available.
14:38:45:WU00:FS00:0x16:Version 2.11 (Thu Dec 9 15:00:14 PST 2010)
14:38:45:WU02:FS02:0x16: Window's signal control handler registered.
14:38:45:WU01:FS01:0x16:Entering M.D.
14:38:45:WU03:FS03:0xa4:
14:38:45:WU00:FS00:0x16:
14:38:45:WU02:FS02:0x16:Preparing to commence simulation
14:38:45:WU03:FS03:0xa4:*------------------------------*
14:38:45:WU00:FS00:0x16:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.30729.01 for 80x86 
14:38:46:WU02:FS02:0x16:- Looking at optimizations...
14:38:46:WU03:FS03:0xa4:Folding@Home Gromacs GB Core
14:38:46:WU00:FS00:0x16:Build host: user-f6d030f24f
14:38:46:WU02:FS02:0x16:- Files status OK
14:38:46:WU03:FS03:0xa4:Version 2.27 (Dec. 15, 2010)
14:38:46:WU00:FS00:0x16:Board Type: AMD/OpenCL
14:38:46:WU02:FS02:0x16:sizeof(CORE_PACKET_HDR) = 512 file=<>
14:38:46:WU03:FS03:0xa4:
14:38:46:WU00:FS00:0x16:Core      : x=16
14:38:46:WU02:FS02:0x16:- Expanded 44819 -> 171163 (decompressed 381.8 percent)
14:38:46:WU03:FS03:0xa4:Preparing to commence simulation
14:38:46:WU00:FS00:0x16: Window's signal control handler registered.
14:38:46:WU02:FS02:0x16:Called DecompressByteArray: compressed_data_size=44819 data_size=171163, decompressed_data_size=171163 diff=0
14:38:46:WU03:FS03:0xa4:- Looking at optimizations...
14:38:46:WU00:FS00:0x16:Preparing to commence simulation
14:38:46:WU02:FS02:0x16:- Digital signature verified
14:38:46:WU03:FS03:0xa4:- Files status OK
14:38:46:WU00:FS00:0x16:- Looking at optimizations...
14:38:46:WU02:FS02:0x16:
14:38:46:WU03:FS03:0xa4:- Expanded 2079390 -> 5386224 (decompressed 259.0 percent)
14:38:46:WU00:FS00:0x16:- Files status OK
14:38:46:WU02:FS02:0x16:Project: 11292 (Run 1, Clone 48, Gen 12)
14:38:46:WU03:FS03:0xa4:Called DecompressByteArray: compressed_data_size=2079390 data_size=5386224, decompressed_data_size=5386224 diff=0
14:38:46:WU00:FS00:0x16:sizeof(CORE_PACKET_HDR) = 512 file=<>
14:38:46:WU02:FS02:0x16:
14:38:46:WU03:FS03:0xa4:- Digital signature verified
14:38:46:WU00:FS00:0x16:- Expanded 45120 -> 171163 (decompressed 379.3 percent)
14:38:46:WU02:FS02:0x16:Assembly optimizations on if available.
14:38:46:WU03:FS03:0xa4:
14:38:46:WU00:FS00:0x16:Called DecompressByteArray: compressed_data_size=45120 data_size=171163, decompressed_data_size=171163 diff=0
14:38:46:WU02:FS02:0x16:Entering M.D.
14:38:46:WU03:FS03:0xa4:Project: 7809 (Run 1, Clone 432, Gen 36)
14:38:46:WU00:FS00:0x16:- Digital signature verified
14:38:46:WU03:FS03:0xa4:
14:38:46:WU00:FS00:0x16:
14:38:46:WU03:FS03:0xa4:Assembly optimizations on if available.
14:38:46:WU00:FS00:0x16:Project: 11292 (Run 8, Clone 105, Gen 30)
14:38:46:WU03:FS03:0xa4:Entering M.D.
14:38:46:WU00:FS00:0x16:
14:38:46:WU00:FS00:0x16:Assembly optimizations on if available.
14:38:46:WU00:FS00:0x16:Entering M.D.
14:38:47:WU01:FS01:0x16:Will resume from checkpoint file 01/wudata_01.ckp
14:38:47:WU01:FS01:0x16:Tpr hash 01/wudata_01.tpr:  1121881092 3864891076 3122073343 1673674026 3916301202
14:38:47:WU01:FS01:0x16:Working on ALZHEIMER DISEASE AMYLOID
14:38:47:WU01:FS01:0x16:Client config unavailable.
14:38:47:WU02:FS02:0x16:Will resume from checkpoint file 02/wudata_01.ckp
14:38:47:WU02:FS02:0x16:Tpr hash 02/wudata_01.tpr:  607329007 1780157933 3718750709 2581859805 1395931316
14:38:47:WU02:FS02:0x16:Working on ALZHEIMER DISEASE AMYLOID
14:38:47:WU02:FS02:0x16:Client config unavailable.
14:38:47:WU00:FS00:0x16:Will resume from checkpoint file 00/wudata_01.ckp
14:38:47:WU00:FS00:0x16:Tpr hash 00/wudata_01.tpr:  370268691 4065539119 1418384160 2068288595 2435575424
14:38:47:WU00:FS00:0x16:Working on ALZHEIMER DISEASE AMYLOID
14:38:47:WU00:FS00:0x16:Client config unavailable.
14:38:47:WU01:FS01:0x16:Starting GUI Server
14:38:47:WU02:FS02:0x16:Starting GUI Server
14:38:47:WU00:FS00:0x16:Starting GUI Server
14:38:48:Server connection id=1 on 0.0.0.0:36330 from 127.0.0.1
14:38:50:WU01:FS01:0x16:Resuming from checkpoint
14:38:50:WU01:FS01:0x16:fcCheckPointResume: retreived and current tpr file hash:
14:38:50:WU01:FS01:0x16:   0   1121881092   1121881092
14:38:50:WU01:FS01:0x16:   1   3864891076   3864891076
14:38:50:WU01:FS01:0x16:   2   3122073343   3122073343
14:38:50:WU01:FS01:0x16:   3   1673674026   1673674026
14:38:50:WU01:FS01:0x16:   4   3916301202   3916301202
14:38:50:WU01:FS01:0x16:fcCheckPointResume: file hashes same.
14:38:50:WU01:FS01:0x16:fcCheckPointResume: state restored.
14:38:50:WU01:FS01:0x16:fcCheckPointResume: name 01/wudata_01.log Verified 01/wudata_01.log
14:38:50:WU02:FS02:0x16:Resuming from checkpoint
14:38:50:WU01:FS01:0x16:fcCheckPointResume: name 01/wudata_01.trr Verified 01/wudata_01.trr
14:38:50:WU02:FS02:0x16:fcCheckPointResume: retreived and current tpr file hash:
14:38:50:WU01:FS01:0x16:fcCheckPointResume: name 01/wudata_01.xtc Verified 01/wudata_01.xtc
14:38:50:WU02:FS02:0x16:   0    607329007    607329007
14:38:50:WU01:FS01:0x16:fcCheckPointResume: name 01/wudata_01.edr Verified 01/wudata_01.edr
14:38:50:WU02:FS02:0x16:   1   1780157933   1780157933
14:38:50:WU00:FS00:0x16:Resuming from checkpoint
14:38:50:WU01:FS01:0x16:fcCheckPointResume: state restored 2
14:38:50:WU02:FS02:0x16:   2   3718750709   3718750709
14:38:50:WU00:FS00:0x16:fcCheckPointResume: retreived and current tpr file hash:
14:38:50:WU01:FS01:0x16:Resumed from checkpoint
14:38:50:WU02:FS02:0x16:   3   2581859805   2581859805
14:38:50:WU00:FS00:0x16:   0    370268691    370268691
14:38:50:WU01:FS01:0x16:Setting checkpoint frequency: 500000
14:38:50:WU02:FS02:0x16:   4   1395931316   1395931316
14:38:50:WU00:FS00:0x16:   1   4065539119   4065539119
14:38:50:WU01:FS01:0x16:Completed  49500001 out of 50000000 steps (99%).
14:38:50:WU02:FS02:0x16:fcCheckPointResume: file hashes same.
14:38:50:WU00:FS00:0x16:   2   1418384160   1418384160
14:38:50:WU02:FS02:0x16:fcCheckPointResume: state restored.
14:38:50:WU00:FS00:0x16:   3   2068288595   2068288595
14:38:50:WU02:FS02:0x16:fcCheckPointResume: name 02/wudata_01.log Verified 02/wudata_01.log
14:38:50:WU00:FS00:0x16:   4   2435575424   2435575424
14:38:50:WU02:FS02:0x16:fcCheckPointResume: name 02/wudata_01.trr Verified 02/wudata_01.trr
14:38:50:WU00:FS00:0x16:fcCheckPointResume: file hashes same.
14:38:50:WU02:FS02:0x16:fcCheckPointResume: name 02/wudata_01.xtc Verified 02/wudata_01.xtc
14:38:50:WU00:FS00:0x16:fcCheckPointResume: state restored.
14:38:50:WU02:FS02:0x16:fcCheckPointResume: name 02/wudata_01.edr Verified 02/wudata_01.edr
14:38:50:WU00:FS00:0x16:fcCheckPointResume: name 00/wudata_01.log Verified 00/wudata_01.log
14:38:50:WU02:FS02:0x16:fcCheckPointResume: state restored 2
14:38:50:WU00:FS00:0x16:fcCheckPointResume: name 00/wudata_01.trr Verified 00/wudata_01.trr
14:38:50:WU02:FS02:0x16:Resumed from checkpoint
14:38:50:WU02:FS02:0x16:Setting checkpoint frequency: 500000
14:38:50:WU02:FS02:0x16:Completed  34500001 out of 50000000 steps (69%).
14:38:50:WU00:FS00:0x16:fcCheckPointResume: name 00/wudata_01.xtc Verified 00/wudata_01.xtc
14:38:50:WU00:FS00:0x16:fcCheckPointResume: name 00/wudata_01.edr Verified 00/wudata_01.edr
14:38:50:WU00:FS00:0x16:fcCheckPointResume: state restored 2
14:38:50:WU00:FS00:0x16:Resumed from checkpoint
14:38:50:WU00:FS00:0x16:Setting checkpoint frequency: 599998
14:38:50:WU00:FS00:0x16:Completed  43799855 out of 59999872 steps (72%).
14:38:50:WU00:FS00:0x16:Completed  43799907 out of 59999872 steps (73%).
14:38:51:WU03:FS03:0xa4:Using Gromacs checkpoints
14:38:51:WU03:FS03:0xa4:Mapping NT from 8 to 8 
14:38:52:WU03:FS03:0xa4:Resuming from checkpoint
14:38:52:WU03:FS03:0xa4:Verified 03/wudata_01.log
14:38:52:WU03:FS03:0xa4:Verified 03/wudata_01.trr
14:38:52:WU03:FS03:0xa4:Verified 03/wudata_01.xtc
14:38:52:WU03:FS03:0xa4:Verified 03/wudata_01.edr
14:38:53:WU03:FS03:0xa4:Completed 434090 out of 1500000 steps  (28%)
14:39:54:WU03:FS03:0xa4:Completed 435000 out of 1500000 steps  (29%)
here is a copy of my log, im not sure what im lookin for, also i upgraded to 13.1 drivers today

Mod edit: Added Code tags

Re: stuck at 99.99%

Posted: Thu Feb 07, 2013 2:45 pm
by wered
forgot to add, since installing the drivers, wu00 is now at 73%, 01 is still at 99.99% with 7 seconds to go like it was all night long, and 02 is now at 69%, so im not sure what is up with that?

Re: stuck at 99.99%

Posted: Thu Feb 07, 2013 3:20 pm
by wered
well it sat for about another hour and then finnished, so im guessing the new drivers fixed that problem, also one other thing, my gpus never go above 40% usage, why is that?

Re: stuck at 99.99%

Posted: Thu Feb 07, 2013 3:59 pm
by mmonnin
The GPU utilization is a driver problem. Either revert back to 12.8 or go to one of the later beta releases.

http://www.overclock.net/t/1323729/upda ... ivers/0_30

Re: stuck at 99.99%

Posted: Thu Feb 07, 2013 4:43 pm
by Joe_H
Besides the issue mentioned with some versions of the Catalyst drivers not fully utilizing the ATI GPU's for folding, I noticed your CPU is overcommitted. Each ATI GPU can use up to an entire CPU core to process data going to and from the GPU, so with your CPU slot set at SMP:8 the GPU and CPU slots are fighting for CPU resources. I would recommend changing the SMP slot down to a setting of at least 6, 4 or 5 might work even better. The contention over CPU usage is probably why the GPU core takes a long time to do the end of run finishing up to make the WU ready to upload.

Re: stuck at 99.99%

Posted: Sat Feb 09, 2013 3:17 am
by wered
well after i updated to 13.1 i no longer had the issue of being stuck at 99% its just not using the graphics card fully, so the going back to the 12.8 drivers worked, but it hindered my computers performance in crysis 3 alot so i went back to 13.1 and it will just have to stay that way until the drivers update and work right with it

Re: stuck at 99.99%

Posted: Sat Feb 09, 2013 3:36 am
by mmonnin
Driver fix:
http://www.overclock.net/t/1323729/upda ... ivers/0_30