Page 1 of 1

F@H TFP problems?

Posted: Sat Nov 24, 2012 5:37 pm
by T_Yamamoto
My usual estimated TFP is 1 hour 1 minute but now it has jumped for 7 minutes and 45 seconds all of a sudden
It got to 99.99% once and I wasn't convinced so I restarted F@H and it dropped down to 92%.
I restarted my computer and it goes back down to 92% and then the TFP goes from 1 hour 1 minute back to 7 minutes 45 seconds.

Here is the log

Code: Select all

*********************** Log Started 2012-11-24T17:16:14Z ***********************
17:16:14:************************* Folding@home Client *************************
17:16:14:      Website: http://folding.stanford.edu/
17:16:14:    Copyright: (c) 2009-2012 Stanford University
17:16:14:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:16:14:         Args: --lifeline 348 --command-port=36330
17:16:14:       Config: C:/Documents and Settings/Takumi/Application
17:16:14:               Data/FAHClient/config.xml
17:16:14:******************************** Build ********************************
17:16:14:      Version: 7.2.9
17:16:14:         Date: Oct 3 2012
17:16:14:         Time: 18:05:48
17:16:14:      SVN Rev: 3578
17:16:14:       Branch: fah/trunk/client
17:16:14:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
17:16:14:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
17:16:14:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
17:16:14:     Platform: win32 XP
17:16:14:         Bits: 32
17:16:14:         Mode: Release
17:16:14:******************************* System ********************************
17:16:14:          CPU: Intel(R) Pentium(R) Dual CPU E2140 @ 1.60GHz
17:16:14:       CPU ID: GenuineIntel Family 6 Model 15 Stepping 13
17:16:14:         CPUs: 2
17:16:14:       Memory: 1.25GiB
17:16:14:  Free Memory: 707.02MiB
17:16:14:      Threads: WINDOWS_THREADS
17:16:14:   On Battery: false
17:16:14:   UTC offset: -5
17:16:14:          PID: 3464
17:16:14:          CWD: C:/Documents and Settings/Takumi/Application Data/FAHClient
17:16:14:           OS: Microsoft Windows XP Service Pack 3
17:16:14:      OS Arch: X86
17:16:14:         GPUs: 2
17:16:14:        GPU 0: UNSUPPORTED: RV516 [Radeon X1300/X1550 Series]
17:16:14:        GPU 1: UNSUPPORTED: RV516 [Radeon X1300/X1550 Series] (Secondary)
17:16:14:         CUDA: Not detected
17:16:14:Win32 Service: false
17:16:14:***********************************************************************
17:16:14:<config>
17:16:14:  <!-- Folding Slot Configuration -->
17:16:14:  <gpu v='true'/>
17:16:14:
17:16:14:  <!-- Network -->
17:16:14:  <proxy v=':8080'/>
17:16:14:
17:16:14:  <!-- User Information -->
17:16:14:  <team v='198'/>
17:16:14:  <user v='T_Yamamoto'/>
17:16:14:
17:16:14:  <!-- Folding Slots -->
17:16:14:  <slot id='0' type='SMP'/>
17:16:14:</config>
17:16:14:Trying to access database...
17:16:23:Successfully acquired database lock
17:16:23:Enabled folding slot 00: READY smp:2
17:16:26:WU00:FS00:Starting
17:16:26:Server connection id=1 on 0.0.0.0:36330 from 127.0.0.1
17:16:26:WU00:FS00:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" "C:/Documents and Settings/Takumi/Application Data/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe" -dir 00 -suffix 01 -version 702 -lifeline 3464 -checkpoint 15 -np 2
17:16:31:WU00:FS00:Started FahCore on PID 3004
17:16:33:WU00:FS00:Core PID:844
17:16:33:WU00:FS00:FahCore 0xa4 started
17:16:35:WU00:FS00:0xa4:
17:16:35:WU00:FS00:0xa4:*------------------------------*
17:16:35:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
17:16:35:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
17:16:35:WU00:FS00:0xa4:
17:16:35:WU00:FS00:0xa4:Preparing to commence simulation
17:16:35:WU00:FS00:0xa4:- Looking at optimizations...
17:16:35:WU00:FS00:0xa4:- Files status OK
17:16:36:WU00:FS00:0xa4:- Expanded 2053633 -> 5365960 (decompressed 261.2 percent)
17:16:36:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=2053633 data_size=5365960, decompressed_data_size=5365960 diff=0
17:16:36:WU00:FS00:0xa4:- Digital signature verified
17:16:36:WU00:FS00:0xa4:
17:16:36:WU00:FS00:0xa4:Project: 7808 (Run 7, Clone 319, Gen 7)
17:16:36:WU00:FS00:0xa4:
17:16:41:WU00:FS00:0xa4:Assembly optimizations on if available.
17:16:41:WU00:FS00:0xa4:Entering M.D.
17:16:48:WU00:FS00:0xa4:Using Gromacs checkpoints
17:16:49:WU00:FS00:0xa4:Mapping NT from 2 to 2 
17:16:56:WU00:FS00:0xa4:Resuming from checkpoint
17:16:56:WU00:FS00:0xa4:Verified 00/wudata_01.log
17:16:56:WU00:FS00:0xa4:Verified 00/wudata_01.trr
17:16:56:WU00:FS00:0xa4:Verified 00/wudata_01.xtc
17:16:56:WU00:FS00:0xa4:Verified 00/wudata_01.edr
17:16:57:WU00:FS00:0xa4:Completed 1388440 out of 1500000 steps  (92%)
Complete up to 92% but progress currently shows 95.52%

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 6:03 pm
by P5-133XL
You didn't show enough log to see the changes in frame times. If the issue is that the log isn't showing enough you can go back multiple logs by looking in the log folder (Start->all programs->FAHClient->data Directory->logs).

Whenever you restart folding either from a reboot or simply restarting folding you risk going backwards. The checkpoints are only written every so often (in your case the default of every 15 minutes). Also, there is a possible risk that the checkpoint is invalid and the entire WU restarts from 0%. The best I can offer you is that the less you restart folding the less backwards movement you will experience.

Next, whenever you restart folding the time estimates can be significantly off for the first few frames. It is simply a question that the client has no data to establish an accurate estimates till some frames have been completed.

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 6:17 pm
by T_Yamamoto
I checked my logs, but it doesnt show the problem.
ETA says 1 hour so I will be back then with an update

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 6:52 pm
by T_Yamamoto
ETA 12 minutes
Progress: 98.37%
Log shows:

Code: Select all

*********************** Log Started 2012-11-24T17:16:14Z ***********************
17:16:14:************************* Folding@home Client *************************
17:16:14:      Website: http://folding.stanford.edu/
17:16:14:    Copyright: (c) 2009-2012 Stanford University
17:16:14:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:16:14:         Args: --lifeline 348 --command-port=36330
17:16:14:       Config: C:/Documents and Settings/Takumi/Application
17:16:14:               Data/FAHClient/config.xml
17:16:14:******************************** Build ********************************
17:16:14:      Version: 7.2.9
17:16:14:         Date: Oct 3 2012
17:16:14:         Time: 18:05:48
17:16:14:      SVN Rev: 3578
17:16:14:       Branch: fah/trunk/client
17:16:14:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
17:16:14:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
17:16:14:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
17:16:14:     Platform: win32 XP
17:16:14:         Bits: 32
17:16:14:         Mode: Release
17:16:14:******************************* System ********************************
17:16:14:          CPU: Intel(R) Pentium(R) Dual CPU E2140 @ 1.60GHz
17:16:14:       CPU ID: GenuineIntel Family 6 Model 15 Stepping 13
17:16:14:         CPUs: 2
17:16:14:       Memory: 1.25GiB
17:16:14:  Free Memory: 707.02MiB
17:16:14:      Threads: WINDOWS_THREADS
17:16:14:   On Battery: false
17:16:14:   UTC offset: -5
17:16:14:          PID: 3464
17:16:14:          CWD: C:/Documents and Settings/Takumi/Application Data/FAHClient
17:16:14:           OS: Microsoft Windows XP Service Pack 3
17:16:14:      OS Arch: X86
17:16:14:         GPUs: 2
17:16:14:        GPU 0: UNSUPPORTED: RV516 [Radeon X1300/X1550 Series]
17:16:14:        GPU 1: UNSUPPORTED: RV516 [Radeon X1300/X1550 Series] (Secondary)
17:16:14:         CUDA: Not detected
17:16:14:Win32 Service: false
17:16:14:***********************************************************************
17:16:14:<config>
17:16:14:  <!-- Folding Slot Configuration -->
17:16:14:  <gpu v='true'/>
17:16:14:
17:16:14:  <!-- Network -->
17:16:14:  <proxy v=':8080'/>
17:16:14:
17:16:14:  <!-- User Information -->
17:16:14:  <team v='198'/>
17:16:14:  <user v='T_Yamamoto'/>
17:16:14:
17:16:14:  <!-- Folding Slots -->
17:16:14:  <slot id='0' type='SMP'/>
17:16:14:</config>
17:16:14:Trying to access database...
17:16:23:Successfully acquired database lock
17:16:23:Enabled folding slot 00: READY smp:2
17:16:26:WU00:FS00:Starting
17:16:26:Server connection id=1 on 0.0.0.0:36330 from 127.0.0.1
17:16:26:WU00:FS00:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" "C:/Documents and Settings/Takumi/Application Data/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe" -dir 00 -suffix 01 -version 702 -lifeline 3464 -checkpoint 15 -np 2
17:16:31:WU00:FS00:Started FahCore on PID 3004
17:16:33:WU00:FS00:Core PID:844
17:16:33:WU00:FS00:FahCore 0xa4 started
17:16:35:WU00:FS00:0xa4:
17:16:35:WU00:FS00:0xa4:*------------------------------*
17:16:35:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
17:16:35:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
17:16:35:WU00:FS00:0xa4:
17:16:35:WU00:FS00:0xa4:Preparing to commence simulation
17:16:35:WU00:FS00:0xa4:- Looking at optimizations...
17:16:35:WU00:FS00:0xa4:- Files status OK
17:16:36:WU00:FS00:0xa4:- Expanded 2053633 -> 5365960 (decompressed 261.2 percent)
17:16:36:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=2053633 data_size=5365960, decompressed_data_size=5365960 diff=0
17:16:36:WU00:FS00:0xa4:- Digital signature verified
17:16:36:WU00:FS00:0xa4:
17:16:36:WU00:FS00:0xa4:Project: 7808 (Run 7, Clone 319, Gen 7)
17:16:36:WU00:FS00:0xa4:
17:16:41:WU00:FS00:0xa4:Assembly optimizations on if available.
17:16:41:WU00:FS00:0xa4:Entering M.D.
17:16:48:WU00:FS00:0xa4:Using Gromacs checkpoints
17:16:49:WU00:FS00:0xa4:Mapping NT from 2 to 2 
17:16:56:WU00:FS00:0xa4:Resuming from checkpoint
17:16:56:WU00:FS00:0xa4:Verified 00/wudata_01.log
17:16:56:WU00:FS00:0xa4:Verified 00/wudata_01.trr
17:16:56:WU00:FS00:0xa4:Verified 00/wudata_01.xtc
17:16:56:WU00:FS00:0xa4:Verified 00/wudata_01.edr
17:16:57:WU00:FS00:0xa4:Completed 1388440 out of 1500000 steps  (92%)
18:10:25:WU00:FS00:0xa4:Completed 1395000 out of 1500000 steps  (93%)
It has been 15 minutes and no save?

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 7:06 pm
by P5-133XL
It does not record the saving of checkpoints in the log. You'd need to check file times.

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 7:09 pm
by T_Yamamoto
Which files would that be?

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 7:12 pm
by T_Yamamoto
also its stuck at ETA 4 seconds

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 7:17 pm
by P5-133XL
The files in the work folder (Start->All programs->FAHClient->Data directory->Work) from there you will see folders that correspond to the work queues enter one of those and you will look at *.ckp files and their times.

At the very end the ETA can get stuck for a few moments as the WU gets prepared for transmission.

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 7:33 pm
by T_Yamamoto
it says last modified 2:31 which is 2 minutes ago. I guess I'll just leave it as is for a while

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 8:15 pm
by T_Yamamoto

Code: Select all

19:38:47:Server connection id=2 on 0.0.0.0:36330 from 127.0.0.1
19:39:04:Server connection id=2 ended
this just showed up, wondering what it means
looks like now its down to 94% >.<

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 8:44 pm
by Joe_H
T_Yamamoto wrote:

Code: Select all

19:38:47:Server connection id=2 on 0.0.0.0:36330 from 127.0.0.1
19:39:04:Server connection id=2 ended
this just showed up, wondering what it means
looks like now its down to 94% >.<
The various components of the folding client communicate with each other using network connections. The 127.0.0.1 number is a local connection. So when you open FAHControl, there will be an entry in the log FAHClient keeps for that, and another when the connection ends when FAHControl is closed. There is a documented telnet interface to FAHClient, it is also available to be used by third party monitoring utilities.

Re: F@H TFP problems?

Posted: Sat Nov 24, 2012 9:51 pm
by Jesse_V
Joe_H wrote:There is a documented telnet interface to FAHClient, it is also available to be used by third party monitoring utilities.
Documentation is here: https://fah-web.stanford.edu/projects/F ... eInterface

Re: F@H TFP problems?

Posted: Sun Nov 25, 2012 1:52 pm
by bruce
T_Yamamoto wrote:also its stuck at ETA 4 seconds
If the TimePerFrame is inaccurate (perhaps because of a recent restart or a change in activity of other programs) the Estimated Time of Arrival is also going to be off, too. If the estimate is for a fast completion but it doesn't finish then, there's no new data on which to make an improved estimate so all it can do is wait got the WU to finish.
T_Yamamoto wrote:it says last modified 2:31 which is 2 minutes ago. I guess I'll just leave it as is for a while
So checkpointing is working.
T_Yamamoto wrote:My usual estimated TFP is 1 hour 1 minute but now it has jumped for 7 minutes and 45 seconds all of a sudden
It's unusual for anything of that magnitude to happen. If the 61m time was a different WU from a different project, then you're assuming all projects are similar and that is NOT TRUE. Different Projects are assigned different numbers of points to (approximately) compensate for variations in TPF.

If both the 61m and 7.8m were frame times on the same WU, that's an indication that some other program was putting a VERY heavy load on your CPU. FAH is designed to run at an extremely low priority. All other processing requests should slow FAH down, but that's a pretty radical change. What changes to your processing load might have caused such a radical shift in TPF?

Re: F@H TFP problems?

Posted: Tue Nov 27, 2012 3:10 am
by T_Yamamoto
I'm running a E2140 and there is no way that my TFP could just that high. I had it folding while I wasnt using my computer and it was going 61minute per fold. Then all of a sudden, while gaming, I check and it shot up to 7.8 minutes.

Thank fully that WU is done and over.
Its ended up being worth 3000+ points :P

Re: F@H TFP problems?

Posted: Tue Nov 27, 2012 3:49 am
by P5-133XL
Even small amounts of non-folding CPU usage can have dramatic TPF increases for SMP WU's. The SMP normally uses one thread per core and it is highly synchronized between each thread. If an outside activity like gaming occurs and starts using a CPU core then on that core the thread will be suspended to allow your application unfettered access to your CPU. Meanwhile the other threads being synchronized with the suspended thread sit there in a loop waiting for the thread to reappear.

Typically with a 4-8 core machine it would be recommended that you SMP-fold with less than totally full to prevent that from happening. If there is a free core than the non-folding application will not suspend a folding thread. You however, only have two cores on an E2140 so the solution is to run two uniprocessor slots so that if one is suspended, it won't stop the other.

The difficulty is that on some machines that either do not run 24x7 or are not powerful enough that you will potentially miss deadlines. The other problem is that multiple UNI processor slots generally give fewer PPD than one SMP slot. On the brighter side, two uniprocessor slots may generate more PPD than the current situation.

So my suggestion is to try two uniprocessor slots and see if that will work better than a single SMP slot for you. If not, then you can always go back to the single SMP slot.