Page 1 of 1

8019 and time skew?

Posted: Fri Oct 11, 2013 12:49 am
by gimpy
I returned home to find 12-15 failed 8019 wu's. Client not paused,oc'd touched at all. Only common denominator is it paused to fix "clock skew" then failed. I hope this topic might help someone else. I've done my best. Invested money (for components) and time babysitting p.c. Now QRB requirements punish anyone running advanced units that might fail and anyone trying to get optimum stability and performance from their systems. Alas, my stats keep me from earning QRB points. So I must admit my inferiority at Folding and let those better at it to do so. It is discouraging (at least) to see base points @2555 QRB 16000 and takes you 2-3 days to complete. I cant justify the electric. Best luck to all and I hope my "time skew" crashes help others?

Code: Select all

22:03:27:WU00:FS02:0x15:- Digital signature verified
22:03:27:WU00:FS02:0x15:
22:03:27:WU00:FS02:0x15:Project: 8018 (Run 223, Clone 1, Gen 52)
22:03:27:WU00:FS02:0x15:
22:03:28:WU00:FS02:0x15:Entering M.D.
22:03:30:WU00:FS02:0x15:Tpr hash 00/wudata_01.tpr:  2731574642 3022946372 102154533 2248366412 3983019597
22:03:30:WU00:FS02:0x15:GPU device id=0
22:03:32:WU01:FS00:0xa4:Using Gromacs checkpoints
22:03:33:WU01:FS00:0xa4:Mapping NT from 4 to 4 
22:03:33:WU00:FS02:0x15:Working on GRowing Old MAkes el Chrono Sweat
22:03:33:WU00:FS02:0x15:Client config unavailable.
22:03:33:WU00:FS02:0x15:Starting GUI Server
22:03:41:WU01:FS00:0xa4:Resuming from checkpoint
22:03:41:WU01:FS00:0xa4:Verified 01/wudata_01.log
22:03:48:WU01:FS00:0xa4:Verified 01/wudata_01.trr
22:03:48:WU01:FS00:0xa4:Verified 01/wudata_01.xtc
22:03:48:WU01:FS00:0xa4:Verified 01/wudata_01.edr
22:03:49:WU01:FS00:0xa4:Completed 1703640 out of 2500000 steps  (68%)
22:05:37:WU00:FS02:0x15:Setting checkpoint frequency: 250000
22:05:37:WU00:FS02:0x15:Completed         3 out of 25000000 steps (0%).
22:05:38:WARNING:WU00:FS02:Detected clock skew (2 mins 11 secs), adjusting time estimates
22:06:12:FS02:Shutting core down
22:06:12:FS00:Shutting core down
22:06:15:WU01:FS00:0xa4:Client no longer detected. Shutting down core 
22:06:15:WU01:FS00:0xa4:
22:06:15:WU01:FS00:0xa4:Folding@home Core Shutdown: CLIENT_DIED
22:06:16:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
Mod Edit: Added Code Tags - PantherX

Re: 8019 and time skew?

Posted: Fri Oct 11, 2013 1:13 am
by bruce
gimpy wrote:I returned home to find 12-15 failed 8019 wu's. Client not paused,oc'd touched at all. Only common denominator is it paused to fix "clock skew" then failed.
It's the other way around. The clock skew is a RESULT of a WU being stopped and resumed later. It's the client's way of adjusting the calculated time per frame, recognizing that it can't be estimated just based on the length of time since the WU was started.

22:03:27:WU00:FS02:0x15:Project: 8018 (Run 223, Clone 1, Gen 52)
22:03:27:WU00:FS02:0x15:
22:03:28:WU00:FS02:0x15:Entering M.D.
22:03:30:WU00:FS02:0x15:Tpr hash 00/wudata_01.tpr: 2731574642 3022946372 102154533 2248366412 3983019597
22:03:30:WU00:FS02:0x15:GPU device id=0
22:03:33:WU00:FS02:0x15:Working on GRowing Old MAkes el Chrono Sweat
22:03:33:WU00:FS02:0x15:Client config unavailable.
22:03:33:WU00:FS02:0x15:Starting GUI Server
22:05:37:WU00:FS02:0x15:Setting checkpoint frequency: 250000
22:05:37:WU00:FS02:0x15:Completed 3 out of 25000000 steps (0%).
22:05:38:WARNING:WU00:FS02:Detected clock skew (2 mins 11 secs), adjusting time estimates

Within +- 1 second, the FahCore_15 was running for 2m11s initializing (not actually making progress) so that can't be counted toward the total time required to reach "3 out of 25000000 steps." The time skew message is not reporting an error, just a necessary adjustment.


Finding the cause of any failures is a different matter but I don't see any errors at all. It looks like you shut down your client at 22:06:12 since both slots stopped then.

Re: 8019 and time skew?

Posted: Fri Oct 11, 2013 1:59 am
by PantherX
gimpy wrote:...Now QRB requirements punish anyone running advanced units that might fail and anyone trying to get optimum stability and performance from their systems. Alas, my stats keep me from earning QRB points. So I must admit my inferiority at Folding and let those better at it to do so. It is discouraging (at least) to see base points @2555 QRB 16000 and takes you 2-3 days to complete...
Please note that QRB is calculated by the Server time, i.e. the Server time of when the WU was assigned to you and when it was returned back to the server successfully. The only effect that might happen due to time difference is that the FAHClient might dump the WU if the current date is after the expiration date.

I just experienced a clock skew. The system was up for several days and successfully folded 20+ WUs. The issue that I was seeing was that the PPD estimates were extremely low while I got the correct points. I updated the time via the internet and it did the correction. Here is a brief section of the log:

Code: Select all

*********************** Log Started 2009-12-31T21:01:33Z ***********************
21:01:33:************************* Folding@home Client *************************
21:01:33:      Website: http://folding.stanford.edu/
21:01:33:    Copyright: (c) 2009-2012 Stanford University
21:01:33:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
21:01:33:         Args: 
21:01:33:       Config: C:/ProgramData/FAHClient/config.xml
21:01:33:******************************** Build ********************************
21:01:33:      Version: 7.2.9
21:01:33:         Date: Oct 3 2012
21:01:33:         Time: 18:05:48
21:01:33:      SVN Rev: 3578
21:01:33:       Branch: fah/trunk/client
21:01:33:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
21:01:33:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
21:01:33:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
21:01:33:     Platform: win32 XP
21:01:33:         Bits: 32
21:01:33:         Mode: Release
21:01:33:******************************* System ********************************
21:01:33:          CPU: Pentium(R) Dual-Core CPU E5400 @ 2.70GHz
21:01:33:       CPU ID: GenuineIntel Family 6 Model 23 Stepping 10
21:01:33:         CPUs: 2
21:01:33:       Memory: 2.97GiB
21:01:33:  Free Memory: 2.27GiB
21:01:33:      Threads: WINDOWS_THREADS
21:01:33:   On Battery: false
21:01:33:   UTC offset: 3
21:01:33:          PID: 1668
21:01:33:          CWD: C:/Windows/system32
21:01:33:           OS: Windows 7 Ultimate
21:01:33:      OS Arch: AMD64
21:01:33:         GPUs: 0
21:01:33:         CUDA: Not detected
21:01:33:Win32 Service: true
21:01:33:***********************************************************************
21:01:33:<config>
21:01:33:  <!-- Network -->
21:01:33:  <proxy v=':8080'/>
21:01:33:
21:01:33:  <!-- Remote Command Server -->
21:01:33:  <command-allow v='127.0.0.1 192.168.1.0/24'/>
21:01:33:  <password v=REDACTED>
21:01:33:
21:01:33:  <!-- User Information -->
21:01:33:  <passkey v='********************************'/>
21:01:33:  <team v='69411'/>
21:01:33:  <user v='PantherX'/>
21:01:33:
21:01:33:  <!-- Folding Slots -->
21:01:33:  <slot id='0' type='SMP'>
21:01:33:    <cpus v='-1'/>
21:01:33:    <max-packet-size v='small'/>
21:01:33:    <next-unit-percentage v='100'/>
21:01:33:    <pause-on-start v='false'/>
21:01:33:  </slot>
21:01:33:</config>
...
******************************** Date: 16/01/10 ********************************
******************************** Date: 17/01/10 ********************************
06:49:53:WU01:FS00:0xa4:Completed 1180000 out of 2000000 steps  (59%)
******************************** Date: 10/10/13 ********************************
23:51:39:WU01:FS00:Downloading project 10450 description
23:51:39:WU01:FS00:Connecting to fah-web.stanford.edu:80
23:51:40:WU01:FS00:Project 10450 description downloaded successfully
23:51:40:WARNING:WU01:FS00:Detected clock skew (3.73 years), adjusting time estimates
00:02:37:WU01:FS00:0xa4:Completed 1200000 out of 2000000 steps  (60%)
00:20:13:WU01:FS00:0xa4:Completed 1220000 out of 2000000 steps  (61%)

Re: 8019 and time skew?

Posted: Fri Oct 11, 2013 2:12 am
by Joe_H
I have also seen the message about time skew on a system that went to sleep with processing active on a WU. When woken up processing resumed and the message was entered into the log reporting a time lapse similar to the length of time the machine was sleeping.

Re: 8019 and time skew?

Posted: Fri Oct 11, 2013 7:22 am
by gimpy
WOW! Everyone missed my point(s).Not going to "quote" everyone' 1.My system is set to power- "always on". My system shouldn't have paused. 2. Not talking of system time or server time.. Talking about 80% successful return,not time on clock.3. If time skew is from pause....why? I live alone with no ghosts so...why client pause at all?.Lastly,I was trying to say I've been folding since mid-2002-2003? anonymously.Then registered 2005?. What I meant is F@H now taken over by people with more rescorces/skill than I. Example: 1 user has 5.5 billion pointrs with 100- WU's than I. I have 2.5 million.I was saying others are exceeding in 1 day what took me 10-11 years to do. I took up folding because I'm disabled with a disease that COULD be cured. Now others carry the torch...and better . I'm retiring. 2 WU's left,set to finish,then I'm gone. Good luck to all.

Re: 8019 and time skew?

Posted: Fri Oct 11, 2013 7:32 am
by bruce
We're all trying to tell you to completely ignore the messages about the time skew. Other than that, what error are you reporting? (That is NOT an error.)

If you still want more help, please post the the first couple pages of the log where it reports what hardware you have, followed by the configuration settings that you're using.

Re: 8019 and time skew?

Posted: Sat Oct 12, 2013 9:22 pm
by gimpy
I apologize if I wasn't clear.That is where my log stopped at and my client dumped work units. Problem solved.I was using a brand new 650Ti that was defective. The card ran at varying speeds then crashing.At the "time skew" is where it failed.Since that was end of log I assumed it was applicable to problem.So hardware not software. All I got was 2-3 completed units with it. I burned it in then left it unattended.Had to watch it behaving erratically.This is late replying but wanted to thank those who tried to assist me.Peace and good health to all! gimpy