Update server? [stats down - fixed]

Moderators: Site Moderators, FAHC Science Team

billford
Posts: 1003
Joined: Thu May 02, 2013 8:46 pm
Hardware configuration: Full Time:

2x NVidia GTX 980
1x NVidia GTX 780 Ti
2x 3GHz Core i5 PC (Linux)

Retired:

3.2GHz Core i5 PC (Linux)
3.2GHz Core i5 iMac
2.8GHz Core i5 iMac
2.16GHz Core 2 Duo iMac
2GHz Core 2 Duo MacBook
1.6GHz Core 2 Duo Acer laptop
Location: Near Oxford, United Kingdom
Contact:

Re: Update server? [stats down]

Post by billford »

Thanks bruce.
Image
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

Re: Update server? [stats down]

Post by Grandpa_01 »

bruce wrote:
billford wrote:Sorry if this is a daft question from a newbie, but surely that wouldn't happen in this case?
Right. Work Servers and Collection Servers were not down. Something happened to the stats servers and/or stats database and points were were not added to the database but the WUs were collected from the clients whenever they were completed.

The WU is time-stamped by the server when it's assigned and time-stamped again when it's returned to the WS or to the CS. QRB is based on that time difference and each server logs that information, whether or not the data is collected from those logs and sent to the stats server. Maybe grandpa is thinking of times when both the WS and the CS were temporarily unable to accept the uploads. That would have extended the time the WU was out for processing and reduced the QRB, but that's not what happened.
Correct but I had read the earlier post about the time stamp error and was taking that into concideration so from what was posted earlier it appears it was entirley possible for the QRB to be messed up.

Re: Update server?

Postby bruce » Wed Sep 18, 2013 3:47 pm
As 7im says, the date problem has been reported. Unfortunately I think the problem is bigger than just the date being tomorrow. The old stats page reported a date AND A TIME and if it was older than the top of the previous hour, there was a pretty good chance that one of the stats processes needed to be restarted. (All the stats eventually came through.) Without the time-stamp, I can't tell how old they are.

As a Moderation, I can, however, look up Project: 6361 (Run 1, Clone 73, Gen 15) and see that the credit has not yet been processed, confirming that SOMETHING is wrong with the stats processing cycle.

BTW: it's not 22:00 yet. (It's meaningless unless you specify what timezone you're in, which you did later.) While the logs are timestamped in UTC, Stanford servers run on PST/PDT and Donors are scattered all over the globe.
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
schwancr
Pande Group Member
Posts: 136
Joined: Wed Jun 01, 2011 9:45 pm

Re: Update server? [stats down]

Post by schwancr »

These last WUs should be in the stats system now. I'm not sure how they didn't make it the first time.

-Christian
schwancr
Pande Group Member
Posts: 136
Joined: Wed Jun 01, 2011 9:45 pm

Re: Update server? [stats down]

Post by schwancr »

Actually, something went wrong when I tried to re-run some of the logs earlier. I can see which ones, but I don't have time to update it right now.

Bear with me,
-Christian
Leonardo
Posts: 260
Joined: Tue Dec 04, 2007 5:09 am
Hardware configuration: GPU slots on home-built, purpose-built PCs.
Location: Eagle River, Alaska

Re: Update server? [stats down]

Post by Leonardo »

Christian, thank you very much for the communication with us. It makes a big difference - always has, always will.
Image
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: Update server? [stats down]

Post by codysluder »

Grandpa_01 wrote:Correct but I had read the earlier post about the time stamp error and was taking that into concideration so from what was posted earlier it appears it was entirley possible for the QRB to be messed up.
Wrong, again. Any time-stamps that you can see are relative to your local clock, not the server clock. They may contribute to unofficial estimates (and you can manipulate the local clock so that cannot be official) but local clocks have nothing to do with your real points. The QRB times are 100% server based and if there's any way to see them, they're only in the Moderator Data.
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

Re: Update server? [stats down]

Post by Grandpa_01 »

codysluder wrote:
Grandpa_01 wrote:Correct but I had read the earlier post about the time stamp error and was taking that into concideration so from what was posted earlier it appears it was entirley possible for the QRB to be messed up.
Wrong, again. Any time-stamps that you can see are relative to your local clock, not the server clock. They may contribute to unofficial estimates (and you can manipulate the local clock so that cannot be official) but local clocks have nothing to do with your real points. The QRB times are 100% server based and if there's any way to see them, they're only in the Moderator Data.
Not quite sure what you are saying I was wrong about since it was the Stanford stats page here http://fah-web.stanford.edu/cgi-bin/mai ... =userstats that was reporting the wrong time stamp, which I am assuming is tied to Stanfords servers. It may not be but to a laymen like myself I believe that is a reasonable assumption since it is the official stats page. :wink:
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Update server? [stats down]

Post by 7im »

Reread the thread again. The only place that had the wrong date time was the stats front page. Everything else was checked and accurate at that time.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

Re: Update server?

Post by Grandpa_01 »

bruce wrote:As 7im says, the date problem has been reported. Unfortunately I think the problem is bigger than just the date being tomorrow. The old stats page reported a date AND A TIME and if it was older than the top of the previous hour, there was a pretty good chance that one of the stats processes needed to be restarted. (All the stats eventually came through.) Without the time-stamp, I can't tell how old they are.

As a Moderation, I can, however, look up Project: 6361 (Run 1, Clone 73, Gen 15) and see that the credit has not yet been processed, confirming that SOMETHING is wrong with the stats processing cycle.

BTW: it's not 22:00 yet. (It's meaningless unless you specify what timezone you're in, which you did later.) While the logs are timestamped in UTC, Stanford servers run on PST/PDT and Donors are scattered all over the globe.
7im are you talking about this post.

In my mind when I read this it says there was no time stamp and that something is wrong with the timestamps.
(All the stats eventually came through.) Without the time-stamp,
I guess I read it wrong.
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
-alias-
Posts: 121
Joined: Sun Feb 22, 2009 1:20 pm

Re: Update server? [stats down]

Post by -alias- »

bollix47 wrote:
schwancr wrote:Alright, we tracked down some logs that hadn't been put into the stats DBs. Hopefully this covers all of them, if not then please post PRCG's so we can track them down.

Thanks,
Christian
from -alias-
Project: 8103 (Run 1, Clone 31, Gen 206)
Project: 8101 (Run 23, Clone 2, Gen 237)
Project: 8105 (Run 0, Clone 15, Gen 149)
see log in above post
Thanks,
Christian

4 of my missing 5 is now in place in the statistics, but I do not know which of the two in the log below the fourth is. I can only assume that it is Project: 8104 (Run 0, Clone 5, Gen 110) with timestamp [September 19 7:21:55 UTC] that is the fourth. Is it still possible to look up the 5 which I think is Project: 8105 (Run 0, Clone 4, Gen 176) with timestamp [September 18 3:43:57 p.m. UTC]

Code: Select all

[15:36:25] Completed 247500 out of 250000 steps  (99%)
[15:43:17] Completed 250000 out of 250000 steps  (100%)
[15:43:29] DynamicWrapper: Finished Work Unit: sleep=10000
[15:43:39] 
[15:43:39] Finished Work Unit:
[15:43:39] - Reading up to 64340496 from "work/wudata_04.trr": Read 64340496
[15:43:39] trr file hash check passed.
[15:43:39] - Reading up to 31618328 from "work/wudata_04.xtc": Read 31618328
[15:43:39] xtc file hash check passed.
[15:43:39] edr file hash check passed.
[15:43:39] logfile size: 195672
[15:43:39] Leaving Run
[15:43:42] - Writing 96315372 bytes of core data to disk...
[15:43:56] Done: 96314860 -> 91564001 (compressed to 5.8 percent)
[15:43:56]   ... Done.
[15:43:57] - Shutting down core
[15:43:57] 
[15:43:57] Folding@home Core Shutdown: FINISHED_UNIT
[15:43:57] CoreStatus = 64 (100)
[15:43:57] Unit 4 finished with 88 percent of time to deadline remaining.
[15:43:57] Updated performance fraction: 0.879325
[15:43:57] Sending work to server
[15:43:57] Project: 8105 (Run 0, Clone 4, Gen 176)
[15:43:57] + Attempting to send results [September 18 15:43:57 UTC]
[15:43:57] - Reading file work/wuresults_04.dat from core
[15:43:57]   (Read 91564513 bytes from disk)
[15:43:57] Connecting to http://128.143.231.201:8080/
[15:46:35] Posted data.
[15:46:35] Initial: 0000; - Uploaded at ~565 kB/s
[15:46:35] - Averaged speed for that direction ~554 kB/s
[15:46:35] + Results successfully sent
[15:46:35] Thank you for your contribution to Folding@Home.
[15:46:35] + Number of Units Completed: 287

[15:46:35] Trying to send all finished work units
[15:46:35] + No unsent completed units remaining.



**********************************************************************

[07:13:43] Completed 247500 out of 250000 steps  (99%)
[07:20:59] Completed 250000 out of 250000 steps  (100%)
[07:21:11] DynamicWrapper: Finished Work Unit: sleep=10000
[07:21:21] 
[07:21:21] Finished Work Unit:
[07:21:21] - Reading up to 64206000 from "work/wudata_00.trr": Read 64206000
[07:21:22] trr file hash check passed.
[07:21:22] - Reading up to 31550284 from "work/wudata_00.xtc": Read 31550284
[07:21:22] xtc file hash check passed.
[07:21:22] edr file hash check passed.
[07:21:22] logfile size: 202330
[07:21:22] Leaving Run
[07:21:25] - Writing 96119490 bytes of core data to disk...
[07:21:48] Done: 96118978 -> 91368721 (compressed to 5.6 percent)
[07:21:49]   ... Done.
[07:21:54] - Shutting down core
[07:21:54] 
[07:21:54] Folding@home Core Shutdown: FINISHED_UNIT
[07:21:55] CoreStatus = 64 (100)
[07:21:55] Unit 0 finished with 83 percent of time to deadline remaining.
[07:21:55] Updated performance fraction: 0.799564
[07:21:55] Sending work to server
[07:21:55] Project: 8104 (Run 0, Clone 5, Gen 110)
[07:21:55] + Attempting to send results [September 19 07:21:55 UTC]
[07:21:55] - Reading file work/wuresults_00.dat from core
[07:21:55]   (Read 91369233 bytes from disk)
[07:21:55] Connecting to http://128.143.231.201:8080/
[07:24:35] Posted data.
[07:24:35] Initial: 0000; - Uploaded at ~557 kB/s
[07:24:35] - Averaged speed for that direction ~553 kB/s
[07:24:35] + Results successfully sent
[07:24:35] Thank you for your contribution to Folding@Home.
[07:24:35] + Number of Units Completed: 555

[07:24:36] Trying to send all finished work units
[07:24:36] + No unsent completed units remaining.
My last upload before the statserver went down is shown below at 09.18,12pm
Image
bollix47
Posts: 2959
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Update server? [stats down]

Post by bollix47 »

Both of those work units were credited:

Hi -alias- (team 37651),
Your WU (P8104 R0 C5 G110) was added to the stats database on 2013-09-19 09:18:35 for 294363 points of credit.
Hi -alias- (team 37651),
Your WU (P8105 R0 C4 G176) was added to the stats database on 2013-09-18 09:07:58 for 446966 points of credit.
-alias-
Posts: 121
Joined: Sun Feb 22, 2009 1:20 pm

Re: Update server? [stats down]

Post by -alias- »

Thanks,
I think I am still missing one, but I do not know which one it could be. I could of course be wrong, but I do not think so. But ok, 1 in loss is fair enough.
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Update server?

Post by 7im »

Grandpa_01 wrote:
bruce wrote:As 7im says, the date problem has been reported. Unfortunately I think the problem is bigger than just the date being tomorrow. The old stats page reported a date AND A TIME and if it was older than the top of the previous hour, there was a pretty good chance that one of the stats processes needed to be restarted. (All the stats eventually came through.) Without the time-stamp, I can't tell how old they are.

As a Moderation, I can, however, look up Project: 6361 (Run 1, Clone 73, Gen 15) and see that the credit has not yet been processed, confirming that SOMETHING is wrong with the stats processing cycle.

BTW: it's not 22:00 yet. (It's meaningless unless you specify what timezone you're in, which you did later.) While the logs are timestamped in UTC, Stanford servers run on PST/PDT and Donors are scattered all over the globe.
7im are you talking about this post.

In my mind when I read this it says there was no time stamp and that something is wrong with the timestamps.
(All the stats eventually came through.) Without the time-stamp,
I guess I read it wrong.
No, I am talking about this post: viewtopic.php?p=249131#p249131

More importantly, the date and time on the Stats server front page has nothing to do with QRB calculations. Only the date time stamps on the work servers are used to calculate points. There was never any indication the work servers had a date time problem.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
schwancr
Pande Group Member
Posts: 136
Joined: Wed Jun 01, 2011 9:45 pm

Re: Update server? [stats down - fixed]

Post by schwancr »

All of the stats issues from the end of last week should be resolved, so please let me know if anything else comes up.

-Christian
Post Reply