Page 5 of 13
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 4:41 pm
by orphic1
GreyWhiskers wrote:Code: Select all
Server IP Name Client WU Avail WUs Rcv
Tue Dec 13 04:50:01 PST 2011 171.64.65.102 vspg2v2 GPU 16012 9292
Tue Dec 13 05:00:00 PST 2011 171.64.65.102 vspg2v2 GPU 16040 220
Tue Dec 13 05:10:00 PST 2011 171.64.65.102 vspg2v2 GPU 16860 2205
Tue Dec 13 05:20:00 PST 2011 171.64.65.102 vspg2v2 GPU 16035 4215
Tue Dec 13 05:30:00 PST 2011 171.64.65.102 vspg2v2 GPU 16865 6166
Tue Dec 13 05:40:00 PST 2011 171.64.65.102 vspg2v2 GPU 15983 8160
Tue Dec 13 05:50:00 PST 2011 171.64.65.102 vspg2v2 GPU 15970 10139
Tue Dec 13 06:00:00 PST 2011 171.64.65.102 vspg2v2 GPU 16790 197
Tue Dec 13 06:10:00 PST 2011 171.64.65.102 vspg2v2 GPU 15935 2132
Tue Dec 13 06:20:00 PST 2011 171.64.65.102 vspg2v2 GPU 15949 4052
Tue Dec 13 06:30:00 PST 2011 171.64.65.102 vspg2v2 GPU 15944 5975
Tue Dec 13 06:40:00 PST 2011 171.64.65.102 vspg2v2 GPU 15901 7863
Tue Dec 13 06:50:00 PST 2011 171.64.65.102 vspg2v2 GPU 16659 9706
Tue Dec 13 07:00:00 PST 2011 171.64.65.102 vspg2v2 GPU 15877 169
Tue Dec 13 07:10:01 PST 2011 171.64.65.102 vspg2v2 GPU 15943 2084
Tue Dec 13 07:20:00 PST 2011 171.64.65.102 vspg2v2 GPU 16812 4047
Tue Dec 13 07:30:00 PST 2011 171.64.65.102 vspg2v2 GPU 16891 6076
Tue Dec 13 07:40:00 PST 2011 171.64.65.102 vspg2v2 GPU 16934 8145
Tue Dec 13 07:50:00 PST 2011 171.64.65.102 vspg2v2 GPU 16895 10157
I grabbed a few hours of the details for one of the GPU servers that seemed in the stats to have a lot (10,157) of WUs received.
The WU Available count is going up and down a bit - what one would expect from the replenishment process.
But, the WU Received is cycling. The top row shows 9292, then ten minutes later, its down to 220, then monotonically goes up till it reaches 10139, then down to 197. For this sample, the cycle is hourly.
Per the definition of WU Received, that would tell me that a batch of WUs are being sent to the Stats server at the top of the hour. I presume the stats server is receiving these, but Stanford has chosen not to make the stats available quite yet.
NOTE: this is pure supposition on my part from the data.
Do you think they might be sanitizing the data due to the outage? That would make sense to me that they would want to make sure all stats are correct and validated before pushing to the public site.
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 4:43 pm
by new08
It could be possible that SOME WUs have gone adrift [nothing terrible about that,really] but that until the problem is resolved 'Schtum' is the watchword ...they have SO much more data on all this -and minor worries can get amplified easily in the dark.
That's also not the donors fault- we are all little cogs in the F@H wheel, but don't actually want to be total mice
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 5:15 pm
by orphic1
new08 wrote:It could be possible that SOME WUs have gone adrift [nothing terrible about that,really] but that until the problem is resolved 'Schtum' is the watchword ...they have SO much more data on all this -and minor worries can get amplified easily in the dark.
That's also not the donors fault- we are all little cogs in the F@H wheel, but don't actually want to be total mice
Agreed, I have been following Vijay's blog but it hasn't been updated since this weekend.
http://folding.typepad.com/
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 6:48 pm
by chriskwarren
I'm getting questions about the possibility of points being lost due to this outage. I understand that work is still being sent and recieved. I have been encouraging folks to continue folding as normal.
I know the politically-correct and scientifically-correct thing to do is to keep folding, but keeping in mind that some donors DO fold for the points, I just want to make sure I am giving them the best possible advice for their circumstances. Some folks are still sore from the Bigadv changes, so I would like to be careful on this current network issue.
Can I safely assume for these donors (and continue to assert) that the points will be awarded as they should have been, bonuses intact and all? I have been looking for some kind of official word on this but can't seem to find it.
Thanks.
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 7:07 pm
by kromberg
I just hope the stats servers have the ability to catch up on 3 days of WUs.
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 7:36 pm
by DocJonz
It would be good if Vijay's blog was updated to let people know what's going on - it's the first place I look when there's 'an issue' - or there should be some response from the new Donor Advisory Board, as this was put together with the aim, in part, of improving communication flow both ways ... come on chaps, let's have some news
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 7:41 pm
by Jesse_V
Apparently Dr. Pande has a Twitter and Google+ account, but from what I've seen both have been publicly silent. So it looks like his F@h community messages involve either forum or blog posts.
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 7:54 pm
by kasson
Thanks for your posts. Points are still being recorded and credited; it appears that one of the databases where they're stored for display has been down since the outage. I sent email to help move this along.
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 7:55 pm
by orphic1
kasson wrote:Thanks for your posts. Points are still being recorded and credited; it appears that one of the databases where they're stored for display has been down since the outage. I sent email to help move this along.
Thank you!!!
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 8:12 pm
by CougTek
Just when I was starting to think that it was a test by the Pande Lab to check how much production would drop if they'd stop the points reward system...
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 8:38 pm
by k1net1cs
@kasson
Thanks for the update. =)
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 9:08 pm
by Jesse_V
CougTek wrote:Just when I was starting to think that it was a test by the Pande Lab to check how much production would drop if they'd stop the points reward system...
It's a test of your commitment. After the test is over, the PG will use all of these forums posts to survey how many people fold for the points, and how many people fold for the science. Then you'll need to assume the Party Escort Submission Position and cake will be served.
JK, but seriously though, I've been watching the petaFLOPS and noticed that we've dropped from 9.0-9.1 x86 petaFLOPS down to 8.4, which is about a 9% drop. Of course that's not the best way to measure scientific production, and it might be just coincidence, but I just wanted to throw that out there. I hope to see everything up and running smoothly soon.
Re: Are you missing points??? - please read this!!!
Posted: Tue Dec 13, 2011 9:08 pm
by bruce
As Joe_H has said and as I've said in other posts, most of the work servers are back up so you should be sending and receiving work normally.
The database that accumulates data for the stats is still off-line. That DB is not essential to the science though it's important to many donors.
Each Work Server routinly accumulates stats records for WUs that have been returned to it. Under normal circumstances, those data are collected hourly and added to the master database. Since the DB is down, that updating process is not functioning and the records are collecting on each Work Server. Eventually, when the stats are functioning again, the data will be collected and there will be one or more massive update of everything that has happened during the outage.
I have no information about when that might change.
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 9:26 pm
by bruce
Jesse_V wrote:It's a test of your commitment. After the test is over, the PG will use all of these forums posts to survey how many people fold for the points, and how many people fold for the science. Then you'll need to assume the Party Escort Submission Position and cake will be served.
I really don't think the PG cares whether folks fold for the science, for the points, or for some combination of the two. Personally, I treat everyone equally and I know I really don't care what your motivation is as long as you fold.
JK, but seriously though, I've been watching the petaFLOPS and noticed that we've dropped from 9.0-9.1 x86 petaFLOPS down to 8.4, which is about a 9% drop. Of course that's not the best way to measure scientific production, and it might be just coincidence, but I just wanted to throw that out there. I hope to see everything up and running smoothly soon.
That's a combination of the servers being off-line, the stats not being accumulated, and maybe a few who have suspended processing, feeling rather indignant.
During the most serious part of the server outage, some folks couldn't get work because all the servers that have work for their particular client were off-line. As far as I can tell, that has been corrected.
I'm not sure where the petaFLOPS data is collected but my guess is that it's part of the stats system which is still down. We'll know if you see a corresponding jump in petaFLOPS when the stats system comes back on line.
Those who indignantly shut down fail to realize that there will a big jump in credits for those who keep folding and they'll miss out on the bump.
Re: Stanford Network Issue
Posted: Tue Dec 13, 2011 9:35 pm
by k1wi
bruce wrote:Those who indignantly shut down fail to realize that there will a big jump in credits for those who keep folding and they'll miss out on the bump.
That's the first thing that comes to my mind whenever the servers have a hiccup (small or otherwise) and people start declaring that they're going to pull their clients, particularly given PG's track record in these circumstances (I can only vaguely recall one instance where points weren't tracked perfectly and donors still received a compensation credit?) I would have thought that the risk of missing out on point bump would outweigh the risk of not having points credited...