Page 3 of 4
Re: 171.64.65.62
Posted: Sun Oct 31, 2010 3:26 pm
by Fireball0236
Well, tomorrow is the 1st of November, a holiday (at least here it is). If the Pande Group takes the day off too, problems could last longer =\ .
Btw, did you also include server 171.67.108.33 in that message, sortofageek? Would be tremendously helpful
.
~ Fireball0236
Re: 171.64.65.62
Posted: Sun Oct 31, 2010 3:28 pm
by sortofageek
They are aware we can't get classic WUs, that there may be other servers involved, yes.
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 3:33 pm
by alancabler
All of my CPU clients get assigned to 171.64.65.62, even SMP client.
Problem might have something to do with the assignment server...
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 3:40 pm
by sortofageek
Mine do as well, Alan. All I see in my logs is 171.64.65.62. I did mention I'm not getting reassigned.
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 3:57 pm
by Baowoulf
alancabler wrote:All of my CPU clients get assigned to 171.64.65.62, even SMP client.
Problem might have something to do with the assignment server...
That might be it because I keep getting the message about the Server having no record of my WU. So something isn't being updated. I wonder if this is connected to the AS server info needing to be update after the ProtoMol WU's were removed. Just guessing here unless someone else has other info. Thankyou sortofageek for the update.
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 3:59 pm
by sortofageek
You're welcome. They are looking, but I don't really know anything more than that at this point.
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 4:00 pm
by VijayPande
Thanks for the heads up and sorry about the shortage. We're looking into it.
Here's the situation. We had one physical box go down yesterday that we serving multiple server VMs, including GPU3 and classic WUs. The classic WUs are running on the v5 (not v6) WS code and so it takes a long time for them to reload. This put a big load on the other classic servers, leading to the situation we have here.
We'll keep an eye on it as the day goes, but part of this will just take time for vsp05c to load jobs. If vsp05c takes too long (eg isn't done today), then we may force the upgrade to v6 just to get that machine on line sooner. This unfortunately isn't as simple as just restarting a server.
Re: 171.64.65.62
Posted: Sun Oct 31, 2010 4:05 pm
by yahavbr
sortofageek wrote:I have a couple of clients in the same situation.
It's Sunday, I'm not promising anything today, but I did send a message to Pande Group.
It's a pity that users need to send messages to them - millions of potential folding cycles (or whatever units are used to measure folding) are lost in these couple of days.
Re: 171.64.65.62
Posted: Sun Oct 31, 2010 4:07 pm
by VijayPande
yahavbr wrote:sortofageek wrote:I have a couple of clients in the same situation.
It's Sunday, I'm not promising anything today, but I did send a message to Pande Group.
It's a pity that users need to send messages to them - millions of potential folding cycles (or whatever units are used to measure folding) are lost in these couple of days.
We do have our own scripts that flag these sorts of issues as well. The main issue is that we (and Stanford in general) are short staffed on the weekend, so we don't respond to the forum as much. Please see my previous post for details regarding the server that went down, etc.
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 4:08 pm
by sortofageek
I'm not certain I even needed to send a message. Looking at Professor Pande's post, it seems like something they were already working on. I sent the message so I could assure you folks they were aware.
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 4:15 pm
by Baowoulf
yahavbr wrote:sortofageek wrote:I have a couple of clients in the same situation.
It's Sunday, I'm not promising anything today, but I did send a message to Pande Group.
It's a pity that users need to send messages to them - millions of potential folding cycles (or whatever units are used to measure folding) are lost in these couple of days.
I figured as much that they knew something was going on, it was just that we were wondering what the problem was. I had no doubt they were working on it too. It's just nice to have updates when something like this happens which VijayPande and sortofageek did. So us users feel like we're kept in the loop as well.
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 4:17 pm
by brityank
Many thanks for the info, Dr. Pande and sortofageek. Will just wait it all out.
Oh - and
Happy Halloween from my part of the web!
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 4:20 pm
by VijayPande
Yea, sorry for more of a trick than a treat today, but vspg10c is back on line right now and serving lots of WUs, so at the moment classic WUs are back on line. I think we should be ok short term (although some brief shortages here and there today) and with the real solution coming on Monday.
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 4:32 pm
by Fireball0236
Many thanks, Vijay Pande! My client is happily folding again, working on P10032!
~ Fireball0236
Re: Can't get classic WUs: 171.64.65.62, 171.67.108.33
Posted: Sun Oct 31, 2010 4:46 pm
by pmasley
I still cannot get an assignment from the server. Using the classic program. It has been trying for the past two days. Guess I will just shut it down for a while and the the Gang sort it all out.
[16:20:52] + Attempting to get work packet
[16:20:52] - Connecting to assignment server
[16:20:53] - Successful: assigned to (171.64.65.62).
[16:20:53] + News From Folding@Home: Welcome to Folding@Home
[16:20:53] Loaded queue successfully.
[16:20:53] - Couldn't send HTTP request to server
[16:20:53] (Got status 503)
[16:20:53] + Could not connect to Work Server
[16:20:53] - Attempt #13 to get work failed, and no other work to do.
Waiting before retry.
Actually, this is not #13. I have restarted the program eight times over the past two days. I will restart it on Tuesday and see what happens.