Page 5 of 25

171.64.65.56

Posted: Wed Oct 22, 2008 2:46 am
by ArVee
171.64.65.56 is in Reject mode and the CS is failing to accept as well. Again.

Re: 171.64.65.56 is in Reject status

Posted: Wed Oct 22, 2008 4:17 am
by 314159
Rejecting again. :roll:

Any troubleshooters awake? :wink:

Thanks,

John

Re: 171.64.65.56 is in Reject status

Posted: Wed Oct 22, 2008 4:27 am
by kasson
Restarting the sever code now...

Re: 171.64.65.56

Posted: Wed Oct 22, 2008 7:41 am
by ppetrone
It's up now. Thanks for the report.

Paula

Re: 171.64.65.56 is in Reject status

Posted: Wed Oct 22, 2008 1:27 pm
by 314159
Thanks for the quick resolution. :)
This server has been "behaving" fairly well recently.

Serverstats' DL column looks a bit thin.
Should a Linux/OSX folder be concerned? :?

John

Re: 171.64.65.56 is in Reject status

Posted: Thu Oct 30, 2008 2:25 am
by Foxbat
My Mac Mini stalled waiting on 171.64.65.56 this evening. It took about 30 minutes before it downloaded another WU. Must have been rush hour; the server status showed Accepting, but I got reject about a dozen times. I try not to take it personally... ;)

Re: 171.64.65.56 is in Reject status

Posted: Thu Oct 30, 2008 3:26 am
by Ragnar Dan
This server, and all servers my machines use are obviously not behaving properly. Either add more servers or quit allowing new downloads of the client so you're not so overwhelmed.

Re: 171.64.65.56 is in Reject status

Posted: Fri Oct 31, 2008 6:04 pm
by susato
Peace Ragnar, balky servers are just as troublesome for the PandeGroup scientists as they are for us donors.

Got a question though, Paula and Peter - For the past week or so the "WU To Go" and "WU Available" numbers on this server have hovered around 100 or less except for some VERY short intervals (for two hours there were between 300 and 100 WU available, and or another hour around 1600 WU were available)

In the past a low number of "WU Available" on a server meant a shortage of work units for the machines provisioned by that server.
Is this still true or has the WU supply behind the work servers been adjusted so that units will continue to flow freely to donors even if the numbers available on the work server are low?

TIA for your answers.

Re: 171.64.65.56 is in Reject status

Posted: Fri Oct 31, 2008 10:13 pm
by kasson
We had a problem with the server code on that machine that was causing it to "leak" jobs--jobs that should be available for assignment were being marked as not available. We've fixed the code, but we need to reclaim more of the jobs. We're hoping to improve the situation.

Re: 171.64.65.56 is in Reject status

Posted: Tue Nov 11, 2008 1:04 pm
by susato
Thanks Peter. It's now 12 days later and the server is still low on WU. Over the last week my three older Linux dual-core machines have been unable to get any work at all from this server. They are folding WU from the .65.64 server designed for quad core machines. at an average of 1.67x minimum speed. Similar reports are coming in on team forum pages from other donors whose mac minis, Linux servers and mac laptops are also struggling to finish these units on time.

All donors know by heart that the PG needs work units returned promptly in order to keep the average generation time down and move the research forward swiftly. The original assignment-server logic diverting dual-core machines to the .65.64 server was supposed to be a stopgap to keep duallies in work during brief upsets of the .65.56 server. Three weeks is not brief. Is this still a server problem or is there a shortage of duallie work units?

The serverstats page also indicates that very few work units are returning to 65.56 -- this has to be related to the server's failure to distribute WU in the first place, because dual core machines are out there ready to fold them.

Looking forward to an update on this situation. Thanks.

Re: 171.64.65.56 is in Reject status

Posted: Tue Nov 11, 2008 2:30 pm
by kasson
Thanks for the ping. We currently have 3644 jobs available on vspg4; hopefully that will help somewhat.

Re: 171.64.65.56 is in Reject status

Posted: Wed Nov 12, 2008 12:24 pm
by susato
Helps? Definitely! Those jobs are quickly being snapped up - a moment ago there were only 2136. At this rate the Mini's will be hungry again by noon Thursday.
It can't be easy keeping up with the demand for all kinds of work units around the clock. Thanks for keeping us "provisioned".

Re: 171.64.65.56 is in Reject status

Posted: Sun Nov 16, 2008 2:07 pm
by Aardvark
Server will not accept WU and CS is not accepting either. Serverstats indicate the 171.64.65.56 is REJECT. Am sending SOS...

Re: 171.64.65.56 is in Reject status

Posted: Sun Nov 16, 2008 3:40 pm
by AgrFan
171.64.65.56 has been having issues for almost a month now. It looks to have run out of disk space. I noticed the DL column on the server stat page was showing a zero last night.

Peter, any ETA on when this server will be functioning normally again?

Code: Select all

[15:01:07] + Attempting to send results
[15:01:07] - Reading file work/wuresults_05.dat from core
[15:01:07]   (Read 26035859 bytes from disk)
[15:01:07] Connecting to http://171.64.65.56:8080/
[15:01:07] - Couldn't send HTTP request to server
[15:01:07] + Could not connect to Work Server (results)
[15:01:07]     (171.64.65.56:8080)
[15:01:07] - Error: Could not transmit unit 05 (completed November 16) to work server.
[15:01:07] - 4 failed uploads of this unit.


[15:01:07] + Attempting to send results
[15:01:07] - Reading file work/wuresults_05.dat from core
[15:01:07]   (Read 26035859 bytes from disk)
[15:01:07] Connecting to http://171.67.108.25:8080/
[15:01:08] - Couldn't send HTTP request to server
[15:01:08] + Could not connect to Work Server (results)
[15:01:08]     (171.67.108.25:8080)
[15:01:08]   Could not transmit unit 05 to Collection server; keeping in queue.
[15:01:08] + Sent 0 of 1 completed units to the server
[15:01:08] - Autosend completed

Re: 171.64.65.56 is in Reject status

Posted: Sun Nov 16, 2008 4:40 pm
by noorman
Aardvark wrote:Server will not accept WU and CS is not accepting either. Serverstats indicate the 171.64.65.56 is REJECT. Am sending SOS...
.


SOS to whom ?

I made a new thread on this and sent a PM to Kasson


.