Page 1 of 1

Two days without 171.67.108.17

Posted: Thu Nov 12, 2009 1:48 am
by preet.to

Code: Select all

[00:58:53] + Attempting to send results [November 12 00:58:53 UTC]
[00:58:53] - Reading file work/wuresults_00.dat from core
[00:58:53]   (Read 4768341 bytes from disk)
[00:58:53] Connecting to http://171.67.108.17:8080/
[00:58:53] - Couldn't send HTTP request to server
[00:58:53]   (Got status 503)
[00:58:53] + Could not connect to Work Server (results)
[00:58:53]     (171.67.108.17:8080)
[00:58:53] + Retrying using alternative port
[00:58:53] Connecting to http://171.67.108.17:80/
[00:58:54] - Couldn't send HTTP request to server
[00:58:54]   (Got status 503)
[00:58:54] + Could not connect to Work Server (results)
[00:58:54]     (171.67.108.17:80)
[00:58:54]   Could not transmit unit 00 to Collection server; keeping in queue.
[00:58:54] + Sent 0 of 2 completed units to the server
[00:58:54] - Autosend completed
I have a number of WU in the queue. Nothing has been uploading since Sunday. Server stats show nothing really wrong. It sometimes takes 12 hours to get a WU as well.

Anyone shed light on this?

Re: Two days without 171.67.108.17

Posted: Thu Nov 12, 2009 8:29 am
by toTOW
I repeat once again : 108.17 is a collection server ... unfortunately it's often overloaded (error 503 like in your example) or it fails to do its job (because it's not aware of all WU distributed wy work servers).

It would be better to locate the original work server (probably a couple of line before those you quoted) and search why it's down (or if someone else already reported it).

Re: Two days without 171.67.108.17

Posted: Thu Nov 12, 2009 2:02 pm
by preet.to
Please don't get me wrong, I am not complaining. F@H is a victim of it's own success. You cannot keep up with all the donors. I think that is good.

My post was about collections and not about getting new WU. I just get worried when nothing is collected for four days. Usually when things are slow, one or two will get through. But this is zero and the server stats page does not show that something is wrong. The network load is looking like others. I searched the announcements, forums on specific clients and did not get a clear answer.

Previous posts on this CS were from a different time period. So I was posting to make people aware. Perhaps to get a status.

Re: Two days without 171.67.108.17

Posted: Thu Nov 12, 2009 3:05 pm
by toTOW
Different time period, but still the same underlying causes/explanations ;)

Re: Two days without 171.67.108.17

Posted: Fri Nov 13, 2009 9:07 pm
by bruce
preet.to wrote:My post was about collections and not about getting new WU. I just get worried when nothing is collected for four days. Usually when things are slow, one or two will get through. But this is zero and the server stats page does not show that something is wrong. The network load is looking like others. I searched the announcements, forums on specific.
I think your understanding is good, as far as it goes, but you're not understanding what toTOW is saying.

When you look for a new WU, you can accept a download from one of several different servers, so you don't necessarily know if a server is down.

When you complete a WU, it must upload to the same Work Server so you will know if the server you got the WU from has gone down. That server is NOT a collection server.

If a WU fails to upload, whether the designated Work Server is down, overloaded, or you just found a temporary internet problem, the client will try to upload to a second server, called a Collection Server, which then re-sends it directly to the primary Work Server whenever it's available. The Collection Servers have been perpetually overloaded but they do successfully upload some results that fail to reach the Work Server.

Reporting a problem with a CS such as 171.67.108.17 isn't a primary concern -- you should pay attention to the status of the WS and look for other reports regarding it. In the segment of FAHlog just before the parts that have been posted, you'll find similar messages but with a different IP address. That's what you should be reporting.

I'm going to close this topic.