Page 1 of 1

171.67.108.25 and 171.64.65.56

Posted: Thu May 21, 2009 1:26 am
by Prohibition
] Completed 27510 out of 250000 steps (11%)
[20:05:17] - Couldn't send HTTP request to server
[20:05:17] + Could not connect to Work Server (results)
[20:05:17] (171.64.65.56:8080)
[20:05:18] - Error: Could not transmit unit 04 (completed May 20) to work server.


[20:05:18] + Attempting to send results
[20:05:18] - Couldn't send HTTP request to server
[20:05:18] + Could not connect to Work Server (results)
[20:05:18] (171.67.108.25:8080)
[20:05:18] Could not transmit unit 04 to Collection server; keeping in queue.


ive been stuck with my vmware smp clients not sending in results. i have even whiped out the folder and redid the clients. the clients worked couple of days ago. i have 2 gpu clients running strong with no problems. I have disabled windows fire wall. (VISTA 64) and also tried disabling firewall on router. with restart no luck still no send. any ideas?

Re: 171.67.108.25 and 171.64.65.56

Posted: Thu May 21, 2009 2:21 am
by patonb
Same as both my 260's

Attempting to send results [May 21 02:17:30 UTC]
[02:17:31] - Couldn't send HTTP request to server
[02:17:31] (Got status 503)
[02:17:31] + Could not connect to Work Server (results)
[02:17:31] (171.67.108.25:8080)
[02:17:31] + Retrying using alternative port
[02:17:31] - Couldn't send HTTP request to server
[02:17:31] (Got status 503)
[02:17:31] + Could not connect to Work Server (results)
[02:17:31] (171.67.108.25:80)
[02:17:31] Could not transmit unit 05 to Collection server; keeping in queue.
[02:17:39] + Attempting to get work packet
[02:17:39] - Connecting to assignment server
[02:17:40] + No appropriate work server was available; will try again in a bit.
[02:17:40] + Couldn't get work instructions.
[02:17:40] - Attempt #2 to get work failed,

It's been 2 hours.

Re: 171.67.108.25 and 171.64.65.56

Posted: Thu May 21, 2009 4:21 pm
by bruce
patonb wrote:[02:17:31] (171.67.108.25:80)
[02:17:31] Could not transmit unit 05 to Collection server; keeping in queue.
Two observations:
1) You're only showing a 503 error with a single Collection Server so we need more of your FAHlog. Collection servers, like 171.67.108.25, are a backup for the primary Work Server. Error 503 indicates it's busy, which is a relatively common problem and the Pande Group knows that the only fix for them is better server hardware and software, overall.
2) A 2-hour problem uploading a result isn't considered serious. The client is designed to hold the output in queue and move on to downloading a new wprl assignment from a different server. The upload will be retried later, automatically, so be patient.
[02:17:39] - Connecting to assignment server
[02:17:40] + No appropriate work server was available; will try again in a bit.
[02:17:40] + Couldn't get work instructions.
[02:17:40] - Attempt #2 to get work failed,

It's been 2 hours.
Your client did have trouble downloading a new assignment. This is a more important problem. I see other discussions about this issue and one of the Mods has notified the Pande Group. Presumably that is being or has been worked on by now (9:30 AM Stanford time).

Re: 171.67.108.25 and 171.64.65.56

Posted: Thu May 21, 2009 4:36 pm
by bruce
Welcome to the foldingforum, Prohibition.

See the answers in my previous post regarding uploading to a Collection Server. You have included information about the primary work server failing to accept the upload, and that's useful information but I'll still say that uploading problems are less significant than downloading problems. The important perspective here is that the current WU has reached 11% and is still being processed so your system is working, not waiting. What you're seeing here is the client automatically retrying an upload that failed either because a server was off-line or too busy. That's the way the client is supposed to work when there's a server problem.
Prohibition wrote:] Completed 27510 out of 250000 steps (11%)
[20:05:17] - Couldn't send HTTP request to server
[20:05:17] + Could not connect to Work Server (results)
[20:05:17] (171.64.65.56:8080)
[20:05:18] - Error: Could not transmit unit 04 (completed May 20) to work server.
Once the current WU passes 100% and the client tries to get new work again, you may have the same problem as patonb, but you should face that problem if and when it happens. Hopefully it will be fixed by then. In either case, there's nothing you can do about it, since the problems are at Stanford.

Re: 171.67.108.25 and 171.64.65.56

Posted: Fri May 22, 2009 8:18 pm
by Prohibition
ok. i still cant send that to the collection server. i uninstalled the program reinstalled and tried again. as for more of teh log i dont know how much you need. here is more of the new unable to send.

[13:43:50] + Attempting to send results
[13:45:10] - Couldn't send HTTP request to server
[13:45:10] + Could not connect to Work Server (results)
[13:45:10] (171.64.65.56:8080)
[13:45:10] - Error: Could not transmit unit 01 (completed May 22) to work server.
[13:45:10] Keeping unit 01 in queue.


[13:45:10] + Attempting to send results
[13:46:00] - Couldn't send HTTP request to server
[13:46:00] + Could not connect to Work Server (results)
[13:46:00] (171.64.65.56:8080)
[13:46:00] - Error: Could not transmit unit 01 (completed May 22) to work server.


[13:46:00] + Attempting to send results
[13:46:00] - Couldn't send HTTP request to server
[13:46:00] + Could not connect to Work Server (results)
[13:46:00] (171.67.108.25:8080)
[13:46:00] Could not transmit unit 01 to Collection server; keeping in queue.
[13:46:00] - Preparing to get new work unit...
[13:46:00] + Attempting to get work packet
[13:46:00] - Connecting to assignment server
[13:46:01] - Successful: assigned to (171.64.65.56)

Re: 171.67.108.25 and 171.64.65.56

Posted: Fri May 22, 2009 8:44 pm
by bruce
Prohibition wrote:ok. i still cant send that to the collection server. i uninstalled the program reinstalled and tried again. as for more of teh log i dont know how much you need. here is more of the new unable to send.

[13:43:50] + Attempting to send results
[13:45:10] - Couldn't send HTTP request to server
[13:45:10] + Could not connect to Work Server (results)
[13:45:10] (171.64.65.56:8080)
[13:45:10] - Error: Could not transmit unit 01 (completed May 22) to work server.
[13:45:10] Keeping unit 01 in queue.


[13:45:10] + Attempting to send results
[13:46:00] - Couldn't send HTTP request to server
[13:46:00] + Could not connect to Work Server (results)
[13:46:00] (171.64.65.56:8080)
[13:46:00] - Error: Could not transmit unit 01 (completed May 22) to work server.


[13:46:00] + Attempting to send results
[13:46:00] - Couldn't send HTTP request to server
[13:46:00] + Could not connect to Work Server (results)
[13:46:00] (171.67.108.25:8080)
[13:46:00] Could not transmit unit 01 to Collection server; keeping in queue.
[13:46:00] - Preparing to get new work unit...
[13:46:00] + Attempting to get work packet
[13:46:00] - Connecting to assignment server
[13:46:01] - Successful: assigned to (171.64.65.56)
Reinstalling rarely helps. The Collection Server at 171.67.108.25 is perpetually overloaded, though it does process a limited number of uploads so that's not worth discussing. The real question is what about the primary Work Server which is on 171.64.65.56. It's pretty busy right now with a NetLoad of 105 but the client will probably upload to it soon.

It looks like FAH has been able to connect to that server so it's probably not a firewall problem, but it never hurts to turn your security software off temporarily to see. Also, you should confirm that you can connect to http://171.64.65.56:8080/ with your browser.

Re: 171.67.108.25 and 171.64.65.56

Posted: Sun May 24, 2009 7:31 pm
by BrokenWolf
Looks like there may be another issue with these servers. The following is from one of my 2 systems that can not upload their WU's or get new ones. When going to thru my web browser I do not get connected to the servers, so I do not get the OK on the page.

Happy Memorial Day Weekend.

BrokenWolf

Code: Select all

# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.24beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/timk/Folding
Executable: ./fah6
Arguments: -smp -verbosity 9 -smp -verbosity 9 

[19:24:35] - Ask before connecting: No
[19:24:35] - User name: BrokenWolf (Team 1971)
[19:24:35] - User ID: 22A5CFF206376BBE
[19:24:35] - Machine ID: 2
[19:24:35] 
[19:24:35] Loaded queue successfully.
[19:24:35] - Preparing to get new work unit...
[19:24:35] + Attempting to get work packet
[19:24:35] - Will indicate memory of 1505 MB
[19:24:35] - Connecting to assignment server
[19:24:35] Connecting to http://assign.stanford.edu:8080/
[19:24:35] - Autosending finished units... [May 24 19:24:35 UTC]
[19:24:35] Trying to send all finished work units
[19:24:35] Project: 2669 (Run 14, Clone 162, Gen 58)


[19:24:35] + Attempting to send results [May 24 19:24:35 UTC]
[19:24:35] - Reading file work/wuresults_02.dat from core
[19:24:36] Posted data.
[19:24:37] Initial: 0000; + No appropriate work server was available; will try again in a bit.
[19:24:37] + Couldn't get work instructions.
[19:24:37] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[19:24:37]   (Read 25922859 bytes from disk)
[19:24:37] Connecting to http://171.64.65.56:8080/
[19:24:38] - Couldn't send HTTP request to server
[19:24:38] + Could not connect to Work Server (results)
[19:24:38]     (171.64.65.56:8080)
[19:24:38] + Retrying using alternative port
[19:24:38] Connecting to http://171.64.65.56:80/
[19:24:38] - Couldn't send HTTP request to server
[19:24:38] + Could not connect to Work Server (results)
[19:24:38]     (171.64.65.56:80)
[19:24:38] - Error: Could not transmit unit 02 (completed May 24) to work server.
[19:24:38] - 3 failed uploads of this unit.


[19:24:38] + Attempting to send results [May 24 19:24:38 UTC]
[19:24:38] - Reading file work/wuresults_02.dat from core
[19:24:38]   (Read 25922859 bytes from disk)
[19:24:38] Connecting to http://171.67.108.25:8080/
[19:24:38] - Couldn't send HTTP request to server
[19:24:38]   (Got status 503)
[19:24:38] + Could not connect to Work Server (results)
[19:24:38]     (171.67.108.25:8080)
[19:24:38] + Retrying using alternative port
[19:24:38] Connecting to http://171.67.108.25:80/
[19:24:38] - Couldn't send HTTP request to server
[19:24:38]   (Got status 503)
[19:24:38] + Could not connect to Work Server (results)
[19:24:38]     (171.67.108.25:80)
[19:24:38]   Could not transmit unit 02 to Collection server; keeping in queue.
[19:24:38] + Sent 0 of 1 completed units to the server
[19:24:38] - Autosend completed

Re: 171.67.108.25 and 171.64.65.56

Posted: Sun May 24, 2009 7:49 pm
by Thorsten_Q.
At the moment 2 of my system can't send or recieve a WU. I have the same log as already posted by others...

Code: Select all

[21:03:02] Completed 247506 out of 250000 steps  (99%)
[21:10:32] Completed 250000 out of 250000 steps  (100%)
[21:10:33] DynamicWrapper: Finished Work Unit: sleep=10000
[21:10:43] 
[21:10:43] Finished Work Unit:
[21:10:43] - Reading up to 21128112 from "work/wudata_02.trr": Read 21128112
[21:10:43] trr file hash check passed.
[21:10:43] - Reading up to 4431108 from "work/wudata_02.xtc": Read 4431108
[21:10:43] xtc file hash check passed.
[21:10:43] edr file hash check passed.
[21:10:43] logfile size: 182695
[21:10:43] Leaving Run
[21:10:47] - Writing 25887011 bytes of core data to disk...
[21:10:48]   ... Done.
[21:10:48] - Shutting down core
[21:10:48] 
[21:10:48] Folding@home Core Shutdown: FINISHED_UNIT
[21:14:08] CoreStatus = 64 (100)
[21:14:08] Sending work to server


[21:14:08] + Attempting to send results
[21:14:08] - Couldn't send HTTP request to server
[21:14:08] + Could not connect to Work Server (results)
[21:14:08]     (171.64.65.56:8080)
[21:14:08] - Error: Could not transmit unit 02 (completed May 24) to work server.
[21:14:08]   Keeping unit 02 in queue.


[21:14:08] + Attempting to send results
[21:14:09] - Couldn't send HTTP request to server
[21:14:09] + Could not connect to Work Server (results)
[21:14:09]     (171.64.65.56:8080)
[21:14:09] - Error: Could not transmit unit 02 (completed May 24) to work server.


[21:14:09] + Attempting to send results
[21:14:19] - Couldn't send HTTP request to server
[21:14:19] + Could not connect to Work Server (results)
[21:14:19]     (171.67.108.25:8080)
[21:14:19]   Could not transmit unit 02 to Collection server; keeping in queue.
[21:14:19] - Preparing to get new work unit...
[21:14:19] + Attempting to get work packet
[21:14:19] - Connecting to assignment server
[21:14:20] + No appropriate work server was available; will try again in a bit.
[21:14:20] + Couldn't get work instructions.
[21:14:20] - Attempt #1  to get work failed, and no other work to do.
             Waiting before retry.
And I can't connect to http://171.64.65.56:8080/ ...

Re: 171.67.108.25 and 171.64.65.56

Posted: Sun May 24, 2009 9:19 pm
by codysluder
Thorsten_Q. wrote:And I can't connect to http://171.64.65.56:8080/ ...
I believe this has been fixed. I can connect to it.