Page 1 of 2

171.64.65.56 and 171.67.108.25

Posted: Sat Mar 28, 2009 1:33 pm
by ikerekes

Code: Select all

13:28:44] - Autosending finished units... [March 28 13:28:44 UTC]
[13:28:44] Trying to send all finished work units
[13:28:44] Project: 2669 (Run 0, Clone 42, Gen 94)


[13:28:44] + Attempting to send results [March 28 13:28:44 UTC]
[13:28:44] - Reading file work/wuresults_01.dat from core
[13:28:44]   (Read 25980593 bytes from disk)
[13:28:44] Connecting to http://171.64.65.56:8080/
[13:28:44] - Couldn't send HTTP request to server
[13:28:44] + Could not connect to Work Server (results)
[13:28:44]     (171.64.65.56:8080)
[13:28:44] + Retrying using alternative port
[13:28:44] Connecting to http://171.64.65.56:80/
[13:28:44] - Couldn't send HTTP request to server
[13:28:44]   (Got status 503)
[13:28:44] + Could not connect to Work Server (results)
[13:28:44]     (171.64.65.56:80)
[13:28:44] - Error: Could not transmit unit 01 (completed March 28) to work server.
[13:28:44] - 4 failed uploads of this unit.


[13:28:44] + Attempting to send results [March 28 13:28:44 UTC]
[13:28:44] - Reading file work/wuresults_01.dat from core
[13:28:44]   (Read 25980593 bytes from disk)
[13:28:44] Connecting to http://171.67.108.25:8080/
[13:28:45] - Couldn't send HTTP request to server
[13:28:45] + Could not connect to Work Server (results)
[13:28:45]     (171.67.108.25:8080)
[13:28:45] + Retrying using alternative port
[13:28:45] Connecting to http://171.67.108.25:80/
[13:28:45] - Couldn't send HTTP request to server
[13:28:45] + Could not connect to Work Server (results)
[13:28:45]     (171.67.108.25:80)
[13:28:45]   Could not transmit unit 01 to Collection server; keeping in queue.
[13:28:45] + Sent 0 of 1 completed units to the server
[13:28:45] - Autosend completed
for the last 4 hours :(

Re: 171.64.65.56 - 171.67.108.25

Posted: Sat Mar 28, 2009 3:57 pm
by dschief
Looks like 171.64.65.56 is back up 08:55 west coast time.
finally got an upload , after 7 tries & got an A1 core in return

Re: 171.64.65.56 - 171.67.108.25

Posted: Sat Mar 28, 2009 4:13 pm
by Pick2
I'm still having the same problem as ikerekes
Some go up , some new WU , most not going up or down.
Net load 150 , CPU usage is up/yellow

Re: 171.64.65.56 - 171.67.108.25

Posted: Sat Mar 28, 2009 11:29 pm
by toTOW
I've just notified the Pande Group about this issue.

Re: 171.64.65.56 - 171.67.108.25

Posted: Sun Mar 29, 2009 12:54 am
by VijayPande
We're keeping an eye on this. For now, we've restarted the server, which usually helps shed some issues.

Re: 171.64.65.56 - 171.67.108.25

Posted: Mon Mar 30, 2009 4:14 pm
by Shadowtester
Looks like 171.64.65.56 is having problems again this morning I have a client which has been assigned to this sever and keeps getting the following error.

Code: Select all

[15:58:37] + Attempting to get work packet
[15:58:37] - Will indicate memory of 2048 MB
[15:58:37] - Connecting to assignment server
[15:58:37] Connecting to http://assign.stanford.edu:8080/
[15:58:38] Posted data.
[15:58:38] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[15:58:38] + News From Folding@Home: Welcome to Folding@Home
[15:58:38] Loaded queue successfully.
[15:58:38] Connecting to http://171.64.65.56:8080/
[15:58:38] - Couldn't send HTTP request to server
[15:58:38]   (Got status 503)
[15:58:38] + Could not connect to Work Server
[15:58:38] - Attempt #5  to get work failed, and no other work to do.
Waiting before retry.

Re: 171.64.65.56 - 171.67.108.25

Posted: Mon Mar 30, 2009 4:28 pm
by kasson
We're doing a bit of server maintenance, so you may see the server go down for ~5-minute periods this morning. Nothing longer expected, though.

Re: 171.64.65.56 - 171.67.108.25

Posted: Mon Mar 30, 2009 4:46 pm
by Shadowtester
Finally got a new work unit but I think your 5 minutes was a little understated I started to get the 503 errors at 15:40:08 and did not receive a new work unit until 16:40:37 so I would say 1 hour would have been a closer down time estimate. ;)

Re: 171.64.65.56 - 171.67.108.25

Posted: Mon Mar 30, 2009 6:22 pm
by ikerekes
Shadowtester wrote:Finally got a new work unit but I think your 5 minutes was a little understated I started to get the 503 errors at 15:40:08 and did not receive a new work unit until 16:40:37 so I would say 1 hour would have been a closer down time estimate. ;)
in my case almost 3 hours :(

Code: Select all

[13:24:31] + Attempting to send results [March 30 13:24:31 UTC]
[13:24:31] - Reading file work/wuresults_05.dat from core
[13:24:31]   (Read 49228043 bytes from disk)
[13:24:31] Connecting to http://171.64.65.56:8080/
[13:24:31] - Couldn't send HTTP request to server
[13:24:31]   (Got status 503)
[13:24:31] + Could not connect to Work Server (results)
[13:24:31]     (171.64.65.56:8080)
[13:24:31] + Retrying using alternative port
[13:24:31] Connecting to http://171.64.65.56:80/
[13:24:31] - Couldn't send HTTP request to server
[13:24:31]   (Got status 503)
[13:24:31] + Could not connect to Work Server (results)
[13:24:31]     (171.64.65.56:80)
[13:24:31] - Error: Could not transmit unit 05 (completed March 30) to work server.
[13:24:31] - 1 failed uploads of this unit.
[13:24:31]   Keeping unit 05 in queue.
[13:24:31] Trying to send all finished work units
[13:24:31] Project: 2677 (Run 39, Clone 92, Gen 0)


[13:24:31] + Attempting to send results [March 30 13:24:31 UTC]
[13:24:31] - Reading file work/wuresults_05.dat from core
[13:24:31]   (Read 49228043 bytes from disk)
[13:24:31] Connecting to http://171.64.65.56:8080/
[13:24:31] - Couldn't send HTTP request to server
[13:24:31] + Could not connect to Work Server (results)
[13:24:31]     (171.64.65.56:8080)
[13:24:31] + Retrying using alternative port
[13:24:31] Connecting to http://171.64.65.56:80/
[13:24:32] - Couldn't send HTTP request to server
[13:24:32]   (Got status 503)
[13:24:32] + Could not connect to Work Server (results)
[13:24:32]     (171.64.65.56:80)
[13:24:32] - Error: Could not transmit unit 05 (completed March 30) to work server.
[13:24:32] - 2 failed uploads of this unit.


[13:24:32] + Attempting to send results [March 30 13:24:32 UTC]
[13:24:32] - Reading file work/wuresults_05.dat from core
[13:24:32]   (Read 49228043 bytes from disk)
[13:24:32] Connecting to http://171.67.108.25:8080/
[13:38:09] Posted data.
[13:38:09] Initial: 0000; - Uploaded at ~58 kB/s
[13:38:09] - Averaged speed for that direction ~52 kB/s
[13:38:09] + Results successfully sent
[13:38:09] Thank you for your contribution to Folding@Home.
[13:38:09] + Number of Units Completed: 194

[13:38:09]   Successfully sent unit 05 to Collection server.
[13:38:10] + Sent 1 of 1 completed units to the server
[13:38:10] - Preparing to get new work unit...
[13:38:10] + Attempting to get work packet
[13:38:10] - Will indicate memory of 2004 MB
[13:38:10] - Connecting to assignment server
[13:38:10] Connecting to http://assign.stanford.edu:8080/
[13:38:11] Posted data.
[13:38:11] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[13:38:11] + News From Folding@Home: Welcome to Folding@Home
[13:38:11] Loaded queue successfully.
[13:38:11] Connecting to http://171.64.65.56:8080/
[13:38:11] - Couldn't send HTTP request to server
[13:38:11]   (Got status 503)
[13:38:11] + Could not connect to Work Server
[13:38:11] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[13:38:25] + Attempting to get work packet
[13:38:25] - Will indicate memory of 2004 MB
[13:38:25] - Connecting to assignment server
[13:38:25] Connecting to http://assign.stanford.edu:8080/
[13:38:25] Posted data.
[13:38:25] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[13:38:25] + News From Folding@Home: Welcome to Folding@Home
[13:38:25] Loaded queue successfully.
[13:38:25] Connecting to http://171.64.65.56:8080/
[13:38:25] - Couldn't send HTTP request to server
[13:38:25] + Could not connect to Work Server
[13:38:25] - Attempt #2  to get work failed, and no other work to do.
Waiting before retry.
[13:38:38] + Attempting to get work packet
[13:38:38] - Will indicate memory of 2004 MB
[13:38:38] - Connecting to assignment server
[13:38:38] Connecting to http://assign.stanford.edu:8080/
[13:38:38] Posted data.
[13:38:38] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[13:38:38] + News From Folding@Home: Welcome to Folding@Home
[13:38:38] Loaded queue successfully.
[13:38:38] Connecting to http://171.64.65.56:8080/
[13:38:38] - Couldn't send HTTP request to server
[13:38:38]   (Got status 503)
[13:38:38] + Could not connect to Work Server
[13:38:38] - Attempt #3  to get work failed, and no other work to do.
Waiting before retry.
[13:39:12] + Attempting to get work packet
[13:39:12] - Will indicate memory of 2004 MB
[13:39:12] - Connecting to assignment server
[13:39:12] Connecting to http://assign.stanford.edu:8080/
[13:39:12] Posted data.
[13:39:12] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[13:39:12] + News From Folding@Home: Welcome to Folding@Home
[13:39:12] Loaded queue successfully.
[13:39:12] Connecting to http://171.64.65.56:8080/
[13:39:13] - Couldn't send HTTP request to server
[13:39:13]   (Got status 503)
[13:39:13] + Could not connect to Work Server
[13:39:13] - Attempt #4  to get work failed, and no other work to do.
Waiting before retry.
[13:40:05] + Attempting to get work packet
[13:40:05] - Will indicate memory of 2004 MB
[13:40:05] - Connecting to assignment server
[13:40:05] Connecting to http://assign.stanford.edu:8080/
[13:40:05] Posted data.
[13:40:05] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[13:40:05] + News From Folding@Home: Welcome to Folding@Home
[13:40:06] Loaded queue successfully.
[13:40:06] Connecting to http://171.64.65.56:8080/
[13:40:06] - Couldn't send HTTP request to server
[13:40:06] + Could not connect to Work Server
[13:40:06] - Attempt #5  to get work failed, and no other work to do.
Waiting before retry.
[13:41:33] + Attempting to get work packet
[13:41:33] - Will indicate memory of 2004 MB
[13:41:33] - Connecting to assignment server
[13:41:33] Connecting to http://assign.stanford.edu:8080/
[13:41:33] Posted data.
[13:41:33] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[13:41:33] + News From Folding@Home: Welcome to Folding@Home
[13:41:33] Loaded queue successfully.
[13:41:33] Connecting to http://171.64.65.56:8080/
[13:41:33] - Couldn't send HTTP request to server
[13:41:33] + Could not connect to Work Server
[13:41:33] - Attempt #6  to get work failed, and no other work to do.
Waiting before retry.
[13:44:16] + Attempting to get work packet
[13:44:16] - Will indicate memory of 2004 MB
[13:44:16] - Connecting to assignment server
[13:44:16] Connecting to http://assign.stanford.edu:8080/
[13:44:17] Posted data.
[13:44:17] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[13:44:17] + News From Folding@Home: Welcome to Folding@Home
[13:44:17] Loaded queue successfully.
[13:44:17] Connecting to http://171.64.65.56:8080/
[13:44:17] - Couldn't send HTTP request to server
[13:44:17] + Could not connect to Work Server
[13:44:17] - Attempt #7  to get work failed, and no other work to do.
Waiting before retry.
[13:49:47] + Attempting to get work packet
[13:49:47] - Will indicate memory of 2004 MB
[13:49:47] - Connecting to assignment server
[13:49:47] Connecting to http://assign.stanford.edu:8080/
[13:49:48] Posted data.
[13:49:48] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[13:49:48] + News From Folding@Home: Welcome to Folding@Home
[13:49:48] Loaded queue successfully.
[13:49:48] Connecting to http://171.64.65.56:8080/
[13:49:48] - Couldn't send HTTP request to server
[13:49:48] + Could not connect to Work Server
[13:49:48] - Attempt #8  to get work failed, and no other work to do.
Waiting before retry.
[14:00:42] + Attempting to get work packet
[14:00:42] - Will indicate memory of 2004 MB
[14:00:42] - Connecting to assignment server
[14:00:42] Connecting to http://assign.stanford.edu:8080/
[14:00:43] Posted data.
[14:00:43] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[14:00:43] + News From Folding@Home: Welcome to Folding@Home
[14:00:43] Loaded queue successfully.
[14:00:43] Connecting to http://171.64.65.56:8080/
[14:00:43] - Couldn't send HTTP request to server
[14:00:43]   (Got status 503)
[14:00:43] + Could not connect to Work Server
[14:00:43] - Attempt #9  to get work failed, and no other work to do.
Waiting before retry.
[14:22:05] + Attempting to get work packet
[14:22:05] - Will indicate memory of 2004 MB
[14:22:05] - Connecting to assignment server
[14:22:05] Connecting to http://assign.stanford.edu:8080/
[14:22:05] Posted data.
[14:22:05] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[14:22:05] + News From Folding@Home: Welcome to Folding@Home
[14:22:05] Loaded queue successfully.
[14:22:05] Connecting to http://171.64.65.56:8080/
[14:22:05] - Couldn't send HTTP request to server
[14:22:05]   (Got status 503)
[14:22:05] + Could not connect to Work Server
[14:22:05] - Attempt #10  to get work failed, and no other work to do.
Waiting before retry.
[15:04:51] + Attempting to get work packet
[15:04:51] - Will indicate memory of 2004 MB
[15:04:51] - Connecting to assignment server
[15:04:51] Connecting to http://assign.stanford.edu:8080/
[15:04:51] Posted data.
[15:04:51] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[15:04:51] + News From Folding@Home: Welcome to Folding@Home
[15:04:51] Loaded queue successfully.
[15:04:51] Connecting to http://171.64.65.56:8080/
[15:04:51] - Couldn't send HTTP request to server
[15:04:51] + Could not connect to Work Server
[15:04:51] - Attempt #11  to get work failed, and no other work to do.
Waiting before retry.
[15:42:32] ***** Got a SIGTERM signal (15)
[15:42:32] Killing all core threads

Re: 171.64.65.56 - 171.67.108.25

Posted: Wed Apr 01, 2009 9:41 am
by toTOW
Netload is back to 200 for 171.64.65.56, and some people are having troubles connecting to it :(

Re: 171.64.65.56 - 171.67.108.25

Posted: Wed Apr 01, 2009 2:53 pm
by weedacres
171.64.65.56 is causing a lot of upload problems. I'm having to hand tend 3 machines that stall when trying to upload results. No errors, just stuck at :

[12:39:08] + Attempting to send results [April 1 12:39:08 UTC]
[12:39:08] - Reading file work/wuresults_07.dat from core
[12:39:11] (Read 26710802 bytes from disk)
[12:39:11] Connecting to http://171.64.65.56:8080/

Does anyone know the status of getting this fixed?

Re: 171.64.65.56 - 171.67.108.25

Posted: Wed Apr 01, 2009 2:59 pm
by kasson
There's a lot of traffic on the server right now. I'm watching the log--work units are coming in and out, so it's not stuck.

Re: 171.64.65.56 - 171.67.108.25

Posted: Wed Apr 01, 2009 4:30 pm
by weedacres
There's a lot of traffic on the server right now. I'm watching the log--work units are coming in and out, so it's not stuck.
I just tried to send the stuck unit again and it worked fine. I got up this morning to one stalled for close to 4 hours. Once stalled it doesn't appear able to recover without ctl-c and restarting the wu.
This has been going on for a couple of days, all on A2's and running fah6.static under vm.

Re: 171.64.65.56 - 171.67.108.25

Posted: Wed Apr 01, 2009 8:00 pm
by chluk2425
kasson wrote:There's a lot of traffic on the server right now. I'm watching the log--work units are coming in and out, so it's not stuck.
the problem is.... after it got stuck, it will not reconnect by itself.... and it will stay there like forever (8hrs)....

Re: 171.64.65.56 - 171.67.108.25

Posted: Fri Apr 10, 2009 1:52 pm
by ikerekes
just an FYI 171.64.65.56 in reject status, and 171.67.108.25 is getting 503 again.