Page 1 of 1

171.67.108.25 down -resolved

Posted: Sun Mar 20, 2011 10:49 pm
by Jeannie
This collection server has been down since early Friday morning, and I have a completed uniprocessor result to send, and another uniprocessor workunit that will complete within the next day destined for this collection server. Any news?

Re: 171.67.108.25 down

Posted: Sun Mar 20, 2011 11:08 pm
by HendricksSA
Possibly related to this. http://folding.typepad.com/news/2011/03 ... vsp22.html
The server status page shows it down right now ... along with several others.

Re: 171.67.108.25 down

Posted: Sun Mar 20, 2011 11:10 pm
by Jeannie
Except this server wasn't listed in that news item, and according to the news item the problem with those other servers is resolved.

Re: 171.67.108.25 down

Posted: Sun Mar 20, 2011 11:19 pm
by HendricksSA
Stats page here: http://fah-web.stanford.edu/serverstat.html
This CS is listed in the troubleshooting guide at: viewtopic.php?f=18&t=17794 and it should help you. Your queue should be able to hold these for a bit until the connection problems are resolved.

Re: 171.67.108.25 down

Posted: Mon Mar 21, 2011 1:07 am
by Jeannie

Code: Select all

00:54:00:Sending unit results: id:01 state:SEND project:6517 run:15 clone:170 gen:45 core:0x78 unit:0x4c1325354d8461f7002d00aa000f1975
00:54:01:Unit 01: Uploading 4.14KiB
00:54:01:Connecting to 171.64.65.62:8080
00:54:01:WARNING: Exception: Failed to send results to work server: Failed to read response packet
00:54:01:Trying to send results to collection server
00:54:01:Unit 01: Uploading 4.14KiB
00:54:01:Connecting to 171.67.108.25:8080
00:54:05:Unit 02:(Starting from checkpoint)
00:54:05:Unit 02:Protein: 1CFC_A_8 in water
00:54:05:Unit 02:
00:54:05:Unit 02:Writing local files
00:54:05:Unit 02:Completed 52500 out of 250000 steps  (21%)
00:54:22:WARNING: WorkServer connection failed on port 8080 trying 80
00:54:22:Connecting to 171.67.108.25:80
00:54:43:ERROR: Exception: Failed to connect to 171.67.108.25:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
00:55:00:Sending unit results: id:01 state:SEND project:6517 run:15 clone:170 gen:45 core:0x78 unit:0x4c1325354d8461f7002d00aa000f1975
00:55:01:Unit 01: Uploading 4.14KiB
00:55:01:Connecting to 171.64.65.62:8080
00:55:01:WARNING: Exception: Failed to send results to work server: Failed to read response packet
00:55:01:Trying to send results to collection server
00:55:01:Unit 01: Uploading 4.14KiB
00:55:01:Connecting to 171.67.108.25:8080
So really my problem is with Workserver 171.64.65.62. The Stats page says this is 'accepting'; if I try this URL in my browser, I get the OK response, but if you look at the first first few lines above, upi see I get "00:54:01:WARNING: Exception: Failed to send results to work server: Failed to read response packet".

I rebooted my computer to make sure that I didn't have an internet connection problem, even though I hadn't seen any indications of such a probem,

Is this possibly a problem with my client rather than with the server? ( I have successfully returned a different workunit after completing 6517(15.170,45)

Re: 171.67.108.25 down-resoved

Posted: Mon Mar 21, 2011 1:21 am
by Jeannie
I looked back in a prior log and found that I had a message that 6517(15, 170, 45) was detected as a bad work unit when it was 29% complete. That MIGHT have been around the time I upgraded to a new client, so I assume this one was 'user error'. I've deleted the work folder for it.

Re: 171.67.108.25 down -resolved

Posted: Mon Mar 21, 2011 7:38 am
by HendricksSA
Hard to know exactly how this one went. Something was not working perfectly since the client had to try the CS ... but at the same time it was a bad work unit. Since you could get the "OK" in your browser, I would just chalk it up to a bad combination of events and not anything you did. I suspect your client is ok. Let it run another and see what happens. If it fails again, we are here. Good job getting the essentials posted.