171.67.108.25 down -resolved

Moderators: Site Moderators, FAHC Science Team

Post Reply
Jeannie
Posts: 49
Joined: Sun Dec 02, 2007 3:07 am
Location: Central New Jersey

171.67.108.25 down -resolved

Post by Jeannie »

This collection server has been down since early Friday morning, and I have a completed uniprocessor result to send, and another uniprocessor workunit that will complete within the next day destined for this collection server. Any news?
Last edited by Jeannie on Mon Mar 21, 2011 1:39 am, edited 1 time in total.
HendricksSA
Posts: 336
Joined: Fri Jun 26, 2009 4:34 am

Re: 171.67.108.25 down

Post by HendricksSA »

Possibly related to this. http://folding.typepad.com/news/2011/03 ... vsp22.html
The server status page shows it down right now ... along with several others.
Jeannie
Posts: 49
Joined: Sun Dec 02, 2007 3:07 am
Location: Central New Jersey

Re: 171.67.108.25 down

Post by Jeannie »

Except this server wasn't listed in that news item, and according to the news item the problem with those other servers is resolved.
HendricksSA
Posts: 336
Joined: Fri Jun 26, 2009 4:34 am

Re: 171.67.108.25 down

Post by HendricksSA »

Stats page here: http://fah-web.stanford.edu/serverstat.html
This CS is listed in the troubleshooting guide at: viewtopic.php?f=18&t=17794 and it should help you. Your queue should be able to hold these for a bit until the connection problems are resolved.
Jeannie
Posts: 49
Joined: Sun Dec 02, 2007 3:07 am
Location: Central New Jersey

Re: 171.67.108.25 down

Post by Jeannie »

Code: Select all

00:54:00:Sending unit results: id:01 state:SEND project:6517 run:15 clone:170 gen:45 core:0x78 unit:0x4c1325354d8461f7002d00aa000f1975
00:54:01:Unit 01: Uploading 4.14KiB
00:54:01:Connecting to 171.64.65.62:8080
00:54:01:WARNING: Exception: Failed to send results to work server: Failed to read response packet
00:54:01:Trying to send results to collection server
00:54:01:Unit 01: Uploading 4.14KiB
00:54:01:Connecting to 171.67.108.25:8080
00:54:05:Unit 02:(Starting from checkpoint)
00:54:05:Unit 02:Protein: 1CFC_A_8 in water
00:54:05:Unit 02:
00:54:05:Unit 02:Writing local files
00:54:05:Unit 02:Completed 52500 out of 250000 steps  (21%)
00:54:22:WARNING: WorkServer connection failed on port 8080 trying 80
00:54:22:Connecting to 171.67.108.25:80
00:54:43:ERROR: Exception: Failed to connect to 171.67.108.25:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
00:55:00:Sending unit results: id:01 state:SEND project:6517 run:15 clone:170 gen:45 core:0x78 unit:0x4c1325354d8461f7002d00aa000f1975
00:55:01:Unit 01: Uploading 4.14KiB
00:55:01:Connecting to 171.64.65.62:8080
00:55:01:WARNING: Exception: Failed to send results to work server: Failed to read response packet
00:55:01:Trying to send results to collection server
00:55:01:Unit 01: Uploading 4.14KiB
00:55:01:Connecting to 171.67.108.25:8080
So really my problem is with Workserver 171.64.65.62. The Stats page says this is 'accepting'; if I try this URL in my browser, I get the OK response, but if you look at the first first few lines above, upi see I get "00:54:01:WARNING: Exception: Failed to send results to work server: Failed to read response packet".

I rebooted my computer to make sure that I didn't have an internet connection problem, even though I hadn't seen any indications of such a probem,

Is this possibly a problem with my client rather than with the server? ( I have successfully returned a different workunit after completing 6517(15.170,45)
Jeannie
Posts: 49
Joined: Sun Dec 02, 2007 3:07 am
Location: Central New Jersey

Re: 171.67.108.25 down-resoved

Post by Jeannie »

I looked back in a prior log and found that I had a message that 6517(15, 170, 45) was detected as a bad work unit when it was 29% complete. That MIGHT have been around the time I upgraded to a new client, so I assume this one was 'user error'. I've deleted the work folder for it.
HendricksSA
Posts: 336
Joined: Fri Jun 26, 2009 4:34 am

Re: 171.67.108.25 down -resolved

Post by HendricksSA »

Hard to know exactly how this one went. Something was not working perfectly since the client had to try the CS ... but at the same time it was a bad work unit. Since you could get the "OK" in your browser, I would just chalk it up to a bad combination of events and not anything you did. I suspect your client is ok. Let it run another and see what happens. If it fails again, we are here. Good job getting the essentials posted.
Post Reply