171.64.65.62

Moderators: Site Moderators, FAHC Science Team

Post Reply
Saryn
Posts: 20
Joined: Sun Dec 02, 2007 9:08 pm
Contact:

171.64.65.62

Post by Saryn »

This server has seemed to be down for several days now. Have a WU waiting for it, but it has been sitting in limbo :(
Last edited by Saryn on Thu Apr 28, 2011 3:59 pm, edited 1 time in total.
gwildperson
Posts: 450
Joined: Tue Dec 04, 2007 8:36 pm

Re: 171.67.108.25

Post by gwildperson »

That's a Collection Server. It has been down for months and nobody is going to do anything about it soon. You have to report the Work Server, not the Collection Server.

Read the "do this first" post at the top of this forum
Saryn
Posts: 20
Joined: Sun Dec 02, 2007 9:08 pm
Contact:

171.64.65.62:8080

Post by Saryn »

Excuse me, i copied the wrong IP address.

Code: Select all

14:41:23:Sending unit results: id:02 state:SEND project:6513 run:16 clone:284 gen:66 core:0x78 unit:0x35c503eb4db6fb790042011c00101971
14:41:23:Unit 02: Uploading 3.98KiB
14:41:23:Connecting to [b]171.64.65.62:8080[/b]
14:41:23:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
14:41:23:Trying to send results to collection server
14:41:23:Unit 02: Uploading 3.98KiB
14:41:23:Connecting to 171.67.108.25:8080
14:41:44:WARNING: WorkServer connection failed on port 8080 trying 80
14:41:44:Connecting to 171.67.108.25:80
14:42:05:ERROR: Exception: Failed to connect to 171.67.108.25:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
gwildperson
Posts: 450
Joined: Tue Dec 04, 2007 8:36 pm

Re: 171.64.65.62:8080

Post by gwildperson »

I think you're probably looking at a bug in V7 rather than a problem with that server. I've seen a number of V7 reports that are similar such as viewtopic.php?f=67&t=17987. There's an open ticket covering it. So far everybody has had to dump the WU manually.

If you check the history of that server, you'll see that it's accepting several hundred WUs per hour and although it's moderately busy, it doesn't seem to be overloaded.

Search backward in the log and see if you can find where Unit 02 ended. Was it an error or a successful completion of the WU?
Saryn
Posts: 20
Joined: Sun Dec 02, 2007 9:08 pm
Contact:

Re: 171.64.65.62

Post by Saryn »

Code: Select all

17:49:35:Unit 02:Writing local files
17:49:35:Unit 02:Completed 12500 out of 250000 steps  (5%)
17:49:43:Unit 00:Completed   8000000 out of 50000000 steps (16%).
17:54:12:Unit 00:Completed   8500000 out of 50000000 steps (17%).
17:55:47:Unit 02:Gromacs cannot continue further.
17:55:47:Unit 02:Going to send back what have done.
17:55:47:Unit 02:logfile size: 9633
17:55:47:Unit 02:- Writing 10169 bytes of core data to disk...
17:55:47:Unit 02:Done: 9657 -> 3562 (compressed to 36.8 percent)
17:55:47:Unit 02:  ... Done.
17:55:47:Unit 02:
17:55:47:Unit 02:Folding@home Core Shutdown: EARLY_UNIT_END
17:55:47:FahCore, running Unit 02, returned: BAD_WORK_UNIT (114)
17:55:47:Sending unit results: id:02 state:SEND project:6513 run:16 clone:284 gen:66 core:0x78 unit:0x35c503eb4db6fb790042011c00101971
17:55:47:Unit 02: Uploading 3.98KiB
17:55:47:Connecting to 171.64.65.62:8080
17:55:47:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
17:55:47:Trying to send results to collection server
17:55:47:Unit 02: Uploading 3.98KiB
17:55:47:Connecting to 171.67.108.25:8080
17:55:47:Connecting to assign3.stanford.edu:8080
Looks like something wasn't folding right so it terminated. Should I just delete the unit then?
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: 171.64.65.62

Post by codysluder »

Yes. See ticket #615.
Post Reply