Page 1 of 1

Is plfah1-1.mskcc.org (140.163.4.231) down?

Posted: Wed Apr 08, 2020 1:04 pm
by bronozoj

Code: Select all

12:29:42:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11748 run:0 clone:5487 gen:10 core:0x22 unit:0x0000001a8ca304e75e6bafe323dd07eb
12:29:42:WU01:FS01:Uploading 12.58MiB to 140.163.4.231
12:29:42:WU01:FS01:Connecting to 140.163.4.231:8080
12:30:04:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
12:30:04:WU01:FS01:Connecting to 140.163.4.231:80
12:30:19:WU01:FS01:Upload 0.50%
12:31:01:WU01:FS01:Upload 0.99%
12:31:01:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
12:31:19:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11748 run:0 clone:5487 gen:10 core:0x22 unit:0x0000001a8ca304e75e6bafe323dd07eb
12:31:19:WU01:FS01:Uploading 12.58MiB to 140.163.4.231
12:31:19:WU01:FS01:Connecting to 140.163.4.231:8080
12:32:02:WU01:FS01:Upload 0.99%
12:32:02:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
12:33:56:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11748 run:0 clone:5487 gen:10 core:0x22 unit:0x0000001a8ca304e75e6bafe323dd07eb
12:33:56:WU01:FS01:Uploading 12.58MiB to 140.163.4.231
12:33:56:WU01:FS01:Connecting to 140.163.4.231:8080
12:35:36:WU01:FS01:Upload 0.99%
12:35:36:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
12:38:11:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11748 run:0 clone:5487 gen:10 core:0x22 unit:0x0000001a8ca304e75e6bafe323dd07eb
12:38:11:WU01:FS01:Uploading 12.58MiB to 140.163.4.231
12:38:11:WU01:FS01:Connecting to 140.163.4.231:8080
12:38:32:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
12:38:32:WU01:FS01:Connecting to 140.163.4.231:80
12:38:53:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
12:45:02:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11748 run:0 clone:5487 gen:10 core:0x22 unit:0x0000001a8ca304e75e6bafe323dd07eb
12:45:02:WU01:FS01:Uploading 12.58MiB to 140.163.4.231
12:45:02:WU01:FS01:Connecting to 140.163.4.231:8080
12:45:23:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
12:45:23:WU01:FS01:Connecting to 140.163.4.231:80
12:45:44:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
12:56:07:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11748 run:0 clone:5487 gen:10 core:0x22 unit:0x0000001a8ca304e75e6bafe323dd07eb
12:56:07:WU01:FS01:Uploading 12.58MiB to 140.163.4.231
12:56:07:WU01:FS01:Connecting to 140.163.4.231:8080
12:56:28:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
12:56:28:WU01:FS01:Connecting to 140.163.4.231:80
12:56:50:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
This seems to suggest that the work server is either unreachable or is not responding. Other applications work fine and communicate with other fah servers

Re: Is plfah1-1.mskcc.org (140.163.4.231) down?

Posted: Wed Apr 08, 2020 3:27 pm
by Neil-B
Not showing so at the moment https://apps.foldingathome.org/serverstats and looks like still OK for storage so might just be overload at the moment.

Re: Is plfah1-1.mskcc.org (140.163.4.231) down?

Posted: Fri Apr 10, 2020 1:09 am
by bronozoj
Is there any other way to upload a work unit? It continues to fail with the same error until now and its nearing its expiration (2020-04-16T08:31:28Z). Connecting with a VPN to multiple locations make no difference and trying to ping the server results to 100% dropped packets. The server does have a warning that the collection server is not connected. Can this affect the ability to upload finished work units?

Re: Is plfah1-1.mskcc.org (140.163.4.231) down?

Posted: Fri Apr 10, 2020 1:19 am
by PantherX
bronozoj wrote:Is there any other way to upload a work unit?...
Unfortunately there isn't.
bronozoj wrote:...It continues to fail with the same error until now and its nearing its expiration (2020-04-16T08:31:28Z)...
Today is 2020-04-10T01:16:50Z which means 6 days until it expires. I am hopeful that the issue will be resolved before then
bronozoj wrote:...The server does have a warning that the collection server is not connected. Can this affect the ability to upload finished work units?
The "warning" isn't really a warning per se. The configuration of a CS is entirely optional and depends on the researcher. Not having a CS for a WS means that if the WS is unable to accept the completed WUs, no CS can collect it. If a WS has a CS, then the completed WU can be uploaded to the CS if the WS fails.

Re: Is plfah1-1.mskcc.org (140.163.4.231) down?

Posted: Fri Apr 10, 2020 5:06 am
by iceman1992
Do the assignment servers check if the work servers are up? I keep getting assigned to 140.163.4.231, when it's currently down.