128.252.203.10 problem or WU?

Moderators: Site Moderators, FAHC Science Team

level6
Posts: 13
Joined: Tue May 05, 2020 2:35 am
Hardware configuration: See the current list, here: https://www.leper.org/FAH/level6/client_stats.html
Location: Dallas, Texas, USA
Contact:

Re: 128.252.203.10 problem or WU?

Post by level6 »

Ah, excellent info, thanks PantherX! And, that is great news, indeed.

I definitely have more lurking to do to understand these details better.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 128.252.203.10 problem or WU?

Post by PantherX »

Oussebon wrote:...Anything to be done?
Apart from reporting it in the Forum which is then raised to the researcher, not much :( You can simply leave the client running and hopefully, the completed WU is uploaded before the Expiration date. Apart from that, there's not much one can do unless the researchers ask for something specific.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Oussebon
Posts: 5
Joined: Mon Mar 16, 2020 3:10 pm

Re: 128.252.203.10 problem or WU?

Post by Oussebon »

GDF wrote:This is only anecdotal, worked for me, and might have been complete coincidence. I paused the slot with the problem, waited for the server to reboot (which you can see on the serverstats page by watching uptime roll back to zero), then restarted the slot. The upload went right through.
Thanks for the tip. The server was just restarted but sadly no joy.

Fails at 0.23% as above.

Although what used to happen is that it might occasionally try to start sending, getting as far as 0.23%, and fail, then the server would "actively refuse" the connection for every subsequent attempt.

Now, it starts uploading every time it is meant to. It just always fails at 0.23%.

Same messages as per previously-posted logs (Transfer failed).
PantherX wrote:
Oussebon wrote:...Anything to be done?
Apart from reporting it in the Forum which is then raised to the researcher, not much :( You can simply leave the client running and hopefully, the completed WU is uploaded before the Expiration date. Apart from that, there's not much one can do unless the researchers ask for something specific.
Thanks - and sorry, initially missed your post somehow! As things have changed a little (first hurdle overcome, still 2nd hurdle in the way) I hope the update helps them narrow it down. Shame to waste WUs after all.

Further edit: Scratch the above - back to the active refused error message as per last post.
GDF
Posts: 8
Joined: Mon May 04, 2020 7:53 pm

Re: 128.252.203.10 problem or WU?

Post by GDF »

PantherX wrote:I do understand your POV and it negatively impacts all involved, the researchers and the donors. However, considering that there are multiple labs involved (https://foldingathome.org/about/the-fol ... onsortium/) across the globe in various countries dealing with various lock-down policies, even on a "good" day, it would take a bit of time. In a pandemic situation, it is a lot harder but no-one has given up and instead, they have double-down and working to improving various aspects to ensure that it is fixed. Sometimes, labs will have to involve their internal IT department which can also add to the delay if it is a University infrastructure limitation like internet or electricity.
Thanks for the welcome, and I understand the problem intimately, as remote server management is something I do as part of my day job. Right now I'm having to get explicit permission to be in the building where my hardware is located, so I'm grateful that nearly all of the process can be managed through the net. I'm also glad I don't have to hear about problems through word of mouth in a public forum!

I appreciate all the effort that is going into this and I'm happy to be able to play a tiny part.
kalamai2
Posts: 2
Joined: Wed May 06, 2020 4:51 pm

Re: 128.252.203.10 problem or WU?

Post by kalamai2 »

Failing for me as well for at least a few hours (log freshly started after a restart attempt). Does seem redundancy/scaling on the work collection servers would be helpful.

Code: Select all

*********************** Log Started 2020-05-06T16:37:53Z ***********************
16:37:53:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11760 run:0 clone:6289 gen:29 core:0x22 unit:0x0000003180fccb0a5e6f0a8922c6cfcd
16:37:53:WU00:FS01:Uploading 23.06MiB to 128.252.203.10
16:37:53:WU00:FS01:Connecting to 128.252.203.10:8080
16:38:15:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
16:38:15:WU00:FS01:Connecting to 128.252.203.10:80
16:38:36:WARNING:WU00:FS01:Exception: Failed to send results to work server: Failed to connect to 128.252.203.10:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
16:38:37:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11760 run:0 clone:6289 gen:29 core:0x22 unit:0x0000003180fccb0a5e6f0a8922c6cfcd
16:38:37:WU00:FS01:Uploading 23.06MiB to 128.252.203.10
16:38:37:WU00:FS01:Connecting to 128.252.203.10:8080
16:38:58:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
16:38:58:WU00:FS01:Connecting to 128.252.203.10:80
16:39:19:WARNING:WU00:FS01:Exception: Failed to send results to work server: Failed to connect to 128.252.203.10:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
16:39:37:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11760 run:0 clone:6289 gen:29 core:0x22 unit:0x0000003180fccb0a5e6f0a8922c6cfcd
16:39:37:WU00:FS01:Uploading 23.06MiB to 128.252.203.10
16:39:37:WU00:FS01:Connecting to 128.252.203.10:8080
16:41:06:WU00:FS01:Upload 0.54%
16:41:06:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
16:41:14:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11760 run:0 clone:6289 gen:29 core:0x22 unit:0x0000003180fccb0a5e6f0a8922c6cfcd
16:41:14:WU00:FS01:Uploading 23.06MiB to 128.252.203.10
16:41:14:WU00:FS01:Connecting to 128.252.203.10:8080
16:41:30:WU00:FS01:Upload 0.27%
16:42:38:WU00:FS01:Upload 0.54%
16:42:38:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
16:43:52:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11760 run:0 clone:6289 gen:29 core:0x22 unit:0x0000003180fccb0a5e6f0a8922c6cfcd
16:43:52:WU00:FS01:Uploading 23.06MiB to 128.252.203.10
16:43:52:WU00:FS01:Connecting to 128.252.203.10:8080
16:45:29:WU00:FS01:Upload 0.54%
16:45:29:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
16:45:30:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11760 run:0 clone:6289 gen:29 core:0x22 unit:0x0000003180fccb0a5e6f0a8922c6cfcd
16:45:30:WU00:FS01:Uploading 23.06MiB to 128.252.203.10
16:45:30:WU00:FS01:Connecting to 128.252.203.10:8080
16:45:49:WU00:FS01:Upload 0.54%
16:45:49:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
16:47:07:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11760 run:0 clone:6289 gen:29 core:0x22 unit:0x0000003180fccb0a5e6f0a8922c6cfcd
16:47:07:WU00:FS01:Uploading 23.06MiB to 128.252.203.10
16:47:07:WU00:FS01:Connecting to 128.252.203.10:8080
16:47:13:WU00:FS01:Upload 20.33%
16:47:19:WU00:FS01:Upload 84.83%
16:47:54:WARNING:WU00:FS01:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
16:49:44:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11760 run:0 clone:6289 gen:29 core:0x22 unit:0x0000003180fccb0a5e6f0a8922c6cfcd
16:49:44:WU00:FS01:Uploading 23.06MiB to 128.252.203.10
16:49:44:WU00:FS01:Connecting to 128.252.203.10:8080
16:53:01:WU00:FS01:Upload 1.63%
16:53:01:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
16:53:58:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11760 run:0 clone:6289 gen:29 core:0x22 unit:0x0000003180fccb0a5e6f0a8922c6cfcd
16:53:59:WU00:FS01:Uploading 23.06MiB to 128.252.203.10
16:53:59:WU00:FS01:Connecting to 128.252.203.10:8080
16:54:01:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
16:54:01:WU00:FS01:Connecting to 128.252.203.10:80
16:54:04:WARNING:WU00:FS01:Exception: Failed to send results to work server: Failed to connect to 128.252.203.10:80: No connection could be made because the target machine actively refused it.
kalamai2
Posts: 2
Joined: Wed May 06, 2020 4:51 pm

Re: 128.252.203.10 problem or WU?

Post by kalamai2 »

This finally uploaded for me, I noticed a fresh restart in server stats and also just updated my client version to .13 - I imagine it was the server restart that got it working but who knows :)

Thanks.

Mike
Jeanne de Flandre
Posts: 3
Joined: Sat May 02, 2020 11:55 am

Re: 128.252.203.10 problem or WU?

Post by Jeanne de Flandre »

Hi,

Upload to 128.252.203.10 for Project 11760 still keeps failing from 2020/05/06 21:43.

The estimated credit is already same as the base credit. I don't care my 'lost credit', but I want that the server will receive the result.

Code: Select all

*********************** Log Started 2020-05-06T17:20:44Z ***********************
...
21:43:03:WU00:FS01:Uploading 23.09MiB to 128.252.203.10
21:43:03:WU00:FS01:Connecting to 128.252.203.10:8080
21:43:18:WU00:FS01:Upload 0.27%
21:43:42:WU00:FS01:Upload 0.54%
21:43:43:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
21:43:43:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11760 run:0 clone:1179 gen:25 core:0x22 unit:0x0000003080fccb0a5e6d7cd5382edcbf
...
*********************** Log Started 2020-05-09T02:19:53Z ***********************
...
05:54:57:WU00:FS01:Uploading 23.09MiB to 128.252.203.10
05:54:57:WU00:FS01:Connecting to 128.252.203.10:8080
05:55:12:WU00:FS01:Upload 0.54%
05:55:12:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 128.252.203.10 problem or WU?

Post by PantherX »

Welcome to the F@H Forum Jeanne de Flandre,

Please note that the Server 128.252.203.10 has an uptime of ~15 minutes which means that it was recently restarted. Thus, your completed WU will hopefully be accepted soon :)
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Jeanne de Flandre
Posts: 3
Joined: Sat May 02, 2020 11:55 am

Re: 128.252.203.10 problem or WU?

Post by Jeanne de Flandre »

Thanks for your reply PantherX.
Each time I see https://apps.foldingathome.org/serverstats , 128.252.203.10 *always* has quite short uptime and now it says "a few seconds". So I guess that it repeats rebooting.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 128.252.203.10 problem or WU?

Post by PantherX »

Thanks for that, I have informed the researcher so let's see what happens :)
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 128.252.203.10 problem or WU?

Post by bruce »

anandhanju wrote:Thanks for your reports. The necessary folks have been notified and they will be looking into this.
Since the start of the COVID surge, the demand for new assignments has regularly exceeded the available bandwidth on FAH's servers. As fast as FAH could add more servers, the demand increased even more. Code was added to the Assignment Server to limit this excess to something on the order of what can actually be useful.

From my observations, there's nothing really limiting the bandwidth of the WUs being returned. When a FAHClient decides it's time to upload a result, it proceeds without any knowledge of how much inbound bandwidth is available. I frequently see very slow upload speeds, which IMHO indicates the inbound path is (probably) saturated. I can't think of a good way to manage that bandwidth other than to let an increasing percentage of upload transactions fail and redirect them to a Collection Server.

Rebooting the server, of course, terminates the active uploads and then it takes some time for the backlog to decide to retry. That's not a very good system but as I said, I can't think of a better solution.

Comments anyone?
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 128.252.203.10 problem or WU?

Post by PantherX »

Please note that the Server (128.252.203.10) has been poked and it seems to be stable now. Let's see if your WUs are now uploaded without issues :)
bruce wrote:...I can't think of a good way to manage that bandwidth other than to let an increasing percentage of upload transactions fail and redirect them to a Collection Server...
Is it possible to alternate the WU being allocated to say WU A will return to WS and WU B will return to CS? In other words, alternate the primary and secondary returns. That way, the initial impact on the WS has been "halved" but then the CS has gone from being a backup to being a production one. Plus, the links between the WS and CS will now be used continuously during production as opposed to only for backup.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Jeanne de Flandre
Posts: 3
Joined: Sat May 02, 2020 11:55 am

Re: 128.252.203.10 problem or WU?

Post by Jeanne de Flandre »

Thanks for your intervention. Now it is finally uploaded. :)

Code: Select all

******************************* Date: 2020-05-09 *******************************
23:54:57:WU00:FS01:Uploading 23.09MiB to 128.252.203.10
23:54:57:WU00:FS01:Connecting to 128.252.203.10:8080
23:55:03:WU00:FS01:Upload 2.98%
...
23:58:15:WU00:FS01:Upload 98.51%
23:58:19:WU00:FS01:Upload complete
23:58:19:WU00:FS01:Server responded WORK_ACK (400)
23:58:19:WU00:FS01:Final credit estimate, 12884.00 points
23:58:19:WU00:FS01:Cleaning up
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 128.252.203.10 problem or WU?

Post by PantherX »

That's great to hear! Thanks for the confirmation :)
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Post Reply