Page 1 of 1
11752 cant upload to 140.163.4.231:80
Posted: Fri Apr 03, 2020 6:41 pm
by treckin
Code: Select all
*********************** Log Started 2020-04-03T18:24:49Z ***********************
18:24:50:WU01:FS02:Sending unit results: id:01 state:SEND error:NO_ERROR project:11752 run:0 clone:12172 gen:1 core:0x22 unit:0x000000038ca304e75e6d6d269489d6ee
18:24:50:WU01:FS02:Uploading 24.34MiB to 140.163.4.231
18:24:50:WU01:FS02:Connecting to 140.163.4.231:8080
18:25:11:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80
18:25:11:WU01:FS02:Connecting to 140.163.4.231:80
18:25:33:WARNING:WU01:FS02:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
18:25:33:WU01:FS02:Sending unit results: id:01 state:SEND error:NO_ERROR project:11752 run:0 clone:12172 gen:1 core:0x22 unit:0x000000038ca304e75e6d6d269489d6ee
18:25:33:WU01:FS02:Uploading 24.34MiB to 140.163.4.231
18:25:33:WU01:FS02:Connecting to 140.163.4.231:8080
18:25:54:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80
18:25:54:WU01:FS02:Connecting to 140.163.4.231:80
18:26:16:WARNING:WU01:FS02:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
18:26:33:WU01:FS02:Sending unit results: id:01 state:SEND error:NO_ERROR project:11752 run:0 clone:12172 gen:1 core:0x22 unit:0x000000038ca304e75e6d6d269489d6ee
18:26:33:WU01:FS02:Uploading 24.34MiB to 140.163.4.231
18:26:33:WU01:FS02:Connecting to 140.163.4.231:8080
18:26:54:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80
18:26:54:WU01:FS02:Connecting to 140.163.4.231:80
18:27:16:WARNING:WU01:FS02:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
18:28:10:WU01:FS02:Sending unit results: id:01 state:SEND error:NO_ERROR project:11752 run:0 clone:12172 gen:1 core:0x22 unit:0x000000038ca304e75e6d6d269489d6ee
18:28:10:WU01:FS02:Uploading 24.34MiB to 140.163.4.231
18:28:10:WU01:FS02:Connecting to 140.163.4.231:8080
18:29:26:WU01:FS02:Upload 0.51%
18:29:26:WARNING:WU01:FS02:Exception: Failed to send results to work server: Transfer failed
18:30:48:WU01:FS02:Sending unit results: id:01 state:SEND error:NO_ERROR project:11752 run:0 clone:12172 gen:1 core:0x22 unit:0x000000038ca304e75e6d6d269489d6ee
18:30:48:WU01:FS02:Uploading 24.34MiB to 140.163.4.231
18:30:48:WU01:FS02:Connecting to 140.163.4.231:8080
18:31:09:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80
18:31:09:WU01:FS02:Connecting to 140.163.4.231:80
18:31:16:WU01:FS02:Upload 0.26%
18:32:24:WU01:FS02:Upload 0.51%
18:32:24:WARNING:WU01:FS02:Exception: Failed to send results to work server: Transfer failed
18:35:02:WU01:FS02:Sending unit results: id:01 state:SEND error:NO_ERROR project:11752 run:0 clone:12172 gen:1 core:0x22 unit:0x000000038ca304e75e6d6d269489d6ee
18:35:02:WU01:FS02:Uploading 24.34MiB to 140.163.4.231
18:35:02:WU01:FS02:Connecting to 140.163.4.231:8080
18:35:23:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80
18:35:23:WU01:FS02:Connecting to 140.163.4.231:80
18:35:23:WU01:FS02:Upload 0.26%
18:35:43:WU01:FS02:Upload 0.51%
18:35:43:WARNING:WU01:FS02:Exception: Failed to send results to work server: Transfer failed
I have already restarted my machine a few times, not sure what else I can try.
A manual ping of the server fails.
Re: 11752 cant upload to 140.163.4.231:80
Posted: Sat Apr 04, 2020 7:56 am
by Yuko
I have the same exact problem, same server and same project
Re: 11752 cant upload to 140.163.4.231:80
Posted: Sun Apr 05, 2020 3:55 am
by Qwarkman
I'm having trouble as well
Getting 2 different responds though
Code: Select all
23:01:47:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11752 run:0 clone:12494 gen:1 core:0x22 unit:0x000000028ca304e75e6d6d0d628d914a
23:01:47:WU02:FS01:Uploading 24.33MiB to 140.163.4.231
23:01:47:WU02:FS01:Connecting to 140.163.4.231:8080
23:02:08:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
23:02:08:WU02:FS01:Connecting to 140.163.4.231:80
23:03:20:WU02:FS01:Upload 0.26%
23:03:20:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
or
Code: Select all
01:04:46:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11752 run:0 clone:12494 gen:1 core:0x22 unit:0x000000028ca304e75e6d6d0d628d914a
01:04:46:WU02:FS01:Uploading 24.33MiB to 140.163.4.231
01:04:46:WU02:FS01:Connecting to 140.163.4.231:8080
01:05:07:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
01:05:07:WU02:FS01:Connecting to 140.163.4.231:80
01:05:28:WARNING:WU02:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
Re: 11752 cant upload to 140.163.4.231:80
Posted: Sun Apr 05, 2020 7:18 am
by Nachro
Same problem here, same server but project 11748.
The server 140.163.4.231 seems offline from 2 days
Re: 11752 cant upload to 140.163.4.231:80
Posted: Sun Apr 05, 2020 8:13 am
by uyaem
The error messages hint to a server under heavy load / overload.
According to
https://apps.foldingathome.org/serverstats, this WS doesn't have a collection server (where results would be parked until the WS has time), which is probably why there is issues with this particular server for multiple people.
Re: 11752 cant upload to 140.163.4.231:80
Posted: Sun Apr 05, 2020 10:46 am
by semaphore
Same here, but it doesnt seem to be completely dead... rather more like out of disk space.
Not posting log here (due to restrictions being a newbie on forum) but as many in this already have. The upload starts a couple of percent and then stops.
Mine (project 11750) stops at 0.43%.
Hope this can be fixed. Have a feeling that almost 48hours of folding is trying to hammer this server now, not making it more easy, if the server would to come up.
A bit off topic: But why isn't any kind of redundancy built into this? Why only try an IP, and not asking another server for Collection Servers that can handle a specific project? What I can see in this thread there are more than one project ids affected now.
Re: 11752 cant upload to 140.163.4.231:80
Posted: Sun Apr 05, 2020 10:42 pm
by Arnold0
Hi,
So I have a WU that connot be uploaded since 3 days. When it tries to upload I either get a long error basically stating the server didnt answer, or the upload seam to start and always fail at 0.99%.
The server that fails is 140.163.4.231. When I try to ping it doesn't answer pings, but on the server status page it isn't listed as down.
Here are logs :
22:05:41:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11748 run:0 clone:5555 gen:2 core:0x22 unit:0x000000068ca304e75e6baff246ee8f4f
22:05:41:WU02:FS01:Uploading 12.58MiB to 140.163.4.231
22:05:41:WU02:FS01:Connecting to 140.163.4.231:8080
22:06:03:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
22:06:03:WU02:FS01:Connecting to 140.163.4.231:80
22:06:24:WARNING:WU02:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: Une tentative de connexion a échoué car le parti connecté n’a pas répondu convenablement au-delà d’une certaine durée ou une connexion établie a échoué car l’hôte de connexion n’a pas répondu.
22:23:38:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11748 run:0 clone:5555 gen:2 core:0x22 unit:0x000000068ca304e75e6baff246ee8f4f
22:23:38:WU02:FS01:Uploading 12.58MiB to 140.163.4.231
22:23:38:WU02:FS01:Connecting to 140.163.4.231:8080
22:23:59:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
22:23:59:WU02:FS01:Connecting to 140.163.4.231:80
22:24:01:WU00:FS01:0x22:Completed 620000 out of 2000000 steps (31%)
22:24:07:WU02:FS01:Upload 0.50%
22:24:27:WU03:FS00:0xa7:Completed 142500 out of 250000 steps (57%)
22:25:15:WU02:FS01:Upload 0.99%
22:25:15:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
However I have noticed something very wierd, in the logs here you can see the project is 11748, run 0, clone 5555, gen 2, I tried to enter these details on the WU status page, and I can see that someone had already returned it on march 30. I had it assigned on April 3rd with a timeout on April 4th and expiration April 11th.
I checked the ones from Qwarkman and treckin, and they both also got assigned WUs that someone else already had returned on March 30th.
Is it normal that we got WUs that were already returned a few days prior by other people ? Could it be the reason why we cant return ours ?
Re: 11752 cant upload to 140.163.4.231:80
Posted: Mon Apr 06, 2020 2:43 am
by PantherX
Arnold0 wrote:...I checked the ones from Qwarkman and treckin, and they both also got assigned WUs that someone else already had returned on March 30th.
Is it normal that we got WUs that were already returned a few days prior by other people ? Could it be the reason why we cant return ours ?
Generally speaking, on the first attempt, a WU is assigned to a single system only. If that WU doesn't return before the timeout period or reports an error, then the following happens:
1) WU doesn't arrive before the timeout period -> It gets reassigned to a different system
2) WU returns an error -> It is assigned to 3 other systems to verify if the WU is a bad one or not
Apart from that, it doesn't normally get assigned to multiple systems.
Re: 11752 cant upload to 140.163.4.231:80
Posted: Mon Apr 06, 2020 9:39 pm
by semaphore
PantherX wrote:Arnold0 wrote:...I checked the ones from Qwarkman and treckin, and they both also got assigned WUs that someone else already had returned on March 30th.
Is it normal that we got WUs that were already returned a few days prior by other people ? Could it be the reason why we cant return ours ?
Generally speaking, on the first attempt, a WU is assigned to a single system only. If that WU doesn't return before the timeout period or reports an error, then the following happens:
1) WU doesn't arrive before the timeout period -> It gets reassigned to a different system
2) WU returns an error -> It is assigned to 3 other systems to verify if the WU is a bad one or not
Apart from that, it doesn't normally get assigned to multiple systems.
Thats good info, thanks PantherX.
My WU has been in "Work Queue" with status "SEND" for 4days now, and stopped restarting my Client/Computer, because the server clearly doesn't respond. And with the above info I am guessing that the re-assigns for my WU has started 3 days ago, and if they got the WU from the same server, then they will in turn trigger a re-assign (due to work server not accepting uploads), which in turn will cause next re-assignment to next Client...etc..
So my question then:
Is it the same workserver that my Client trying to upload that re-assigns? or is it the main two ones at assign1.foldingathome.org and assign2.foldingathome.org? (from some sort of central WU warehouse)
Because if it can be configured to NOT have a collection server, then this could be as bad as the re-assignement explosion above.
But IF the central servers already re-assigned my WU to other (better configured) WorkServers, then we all in this thread will have to wait for the expiration (where I guess the WU gets deleted)
Re: 11752 cant upload to 140.163.4.231:80
Posted: Mon Apr 06, 2020 9:59 pm
by Tohya
You can mouse over the error to get the error message. Both of the servers run by rafal.wiewiora use fah4.eastus.cloudapp.azure.com as a collection server. but that collection server is down.
The 2 Assignment servers just direct your client to a work server that has work available for your configuration. The work server controls the projects that are on it and will reassign it if it expires.
Re: 11752 cant upload to 140.163.4.231:80
Posted: Mon Apr 06, 2020 10:34 pm
by semaphore
Thanks Tohya.
The server page is reporting its CS with fah4 as "failed", so I kind of want to come to the same conclusion.
Although in my Client has NEVER tried to connect to fah4 (52.224.109.74).
My client reports 140.163.4.231 as the Work Server
And 0.0.0.0 as the Collection Server.
And i think someone said that if there is no Collection server configured, then the upload goes to the Work Server.
Of all the reports of uploads just being able to upload around 0.5%, it gives me the conclusion that the Work Server is out of disk.
It would be nice if you could access the Clients database somehow, and just enter and IP to some of the working Collection Servers.
Or even better (if I let my brain stretch a bit): Have a client that creates a torrent for each finished WU, which all other clients (if they choose to accept) joins in on (with DHT), and suddenly you get one of the worlds largest distribution networks for the scientists to collect their WU:s from (when they managed to extend their disk). A central tracker could then remove the torrent for that WU when the credit tracker reports "credits entered to the database".
Re: 11752 cant upload to 140.163.4.231:80
Posted: Mon Apr 06, 2020 11:58 pm
by PantherX
semaphore wrote:...And with the above info I am guessing that the re-assigns for my WU has started 3 days ago, and if they got the WU from the same server, then they will in turn trigger a re-assign (due to work server not accepting uploads), which in turn will cause next re-assignment to next Client...etc..
So my question then: Is it the same workserver that my Client trying to upload that re-assigns? or is it the main two ones at assign1.foldingathome.org and assign2.foldingathome.org? (from some sort of central WU warehouse)
Because if it can be configured to NOT have a collection server, then this could be as bad as the re-assignement explosion above.
But IF the central servers already re-assigned my WU to other (better configured) WorkServers, then we all in this thread will have to wait for the expiration (where I guess the WU gets deleted)
The WU is reassigned after the Timeout period has gone. This will vary for each Project.
The WU is always returned to the WS that it is downloaded from or a CS if it is configured.
The AS is not really a central WU warehouse... it is more of a WU index. It knows off all the WUs present on WS so when your client asks for a WU, it will point your client to the most suitable WS. There are limits to the number of times a WU is reassigned before it gets blacklisted. Thus, the WU explosion is rather limited.
semaphore wrote:...It would be nice if you could access the Clients database somehow, and just enter and IP to some of the working Collection Servers.
Even if you do that, it would fail as the CS will only accept WUs that it knows it should accept. This is done on a Project level where they configure WS and potentially CS.
semaphore wrote:...Or even better (if I let my brain stretch a bit): Have a client that creates a torrent for each finished WU, which all other clients (if they choose to accept) joins in on (with DHT), and suddenly you get one of the worlds largest distribution networks for the scientists to collect their WU:s from (when they managed to extend their disk). A central tracker could then remove the torrent for that WU when the credit tracker reports "credits entered to the database".
I like that idea as I have the disk space and internet connection to contribute but the idea didn't make it past the initial testing phase back in 2008: viewtopic.php?f=16&t=643
Re: 11752 cant upload to 140.163.4.231:80
Posted: Tue Apr 07, 2020 4:19 pm
by semaphore
140.163.4.231 is now working since about 1hour
Right click your F@H icon, next to the clock (on windows desktop), and choose "Quit"
Then start up your client again by searching for "Folding@home", and press enter.
The client should now start up again, but with lower timeouts for the now working Work Server (140.163.4.231:80)
Re: 11752 cant upload to 140.163.4.231:80
Posted: Tue Apr 07, 2020 4:21 pm
by Arnold0
Hi, I don't know if anything changed on Folding's side but it appears my stuck WU did sucessfully upload to 140.163.4.231 today after being stuck for 4 days. I now see two lines on the WU Status page, the one from March 30th and mine from today.
Re: 11752 cant upload to 140.163.4.231:80
Posted: Tue Apr 07, 2020 4:29 pm
by semaphore
PantherX wrote:
semaphore wrote:...Or even better (if I let my brain stretch a bit): Have a client that creates a torrent for each finished WU, which all other clients (if they choose to accept) joins in on (with DHT), and suddenly you get one of the worlds largest distribution networks for the scientists to collect their WU:s from (when they managed to extend their disk). A central tracker could then remove the torrent for that WU when the credit tracker reports "credits entered to the database".
I like that idea as I have the disk space and internet connection to contribute but the idea didn't make it past the initial testing phase back in 2008: viewtopic.php?f=16&t=643
Ah... more info than I found, but still not a clean shut-down of the project "Storage@Home".
Still the functionality could still be built, just not as an own coded client.
For instance using a torrent-tracker could easily solve any disc problem Work Servers have (or Collection Servers).
You just collect the science when you can, and if Client needs to go offline, it can do that more quickly by sending its work to other Clients (that of course will help out). Oh well.. will start an own thread about it, as soon I have time.
Still for us affected by this server, send your data now, before it brakes again ![Wink ;)](./images/smilies/icon_wink.gif)
I was lucky and looks I got full points (some have reported their WU:s being rejected)