Page 3 of 7

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 12:39 am
by GeekFantasy
Still have not successfully uploaded to 140.163.4.231, my client hasn't even tried to upload for over 30 minutes now. Already folded another GPU WU for a different server without issue up or down. Now my GPU is folding a WU for the other server in this problem group 140.163.4.241. Let's hope the upload experience for 241 is better than 231. :roll:

EDIT:
Kougar wrote:Started to upload, took 5 minutes to upload 0.56% before it failed. All subsequent attempts to connect have failed.
After a long pause, I had 0.57% upload over 2 minutes before failing again for the original 231 server. Still folding the WU from 241.

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 1:46 am
by Kougar
Finished a second WU and it uploaded to a different server, yay. Meanwhile this one is still stuck and it has currently lost 45% of the points value. Sometimes it connects and starts to upload, but always fails around half a percent.

Code: Select all

00:50:15:WU01:FS01:Uploading 24.34MiB to 140.163.4.231
00:50:15:WU01:FS01:Connecting to 140.163.4.231:8080
00:50:36:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
00:50:36:WU01:FS01:Connecting to 140.163.4.231:80
00:50:39:WU01:FS01:Upload 0.26%
00:51:04:WU00:FS01:0x22:Completed 4640000 out of 8000000 steps (58%)
00:51:47:WU01:FS01:Upload 0.51%
00:51:47:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed

PRCG 11749(0,4957,15) no collection server assigned

Posted: Fri Apr 10, 2020 2:22 am
by favrepeoria

Code: Select all

02:14:36:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11749 run:0 clone:4957 gen:15 core:0x22 unit:0x0000001f8ca304e75e6a80183896451e
02:14:36:WU00:FS01:Uploading 12.57MiB to 140.163.4.231
02:14:36:WU00:FS01:Connecting to 140.163.4.231:8080
02:14:57:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
02:14:57:WU00:FS01:Connecting to 140.163.4.231:80
02:15:18:WARNING:WU00:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
02:15:18:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11749 run:0 clone:4957 gen:15 core:0x22 unit:0x0000001f8ca304e75e6a80183896451e
02:15:18:WU00:FS01:Uploading 12.57MiB to 140.163.4.231
02:15:18:WU00:FS01:Connecting to 140.163.4.231:8080
02:15:39:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
02:15:39:WU00:FS01:Connecting to 140.163.4.231:80
02:16:00:WARNING:WU00:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
02:16:55:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11749 run:0 clone:4957 gen:15 core:0x22 unit:0x0000001f8ca304e75e6a80183896451e
02:16:55:WU00:FS01:Uploading 12.57MiB to 140.163.4.231
02:16:55:WU00:FS01:Connecting to 140.163.4.231:8080
02:17:16:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
02:17:16:WU00:FS01:Connecting to 140.163.4.231:80
02:17:37:WARNING:WU00:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
02:19:33:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11749 run:0 clone:4957 gen:15 core:0x22 unit:0x0000001f8ca304e75e6a80183896451e
02:19:33:WU00:FS01:Uploading 12.57MiB to 140.163.4.231
02:19:33:WU00:FS01:Connecting to 140.163.4.231:8080
02:19:54:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
02:19:54:WU00:FS01:Connecting to 140.163.4.231:80
02:20:15:WARNING:WU00:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
I am unable to submit results because a collection server is not being assigned. Is this normal? It has made at least 10 attempts. I quit out of client and restarted to see if that would help and no dice. I have been having issues with WU taking a very long time uploading lately and often getting stuck at 99% uploaded on one machine.

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 2:25 am
by bruce
Does your WU report that a CS is configured?

A reasonable test of the server is to open http://140.163.4.231. If it fails, the server is down. If it opens the WS landing page it's probably working.

Re: PRCG 11749(0,4957,15) no collection server assigned

Posted: Fri Apr 10, 2020 2:28 am
by bruce
Something's not right with the work server AND no CS is assigned.

I'll merge your topic with viewtopic.php?f=18&t=34116

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 2:35 am
by zelthian
bruce wrote:Does your WU report that a CS is configured?

A reasonable test of the server is to open http://140.163.4.231. If it fails, the server is down. If it opens the WS landing page it's probably working.
CS on my WU shows 0.0.0.0.

Attempting to open the IP of the server via URL did not open the landing page (site couldn't be reached). Seems pretty clear that the server is down.

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 2:45 am
by GeekFantasy
The server at 140.163.4.231 has been unreachable for me for the last 4 hours, save for one time when it tried to upload and then failed. At the time of writing this, it is still unreachable for WU upload from an attempt that just took place.

However, both surprisingly and pleasantly I can attest to the fact that a subsequent work unit from the same client was just successfully uploaded to 140.163.4.241 which seems to be from the same server cluster. So luckily the issue seems to only be impacting the server named in this thread. Good luck.

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 2:55 am
by PantherX
zelthian wrote:...CS on my WU shows 0.0.0.0...
Welcome to the F@H Forum zelthian,

That means that the CS is not configured. This isn't a mistake or an error since the configuration of a CS is entirely optional and depends on the researcher(s).

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 2:57 am
by zelthian
PantherX wrote: Welcome to the F@H Forum zelthian,
Thank you!
PantherX wrote: That means that the CS is not configured. This isn't a mistake or an error since the configuration of a CS is entirely optional and depends on the researcher(s).
So nothing wrong, nothing I can/need to do?

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 3:00 am
by PantherX
zelthian wrote:...So nothing wrong, nothing I can/need to do?
Technically, the WS isn't functioning as expected (that's what's wrong) but from your end, there's nothing that you can do except leave the client running in the background and it will continue to try to upload the completed WU to the WS.

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 8:49 am
by x-MaSh-x

Code: Select all

03:37:13:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11749 run:0 clone:3710 gen:25 core:0x22 unit:0x0000002a8ca304e75e6a8010612a6e92
03:37:13:WU00:FS01:Uploading 12.56MiB to 140.163.4.231
03:37:13:WU00:FS01:Connecting to 140.163.4.231:8080
03:37:22:WU00:FS01:Upload 0.50%
03:38:44:WU02:FS01:0x22:Completed 20000 out of 1000000 steps (2%)
03:39:11:WU00:FS01:Upload complete
03:39:11:WU00:FS01:Server responded WORK_QUIT (404)
03:39:11:WARNING:WU00:FS01:Server did not like results, dumping
03:39:11:WU00:FS01:Cleaning up
Boo. After 50 odd attempts to upload...

Can I avoid this server via settings? Seeing the credit dwindle over a couple of days is bad enough but then finding the compute was wasted as well makes me want to avoid it...

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 11:29 am
by Manfred.Knick
PantherX wrote: it will continue to try to upload the completed WU to the WS
Interactions @ mskcc (with fah4 involved) continue to produce frustration:
No fun to burn energy to heat and noise for nothing, indeed.

# [ 11748 | 0 |463 | 9 ] :

$ grep WU02 /opt/foldingathome/log.txt (excerpt)

08:13:03:WU02:FS01:Connecting to 65.254.110.245:8080
08:13:04:WU02:FS01:Assigned to work server 140.163.4.231
08:13:04:WU02:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GP104 [GeForce GTX 1070 Ti] 8186 from 140.163.4.231
08:13:04:WU02:FS01:Connecting to 140.163.4.231:8080
...
08:14:19:WU02:FS01:Download complete
08:14:19:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:11748 run:0 clone:463 gen:9 core:0x22 unit:0x0000001a8ca304e75e6a7fe1ae52558b
08:14:19:WU02:FS01:Starting
...
...
10:20:41:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11748 run:0 clone:463 gen:9 core:0x22 unit:0x0000001a8ca304e75e6a7fe1ae52558b
10:20:41:WU02:FS01:Uploading 12.58MiB to 140.163.4.231
10:20:41:WU02:FS01:Connecting to 140.163.4.231:8080
...
10:22:06:WU02:FS01:Upload complete
10:22:06:WU02:FS01:Server responded WORK_QUIT (404)
10:22:06:WARNING:WU02:FS01:Server did not like results, dumping
10:22:07:WU02:FS01:Cleaning up (*)

(*) the latter implying that e.g. no wuresults_??.dat survives for analysis.
Where can I find documentation about error codes of WUs?

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 12:00 pm
by SvicidalBug
Manfred.Knick wrote: 10:22:06:WU02:FS01:Upload complete
10:22:06:WU02:FS01:Server responded WORK_QUIT (404)
10:22:06:WARNING:WU02:FS01:Server did not like results, dumping
10:22:07:WU02:FS01:Cleaning up (*)
I was having the same issue uploading the finished WU to this server. When it finally went through, I got this error.

11:50:33:WU01:FS01:Server responded WORK_QUIT (404)
11:50:33:WARNING:WU01:FS01:Server did not like results, dumping

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 1:03 pm
by Kougar

Code: Select all

03:41:26:WU01:FS01:Upload complete
03:41:26:WU01:FS01:Server responded WORK_QUIT (404)
03:41:26:WARNING:WU01:FS01:Server did not like results, dumping
So does this mean the WU was thrown out and no credit was given?

Re: Can't upload to 140.163.4.231 again

Posted: Fri Apr 10, 2020 1:39 pm
by iceman1992
Kougar wrote:

Code: Select all

03:41:26:WU01:FS01:Upload complete
03:41:26:WU01:FS01:Server responded WORK_QUIT (404)
03:41:26:WARNING:WU01:FS01:Server did not like results, dumping
So does this mean the WU was thrown out and no credit was given?
Yes