Page 4 of 7
Re: Can't upload to 140.163.4.231 again
Posted: Fri Apr 10, 2020 3:17 pm
by Manfred.Knick
POSTSCRIPTUM:
How come that this WU was assigned to me
this morning, 08:13
although it has been returned 3x from different donors
and even successfully marked OK and creditet 2x
days ago already ?!
https://apps.foldingathome.org/wu#proje ... =463&gen=9
Re: Can't upload to 140.163.4.231 again
Posted: Fri Apr 10, 2020 5:11 pm
by GeekFantasy
I was issued another WU from this broken server this morning before my original WU was even able to be uploaded. After finishing the second WU from this morning, both were uploaded only to be met with the WORK_QUIT message people seem to be seeing. I was half joking earlier in this thread, but after seeing this, I also feel we should be allowed to blacklist this server from our clients. Or at the very least, can it be prevented from being assigned by the assignment servers until these issues are tested and resolved? It doesn't seem fair to us volunteers to waste our time and resources by continuing to issue WUs from a faulty server. We would all really appreciate it!
Code: Select all
06:44:33:WU03:FS00:Connecting to 65.254.110.245:8080
06:44:33:WU03:FS00:Assigned to work server 140.163.4.231
06:44:33:WU03:FS00:Requesting new work unit for slot 00: RUNNING gpu:0:TU106 [GeForce RTX 2060 Super] from 140.163.4.231
06:44:33:WU03:FS00:Connecting to 140.163.4.231:8080
06:45:20:WU03:FS00:Downloading 13.15MiB
06:45:22:WU03:FS00:Download complete
06:45:22:WU03:FS00:Received Unit: id:03 state:DOWNLOAD error:NO_ERROR project:11752 run:0 clone:7238 gen:11 core:0x22 unit:0x0000001d8ca304e75e6bbed67e4fc3e3
...
(Original WU failing for a day)
07:30:33:WU02:FS00:Sending unit results: id:02 state:SEND error:NO_ERROR project:11747 run:0 clone:9000 gen:15 core:0x22 unit:0x000000228ca304e75e6bae3514579ddb
07:30:33:WU02:FS00:Uploading 21.92MiB to 140.163.4.231
07:30:33:WU02:FS00:Connecting to 140.163.4.231:8080
07:30:39:WU02:FS00:Upload 27.94%
07:30:45:WU02:FS00:Upload 61.86%
07:30:51:WU02:FS00:Upload 97.21%
07:30:53:WU02:FS00:Upload complete
07:30:53:WU02:FS00:Server responded WORK_QUIT (404)
07:30:53:WARNING:WU02:FS00:Server did not like results, dumping
07:30:53:WU02:FS00:Cleaning up
...
(Unit issued and folded this morning)
09:13:17:WU03:FS00:0x22:Completed 1000000 out of 1000000 steps (100%)
09:13:26:WU03:FS00:0x22:Saving result file ..\logfile_01.txt
09:13:26:WU03:FS00:0x22:Saving result file checkpointState.xml
09:13:32:WU03:FS00:0x22:Saving result file checkpt.crc
09:13:32:WU03:FS00:0x22:Saving result file positions.xtc
09:13:35:WU03:FS00:0x22:Saving result file science.log
09:13:35:WU03:FS00:0x22:Folding@home Core Shutdown: FINISHED_UNIT
09:13:36:WU03:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
09:13:36:WU03:FS00:Sending unit results: id:03 state:SEND error:NO_ERROR project:11752 run:0 clone:7238 gen:11 core:0x22 unit:0x0000001d8ca304e75e6bbed67e4fc3e3
09:13:36:WU03:FS00:Uploading 24.34MiB to 140.163.4.231
09:13:36:WU03:FS00:Connecting to 140.163.4.231:8080
09:13:42:WU03:FS00:Upload 30.56%
09:13:48:WU03:FS00:Upload 60.60%
09:13:55:WU03:FS00:Upload 76.27%
09:14:08:WU03:FS00:Upload complete
09:14:08:WU03:FS00:Server responded WORK_QUIT (404)
09:14:08:WARNING:WU03:FS00:Server did not like results, dumping
09:14:08:WU03:FS00:Cleaning up
Re: Can't upload to 140.163.4.231 again
Posted: Fri Apr 10, 2020 5:17 pm
by Sarr
I also have a work unit repeatedly failing to upload to this server. Someone else pointed out that their work unit seemed to have already been processed by somone else and given credit for. I checked this one, and similarly, it had been assigned to two others before apparently.
logs:
Code: Select all
15:14:00:WU02:FS01:0x22:Completed 1000000 out of 1000000 steps (100%)
15:14:04:WU02:FS01:0x22:Saving result file ../logfile_01.txt
15:14:04:WU02:FS01:0x22:Saving result file checkpointState.xml
15:14:07:WU02:FS01:0x22:Saving result file checkpt.crc
15:14:07:WU02:FS01:0x22:Saving result file positions.xtc
15:14:09:WU02:FS01:0x22:Saving result file science.log
15:14:09:WU02:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
15:14:09:WU02:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
15:14:09:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11750 run:0 clone:2592 gen:8 core:0x22 unit:0x0000001c8ca304e75e6a802851b79d8f
15:14:10:WU02:FS01:Uploading 14.51MiB to 140.163.4.231
15:14:10:WU02:FS01:Connecting to 140.163.4.231:8080
15:15:36:WU00:FS01:Connecting to 65.254.110.245:8080
15:15:36:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
15:15:36:WU00:FS01:Connecting to 18.218.241.186:80
15:15:37:WU00:FS01:Assigned to work server 13.90.152.57
15:15:37:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:Baffin [Polaris11] from 13.90.152.57
15:15:37:WU00:FS01:Connecting to 13.90.152.57:8080
15:15:37:ERROR:WU00:FS01:Exception: Server did not assign work unit
15:15:50:WU02:FS01:Upload 0.43%
15:15:50:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
15:15:50:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11750 run:0 clone:2592 gen:8 core:0x22 unit:0x0000001c8ca304e75e6a802851b79d8f
15:15:50:WU02:FS01:Uploading 14.51MiB to 140.163.4.231
15:15:50:WU02:FS01:Connecting to 140.163.4.231:8080
15:18:01:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
15:18:01:WU02:FS01:Connecting to 140.163.4.231:80
15:19:51:WU00:FS01:Connecting to 65.254.110.245:8080
15:19:51:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
15:19:51:WU00:FS01:Connecting to 18.218.241.186:80
15:19:52:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
15:19:52:ERROR:WU00:FS01:Exception: Could not get an assignment
15:20:12:WARNING:WU02:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: Connection timed out
15:20:12:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11750 run:0 clone:2592 gen:8 core:0x22 unit:0x0000001c8ca304e75e6a802851b79d8f
15:20:12:WU02:FS01:Uploading 14.51MiB to 140.163.4.231
15:20:12:WU02:FS01:Connecting to 140.163.4.231:8080
15:22:54:WU02:FS01:Upload 0.43%
15:22:54:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
15:22:54:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11750 run:0 clone:2592 gen:8 core:0x22 unit:0x0000001c8ca304e75e6a802851b79d8f
15:22:54:WU02:FS01:Uploading 14.51MiB to 140.163.4.231
15:22:54:WU02:FS01:Connecting to 140.163.4.231:8080
15:25:05:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
15:25:05:WU02:FS01:Connecting to 140.163.4.231:80
15:26:42:WU00:FS01:Connecting to 65.254.110.245:8080
15:26:42:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
15:26:42:WU00:FS01:Connecting to 18.218.241.186:80
15:26:42:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
15:26:42:ERROR:WU00:FS01:Exception: Could not get an assignment
15:27:16:WARNING:WU02:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: Connection timed out
15:27:16:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11750 run:0 clone:2592 gen:8 core:0x22 unit:0x0000001c8ca304e75e6a802851b79d8f
15:27:16:WU02:FS01:Uploading 14.51MiB to 140.163.4.231
15:27:16:WU02:FS01:Connecting to 140.163.4.231:8080
15:29:58:WU02:FS01:Upload 0.43%
15:29:58:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
15:31:30:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11750 run:0 clone:2592 gen:8 core:0x22 unit:0x0000001c8ca304e75e6a802851b79d8f
15:31:31:WU02:FS01:Uploading 14.51MiB to 140.163.4.231
15:31:31:WU02:FS01:Connecting to 140.163.4.231:8080
15:33:14:WU02:FS01:Upload 0.43%
15:33:15:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
15:37:47:WU00:FS01:Connecting to 65.254.110.245:8080
15:37:48:WU00:FS01:Assigned to work server 140.163.4.231
15:37:48:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:Baffin [Polaris11] from 140.163.4.231
15:37:48:WU00:FS01:Connecting to 140.163.4.231:8080
15:38:22:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11750 run:0 clone:2592 gen:8 core:0x22 unit:0x0000001c8ca304e75e6a802851b79d8f
15:38:22:WU02:FS01:Uploading 14.51MiB to 140.163.4.231
15:38:22:WU02:FS01:Connecting to 140.163.4.231:8080
15:39:58:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
15:39:58:WU00:FS01:Connecting to 140.163.4.231:80
15:40:14:WU02:FS01:Upload 0.43%
15:40:14:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
15:42:09:ERROR:WU00:FS01:Exception: Failed to connect to 140.163.4.231:80: Connection timed out
15:49:27:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11750 run:0 clone:2592 gen:8 core:0x22 unit:0x0000001c8ca304e75e6a802851b79d8f
15:49:27:WU02:FS01:Uploading 14.51MiB to 140.163.4.231
15:49:27:WU02:FS01:Connecting to 140.163.4.231:8080
15:51:34:WU02:FS01:Upload 0.43%
15:51:34:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
15:55:44:WU00:FS01:Connecting to 65.254.110.245:8080
15:55:45:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
15:55:45:WU00:FS01:Connecting to 18.218.241.186:80
15:55:45:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
15:55:45:ERROR:WU00:FS01:Exception: Could not get an assignment
16:07:24:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11750 run:0 clone:2592 gen:8 core:0x22 unit:0x0000001c8ca304e75e6a802851b79d8f
16:07:24:WU02:FS01:Uploading 14.51MiB to 140.163.4.231
16:07:24:WU02:FS01:Connecting to 140.163.4.231:8080
16:09:35:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
16:09:35:WU02:FS01:Connecting to 140.163.4.231:80
16:11:47:WARNING:WU02:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.231:80: Connection timed out
16:24:46:WU00:FS01:Connecting to 65.254.110.245:8080
16:24:47:WU00:FS01:Assigned to work server 128.252.203.10
16:24:47:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:Baffin [Polaris11] from 128.252.203.10
16:24:47:WU00:FS01:Connecting to 128.252.203.10:8080
16:26:58:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
16:26:58:WU00:FS01:Connecting to 128.252.203.10:80
16:29:09:ERROR:WU00:FS01:Exception: Failed to connect to 128.252.203.10:80: Connection timed out
16:36:26:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11750 run:0 clone:2592 gen:8 core:0x22 unit:0x0000001c8ca304e75e6a802851b79d8f
16:36:26:WU02:FS01:Uploading 14.51MiB to 140.163.4.231
16:36:26:WU02:FS01:Connecting to 140.163.4.231:8080
16:38:06:WU02:FS01:Upload 0.43%
16:38:06:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
17:11:46:WU00:FS01:Connecting to 65.254.110.245:8080
Re: Can't upload to 140.163.4.231 again
Posted: Fri Apr 10, 2020 5:47 pm
by iceman1992
Yeah me too. I haven't folded a WU today because I kept getting assigned to the same server
Re: Can't upload to 140.163.4.231 again
Posted: Fri Apr 10, 2020 6:34 pm
by bruce
Can somebody confirm yes/no: Has a CS been added? (one answer is enough)
Re: Can't upload to 140.163.4.231 again
Posted: Fri Apr 10, 2020 10:23 pm
by PantherX
Yes, fah4.eastus.cloudapp.azure.com
Re: Can't upload to 140.163.4.231 again
Posted: Sun Apr 12, 2020 1:02 am
by GeekFantasy
Glad this server got fixed with a proper CS added. I have had many successful WUs come down and up from this server now that it's back in full swing. Thanks to all those involved in fixing it.
40.114.52.201 has 0 bytes storage!
Posted: Mon Apr 13, 2020 9:54 pm
by L0w3r
according to
https://apps.foldingathome.org/serverstats 40.114.52.201 has 0 bytes storage.
perhaps this is why i haven't been able to upload more than 60% of a WU for the last 2 days to it?
Re: 40.114.52.201 has 0 bytes storage!
Posted: Mon Apr 13, 2020 11:36 pm
by greblos
I've noticed this as well. I have a work unit that has been trying to upload to the 40.114.52.201 server since 14:22 UTC on March 12. It has no issues connecting, but it gets a 464 response after each upload attempt. Here's the latest attempt:
Code: Select all
23:29:07:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:13850 run:0 clone:1784 gen:28 core:0xa7 unit:0x00000029287234c95e725943f0860ba4
23:29:07:WU01:FS00:Uploading 2.48MiB to 40.114.52.201
23:29:07:WU01:FS00:Connecting to 40.114.52.201:8080
23:29:13:WU01:FS00:Upload 70.70%
23:29:16:WU01:FS00:Upload complete
23:29:16:WU01:FS00:Server responded PLEASE_WAIT (464)
23:29:16:WARNING:WU01:FS00:Failed to send results, will try again later
Re: 40.114.52.201 has 0 bytes storage!
Posted: Tue Apr 14, 2020 1:06 am
by L0w3r
was rebooted 36 minutes ago, and now has 22.6GB of storage to accept new work.
my WU was accepted.
01:02:07:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13851 run:0 clone:4753 gen:13 core:0xa7 unit:0x0000001d287234c95e72581fd43cf311
01:02:07:WU00:FS01:Uploading 2.54MiB to 40.114.52.201
01:02:07:WU00:FS01:Connecting to 40.114.52.201:8080
01:02:13:WU00:FS01:Upload 61.53%
01:02:16:WU00:FS01:Upload complete
01:02:19:WU00:FS01:Server responded WORK_ACK (400)
01:02:19:WU00:FS01:Final credit estimate, 1382.00 points
01:02:19:WU00:FS01:Cleaning up
01:02:19:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
Re: Can't upload to 140.163.4.231 again
Posted: Tue Apr 14, 2020 2:30 am
by vmzy
Encounter 'Server responded PLEASE_WAIT (464)' problem for over 48 hours.
Code: Select all
00:00:48:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11751 run:0 clone:6782 gen:14 core:0x22 unit:0x0000001f8ca304e75e6bbbd247e4f258
00:00:48:WU00:FS01:Uploading 14.66MiB to 140.163.4.231
00:00:48:WU00:FS01:Connecting to 140.163.4.231:8080
00:00:54:WU00:FS01:Upload 2.98%
00:01:00:WU00:FS01:Upload 6.82%
00:01:06:WU00:FS01:Upload 9.81%
00:01:12:WU00:FS01:Upload 11.94%
00:01:18:WU00:FS01:Upload 15.78%
00:01:24:WU00:FS01:Upload 17.91%
00:01:30:WU00:FS01:Upload 21.75%
00:01:36:WU00:FS01:Upload 24.30%
00:01:42:WU00:FS01:Upload 27.29%
00:01:48:WU00:FS01:Upload 29.85%
00:01:54:WU00:FS01:Upload 33.68%
00:02:00:WU00:FS01:Upload 37.10%
00:02:07:WU00:FS01:Upload 40.93%
00:02:13:WU00:FS01:Upload 44.77%
00:02:19:WU00:FS01:Upload 46.90%
00:02:25:WU00:FS01:Upload 50.31%
00:02:31:WU00:FS01:Upload 55.86%
00:02:37:WU00:FS01:Upload 59.69%
00:02:43:WU00:FS01:Upload 63.11%
00:02:49:WU00:FS01:Upload 64.81%
00:02:55:WU00:FS01:Upload 68.65%
00:03:01:WU00:FS01:Upload 71.21%
00:03:07:WU00:FS01:Upload 73.34%
00:03:13:WU00:FS01:Upload 76.75%
00:03:19:WU00:FS01:Upload 82.29%
00:03:25:WU00:FS01:Upload 85.70%
00:03:32:WU00:FS01:Upload 88.26%
00:03:38:WU00:FS01:Upload 91.67%
00:03:44:WU00:FS01:Upload 95.51%
00:03:50:WU00:FS01:Upload complete
00:03:50:WU00:FS01:Server responded PLEASE_WAIT (464)
00:03:50:WARNING:WU00:FS01:Failed to send results, will try again later
Re: 40.114.52.201 has 0 bytes storage!
Posted: Tue Apr 14, 2020 3:09 am
by PantherX
It is currently having 14 GBs of free space but the status is in Accept which means that it will only be accepting completed WUs. Hopefully, it will be enough space for the returned WUs.
Re: Can't upload to 140.163.4.231 again
Posted: Tue Apr 14, 2020 5:26 am
by bruce
The PLEASE_WAIT issue is a known problem. FAHClient should retry the upload again If, after a few tries, it still has not been accepted, please add that information below.
Re: Can't upload to 140.163.4.231 again
Posted: Tue Apr 14, 2020 9:52 am
by vmzy
bruce wrote:The PLEASE_WAIT issue is a known problem. FAHClient should retry the upload again If, after a few tries, it still has not been accepted, please add that information below.
it has been retry over 15 times in these two days.
Re: Can't upload to 140.163.4.231 again
Posted: Tue Apr 14, 2020 11:45 am
by Neil-B
Hopefully a few more people should be around today after the extended weekend and a number of these types of issues will get resolved (fingers crossed for you)