03:46:20:WU01:FS00:0x22:Completed 1000000 out of 1000000 steps (100%)
03:46:26:WU01:FS00:0x22:Saving result file ../logfile_01.txt
03:46:26:WU01:FS00:0x22:Saving result file checkpointState.xml
03:46:26:WU01:FS00:0x22:Saving result file checkpt.crc
03:46:26:WU01:FS00:0x22:Saving result file positions.xtc
03:46:26:WU01:FS00:0x22:Saving result file science.log
03:46:26:WU01:FS00:0x22:Folding@home Core Shutdown: FINISHED_UNIT
03:46:26:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
03:46:26:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:11753 run:0 clone:363 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d76bf9d7e206a
03:46:26:WU01:FS00:Uploading 49.92MiB to 155.247.164.213
03:46:26:WU01:FS00:Connecting to 155.247.164.213:8080
03:46:26:WARNING:WU01:FS00:Exception: Failed to send results to work server: Transfer failed
03:46:26:WU01:FS00:Trying to send results to collection server
03:46:26:WU01:FS00:Uploading 49.92MiB to 155.247.164.214
03:46:26:WU01:FS00:Connecting to 155.247.164.214:8080
03:46:27:ERROR:WU01:FS00:Exception: Transfer failed
... Multiple Attempts ...
06:46:16:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:11753 run:0 clone:363 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d76bf9d7e206a
06:46:16:WU01:FS00:Uploading 49.92MiB to 155.247.164.213
06:46:16:WU01:FS00:Connecting to 155.247.164.213:8080
06:46:16:WARNING:WU01:FS00:Exception: Failed to send results to work server: Transfer failed
06:46:16:WU01:FS00:Trying to send results to collection server
06:46:16:WU01:FS00:Uploading 49.92MiB to 155.247.164.214
06:46:16:WU01:FS00:Connecting to 155.247.164.214:8080
06:46:17:ERROR:WU01:FS00:Exception: Transfer failed
06:41:01:WU05:FS02:0x22:Completed 1960000 out of 2000000 steps (98%)
06:41:02:WU02:FS02:Sending unit results: id:02 state:SEND error:NO_ERROR project:11758 run:0 clone:248 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d770fce597dbe
06:41:02:WU02:FS02:Uploading 55.24MiB to 155.247.164.213
06:41:02:WU02:FS02:Connecting to 155.247.164.213:8080
06:41:02:WARNING:WU02:FS02:Exception: Failed to send results to work server: Transfer failed
06:41:02:WU02:FS02:Trying to send results to collection server
06:41:02:WU02:FS02:Uploading 55.24MiB to 155.247.164.214
06:41:02:WU02:FS02:Connecting to 155.247.164.214:8080
06:41:03:ERROR:WU02:FS02:Exception: Transfer failed
06:47:54:WU02:FS02:Sending unit results: id:02 state:SEND error:NO_ERROR project:11758 run:0 clone:248 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d770fce597dbe
06:47:54:WU02:FS02:Uploading 55.24MiB to 155.247.164.213
06:47:54:WU02:FS02:Connecting to 155.247.164.213:8080
06:47:54:WARNING:WU02:FS02:Exception: Failed to send results to work server: Transfer failed
06:47:54:WU02:FS02:Trying to send results to collection server
06:47:54:WU02:FS02:Uploading 55.24MiB to 155.247.164.214
06:47:54:WU02:FS02:Connecting to 155.247.164.214:8080
06:47:54:ERROR:WU02:FS02:Exception: Transfer failed
04:05:17:WU02:FS02:0x22:Saving result file ../logfile_01.txt
04:05:17:WU02:FS02:0x22:Saving result file checkpointState.xml
04:05:17:WU02:FS02:0x22:Saving result file checkpt.crc
04:05:17:WU02:FS02:0x22:Saving result file positions.xtc
04:05:17:WU02:FS02:0x22:Saving result file science.log
04:05:17:WU02:FS02:0x22:Folding@home Core Shutdown: FINISHED_UNIT
04:05:17:WU02:FS02:FahCore returned: FINISHED_UNIT (100 = 0x64)
04:05:17:WU02:FS02:Sending unit results: id:02 state:SEND error:NO_ERROR project:11758 run:0 clone:248 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d770fce597dbe
04:05:17:WU02:FS02:Uploading 55.24MiB to 155.247.164.213
04:05:17:WU02:FS02:Connecting to 155.247.164.213:8080
04:05:18:WARNING:WU02:FS02:Exception: Failed to send results to work server: Transfer failed
04:05:18:WU02:FS02:Trying to send results to collection server
04:05:18:WU02:FS02:Uploading 55.24MiB to 155.247.164.214
04:05:18:WU02:FS02:Connecting to 155.247.164.214:8080
04:05:18:ERROR:WU02:FS02:Exception: Transfer failed
... Multiple Attempts ...
06:59:59:WU02:FS02:Sending unit results: id:02 state:SEND error:NO_ERROR project:11758 run:0 clone:248 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d770fce597dbe
06:59:59:WU02:FS02:Uploading 55.24MiB to 155.247.164.213
06:59:59:WU02:FS02:Connecting to 155.247.164.213:8080
06:59:59:WARNING:WU02:FS02:Exception: Failed to send results to work server: Transfer failed
06:59:59:WU02:FS02:Trying to send results to collection server
06:59:59:WU02:FS02:Uploading 55.24MiB to 155.247.164.214
06:59:59:WU02:FS02:Connecting to 155.247.164.214:8080
06:59:59:ERROR:WU02:FS02:Exception: Transfer failed
It's the middle of the night anywhere in th USA right now and volunteers should be sleeping ... preparing their immune system for another day of exposure to some random viruses.
I don't know who will be reponsible for fiuring out why the servers are down and fixing it but it won't be until tomorrow. Without more info, I don't know if it's the responsibly of the FAH team at temple.edu or the campus network support folks.
suchamoneypit wrote:My clients can't connect to .214, so it is down. Are we able to switch servers so we can fold ? Or is waiting the only option.
Most likely this is due to the sheer amount of new donors as a result of both Intel and Nvidia tweeting about the PC Master Race on Reddit. It's a "good thing". Sort of like when the kid who had no friends just wanted a birthday greetings card for his birthday, and people in the community responded and send him a card - or rather enough cards to swim in.
If you want to contribute still, just leave your machine running. Eventually it should be able to pick up new WUs and start folding, granted that it has been configured correctly (which is most cases is likely just the defaults).
Also I noticed that 155.247.164.214's status is set to "Assign", while it's the collection server for my WU that's been trying to send for a couple of hours. It appears to be up but not accepting work units.
One of my machines has been trying to submit a WU for 11758 for over 24hrs now (72 attempts); during Sunday I noted that the servers (213 & 214) were mostly down, but as of this morning they seem to be up according to the server status page, yet I am still getting this (UTC time):
07:48:52:WU00:FS01:Uploading 55.24MiB to 155.247.164.213
07:48:52:WU00:FS01:Connecting to 155.247.164.213:8080
07:48:52:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
07:48:52:WU00:FS01:Trying to send results to collection server
07:48:52:WU00:FS01:Uploading 55.24MiB to 155.247.164.214
07:48:52:WU00:FS01:Connecting to 155.247.164.214:8080
07:48:56:ERROR:WU00:FS01:Exception: Transfer failed
07:53:06:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11758 run:0 clone:1756 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d771303ec7ef7
07:53:06:WU00:FS01:Uploading 55.24MiB to 155.247.164.213
07:53:06:WU00:FS01:Connecting to 155.247.164.213:8080
07:53:07:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
07:53:07:WU00:FS01:Trying to send results to collection server
07:53:07:WU00:FS01:Uploading 55.24MiB to 155.247.164.214
07:53:07:WU00:FS01:Connecting to 155.247.164.214:8080
07:53:07:ERROR:WU00:FS01:Exception: Transfer failed
07:59:58:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11758 run:0 clone:1756 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d771303ec7ef7
07:59:58:WU00:FS01:Uploading 55.24MiB to 155.247.164.213
07:59:58:WU00:FS01:Connecting to 155.247.164.213:8080
07:59:58:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
07:59:58:WU00:FS01:Trying to send results to collection server
07:59:58:WU00:FS01:Uploading 55.24MiB to 155.247.164.214
07:59:58:WU00:FS01:Connecting to 155.247.164.214:8080
07:59:59:ERROR:WU00:FS01:Exception: Transfer failed
08:11:03:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11758 run:0 clone:1756 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d771303ec7ef7
08:11:03:WU00:FS01:Uploading 55.24MiB to 155.247.164.213
08:11:03:WU00:FS01:Connecting to 155.247.164.213:8080
08:11:04:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
08:11:04:WU00:FS01:Trying to send results to collection server
08:11:04:WU00:FS01:Uploading 55.24MiB to 155.247.164.214
08:11:04:WU00:FS01:Connecting to 155.247.164.214:8080
08:11:04:ERROR:WU00:FS01:Exception: Transfer failed
08:29:00:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11758 run:0 clone:1756 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d771303ec7ef7
08:29:00:WU00:FS01:Uploading 55.24MiB to 155.247.164.213
08:29:00:WU00:FS01:Connecting to 155.247.164.213:8080
08:29:01:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
08:29:01:WU00:FS01:Trying to send results to collection server
08:29:01:WU00:FS01:Uploading 55.24MiB to 155.247.164.214
08:29:01:WU00:FS01:Connecting to 155.247.164.214:8080
08:29:01:ERROR:WU00:FS01:Exception: Transfer failed
I've reset the retry timer multiple times as it has extended well beyond 1hr (log indicates it's been up over 2hrs).
The same machine (and folding slot) is proceeding and has completed multiple other WUs meanwhile trying to send this though, so it's not blocking or anything, I'm just trying to figure out why it fails to submit the work even now when the servers are claimed to be up.
But why is the Estimated Credit constantly reduced? It was previously worked hard and now it is shrinking minute by minute while the Collection Server 155.247.164.214 is not receiving any data. That's not OK.
10:13:31:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11758 run:0 clone:1756 gen:0 core:0x22 unit:0x000000009bf7a4d55e6d771303ec7ef7
10:13:31:WU00:FS01:Uploading 55.24MiB to 155.247.164.213
10:13:31:WU00:FS01:Connecting to 155.247.164.213:8080
10:13:31:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
10:13:31:WU00:FS01:Trying to send results to collection server
10:13:31:WU00:FS01:Uploading 55.24MiB to 155.247.164.214
10:13:31:WU00:FS01:Connecting to 155.247.164.214:8080
10:13:32:ERROR:WU00:FS01:Exception: Transfer failed
10:13:37:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
10:13:37:WU02:FS01:Connecting to 128.252.203.10:80
10:14:02:ERROR:WU02:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
10:14:16:WU02:FS01:Connecting to 65.254.110.245:8080
10:14:16:WU02:FS01:Assigned to work server 155.247.164.213
10:14:16:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:TU106M [GeForce RTX 2060 Mobile] from 155.247.164.213
10:14:16:WU02:FS01:Connecting to 155.247.164.213:8080
10:14:27:WU02:FS01:Downloading 11.98MiB
10:14:33:WU02:FS01:Download 84.00%
10:14:34:WU02:FS01:Download complete
10:14:34:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:11753 run:0 clone:3574 gen:1 core:0x22 unit:0x000000029bf7a4d55e6d76caa76041b9
10:14:34:WU02:FS01:Starting
suchamoneypit wrote:My clients can't connect to .214, so it is down. Are we able to switch servers so we can fold ? Or is waiting the only option.
I would be interested in this as well. Instead of trying again and again to connect to a certain server, every try coming with longer intervals, is it possible to leave that server be for a bit (let's say after 3 or 4 failed attempts) and find another one?
It's the middle of the night anywhere in th USA right now and volunteers should be sleeping ... preparing their immune system for another day of exposure to some random viruses.
I don't know who will be reponsible for fiuring out why the servers are down and fixing it but it won't be until tomorrow. Without more infor, I don't know if it's the responsibly of the FAH team at temple.edu or the campus network support folks.
The Estimated Credit is now dropping every second. That's not fair, FAH should change that. The Doners are very patient and want to support FAH. For the failure of the server you should not even punish them.
Will these servers back again?
I tried to upload a finished project almost 2 days but got error message in log file. I've been running a 24/7 system.
Project number: 11753
Last edited by GeriCom76 on Mon Mar 16, 2020 3:45 pm, edited 1 time in total.