Page 1 of 1

Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Mon Mar 23, 2020 2:17 am
by jpstep2
I have a pending WU for project 11758 that is stuck on send. It was completed on my system over five days ago. It expires 2020-03-25T10:55:45Z. It does not prevent me from downloading or sending a new WU. Please advise on what can be done to resolve this issue. If this WU expires, will it just go away on its own? Thanks in advance.

Latest log entry on send failure:

01:28:50:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11758 run:0 clone:730 gen:0 core:0x22 unit:0x000000029bf7a4d55e6d7710dfff252c
01:28:50:WU01:FS01:Uploading 55.24MiB to 155.247.164.213
01:28:50:WU01:FS01:Connecting to 155.247.164.213:8080
01:28:51:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
01:28:51:WU01:FS01:Trying to send results to collection server
01:28:51:WU01:FS01:Uploading 55.24MiB to 155.247.164.214
01:28:51:WU01:FS01:Connecting to 155.247.164.214:8080
01:28:51:ERROR:WU01:FS01:Exception: Transfer failed

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Mon Mar 23, 2020 2:52 am
by Joe_H
Yes, it will go away on its own after expiring. There is a known issue with this project and the servers, it is being looked into.

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Mon Mar 23, 2020 9:26 pm
by bruce
This WU seems to be corrupt. It has been returned twice with errors. Your upload says it is 55.24MiB which is probably too big to upload due to containing too many errors.

Can you find the portion of FAH's log showing what happened when it was processing? If FAH's log is still active, the filters in FAHControl will allow you to narrow your search. If FAHClient has been restarted, it's more difficult to do this from older logs int the /logs directory.

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Mon Mar 23, 2020 9:45 pm
by jonault
I have an 11758 WU which has also been stuck failing to upload for several days now. It's the same size as jpstep2's. I dug up the log entries for it, and don't see anything unusual, but fwiw here it is. Slot 2 on that machine is a RTX 2080Ti.

Code: Select all

08:49:37:WU02:FS02:0x22:*********************** Log Started 2020-03-18T08:49:37Z ***********************
08:49:37:WU02:FS02:0x22:*************************** Core22 Folding@home Core ***************************
08:49:37:WU02:FS02:0x22:       Type: 0x22
08:49:37:WU02:FS02:0x22:       Core: Core22
08:49:37:WU02:FS02:0x22:    Website: https://foldingathome.org/
08:49:37:WU02:FS02:0x22:  Copyright: (c) 2009-2018 foldingathome.org
08:49:37:WU02:FS02:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
08:49:37:WU02:FS02:0x22:             <rafal.wiewiora@choderalab.org>
08:49:37:WU02:FS02:0x22:       Args: -dir 02 -suffix 01 -version 705 -lifeline 17076 -checkpoint 15
08:49:37:WU02:FS02:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
08:49:37:WU02:FS02:0x22:             0 -gpu 0
08:49:37:WU02:FS02:0x22:     Config: <none>
08:49:37:WU02:FS02:0x22:************************************ Build *************************************
08:49:37:WU02:FS02:0x22:    Version: 0.0.2
08:49:37:WU02:FS02:0x22:       Date: Dec 6 2019
08:49:37:WU02:FS02:0x22:       Time: 21:30:31
08:49:37:WU02:FS02:0x22: Repository: Git
08:49:37:WU02:FS02:0x22:   Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
08:49:37:WU02:FS02:0x22:     Branch: HEAD
08:49:37:WU02:FS02:0x22:   Compiler: Visual C++ 2008
08:49:37:WU02:FS02:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
08:49:37:WU02:FS02:0x22:   Platform: win32 10
08:49:37:WU02:FS02:0x22:       Bits: 64
08:49:37:WU02:FS02:0x22:       Mode: Release
08:49:37:WU02:FS02:0x22:************************************ System ************************************
08:49:37:WU02:FS02:0x22:        CPU: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
08:49:37:WU02:FS02:0x22:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 12
08:49:37:WU02:FS02:0x22:       CPUs: 8
08:49:37:WU02:FS02:0x22:     Memory: 15.92GiB
08:49:37:WU02:FS02:0x22:Free Memory: 11.35GiB
08:49:37:WU02:FS02:0x22:    Threads: WINDOWS_THREADS
08:49:37:WU02:FS02:0x22: OS Version: 6.2
08:49:37:WU02:FS02:0x22:Has Battery: true
08:49:37:WU02:FS02:0x22: On Battery: false
08:49:37:WU02:FS02:0x22: UTC Offset: -5
08:49:37:WU02:FS02:0x22:        PID: 19248
08:49:37:WU02:FS02:0x22:        CWD: C:\Users\Jon\AppData\FAHClient\work
08:49:37:WU02:FS02:0x22:         OS: Windows 10 Pro
08:49:37:WU02:FS02:0x22:    OS Arch: AMD64
08:49:37:WU02:FS02:0x22:********************************************************************************
08:49:37:WU02:FS02:0x22:Project: 11758 (Run 0, Clone 1033, Gen 0)
08:49:37:WU02:FS02:0x22:Unit: 0x000000039bf7a4d55e6d7711b29ac3b3
08:49:37:WU02:FS02:0x22:Reading tar file core.xml
08:49:37:WU02:FS02:0x22:Reading tar file integrator.xml
08:49:37:WU02:FS02:0x22:Reading tar file state.xml
08:49:38:WU02:FS02:0x22:Reading tar file system.xml
08:49:38:WU02:FS02:0x22:Digital signatures verified
08:49:38:WU02:FS02:0x22:Folding@home GPU Core22 Folding@home Core
08:49:38:WU02:FS02:0x22:Version 0.0.2
08:49:47:WU00:FS02:Upload 0.85%
08:49:52:WU02:FS02:0x22:Completed 0 out of 1000000 steps (0%)
08:49:52:WU02:FS02:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
08:50:00:WU00:FS02:Upload 26.01%
08:50:14:WU00:FS02:Upload 66.53%
08:50:44:WU02:FS02:0x22:Completed 10000 out of 1000000 steps (1%)
08:51:08:WU00:FS02:Upload complete
08:51:08:WU00:FS02:Server responded WORK_ACK (400)
08:51:08:WU00:FS02:Final credit estimate, 97135.00 points
08:51:08:WU00:FS02:Cleaning up
08:51:36:WU02:FS02:0x22:Completed 20000 out of 1000000 steps (2%)
08:52:27:WU02:FS02:0x22:Completed 30000 out of 1000000 steps (3%)
08:53:18:WU02:FS02:0x22:Completed 40000 out of 1000000 steps (4%)
08:54:09:WU02:FS02:0x22:Completed 50000 out of 1000000 steps (5%)
08:55:05:WU02:FS02:0x22:Completed 60000 out of 1000000 steps (6%)
08:55:56:WU02:FS02:0x22:Completed 70000 out of 1000000 steps (7%)
08:56:48:WU02:FS02:0x22:Completed 80000 out of 1000000 steps (8%)
08:57:39:WU02:FS02:0x22:Completed 90000 out of 1000000 steps (9%)
08:58:31:WU02:FS02:0x22:Completed 100000 out of 1000000 steps (10%)
08:59:26:WU02:FS02:0x22:Completed 110000 out of 1000000 steps (11%)
09:00:18:WU02:FS02:0x22:Completed 120000 out of 1000000 steps (12%)
09:01:09:WU02:FS02:0x22:Completed 130000 out of 1000000 steps (13%)
09:02:01:WU02:FS02:0x22:Completed 140000 out of 1000000 steps (14%)
09:02:52:WU02:FS02:0x22:Completed 150000 out of 1000000 steps (15%)
09:03:48:WU02:FS02:0x22:Completed 160000 out of 1000000 steps (16%)
09:04:39:WU02:FS02:0x22:Completed 170000 out of 1000000 steps (17%)
09:05:31:WU02:FS02:0x22:Completed 180000 out of 1000000 steps (18%)
09:06:23:WU02:FS02:0x22:Completed 190000 out of 1000000 steps (19%)
09:07:14:WU02:FS02:0x22:Completed 200000 out of 1000000 steps (20%)
09:08:10:WU02:FS02:0x22:Completed 210000 out of 1000000 steps (21%)
09:09:01:WU02:FS02:0x22:Completed 220000 out of 1000000 steps (22%)
09:09:53:WU02:FS02:0x22:Completed 230000 out of 1000000 steps (23%)
09:10:44:WU02:FS02:0x22:Completed 240000 out of 1000000 steps (24%)
09:11:36:WU02:FS02:0x22:Completed 250000 out of 1000000 steps (25%)
09:12:32:WU02:FS02:0x22:Completed 260000 out of 1000000 steps (26%)
09:13:23:WU02:FS02:0x22:Completed 270000 out of 1000000 steps (27%)
09:14:14:WU02:FS02:0x22:Completed 280000 out of 1000000 steps (28%)
09:15:06:WU02:FS02:0x22:Completed 290000 out of 1000000 steps (29%)
09:15:57:WU02:FS02:0x22:Completed 300000 out of 1000000 steps (30%)
09:16:53:WU02:FS02:0x22:Completed 310000 out of 1000000 steps (31%)
09:17:45:WU02:FS02:0x22:Completed 320000 out of 1000000 steps (32%)
09:18:36:WU02:FS02:0x22:Completed 330000 out of 1000000 steps (33%)
09:19:28:WU02:FS02:0x22:Completed 340000 out of 1000000 steps (34%)
09:20:19:WU02:FS02:0x22:Completed 350000 out of 1000000 steps (35%)
09:21:15:WU02:FS02:0x22:Completed 360000 out of 1000000 steps (36%)
09:22:06:WU02:FS02:0x22:Completed 370000 out of 1000000 steps (37%)
09:22:58:WU02:FS02:0x22:Completed 380000 out of 1000000 steps (38%)
09:23:49:WU02:FS02:0x22:Completed 390000 out of 1000000 steps (39%)
09:24:41:WU02:FS02:0x22:Completed 400000 out of 1000000 steps (40%)
09:25:37:WU02:FS02:0x22:Completed 410000 out of 1000000 steps (41%)
09:26:28:WU02:FS02:0x22:Completed 420000 out of 1000000 steps (42%)
09:27:19:WU02:FS02:0x22:Completed 430000 out of 1000000 steps (43%)
09:28:11:WU02:FS02:0x22:Completed 440000 out of 1000000 steps (44%)
09:29:02:WU02:FS02:0x22:Completed 450000 out of 1000000 steps (45%)
09:29:58:WU02:FS02:0x22:Completed 460000 out of 1000000 steps (46%)
09:30:49:WU02:FS02:0x22:Completed 470000 out of 1000000 steps (47%)
09:31:41:WU02:FS02:0x22:Completed 480000 out of 1000000 steps (48%)
09:32:33:WU02:FS02:0x22:Completed 490000 out of 1000000 steps (49%)
09:33:24:WU02:FS02:0x22:Completed 500000 out of 1000000 steps (50%)
09:34:20:WU02:FS02:0x22:Completed 510000 out of 1000000 steps (51%)
09:35:11:WU02:FS02:0x22:Completed 520000 out of 1000000 steps (52%)
09:36:02:WU02:FS02:0x22:Completed 530000 out of 1000000 steps (53%)
09:36:54:WU02:FS02:0x22:Completed 540000 out of 1000000 steps (54%)
09:37:45:WU02:FS02:0x22:Completed 550000 out of 1000000 steps (55%)
09:38:41:WU02:FS02:0x22:Completed 560000 out of 1000000 steps (56%)
09:39:32:WU02:FS02:0x22:Completed 570000 out of 1000000 steps (57%)
09:40:24:WU02:FS02:0x22:Completed 580000 out of 1000000 steps (58%)
09:41:15:WU02:FS02:0x22:Completed 590000 out of 1000000 steps (59%)
09:42:07:WU02:FS02:0x22:Completed 600000 out of 1000000 steps (60%)
09:43:03:WU02:FS02:0x22:Completed 610000 out of 1000000 steps (61%)
09:43:54:WU02:FS02:0x22:Completed 620000 out of 1000000 steps (62%)
09:44:45:WU02:FS02:0x22:Completed 630000 out of 1000000 steps (63%)
09:45:37:WU02:FS02:0x22:Completed 640000 out of 1000000 steps (64%)
09:46:28:WU02:FS02:0x22:Completed 650000 out of 1000000 steps (65%)
09:47:24:WU02:FS02:0x22:Completed 660000 out of 1000000 steps (66%)
09:48:16:WU02:FS02:0x22:Completed 670000 out of 1000000 steps (67%)
09:49:07:WU02:FS02:0x22:Completed 680000 out of 1000000 steps (68%)
09:49:59:WU02:FS02:0x22:Completed 690000 out of 1000000 steps (69%)
09:50:50:WU02:FS02:0x22:Completed 700000 out of 1000000 steps (70%)
09:51:46:WU02:FS02:0x22:Completed 710000 out of 1000000 steps (71%)
09:52:38:WU02:FS02:0x22:Completed 720000 out of 1000000 steps (72%)
09:53:29:WU02:FS02:0x22:Completed 730000 out of 1000000 steps (73%)
09:54:21:WU02:FS02:0x22:Completed 740000 out of 1000000 steps (74%)
09:55:12:WU02:FS02:0x22:Completed 750000 out of 1000000 steps (75%)
09:56:08:WU02:FS02:0x22:Completed 760000 out of 1000000 steps (76%)
09:56:59:WU02:FS02:0x22:Completed 770000 out of 1000000 steps (77%)
09:57:51:WU02:FS02:0x22:Completed 780000 out of 1000000 steps (78%)
09:58:43:WU02:FS02:0x22:Completed 790000 out of 1000000 steps (79%)
09:59:34:WU02:FS02:0x22:Completed 800000 out of 1000000 steps (80%)
10:00:30:WU02:FS02:0x22:Completed 810000 out of 1000000 steps (81%)
10:01:21:WU02:FS02:0x22:Completed 820000 out of 1000000 steps (82%)
10:02:13:WU02:FS02:0x22:Completed 830000 out of 1000000 steps (83%)
10:03:04:WU02:FS02:0x22:Completed 840000 out of 1000000 steps (84%)
10:03:56:WU02:FS02:0x22:Completed 850000 out of 1000000 steps (85%)
10:04:51:WU02:FS02:0x22:Completed 860000 out of 1000000 steps (86%)
10:05:43:WU02:FS02:0x22:Completed 870000 out of 1000000 steps (87%)
10:06:34:WU02:FS02:0x22:Completed 880000 out of 1000000 steps (88%)
10:07:26:WU02:FS02:0x22:Completed 890000 out of 1000000 steps (89%)
10:08:17:WU02:FS02:0x22:Completed 900000 out of 1000000 steps (90%)
10:09:13:WU02:FS02:0x22:Completed 910000 out of 1000000 steps (91%)
10:10:04:WU02:FS02:0x22:Completed 920000 out of 1000000 steps (92%)
10:10:56:WU02:FS02:0x22:Completed 930000 out of 1000000 steps (93%)
10:11:47:WU02:FS02:0x22:Completed 940000 out of 1000000 steps (94%)
10:12:39:WU02:FS02:0x22:Completed 950000 out of 1000000 steps (95%)
10:13:35:WU02:FS02:0x22:Completed 960000 out of 1000000 steps (96%)
10:14:26:WU02:FS02:0x22:Completed 970000 out of 1000000 steps (97%)
10:15:17:WU02:FS02:0x22:Completed 980000 out of 1000000 steps (98%)
10:16:09:WU02:FS02:0x22:Completed 990000 out of 1000000 steps (99%)
10:16:10:WU00:FS02:Connecting to 65.254.110.245:8080
10:16:10:WARNING:WU00:FS02:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:16:10:WU00:FS02:Connecting to 18.218.241.186:80
10:16:10:WARNING:WU00:FS02:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:16:10:ERROR:WU00:FS02:Exception: Could not get an assignment
10:16:10:WU00:FS02:Connecting to 65.254.110.245:8080
10:16:10:WARNING:WU00:FS02:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:16:10:WU00:FS02:Connecting to 18.218.241.186:80
10:16:11:WARNING:WU00:FS02:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:16:11:ERROR:WU00:FS02:Exception: Could not get an assignment
10:17:00:WU02:FS02:0x22:Completed 1000000 out of 1000000 steps (100%)
10:17:05:WU02:FS02:0x22:Saving result file ..\logfile_01.txt
10:17:05:WU02:FS02:0x22:Saving result file checkpointState.xml
10:17:05:WU02:FS02:0x22:Saving result file checkpt.crc
10:17:05:WU02:FS02:0x22:Saving result file positions.xtc
10:17:05:WU02:FS02:0x22:Saving result file science.log
10:17:05:WU02:FS02:0x22:Folding@home Core Shutdown: FINISHED_UNIT
10:17:05:WU02:FS02:FahCore returned: FINISHED_UNIT (100 = 0x64)
10:17:05:WU02:FS02:Sending unit results: id:02 state:SEND error:NO_ERROR project:11758 run:0 clone:1033 gen:0 core:0x22 unit:0x000000039bf7a4d55e6d7711b29ac3b3
10:17:05:WU02:FS02:Uploading 55.24MiB to 155.247.164.213
10:17:05:WU02:FS02:Connecting to 155.247.164.213:8080
10:17:06:WARNING:WU02:FS02:Exception: Failed to send results to work server: Transfer failed
10:17:06:WU02:FS02:Trying to send results to collection server
10:17:06:WU02:FS02:Uploading 55.24MiB to 155.247.164.214
10:17:06:WU02:FS02:Connecting to 155.247.164.214:8080
10:17:06:ERROR:WU02:FS02:Exception: Transfer failed
10:17:07:WU02:FS02:Sending unit results: id:02 state:SEND error:NO_ERROR project:11758 run:0 clone:1033 gen:0 core:0x22 unit:0x000000039bf7a4d55e6d7711b29ac3b3
10:17:07:WU02:FS02:Uploading 55.24MiB to 155.247.164.213
10:17:07:WU02:FS02:Connecting to 155.247.164.213:8080
10:17:07:WARNING:WU02:FS02:Exception: Failed to send results to work server: Transfer failed
10:17:07:WU02:FS02:Trying to send results to collection server
10:17:07:WU02:FS02:Uploading 55.24MiB to 155.247.164.214
10:17:07:WU02:FS02:Connecting to 155.247.164.214:8080
10:17:08:ERROR:WU02:FS02:Exception: Transfer failed
10:17:10:WU00:FS02:Connecting to 65.254.110.245:8080
10:17:10:WARNING:WU00:FS02:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:17:10:WU00:FS02:Connecting to 18.218.241.186:80
10:17:11:WARNING:WU00:FS02:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:17:11:ERROR:WU00:FS02:Exception: Could not get an assignment

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Mon Mar 23, 2020 9:49 pm
by bruce
The server will not accept oversized uploads. I don't know what the limit is, but 55.24MiB looks pretty big. I have no idea if that's because it contains a lot of error messages or if somebody needs to shrink the output or increase the upload limit for that server. Can you filter FAH's log and see if there are any clues that appeared during the processing of that WU?

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Mon Mar 23, 2020 9:54 pm
by jonault
bruce, that's everything that was in the log for that slot while it was folding that WU. The only things I cut out were progress reports from the other folding slot that was active.

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Mon Mar 23, 2020 10:10 pm
by bruce
OK. I've bumped it up to the project owner.

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Mon Mar 23, 2020 10:18 pm
by jonault
I made a backup copy of the work folder for that WU, in case there's anything in there that would be useful.

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Mon Mar 23, 2020 11:03 pm
by alxbelu
bruce wrote:The server will not accept oversized uploads. I don't know what the limit is, but 55.24MiB looks pretty big. I have no idea if that's because it contains a lot of error messages or if somebody needs to shrink the output or increase the upload limit for that server. Can you filter FAH's log and see if there are any clues that appeared during the processing of that WU?
bruce, the size has been mentioned as a concern in this thread: viewtopic.php?f=18&t=32492&start=75

Especially note the HTTP-responses recorded by Wireshark for these WU's: "HTTP/1.0 413 HTTP_REQUEST_ENTITY_TOO_LARGE"

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Tue Mar 24, 2020 12:21 am
by Empie
Got also one refusing to upload:
11758 (Run 0, Clone 3318, Gen 0)

Code: Select all

23:10:08:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11758 run:0 clone:3318 gen:0 core:0x22 unit:0x000000029bf7a4d55e6d7718287852a7
23:10:08:WU01:FS01:Uploading 55.24MiB to 155.247.164.213
23:10:08:WU01:FS01:Connecting to 155.247.164.213:8080
23:10:08:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
23:10:08:WU01:FS01:Trying to send results to collection server
23:10:08:WU01:FS01:Uploading 55.24MiB to 155.247.164.214
23:10:08:WU01:FS01:Connecting to 155.247.164.214:8080
23:10:09:ERROR:WU01:FS01:Exception: Transfer failed

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Wed Apr 01, 2020 5:45 pm
by TxRedneck
Image
What's the recommended course of action in this situation?

Tx

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Wed Apr 01, 2020 8:44 pm
by Joe_H
TxRedneck wrote:What's the recommended course of action in this situation?
Depends on what is being logged about the upload failing. Normally just let the client retry sending until the WU is accepted as the server is up and supposed to be accepting returns.

Re: Unable to upload WU, stuck on send, expires 2020-03-25

Posted: Wed Apr 01, 2020 9:00 pm
by TxRedneck
Joe_H wrote:
TxRedneck wrote:What's the recommended course of action in this situation?
Depends on what is being logged about the upload failing. Normally just let the client retry sending until the WU is accepted as the server is up and supposed to be accepting returns.
I'll pull and post logs shortly, ty sir.

Tx