Problem on sending results [Project 14283]
Moderators: Site Moderators, FAHC Science Team
Problem on sending results [Project 14283]
Hi guys,
I've currently this error on various of my rigs.
Problem seems to be on project 12783 only.
01:22:48:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:14283 run:0 clone:1 gen:46 core:0x21 unit:0x0000003380fccb0a5d9e11688fbd34af
01:22:48:WU00:FS01:Uploading 160.21MiB to 128.252.203.10
01:22:48:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:14283 run:0 clone:79 gen:1 core:0x21 unit:0x0000000280fccb0a5d9e116d639f080f
01:22:48:WU00:FS01:Connecting to 128.252.203.10:8080
01:22:48:WU01:FS01:Uploading 193.10MiB to 128.252.203.10
...
01:22:50:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
01:22:50:WU00:FS01:Trying to send results to collection server
01:22:50:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
01:22:50:WU00:FS01:Uploading 160.21MiB to 155.247.166.219
01:22:50:WU01:FS01:Trying to send results to collection server
01:22:50:WU00:FS01:Connecting to 155.247.166.219:8080
01:22:50:WU01:FS01:Uploading 193.10MiB to 155.247.166.219
01:22:50:WU01:FS01:Connecting to 155.247.166.219:8080
01:22:51:ERROR:WU00:FS01:Exception: Transfer failed
01:22:51:ERROR:WU01:FS01:Exception: Transfer failed
01:22:52:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:14283 run:0 clone:1 gen:46 core:0x21 unit:0x0000003380fccb0a5d9e11688fbd34af
01:22:52:WU00:FS01:Uploading 160.21MiB to 128.252.203.10
01:22:52:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:14283 run:0 clone:79 gen:1 core:0x21 unit:0x0000000280fccb0a5d9e116d639f080f
01:22:52:WU00:FS01:Connecting to 128.252.203.10:8080
01:22:52:WU01:FS01:Uploading 193.10MiB to 128.252.203.10
01:22:52:WU01:FS01:Connecting to 128.252.203.10:8080
01:22:53:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
01:22:53:WU00:FS01:Trying to send results to collection server
01:22:53:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
01:22:53:WU00:FS01:Uploading 160.21MiB to 155.247.166.219
01:22:53:WU01:FS01:Trying to send results to collection server
01:22:53:WU00:FS01:Connecting to 155.247.166.219:8080
01:22:53:WU01:FS01:Uploading 193.10MiB to 155.247.166.219
01:22:53:WU01:FS01:Connecting to 155.247.166.219:8080
01:22:53:ERROR:WU00:FS01:Exception: Transfer failed
01:22:54:ERROR:WU01:FS01:Exception: Transfer failed
I've currently this error on various of my rigs.
Problem seems to be on project 12783 only.
01:22:48:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:14283 run:0 clone:1 gen:46 core:0x21 unit:0x0000003380fccb0a5d9e11688fbd34af
01:22:48:WU00:FS01:Uploading 160.21MiB to 128.252.203.10
01:22:48:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:14283 run:0 clone:79 gen:1 core:0x21 unit:0x0000000280fccb0a5d9e116d639f080f
01:22:48:WU00:FS01:Connecting to 128.252.203.10:8080
01:22:48:WU01:FS01:Uploading 193.10MiB to 128.252.203.10
...
01:22:50:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
01:22:50:WU00:FS01:Trying to send results to collection server
01:22:50:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
01:22:50:WU00:FS01:Uploading 160.21MiB to 155.247.166.219
01:22:50:WU01:FS01:Trying to send results to collection server
01:22:50:WU00:FS01:Connecting to 155.247.166.219:8080
01:22:50:WU01:FS01:Uploading 193.10MiB to 155.247.166.219
01:22:50:WU01:FS01:Connecting to 155.247.166.219:8080
01:22:51:ERROR:WU00:FS01:Exception: Transfer failed
01:22:51:ERROR:WU01:FS01:Exception: Transfer failed
01:22:52:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:14283 run:0 clone:1 gen:46 core:0x21 unit:0x0000003380fccb0a5d9e11688fbd34af
01:22:52:WU00:FS01:Uploading 160.21MiB to 128.252.203.10
01:22:52:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:14283 run:0 clone:79 gen:1 core:0x21 unit:0x0000000280fccb0a5d9e116d639f080f
01:22:52:WU00:FS01:Connecting to 128.252.203.10:8080
01:22:52:WU01:FS01:Uploading 193.10MiB to 128.252.203.10
01:22:52:WU01:FS01:Connecting to 128.252.203.10:8080
01:22:53:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
01:22:53:WU00:FS01:Trying to send results to collection server
01:22:53:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
01:22:53:WU00:FS01:Uploading 160.21MiB to 155.247.166.219
01:22:53:WU01:FS01:Trying to send results to collection server
01:22:53:WU00:FS01:Connecting to 155.247.166.219:8080
01:22:53:WU01:FS01:Uploading 193.10MiB to 155.247.166.219
01:22:53:WU01:FS01:Connecting to 155.247.166.219:8080
01:22:53:ERROR:WU00:FS01:Exception: Transfer failed
01:22:54:ERROR:WU01:FS01:Exception: Transfer failed
Re: Problem on sending results
See my explanation here
You've paused those WUs many times while they were processing (Most likely you processed them "on idle"). The upload packets are all greater than 100 MiB and are much too big to be valid results from project:14283
You've paused those WUs many times while they were processing (Most likely you processed them "on idle"). The upload packets are all greater than 100 MiB and are much too big to be valid results from project:14283
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Problem on sending results
Not sure to understand, rigs are dedicated to folding and 1 wu takes like 4 hours to compute.
I see my monitoring tool has detected problem and relaunched wu several times, that's enough to lost that wu ?
So we can download wus with various size, all my rigs are configured for wus of 200MB max, but upload is limited to 100MB ? Well, that's a lot of time lost ...
I see my monitoring tool has detected problem and relaunched wu several times, that's enough to lost that wu ?
So we can download wus with various size, all my rigs are configured for wus of 200MB max, but upload is limited to 100MB ? Well, that's a lot of time lost ...
Re: Problem on sending results
As I said in the linked explanation, every time the WU enters/leaves the paused state, extra garbage is added to the upload. If the WU never pauses, the bug in FAHCore_a7 for Windows keeps the results upload correct (and concise). The new version of FAHCore_a7 fixes this problem and the results will be up-loadable.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Site Admin
- Posts: 7927
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Problem on sending results
How many times did your tool relaunch the WU? Post that log and perhaps that will indicate where the problem was. Generally it does not take just a few times restarting to blow up the WU upload size to 193 MB, if it takes that many restarts the WU itself was bad or your system is not folding stable for GPU folding.
In this case Bruce missed that the WU's involved were running the GPU Core_21, so his comments about the Core_A7 issue are not completely relevant. Someone who has processed a Project 14283 WU will have to weigh in with the normal upload size for a WU from that project.
I have looked up both WU's. So far each has one report of a return where the WU failed to be processed successfully. Additional reports would be needed to determine that the WU's are bad, someone may successfully process them when reassigned.
In this case Bruce missed that the WU's involved were running the GPU Core_21, so his comments about the Core_A7 issue are not completely relevant. Someone who has processed a Project 14283 WU will have to weigh in with the normal upload size for a WU from that project.
I have looked up both WU's. So far each has one report of a return where the WU failed to be processed successfully. Additional reports would be needed to determine that the WU's are bad, someone may successfully process them when reassigned.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Re: Problem on sending results [Project 14283]
Oops. It looks like size has nothing to do with it. (So much for spending a week on "vacation"
Ib fact, your client detected the WU as FAULTY so there's probably more useful information in an earlier part of the log. Scroll back to where those WUs were downloaded.
Ib fact, your client detected the WU as FAULTY so there's probably more useful information in an earlier part of the log. Scroll back to where those WUs were downloaded.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Problem on sending results [Project 14283]
I will not find how many time wu has been relaunched, I've not this level of detail in my logs
-
- Site Admin
- Posts: 7927
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Problem on sending results [Project 14283]
The logs kept by the client would, fi your tool is completely relaunching processing, then even then the client keeps the last 16 logs by default.
Perhaps you need to rethink how your monitoring tool is handling problems.
Perhaps you need to rethink how your monitoring tool is handling problems.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
-
- Site Moderator
- Posts: 6349
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: Problem on sending results [Project 14283]
Additional data added at each failure (bad state) on an already big WU might exceed the maximum upload size of the server ... and p14283 is already big when everything is fine (more than 100MB to upload).
Feel free to dump these WUs, they will never get back (and won't be very useful since they failed).
Feel free to dump these WUs, they will never get back (and won't be very useful since they failed).
-
- Posts: 652
- Joined: Sun Nov 22, 2009 8:42 pm
- Hardware configuration: AMD R7 3700X @ 4.0 GHz; ASUS ROG STRIX X470-F GAMING; DDR4 2x8GB @ 3.0 GHz; GByte RTX 3060 Ti @ 1890 MHz; Fortron-550W 80+ bronze; Win10 Pro/64
- Location: Bulgaria/Team #224497/artoar11_ALL_....
Re: Problem on sending results [Project 14283]
I don't know if it's fair to compare that way. My WU upload from this project/2019-10-13T12:44:41Z:
12:44:40:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:14283 run:0 clone:2 gen:5 core:0x21 unit:0x0000000580fccb0a5d9e11684ba342e0
12:44:40:WU00:FS01:Uploading 115.64MiB to 128.252.203.10
12:44:40:WU00:FS01:Connecting to 128.252.203.10:8080
12:44:46:WU00:FS01:Upload 24.11%
12:44:52:WU00:FS01:Upload 59.24%
12:44:58:WU00:FS01:Upload 91.45%
12:45:01:WU00:FS01:Upload complete
12:45:01:WU00:FS01:Server responded WORK_ACK (400)
12:45:01:WU00:FS01:Final credit estimate, 155440.00 points
12:45:01:WU00:FS01:Cleaning up
12:44:40:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:14283 run:0 clone:2 gen:5 core:0x21 unit:0x0000000580fccb0a5d9e11684ba342e0
12:44:40:WU00:FS01:Uploading 115.64MiB to 128.252.203.10
12:44:40:WU00:FS01:Connecting to 128.252.203.10:8080
12:44:46:WU00:FS01:Upload 24.11%
12:44:52:WU00:FS01:Upload 59.24%
12:44:58:WU00:FS01:Upload 91.45%
12:45:01:WU00:FS01:Upload complete
12:45:01:WU00:FS01:Server responded WORK_ACK (400)
12:45:01:WU00:FS01:Final credit estimate, 155440.00 points
12:45:01:WU00:FS01:Cleaning up
Re: Problem on sending results [Project 14283]
P14283 is a GPU project. The bug in the CPU core_a7 which adds extra data to the upload has nothing to do with P14283. That bug has been causing congestion on 155.247.166.2xx and 14283 is on a server at a different site: 128.252.203.10.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Site Moderator
- Posts: 6349
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: Problem on sending results [Project 14283]
This is the normal upload size for this project for a WU completed without Bad States ...artoar_11 wrote:I don't know if it's fair to compare that way. My WU upload from this project/2019-10-13T12:44:41Z:
12:44:40:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:14283 run:0 clone:2 gen:5 core:0x21 unit:0x0000000580fccb0a5d9e11684ba342e0
12:44:40:WU00:FS01:Uploading 115.64MiB to 128.252.203.10
12:44:40:WU00:FS01:Connecting to 128.252.203.10:8080
12:44:46:WU00:FS01:Upload 24.11%
12:44:52:WU00:FS01:Upload 59.24%
12:44:58:WU00:FS01:Upload 91.45%
12:45:01:WU00:FS01:Upload complete
12:45:01:WU00:FS01:Server responded WORK_ACK (400)
12:45:01:WU00:FS01:Final credit estimate, 155440.00 points
12:45:01:WU00:FS01:Cleaning up
Re: Problem on sending results [Project 14283]
Unknown answer....
P14283 is a project that runs on the GPU. The recent change to the FAHCore was for CPU WUs so your question isn't applicable. Also, you can't really assume that one project returns a similar amount of data as some other project.
P14283 is a project that runs on the GPU. The recent change to the FAHCore was for CPU WUs so your question isn't applicable. Also, you can't really assume that one project returns a similar amount of data as some other project.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.