Stuck sending results 13.82.98.119

Moderators: Site Moderators, FAHC Science Team

PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Stuck sending results 13.82.98.119

Post by PantherX »

durval wrote:...My client is reporting "Connection timed out", so (with all due respect) I beg to differ: it's not some "logical" problem like not assigning new WUs -- on the contrary, it means the IP address assigned to the server is not answering. It could be because the server has crashed (the OS, not the F@H software), it has been powered down, or even (sorry) that its LAN cable has indeed been unplugged. Otherwise we would be seeing other errors like "Connection reset"...
I guess we might be describing the issue from two perspectives. Mine is what you see but you're saying more technical detail.
durval wrote:...
00:26:50:WU04:FS01:Upload complete
00:26:50:WU04:FS01:Server responded WORK_QUIT (404)
00:26:50:WARNING:WU04:FS01:Server did not like results, dumping
Does that mean that it was all in vain? :cry:
Unfortunately, it means that the WU didn't pass the validation test that the Server ran. If it fails the validation test, the results is discarded and you don't get any credits.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Stuck sending results 13.82.98.119

Post by PantherX »

VAcharonD1 wrote:I'm still timing out when trying to connect. My WU is going to time out soon, too.
Please note that if the completed WU if successfully upload before the Timeout date will get the bonus credits.
If the completed WU if successfully upload before the Expiration date and after the Timeout date, it will get base credits.
Once the WU reaches the Expiration date, it will be automatically be deleted from the client.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
VAcharonD1
Posts: 14
Joined: Tue Mar 24, 2020 2:46 am

Re: Stuck sending results 13.82.98.119

Post by VAcharonD1 »

The estimated credit has dropped by over 20,000 points. I know points are mainly for internet bragging rights, but insofar as they represent the value of one's contribution, the value of this one is dropping rapidly. I hope the project finds a way to route around single points of failure like this.
Sn0wy23
Posts: 16
Joined: Mon Dec 12, 2016 1:10 pm

Re: Stuck sending results 13.82.98.119

Post by Sn0wy23 »

At 6.00 GMT the server still showing as Down and my WUs have no place to go just like others are having issues with.
Hope that the server can besorted out so that the WUs can be uploaded so we don't loose valuable work.

I have had a few "machine did not like the results" over the last 3 days, but still getting and sending WUs when they are available and the server will accept my uploads.
Quick pause of the Slots in the clients happiliy resets the timer so the time between retries are not huge.

Typed with thumbs and with only half hour sleep in last 4 days so excuse the typos and grammer.
Glad it isn't just myself with issues!
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Stuck sending results 13.82.98.119

Post by PantherX »

Good news is that I have just received confirmation that the issue should be resolved for 13.82.98.119 so hopefully, your completed WU will be accepted soon. We appreciate your patience during this :)
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Epsilon_Process
Posts: 6
Joined: Fri Apr 10, 2020 5:52 am

Re: Stuck sending results 13.82.98.119

Post by Epsilon_Process »

PantherX wrote:Good news is that I have just received confirmation that the issue should be resolved for 13.82.98.119 so hopefully, your completed WU will be accepted soon. We appreciate your patience during this :)
Yes, looks good now. I restarted my clients, and all my backlogged WUs were accepted and points were logged. Looks like the server version was updated in the process, to 9.6.7. Thanks!
durval
Posts: 15
Joined: Sun Apr 12, 2020 12:27 am

Re: Stuck sending results 13.82.98.119

Post by durval »

PantherX wrote:Good news is that I have just received confirmation that the issue should be resolved for 13.82.98.119 so hopefully, your completed WU will be accepted soon. We appreciate your patience during this :)
Thanks for the feedback, @PantherX.
Epsilon_Process wrote:Yes, looks good now. I restarted my clients, and all my backlogged WUs were accepted and points were logged. Looks like the server version was updated in the process, to 9.6.7. Thanks!
I second that: my other clients were able to upload their pending WUs and log points (albeit severely reduced by losing most of the QRB due to the wait). But all seems normal now.
Sn0wy23 wrote:I have had a few "machine did not like the results" over the last 3 days, but still getting and sending WUs when they are available and the server will accept my uploads.
That sucks :? I just lost that one WU recently (the other one I lost so far was late last week).
Quick pause of the Slots in the clients happiliy resets the timer so the time between retries are not huge.
Interesting. I've been restarting the client (and sometimes losing a few % on the other slot) in order to avoid hours-long retries, as recommended in a sticky post somewhere here in the Forum. Nice to hear that just pausing/unpausing the slots also work, will try that from now on. Thanks for the info, @Sn0wy23.
PantherX wrote:I guess we might be describing the issue from two perspectives. Mine is what you see but you're saying more technical detail.
Yep, I'm a sysadmin by trade, so I tend to see things 'right down to the bare metal'. From an application perspective, what you said is of course correct. BTW, if F@H ever needs sysadmins to take care of those servers, please count me in as a volunteer.
PantherX wrote:Unfortunately, it means that the WU didn't pass the validation test that the Server ran. If it fails the validation test, the results is discarded and you don't get any credits.
:e( This is really disappointing, it was like 5 hours of GPU time and 130K points straight down the toilet :(

@PantherX, why does that happen? Like the other WU I lost last week, this particular client runs on a HP Professsional machine, sporting a Xeon with ECC RAM and a Quadro P5000 GPU, and on Linux (no crappy Windows het), so I'm pretty sure the fault did not happen at my end...
FoldingStorm
Posts: 3
Joined: Wed Apr 15, 2020 3:26 pm

Re: Stuck sending results 13.82.98.119

Post by FoldingStorm »

PantherX wrote:Good news is that I have just received confirmation that the issue should be resolved for 13.82.98.119 so hopefully, your completed WU will be accepted soon. We appreciate your patience during this :)
Too late unfortunately, at least it only took like 19 hours.
It continued to fail to upload until about 9:59 UTC today, at which point it appears to have just dropped the WU entirely.
Looks like I wasted all that energy.
Jan
Posts: 79
Joined: Tue Mar 31, 2020 6:46 pm

Re: Stuck sending results 13.82.98.119

Post by Jan »

It's an issue that shouldnt pop up again once the infrastructure has completely adapted to the new scale of users. I feel for your work though. :(
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Stuck sending results 13.82.98.119

Post by PantherX »

durval wrote:...@PantherX, why does that happen? Like the other WU I lost last week, this particular client runs on a HP Professsional machine, sporting a Xeon with ECC RAM and a Quadro P5000 GPU, and on Linux (no crappy Windows het), so I'm pretty sure the fault did not happen at my end...
In this particular case (where you successfully completed the WU but the Server rejected it), I can think of two reasons (there might be more):
1) Corruption of data while transferring to the Server. If the Server received invalid data, it will simply discard it. There's no retry.
2) Very rarely, a WU on one set of hardware folds fine but on the other set, has just enough corruption to not fail on the system folding it but still fail the validation test on the Server.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
durval
Posts: 15
Joined: Sun Apr 12, 2020 12:27 am

Re: Stuck sending results 13.82.98.119

Post by durval »

@PantherX, thanks for the excellent, detailed response.
PantherX wrote:
durval wrote:...@PantherX, why does that happen? Like the other WU I lost last week, this particular client runs on a HP Professsional machine, sporting a Xeon with ECC RAM and a Quadro P5000 GPU, and on Linux (no crappy Windows here), so I'm pretty sure the fault did not happen at my end...
In this particular case (where you successfully completed the WU but the Server rejected it), I can think of two reasons (there might be more):
1) Corruption of data while transferring to the Server. If the Server received invalid data, it will simply discard it. There's no retry.
2) Very rarely, a WU on one set of hardware folds fine but on the other set, has just enough corruption to not fail on the system folding it but still fail the validation test on the Server.
In my experience, reason #1 is very improbable (the transmission is done over TCP, in which every packet has an individual checksum which is verified and causes retransmission of the packet if corrupted, before being sent "up" from the OS to the application).

So I guess we are seeing a case of #2 here.

I do not think it is so rare: over only 241 WUs I've processed so far, it has already happened twice.

If I can be of assistance in finding and resolving the root cause of this, please don't hesitate to contact me.

Thanks again,
-- Durval
Post Reply