Page 1 of 3

Azure servers down 40.121.152.108 / 52.224.109.74

Posted: Fri Jul 17, 2020 7:11 pm
by comixgoddess
I have been getting the following string of log entries for the last 20 minutes or so --

19:00:34:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14570 run:0 clone:1384 gen:222 core:0xa7 unit:0x00000103287234c95e7eea1b6620dfda
19:00:34:WU01:FS00:Uploading 6.82MiB to 40.114.52.201
19:00:34:WU01:FS00:Connecting to 40.114.52.201:8080
19:00:34:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
19:00:34:WU01:FS00:Connecting to 40.114.52.201:80
19:00:34:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 40.114.52.201:80: Connection refused
19:00:34:WU01:FS00:Trying to send results to collection server
19:00:34:WU01:FS00:Uploading 6.82MiB to 52.224.109.74
19:00:34:WU01:FS00:Connecting to 52.224.109.74:8080
19:00:34:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
19:00:34:WU01:FS00:Connecting to 52.224.109.74:80
19:00:34:ERROR:WU01:FS00:Exception: Failed to connect to 52.224.109.74:80: Connection refused

Looking at the server stats page, 40.114.52.201 is showing as "Down" while 52.224.109.74 is showing that it should be accepting returned results. Is there anything I should be doing on my end to help this along? Thanks!

Azure servers down 40.121.152.108 / 52.224.109.74

Posted: Fri Jul 17, 2020 7:32 pm
by itskieran
I noticed I was stuck sending with the next CPU WU on 5% already, so I investigated.

The WS is 40.114.52.201 and the CS is 52.224.109.74

It seems that a lot of servers (8) are down according to the server stats

Someone might want to take a look.

Re: Large number of servers down

Posted: Fri Jul 17, 2020 8:06 pm
by matrix1999
Yes, same here. My WU is failed to upload to 52.224.109.74. And according to the server stats as you posted, there are many servers being down at the moment, namely eastus.cloudapp.azure.com, seas.wustl.edu, temple.edu and some others. Can someone look into it, please?

Re: Large number of servers down

Posted: Fri Jul 17, 2020 8:12 pm
by bollix47
Apparently the azure servers are experiencing a problem and development is currently looking into said problem ... hopefully it will be fixed 'soon'.

I too have a few WUs that I can't return so I will be keeping an 'eye' on events and will let you know if anything new develops.

Re: Cannot upload to 40.114.52.201

Posted: Fri Jul 17, 2020 8:20 pm
by bollix47
viewtopic.php?f=18&t=35812&p=339825#p339825

Re: Large number of servers down

Posted: Fri Jul 17, 2020 11:51 pm
by Foxbat
The UV index must be 10 because there isn't a working Cloud in the Azure Sky…

(sorry)

Glad to see someone is working on this. So far I have just the one WU trying to upload.

WU not sending (40.114.52.201 and 52.224.109.74)

Posted: Sat Jul 18, 2020 2:24 am
by Familyman_19
I have a completed work unit that has been stuck for several hours. The log shows the following errors:

02:04:46:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 40.114.52.201:80: No connection could be made because the target machine actively refused it.
02:04:51:ERROR:WU01:FS00:Exception: Failed to connect to 52.224.109.74:80: No connection could be made because the target machine actively refused it.

It keeps doing this over and over. Other WUs have completed and have been sent back just fine. Any ideas?

Re: WU not sending

Posted: Sat Jul 18, 2020 5:39 am
by comixgoddess
Same here; mine has been "stuck" for 7 hours now. Please see this thread - viewtopic.php?f=18&t=35812.

Re: WU not sending

Posted: Sat Jul 18, 2020 8:14 pm
by RichieDoubleU
Same with me: this WU doesn't get sent since over 12 hours now, while another WU has been processed and sent successfully. So right now I got stuck with 13851.
Here one sample of the meanwhile very lengthy log.
project:13851 run:0 clone:8229 gen:208 core:0xa7 unit:0x000000fe287234c95e72ea9026ea9b9b
20:01:40:WU00:FS00:Uploading 2.47MiB to 40.114.52.201
20:01:40:WU00:FS00:Connecting to 40.114.52.201:8080
20:01:40:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
20:01:40:WU00:FS00:Connecting to 40.114.52.201:80
20:01:40:WARNING:WU00:FS00:Exception: Failed to send results to work server: Failed to connect to 40.114.52.201:80: Connection refused

Question: Can I do anything to solve this problem myself? I'd rather think not...
For the time being I've stopped folding and would prefer to have this thing solved before I start folding again.

Re: Large number of servers down

Posted: Sat Jul 18, 2020 8:44 pm
by Joe_H
Foxbat wrote:The UV index must be 10 because there isn't a working Cloud in the Azure Sky…
One of the five servers on Azure is up and running, waiting on information as to when others will be back.

Re: WU not sending

Posted: Sat Jul 18, 2020 9:26 pm
by Neil-B
There are a number of servers down at the moment so until they are up again completed WUs for those servers will be unable to upload .. since they are down they wont be issuing any more WUs - let your client handle this (it will retry until the server is up and it uploads or until it passes expiration and is dumped by the client) and keeping folding from the servers that are up would be the normal approach (the client is designed to work this way) .. but if you wish to put a hold on folding until the WU clears that is obviously a perfectly ok choice - whether you fold or not wont make any difference to how quickly the completed WU clears.

Re: WU not sending

Posted: Sat Jul 18, 2020 10:29 pm
by RichieDoubleU
Neil, thanks for the info. I understand it better now.

Re: WU not sending

Posted: Sat Jul 18, 2020 10:47 pm
by Neil-B
It is a real pain (for everyone, folders, researchers, devs) when this happens cause it holds up the science and everyone gets frustrated as it in effect "wastes" effort and slows progress ... but issues happen - believe me, the researchers and devs behind the scenes will be doing the best they can to get the issues resolved asap - however that doesn't make it any less annoying ... in time one either has to be patient (which I am really bad at) or learn to look at the logs/control interfaces less often and have faith things are working/will sort themselves out !! ... I spotted in another thread that they have got one of the servers back up (hopefully functioning properly) but when the others will follow is anyones guess - and as usual it is a weekend so trying to fix stuff is harder/slower :(

Re: WU not sending (40.114.52.201 and 52.224.109.74)

Posted: Sat Jul 18, 2020 11:36 pm
by bruce
There are reports that foreign hackers are targeting COVID research. Subject: Cozy Bear (APT-29) claws Coronavirus research from the West.

Yes, there are several servers down and people are working on fixing them. I don't know if there's any connection with the hackers, but it would not surpise me to learn that there's a connection.

Re: WU not sending (40.114.52.201 and 52.224.109.74)

Posted: Sun Jul 19, 2020 1:10 am
by psaam0001
I know I have 4 WU's (so far) that are waiting to go to a collection server...

May the ultimate social distancing regulator separate these uncouth hackers from their tools--permanently!

Paul