Page 1 of 1
Could not get an assignment - empty work server
Posted: Tue Dec 20, 2016 9:51 pm
by orion456
Suddenly my 4p rig won't get work assignments while my other identical rig is fine.
21:38:38:WU00:FS00:Connecting to 171.67.108.45:8080
21:38:38:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.45:8080': Empty work server assignment
21:38:38:WU00:FS00:Connecting to 171.64.65.35:80
21:38:40:WARNING:WU00:FS00:Failed to get assignment from '171.64.65.35:80': Empty work server assignment
21:38:40:ERROR:WU00:FS00:Exception: Could not get an assignment
The network is working fine.
I can upload and download files.
All the settings are identical to my other machine.
Any ideas?
Re: Could not get an assignment - empty work server
Posted: Tue Dec 20, 2016 10:16 pm
by Joe_H
Could you post the first 100 or so lines of your log file to show the system info and configuration?
There are projects available depending on settings for up to 27 cores. But other settings may lead to no assignments being available either temporarily or permanently. more important than the addresses of the Assignment Servers that are currently unable to match your system to a WS are the IP addresses the system has received work from in the past.
Re: Could not get an assignment - empty work server
Posted: Wed Dec 21, 2016 9:35 pm
by bruce
The wording in your title is incorrect. You have not found an "empty work server" What you have found is a condition where the WorkServerAssignment is empty.
That means that with your setup, you cannot be assigned to a work server that can supply you with a WU. The first 100 lines of your log will allow us to identify the problem and help you reconfigure your system to accept the WUs that are available.
Re: Could not get an assignment - empty work server
Posted: Mon Dec 26, 2016 9:29 pm
by sco01
on 3 different computers same thing:
19:13:34:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': Failed to connect to 171.67.108.45:80: Попытка установить соединение была безуспешной, т.к. от другого компьютера за требуемое время не получен нужный отклик, или было разорвано уже установленное соединение из-за неверного отклика уже подключенного компьютера.
19:13:34:WU01:FS01:Connecting to 171.64.65.35:80
19:13:55:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': Failed to connect to 171.64.65.35:80: Подключение не установлено, т.к. конечный компьютер отверг запрос на подключение.
19:13:55:ERROR:WU01:FS01:Exception: Could not get an assignment
Re: Could not get an assignment - empty work server
Posted: Mon Dec 26, 2016 11:18 pm
by bruce
@sco01
Post the top hundred or so lines of your log per the instructions in the signature block of my previous post.
Re: Could not get an assignment - empty work server
Posted: Tue Dec 27, 2016 7:34 am
by sco01
bruce
log -
https://yadi.sk/i/rN_E4pN735Eev4
the client incorrectly writes in the log in languages other than English - in the customer I see everything in Russian, and in the log garbage...
Re: Could not get an assignment - empty work server
Posted: Tue Dec 27, 2016 10:23 am
by foldy
Looks like FS01 tried to download a new work unit after some connection failures but then hangs while downloading at 1.53%.
Code: Select all
19:24:31:WU01:FS01:Download 1.53%
19:24:33:WU00:FS01:Final credit estimate, 61871.00 points
19:24:33:WU00:FS01:Cleaning up
The other FS02 works fine.
As workaround can you try to quit the folding client and then restart it?
Re: Could not get an assignment - empty work server
Posted: Tue Dec 27, 2016 7:19 pm
by bruce
In the log 4 posts above, it says
Failed to connect to 171.67.108.45:80 followed by what is probably cyrillic (which means nothing to me). Nevertheless, I have to ask if you can connect to
http://171.67.108.45 from inside your browser. (It works for me.)
The same goes for
http://171.64.65.35/
If you
can connect, perhaps a translation of the messages after the English messages would help.
EDIT: I found it:
The attempt to establish a connection was unsuccessful, because from another computer in the required time does not give the desired response, or was aborted by an established connection due to an incorrect response is already connected computer.
This confirms the suspicion of both foldy and myself that your internet connection is unreliable. Certain types of internet errors can only be corrected by restarting FAHClient. That is on the list of improvements recommended for an upgraded client.
By the way, the error
OpenCL device matching slot 1 not found is one of the few remaining serious problems in FAHClient 7.4.15. The only workaround I've found is to set the index values manually but you have to do it about 50% of the times you restart FAHClient -- or simply restart enouigh times for it to hit the combination that happens to work. As far as I know, it's only a problem when you have a pair of identical GPUs.
Re: Could not get an assignment - empty work server
Posted: Wed Dec 28, 2016 7:29 pm
by sco01
bruce
My connection is stable. Deleting the folder work hung in the slot solves the problem for a while. Exactly the same problem has the other members of our team in different cities (and therefore with different providers). The problems began on the 27th.
Re: Could not get an assignment - empty work server
Posted: Wed Dec 28, 2016 7:38 pm
by sco01
p.s. OpenCL annoying bug. And not so much that you need to put codes manually, as it should be done almost after every reboot. By the way, after this job is usually missed dies and swinging new
Re: Could not get an assignment - empty work server
Posted: Wed Dec 28, 2016 9:00 pm
by foldy
Isn't this a known issue in FahClient 7.4.15 beta that if a work unit download is stalled then it does not get out of it on itself?
Re: Could not get an assignment - empty work server
Posted: Wed Dec 28, 2016 9:36 pm
by Joe_H
It is a continuing issue from earlier releases. Updates to the network connection code has improved the situation a bit, the client detects and resumes the download or upload that is stalled much more often than the previous full release. But it has not reached a 100% fix yet. That connection stall detection that does work can take upwards of 10-15 minutes at times in my experience.
Re: Could not get an assignment - empty work server
Posted: Thu Dec 29, 2016 5:23 am
by bruce
The manual test that I used to use when I opened one of the bug reports was to wait until a transfer or two were in progress (maybe like this:)
Code: Select all
04:03:33:WU02:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
04:03:33:WU02:FS00:Sending unit results: id:02 state:SEND error:NO_ERROR project:8684 run:0 clone:32 gen:0 core:0xa7 unit:0x000000010002894b5846e874fbc1802a
04:03:33:WU02:FS00:Uploading 2.97MiB to 155.247.166.219
04:03:33:WU02:FS00:Connecting to 155.247.166.219:8080
04:03:39:WU02:FS00:Upload 18.94%
04:03:45:WU02:FS00:Upload 25.25%
04:03:45:WU00:FS00:Download 71.49%
04:03:51:WU02:FS00:Upload 39.98%
04:03:52:WU00:FS00:Download 87.38%
04:03:57:WU02:FS00:Upload 50.50%
04:03:58:WU00:FS00:Download 100.00%
04:03:58:WU00:FS00:Download complete
04:04:04:WU02:FS00:Upload 61.02%
04:04:11:WU02:FS00:Upload 73.64%
04:04:17:WU02:FS00:Upload 84.16%
04:04:23:WU02:FS00:Upload 94.68%
04:04:31:WU02:FS00:Upload complete
04:04:31:WU02:FS00:Server responded WORK_ACK (400)
and somewhere in the middle, I'd unplug the internet cable or reset my router so I couldn't finish. Later I'd reconnect to see what happens.
Re: Could not get an assignment - empty work server
Posted: Thu Dec 29, 2016 5:57 am
by bruce
On thing is new. There are now some error messages that weren't there before ... like this:
Code: Select all
05:47:50:WU01:FS02:0x21:Completed 4000000 out of 5000000 steps (80%)
05:48:03:ERROR:Receive error: 10054: An existing connection was forcibly closed by the remote host.
05:49:17:WU01:FS02:0x21:Completed 4050000 out of 5000000 steps (81%)
in this particular case, I closed FAHControl by closing the window, itself, rather than using the Exit button on the client. I'm going to guess that the telnet port between FAHControl and FAHClient was in the middle of a data transfer. I never saw that message until recently -- and errors which are logged can be corrected.
Has anybody has gotten a message like that when the only thing that could have been interrupted forcibly was on the internet port. (such as my test scenario above).