Page 7 of 10
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Tue May 30, 2017 10:19 pm
by SteveWillis
I guess it's a moot point now but here is a new version of my Linux pause script that only pauses the one FS that the script thinks is hung up.
#!/bin/bash
cd /var/lib/fahclient
while true
do
egrep -i "Connected|assign|refused|Upload|Download" log.txt|tail -1|egrep "refused|assignment"
results=$?
echo "$(date) results = $results" #you can delete this line
if [ $results = 0 ]
then
INDEX=$(egrep -i "Connected|assign|refused|Upload|Download" log.txt|tail -1|egrep "refused|assignment"|cut -d F -f2 |cut -b 3)
echo "PAUSED ******* $(date) INDEX = $INDEX" #you can delete this line
echo -e "pause $INDEX\nquit" | nc localhost 36330 &> /dev/null
sleep 10
echo -e "unpause $INDEX\nquit" | nc localhost 36330 &> /dev/null
fi
sleep 60
done
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Tue May 30, 2017 10:38 pm
by Aurum
<cause v='ALZHEIMERS'/> is working very well so I'm bringing all my rigs back to help fight Alzheimer's disease.
I assume if Alzheimer's runs out of WUs that it will assign something else.
I think it would be nice if I could prioritize all of the causes and not just pick one.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Tue May 30, 2017 10:42 pm
by Aurum
msultan wrote:Hello everyone,
I apologize for the late response. 171.67.108.105 is my WS, which has been given assignemnts by the WS even though it has no assignable jobs.
Best, Muneeb
That's where we started from pointing out that Server Stats page said 171.67.108.105 had no WUs but were told that Server Stats is inaccurate.
Definitely need to fix the "failover" code for the AS.
Muneeb, Get some more WUs in the queue so we can fold them for you
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 4:06 am
by Leonardo
Ohh, you just made my day!
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 4:27 pm
by rwh202
msultan wrote:Hello everyone,
I apologize for the late response. 171.67.108.105 is my WS, which has been given assignemnts by the WS even though it has no assignable jobs. We are currently trying to fix the problem with the AS where it keeps sending jobs to my WS. In the meanwhile, I have reduced the priority of my WS so that it doesn't assign jobs as frequently(it is currently 1/10 of the original value).
I am terribly sorry for all the problems that this issue is causing everyone. We appreciate all of your support and hope this doesn't turn you away from F@H. Again, I am sorry for the problem, and we are trying to fix it.
Best,
Muneeb
Any chance you could reduce the WS priority even further (i.e. 0) until the underlying problem is fixed?
I'm still seeing multiple failed assignments to this server where no preference / beta flags are set, so clients get stuck in an ever increasing retry loop. I can't immediately update the flags on these remote clients, so still stuck doing pause / unpause on them.
Thanks.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 5:03 pm
by Aurum
rwh202, I switched all mine to Alzheimer's and they're working great. Not sure what happens if they run out of Alzheimer's WUs.
By remote does that mean FAHControl cannot communicate with them?
I'm planning on setting up a folding farm at my mother's house and was hoping I could control them from home via TightVNC.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 5:30 pm
by rwh202
Aurum wrote:rwh202, I switched all mine to Alzheimer's and they're working great. Not sure what happens if they run out of Alzheimer's WUs.
By remote does that mean FAHControl cannot communicate with them?
I'm planning on setting up a folding farm at my mother's house and was hoping I could control them from home via TightVNC.
Yeah, all my local clients (at home) are now set for Huntington's and working. It's just my machine in the office that still has no flags and problems.
I have remote access over port 36330 so use HFM to control both my local and remote clients from home - I haven't tried FAHControl recently - it always used to be buggy so switched to HFM.
VNC / remote desktop is an alternative that should work, but too much firewall / ISO 27001 hassle for me to implement.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 5:32 pm
by AJMSmith
So how did I get this?
Code: Select all
17:21:52:WU00:FS01:Connecting to 171.67.108.45:80
17:21:53:WU00:FS01:Assigned to work server 171.67.108.105
17:21:53:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP106 [GeForce GTX 1060 6GB] from 171.67.108.105
17:21:53:WU00:FS01:Connecting to 171.67.108.105:8080
17:21:54:ERROR:WU00:FS01:Exception: Server did not assign work unit
17:21:54:WU00:FS01:Connecting to 171.67.108.45:80
17:21:54:WU00:FS01:Assigned to work server 171.67.108.105
17:21:54:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP106 [GeForce GTX 1060 6GB] from 171.67.108.105
17:21:54:WU00:FS01:Connecting to 171.67.108.105:8080
17:21:55:WU00:FS01:Downloading 21.17MiB
I removed the GPU slot, saved the configuration then replace the GPU slot and got the above.
BTW 171.67.108 & 171.67.108.45 are both part of a block of 262,144 (0x40000) addresses assigned to Stamford (the block being 171.64.0.0 to 171.67.255.255).
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 5:40 pm
by Sailer
I have been having a hard time getting work units assigned as well since Friday, May 26. Server unit involved is 171.67.108.105:8080. This has been occurring mainly with two computers running two GTX1080 Ti cards, but also with a computer using a single GTX1080 and one with a single GTX980 TI. Sometimes restart will result in a download, but often not. The total effect has been a loss of about 1 million PPD.
17:27:18:WU02:FS01:Connecting to 171.67.108.45:80
17:27:20:WU02:FS01:Assigned to work server 171.67.108.105
17:27:21:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] from 171.67.108.105
17:27:21:WU02:FS01:Connecting to 171.67.108.105:8080
17:27:22:ERROR:WU02:FS01:Exception: Server did not assign work unit
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 6:05 pm
by markfw
I have 11 GPU's. At any one time 2-5 of them get the error "Exception: Server did not assign work unit". If I edit slots on a box, and remove the GPU, save, then add the GPU, then save multiple times, after about 5-7 tries it will get a unit. I have been doing this for a week now, but its really getting annoying. Can someone please fix this issue ? Below is one log file:
Code: Select all
15:32:33:WU00:FS00:Upload complete
15:32:33:WU00:FS00:Server responded WORK_ACK (400)
15:32:33:WU00:FS00:Final credit estimate, 36672.00 points
15:32:33:WU00:FS00:Cleaning up
15:32:34:WU01:FS00:Assigned to work server 171.67.108.105
15:32:34:WU01:FS00:Requesting new work unit for slot 00: READY gpu:1:GP104 [GeForce GTX 1070] from 171.67.108.105
15:32:34:WU01:FS00:Connecting to 171.67.108.105:8080
15:32:34:ERROR:WU01:FS00:Exception: Server did not assign work unit
15:34:10:WU01:FS00:Connecting to 171.67.108.45:80
15:34:11:WU01:FS00:Assigned to work server 171.67.108.105
15:34:11:WU01:FS00:Requesting new work unit for slot 00: READY gpu:1:GP104 [GeForce GTX 1070] from 171.67.108.105
15:34:11:WU01:FS00:Connecting to 171.67.108.105:8080
15:34:11:ERROR:WU01:FS00:Exception: Server did not assign work unit
15:36:47:WU01:FS00:Connecting to 171.67.108.45:80
15:36:47:WU01:FS00:Assigned to work server 171.67.108.105
15:36:48:WU01:FS00:Requesting new work unit for slot 00: READY gpu:1:GP104 [GeForce GTX 1070] from 171.67.108.105
15:36:48:WU01:FS00:Connecting to 171.67.108.105:8080
15:36:48:ERROR:WU01:FS00:Exception: Server did not assign work unit
15:41:01:WU01:FS00:Connecting to 171.67.108.45:80
15:41:01:WU01:FS00:Assigned to work server 171.67.108.105
15:41:02:WU01:FS00:Requesting new work unit for slot 00: READY gpu:1:GP104 [GeForce GTX 1070] from 171.67.108.105
15:41:02:WU01:FS00:Connecting to 171.67.108.105:8080
15:41:02:ERROR:WU01:FS00:Exception: Server did not assign work unit
15:47:53:WU01:FS00:Connecting to 171.67.108.45:80
15:47:53:WU01:FS00:Assigned to work server 171.67.108.105
15:47:54:WU01:FS00:Requesting new work unit for slot 00: READY gpu:1:GP104 [GeForce GTX 1070] from 171.67.108.105
15:47:54:WU01:FS00:Connecting to 171.67.108.105:8080
15:47:54:ERROR:WU01:FS00:Exception: Server did not assign work unit
15:58:59:WU01:FS00:Connecting to 171.67.108.45:80
15:58:59:WU01:FS00:Assigned to work server 171.67.108.105
15:58:59:WU01:FS00:Requesting new work unit for slot 00: READY gpu:1:GP104 [GeForce GTX 1070] from 171.67.108.105
15:58:59:WU01:FS00:Connecting to 171.67.108.105:8080
15:59:00:ERROR:WU01:FS00:Exception: Server did not assign work unit
16:16:55:WU01:FS00:Connecting to 171.67.108.45:80
16:16:55:WU01:FS00:Assigned to work server 171.67.108.105
16:16:56:WU01:FS00:Requesting new work unit for slot 00: READY gpu:1:GP104 [GeForce GTX 1070] from 171.67.108.105
16:16:56:WU01:FS00:Connecting to 171.67.108.105:8080
16:16:56:ERROR:WU01:FS00:Exception: Server did not assign work unit
16:45:57:WU01:FS00:Connecting to 171.67.108.45:80
16:45:57:WU01:FS00:Assigned to work server 171.67.108.105
16:45:58:WU01:FS00:Requesting new work unit for slot 00: READY gpu:1:GP104 [GeForce GTX 1070] from 171.67.108.105
16:45:58:WU01:FS00:Connecting to 171.67.108.105:8080
16:45:58:ERROR:WU01:FS00:Exception: Server did not assign work unit
17:32:56:WU01:FS00:Connecting to 171.67.108.45:80
17:32:56:WU01:FS00:Assigned to work server 171.67.108.105
17:32:57:WU01:FS00:Requesting new work unit for slot 00: READY gpu:1:GP104 [GeForce GTX 1070] from 171.67.108.105
17:32:57:WU01:FS00:Connecting to 171.67.108.105:8080
17:32:58:ERROR:WU01:FS00:Exception: Server did not assign work unit
and another box log file:
Code: Select all
15:07:18:WU01:FS00:Upload 97.11%
15:07:42:WU01:FS00:Upload complete
15:07:43:WU01:FS00:Server responded WORK_ACK (400)
15:07:43:WU01:FS00:Final credit estimate, 77597.00 points
15:07:43:WU01:FS00:Cleaning up
15:10:59:WU00:FS00:Connecting to 171.67.108.45:80
15:10:59:WU00:FS00:Assigned to work server 171.67.108.105
15:11:01:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM204 [GeForce GTX 980] 4612 from 171.67.108.105
15:11:04:WU00:FS00:Connecting to 171.67.108.105:8080
15:11:05:ERROR:WU00:FS00:Exception: Server did not assign work unit
15:17:50:WU00:FS00:Connecting to 171.67.108.45:80
15:17:50:WU00:FS00:Assigned to work server 171.67.108.105
15:17:52:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM204 [GeForce GTX 980] 4612 from 171.67.108.105
15:17:56:WU00:FS00:Connecting to 171.67.108.105:8080
15:17:57:ERROR:WU00:FS00:Exception: Server did not assign work unit
15:28:56:WU00:FS00:Connecting to 171.67.108.45:80
15:28:56:WU00:FS00:Assigned to work server 171.67.108.105
15:28:58:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM204 [GeForce GTX 980] 4612 from 171.67.108.105
15:29:00:WU00:FS00:Connecting to 171.67.108.105:8080
15:29:01:ERROR:WU00:FS00:Exception: Server did not assign work unit
15:46:52:WU00:FS00:Connecting to 171.67.108.45:80
15:46:53:WU00:FS00:Assigned to work server 171.67.108.105
15:46:55:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM204 [GeForce GTX 980] 4612 from 171.67.108.105
15:46:56:WU00:FS00:Connecting to 171.67.108.105:8080
15:46:57:ERROR:WU00:FS00:Exception: Server did not assign work unit
16:15:55:WU00:FS00:Connecting to 171.67.108.45:80
16:15:55:WU00:FS00:Assigned to work server 171.67.108.105
16:15:57:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM204 [GeForce GTX 980] 4612 from 171.67.108.105
16:16:00:WU00:FS00:Connecting to 171.67.108.105:8080
16:16:01:ERROR:WU00:FS00:Exception: Server did not assign work unit
17:02:53:WU00:FS00:Connecting to 171.67.108.45:80
17:02:54:WU00:FS00:Assigned to work server 171.67.108.105
17:02:55:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM204 [GeForce GTX 980] 4612 from 171.67.108.105
17:02:55:WU00:FS00:Connecting to 171.67.108.105:8080
17:02:56:ERROR:WU00:FS00:Exception: Server did not assign work unit
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 8:06 pm
by Leonardo
I've set all my slots (5) to Huntington's. Following that setting change 16 hours ago, there has been not been an instance of a slot without a work unit to process. All slots have been fully engaged without pause.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 8:28 pm
by JonasTheMovie
Ive been working on Huntington without any problems to get WUs the last day.
But now when checking , I had two WUs in my single slot, instead of finishing a 10496, at 99% FAH downloaded a new 11431 and I noticed this at 90% of that one.
Rebooting restarted 10496 at 94% and then finished, uploaded and went on to work on the 11431.
I have no idea if this has any connection to the server problems lately, but I have never seen anyone have this problem, cannot find any thread when I search for it.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 9:29 pm
by Aurum
msultan wrote:Hello everyone,
I apologize for the late response. 171.67.108.105 is my WS, which has been given assignemnts by the WS even though it has no assignable jobs. We are currently trying to fix the problem with the AS where it keeps sending jobs to my WS. In the meanwhile, I have reduced the priority of my WS so that it doesn't assign jobs as frequently(it is currently 1/10 of the original value).
I am terribly sorry for all the problems that this issue is causing everyone. We appreciate all of your support and hope this doesn't turn you away from F@H. Again, I am sorry for the problem, and we are trying to fix it.
Best,
Muneeb
Select a Cause on the FAHControl/Configure/Advanced tab. Alzheimer's has worked great for a day now. Others say Huntington's is great as well. Cancer may be the problem.
See Muneeb's explanation in the quote that 171.67.108.105 has no WUs. Some day when they fix the Assignment Server code this problem may go away.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 10:08 pm
by Kjetil
Leonardo wrote:I've set all my slots (5) to Huntington's. Following that setting change 16 hours ago, there has been not been an instance of a slot without a work unit to process. All slots have been fully engaged without pause.
I am running beta on 12 slots, sat 5,28.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Posted: Wed May 31, 2017 11:10 pm
by markfw
I am using Alzheimer's and its now working great. I hate to not contribute to them all, but until they figure this out, that is better than idle.
Number 38 worldwide and rising !