Page 8 of 10

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Wed May 31, 2017 11:32 pm
by markfw
So I decided to spread it around with Huntington's , Parkinson's and Alzheimer's. All work fine. I still wish I could let the server choose what needs the most help.

5 boxes so far had to be configured out of 11.

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Wed May 31, 2017 11:56 pm
by Kougar
msultan wrote:Hello everyone,
I apologize for the late response. 171.67.108.105 is my WS, which has been given assignemnts by the WS even though it has no assignable jobs. We are currently trying to fix the problem with the AS where it keeps sending jobs to my WS. In the meanwhile, I have reduced the priority of my WS so that it doesn't assign jobs as frequently(it is currently 1/10 of the original value).

I am terribly sorry for all the problems that this issue is causing everyone. We appreciate all of your support and hope this doesn't turn you away from F@H. Again, I am sorry for the problem, and we are trying to fix it.
Best,
Muneeb
Thank you for the update. Unfortunately another one of my GPUs was just assigned to this server two hours ago. :!:

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Thu Jun 01, 2017 12:55 am
by markfw
I have machines doing everything but cancer (what I really would want) and they are all working now. 7 of 11 GPU's, 7 million ppd.

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Thu Jun 01, 2017 6:37 am
by snapshot
JonasTheMovie wrote:Ive been working on Huntington without any problems to get WUs the last day.

But now when checking , I had two WUs in my single slot, instead of finishing a 10496, at 99% FAH downloaded a new 11431 and I noticed this at 90% of that one.
Rebooting restarted 10496 at 94% and then finished, uploaded and went on to work on the 11431.

I have no idea if this has any connection to the server problems lately, but I have never seen anyone have this problem, cannot find any thread when I search for it.
I had this once. It happened because the 10496 hit a problem at 99% but after the new WU had been downloaded. When FAHclient re-started, it picked up the new WU instead of the almost completed one. A manual restart of FAH got things going again as you describe. I find 10496 somewhat prone to giving errors and normally it's just restarted with the loss of a few percent and carries on but when it fails at 99% we get the problem we've both seen. It's absolutely nothing to do with this thread and might deserve its own.

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Thu Jun 01, 2017 6:19 pm
by braddblk
I'm sure what I'm suggesting won't be the first time it has been but how about some way to exclude a malfunctioning server until it's been corrected. This problem of either corrupted WU's or empty servers or whatever else can cause this has shown up time and again over the years.

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Thu Jun 01, 2017 7:18 pm
by Luscious
Ironically I have Cancer selected as my Cause Preference and am not seeing any problems as of the last 12 hours. When I was using the "Any" option I would get stuck slots. Folding with four 980 Ti cards.

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Thu Jun 01, 2017 9:41 pm
by Leonardo
Luscious wrote:Ironically I have Cancer selected as my Cause Preference and am not seeing any problems as of the last 12 hours. When I was using the "Any" option I would get stuck slots. Folding with four 980 Ti cards.
Maybe the Pande Group/Stanford has resolved the problem now? Maybe we'll get lucky and they'll tell us.

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Fri Jun 02, 2017 1:48 am
by k2e2ni
Adam A. Wanderer wrote:
Leonardo wrote:
Luscious wrote:Ironically I have Cancer selected as my Cause Preference and am not seeing any problems as of the last 12 hours. When I was using the "Any" option I would get stuck slots. Folding with four 980 Ti cards.
Maybe the Pande Group/Stanford has resolved the problem now? Maybe we'll get lucky and they'll tell us.
We have to take silence to mean that they haven't solved the problem.
Nope not solved in the slightest! Just got assigned to 171.67.108.105 probably 15mins ago and lo and behold no WU, had 6 failed attempts and still stuck there. Did the Alzheimers trick mentioned in this thread and immediately went to another assignment server and got the job and started folding. For now I have swapped it to cancer and lets see if it will still point to the faulty servers later.

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Fri Jun 02, 2017 2:28 am
by Aurum
markfw wrote:I am using Alzheimer's and its now working great. I hate to not contribute to them all, but until they figure this out, that is better than idle.

Number 38 worldwide and rising !
Look ma, I'm almost half an Ed :shock: :D :lol:
http://folding.stanford.edu/stats/donors-monthly

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Fri Jun 02, 2017 5:15 am
by snapshot
Leonardo wrote:Maybe the Pande Group/Stanford has resolved the problem now? Maybe we'll get lucky and they'll tell us.
In your dreams:

Code: Select all

03:24:43:WU00:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GP107 [GeForce GTX 1050] 1862 from 171.67.108.105
03:24:43:WU00:FS01:Connecting to 171.67.108.105:8080
03:24:44:ERROR:WU00:FS01:Exception: Server did not assign work unit
03:24:44:WU00:FS01:Connecting to 171.67.108.45:80
03:24:45:WU00:FS01:Assigned to work server 171.67.108.105
03:24:45:WU00:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GP107 [GeForce GTX 1050] 1862 from 171.67.108.105
03:24:45:WU00:FS01:Connecting to 171.67.108.105:8080
03:24:45:ERROR:WU00:FS01:Exception: Server did not assign work unit
03:25:44:WU00:FS01:Connecting to 171.67.108.45:80
03:25:45:WU00:FS01:Assigned to work server 171.67.108.105
03:25:45:WU00:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GP107 [GeForce GTX 1050] 1862 from 171.67.108.105
03:25:45:WU00:FS01:Connecting to 171.67.108.105:8080
03:25:45:ERROR:WU00:FS01:Exception: Server did not assign work unit
03:27:09:WU01:FS01:0x21:Completed 2500000 out of 2500000 steps (100%)
03:27:11:WU01:FS01:0x21:Saving result file logfile_01.txt
03:27:11:WU01:FS01:0x21:Saving result file checkpointState.xml
03:27:11:WU01:FS01:0x21:Saving result file checkpt.crc
03:27:11:WU01:FS01:0x21:Saving result file log.txt
03:27:11:WU01:FS01:0x21:Saving result file positions.xtc
03:27:11:WU01:FS01:0x21:Folding@home Core Shutdown: FINISHED_UNIT
03:27:11:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
03:27:11:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9176 run:25 clone:16 gen:100 core:0x21 unit:0x00000086ab436c6957b24c292b842b3b
03:27:11:WU01:FS01:Uploading 12.62MiB to 171.67.108.105
03:27:11:WU01:FS01:Connecting to 171.67.108.105:8080
03:27:17:WU01:FS01:Upload 2.48%
03:27:21:WU00:FS01:Connecting to 171.67.108.45:80
03:27:22:WU00:FS01:Assigned to work server 171.67.108.105
03:27:22:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP107 [GeForce GTX 1050] 1862 from 171.67.108.105
03:27:22:WU00:FS01:Connecting to 171.67.108.105:8080
03:27:22:ERROR:WU00:FS01:Exception: Server did not assign work unit
03:27:23:WU01:FS01:Upload 5.94%
<upload snipped>
03:29:53:WU01:FS01:Upload 92.62%
03:29:58:WU00:FS01:Connecting to 171.67.108.45:80
03:29:59:WU01:FS01:Upload 96.58%
03:29:59:WU00:FS01:Assigned to work server 171.67.108.105
03:29:59:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP107 [GeForce GTX 1050] 1862 from 171.67.108.105
03:29:59:WU00:FS01:Connecting to 171.67.108.105:8080
03:30:00:ERROR:WU00:FS01:Exception: Server did not assign work unit
03:30:05:WU01:FS01:Upload 99.56%
03:30:06:WU01:FS01:Upload complete
03:30:06:WU01:FS01:Server responded WORK_ACK (400)
03:30:06:WU01:FS01:Final credit estimate, 24767.00 points
03:30:06:WU01:FS01:Cleaning up
03:34:13:WU00:FS01:Connecting to 171.67.108.45:80
03:34:14:WU00:FS01:Assigned to work server 171.67.108.105
03:34:14:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP107 [GeForce GTX 1050] 1862 from 171.67.108.105
03:34:14:WU00:FS01:Connecting to 171.67.108.105:8080
03:34:14:ERROR:WU00:FS01:Exception: Server did not assign work unit
03:41:04:WU00:FS01:Connecting to 171.67.108.45:80
03:41:05:WU00:FS01:Assigned to work server 171.67.108.105
03:41:05:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP107 [GeForce GTX 1050] 1862 from 171.67.108.105
03:41:05:WU00:FS01:Connecting to 171.67.108.105:8080
03:41:06:ERROR:WU00:FS01:Exception: Server did not assign work unit
03:52:10:WU00:FS01:Connecting to 171.67.108.45:80
03:52:11:WU00:FS01:Assigned to work server 171.67.108.105
03:52:11:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP107 [GeForce GTX 1050] 1862 from 171.67.108.105
03:52:11:WU00:FS01:Connecting to 171.67.108.105:8080
03:52:11:ERROR:WU00:FS01:Exception: Server did not assign work unit
04:10:07:WU00:FS01:Connecting to 171.67.108.45:80
04:10:08:WU00:FS01:Assigned to work server 171.67.108.105
04:10:08:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP107 [GeForce GTX 1050] 1862 from 171.67.108.105
04:10:08:WU00:FS01:Connecting to 171.67.108.105:8080
04:10:08:ERROR:WU00:FS01:Exception: Server did not assign work unit
04:39:09:WU00:FS01:Connecting to 171.67.108.45:80
04:39:10:WU00:FS01:Assigned to work server 171.67.108.105
04:39:10:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP107 [GeForce GTX 1050] 1862 from 171.67.108.105
04:39:10:WU00:FS01:Connecting to 171.67.108.105:8080
04:39:10:ERROR:WU00:FS01:Exception: Server did not assign work unit
04:58:20:FS01:Paused

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Fri Jun 02, 2017 7:48 am
by Leonardo
Hmm, I've given Huntington's some love for a couple days now. I guess it's time to visit Alzheimer's.

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Fri Jun 02, 2017 9:58 pm
by bruce
JimF wrote:The information flow on this project is all downhill. The purpose of the moderators (helpful though they may be in many cases) is to shield the developers from problems rather than feeding information back to them. These are not new issues (and a lot of others not apparent at the moment). They have been going on for years. PG's usual response is to start a new public relations campaign to make up for the people who leave.
FALSE.

First of all, this is forum is a peer-to-peer self-help forum consisting of volunteers. We Mods/Admins here are volunteers too. The charter that we have adopted is aimed mainly at fighting spam and keeping some kind of coherence to the topics posted. I don't shield anybody. What I do do is to help to gather enough information so that the problem has a coherent summary that describes the problem(s) in terms that Development can attack. You'd be surprised how many posts simply don't give enough information to be useful to them... and nobody wants to have their requests behind a long queue of incomplete descriptions.

In fact, though it's not general knowledge, quite a bit of my time is spent directing focused requests TO specific PG members who can solve the specific problem(s) encountered. I've developed a list of people to contact when a certain type of problem arises. (...and we're I'm not the only ones doing this).

If you'd like to volunteer to gather that sort of information and direct it to somebody who can fix it, you're more than welcome to do so. We can certainly use your help.

In this particular case the corrective actions have (very likely) been slower in coming because I've been out of the country AFK, but the holiday has been a more significant contributor. The good news is that my first contact after returning turned out to be to a person who was already aware of the problem and already nworking toward a solution.

I'm now aware that there have been several different problems that needed to be resolved by different people. Also, several were things that take time to resolve adequately ... so as you have seen, returning to normal has consisted of a number of steo-by-step changes that have produced gradual improvements over time.

As I'm sure you know, FAH consists of many different servers in several locations and coordinating them so that they appear to act as a single system requires a rather complex set of interactions. The only really good news here is that Development has put a plan together to improve several aspects of this coordination process. Once the changes are rolled out, there will be several improvements that were discovered during the recent troubles.

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Sun Jun 04, 2017 3:02 pm
by foldy
Today I got a new work unit from 171.67.108.105 so the issue is fixed now?

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Sun Jun 04, 2017 3:25 pm
by Aurum
foldy wrote:Today I got a new work unit from 171.67.108.105 so the issue is fixed now?
Doubt it. Muneeb probably just loaded some WUs on his IP.

Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105

Posted: Sun Jun 04, 2017 3:28 pm
by SombraGuerrero
Agreed. Two or three times while this server has been like this, I have managed to obtain a WU, but it seems to be a very particular one that pushes through.