WU's Not Being Assigned by 171.67.108.102/171.67.108.105/?
Moderators: Site Moderators, FAHC Science Team
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Seems to have improved a lot in the last 24 hours. I'm still getting the occassional WU not assigned, but it's nowhere near as bad as it was.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Good questions. The information flow on this project is all downhill. The purpose of the moderators (helpful though they may be in many cases) is to shield the developers from problems rather than feeding information back to them. These are not new issues (and a lot of others not apparent at the moment). They have been going on for years. PG's usual response is to start a new public relations campaign to make up for the people who leave.Nert wrote:This whole episode is sad and disrespectful to the people that contribute to this project. Two questions come to mind:
1) Why do the volunteer contributors have a sense of urgency and those responsible for the project do not ?
2) These problems ALWAYS seem to happen over holiday weekends. Is everything so fragile that it fails when no one is there to hand hold the systems and keep them running ?
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
When I was a graduate student I did NOT get holidays. When I worked at Intel I was on-call 24x7. I bet they could even remedy this remotely. I notified Pande and Chodera and have heard nothingJoe_H wrote:I have heard back that it is being looked into, but nothing further to post. The first reports came in on a Friday evening and reported to PG on Saturday morning. This is a relatively major holiday weekend, so limited staff would be available to work on this.
Please notify us when the servers are working reliably so I can move my rigs back to F@H. I'm down below 20% of my capacity and if they don't all have WUs when I get home today I'll move the last of them to another project.
In Science We Trust
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
That is a good choice for their science, which I think is quite good too (though not being an expert, I can't prove it). I will check back by the end of the year to see if any problems are resolved. Given their usual rate of progress, that should be sufficient.Adam A. Wanderer wrote: As sad as these developments are, I'll stick with F@H. There's just no other project that does the work F@H does. And, F@H has improved over the years, I hope it'll continue to do so.
-
- Posts: 86
- Joined: Wed Jan 06, 2016 4:16 am
- Location: Northern Sweden
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
I see the same problem with slots stalling, glad to see Im not alone, if you read me right.
But I have to ask, what is the main problem here?
That there are unresponsive project owners that stalls contributors slots if they happen to be directed to those projects/servers, or
That the client does not recognize a stall due to multiple fails in downloading a new assignment and downloads another project?
Each time this has happened a reboot has "solved" my problem, a new WU has downloaded and it has been processing for a day, till I happen to come upon a problematic server.
Since the reboot helps, that tells me that the client should be able to recognize the problem and go on to the next project/server.
But I have to ask, what is the main problem here?
That there are unresponsive project owners that stalls contributors slots if they happen to be directed to those projects/servers, or
That the client does not recognize a stall due to multiple fails in downloading a new assignment and downloads another project?
Each time this has happened a reboot has "solved" my problem, a new WU has downloaded and it has been processing for a day, till I happen to come upon a problematic server.
Since the reboot helps, that tells me that the client should be able to recognize the problem and go on to the next project/server.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
I hope the Stanford IT is robust and has many backups so if those things happen they can recover from it.Adam A. Wanderer wrote:Nert wrote:Was any form of "hacking" or a virus involved?
For the donors the worst case is the servers don't work for some days. For the science the worst case would be if the folding results are lost or corrupted.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Good idea. I'll try it when I get home.boristsybin wrote:seems it worksSerge_Grenier wrote:Seems <client-type v='beta'/> is working to get WUs since yesterday.
I used to use client-type v='advanced' to try to send the biggest jobs to my best rigs but it did not seem to have any effect so I deleted them.
In Science We Trust
-
- Posts: 410
- Joined: Mon Nov 15, 2010 8:51 pm
- Hardware configuration: 8x GTX 1080
3x GTX 1080 Ti
3x GTX 1060
Various other bits and pieces - Location: South Coast, UK
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Is this the one that works? If so, I think it could be the <cause v='HUNTINGTONS'/>SteveWillis wrote:I should mention that my older machine has also not had any problem at all. Only my newer machine had the problem. I mentioned it earlier but didn't bother to include my log.
Code: Select all
*********************** Log Started 2017-05-29T23:18:46Z *********************** 23:18:46:************************* Folding@home Client ************************* 23:18:46: Website: http://folding.stanford.edu/ 23:18:46: Copyright: (c) 2009-2014 Stanford University 23:18:46: Author: Joseph Coffland <joseph@cauldrondevelopment.com> 23:18:46: Args: --child --lifeline 1895 /etc/fahclient/config.xml --run-as 23:18:46: fahclient --pid-file=/var/run/fahclient.pid --daemon 23:18:46: Config: /etc/fahclient/config.xml 23:18:46:******************************** Build ******************************** 23:18:46: Version: 7.4.4 23:18:46: Date: Mar 4 2014 23:18:46: Time: 12:02:38 23:18:46: SVN Rev: 4130 23:18:46: Branch: fah/trunk/client 23:18:46: Compiler: GNU 4.4.7 23:18:46: Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math 23:18:46: -fno-unsafe-math-optimizations -msse2 23:18:46: Platform: linux2 3.2.0-1-amd64 23:18:46: Bits: 64 23:18:46: Mode: Release 23:18:46:******************************* System ******************************** 23:18:46: CPU: AMD FX(tm)-8320 Eight-Core Processor 23:18:46: CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0 23:18:46: CPUs: 8 23:18:46: Memory: 31.32GiB 23:18:46:Free Memory: 30.66GiB 23:18:46: Threads: POSIX_THREADS 23:18:46: OS Version: 3.19 23:18:46:Has Battery: false 23:18:46: On Battery: false 23:18:46: UTC Offset: -5 23:18:46: PID: 1897 23:18:46: CWD: /var/lib/fahclient 23:18:46: OS: Linux 3.19.0-32-generic x86_64 23:18:46: OS Arch: AMD64 23:18:46: GPUs: 6 23:18:46: GPU 0: NVIDIA:7 GP104 [GeForce GTX 1080] 8873 23:18:46: GPU 1: UNSUPPORTED: NV3 [PCI] 23:18:46: GPU 2: NVIDIA:7 GP104 [GeForce GTX 1080] 8873 23:18:46: GPU 3: UNSUPPORTED: NV3 [PCI] 23:18:46: GPU 4: NVIDIA:7 GP104 [GeForce GTX 1080] 8873 23:18:46: GPU 5: UNSUPPORTED: NV3 [PCI] 23:18:46: CUDA: 6.1 23:18:46:CUDA Driver: 8000 23:18:46:*********************************************************************** 23:18:46:<config> 23:18:46: <!-- Client Control --> 23:18:46: <fold-anon v='true'/> 23:18:46: 23:18:46: <!-- Folding Core --> 23:18:46: <checkpoint v='30'/> 23:18:46: 23:18:46: <!-- Folding Slot Configuration --> 23:18:46: <cause v='HUNTINGTONS'/> 23:18:46: 23:18:46: <!-- Network --> 23:18:46: <proxy v=':8080'/> 23:18:46: 23:18:46: <!-- Slot Control --> 23:18:46: <power v='full'/> 23:18:46: 23:18:46: <!-- User Information --> 23:18:46: <passkey v='********************************'/> 23:18:46: <team v='224497'/> 23:18:46: <user v='DarthMouse_ALL_1GD5nCZbh7gNo1SESPLT24xEd2Jsu4rTP9'/> 23:18:46: 23:18:46: <!-- Work Unit Control --> 23:18:46: <next-unit-percentage v='100'/> 23:18:46: 23:18:46: <!-- Folding Slots --> 23:18:46: <slot id='0' type='GPU'/> 23:18:46: <slot id='1' type='GPU'/> 23:18:46: <slot id='2' type='GPU'/> 23:18:46:</config>
I've added that flag and got work straight away on 3 different rigs. I'm guessing that this flag (and others, like beta) gives you preferential referral to non-affected WorkServers.
Thanks!
-
- Posts: 177
- Joined: Tue Aug 26, 2014 9:48 pm
- Hardware configuration: 10 SMP folding slots on Intel Phi "Knights Landing" system, configured as 24 CPUs/slot
9 AMD GPU folding slots
31 Nvidia GPU folding slots
50 total folding slots
Average PPD/slot = 459,500 - Location: Dallas, TX
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
@rwh202 I can confirm that changing the cause preference to Huntington's does avoid the problematic work server/assignment server. All slots are finally operational. Changing this value got 14 slots that were in "ready" mode to get a work unit and start processing. The procedure is to pause the slot that's in "ready" mode, then go to Configure, select tab Advanced, then select the Cause Preference as Huntinton's, click Save then un-pause the slot. The slot should pick up a work unit right away. Thanks rwh202
Hardware config viewtopic.php?f=66&t=17997&p=277235#p277235
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
We might just cure Huntington's tonight with the entire F@H network cranking it
In Science We Trust
-
- Posts: 389
- Joined: Fri Apr 15, 2016 12:42 am
- Hardware configuration: PC 1:
Linux Mint 17.3
three gtx 1080 GPUs One on a powered header
Motherboard = [MB-AM3-AS-SB-990FXR2] qty 1 Asus Sabertooth 990FX(+59.99)
CPU = [CPU-AM3-FX-8320BR] qty 1 AMD FX 8320 Eight Core 3.5GHz(+41.99)
PC2:
Linux Mint 18
Open air case
Motherboard: ASUS Crosshair V Formula-Z AM3+ AMD 990FX SATA 6Gb/s USB 3.0 ATX AMD
AMD FD6300WMHKBOX FX-6300 6-Core Processor Black Edition with Cooler Master Hyper 212 EVO - CPU Cooler with 120mm PWM Fan
three gtx 1080,
one gtx 1080 TI on a powered header
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Yes that is the one that works.
1080 and 1080TI GPUs on Linux Mint
-
- Posts: 40
- Joined: Sat Jan 30, 2010 2:38 am
- Location: Washington D.C.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
boristsybin wrote:seems it worksSerge_Grenier wrote:Seems <client-type v='beta'/> is working to get WUs since yesterday.
Code: Select all
client-type
beta
I have 4 rigs, and kept wondering why the last two of them never had the WS x.x.x.105 issues that the first two kelp having. I assumed it was the 1080 Ti's that kept the malpracticing server at bay in those two 100% uptime rigs. But then I was like, "why is @PS3EdOlkkola having such a huge problem if I am not? surely he has lots of high-end cards too..."
Lo and behold, when I checked, the last two rigs had the "beta" flag set in them, whereas my first two rigs didn't. So I went into FAHControl > Configure > Expert (tab) > Extra client options > then added the above "beta" flag to the first two rigs as well > hit OK > hit Save. The next time any slot checked, it got a "beta" assignment right away.
Since then I have had zero problems. Although I suspect the PPD is "slightly" lower than non-beta, at least I don't have to waste my time pausing and unpausing several times an hour.
Good find @Serge_Grenier
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Is it possible to that in some way through telnet localhost 36330?PS3EdOlkkola wrote:@rwh202 I can confirm that changing the cause preference to Huntington's does avoid the problematic work server/assignment server. All slots are finally operational. Changing this value got 14 slots that were in "ready" mode to get a work unit and start processing. The procedure is to pause the slot that's in "ready" mode, then go to Configure, select tab Advanced, then select the Cause Preference as Huntinton's, click Save then un-pause the slot. The slot should pick up a work unit right away. Thanks rwh202
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
PG should probably also start a campaign to get more biologists joining their team and working on folding projects because the paper publication rate is far from following the computational power increase of the network... That's quite a lot of electricity spent worldwide for quite a few published papers in the last years...JimF wrote:PG's usual response is to start a new public relations campaign to make up for the people who leave.
Re: WU's Not Being Assigned by 171.67.108.102/171.67.108.105
Hello everyone,
I apologize for the late response. 171.67.108.105 is my WS, which has been given assignemnts by the WS even though it has no assignable jobs. We are currently trying to fix the problem with the AS where it keeps sending jobs to my WS. In the meanwhile, I have reduced the priority of my WS so that it doesn't assign jobs as frequently(it is currently 1/10 of the original value).
I am terribly sorry for all the problems that this issue is causing everyone. We appreciate all of your support and hope this doesn't turn you away from F@H. Again, I am sorry for the problem, and we are trying to fix it.
Best,
Muneeb
I apologize for the late response. 171.67.108.105 is my WS, which has been given assignemnts by the WS even though it has no assignable jobs. We are currently trying to fix the problem with the AS where it keeps sending jobs to my WS. In the meanwhile, I have reduced the priority of my WS so that it doesn't assign jobs as frequently(it is currently 1/10 of the original value).
I am terribly sorry for all the problems that this issue is causing everyone. We appreciate all of your support and hope this doesn't turn you away from F@H. Again, I am sorry for the problem, and we are trying to fix it.
Best,
Muneeb