Page 1 of 1
Configuration ineffectual causing performance degradation
Posted: Thu Apr 19, 2012 8:20 am
by EugeneG
I have just installed fah on my Windows 7 64bit PC and find that it is causing a noticeable delay in my day-to-day work. I have tried limiting smp threads to 6 and CPU usage to 90% but all 4 cores are still permanently 100% busy. I have restarted the PC since setting this up. What have I missed please ?
Re: Configuration ineffectual causing performance degradatio
Posted: Thu Apr 19, 2012 2:35 pm
by 7im
Hello EugeneG, welcome to the forum.
I couldn't tell from the picture, but if you are running both an SMP and a GPU slot, pause the GPU slot to see if the system lag goes away.
Re: Configuration ineffectual causing performance degradatio
Posted: Thu Apr 19, 2012 4:05 pm
by Joe_H
EugeneG wrote:I have just installed fah on my Windows 7 64bit PC and find that it is causing a noticeable delay in my day-to-day work. I have tried limiting smp threads to 6 and CPU usage to 90% but all 4 cores are still permanently 100% busy. I have restarted the PC since setting this up. What have I missed please ?
By saying you have limited smp threads to 6, do you mean in the configuration for the folding slot? Then, there is at least part of your problem. Your Core 2 Quad processor can only run 4 threads simultaneously. So either leave the SMP setting at -1 to let the client choose the appropriate number, or set it no higher than 4. A SMP setting of 3 will use 75%. The CPU percentage setting currently only works well with uniprocessor cores, not the SMP cores.
Re: Configuration ineffectual causing performance degradatio
Posted: Thu Apr 19, 2012 4:17 pm
by 7im
Oh, nice catch, hence the "CPUS: 4" item in the ***System*** section. I should have also recognized the Q9300 as no HT.
As noted, running with more threads than CPUs will not speed up the folding and potentially causes system lag. Run with 4 or less. IF also running a GPU client slot, it may need to drop to 3.
Re: Configuration ineffectual causing performance degradatio
Posted: Fri Apr 20, 2012 12:30 pm
by EugeneG
Thank you for the replies. Having run with different settings I now think that there is no more than a weak relationship between the configuration settings and program resource demands - at least for smp machines. I have not yet tried this on another PC - perhaps one without available GPU.
The smp CPUs field in the configure folding slot screen states that it controls the number of THREADS in use, not cores. Neither appear to be true though as I have set it as low as 2 and all 4 of my cores were active. Moreover, running Resource Monitor shows that the number of threads always exceeds the limit placed in the FaH configure screen. I have given up on this method of control and set it to -1.
The Percent CPU usage slider allows a little more control. I have settled on 40% as that way I seem to average at about 90% usage, and the 10% that the system idles is what I seem to need to be able to get an acceptable response. I have a little outdated knowledge of the VMS operating system (in many ways the forerunner of Windows) and that leads me to think that what I am seeing is the result of too large a quantum, yet surely that can't really be it. But if the FaH process (which does have the lowest available priority) is being pre-empted as it should, then I should never have to wait longer than about 200ms for a response.
Joe-H, don't confuse cores with threads. The clue is in FaH's suggestion to make the number of threads a multiple of two, presumably as that is the number of hyper threads Intel makes available per core. I think labeling the section 'CPUs' then going on to mention threads is at best confusing. However, a process can have numerous threads running concurrently, although only two per core will be able to be processed at a time; and as 7im mentioned, there is no point in trying to exceed that. Your comment about the CPU usage only working well with a uniprocessor machine seems likely to point to the explanation of the settings I am having to use.
Re: Configuration ineffectual causing performance degradatio
Posted: Fri Apr 20, 2012 3:34 pm
by Joe_H
I am not confused, the important number is the number of cores and how many threads they support. The number placed in the configuration of the slot sets the number of main compute threads, there are additional threads used by the FAHcore that relatively little CPU time is spent on. So, in your case you have a 4-core Core 2 Quad CPU that only executes one thread per core at a time, the maximum number you should set is 4. If you had a CPU that has hyper-thread support such as an i7 with 4 cores, then you could set that number as high as 8.
I have not looked at how many extra threads are used by each FACcore, but for the A4 core it is 3 under OS X. Should have checked the last time I ran under Windows to see if it had the same overhead, but it is not that important. So, on my Core 2 Duo machine I see 5 threads, and on the i7 and Xeon machines 11 threads are associated with the FAHcore_a4 process. 2 and 8 are the count of the active threads respectively when the processes are profiled. At a minimum the thread count was listed as 4, or 1 plus 3, the times I tried out the A4 as an uniprocessor folding core.
As for the recommended numbers to use in that setting being multiples of 2, that is because many WU's fail when using SMP settings that are "large" prime numbers or multiples of the same. How large is "large" is being looked into. Many WU's will work with 5 or 7, almost all fail by settings of 11 or higher. However the primes of 2 and 3 work. In theory the number 9 as a non-prime multiple of 3 should also work okay, but I don't recall any reports of tests with that number. So if you set your folding to those 2 or 3, you will use less than the max CPU available. The other cores will not be completely inactive, your OS and the other apps running such as FAHControl will use some of the available processing depending on how CPU intensive they are.
Re: Configuration ineffectual causing performance degradatio
Posted: Fri Apr 20, 2012 3:56 pm
by P5-133XL
your q9300 does not have hyper-threading so your quad core CPU should have a maximum of four threads dedicated to folding i.e SMP:4. Any more and the threads will compete with each other for CPU time causing folding to slow down. There is significant overhead in a task-switch moving a thread to another core and if the number of threads match the number of cores the number of task-switches is minimized. The reason that Stanford recommends even numbers of threads is that the SMP FAHCores occasionally chokes when running using a prime numbers of threads (i.e. 5, 7, 11 ...). So please don't over-subscribe the number of threads beyond the number of cores plus the number of hyper-threaded simulated cores which for you is four. Going SMP:6 will drop your PPD for absolutely no benefit.
I doubt that the SMP is causing a noticable performance degradation for normal usage. The Windows priority system is quite effective at preventing that. However, GPU folding does cause issues for there is no resource conflict resolution within the video subsystem. If you are seeing a performance issues on your machine, then the first place to look is in suspending GPU folding. There are some techniques that may significantly improve lag caused by GPU folding but that involves a significant amount of fiddling by finding the right driver version combined with using the Aero theme.
Re: Configuration ineffectual causing performance degradatio
Posted: Fri Apr 20, 2012 4:18 pm
by EugeneG
That's helpful - thanks.