Page 1 of 1
Can't choose large number of CPU cores
Posted: Tue May 19, 2020 8:34 pm
by bambihunter
I have a couple of servers that I am no longer going to use in production. Last night, I was working with the smaller one. It is dual 12 core Xeon's for 24 cores and 48 threads. It will not let me set it higher than 30 cores. While I realize that the HT "cores" won't double the yield, I was surprised it wouldn't go higher. Especially because my other system has quad E5-4650 v3's and I was going to use most of the CPU power for this until I migrate the last few smaller VM's off it. Of course I can set to use more but smaller WU's, but is there a top limit in number of CPU's?
Unrelated, but one of my old gaming system has 3 x GTX980's in it and it seems there is constantly one GPU failing. It is not the same GPU, nor same slot, etc. I have moved them from slot to slot and PC to PC. There is no rhyme or reason that I have found as to what is causing it. One card may run 25 units in a row, no failures. Then, fail repeatedly, then return to working fine again the next day. I have also tried opening up the side with additional cooling but no change. Has there been a rash of random GPU WU's that have caused this over the past 6 weeks or so?
https://stats.foldingathome.org/donor/2261
Re: Can't choose large number of CPU cores
Posted: Tue May 19, 2020 8:46 pm
by Neil-B
Guess you are using windows ... 32threads is limit for a CPU slot and tends to work nicely and if possible you want one slot this large ... a second 12thread cpu slot would leave 4 threads free - you may need no more than that as FAH runs low priority and your other stuff should take precedence ... if the servers end up pure FAH then I'd put 16 remaining threads as the 2nd slot ... some people will argue the need to leave so threads to avoid contention issues, but my experience is that server grade systems running Xeons actually cope ok even if you "max" the thread count - I get best from my twin 14core 56 thread system when running 32/56 and 24/56 slots.
As to GPU issues I won't try to offer advice - they confuse me !!
Re: Can't choose large number of CPU cores
Posted: Tue May 19, 2020 9:11 pm
by bambihunter
Thanks Neil. I appreciate the response. So it is a Windows limitation then? Or Windows FAH Client?
Yes it is running Windows Server 2012r2 at the moment with VM's in Hyper-V though if I keep it as a partial work server it will be moved to VMWare. It looks like your top system is similar to this server. it is a Dell T630. I have been reading up on
https://flings.vmware.com/vmware-applia ... lding-home as it sounds like it could be viable. I still need to have the server available so that I CAN turn it down if needed. I am the local SysAdmin for our company and sometimes I have to take sandbox data and manipulate/test it. Once I finish getting the rest of the stuff off the blade server, THEN I can crank up the CPU folding to join up with the decent output of my half-dozen GPU's.
Re: Can't choose large number of CPU cores
Posted: Tue May 19, 2020 9:25 pm
by Neil-B
I have seen posts that imply enterprise versions can do more than 32thread slots ... Mine didn't and I never felt the desire to work out why ... I could run latest server product as have license but happy just running it in Win10 Ent with base system installation of FAH ... For work related use I'll tend to rebuild to latest drop of Server and run everything in Hyper-V as all I do is experimental/sandbox style proof of concept work - and when doing that I am thrashing the kit and FAH has to take a break.
There can be some issues with thread counts higher than 32 (people do run them under linux) as the slot counts are tested less (maybe never) and some odder things can happen with the way Gromacs in the FAHCore splits up the thread usage ... Another reason I have been relaxed about just using 32threads as max.
Re: Can't choose large number of CPU cores
Posted: Tue May 19, 2020 11:42 pm
by Rel25917
If the failed gpu units are all 13404 or 13405 I wouldn't worry too much, they have a higher than normal failure rate. If they are other projects that are failing there could be a problem, would need to see the logs from a failure to have any ideas why.
Re: Can't choose large number of CPU cores
Posted: Wed May 20, 2020 1:28 am
by MeeLee
Just make sure you have enough headroom on the PSU. If you use a dual PSU system, try to see if some sort of logger notices electric spikes.
Nvidia GPUs are very susceptible to voltage drops or spikes.
Sometimes setting the fan curve to high, helps reduce voltage spikes, but it works against voltage droops...
Under load, my GPUs get somewhere between 11,5 to 11,8V.
Re: Can't choose large number of CPU cores
Posted: Wed May 20, 2020 3:54 am
by PantherX
Assuming that your using work hardware, please make sure that you have permission (generally written) from people authorized to make those decisions (Internal IT, TL, Manager, CISO, CTO, GM, EGM, etc.) as per the EULA.
If you have to run F@H in a VM, consider using a Linux based on. Generally speaking, the current version of FahCore_a7 is more efficient while running in Linux than in Windows.
Re: Can't choose large number of CPU cores
Posted: Wed May 20, 2020 4:05 pm
by bambihunter
These are my own systems PantherX. I buy some of the equipment as we retire it or from other places. For about 10 years I had my own I.T. consulting business for a while which was why I bought these initially. That's interesting that core performs better in Linux. I used to use Linux for everything at home but haven't used it much in at least 10 years, maybe 15 except for a bootable ISO to fix issues on Windows server/PC's.
Melee, very good tips. One should never underestimate the value of a good PSU. I retired my old gaming system a year ago and built a new HEDT with everything new except I reused the 1200w PSU. When I fired it up, the same game was crashing to desktop at the same time. On a hunch, I put a NIB warranty replacement 1k power supply in. That problem went away, but occasionally it it would just click and shut clear off. This led me to believe 1k wasn't enough and was shutting off in protect mode. So, I bought a 1600 watt Corsair and it has ran flawlessly since. The old system with the 980's now has a known good 1200 watt power supply (same model as the flakey one). It actually has adjustable voltage so one can step up the 12v rail a bit if there's too much voltage droop.
This newer gaming system (with a little work here and there) is a great folder. A pair of 2080ti's running on an 18 core i9-9980xe. Stock it is 3.0 but I fold 24/7 with it at 4.7ghz on all cores when not using it for gaming or work. The CPU on this, and on my servers really don't do much PPD comparatively. Maybe 80k per day (I can't remember), compared to my 2mil+ from each of the 2080's. If I were smart, I'd sell some of the servers and buy another video card.
This isn't a bad OC from a workstation board with 128gb of RAM:
https://valid.x86.fr/y1srm1