There are potential problems with dividing the atoms of a protein into groups that can run in parallel (on separate threads) One such problem is the protein might be too small ... making the groups each too small. A slot with a large number of threads would only work well on proteins with a large number of atoms. Another potential problem is that GROMACS doesn't like to use thread-counts with large prime factors. In both cases, FAH should assign a project with acceptable number of threads, leaving other threads idle.
While this has nothing directly to do with the 40 vs 32-thread limitation, it would limit your total productivity somewhat, even if the Windows 40 / 32 thread problem were to be fixed.
Dual 20 Core xeons only one being utilized.
Moderators: Site Moderators, FAHC Science Team
Re: Dual 20 Core xeons only one being utilized.
^^ Thanks - am I to understand that it would be better to create several slots with say 8 threads each? Productivity-wise?
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: Dual 20 Core xeons only one being utilized.
No, 2 slots is enough to get all cores used. It is better to only fold 2 work units with all threads fast then to have 4 work units finished slow.
Re: Dual 20 Core xeons only one being utilized.
Depends.
On my Xeon, I can set 10 out of 20 cores, and it appears that Windows is using core 0, 2, 4, 6, 8.... (all the master cores), and not the hyperthreading cores by default.
Remember you have only so much L-cache assigned to each core (or core set); thus it makes more sense to run half the threads on a hyperthreading core.
And run all the cores (minus the ones assigned to the graphics card) on a non-hyperthreading CPU.
Surprisingly, the difference between running my Xeon at 10 cores, or running them at 20 cores is less than 3 Watts on the wall!
CPU PPD on the other hand went up by 33+%.
On my Xeon, I can set 10 out of 20 cores, and it appears that Windows is using core 0, 2, 4, 6, 8.... (all the master cores), and not the hyperthreading cores by default.
Remember you have only so much L-cache assigned to each core (or core set); thus it makes more sense to run half the threads on a hyperthreading core.
And run all the cores (minus the ones assigned to the graphics card) on a non-hyperthreading CPU.
Surprisingly, the difference between running my Xeon at 10 cores, or running them at 20 cores is less than 3 Watts on the wall!
CPU PPD on the other hand went up by 33+%.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: Dual 20 Core xeons only one being utilized.
Does anyone else see the same result, only using real cores not virtual cores (hyperthreading) improves PPD?Surprisingly, the difference between running my Xeon at 10 cores, or running them at 20 cores is less than 3 Watts on the wall!
CPU PPD on the other hand went up by 33+%.
-
- Site Moderator
- Posts: 6359
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: Dual 20 Core xeons only one being utilized.
I wouldn't be surprised with AVX core and/or a WU with very few atoms ...
Re: Dual 20 Core xeons only one being utilized.
With SSE2, using an odd-even pair of cores runs slower than using only cores with even or only cores with odd numbers since the odd-even pair share the SSE2 hardware. I've never tested AVX so I'm not sure how it behaves in similar situations. ** Of course for most people, it's the OS that decides which cores to leave idle when they're not all in use (i.e- unless you you intentionally assign affinity.) For the most part, using most of your cores gets more work done than intentionally leaving some idle.
If you do decide to test AVX, please report the details of which projects you tested, how many concurrent frames you tested, and how the TFP changed.
If you do decide to test AVX, please report the details of which projects you tested, how many concurrent frames you tested, and how the TFP changed.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.