core 0x17 GPU usage
Moderators: Site Moderators, FAHC Science Team
core 0x17 GPU usage
Hi,
Have a question re core 0x17. I'm playing with my new graphics card and noticed that GPU utilization never goes beyond 90% when folding which is a bit strange - I'd expect it to make full use of the GPU. I have a full physical (+1 a HT one) core dedicated at 4.6 Ghz so I don't think this has anything to do with being CPU starved (also confirmed by the fact I have 10% going to system idle process in task manager). So I'm wondering whether this [GPU being loaded up to 90% only] is a general thing or card specific and why that is.
Thanks
Have a question re core 0x17. I'm playing with my new graphics card and noticed that GPU utilization never goes beyond 90% when folding which is a bit strange - I'd expect it to make full use of the GPU. I have a full physical (+1 a HT one) core dedicated at 4.6 Ghz so I don't think this has anything to do with being CPU starved (also confirmed by the fact I have 10% going to system idle process in task manager). So I'm wondering whether this [GPU being loaded up to 90% only] is a general thing or card specific and why that is.
Thanks
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
-
- Posts: 823
- Joined: Tue Mar 25, 2008 12:45 am
- Hardware configuration: Core i7 3770K @3.5 GHz (not folding), 8 GB DDR3 @2133 MHz, 2xGTX 780 @1215 MHz, Windows 7 Pro 64-bit running 7.3.6 w/ 1xSMP, 2xGPU
4P E5-4650 @3.1 GHz, 64 GB DDR3 @1333MHz, Ubuntu Desktop 13.10 64-bit
Re: core 0x17 GPU usage
It isn't specific to your card. I just took a look at the usage of my 780s and they're also hovering in the 90% range. I don't fold on my CPU, so it's definitely not due to the GPUs being starved for CPU time.
I wouldn't worry about it. It's probably just the cores not being able to scale perfectly with shader count.
I wouldn't worry about it. It's probably just the cores not being able to scale perfectly with shader count.
Re: core 0x17 GPU usage
The core can max out a titan or 780ti, not sure what is holding yours back.
-
- Posts: 823
- Joined: Tue Mar 25, 2008 12:45 am
- Hardware configuration: Core i7 3770K @3.5 GHz (not folding), 8 GB DDR3 @2133 MHz, 2xGTX 780 @1215 MHz, Windows 7 Pro 64-bit running 7.3.6 w/ 1xSMP, 2xGPU
4P E5-4650 @3.1 GHz, 64 GB DDR3 @1333MHz, Ubuntu Desktop 13.10 64-bit
Re: core 0x17 GPU usage
What project? I admit to not having checked GPU usage in a long time, so the only data I have is what's running now, which have been from 9201.
Re: core 0x17 GPU usage
All projects that I've seen so far.
Re: core 0x17 GPU usage
It's false to assume your CPU is maxed out only when it's reporting 100%. Your CPU has to spend some time reading/writing disk data, switching tasks, reading/writing memory, etc. A good disk cache can hide most of the time your CPU has to wait for the disk, but there still will be times when the CPU is not at 100%. Sometimes it has to wait for the internet or for Keyboard/Mouse interactions, too.
Similarly, it's false to assume that a GPU is only maxed-out when it reports 100% utilization. Some time has to be spent moving data through the PCIe bus and the data being processed isn't always an exact multiple of the number of shaders. so when data is delivered to the GPU, some data blocks will be smaller than the number of free shaders. At certain points in the calculations, data must be synchronized before the next type of processing can begin. [ETC.}
Similarly, it's false to assume that a GPU is only maxed-out when it reports 100% utilization. Some time has to be spent moving data through the PCIe bus and the data being processed isn't always an exact multiple of the number of shaders. so when data is delivered to the GPU, some data blocks will be smaller than the number of free shaders. At certain points in the calculations, data must be synchronized before the next type of processing can begin. [ETC.}
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: core 0x17 GPU usage
Thanks. Yes, it does look a bottleneck somewhere. So shaders count? With core 0x15 my utilisation is 99% so it does seem to also have to do with the way core 0x17 works.
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
Re: core 0x17 GPU usage
If you want to reduce everything to a single number, then look only at the shader count. My point is that you can't reduce everything to a single number.
Of course there is a bottlleneck but you're assuming that somehow the shaders will always be your bottleneck and nothing else matters. In fact, the real bottleneck is always a mixture of all of the delays from everything that might saturate for some fraction of the time you're processing a WU. Reducing any of those delays will speed up production in proportion to how those pieces, both big and small, add up to the total delay.
In your case, 90% of the delay is due to the limitations of your shaders and 10% of your delay is due to something else. Some people would attempt to reduce the shader delay by overclocking, but that won't help the other 10%. Assuming the overclock is stable, both throughput and the percentage of non-shader delays will increase. Replacing PCIe 2.0 with PCIe 3.0 will reduce the non-shader delays and increase the percentage of delays that are attributable to the shaders in proportion to what fraction of the time is spent waiting on PCIe transfers.
Of course there is a bottlleneck but you're assuming that somehow the shaders will always be your bottleneck and nothing else matters. In fact, the real bottleneck is always a mixture of all of the delays from everything that might saturate for some fraction of the time you're processing a WU. Reducing any of those delays will speed up production in proportion to how those pieces, both big and small, add up to the total delay.
In your case, 90% of the delay is due to the limitations of your shaders and 10% of your delay is due to something else. Some people would attempt to reduce the shader delay by overclocking, but that won't help the other 10%. Assuming the overclock is stable, both throughput and the percentage of non-shader delays will increase. Replacing PCIe 2.0 with PCIe 3.0 will reduce the non-shader delays and increase the percentage of delays that are attributable to the shaders in proportion to what fraction of the time is spent waiting on PCIe transfers.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: core 0x17 GPU usage
Thanks. I haven't assumed anything - my observation is that not 1%, but 10% of my GPU's core is not used which is a shame. So if indeed (most) of these 10% is caused by a shaders bottleneck this should not be an issue (or at least not to such an extent) on 980 cards, right? Which would mean that 970 cards are at a disadvantage regardless of how high they can clock. That's rather an academic point of course since in the grand scheme of things it's the bottom line performance we get...
By the way I'm on PCI-E 3.0 16x so GPU-to-bus latencies should be minimal. With the core/shaders clock shared now I can't overclock shaders alone.
By the way I'm on PCI-E 3.0 16x so GPU-to-bus latencies should be minimal. With the core/shaders clock shared now I can't overclock shaders alone.
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
-
- Posts: 57
- Joined: Fri Dec 28, 2007 9:07 am
- Hardware configuration: Computer 1:
CPU: Intel Q6600@2,4GHz
RAM: 8GB
OS: Windows 7 SP1
Video: EVGA GTX550Ti SC (NVIDIA GeForce GTX550Ti GPU - 1GB GDDR5)
(OC: GPU@981MHz / Shaders@1962 / Memory@4514)
PSU: OCZ StealthXtream 600 Watt
Client 7.4.4
Computer 2:
CPU: AMD AthlonII X4 635 @2.9GHz
RAM: 4GB
OS: Windows Server 2008 R2 SP2
Client 7.4.4, configured as a service
Computer 3:
CPU: Intel Core i7-4790K @4.0GHz
GPU: EVGA GTX980 @1.518GHz
RAM: 32 GB
OS: Windows 7 SP1
Client 7.4.4
Computer 4:
CPU: Intel Core i5 M560 @2,67GHz
RAM: 4 GB
OS: Windows 7 Enterprise
Client: Win-SMP2
Computer 5:
CPU: Intel Core i3 4370 @3.8GHz
RAM: 8GB
OS: Windows 7 SP1
Client 7.4.4 configured as a service - Location: Netherlands
Re: core 0x17 GPU usage
As I moved my GTX980 (EVGA SC ACX2) from my old Core2Quad Q6600@2.4/PCIe 2.0 to my new i7-4790K@4.0/PCIe 3.0, the GPU usage (as shown by EVGA PrecisionX) went up from 82% to 88-89%.
To be honest, I had hoped for a better result.
To be honest, I had hoped for a better result.
Re: core 0x17 GPU usage
If you want a higher GPU usage, get a GTX 750 Ti. I have two on a Haswell MB, running at 98% for both Core_17 and Core_18 (each fed by a virtual core of an i7-4770).
On the other hand, if you want higher output, stick with your GTX 980.
The real issue insofar as I am concerned is PPD per dollar (and per watt, depending on your energy costs, green-ness, etc.). The GTX 980 is nothing to complain about, all things considered. And as Bruce points out, there will always be a bottleneck somewhere. You have just found the bottleneck du jour.
On the other hand, if you want higher output, stick with your GTX 980.
The real issue insofar as I am concerned is PPD per dollar (and per watt, depending on your energy costs, green-ness, etc.). The GTX 980 is nothing to complain about, all things considered. And as Bruce points out, there will always be a bottleneck somewhere. You have just found the bottleneck du jour.
Re: core 0x17 GPU usage
You missed my point. If 10% of the time you shaders are not busy, then 90% of the time the result has not been completed is because you're waiting on the shaders -- which is what I called "shader delay" You're asking for 100% of the delays should be caused because there aren't enough shaders or they're not fast enough (and they never have to wait for ANYTHING ELSE). Your shaders can't process data that has not arrived yet and whenever the shaders complete a work segment, some time is spend transferring data (PCIe delay) and some time (CPU delay) is spent while the CPU processes the data it has received and prepares the next work segment. You're saying those pieces add up to 10%.Breach wrote:Thanks. I haven't assumed anything - my observation is that not 1%, but 10% of my GPU's core is not used which is a shame. So if indeed (most) of these 10% is caused by a shaders bottleneck this should not be an issue (or at least not to such an extent) on 980 cards, right?
(I'm assuming there are only three categories and the only reason the WU has not been complete falls into one of those three. When more than one thing is busy it has to be listed under one of those three categories.) Also, during the processing of each WU, there are several different types of work segments. Some will be delayed more by one thing and some will be delayed by something else and by only looking at one number, you're taking some kind of weighted average.
CBT reduced his PCIe delay and CPU delay and a bigger percentage was spent waiting for the GPU (from 82% to 88-89%). JimF suggests that if you use a slower GPU without changing other delays a bigger percentage of the time will be spent waiting on the shaders.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 57
- Joined: Fri Dec 28, 2007 9:07 am
- Hardware configuration: Computer 1:
CPU: Intel Q6600@2,4GHz
RAM: 8GB
OS: Windows 7 SP1
Video: EVGA GTX550Ti SC (NVIDIA GeForce GTX550Ti GPU - 1GB GDDR5)
(OC: GPU@981MHz / Shaders@1962 / Memory@4514)
PSU: OCZ StealthXtream 600 Watt
Client 7.4.4
Computer 2:
CPU: AMD AthlonII X4 635 @2.9GHz
RAM: 4GB
OS: Windows Server 2008 R2 SP2
Client 7.4.4, configured as a service
Computer 3:
CPU: Intel Core i7-4790K @4.0GHz
GPU: EVGA GTX980 @1.518GHz
RAM: 32 GB
OS: Windows 7 SP1
Client 7.4.4
Computer 4:
CPU: Intel Core i5 M560 @2,67GHz
RAM: 4 GB
OS: Windows 7 Enterprise
Client: Win-SMP2
Computer 5:
CPU: Intel Core i3 4370 @3.8GHz
RAM: 8GB
OS: Windows 7 SP1
Client 7.4.4 configured as a service - Location: Netherlands
Re: core 0x17 GPU usage
FYI, By switching off the CPU client (even though it was already configured for 7 cores), I managed to get the '%GPU usage' to 91%.
I'm not sure if the i7 turbo modus kicks in at this point. If not, there may be another improvement in that point.
How can I tell if turbo is active and at which frequency that core runs?
This now leads to a higher PPD from the GPU, but I miss the PPDs from the CPU. I'm not sure yet which combination is best (for a core_17). Since I seem to keep receiving Core_15 WU's, where the equation is quite different, I certainly don't want to miss out on the PPDs from the CPU client for now.
Corné
I'm not sure if the i7 turbo modus kicks in at this point. If not, there may be another improvement in that point.
How can I tell if turbo is active and at which frequency that core runs?
This now leads to a higher PPD from the GPU, but I miss the PPDs from the CPU. I'm not sure yet which combination is best (for a core_17). Since I seem to keep receiving Core_15 WU's, where the equation is quite different, I certainly don't want to miss out on the PPDs from the CPU client for now.
Corné
-
- Posts: 1576
- Joined: Tue May 28, 2013 12:14 pm
- Location: Tokyo
Re: core 0x17 GPU usage
@CBT: try CPU:6 instead ... That might keep your level up for GPU and still add CPU-points. Or even just CPU:4
Please contribute your logs to http://ppd.fahmm.net
-
- Posts: 57
- Joined: Fri Dec 28, 2007 9:07 am
- Hardware configuration: Computer 1:
CPU: Intel Q6600@2,4GHz
RAM: 8GB
OS: Windows 7 SP1
Video: EVGA GTX550Ti SC (NVIDIA GeForce GTX550Ti GPU - 1GB GDDR5)
(OC: GPU@981MHz / Shaders@1962 / Memory@4514)
PSU: OCZ StealthXtream 600 Watt
Client 7.4.4
Computer 2:
CPU: AMD AthlonII X4 635 @2.9GHz
RAM: 4GB
OS: Windows Server 2008 R2 SP2
Client 7.4.4, configured as a service
Computer 3:
CPU: Intel Core i7-4790K @4.0GHz
GPU: EVGA GTX980 @1.518GHz
RAM: 32 GB
OS: Windows 7 SP1
Client 7.4.4
Computer 4:
CPU: Intel Core i5 M560 @2,67GHz
RAM: 4 GB
OS: Windows 7 Enterprise
Client: Win-SMP2
Computer 5:
CPU: Intel Core i3 4370 @3.8GHz
RAM: 8GB
OS: Windows 7 SP1
Client 7.4.4 configured as a service - Location: Netherlands
Re: core 0x17 GPU usage
I've been thinking along the same lines. I might even try CPU:3, to make sure a 'complete' core is available, not just a HT module. My GPU now has a Core_15, so I can't test for the moment.
Btw. why doesn't core_15 receive any bonus points?
Corné
Btw. why doesn't core_15 receive any bonus points?
Corné
Last edited by CBT on Sat Jan 31, 2015 10:44 am, edited 1 time in total.