And how much more does one need to run 2 GPUs (with no CPU folding) than a FX-8320 Eight-Core Processor running at 3500.00MHz?bruce wrote:The OpenCL driver for Linux is significantly different than the Windows OpenCL driver. Each GPU uses at most one CPU thread. Many projects do significantly better if you have a faster CPU (Central Processing Unit). The GPU can't get data fast enough to stay busy ...
Project 10496 (158,16,66)
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 230
- Joined: Mon Dec 12, 2016 4:06 am
Re: Project 10496 (158,16,66)
Re: Project 10496 (158,16,66)
Unfortunately having 8-cores doesn't matter. Drivers are not multi-threaded. Each GPU uses one CPU core. I'll bet you could create a slot that uses 6 CPUs without changing the GPU performance ... unless you already run some "heavy" applications that continuously use some of your CPU threads.ComputerGenie wrote:bruce wrote:And how much more does one need to run 2 GPUs (with no CPU folding) than a FX-8320 Eight-Core Processor running at 3500.00MHz?
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 230
- Joined: Mon Dec 12, 2016 4:06 am
Re: Project 10496 (158,16,66)
bruce wrote:The OpenCL driver for Linux is significantly different than the Windows OpenCL driver. Each GPU uses at most one CPU thread. Many projects do significantly better if you have a faster CPU (Central Processing Unit). The GPU can't get data fast enough to stay busy ...
OK, so to re-ask:bruce wrote:Unfortunately having 8-cores doesn't matter. Drivers are not multi-threaded. Each GPU uses one CPU core. I'll bet you could create a slot that uses 6 CPUs without changing the GPU performance ... unless you already run some "heavy" applications that continuously use some of your CPU threads.ComputerGenie wrote:And how much more does one need to run 2 GPUs (with no CPU folding) than a FX-8320 Eight-Core Processor running at 3500.00MHz?
And how much more does one need to run 2 1080 GPUs (with no CPU folding) than a CPU running at 3500.00MHz?
-
- Site Admin
- Posts: 7937
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Project 10496 (158,16,66)
Ignore the clock speed of the CPU, it is a relatively meaningless measure of performance. The clock speed is only useful comparing processors from the same family. As a design that dates back nearly 5 years, the FX-8320 is probably fast enough in most cases, but its ability to transfer data between the CPU and the GPU card in a PCIe slot is also going to be dependent on the chipset that connects them.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
-
- Posts: 230
- Joined: Mon Dec 12, 2016 4:06 am
Re: Project 10496 (158,16,66)
M5A99FX PRO R2.0Joe_H wrote:Ignore the clock speed of the CPU, it is a relatively meaningless measure of performance. The clock speed is only useful comparing processors from the same family. As a design that dates back nearly 5 years, the FX-8320 is probably fast enough in most cases, but its ability to transfer data between the CPU and the GPU card in a PCIe slot is also going to be dependent on the chipset that connects them.
"Chipset - AMD 990FX/SB950
System Bus - Up to 5.2 GT/s HyperTransport™ 3.0 "
Still not sure how that could/would have a massive effect on one given RCG and not another RCG in the same project.
-
- Posts: 17
- Joined: Sat Oct 08, 2011 8:33 am
- Hardware configuration: 1: AMD FX 8150, 8 GB 1333MHZ, 256 GB Samsung evo 850, Radeon 280X
2: Intel i3 4170, 8GB ram, 120GB SSDNow V300, 2X KFA2 gtx 1080
3: AMD Phenom II x6 1100T, 8GB 1600MHZ, Radeon 280X
4: AMD FX 4300, 8GB 1600MHz, Crucial C300 128GB, 2X Radeon 290X - Location: Norway
Re: Project 10496 (158,16,66)
The only thing I notice with this project and 10494/10492 is that they have a very large atom number (277543/189218)
Maybe the issue is connected to this?
Maybe the issue is connected to this?
Re: Project 10496 (158,16,66)
It could be dependent on the atom count ... in fact, that's pretty likely. It still depends a lot on how NVidia wrote their OpenCL driver. A certain amount of the OpenCL work is done on the CPU, preparing data to be transferred to the GPU. Add the time for the chipset to transfer the data to the GPU and it produces less than ideal performance. As I've said several times, PG is aware of the problem and working toward an improvement but they're still dependent on whatever drivers NVidia distributes.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 69
- Joined: Sun Feb 28, 2016 10:06 pm
Re: Project 10496 (158,16,66)
Any update on this? This project seems to have spread out. Currently have all nine of my GPU's running it at the same time. From 980's to 1080TI's all the same impact. Ubuntu 14.04/16.04. Nvidia 370.28 to 281.22. So a mix.
Dedicated to my grandparents who have passed away from Alzheimer's
Dedicated folding rig on Linux Mint 19.1:
2 - GTX 980 OC +200
1 - GTX 980 Ti OC +20
4 - GTX 1070 FE OC +200
3 - GTX 1080 OC +140
1 - GTX 1080Ti OC +120
Re: Project 10496 (158,16,66)
When there are many projects that are designed to work with your hardware, you'll get a variety of assignments. If some of those projects happen to be off-line, the frequency of assignments for the remaining projects will increase. If it happens that only a few (or even only one) project is available at the time, it's conceivable that all your systems will be running that project(s).
Rest assured that every WU is unique. FAH doesn't reassign the same WU multiple times -- except after somebody fails to return results by the deadline.
Rest assured that every WU is unique. FAH doesn't reassign the same WU multiple times -- except after somebody fails to return results by the deadline.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Project 10496 (158,16,66)
Unfortunately no, we have not seen the end of this project.
GTX 1080 ti also runs low PPD on this, commonly less than 800k PPD (vs 1 million PLUS for everything else my 1080 ti cards have seen to date).
I'm starting to wonder if this project has memory latency issues, perhaps due to it's size, given it also doesn't seem to run well on the GTX 1080.
GTX 1080 ti also runs low PPD on this, commonly less than 800k PPD (vs 1 million PLUS for everything else my 1080 ti cards have seen to date).
I'm starting to wonder if this project has memory latency issues, perhaps due to it's size, given it also doesn't seem to run well on the GTX 1080.
Re: Project 10496 (158,16,66)
It's the same for me: I've only been folding project 10496 on my gtx 1080TI since the last two days. Anyway, this is my personal theory: what if users block this project to fold more "profitable" WUs? I've read something similar some time ago on another thread: they just blocked incoming data from a particular server that was offline for several daysQuintLeo wrote:I'm starting to wonder if this project has memory latency issues, perhaps due to it's size, given it also doesn't seem to run well on the GTX 1080.
-
- Posts: 1164
- Joined: Wed Apr 01, 2009 9:22 pm
- Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)
Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS
Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only) - Location: Jersey, Channel islands
Re: Project 10496 (158,16,66)
Cherry picking of WU's is frowned upon by PG, people have had their points zero'd in the past for doing it.
-
- Posts: 260
- Joined: Tue Dec 04, 2007 5:09 am
- Hardware configuration: GPU slots on home-built, purpose-built PCs.
- Location: Eagle River, Alaska
Re: Project 10496 (158,16,66)
Is there a possibility your GPU is throttling, due to heat buildup? I have three 1080Tis, all of which typically process a 10496 work unit at about 1M PPD. I have overclocked the GPUs, but only at a minimal 100MHz core boost. They stay relatively cool, typically at about 65 to 70C core temp.QuintLeo wrote:GTX 1080 ti also runs low PPD on this, commonly less than 800k PPD (vs 1 million PLUS for everything else my 1080 ti cards have seen to date).
Keep in mind also that a 1080Ti working a 10496 WU pumps a lot of data through PCIe bus. If there are multiple video cards Folding on the same motherboard, it can very easily saturate the bus.
Re: Project 10496 (158,16,66)
Unfortunately not - and this project is very bad on 1080 (same to a hair LESS PPD than a 1070) and VERY bad on 1080 ti (gives about 10% more PPD than a 1070). 10494 is very similar.
Not nearly as bad on the 1080ti as 9415 and 9414 though, where it gets A LOT WORSE PPD than the 1070 on a 1080ti (literally about 60% based on the ones I've accumulated in HFM so far).
The performance is SO CRAZY BAD on my 1080ti cards that I'm seriously considering blocking the workserver on that machine - it's ridiculous to WASTE top-end folding cards on a work unit that performs so much BETTER on MUCH WORSE hardware.
Not nearly as bad on the 1080ti as 9415 and 9414 though, where it gets A LOT WORSE PPD than the 1070 on a 1080ti (literally about 60% based on the ones I've accumulated in HFM so far).
The performance is SO CRAZY BAD on my 1080ti cards that I'm seriously considering blocking the workserver on that machine - it's ridiculous to WASTE top-end folding cards on a work unit that performs so much BETTER on MUCH WORSE hardware.
Re: Project 10496 (158,16,66)
I think it's fair to assume that the GPU is either waiting for data to process or it's computing something. In other words, it's either waiting on the PCIe bus or the shaders. If the shaders are 80% busy, that means that 20% of the time it's waiting on the PCIe bus to give it data to work on -- and the PCIe bus is either moving data or it's waiting on the CPU to prepare that data. (in fact, those numbers can add up to more than 100% because it's possible to transfer data concurrently with computing data.)
Given your ability to measure the %Busy for the shaders, the %Busy for the bus, and the %Busy for the FAHCore, give us your estimate of what's going on and why it's doing whatever it's doing.
Given your ability to measure the %Busy for the shaders, the %Busy for the bus, and the %Busy for the FAHCore, give us your estimate of what's going on and why it's doing whatever it's doing.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.