Folding Forum

Posted: **Tue Jan 28, 2020 12:12 pm**

And "EVGA RTX 2060 KO" performance speedup only shows in compute workflows like CUDA or OpenCL used by FAH. It does not showup in 3D games, so no sold out because of gamers. It will be a good choice for FAH users

Posted: **Tue Jan 28, 2020 4:54 pm**

Does anybody know if there's an OpenCL program somwhere that can report the actual hardware characteristics of a GPU? Utilites like GPU-Z simply look up the information in a table and you always get the same information that's advertised (or posted on Wikipedia).

The official Ti suffix means there's an actual change to the hardware. I'm not sure if the KO suffix means the same thing since it's not a change to the NVidia designation.

Posted: **Wed Jan 29, 2020 3:57 am**

bruce wrote:Does anybody know if there's an OpenCL program somwhere that can report the actual hardware characteristics of a GPU?

I found that the Nvidia SDK for linux has a sample program for querying CUDA Device information. I've included the output for my EVGA 2060 KO (non-ultra) from both the app deviceQuery and "nvidia-smi --query". Both report 1920 CUDA cores as expected.
It's only been running for ~15 minutes, but FahCore_21 on Project 14320 is showing estimated TPF of 2 minutes 50 seconds, estimated PPD 997345.

Code: Select all

[@localhost bin]$ lspci | grep -i 2060
4a:00.0 VGA compatible controller: NVIDIA Corporation TU104 [GeForce RTX 2060] (rev a1)

Code: Select all

Device 0: "GeForce RTX 2060"
  CUDA Driver Version / Runtime Version          10.2 / 10.2
  CUDA Capability Major/Minor version number:    7.5
  Total amount of global memory:                 5935 MBytes (6222839808 bytes)
  (30) Multiprocessors, ( 64) CUDA Cores/MP:     1920 CUDA Cores
  GPU Max Clock rate:                            1680 MHz (1.68 GHz)
  Memory Clock rate:                             7001 Mhz
  Memory Bus Width:                              192-bit
  L2 Cache Size:                                 3145728 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1024
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 3 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 74 / 0
  Compute Mode:

Code: Select all

GPU 00000000:4A:00.0
    Product Name                    : GeForce RTX 2060
    Product Brand                   : GeForce
    Display Mode                    : Disabled
    Display Active                  : Disabled
    Persistence Mode                : Disabled
    Accounting Mode                 : Disabled
    Accounting Mode Buffer Size     : 4000
    Driver Model
        Current                     : N/A
        Pending                     : N/A
    Serial Number                   : N/A
    GPU UUID                        : GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    Minor Number                    : 2
    VBIOS Version                   : 90.04.63.40.58
    MultiGPU Board                  : No
    Board ID                        : 0x4a00
    GPU Part Number                 : N/A
    Inforom Version
        Image Version               : G001.0000.02.04
        OEM Object                  : 1.1
        ECC Object                  : N/A
        Power Management Object     : N/A
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    GPU Virtualization Mode
        Virtualization Mode         : None
        Host VGPU Mode              : N/A
    IBMNPU
        Relaxed Ordering Mode       : N/A
    PCI
        Bus                         : 0x4A
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x1E8910DE
        Bus Id                      : 00000000:4A:00.0
        Sub System Id               : 0x20663842
        GPU Link Info
            PCIe Generation
                Max                 : 3
                Current             : 3
            Link Width
                Max                 : 16x
                Current             : 16x
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
        Replays Since Reset         : 0
        Replay Number Rollovers     : 0
        Tx Throughput               : 8000 KB/s
        Rx Throughput               : 51000 KB/s
    Fan Speed                       : 89 %
    Performance State               : P2
    Clocks Throttle Reasons
        Idle                        : Not Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Active
        HW Slowdown                 : Not Active
            HW Thermal Slowdown     : Not Active
            HW Power Brake Slowdown : Not Active
        Sync Boost                  : Not Active
        SW Thermal Slowdown         : Not Active
        Display Clock Setting       : Not Active
    FB Memory Usage
        Total                       : 5934 MiB
        Used                        : 107 MiB
        Free                        : 5827 MiB
    BAR1 Memory Usage
        Total                       : 256 MiB
        Used                        : 8 MiB
        Free                        : 248 MiB
    Compute Mode                    : Default
    Utilization
        Gpu                         : 99 %
        Memory                      : 10 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    Encoder Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    FBC Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    Ecc Mode
        Current                     : N/A
        Pending                     : N/A
    ECC Errors
        Volatile
            SRAM Correctable        : N/A
            SRAM Uncorrectable      : N/A
            DRAM Correctable        : N/A
            DRAM Uncorrectable      : N/A
        Aggregate
            SRAM Correctable        : N/A
            SRAM Uncorrectable      : N/A
            DRAM Correctable        : N/A
            DRAM Uncorrectable      : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending Page Blacklist      : N/A
    Temperature
        GPU Current Temp            : 82 C
        GPU Shutdown Temp           : 93 C
        GPU Slowdown Temp           : 90 C
        GPU Max Operating Temp      : 88 C
        Memory Current Temp         : N/A
        Memory Max Operating Temp   : N/A
    Power Readings
        Power Management            : Supported
        Power Draw                  : 153.49 W
        Power Limit                 : 170.00 W
        Default Power Limit         : 170.00 W
        Enforced Power Limit        : 170.00 W
        Min Power Limit             : 125.00 W
        Max Power Limit             : 170.00 W
    Clocks
        Graphics                    : 1845 MHz
        SM                          : 1845 MHz
        Memory                      : 6801 MHz
        Video                       : 1710 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 2100 MHz
        SM                          : 2100 MHz
        Memory                      : 7001 MHz
        Video                       : 1950 MHz
    Max Customer Boost Clocks
        Graphics                    : N/A
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A
    Processes
        Process ID                  : 4263
            Type                    : C
            Name                    : /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_21.fah/FahCore_21
            Used GPU Memory         : 95 MiB

Posted: **Wed Jan 29, 2020 2:57 pm**

That PPD of 997k on 14320 is roughly in line with my normal 2060 so no increase there

Posted: **Wed Jan 29, 2020 3:55 pm**

Nathan_P wrote:That PPD of 997k on 14320 is roughly in line with my normal 2060 so no increase there

useful data @Nathan_P

I've seen massive variations in 11738 WU with anything from 1.19m to 700K on identical 1070s. That figure earlier of nearly 1.4m might not be overly representative of the 2060 KO.

Posted: **Wed Jan 29, 2020 7:17 pm**

HaloJones wrote:
Nathan_P wrote:That PPD of 997k on 14320 is roughly in line with my normal 2060 so no increase there
useful data @Nathan_P

I've seen massive variations in 11738 WU with anything from 1.19m to 700K on identical 1070s. That figure earlier of nearly 1.4m might not be overly representative of the 2060 KO.

I think you will find it is(Representative), my KO Ultra gets those fame rates (and around that PPD) on every 11737 and 11738 WU.

The 2060 KO is running an 11738 at 2 min 03 secs PPD 1354549 currently ,I have not seen it work a 11739 (Its probably in my logs) yet but since my 1080TI is at 1 min 03 secs PPD 1402197 currently , I anticipate the 2060 KO Ultra doing the same...Also to be accurate all my numbers indicate 1,350,000 to 1,360,000 as an average not 1.4million...Just to be accurate.

My question is...Why doesn't it perform better on the other WU's?

This is a dedicated folding rig with the 1080 TI handling the screen and any background usage...The 1080 TI PPD goes up and down but the KO is constant. If the KO was running the monitor and handling other tasks, I would expect PPD to be lower.

Posted: **Thu Jan 30, 2020 4:35 pm**

The crazy numbers are due to the Core22 WU, not the chip. My 2060 does 1.4M PPD on 11738 as well, and does around 1.2M on 11739. Core21 WU generally sit around 1M

Posted: **Thu Jan 30, 2020 10:27 pm**

squads wrote:The crazy numbers are due to the Core22 WU, not the chip. My 2060 does 1.4M PPD on 11738 as well, and does around 1.2M on 11739. Core21 WU generally sit around 1M

Now that is interesting, I have not seen that on my 2 vanilla 2060's. Is that highly OC'ed?

Posted: **Fri Jan 31, 2020 7:33 pm**

Hardware wise, a GPU with N shaders (or "CUDA Cores") running at M Ghz will produce a certain number of GFLOPS when producing 100% of it's capabilities. The concept of "hidden cores" isn't valid.

It's certainly true that small proteins can't maintain 100% production if N is a large number whereas large proteins get very, very close.

FACT: Your GPU can't produce more than 100%, no matter what changes in the FAHCore.

This whole discussion can be validated or invalidated by testing a somewhat smaller protein on a smaller GPU. That should produce close to 100% throughput on that GPU and should establish a reasonable estimate of what PPD that GPU can do. Larger proteins would then also produce 100% throughput and (if the proteins are properly benchmarked) about the same PPD.

Posted: **Fri Jan 31, 2020 9:49 pm**

This website benchmarked the RTX 2060 KO with improvements in CUDA compute workloads.
https://www.youtube.com/watch?v=mUFRBnJdx3Y

Posted: **Sat Feb 01, 2020 12:22 am**

At this point, FAH uses OpenCL. As far as I know, the OpenCL is built on top of CUDA, so if CUDA's FloatinmPoint performance is made more efficient, some of that improvment might or might not be passed on as an improvement to OpenCL. Your benchmarking is probably a good tool to determine that. It's reasonable to assume that the KO benchmarking comparing OpenCL performance of the KO to the OpenCL performance of the non-KO wouldn't be as effective to comparing KO CUDA to non-KO CUDA, because FAHCores have not run native CUDA recently.

At some time in the future, FAHCore_22 might (or might not) be recompiled to use native CUDA rather than OpenCL. Such a recompile will be beneficial for all nVidia GPUs but AMD GPUs will still be using the OpenCL version.

Posted: **Sat Feb 01, 2020 1:21 am**

bruce wrote:... I'm not sure if the KO suffix means the same thing since it's not a change to the NVidia designation.

KO is just an old suffix EVGA has resurrected for this line of cards like the ‘XC’ they also use.

It has the somewhat anaemic cooler from the 1660 Super which is inferior to that found on the 2060 and 2060 Super XC and Ultra models they sell.

I can also confirm that the PPD seen on this card is in-line with my 2060 XC Ultra.

Posted: **Sat Feb 01, 2020 10:31 am**

gordonbb wrote:I can also confirm that the PPD seen on this card is in-line with my 2060 XC Ultra.

Too bad. So no hot deal GPU for FAH.

Posted: **Sat Feb 01, 2020 5:09 pm**

chex313 wrote:
squads wrote:The crazy numbers are due to the Core22 WU, not the chip. My 2060 does 1.4M PPD on 11738 as well, and does around 1.2M on 11739. Core21 WU generally sit around 1M
Now that is interesting, I have not seen that on my 2 vanilla 2060's. Is that highly OC'ed?

No its actually power limited to 140W (stock 160W). I also use Linux, which generally gives a bit of bump in PPD, so those two factors probably balance out.

Posted: **Sat Feb 01, 2020 6:34 pm**

foldy wrote:So the hidden shaders of EVGA RTX 2060 KO really help FAH compute performance.

No they don't, they're disabled. But the entire chip architecture (memory controllers and compute pipeline design) does.

Folding Forum

GeForce RTX 2060

Re: Please add GeForce RTX 2060

Re: Please add GeForce RTX 2060

Re: Please add GeForce RTX 2060

Re: Please add GeForce RTX 2060

Re: Please add GeForce RTX 2060

Re: Please add GeForce RTX 2060

Re: GeForce RTX 2060

Re: GeForce RTX 2060

Re: GeForce RTX 2060

Re: GeForce RTX 2060

Re: GeForce RTX 2060

Re: Please add GeForce RTX 2060

Re: Please add GeForce RTX 2060

Re: GeForce RTX 2060

Re: Please add GeForce RTX 2060