Page 1 of 1

Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 3:29 pm
by benas_s
Nvidia driver 461.09; GPU 1660ti; FAH client 7.6.13.

GPU seems to fold for a while(~40 minutes), gets loud and reaches 81 Celcius and then temperature drops to 44 Celcius, but it doesn't seem to fold for another 40 minutes. After some time GPU again starts to fold and cycle repeats. Doesn't matter which WU is being folded.

Is my GPU dying or I just need to replace thermal paste?

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 3:37 pm
by gunnarre
Are you folding on Windows or Linux? Can you please post the 200 or so lines of your log, as well as the sections of the log where the GPU stops? (How-to here: viewtopic.php?p=327412)

What is the output of the command (should be available both on Windows and Linux).

Code: Select all

nvidia-smi -q
and

Code: Select all

nvidia-smi
If on Windows, can you run GPU-Z and look at the sensors data?

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 3:48 pm
by benas_s
I'm folding on Windows .
As I was looking for logs, folding again started, so I will add

Code: Select all

nvidia-smi
output later

Code: Select all

C:\Users\PC>nvidia-smi -q

==============NVSMI LOG==============

Timestamp                                 : Wed Mar 17 17:40:34 2021
Driver Version                            : 461.09
CUDA Version                              : 11.2

Attached GPUs                             : 1
GPU 00000000:09:00.0
    Product Name                          : GeForce GTX 1660 Ti
    Product Brand                         : GeForce
    Display Mode                          : Enabled
    Display Active                        : Enabled
    Persistence Mode                      : N/A
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : WDDM
        Pending                           : WDDM
    Serial Number                         : N/A
    GPU UUID                              : GPU-8e3f54f2-71c5-2a02-58dd-284ae44e15bc
    Minor Number                          : N/A
    VBIOS Version                         : 90.16.29.00.94
    MultiGPU Board                        : No
    Board ID                              : 0x900
    GPU Part Number                       : N/A
    Inforom Version
        Image Version                     : G001.0000.02.04
        OEM Object                        : 1.1
        ECC Object                        : N/A
        Power Management Object           : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x09
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x218210DE
        Bus Id                            : 00000000:09:00.0
        Sub System Id                     : 0x37501462
        GPU Link Info
            PCIe Generation
                Max                       : 3
                Current                   : 3
            Link Width
                Max                       : 16x
                Current                   : 16x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 1316000 KB/s
        Rx Throughput                     : 15707000 KB/s
    Fan Speed                             : 44 %
    Performance State                     : P2
    Clocks Throttle Reasons
        Idle                              : Not Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    FB Memory Usage
        Total                             : 6144 MiB
        Used                              : 793 MiB
        Free                              : 5351 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 229 MiB
        Free                              : 27 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 0 %
        Memory                            : 1 %
        Encoder                           : 0 %
        Decoder                           : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    Ecc Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
        Aggregate
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 42 C
        GPU Shutdown Temp                 : 95 C
        GPU Slowdown Temp                 : 92 C
        GPU Max Operating Temp            : 90 C
        GPU Target Temperature            : 83 C
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : Supported
        Power Draw                        : 30.18 W
        Power Limit                       : 120.00 W
        Default Power Limit               : 120.00 W
        Enforced Power Limit              : 120.00 W
        Min Power Limit                   : 70.00 W
        Max Power Limit                   : 120.00 W
    Clocks
        Graphics                          : 1500 MHz
        SM                                : 1500 MHz
        Memory                            : 5750 MHz
        Video                             : 1395 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Max Clocks
        Graphics                          : 2160 MHz
        SM                                : 2160 MHz
        Memory                            : 6001 MHz
        Video                             : 1950 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Processes
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 1296
            Type                          : C+G
            Name                          : Insufficient Permissions
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 2776
            Type                          : C+G
            Name                          : C:\Program Files\WindowsApps\Microsoft.Windows.Photos_2020.20120.4004.0_x64__8wekyb3d8bbwe\Microsoft.Photos.exe
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 5256
            Type                          : C
            Name                          : C:\Users\PC\AppData\Roaming\FAHClient\cores\cores.foldingathome.org\win\64bit\22-0.0.13\Core_22.fah\FahCore_22.exe
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 6364
            Type                          : C+G
            Name                          : C:\Program Files\WindowsApps\Microsoft.WindowsCalculator_10.2101.10.0_x64__8wekyb3d8bbwe\Calculator.exe
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 6568
            Type                          : C+G
            Name                          : C:\Windows\explorer.exe
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 7408
            Type                          : C+G
            Name                          : C:\Program Files\Mozilla Firefox\firefox.exe
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 7660
            Type                          : C+G
            Name                          : C:\Windows\SystemApps\Microsoft.Windows.StartMenuExperienceHost_cw5n1h2txyewy\StartMenuExperienceHost.exe
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 7940
            Type                          : C+G
            Name                          : C:\Windows\SystemApps\Microsoft.Windows.Search_cw5n1h2txyewy\SearchApp.exe
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 8468
            Type                          : C+G
            Name                          : C:\Program Files\WindowsApps\Microsoft.YourPhone_1.21021.117.0_x64__8wekyb3d8bbwe\YourPhone.exe
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 8740
            Type                          : C+G
            Name                          : C:\Windows\SystemApps\Microsoft.LockApp_cw5n1h2txyewy\LockApp.exe
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 9964
            Type                          : C+G
            Name                          : C:\Windows\SystemApps\ShellExperienceHost_cw5n1h2txyewy\ShellExperienceHost.exe
            Used GPU Memory               : Not available in WDDM driver model
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 9996
            Type                          : C+G
            Name                          : C:\Windows\SystemApps\MicrosoftWindows.Client.CBS_cw5n1h2txyewy\InputApp\TextInputHost.exe

FAH log

Code: Select all

16:12:41:WARNING:WU00:FS00:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration
16:12:41:WU00:FS00:Connecting to assign4.foldingathome.org:80
16:12:42:WARNING:WU00:FS00:Failed to get assignment from 'assign4.foldingathome.org:80': No WUs available for this configuration
16:12:42:ERROR:WU00:FS00:Exception: Could not get an assignment
16:15:03:WU01:FS01:0x22:Completed 637500 out of 750000 steps (85%)
16:17:25:WU01:FS01:0x22:Completed 645000 out of 750000 steps (86%)
16:17:27:WU01:FS01:0x22:Checkpoint completed at step 645000
16:22:18:FS00:Paused
16:22:18:FS01:Paused
16:22:18:FS01:Shutting core down
16:22:18:WU01:FS01:0x22:WARNING:Console control signal 1 on PID 12920
16:22:18:WU01:FS01:0x22:Exiting, please wait. . .
16:22:28:Removing old file 'configs/config-20210310-145902.xml'
16:22:28:Saving configuration to config.xml
16:22:28:<config>
16:22:28:  <!-- Slot Control -->
16:22:28:  <power v='FULL'/>
16:22:28:
16:22:28:  <!-- User Information -->
16:22:28:  <passkey v='*****'/>
16:22:28:  <team v='36816'/>
16:22:28:  <user v='benas_s'/>
16:22:28:
16:22:28:  <!-- Folding Slots -->
16:22:28:  <slot id='0' type='CPU'>
16:22:28:    <paused v='true'/>
16:22:28:  </slot>
16:22:28:  <slot id='1' type='GPU'>
16:22:28:    <paused v='true'/>
16:22:28:  </slot>
16:22:28:</config>
16:23:19:WARNING:FS01:Killing WU01
16:23:19:WU01:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
18:11:37:FS00:Unpaused
18:11:37:FS01:Unpaused
18:11:37:WU00:FS00:Connecting to assign1.foldingathome.org:80
18:11:37:WU01:FS01:Starting
18:11:37:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\PC\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.13/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 4020 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
18:11:37:WU01:FS01:Started FahCore on PID 15832
18:11:37:WU01:FS01:Core PID:10436
18:11:37:WU01:FS01:FahCore 0x22 started
18:11:37:WARNING:WU00:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
18:11:37:WU00:FS00:Connecting to assign2.foldingathome.org:80
18:11:38:WU01:FS01:0x22:*********************** Log Started 2021-03-15T18:11:37Z ***********************
18:11:38:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
18:11:38:WU01:FS01:0x22:       Core: Core22
18:11:38:WU01:FS01:0x22:       Type: 0x22
18:11:38:WU01:FS01:0x22:    Version: 0.0.13
18:11:38:WU01:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
18:11:38:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
18:11:38:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
18:11:38:WU01:FS01:0x22:       Date: Sep 19 2020
18:11:38:WU01:FS01:0x22:       Time: 02:35:58
18:11:38:WU01:FS01:0x22:   Revision: 571cf95de6de2c592c7c3ed48fcfb2e33e9ea7d3
18:11:38:WU01:FS01:0x22:     Branch: core22-0.0.13
18:11:38:WU01:FS01:0x22:   Compiler: Visual C++ 2015
18:11:38:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
18:11:38:WU01:FS01:0x22:             -DOPENMM_GIT_HASH="\"189320d0\""
18:11:38:WU01:FS01:0x22:   Platform: win32 10
18:11:38:WU01:FS01:0x22:       Bits: 64
18:11:38:WU01:FS01:0x22:       Mode: Release
18:11:38:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
18:11:38:WU01:FS01:0x22:             <peastman@stanford.edu>
18:11:38:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 15832 -checkpoint 15
18:11:38:WU01:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
18:11:38:WU01:FS01:0x22:             0 -gpu 0
18:11:38:WU01:FS01:0x22:************************************ libFAH ************************************
18:11:38:WU01:FS01:0x22:       Date: Sep 7 2020
18:11:38:WU01:FS01:0x22:       Time: 19:09:56
18:11:38:WU01:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
18:11:38:WU01:FS01:0x22:     Branch: HEAD
18:11:38:WU01:FS01:0x22:   Compiler: Visual C++ 2015
18:11:38:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
18:11:38:WU01:FS01:0x22:   Platform: win32 10
18:11:38:WU01:FS01:0x22:       Bits: 64
18:11:38:WU01:FS01:0x22:       Mode: Release
18:11:38:WU01:FS01:0x22:************************************ CBang *************************************
18:11:38:WU01:FS01:0x22:       Date: Sep 7 2020
18:11:38:WU01:FS01:0x22:       Time: 19:08:30
18:11:38:WU01:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
18:11:38:WU01:FS01:0x22:     Branch: HEAD
18:11:38:WU01:FS01:0x22:   Compiler: Visual C++ 2015
18:11:38:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
18:11:38:WU01:FS01:0x22:   Platform: win32 10
18:11:38:WU01:FS01:0x22:       Bits: 64
18:11:38:WU01:FS01:0x22:       Mode: Release
18:11:38:WU01:FS01:0x22:************************************ System ************************************
18:11:38:WU01:FS01:0x22:        CPU: AMD Ryzen 7 3700X 8-Core Processor
18:11:38:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
18:11:38:WU01:FS01:0x22:       CPUs: 16
18:11:38:WU01:FS01:0x22:     Memory: 31.92GiB
18:11:38:WU01:FS01:0x22:Free Memory: 26.45GiB
18:11:38:WU01:FS01:0x22:    Threads: WINDOWS_THREADS
18:11:38:WU01:FS01:0x22: OS Version: 6.2
18:11:38:WU01:FS01:0x22:Has Battery: false
18:11:38:WU01:FS01:0x22: On Battery: false
18:11:38:WU01:FS01:0x22: UTC Offset: 2
18:11:38:WU01:FS01:0x22:        PID: 10436
18:11:38:WU01:FS01:0x22:        CWD: C:\Users\PC\AppData\Roaming\FAHClient\work
18:11:38:WU01:FS01:0x22:************************************ OpenMM ************************************
18:11:38:WU01:FS01:0x22:   Revision: 189320d0
18:11:38:WU01:FS01:0x22:********************************************************************************
18:11:38:WU01:FS01:0x22:Project: 17329 (Run 0, Clone 765, Gen 64)
18:11:38:WU01:FS01:0x22:Unit: 0x00000000000000000000000000000000
18:11:38:WU01:FS01:0x22:Digital signatures verified
18:11:38:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
18:11:38:WU01:FS01:0x22:Version 0.0.13
18:11:38:WU01:FS01:0x22:  Checkpoint write interval: 15000 steps (2%) [50 total]
18:11:38:WU01:FS01:0x22:  JSON viewer frame write interval: 7500 steps (1%) [100 total]
18:11:38:WU01:FS01:0x22:  XTC frame write interval: 250000 steps (33%) [3 total]
18:11:38:WU01:FS01:0x22:  Global context and integrator variables write interval: disabled
18:11:38:WU01:FS01:0x22:There are 4 platforms available.
18:11:38:WU01:FS01:0x22:Platform 0: Reference
18:11:38:WU01:FS01:0x22:Platform 1: CPU
18:11:38:WU01:FS01:0x22:Platform 2: OpenCL
18:11:38:WU01:FS01:0x22:  opencl-device 0 specified
18:11:38:WU01:FS01:0x22:Platform 3: CUDA
18:11:38:WU01:FS01:0x22:  cuda-device 0 specified
18:11:45:WARNING:WU00:FS00:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration
18:11:45:WU00:FS00:Connecting to assign3.foldingathome.org:80
18:11:46:WARNING:WU00:FS00:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration
18:11:46:WU00:FS00:Connecting to assign4.foldingathome.org:80
18:11:47:WU00:FS00:Assigned to work server 206.223.170.146
18:11:47:WU00:FS00:Requesting new work unit for slot 00: READY cpu:15 from 206.223.170.146
18:11:47:WU00:FS00:Connecting to 206.223.170.146:8080
18:11:48:ERROR:WU00:FS00:Exception: Server did not assign work unit
18:11:53:WU01:FS01:0x22:Attempting to create CUDA context:
18:11:53:WU01:FS01:0x22:  Configuring platform CUDA
18:11:58:WU01:FS01:0x22:  Using CUDA and gpu 0
18:11:58:WU01:FS01:0x22:Completed 645000 out of 750000 steps (86%)
18:12:16:Removing old file 'configs/config-20210310-161214.xml'
18:12:16:Saving configuration to config.xml
18:12:16:<config>
18:12:16:  <!-- Slot Control -->
18:12:16:  <power v='FULL'/>
18:12:16:
18:12:16:  <!-- User Information -->
18:12:16:  <passkey v='*****'/>
18:12:16:  <team v='36816'/>
18:12:16:  <user v='benas_s'/>
18:12:16:
18:12:16:  <!-- Folding Slots -->
18:12:16:  <slot id='0' type='CPU'/>
18:12:16:  <slot id='1' type='GPU'/>
18:12:16:</config>
18:13:14:WU00:FS00:Connecting to assign1.foldingathome.org:80
18:13:15:WU00:FS00:Assigned to work server 129.32.209.203
18:13:25:WU00:FS00:Requesting new work unit for slot 00: READY cpu:15 from 129.32.209.203
18:13:25:WU00:FS00:Connecting to 129.32.209.203:8080
18:13:25:ERROR:WU00:FS00:Exception: Server did not assign work unit
18:14:31:WU01:FS01:0x22:Completed 652500 out of 750000 steps (87%)
18:15:51:WU00:FS00:Connecting to assign1.foldingathome.org:80
18:15:52:WARNING:WU00:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
18:15:52:WU00:FS00:Connecting to assign2.foldingathome.org:80
18:15:53:WARNING:WU00:FS00:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration
18:15:53:WU00:FS00:Connecting to assign3.foldingathome.org:80
18:15:54:WARNING:WU00:FS00:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration
18:15:54:WU00:FS00:Connecting to assign4.foldingathome.org:80
18:15:55:WARNING:WU00:FS00:Failed to get assignment from 'assign4.foldingathome.org:80': No WUs available for this configuration
18:15:55:ERROR:WU00:FS00:Exception: Could not get an assignment
18:17:02:WU01:FS01:0x22:Completed 660000 out of 750000 steps (88%)
18:17:04:WU01:FS01:0x22:Checkpoint completed at step 660000
18:19:58:WU01:FS01:0x22:Completed 667500 out of 750000 steps (89%)
18:20:06:WU00:FS00:Connecting to assign1.foldingathome.org:80
18:20:06:WARNING:WU00:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
18:20:06:WU00:FS00:Connecting to assign2.foldingathome.org:80
18:20:07:WARNING:WU00:FS00:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration
18:20:07:WU00:FS00:Connecting to assign3.foldingathome.org:80
18:20:08:WU00:FS00:Assigned to work server 128.252.203.9
18:20:08:WU00:FS00:Requesting new work unit for slot 00: READY cpu:15 from 128.252.203.9
18:20:08:WU00:FS00:Connecting to 128.252.203.9:8080
18:20:08:ERROR:WU00:FS00:Exception: Server did not assign work unit
18:22:31:WU01:FS01:0x22:Completed 675000 out of 750000 steps (90%)
18:22:33:WU01:FS01:0x22:Checkpoint completed at step 675000
18:22:59:FS00:Finishing
18:22:59:FS01:Finishing
18:24:58:WU01:FS01:0x22:Completed 682500 out of 750000 steps (91%)
18:26:57:WU00:FS00:Connecting to assign1.foldingathome.org:80
18:26:57:WARNING:WU00:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
18:26:57:WU00:FS00:Connecting to assign2.foldingathome.org:80
18:26:58:WARNING:WU00:FS00:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration
18:26:58:WU00:FS00:Connecting to assign3.foldingathome.org:80
18:26:58:WARNING:WU00:FS00:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration
18:26:58:WU00:FS00:Connecting to assign4.foldingathome.org:80
18:26:59:WARNING:WU00:FS00:Failed to get assignment from 'assign4.foldingathome.org:80': No WUs available for this configuration
18:26:59:ERROR:WU00:FS00:Exception: Could not get an assignment
18:27:26:WU01:FS01:0x22:Completed 690000 out of 750000 steps (92%)
18:27:28:WU01:FS01:0x22:Checkpoint completed at step 690000
18:29:54:WU01:FS01:0x22:Completed 697500 out of 750000 steps (93%)
18:32:20:WU01:FS01:0x22:Completed 705000 out of 750000 steps (94%)
18:32:22:WU01:FS01:0x22:Checkpoint completed at step 705000
18:34:50:WU01:FS01:0x22:Completed 712500 out of 750000 steps (95%)
18:37:22:WU01:FS01:0x22:Completed 720000 out of 750000 steps (96%)
18:37:24:WU01:FS01:0x22:Checkpoint completed at step 720000
18:38:03:WU00:FS00:Connecting to assign1.foldingathome.org:80
18:38:03:WARNING:WU00:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
18:38:03:WU00:FS00:Connecting to assign2.foldingathome.org:80
18:38:04:WARNING:WU00:FS00:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration
18:38:04:WU00:FS00:Connecting to assign3.foldingathome.org:80
18:38:04:WARNING:WU00:FS00:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration
18:38:04:WU00:FS00:Connecting to assign4.foldingathome.org:80
18:38:05:WARNING:WU00:FS00:Failed to get assignment from 'assign4.foldingathome.org:80': No WUs available for this configuration
18:38:05:ERROR:WU00:FS00:Exception: Could not get an assignment
18:40:02:WU01:FS01:0x22:Completed 727500 out of 750000 steps (97%)
18:42:43:WU01:FS01:0x22:Completed 735000 out of 750000 steps (98%)
18:42:45:WU01:FS01:0x22:Checkpoint completed at step 735000
18:45:26:WU01:FS01:0x22:Completed 742500 out of 750000 steps (99%)
18:48:08:WU01:FS01:0x22:Completed 750000 out of 750000 steps (100%)
18:48:08:WU01:FS01:0x22:Average performance: 267.492 ns/day
18:48:10:WU01:FS01:0x22:Checkpoint completed at step 750000
18:48:29:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
18:48:29:WU01:FS01:0x22:Saving result file checkpointIntegrator.xml.bz2
18:48:29:WU01:FS01:0x22:Saving result file checkpointState.xml.bz2
18:48:29:WU01:FS01:0x22:Saving result file positions.xtc
18:48:29:WU01:FS01:0x22:Saving result file science.log
18:48:29:WU01:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
18:48:30:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
18:48:30:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17329 run:0 clone:765 gen:64 core:0x22 unit:0x000002fd00000040000043b100000000
18:48:30:WU01:FS01:Uploading 29.23MiB to 140.163.4.200
18:48:30:WU01:FS01:Connecting to 140.163.4.200:8080
18:48:36:WU01:FS01:Upload 9.20%
18:48:42:WU01:FS01:Upload 19.67%
18:48:48:WU01:FS01:Upload 30.15%
18:48:54:WU01:FS01:Upload 40.21%
18:49:00:WU01:FS01:Upload 50.26%
18:49:06:WU01:FS01:Upload 60.52%
18:49:12:WU01:FS01:Upload 70.79%
18:49:18:WU01:FS01:Upload 81.05%
18:49:24:WU01:FS01:Upload 91.32%
18:49:29:WU01:FS01:Upload complete
18:49:29:WU01:FS01:Server responded WORK_ACK (400)
18:49:29:WU01:FS01:Final credit estimate, 159788.00 points
18:49:29:WU01:FS01:Cleaning up
18:49:42:FS00:Paused
18:49:42:FS01:Paused
18:49:53:Removing old file 'configs/config-20210310-180404.xml'
18:49:53:Saving configuration to config.xml
18:49:53:<config>
18:49:53:  <!-- Slot Control -->
18:49:53:  <power v='FULL'/>
18:49:53:
18:49:53:  <!-- User Information -->
18:49:53:  <passkey v='*****'/>
18:49:53:  <team v='36816'/>
18:49:53:  <user v='benas_s'/>
18:49:53:
18:49:53:  <!-- Folding Slots -->
18:49:53:  <slot id='0' type='CPU'>
18:49:53:    <paused v='true'/>
18:49:53:  </slot>
18:49:53:  <slot id='1' type='GPU'>
18:49:53:    <paused v='true'/>
18:49:53:  </slot>
18:49:53:</config>
20:08:01:ERROR:Receive error: 10054: An existing connection was forcibly closed by the remote host.
GPU-z sensors while folding:
Image

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 3:57 pm
by gunnarre
The client should be upgraded to 7.6.21 due to a security issue. You might want to upgrade to the newest driver from Nvidia's website as well. What does the topmost part of the log look like if you restart the FAH client? (The lines looking like these:

Code: Select all

*********************** Log Started 2021-03-17T15:56:10Z ***********************
(....)
15:56:10:            CPU: AMD Ryzen 5 3600 6-Core Processor
15:56:10:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
15:56:10:           CPUs: 12
15:56:10:         Memory: 15.61GiB
15:56:10:    Free Memory: 2.19GiB
15:56:10:        Threads: POSIX_THREADS
15:56:10:     OS Version: 5.4
15:56:10:    Has Battery: false
15:56:10:     On Battery: false
15:56:10:     UTC Offset: 1
15:56:10:            PID: 368002
15:56:10:            CWD: /var/lib/fahclient
15:56:10:             OS: Linux 5.4.0-67-generic x86_64
15:56:10:        OS Arch: AMD64
15:56:10:           GPUs: 1
15:56:10:          GPU 0: Bus:8 Slot:0 Func:0 NVIDIA:7 TU116 [GeForce GTX 1660 SUPER]
15:56:10:  CUDA Device 0: Platform:0 Device:0 Bus:8 Slot:0 Compute:7.5 Driver:11.2
15:56:10:OpenCL Device 0: Platform:0 Device:0 Bus:8 Slot:0 Compute:1.2 Driver:460.39
15:56:10:***********************************************************************

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 4:00 pm
by benas_s
After restart log looks like this. But now it is folding. Maybe I should wait until it stops folding

Code: Select all

*********************** Log Started 2021-03-17T15:59:00Z ***********************
15:59:00:Trying to access database...
15:59:00:Successfully acquired database lock
15:59:00:Read GPUs.txt
15:59:00:Enabled folding slot 00: PAUSED cpu:15 (by user)
15:59:00:Enabled folding slot 01: PAUSED gpu:0:TU116 [GeForce GTX 1660 Ti] (by user)
15:59:00:****************************** FAHClient ******************************
15:59:00:        Version: 7.6.13
15:59:00:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:59:00:      Copyright: 2020 foldingathome.org
15:59:00:       Homepage: https://foldingathome.org/
15:59:00:           Date: Apr 27 2020
15:59:00:           Time: 21:21:01
15:59:00:       Revision: 5a652817f46116b6e135503af97f18e094414e3b
15:59:00:         Branch: master
15:59:00:       Compiler: Visual C++ 2008
15:59:00:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
15:59:00:       Platform: win32 10
15:59:00:           Bits: 32
15:59:00:           Mode: Release
15:59:00:           Args: --open-web-control
15:59:00:         Config: C:\Users\PC\AppData\Roaming\FAHClient\config.xml
15:59:00:******************************** CBang ********************************
15:59:00:           Date: Apr 24 2020
15:59:00:           Time: 17:07:55
15:59:00:       Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
15:59:00:         Branch: master
15:59:00:       Compiler: Visual C++ 2008
15:59:00:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
15:59:00:       Platform: win32 10
15:59:00:           Bits: 32
15:59:00:           Mode: Release
15:59:00:******************************* System ********************************
15:59:00:            CPU: AMD Ryzen 7 3700X 8-Core Processor
15:59:00:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
15:59:00:           CPUs: 16
15:59:00:         Memory: 31.92GiB
15:59:00:    Free Memory: 27.63GiB
15:59:00:        Threads: WINDOWS_THREADS
15:59:00:     OS Version: 6.2
15:59:00:    Has Battery: false
15:59:00:     On Battery: false
15:59:00:     UTC Offset: 2
15:59:00:            PID: 4544
15:59:00:            CWD: C:\Users\PC\AppData\Roaming\FAHClient
15:59:00:  Win32 Service: false
15:59:00:             OS: Windows 10 Enterprise
15:59:00:        OS Arch: AMD64
15:59:00:           GPUs: 1
15:59:00:          GPU 0: Bus:9 Slot:0 Func:0 NVIDIA:7 TU116 [GeForce GTX 1660 Ti]
15:59:00:  CUDA Device 0: Platform:0 Device:0 Bus:9 Slot:0 Compute:7.5 Driver:11.2
15:59:00:OpenCL Device 0: Platform:0 Device:0 Bus:9 Slot:0 Compute:1.2 Driver:461.9
15:59:00:******************************* libFAH ********************************
15:59:00:           Date: Apr 15 2020
15:59:00:           Time: 14:53:14
15:59:00:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
15:59:00:         Branch: master
15:59:00:       Compiler: Visual C++ 2008
15:59:00:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
15:59:00:       Platform: win32 10
15:59:00:           Bits: 32
15:59:00:           Mode: Release
15:59:00:***********************************************************************
15:59:00:<config>
15:59:00:  <!-- Slot Control -->
15:59:00:  <power v='FULL'/>
15:59:00:
15:59:00:  <!-- User Information -->
15:59:00:  <passkey v='*****'/>
15:59:00:  <team v='36816'/>
15:59:00:  <user v='benas_s'/>
15:59:00:
15:59:00:  <!-- Folding Slots -->
15:59:00:  <slot id='0' type='CPU'>
15:59:00:    <paused v='true'/>
15:59:00:  </slot>
15:59:00:  <slot id='1' type='GPU'>
15:59:00:    <paused v='true'/>
15:59:00:  </slot>
15:59:00:</config>
15:59:01:4:127.0.0.1:New Web session
15:59:32:FS00:Unpaused
15:59:32:FS01:Unpaused
15:59:32:WU01:FS01:Starting
15:59:32:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\PC\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.13/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 4544 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
15:59:32:WU01:FS01:Started FahCore on PID 10308
15:59:32:WU01:FS01:Core PID:7044
15:59:32:WU01:FS01:FahCore 0x22 started
15:59:32:WU00:FS00:Connecting to assign1.foldingathome.org:80
15:59:32:WU01:FS01:0x22:*********************** Log Started 2021-03-17T15:59:32Z ***********************
15:59:32:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
15:59:32:WU01:FS01:0x22:       Core: Core22
15:59:32:WU01:FS01:0x22:       Type: 0x22
15:59:32:WU01:FS01:0x22:    Version: 0.0.13
15:59:32:WU01:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:59:32:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
15:59:32:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
15:59:32:WU01:FS01:0x22:       Date: Sep 19 2020
15:59:32:WU01:FS01:0x22:       Time: 02:35:58
15:59:32:WU01:FS01:0x22:   Revision: 571cf95de6de2c592c7c3ed48fcfb2e33e9ea7d3
15:59:32:WU01:FS01:0x22:     Branch: core22-0.0.13
15:59:32:WU01:FS01:0x22:   Compiler: Visual C++ 2015
15:59:32:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:59:32:WU01:FS01:0x22:             -DOPENMM_GIT_HASH="\"189320d0\""
15:59:32:WU01:FS01:0x22:   Platform: win32 10
15:59:32:WU01:FS01:0x22:       Bits: 64
15:59:32:WU01:FS01:0x22:       Mode: Release
15:59:32:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
15:59:32:WU01:FS01:0x22:             <peastman@stanford.edu>
15:59:32:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 10308 -checkpoint 15
15:59:32:WU01:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
15:59:32:WU01:FS01:0x22:             0 -gpu 0
15:59:32:WU01:FS01:0x22:************************************ libFAH ************************************
15:59:32:WU01:FS01:0x22:       Date: Sep 7 2020
15:59:32:WU01:FS01:0x22:       Time: 19:09:56
15:59:32:WU01:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
15:59:32:WU01:FS01:0x22:     Branch: HEAD
15:59:32:WU01:FS01:0x22:   Compiler: Visual C++ 2015
15:59:32:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:59:32:WU01:FS01:0x22:   Platform: win32 10
15:59:32:WU01:FS01:0x22:       Bits: 64
15:59:32:WU01:FS01:0x22:       Mode: Release
15:59:32:WU01:FS01:0x22:************************************ CBang *************************************
15:59:32:WU01:FS01:0x22:       Date: Sep 7 2020
15:59:32:WU01:FS01:0x22:       Time: 19:08:30
15:59:32:WU01:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
15:59:32:WU01:FS01:0x22:     Branch: HEAD
15:59:32:WU01:FS01:0x22:   Compiler: Visual C++ 2015
15:59:32:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:59:32:WU01:FS01:0x22:   Platform: win32 10
15:59:32:WU01:FS01:0x22:       Bits: 64
15:59:32:WU01:FS01:0x22:       Mode: Release
15:59:32:WU01:FS01:0x22:************************************ System ************************************
15:59:32:WU01:FS01:0x22:        CPU: AMD Ryzen 7 3700X 8-Core Processor
15:59:32:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
15:59:32:WU01:FS01:0x22:       CPUs: 16
15:59:32:WU01:FS01:0x22:     Memory: 31.92GiB
15:59:32:WU01:FS01:0x22:Free Memory: 27.59GiB
15:59:32:WU01:FS01:0x22:    Threads: WINDOWS_THREADS
15:59:32:WU01:FS01:0x22: OS Version: 6.2
15:59:32:WU01:FS01:0x22:Has Battery: false
15:59:32:WU01:FS01:0x22: On Battery: false
15:59:32:WU01:FS01:0x22: UTC Offset: 2
15:59:32:WU01:FS01:0x22:        PID: 7044
15:59:32:WU01:FS01:0x22:        CWD: C:\Users\PC\AppData\Roaming\FAHClient\work
15:59:32:WU01:FS01:0x22:************************************ OpenMM ************************************
15:59:32:WU01:FS01:0x22:   Revision: 189320d0
15:59:32:WU01:FS01:0x22:********************************************************************************
15:59:32:WU01:FS01:0x22:Project: 17800 (Run 1, Clone 114, Gen 217)
15:59:32:WU01:FS01:0x22:Unit: 0x00000000000000000000000000000000
15:59:32:WU01:FS01:0x22:Digital signatures verified
15:59:32:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
15:59:32:WU01:FS01:0x22:Version 0.0.13
15:59:32:WU01:FS01:0x22:  Checkpoint write interval: 250000 steps (5%) [20 total]
15:59:32:WU01:FS01:0x22:  JSON viewer frame write interval: 50000 steps (1%) [100 total]
15:59:32:WU01:FS01:0x22:  XTC frame write interval: 25000 steps (0.5%) [200 total]
15:59:32:WU01:FS01:0x22:  Global context and integrator variables write interval: disabled
15:59:32:WU01:FS01:0x22:There are 4 platforms available.
15:59:32:WU01:FS01:0x22:Platform 0: Reference
15:59:32:WU01:FS01:0x22:Platform 1: CPU
15:59:32:WU01:FS01:0x22:Platform 2: OpenCL
15:59:32:WU01:FS01:0x22:  opencl-device 0 specified
15:59:32:WU01:FS01:0x22:Platform 3: CUDA
15:59:32:WU01:FS01:0x22:  cuda-device 0 specified
15:59:33:WARNING:WU00:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
15:59:33:WU00:FS00:Connecting to assign2.foldingathome.org:80
15:59:33:WU01:FS01:0x22:Attempting to create CUDA context:
15:59:33:WU01:FS01:0x22:  Configuring platform CUDA
15:59:33:WU00:FS00:Assigned to work server 206.223.170.146
15:59:33:WU00:FS00:Requesting new work unit for slot 00: READY cpu:15 from 206.223.170.146
15:59:33:WU00:FS00:Connecting to 206.223.170.146:8080
15:59:33:WU01:FS01:0x22:  Using CUDA and gpu 0
15:59:33:WU01:FS01:0x22:Completed 750000 out of 5000000 steps (15%)

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 4:18 pm
by iero
Are your temps still high as we are speaking? Which model of the 1660ti are you using? 80 degrees C for 111W of power seems awfully high to me.

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 4:21 pm
by benas_s
iero wrote:Are your temps still high as we are speaking? Which model of the 1660ti are you using? 80 degrees C for 111W of power seems awfully high to me.
yes, they are still that high. Was folding for a year nonstop and suddenly started getting these high temps, high fan speed and occasional folding drops. Usually temp was 70C, but now its almost 80. Model is Asus 1660ti

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 4:29 pm
by iero
Did you change anything on your setup? New fans? Removed fans?New case? Changed PCI-E slot? opened the card for maintenance? accidentally applied an overclock? Anything else coming to mind?

I would suggest updating to the newest drivers.

https://www.nvidia.com/en-gb/drivers/results/171591/
You should also look on the card itself for dust build up. Are you sure that the temp increase was sudden and not incremental/over time?

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 5:31 pm
by benas_s
iero wrote:Did you change anything on your setup? New fans? Removed fans?New case? Changed PCI-E slot? opened the card for maintenance? accidentally applied an overclock? Anything else coming to mind?

I would suggest updating to the newest drivers.

https://www.nvidia.com/en-gb/drivers/results/171591/
You should also look on the card itself for dust build up. Are you sure that the temp increase was sudden and not incremental/over time?
Can't say this problam wasn't incremental, I just noticed it today. Haven't changed my setup for half a year. Will try open case and clean the dust, upgrade drivers and FAH client to v.21

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 6:34 pm
by iero
If you are knowledgeable enough, and have the appropriate tools, I would suggest a thermal paste change as well. *** if the warranty has expired and/or there isn't a warranty sticker on your card.

In any case, waiting to hear if you found the source of your problem.

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 10:32 pm
by HaloJones
try with the case cover off. the gpu running at 81c is normal as Nvidia just vary the speed of the fans to keep it at or below that temperature. If the fans are ramping as you say then it's probably overheating. clean out the system, make sure it has lots of fresh air.

Re: Is my Nvidia GPU dying? (driver 461.09)

Posted: Wed Mar 17, 2021 11:10 pm
by bruce
Also, check the fins of the heatsink. Dust can accumulate and gradually cause the heatsink's efficiency to degrade. A good treatment with canned air can help a lot.