Page 2 of 2

Re: 13400 assigned to GPUs that are way too slow

Posted: Mon Apr 27, 2020 2:34 pm
by Theonlycure
HaloJones wrote:
Theonlycure wrote:I have a RTX 2080Ti and WU 13400 is way too slow also. I usually plow through the work units and get near max credit. This one however is a slug. Only 45% finished and estimated 9hr and 42 minutes left. This would not bother me except for the fact the points don't reflect how much time and electricity I am expending. Estimated credit 317635. Very sad.
Can you provide a little detail?

What OS?
What "client-type" do you have set? Advanced? Beta?
Ubuntu Server 18.04; Just the generic download for Ubuntu. Don't know if any of the following helps, paused then restarted.

Code: Select all

11:30:15:WU02:FS02:0x22:*********************** Log Started 2020-04-27T11:30:15Z ***********************
11:30:15:WU02:FS02:0x22:*************************** Core22 Folding@home Core ***************************
11:30:15:WU02:FS02:0x22:       Type: 0x22
11:30:15:WU02:FS02:0x22:       Core: Core22
11:30:15:WU02:FS02:0x22:    Website: https://foldingathome.org/
11:30:15:WU02:FS02:0x22:  Copyright: (c) 2009-2018 foldingathome.org
11:30:15:WU02:FS02:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
11:30:15:WU02:FS02:0x22:             <rafal.wiewiora@choderalab.org>
11:30:15:WU02:FS02:0x22:       Args: -dir 02 -suffix 01 -version 706 -lifeline 65736 -checkpoint 15
11:30:15:WU02:FS02:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
11:30:15:WU02:FS02:0x22:             0 -gpu 0
11:30:15:WU02:FS02:0x22:     Config: <none>
11:30:15:WU02:FS02:0x22:************************************ Build *************************************
11:30:15:WU02:FS02:0x22:    Version: 0.0.5
11:30:15:WU02:FS02:0x22:       Date: Apr 22 2020
11:30:15:WU02:FS02:0x22:       Time: 03:57:11
11:30:15:WU02:FS02:0x22: Repository: Git
11:30:15:WU02:FS02:0x22:   Revision: 2d69202c898bd9bb3e093f51cd32bf411c2a0388
11:30:15:WU02:FS02:0x22:     Branch: HEAD
11:30:15:WU02:FS02:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
11:30:15:WU02:FS02:0x22:    Options: -std=c++11 -O3 -funroll-loops
11:30:15:WU02:FS02:0x22:   Platform: linux2 4.19.76-linuxkit
11:30:15:WU02:FS02:0x22:       Bits: 64
11:30:15:WU02:FS02:0x22:       Mode: Release
11:30:15:WU02:FS02:0x22:************************************ System ************************************
11:30:15:WU02:FS02:0x22:        CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
11:30:15:WU02:FS02:0x22:     CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
11:30:15:WU02:FS02:0x22:       CPUs: 64
11:30:15:WU02:FS02:0x22:     Memory: 62.85GiB
11:30:15:WU02:FS02:0x22:Free Memory: 2.12GiB
11:30:15:WU02:FS02:0x22:    Threads: POSIX_THREADS
11:30:15:WU02:FS02:0x22: OS Version: 4.15
11:30:15:WU02:FS02:0x22:Has Battery: false
11:30:15:WU02:FS02:0x22: On Battery: false
11:30:15:WU02:FS02:0x22: UTC Offset: -4
11:30:15:WU02:FS02:0x22:        PID: 65740
11:30:15:WU02:FS02:0x22:        CWD: /var/lib/fahclient/work
11:30:15:WU02:FS02:0x22:         OS: Linux 4.15.0-96-generic x86_64
11:30:15:WU02:FS02:0x22:    OS Arch: AMD64
11:30:15:WU02:FS02:0x22:********************************************************************************
11:30:15:WU02:FS02:0x22:Project: 13400 (Run 74, Clone 92, Gen 2)
11:30:15:WU02:FS02:0x22:Unit: 0x0000000212bc7d9a5ea3c3b9dc927870
11:30:15:WU02:FS02:0x22:Digital signatures verified
11:30:15:WU02:FS02:0x22:Folding@home GPU Core22 Folding@home Core
11:30:15:WU02:FS02:0x22:Version 0.0.5
11:30:15:WU02:FS02:0x22:  Found a checkpoint file
11:30:20:WU02:FS02:0x22:Completed 1450000 out of 2000000 steps (72%)
11:30:20:WU02:FS02:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
11:31:45:WU02:FS02:0x22:Completed 1460000 out of 2000000 steps (73%)
11:44:59:WU02:FS02:0x22:Completed 1480000 out of 2000000 steps (74%)
12:11:42:WU02:FS02:0x22:Completed 1500000 out of 2000000 steps (75%)
12:45:27:WU02:FS02:0x22:Completed 1520000 out of 2000000 steps (76%)
12:47:46:FS02:Paused
12:47:46:FS02:Shutting core down
12:47:46:WU02:FS02:0x22:Caught signal SIGINT(2) on PID 65740
12:47:46:WU02:FS02:0x22:Exiting, please wait. . .
12:47:46:WU02:FS02:0x22:Folding@home Core Shutdown: INTERRUPTED
12:47:46:WU02:FS02:FahCore returned: INTERRUPTED (102 = 0x66)
13:04:35:FS02:Unpaused
13:04:35:WU02:FS02:Starting
13:04:35:WU02:FS02:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 02 -suffix 01 -version 706 -lifeline 53535 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
13:04:35:WU02:FS02:Started FahCore on PID 78839
13:04:35:WU02:FS02:Core PID:78843
13:04:35:WU02:FS02:FahCore 0x22 started
13:04:36:WU02:FS02:0x22:*********************** Log Started 2020-04-27T13:04:36Z ***********************
13:04:36:WU02:FS02:0x22:*************************** Core22 Folding@home Core ***************************
13:04:36:WU02:FS02:0x22:       Type: 0x22
13:04:36:WU02:FS02:0x22:       Core: Core22
13:04:36:WU02:FS02:0x22:    Website: https://foldingathome.org/
13:04:36:WU02:FS02:0x22:  Copyright: (c) 2009-2018 foldingathome.org
13:04:36:WU02:FS02:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
13:04:36:WU02:FS02:0x22:             <rafal.wiewiora@choderalab.org>
13:04:36:WU02:FS02:0x22:       Args: -dir 02 -suffix 01 -version 706 -lifeline 78839 -checkpoint 15
13:04:36:WU02:FS02:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
13:04:36:WU02:FS02:0x22:             0 -gpu 0
13:04:36:WU02:FS02:0x22:     Config: <none>
13:04:36:WU02:FS02:0x22:************************************ Build *************************************
13:04:36:WU02:FS02:0x22:    Version: 0.0.5
13:04:36:WU02:FS02:0x22:       Date: Apr 22 2020
13:04:36:WU02:FS02:0x22:       Time: 03:57:11
13:04:36:WU02:FS02:0x22: Repository: Git
13:04:36:WU02:FS02:0x22:   Revision: 2d69202c898bd9bb3e093f51cd32bf411c2a0388
13:04:36:WU02:FS02:0x22:     Branch: HEAD
13:04:36:WU02:FS02:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
13:04:36:WU02:FS02:0x22:    Options: -std=c++11 -O3 -funroll-loops
13:04:36:WU02:FS02:0x22:   Platform: linux2 4.19.76-linuxkit
13:04:36:WU02:FS02:0x22:       Bits: 64
13:04:36:WU02:FS02:0x22:       Mode: Release
13:04:36:WU02:FS02:0x22:************************************ System ************************************
13:04:36:WU02:FS02:0x22:        CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
13:04:36:WU02:FS02:0x22:     CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
13:04:36:WU02:FS02:0x22:       CPUs: 64
13:04:36:WU02:FS02:0x22:     Memory: 62.85GiB
13:04:36:WU02:FS02:0x22:Free Memory: 3.93GiB
13:04:36:WU02:FS02:0x22:    Threads: POSIX_THREADS
13:04:36:WU02:FS02:0x22: OS Version: 4.15
13:04:36:WU02:FS02:0x22:Has Battery: false
13:04:36:WU02:FS02:0x22: On Battery: false
13:04:36:WU02:FS02:0x22: UTC Offset: -4
13:04:36:WU02:FS02:0x22:        PID: 78843
13:04:36:WU02:FS02:0x22:        CWD: /var/lib/fahclient/work
13:04:36:WU02:FS02:0x22:         OS: Linux 4.15.0-96-generic x86_64
13:04:36:WU02:FS02:0x22:    OS Arch: AMD64
13:04:36:WU02:FS02:0x22:********************************************************************************
13:04:36:WU02:FS02:0x22:Project: 13400 (Run 74, Clone 92, Gen 2)
13:04:36:WU02:FS02:0x22:Unit: 0x0000000212bc7d9a5ea3c3b9dc927870
13:04:36:WU02:FS02:0x22:Digital signatures verified
13:04:36:WU02:FS02:0x22:Folding@home GPU Core22 Folding@home Core
13:04:36:WU02:FS02:0x22:Version 0.0.5

https://drive.google.com/file/d/1M9oTxk ... sp=sharing

Mod Edit: Added Code Tags - PantherX

Re: 13400 assigned to GPUs that are way too slow

Posted: Mon Apr 27, 2020 2:39 pm
by Theonlycure
Weird after I paused it. It came back with faster speed. WTH.

Re: 13400 assigned to GPUs that are way too slow

Posted: Mon Apr 27, 2020 4:38 pm
by Joe_H
That has been seen by some after a driver crash and reset, the clocks on a GPU get "stuck" on a lower speed and do not g back to more normal speeds. Sometimes a reboot has been needed to reset thing, others just stopping and starting. Might be th problem here, or something similar.

Re: 13400 assigned to GPUs that are way too slow

Posted: Mon Apr 27, 2020 4:45 pm
by Kebast
Joe_H wrote:That has been seen by some after a driver crash and reset, the clocks on a GPU get "stuck" on a lower speed and do not g back to more normal speeds. Sometimes a reboot has been needed to reset thing, others just stopping and starting. Might be th problem here, or something similar.
I noticed something like that happen on WU 16434 earlier today. GPU clock dropped to ~1100 for seemingly no reason, and never return to the normal range. I was working at the time and didn't notice any obvious errors, and nothing showed in the log file. GPU clocks returned to normal once I started up a game then quit.

Re: 13400 assigned to GPUs that are way too slow

Posted: Mon Apr 27, 2020 6:33 pm
by TPL
I'm running 13400. Still some 5 hours to go. I'll tell you what happens.

My GPU is GTX 1650 Mobile/ Max-Q. Fairly capable within timeout but not too fast for it. About 22 hours.

Re: 13400 assigned to GPUs that are way too slow

Posted: Mon Apr 27, 2020 7:09 pm
by Theonlycure
Joe_H wrote:That has been seen by some after a driver crash and reset, the clocks on a GPU get "stuck" on a lower speed and do not g back to more normal speeds. Sometimes a reboot has been needed to reset thing, others just stopping and starting. Might be th problem here, or something similar.
Thanks, makes sense.

Re: 13400 assigned to GPUs that are way too slow

Posted: Tue Apr 28, 2020 3:06 am
by PantherX
Welcome to the F@H Forum Basti,
Basti wrote:Ran into this today, too...
Please note that your GPU was unfortunately too slow to complete the assigned WU. Hence, it reached the expatriation date and was deleted:
07:45:53:WARNING:WU01:FS01:Past final deadline 2020-04-27T07:45:52Z, dumping

However, the Project will not be assigned to any GPUs as there needs to be additional investigation from the data that has been gathered by the researchers. Here are some details if you're keen: viewtopic.php?f=24&t=34786

Re: 13400 assigned to GPUs that are way too slow

Posted: Tue Apr 28, 2020 3:09 am
by bruce
When you report that a project is being assigned to a GPU that's way too slow, be sure you mention what GPU is doing the folding. Also, describe the PCIe interface. (Is the GPU getting its data at 1x?)

We can't help you if we can't tell what the problem is.

And, yes, P13400 is a challenging project. :eugeek:

Re: 13400 assigned to GPUs that are way too slow

Posted: Tue Apr 28, 2020 3:15 am
by TPL
It was ok for me, 22hours 20 mins.

Re: 13400 assigned to GPUs that are way too slow

Posted: Tue Apr 28, 2020 4:44 am
by Nuitari
Basti wrote:

Code: Select all

08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Polaris11] (rev ff) (prog-if 00 [VGA controller])
        Subsystem: Sapphire Technology Limited Baffin [Radeon RX 550 640SP / RX 560/560X] (Radeon RX 550 640SP)
I have a regular RX560 (the non OC version) that got that unit and it barely squeaked by at 1.8 days , and its a GPU that only does folding.

Interesting to see that you got that card version to fold. Whenever I tried on that one I'd get crashes and tons of errors.

Re: 13400 assigned to GPUs that are way too slow

Posted: Tue Apr 28, 2020 12:35 pm
by MaartenBaert
bruce wrote:When you report that a project is being assigned to a GPU that's way too slow, be sure you mention what GPU is doing the folding. Also, describe the PCIe interface. (Is the GPU getting its data at 1x?)

We can't help you if we can't tell what the problem is.

And, yes, P13400 is a challenging project. :eugeek:
For completeness' sake:

System 1:
CPU is Intel Core i7-4770 CPU @ 3.40GHz
GPU is Nvidia GeForce GTX 660, PCIe 3.0 16x, driver version 440.82
OS is Arch Linux, kernel 5.6.6
Time to complete WU on GPU: ~2.4 days

System 2:
CPU is Intel Xeon CPU E3-1271 v3 @ 3.60GHz
GPU is NVIDIA NVS 310, PCIe 2.0 16x, driver version 390.116
OS is CentOS 7, kernel 3.10.0
Time to complete WU on GPU: ~26.8 days

System 3:
CPU is Intel Xeon CPU E3-1270 v6 @ 3.80GHz
GPU is Nvidia GeForce GTX 1060, PCIe 3.0 16x, driver version 440.64
OS is CentOS 7, kernel 3.10.0
Time to complete WU on GPU: ~16.3 hours