WU 13416 low ppd long run time

Moderators: Site Moderators, FAHC Science Team

rjccosta1
Posts: 7
Joined: Tue Jul 07, 2020 8:22 am

WU 13416 low ppd long run time

Post by rjccosta1 »

I never posted here before. Never had problems before :D :D . I have been folding for 8 years now. For the last 3 months my average ppd is 1.4M with rtx 2070. I have an undervolt of -100mv on gpu. Most wu finish less than 3 hours.

The problem is that this wu type is taking 6h to finish. It is very unusual. Also, the average ppd is 800k-900k. It is almost 50% less than usual. I reset the graphics to stock and the average was exactly the same. My cpu is running stock, no pbo enabled. The temperatures and speed of gpu and cpu are exactly the same as usual.

Please find the first log lines here to start the discussion. I will post the rest when I finish a couple of wu of this type. Any requests be polite and civilised, no need to bite. If I am wrong just nudge me in the right direction :D .

Code: Select all

*********************** Log Started 2020-07-07T07:51:41Z ***********************
07:51:41:Trying to access database...
07:51:41:Successfully acquired database lock
07:51:41:Read GPUs.txt
07:51:42:Enabled folding slot 01: PAUSED gpu:0:TU106 [GeForce RTX 2070] M 6497 (by user)
07:51:43:****************************** FAHClient ******************************
07:51:43:        Version: 7.6.13
07:51:43:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
07:51:43:      Copyright: 2020 foldingathome.org
07:51:43:       Homepage: https://foldingathome.org/
07:51:43:           Date: Apr 27 2020
07:51:43:           Time: 21:21:01
07:51:43:       Revision: 5a652817f46116b6e135503af97f18e094414e3b
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:         Config: C:\Users\ricardo\AppData\Roaming\FAHClient\config.xml
07:51:43:******************************** CBang ********************************
07:51:43:           Date: Apr 24 2020
07:51:43:           Time: 17:07:55
07:51:43:       Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:******************************* System ********************************
07:51:43:            CPU: AMD Ryzen 5 3600 6-Core Processor
07:51:43:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
07:51:43:           CPUs: 12
07:51:43:         Memory: 15.95GiB
07:51:43:    Free Memory: 13.63GiB
07:51:43:        Threads: WINDOWS_THREADS
07:51:43:     OS Version: 6.2
07:51:43:    Has Battery: false
07:51:43:     On Battery: false
07:51:43:     UTC Offset: 1
07:51:43:            PID: 11628
07:51:43:            CWD: C:\Users\ricardo\AppData\Roaming\FAHClient
07:51:43:  Win32 Service: false
07:51:43:             OS: Windows 10 Enterprise
07:51:43:        OS Arch: AMD64
07:51:43:           GPUs: 1
07:51:43:          GPU 0: Bus:38 Slot:0 Func:0 NVIDIA:7 TU106 [GeForce RTX 2070] M 6497
07:51:43:  CUDA Device 0: Platform:0 Device:0 Bus:38 Slot:0 Compute:7.5 Driver:11.0
07:51:43:OpenCL Device 0: Platform:0 Device:0 Bus:38 Slot:0 Compute:1.2 Driver:451.48
07:51:43:******************************* libFAH ********************************
07:51:43:           Date: Apr 15 2020
07:51:43:           Time: 14:53:14
07:51:43:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:***********************************************************************
07:51:43:<config>
07:51:43:  <!-- Folding Core -->
07:51:43:  <checkpoint v='3'/>
07:51:43:
07:51:43:  <!-- Folding Slot Configuration -->
07:51:43:  <client-type v='advanced'/>
07:51:43:
07:51:43:  <!-- Network -->
07:51:43:  <proxy v=':8080'/>
07:51:43:
07:51:43:  <!-- Slot Control -->
07:51:43:  <pause-on-battery v='false'/>
07:51:43:  <power v='full'/>
07:51:43:
07:51:43:  <!-- User Information -->
07:51:43:  <passkey v='*****'/>
07:51:43:  <team v='35947'/>
07:51:43:  <user v='rjcman'/>
07:51:43:
07:51:43:  <!-- Folding Slots -->
07:51:43:  <slot id='1' type='GPU'>
07:51:43:    <paused v='true'/>
07:51:43:  </slot>
07:51:43:</config>
07:52:45:FS01:Unpaused
07:52:45:WU01:FS01:Starting
07:52:45:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\ricardo\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 11628 -checkpoint 3 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
07:52:45:WU01:FS01:Started FahCore on PID 15104
07:52:45:WU01:FS01:Core PID:9780
07:52:45:WU01:FS01:FahCore 0x22 started
07:52:46:WU01:FS01:0x22:*********************** Log Started 2020-07-07T07:52:45Z ***********************
07:52:46:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
07:52:46:WU01:FS01:0x22:       Core: Core22
07:52:46:WU01:FS01:0x22:       Type: 0x22
07:52:46:WU01:FS01:0x22:    Version: 0.0.11
07:52:46:WU01:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
07:52:46:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
07:52:46:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
07:52:46:WU01:FS01:0x22:       Date: Jun 26 2020
07:52:46:WU01:FS01:0x22:       Time: 19:49:16
07:52:46:WU01:FS01:0x22:   Revision: 22010df8a4db48db1b35d33e666b64d8ce48689d
07:52:46:WU01:FS01:0x22:     Branch: core22-0.0.11
07:52:46:WU01:FS01:0x22:   Compiler: Visual C++ 2015
07:52:46:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
07:52:46:WU01:FS01:0x22:   Platform: win32 10
07:52:46:WU01:FS01:0x22:       Bits: 64
07:52:46:WU01:FS01:0x22:       Mode: Release
07:52:46:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
07:52:46:WU01:FS01:0x22:             <peastman@stanford.edu>
07:52:46:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 15104 -checkpoint 3
07:52:46:WU01:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
07:52:46:WU01:FS01:0x22:             0 -gpu 0
07:52:46:WU01:FS01:0x22:************************************ libFAH ************************************
07:52:46:WU01:FS01:0x22:       Date: Jun 26 2020
07:52:46:WU01:FS01:0x22:       Time: 19:47:12
07:52:46:WU01:FS01:0x22:   Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
07:52:46:WU01:FS01:0x22:     Branch: HEAD
07:52:46:WU01:FS01:0x22:   Compiler: Visual C++ 2015
07:52:46:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
07:52:46:WU01:FS01:0x22:   Platform: win32 10
07:52:46:WU01:FS01:0x22:       Bits: 64
07:52:46:WU01:FS01:0x22:       Mode: Release
07:52:46:WU01:FS01:0x22:************************************ CBang *************************************
07:52:46:WU01:FS01:0x22:       Date: Jun 26 2020
07:52:46:WU01:FS01:0x22:       Time: 19:46:11
07:52:46:WU01:FS01:0x22:   Revision: f8529962055b0e7bde23e429f5072ff758089dee
07:52:46:WU01:FS01:0x22:     Branch: master
07:52:46:WU01:FS01:0x22:   Compiler: Visual C++ 2015
07:52:46:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
07:52:46:WU01:FS01:0x22:   Platform: win32 10
07:52:46:WU01:FS01:0x22:       Bits: 64
07:52:46:WU01:FS01:0x22:       Mode: Release
07:52:46:WU01:FS01:0x22:************************************ System ************************************
07:52:46:WU01:FS01:0x22:        CPU: AMD Ryzen 5 3600 6-Core Processor
07:52:46:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
07:52:46:WU01:FS01:0x22:       CPUs: 12
07:52:46:WU01:FS01:0x22:     Memory: 15.95GiB
07:52:46:WU01:FS01:0x22:Free Memory: 12.27GiB
07:52:46:WU01:FS01:0x22:    Threads: WINDOWS_THREADS
07:52:46:WU01:FS01:0x22: OS Version: 6.2
07:52:46:WU01:FS01:0x22:Has Battery: false
07:52:46:WU01:FS01:0x22: On Battery: false
07:52:46:WU01:FS01:0x22: UTC Offset: 1
07:52:46:WU01:FS01:0x22:        PID: 9780
07:52:46:WU01:FS01:0x22:        CWD: C:\Users\ricardo\AppData\Roaming\FAHClient\work
07:52:46:WU01:FS01:0x22:********************************************************************************
07:52:46:WU01:FS01:0x22:Project: 13416 (Run 1053, Clone 177, Gen 0)
07:52:46:WU01:FS01:0x22:Unit: 0x0000000012bc7d9a5f02af804fd165a7
07:52:46:WU01:FS01:0x22:Digital signatures verified
07:52:46:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
07:52:46:WU01:FS01:0x22:Version 0.0.11
07:52:46:WU01:FS01:0x22:  Checkpoint write interval: 50000 steps (5%) [20 total]
07:52:46:WU01:FS01:0x22:  JSON viewer frame write interval: 10000 steps (1%) [100 total]
07:52:46:WU01:FS01:0x22:  XTC frame write interval: 250000 steps (25%) [4 total]
07:52:46:WU01:FS01:0x22:  Global context and integrator variables write interval: 2500 steps (0.25%) [400 total]
07:52:58:WU01:FS01:0x22:Completed 300000 out of 1000000 steps (30%)
07:53:43:Removing old file 'configs/config-20200701-081923.xml'
Last edited by rjccosta1 on Tue Jul 07, 2020 4:17 pm, edited 1 time in total.
rjccosta1
Posts: 7
Joined: Tue Jul 07, 2020 8:22 am

Re: WU 13416 low ppd long run time

Post by rjccosta1 »

I find all of this strange because I got another 13416 project wu that was folding at the usual 1.4M ppd. This means that not all wu on project 13416 are slow. The next code is an example of one of the 3 wu that were very slow (800K ppd on rtx 2070).

Please find the wu in question - Project: 13416 (Run 1053, Clone 177, Gen 0):

Code: Select all

07:52:46:WU01:FS01:0x22:Project: 13416 (Run 1053, Clone 177, Gen 0)
07:52:46:WU01:FS01:0x22:Unit: 0x0000000012bc7d9a5f02af804fd165a7
07:52:46:WU01:FS01:0x22:Digital signatures verified
07:52:46:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
07:52:46:WU01:FS01:0x22:Version 0.0.11
07:52:46:WU01:FS01:0x22:  Checkpoint write interval: 50000 steps (5%) [20 total]
07:52:46:WU01:FS01:0x22:  JSON viewer frame write interval: 10000 steps (1%) [100 total]
07:52:46:WU01:FS01:0x22:  XTC frame write interval: 250000 steps (25%) [4 total]
07:52:46:WU01:FS01:0x22:  Global context and integrator variables write interval: 2500 steps (0.25%) [400 total]
07:52:58:WU01:FS01:0x22:Completed 300000 out of 1000000 steps (30%)
07:53:43:Removing old file 'configs/config-20200701-081923.xml'
07:53:43:Saving configuration to config.xml
07:53:43:<config>
07:53:43:  <!-- Folding Core -->
07:53:43:  <checkpoint v='3'/>
07:53:43:
07:53:43:  <!-- Folding Slot Configuration -->
07:53:43:  <client-type v='advanced'/>
07:53:43:
07:53:43:  <!-- Network -->
07:53:43:  <proxy v=':8080'/>
07:53:43:
07:53:43:  <!-- Slot Control -->
07:53:43:  <pause-on-battery v='false'/>
07:53:43:  <power v='full'/>
07:53:43:
07:53:43:  <!-- User Information -->
07:53:43:  <passkey v='*****'/>
07:53:43:  <team v='35947'/>
07:53:43:  <user v='rjcman'/>
07:53:43:
07:53:43:  <!-- Folding Slots -->
07:53:43:  <slot id='1' type='GPU'/>
07:53:43:</config>
07:55:26:WU01:FS01:0x22:Completed 310000 out of 1000000 steps (31%)
07:57:57:WU01:FS01:0x22:Completed 320000 out of 1000000 steps (32%)
08:00:28:WU01:FS01:0x22:Completed 330000 out of 1000000 steps (33%)
08:02:58:WU01:FS01:0x22:Completed 340000 out of 1000000 steps (34%)
*********************** Log Started 2020-07-07T07:51:41Z ***********************
07:51:41:Trying to access database...
07:51:41:Successfully acquired database lock
07:51:41:Read GPUs.txt
07:51:42:Enabled folding slot 01: PAUSED gpu:0:TU106 [GeForce RTX 2070] M 6497 (by user)
07:51:43:****************************** FAHClient ******************************
07:51:43:        Version: 7.6.13
07:51:43:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
07:51:43:      Copyright: 2020 foldingathome.org
07:51:43:       Homepage: https://foldingathome.org/
07:51:43:           Date: Apr 27 2020
07:51:43:           Time: 21:21:01
07:51:43:       Revision: 5a652817f46116b6e135503af97f18e094414e3b
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:         Config: C:\Users\ricardo\AppData\Roaming\FAHClient\config.xml
07:51:43:******************************** CBang ********************************
07:51:43:           Date: Apr 24 2020
07:51:43:           Time: 17:07:55
07:51:43:       Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:******************************* System ********************************
07:51:43:            CPU: AMD Ryzen 5 3600 6-Core Processor
07:51:43:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
07:51:43:           CPUs: 12
07:51:43:         Memory: 15.95GiB
07:51:43:    Free Memory: 13.63GiB
07:51:43:        Threads: WINDOWS_THREADS
07:51:43:     OS Version: 6.2
07:51:43:    Has Battery: false
07:51:43:     On Battery: false
07:51:43:     UTC Offset: 1
07:51:43:            PID: 11628
07:51:43:            CWD: C:\Users\ricardo\AppData\Roaming\FAHClient
07:51:43:  Win32 Service: false
07:51:43:             OS: Windows 10 Enterprise
07:51:43:        OS Arch: AMD64
07:51:43:           GPUs: 1
07:51:43:          GPU 0: Bus:38 Slot:0 Func:0 NVIDIA:7 TU106 [GeForce RTX 2070] M 6497
07:51:43:  CUDA Device 0: Platform:0 Device:0 Bus:38 Slot:0 Compute:7.5 Driver:11.0
07:51:43:OpenCL Device 0: Platform:0 Device:0 Bus:38 Slot:0 Compute:1.2 Driver:451.48
07:51:43:******************************* libFAH ********************************
07:51:43:           Date: Apr 15 2020
07:51:43:           Time: 14:53:14
07:51:43:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:***********************************************************************
07:51:43:<config>
07:51:43:  <!-- Folding Core -->
07:51:43:  <checkpoint v='3'/>
07:51:43:
07:51:43:  <!-- Folding Slot Configuration -->
07:51:43:  <client-type v='advanced'/>
07:51:43:
07:51:43:  <!-- Network -->
07:51:43:  <proxy v=':8080'/>
07:51:43:
07:51:43:  <!-- Slot Control -->
07:51:43:  <pause-on-battery v='false'/>
07:51:43:  <power v='full'/>
07:51:43:
07:51:43:  <!-- User Information -->
07:51:43:  <passkey v='*****'/>
07:51:43:  <team v='35947'/>
07:51:43:  <user v='rjcman'/>
07:51:43:
07:51:43:  <!-- Folding Slots -->
07:51:43:  <slot id='1' type='GPU'>
07:51:43:    <paused v='true'/>
07:51:43:  </slot>
07:51:43:</config>
07:52:45:FS01:Unpaused
07:52:45:WU01:FS01:Starting
07:52:45:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\ricardo\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 11628 -checkpoint 3 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
07:52:45:WU01:FS01:Started FahCore on PID 15104
07:52:45:WU01:FS01:Core PID:9780
07:52:45:WU01:FS01:FahCore 0x22 started
07:52:46:WU01:FS01:0x22:*********************** Log Started 2020-07-07T07:52:45Z ***********************

07:52:46:WU01:FS01:0x22:Project: 13416 (Run 1053, Clone 177, Gen 0)
07:52:46:WU01:FS01:0x22:Unit: 0x0000000012bc7d9a5f02af804fd165a7
07:52:46:WU01:FS01:0x22:Digital signatures verified
07:52:46:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
07:52:46:WU01:FS01:0x22:Version 0.0.11
07:52:46:WU01:FS01:0x22:  Checkpoint write interval: 50000 steps (5%) [20 total]
07:52:46:WU01:FS01:0x22:  JSON viewer frame write interval: 10000 steps (1%) [100 total]
07:52:46:WU01:FS01:0x22:  XTC frame write interval: 250000 steps (25%) [4 total]
07:52:46:WU01:FS01:0x22:  Global context and integrator variables write interval: 2500 steps (0.25%) [400 total]
07:52:58:WU01:FS01:0x22:Completed 300000 out of 1000000 steps (30%)
07:53:43:Removing old file 'configs/config-20200701-081923.xml'
07:53:43:Saving configuration to config.xml

07:55:26:WU01:FS01:0x22:Completed 310000 out of 1000000 steps (31%)
07:57:57:WU01:FS01:0x22:Completed 320000 out of 1000000 steps (32%)
08:00:28:WU01:FS01:0x22:Completed 330000 out of 1000000 steps (33%)
08:02:58:WU01:FS01:0x22:Completed 340000 out of 1000000 steps (34%)

08:05:28:WU01:FS01:0x22:Completed 350000 out of 1000000 steps (35%)

10:41:13:WU01:FS01:0x22:Completed 990000 out of 1000000 steps (99%)
10:41:14:WU00:FS01:Connecting to assign1.foldingathome.org:80
10:41:14:WU00:FS01:Assigned to work server 18.188.125.154
10:41:14:WU00:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:TU106 [GeForce RTX 2070] M 6497 from 18.188.125.154
10:41:14:WU00:FS01:Connecting to 18.188.125.154:8080
10:41:15:WU00:FS01:Downloading 7.03MiB
10:41:21:WU00:FS01:Download 84.48%
10:41:21:WU00:FS01:Download complete
10:41:22:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13416 run:399 clone:130 gen:1 core:0x22 unit:0x0000000312bc7d9a5f00a7e7ea520da4
10:43:43:WU01:FS01:0x22:Completed 1000000 out of 1000000 steps (100%)
10:43:43:WU01:FS01:0x22:Average performance: 116.129 ns/day
10:43:47:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
10:43:47:WU01:FS01:0x22:Saving result file checkpointState.xml.bz2
10:43:47:WU01:FS01:0x22:Saving result file globals.csv
10:43:47:WU01:FS01:0x22:Saving result file positions.xtc
10:43:47:WU01:FS01:0x22:Saving result file science.log
10:43:47:WU01:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
10:43:48:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
10:43:48:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13416 run:1053 clone:177 gen:0 core:0x22 unit:0x0000000012bc7d9a5f02af804fd165a7
10:43:48:WU01:FS01:Uploading 5.83MiB to 18.188.125.154
10:43:48:WU01:FS01:Connecting to 18.188.125.154:8080
10:43:48:WU00:FS01:Starting
10:43:48:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\ricardo\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 11628 -checkpoint 30 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
10:43:48:WU00:FS01:Started FahCore on PID 4832
10:43:48:WU00:FS01:Core PID:12212
10:43:48:WU00:FS01:FahCore 0x22 started
10:43:49:WU00:FS01:0x22:*********************** Log Started 2020-07-07T10:43:48Z ***********************
10:43:49:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
10:43:49:WU00:FS01:0x22:       Core: Core22
10:43:49:WU00:FS01:0x22:       Type: 0x22
10:43:49:WU00:FS01:0x22:    Version: 0.0.11
10:43:49:WU00:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
10:43:49:WU00:FS01:0x22:  Copyright: 2020 foldingathome.org
10:43:49:WU00:FS01:0x22:   Homepage: https://foldingathome.org/
10:43:49:WU00:FS01:0x22:       Date: Jun 26 2020
10:43:49:WU00:FS01:0x22:       Time: 19:49:16
10:43:49:WU00:FS01:0x22:   Revision: 22010df8a4db48db1b35d33e666b64d8ce48689d
10:43:49:WU00:FS01:0x22:     Branch: core22-0.0.11
10:43:49:WU00:FS01:0x22:   Compiler: Visual C++ 2015
10:43:49:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
10:43:49:WU00:FS01:0x22:   Platform: win32 10
10:43:49:WU00:FS01:0x22:       Bits: 64
10:43:49:WU00:FS01:0x22:       Mode: Release
10:43:49:WU00:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
10:43:49:WU00:FS01:0x22:             <peastman@stanford.edu>
10:43:49:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 706 -lifeline 4832 -checkpoint 30
10:43:49:WU00:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
10:43:49:WU00:FS01:0x22:             0 -gpu 0
10:43:49:WU00:FS01:0x22:************************************ libFAH ************************************
10:43:49:WU00:FS01:0x22:       Date: Jun 26 2020
10:43:49:WU00:FS01:0x22:       Time: 19:47:12
10:43:49:WU00:FS01:0x22:   Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
10:43:49:WU00:FS01:0x22:     Branch: HEAD
10:43:49:WU00:FS01:0x22:   Compiler: Visual C++ 2015
10:43:49:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
10:43:49:WU00:FS01:0x22:   Platform: win32 10
10:43:49:WU00:FS01:0x22:       Bits: 64
10:43:49:WU00:FS01:0x22:       Mode: Release
10:43:49:WU00:FS01:0x22:************************************ CBang *************************************
10:43:49:WU00:FS01:0x22:       Date: Jun 26 2020
10:43:49:WU00:FS01:0x22:       Time: 19:46:11
10:43:49:WU00:FS01:0x22:   Revision: f8529962055b0e7bde23e429f5072ff758089dee
10:43:49:WU00:FS01:0x22:     Branch: master
10:43:49:WU00:FS01:0x22:   Compiler: Visual C++ 2015
10:43:49:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
10:43:49:WU00:FS01:0x22:   Platform: win32 10
10:43:49:WU00:FS01:0x22:       Bits: 64
10:43:49:WU00:FS01:0x22:       Mode: Release
10:43:49:WU00:FS01:0x22:************************************ System ************************************
10:43:49:WU00:FS01:0x22:        CPU: AMD Ryzen 5 3600 6-Core Processor
10:43:49:WU00:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
10:43:49:WU00:FS01:0x22:       CPUs: 12
10:43:49:WU00:FS01:0x22:     Memory: 15.95GiB
10:43:49:WU00:FS01:0x22:Free Memory: 9.24GiB
10:43:49:WU00:FS01:0x22:    Threads: WINDOWS_THREADS
10:43:49:WU00:FS01:0x22: OS Version: 6.2
10:43:49:WU00:FS01:0x22:Has Battery: false
10:43:49:WU00:FS01:0x22: On Battery: false
10:43:49:WU00:FS01:0x22: UTC Offset: 1
10:43:49:WU00:FS01:0x22:        PID: 12212
10:43:49:WU00:FS01:0x22:        CWD: C:\Users\ricardo\AppData\Roaming\FAHClient\work
10:43:49:WU00:FS01:0x22:********************************************************************************
10:43:49:WU00:FS01:0x22:Project: 13416 (Run 399, Clone 130, Gen 1)
10:43:49:WU00:FS01:0x22:Unit: 0x0000000312bc7d9a5f00a7e7ea520da4
10:43:49:WU00:FS01:0x22:Reading tar file core.xml
10:43:49:WU00:FS01:0x22:Reading tar file integrator.xml
10:43:49:WU00:FS01:0x22:Reading tar file state.xml.bz2
10:43:49:WU00:FS01:0x22:Reading tar file system.xml.bz2
10:43:49:WU00:FS01:0x22:Digital signatures verified
10:43:49:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
10:43:49:WU00:FS01:0x22:Version 0.0.11
10:43:49:WU00:FS01:0x22:  Checkpoint write interval: 50000 steps (5%) [20 total]
10:43:49:WU00:FS01:0x22:  JSON viewer frame write interval: 10000 steps (1%) [100 total]
10:43:49:WU00:FS01:0x22:  XTC frame write interval: 250000 steps (25%) [4 total]
10:43:49:WU00:FS01:0x22:  Global context and integrator variables write interval: 2500 steps (0.25%) [400 total]
10:43:54:WU01:FS01:Upload 10.72%
10:44:00:WU01:FS01:Upload 20.37%
10:44:01:WU00:FS01:0x22:Completed 0 out of 1000000 steps (0%)
10:44:06:WU01:FS01:Upload 31.09%
10:44:12:WU01:FS01:Upload 40.74%
10:44:18:WU01:FS01:Upload 51.46%
10:44:24:WU01:FS01:Upload 62.18%
10:44:30:WU01:FS01:Upload 72.90%
10:44:36:WU01:FS01:Upload 83.63%
10:44:42:WU01:FS01:Upload 94.35%
10:44:47:WU01:FS01:Upload complete
10:44:47:WU01:FS01:Server responded WORK_ACK (400)
10:44:47:WU01:FS01:Final credit estimate, 153414.00 points
10:44:47:WU01:FS01:Cleaning up
Joe_H
Site Admin
Posts: 8226
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: WU 13416 low ppd long run time

Post by Joe_H »

There are multiple posts, including by the researcher running these projects, that the 134nn projects are having more variance than normal projects. WUs from some runs will be significantly slower, the data from these is being analyzed to see if a reason can be determined and improve future project configuration and assignment.
Image
Sparkly
Posts: 73
Joined: Sun Apr 19, 2020 11:01 am

Re: WU 13416 low ppd long run time

Post by Sparkly »

Yeah, I see this too, some of the 13416 WUs are insanely CPU hungry for some reason, even thou they are running on GPU, and require a full CPU core on its own to even be able to move forward, compared to other WUs in the same project that have the normal expected CPU load for the atom count they have.
Curt3g
Posts: 16
Joined: Sun Mar 29, 2020 4:27 pm

Re: WU 13416 low ppd long run time

Post by Curt3g »

I also just had a long running one, taking 12.5 hours (normally taking 4 hours). It was run:1291 clone:121 gen:0.

Just wanted to confirm, did I see in one of the posts from @JohnChodera that as long as the job results get successfully returned, we don't need to post abnormally long run times on the forum? Or maybe that pertained to switching to the latest core. Can't remember now.

Cheers,

Curt
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: WU 13416 low ppd long run time

Post by ajm »

As a general rule, if the WUs are returned, the researchers will be able to see how they behaved and I think it's not necessary to list them here, at least not systematically. But it's okay too, I'd say.
.
rjccosta1
Posts: 7
Joined: Tue Jul 07, 2020 8:22 am

Re: WU 13416 low ppd long run time

Post by rjccosta1 »

Thanks for the clarification. Conclusion: some wu are very slow and low ppd. Fair enough. I thought there was something really wrong with my cpu-gpu configuration. You have all been very helpful. We live in times of so much hate and negativity that is surprising to see so many people rowing in the same direction. What great community we have here. 1-0 for humanity.
Ichbin3
Posts: 96
Joined: Thu May 28, 2020 8:06 am
Hardware configuration: MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
Location: Germany

Re: WU 13416 low ppd long run time

Post by Ichbin3 »

I doubt that the scientists can see if some WUs take abnormaly long time.
There are so many reasons why a WU takes longer independent from their structure - like different gpus, underclocking, power limiting, pausing, parallel use of gpu and gaming, ...
Would be curious about a statement from the scientists.
Image
MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: WU 13416 low ppd long run time

Post by bruce »

Projects have a variety of goals and so do the researchers. As a general rule, John Chodera is very careful to check each WU and if it contained an error, he'll evaluate that information carefully. Other researchers may not be specifically analyzing each one so carefully, but they do pay attention.

As far as some WUs taking several hours, that's not surprising Some projects assign WUs that take several days. A single project that takes a week is more efficient than a series of 14 projects that each take 12 hrs. That's why people talk about Points Per Day. The processing time for the Covid Moonshot projects may often be very short, but that's not their main objective.
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: WU 13416 low ppd long run time

Post by JohnChodera »

Thanks for the reports, folks! We shifted from SARS-CoV-1 to SARS-CoV-2 Mpro retrospective benchmarks for the new batch of runs we just loaded into 13416-7, and it seems that this has unexpectedly caused an increase in WU compute time. We've adjusted the base credit upwards to compensate while we investigate what is going on.

> As a general rule, if the WUs are returned, the researchers will be able to see how they behaved and I think it's not necessary to list them here, at least not systematically.

I can confirm this is the case. Anything that's uploaded is available for us to analyze, and we periodically check to identify the major sources of issues and pinpoint specific WUs that are problematic, but it's good to hear about systematic issues (like this) or issues that do NOT get uploaded!

We're still experiencing a much greater RUN-to-RUN variation than expected. Each RUN here is a different ligand for Mpro (either SARS-CoV-1 or 2). The systems are nearly identical in size (number of atoms), so it's the composition of the workload that is causing variation. We're not sure why yet, but we hope to improve this for the next batch of projects (13418-9).

These projects are helping us support the COVID Moonshot (http://covid.postera.ai/covid), so huge thanks again for helping out as we keep progressing toward more potent inhibitors, and with some luck, a molecule we can put into clinical trials in the next few months.

~ John Chodera // MSKCC
Ichbin3
Posts: 96
Joined: Thu May 28, 2020 8:06 am
Hardware configuration: MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
Location: Germany

Re: WU 13416 low ppd long run time

Post by Ichbin3 »

Well than.
Thanks for clarification.
Normal TPF here is 00:01:04, like in 13416 (452, 12, 0)
13416 (669, 132, 0) 2080TI TPF 00:01:25
Image
MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
BobHehmann
Posts: 2
Joined: Wed May 06, 2020 6:52 pm

Re: WU 13416 low ppd long run time

Post by BobHehmann »

Just experienced this (project:13416 run:1051 clone:146 gen:0 core:0x22 unit:0x0000000112bc7d9a5f02af81adfa8240). I noticed on my GPU stats monitor that the GPU usage % was hovering around 62-64% for this WU, whereas I normally see GPU usage > 90%. GPU power draw was commensurately low, while all other GPU stats looked nominal. The GPU is a 2070 Super, running alongside an AMD 3900X cpu, also folding away. CPU utilization looked entirely normal for folding, while 13416 was running slowly on the GPU. Combined CPU & GPU "Total Estimated Points Per Day" was sitting around 1.5M while this WU ran (jogged?) - I'm usually showing 2.3-2.4M ppd CPU/GPU this week. Sitting at 2.5M ppd right now as I type.

I also found that another 13416 instance from earlier today crashed and restarted several times, finally giving up. The "slow" instance was the next WU I received, and it also crashed and restarted once during its extended run, but it did eventually successfully complete. I seemingly recall several other 13416-related crashes over the last couple of days. Normally my GPU folding is rock solid. My OS is Win10 Pro 1909, all patch levels current. I still have relevant log files, let me know if anything from the logs could be of service.

Cheers, Bob
Shirty
Posts: 49
Joined: Thu Jul 11, 2019 10:19 pm

Re: WU 13416 low ppd long run time

Post by Shirty »

I too have noticed this behaviour across a mix of high-end Nvidia cards (species 7 & 8). I seem to be getting these WUs on the majority of my cards, and it's shaved over 4 million points off my daily average. As long as science is getting done I can cope with that for a while, but it'd be nice to get more WUs suited to the hardware I'm donating to the cause.
Image
psaam0001
Posts: 378
Joined: Mon May 18, 2020 2:02 am
Location: Ruckersville, Virginia, USA

Re: WU 13416 low ppd long run time

Post by psaam0001 »

My only question's are: 1) Will the expiration times be less than 3.5 days for a Moonshot WU? And 2) Will dumped units for this be reassigned?

Paul
Last edited by psaam0001 on Wed Jul 08, 2020 10:52 pm, edited 1 time in total.
Breach
Posts: 220
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Re: WU 13416 low ppd long run time

Post by Breach »

Same observations, 13416 (1297, 133, 1) for example - PPD is lower than usual by about 25%.

I've noticed that my GPU is loaded at 75%. If I stop my CPU slot (using 6 out of 8 threads) it goes up to 82-85%. Could it be that my CPU bottlenecks the GPU slot here?

[Edit: Would have helped to read previous posts - it seems that's right, some WUs are just too CPU hungry]
Last edited by Breach on Wed Jul 08, 2020 9:52 pm, edited 2 times in total.
Windows 11 x64 / 9800X3D PBO / 32GB DDR5 6400 1:1 / 5090 FE / Sennheiser 650 / PSU Corsair AX1600i
Post Reply