12129 is cruel to small GPUs

Moderators: Site Moderators, FAHC Science Team

Post Reply
appepi
Posts: 117
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3) HP Z4G4 (3) ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3) Dell GTX 1080 NVIDIA P1000 (3)
Location: Sydney Australia

12129 is cruel to small GPUs

Post by appepi »

I was startled to discover one of my Z440/RTX2060 combinations "Finishing" a WU with 7+ hours still to run long after the others had completed their 9 hours of running during the "off peak" electricity rates. I let Project 12129 ramble on through the day consuming expensive electrons, since it would be a pity to spoil a 99.8% completion rate and once is an accident, as they say. However, a check at https://folding.lar.systems/projects/fo ... file/12129 shows that it isn't an accident at all.

Lars reports it running on everything from a 5090 (1/2 hour) down to a P1000 (48 hours) with TU106 2060's averaging 8.5 hours. Clearly Project 12129 belongs in the playground with the Big GPUs and should not be allowed to bully the little ones. If it comes my way again and wants to eat my lunch outside off-peak times at more than twice the cost while paying only 60% of the usual RTX 2060 PPD rate, it will be getting a 15-hour pause in the middle. And even then only if I am in a good mood.
Image
muziqaz
Posts: 1912
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: 12129 is cruel to small GPUs

Post by muziqaz »

That project has deadline of couple of days. As long as GPUs finish those projects within that timeout, they will always going to be allowed.
8.5h is nothing. You cannot expect every project to be finished within an hour.
FAH Omega tester
Image
appepi
Posts: 117
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3) HP Z4G4 (3) ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3) Dell GTX 1080 NVIDIA P1000 (3)
Location: Sydney Australia

Re: 12129 is cruel to small GPUs

Post by appepi »

Well, that particular run - PCRG [12129,6,52,0] - is still happening for another 2 hours, and while it will bring me about 1M points, it works out at around 1.3M PPD for a device that usually does 2.2M PPD, and it will have run for about 16 hours by the time it finishes, so I will be perfectly happy to donate the next one to ease the current shortage of projects for big GPUs. The thought of magically expensive toys begging for work is just too sad. :(

EDIT: The plot thickens. We seem to be stuck in a loop re-running the last 10% after a crash. If it continues I'll have to dump it.

Code: Select all

******************************* Date: 2025-08-14 *******************************
11:44:51:WU00:FS00:0x24:Completed 3850000 out of 5000000 steps (77%)
11:55:00:WU00:FS00:0x24:Completed 3900000 out of 5000000 steps (78%)
12:04:52:WU00:FS00:0x24:Completed 3950000 out of 5000000 steps (79%)
12:14:52:WU00:FS00:0x24:Completed 4000000 out of 5000000 steps (80%)
12:14:55:WU00:FS00:0x24:Checkpoint completed at step 4000000
12:24:52:WU00:FS00:0x24:Completed 4050000 out of 5000000 steps (81%)
12:34:59:WU00:FS00:0x24:Completed 4100000 out of 5000000 steps (82%)
12:44:55:WU00:FS00:0x24:Completed 4150000 out of 5000000 steps (83%)
12:54:54:WU00:FS00:0x24:Completed 4200000 out of 5000000 steps (84%)
13:05:04:WU00:FS00:0x24:Completed 4250000 out of 5000000 steps (85%)
13:05:08:WU00:FS00:0x24:Checkpoint completed at step 4250000
13:14:49:WU00:FS00:0x24:Completed 4300000 out of 5000000 steps (86%)
13:24:47:WU00:FS00:0x24:Completed 4350000 out of 5000000 steps (87%)
13:34:30:WU00:FS00:0x24:Completed 4400000 out of 5000000 steps (88%)
13:44:21:WU00:FS00:0x24:Completed 4450000 out of 5000000 steps (89%)
13:54:16:WU00:FS00:0x24:Completed 4500000 out of 5000000 steps (90%)
13:54:19:WU00:FS00:0x24:Checkpoint completed at step 4500000
14:04:06:WU00:FS00:0x24:Completed 4550000 out of 5000000 steps (91%)
14:14:18:WU00:FS00:0x24:Completed 4600000 out of 5000000 steps (92%)
14:20:07:WU00:FS00:0x24:An exception occurred at step 4627874: Error invoking kernel: CUDA_ERROR_UNKNOWN (999)
14:20:07:WU00:FS00:0x24:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
14:20:07:WU00:FS00:0x24:Folding@home Core Shutdown: CORE_RESTART
14:20:14:WARNING:WU00:FS00:FahCore returned an unknown error code which probably indicates that it crashed
14:20:14:WARNING:WU00:FS00:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
14:20:14:WU00:FS00:Starting
14:20:14:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/openmm-core-24/windows-10-64bit/release/0x24-8.1.4/Core_24.fah/FahCore_24.exe -dir 00 -suffix 01 -version 706 -lifeline 3636 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
14:20:14:WU00:FS00:Started FahCore on PID 37616
14:20:14:WU00:FS00:Core PID:31288
14:20:14:WU00:FS00:FahCore 0x24 started
14:20:15:WU00:FS00:0x24:*********************** Log Started 2025-08-14T14:20:15Z ***********************
14:20:17:WU00:FS00:0x24:*************************** Core24 Folding@home Core ***************************
14:20:17:WU00:FS00:0x24:       Core: Core24
14:20:17:WU00:FS00:0x24:       Type: 0x24
14:20:17:WU00:FS00:0x24:    Version: 8.1.4
14:20:17:WU00:FS00:0x24:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
14:20:17:WU00:FS00:0x24:  Copyright: 2022 foldingathome.org
14:20:17:WU00:FS00:0x24:   Homepage: https://foldingathome.org/
14:20:17:WU00:FS00:0x24:       Date: Jul 25 2024
14:20:17:WU00:FS00:0x24:       Time: 05:42:49
14:20:17:WU00:FS00:0x24:   Revision: cf9f0139862b8945a2091772770e4631aac37792
14:20:17:WU00:FS00:0x24:     Branch: HEAD
14:20:17:WU00:FS00:0x24:   Compiler: Visual C++
14:20:17:WU00:FS00:0x24:    Options: $( /TP $) /std:c++14 /nologo /EHa /wd4297 /wd4103 /O2
14:20:17:WU00:FS00:0x24:             /Zc:throwingNew /MT -DOPENMM_VERSION="\"8.1.1\"" /Ox /std:c++14
14:20:17:WU00:FS00:0x24:   Platform: win32 10
14:20:17:WU00:FS00:0x24:       Bits: 64
14:20:17:WU00:FS00:0x24:       Mode: Release
14:20:17:WU00:FS00:0x24:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
14:20:17:WU00:FS00:0x24:             <peastman@stanford.edu>
14:20:17:WU00:FS00:0x24:       Args: -dir 00 -suffix 01 -version 706 -lifeline 37616 -checkpoint 15
14:20:17:WU00:FS00:0x24:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
14:20:17:WU00:FS00:0x24:             nvidia -gpu 0 -gpu-usage 100
14:20:17:WU00:FS00:0x24:************************************ libFAH ************************************
14:20:17:WU00:FS00:0x24:       Date: Jul 25 2024
14:20:17:WU00:FS00:0x24:       Time: 05:23:50
14:20:17:WU00:FS00:0x24:   Revision: c7d2824a47eb025fa8cda8968c7a5e971585d90c
14:20:17:WU00:FS00:0x24:     Branch: HEAD
14:20:17:WU00:FS00:0x24:   Compiler: Visual C++
14:20:17:WU00:FS00:0x24:    Options: $( /TP $) /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
14:20:17:WU00:FS00:0x24:   Platform: win32 10
14:20:17:WU00:FS00:0x24:       Bits: 64
14:20:17:WU00:FS00:0x24:       Mode: Release
14:20:17:WU00:FS00:0x24:************************************ CBang *************************************
14:20:17:WU00:FS00:0x24:    Version: 1.7.2
14:20:17:WU00:FS00:0x24:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
14:20:17:WU00:FS00:0x24:        Org: Cauldron Development LLC
14:20:17:WU00:FS00:0x24:  Copyright: Cauldron Development LLC, 2003-2024
14:20:17:WU00:FS00:0x24:   Homepage: https://cauldrondevelopment.com/
14:20:17:WU00:FS00:0x24:    License: LGPL-2.1-or-later
14:20:17:WU00:FS00:0x24:       Date: Jul 25 2024
14:20:17:WU00:FS00:0x24:       Time: 05:22:43
14:20:17:WU00:FS00:0x24:   Revision: f1cd4c791e8c40a35dcfeab3ab85d910949cc0cb
14:20:17:WU00:FS00:0x24:     Branch: HEAD
14:20:17:WU00:FS00:0x24:   Compiler: Visual C++
14:20:17:WU00:FS00:0x24:    Options: $( /TP $) /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
14:20:17:WU00:FS00:0x24:   Platform: win32 10
14:20:17:WU00:FS00:0x24:       Bits: 64
14:20:17:WU00:FS00:0x24:       Mode: Release
14:20:17:WU00:FS00:0x24:************************************ System ************************************
14:20:17:WU00:FS00:0x24:        CPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
14:20:17:WU00:FS00:0x24:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
14:20:17:WU00:FS00:0x24:       CPUs: 12
14:20:17:WU00:FS00:0x24:     Memory: 31.91GiB
14:20:17:WU00:FS00:0x24:Free Memory: 20.43GiB
14:20:17:WU00:FS00:0x24: OS Version: 10.0
14:20:17:WU00:FS00:0x24:Has Battery: false
14:20:17:WU00:FS00:0x24: On Battery: false
14:20:17:WU00:FS00:0x24:   Hostname: Z442
14:20:17:WU00:FS00:0x24: UTC Offset: 10
14:20:17:WU00:FS00:0x24:        PID: 31288
14:20:17:WU00:FS00:0x24:        CWD: C:\ProgramData\FAHClient\work
14:20:17:WU00:FS00:0x24:       Exec: C:\ProgramData\FAHClient\cores\cores.foldingathome.org\openmm-core-24\windows-10-64bit\release\0x24-8.1.4\Core_24.fah\FahCore_24.exe
14:20:17:WU00:FS00:0x24:************************************ OpenMM ************************************
14:20:17:WU00:FS00:0x24:    Version: 8.1.1
14:20:17:WU00:FS00:0x24:********************************************************************************
14:20:17:WU00:FS00:0x24:Project: 12129 (Run 6, Clone 52, Gen 0)
14:20:17:WU00:FS00:0x24:Digital signatures verified
14:20:17:WU00:FS00:0x24:Folding@home GPU Core24 Folding@home Core
14:20:17:WU00:FS00:0x24:Version 8.1.4
14:20:17:WU00:FS00:0x24:  Checkpoint write interval: 250000 steps (5%) [20 total]
14:20:17:WU00:FS00:0x24:  JSON viewer frame write interval: 50000 steps (1%) [100 total]
14:20:17:WU00:FS00:0x24:  XTC frame write interval: 25000 steps (0.5%) [200 total]
14:20:17:WU00:FS00:0x24:  TRR frame write interval: disabled
14:20:17:WU00:FS00:0x24:  Global context and integrator variables write interval: disabled
14:20:17:WU00:FS00:0x24:There are 4 platforms available.
14:20:17:WU00:FS00:0x24:Platform 0: Reference
14:20:17:WU00:FS00:0x24:Platform 1: CPU
14:20:17:WU00:FS00:0x24:Platform 2: OpenCL
14:20:17:WU00:FS00:0x24:  opencl-device 0 specified
14:20:17:WU00:FS00:0x24:Platform 3: CUDA
14:20:17:WU00:FS00:0x24:  cuda-device 0 specified
14:20:56:WU00:FS00:0x24:Attempting to create CUDA context:
14:20:56:WU00:FS00:0x24:  Configuring platform CUDA
14:21:02:WU00:FS00:0x24:  Using CUDA on CUDA Platform and gpu 0
14:21:02:WU00:FS00:0x24:  GPU info: Platform: CUDA
14:21:02:WU00:FS00:0x24:  GPU info: PlatformIndex: 0
14:21:02:WU00:FS00:0x24:  GPU info: Device: NVIDIA GeForce RTX 2060
14:21:02:WU00:FS00:0x24:  GPU info: DeviceIndex: 0
14:21:02:WU00:FS00:0x24:  GPU info: Vendor: 0x10de
14:21:02:WU00:FS00:0x24:  GPU info: PCI: 02:00:00
14:21:02:WU00:FS00:0x24:  GPU info: Compute: 7.5
14:21:02:WU00:FS00:0x24:  GPU info: Driver: 12.7
14:21:02:WU00:FS00:0x24:  GPU info: GPU: true
14:21:02:WU00:FS00:0x24:Completed 4500000 out of 5000000 steps (90%)
14:30:49:WU00:FS00:0x24:Completed 4550000 out of 5000000 steps (91%)
14:41:10:WU00:FS00:0x24:Completed 4600000 out of 5000000 steps (92%)
14:50:59:WU00:FS00:0x24:Completed 4650000 out of 5000000 steps (93%)
15:01:09:WU00:FS00:0x24:Completed 4700000 out of 5000000 steps (94%)
15:11:19:WU00:FS00:0x24:An exception occurred at step 4748836: Error invoking kernel: CUDA_ERROR_UNKNOWN (999)
15:11:20:WU00:FS00:0x24:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
15:11:20:WU00:FS00:0x24:Folding@home Core Shutdown: CORE_RESTART
15:11:25:WARNING:WU00:FS00:FahCore returned an unknown error code which probably indicates that it crashed
15:11:25:WARNING:WU00:FS00:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
15:11:25:WU00:FS00:Starting
15:11:25:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/openmm-core-24/windows-10-64bit/release/0x24-8.1.4/Core_24.fah/FahCore_24.exe -dir 00 -suffix 01 -version 706 -lifeline 3636 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
15:11:25:WU00:FS00:Started FahCore on PID 33384
15:11:25:WU00:FS00:Core PID:28820
15:11:25:WU00:FS00:FahCore 0x24 started
15:11:26:WU00:FS00:0x24:*********************** Log Started 2025-08-14T15:11:26Z ***********************
15:11:26:WU00:FS00:0x24:*************************** Core24 Folding@home Core ***************************
15:11:26:WU00:FS00:0x24:       Core: Core24
15:11:26:WU00:FS00:0x24:       Type: 0x24
15:11:26:WU00:FS00:0x24:    Version: 8.1.4
15:11:26:WU00:FS00:0x24:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:11:26:WU00:FS00:0x24:  Copyright: 2022 foldingathome.org
15:11:26:WU00:FS00:0x24:   Homepage: https://foldingathome.org/
15:11:26:WU00:FS00:0x24:       Date: Jul 25 2024
15:11:26:WU00:FS00:0x24:       Time: 05:42:49
15:11:26:WU00:FS00:0x24:   Revision: cf9f0139862b8945a2091772770e4631aac37792
15:11:26:WU00:FS00:0x24:     Branch: HEAD
15:11:26:WU00:FS00:0x24:   Compiler: Visual C++
15:11:26:WU00:FS00:0x24:    Options: $( /TP $) /std:c++14 /nologo /EHa /wd4297 /wd4103 /O2
15:11:26:WU00:FS00:0x24:             /Zc:throwingNew /MT -DOPENMM_VERSION="\"8.1.1\"" /Ox /std:c++14
15:11:26:WU00:FS00:0x24:   Platform: win32 10
15:11:26:WU00:FS00:0x24:       Bits: 64
15:11:26:WU00:FS00:0x24:       Mode: Release
15:11:26:WU00:FS00:0x24:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
15:11:26:WU00:FS00:0x24:             <peastman@stanford.edu>
15:11:26:WU00:FS00:0x24:       Args: -dir 00 -suffix 01 -version 706 -lifeline 33384 -checkpoint 15
15:11:26:WU00:FS00:0x24:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
15:11:26:WU00:FS00:0x24:             nvidia -gpu 0 -gpu-usage 100
15:11:26:WU00:FS00:0x24:************************************ libFAH ************************************
15:11:26:WU00:FS00:0x24:       Date: Jul 25 2024
15:11:26:WU00:FS00:0x24:       Time: 05:23:50
15:11:26:WU00:FS00:0x24:   Revision: c7d2824a47eb025fa8cda8968c7a5e971585d90c
15:11:26:WU00:FS00:0x24:     Branch: HEAD
15:11:26:WU00:FS00:0x24:   Compiler: Visual C++
15:11:26:WU00:FS00:0x24:    Options: $( /TP $) /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
15:11:26:WU00:FS00:0x24:   Platform: win32 10
15:11:26:WU00:FS00:0x24:       Bits: 64
15:11:26:WU00:FS00:0x24:       Mode: Release
15:11:26:WU00:FS00:0x24:************************************ CBang *************************************
15:11:26:WU00:FS00:0x24:    Version: 1.7.2
15:11:26:WU00:FS00:0x24:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:11:26:WU00:FS00:0x24:        Org: Cauldron Development LLC
15:11:26:WU00:FS00:0x24:  Copyright: Cauldron Development LLC, 2003-2024
15:11:26:WU00:FS00:0x24:   Homepage: https://cauldrondevelopment.com/
15:11:27:WU00:FS00:0x24:    License: LGPL-2.1-or-later
15:11:27:WU00:FS00:0x24:       Date: Jul 25 2024
15:11:27:WU00:FS00:0x24:       Time: 05:22:43
15:11:27:WU00:FS00:0x24:   Revision: f1cd4c791e8c40a35dcfeab3ab85d910949cc0cb
15:11:27:WU00:FS00:0x24:     Branch: HEAD
15:11:27:WU00:FS00:0x24:   Compiler: Visual C++
15:11:27:WU00:FS00:0x24:    Options: $( /TP $) /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
15:11:27:WU00:FS00:0x24:   Platform: win32 10
15:11:27:WU00:FS00:0x24:       Bits: 64
15:11:27:WU00:FS00:0x24:       Mode: Release
15:11:27:WU00:FS00:0x24:************************************ System ************************************
15:11:27:WU00:FS00:0x24:        CPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
15:11:27:WU00:FS00:0x24:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
15:11:27:WU00:FS00:0x24:       CPUs: 12
15:11:27:WU00:FS00:0x24:     Memory: 31.91GiB
15:11:27:WU00:FS00:0x24:Free Memory: 20.38GiB
15:11:27:WU00:FS00:0x24: OS Version: 10.0
15:11:27:WU00:FS00:0x24:Has Battery: false
15:11:27:WU00:FS00:0x24: On Battery: false
15:11:27:WU00:FS00:0x24:   Hostname: Z442
15:11:27:WU00:FS00:0x24: UTC Offset: 10
15:11:27:WU00:FS00:0x24:        PID: 28820
15:11:27:WU00:FS00:0x24:        CWD: C:\ProgramData\FAHClient\work
15:11:27:WU00:FS00:0x24:       Exec: C:\ProgramData\FAHClient\cores\cores.foldingathome.org\openmm-core-24\windows-10-64bit\release\0x24-8.1.4\Core_24.fah\FahCore_24.exe
15:11:27:WU00:FS00:0x24:************************************ OpenMM ************************************
15:11:27:WU00:FS00:0x24:    Version: 8.1.1
15:11:27:WU00:FS00:0x24:********************************************************************************
15:11:27:WU00:FS00:0x24:Project: 12129 (Run 6, Clone 52, Gen 0)
15:11:27:WU00:FS00:0x24:Digital signatures verified
15:11:27:WU00:FS00:0x24:Folding@home GPU Core24 Folding@home Core
15:11:27:WU00:FS00:0x24:Version 8.1.4
15:11:27:WU00:FS00:0x24:  Checkpoint write interval: 250000 steps (5%) [20 total]
15:11:27:WU00:FS00:0x24:  JSON viewer frame write interval: 50000 steps (1%) [100 total]
15:11:27:WU00:FS00:0x24:  XTC frame write interval: 25000 steps (0.5%) [200 total]
15:11:27:WU00:FS00:0x24:  TRR frame write interval: disabled
15:11:27:WU00:FS00:0x24:  Global context and integrator variables write interval: disabled
15:11:27:WU00:FS00:0x24:There are 4 platforms available.
15:11:27:WU00:FS00:0x24:Platform 0: Reference
15:11:27:WU00:FS00:0x24:Platform 1: CPU
15:11:27:WU00:FS00:0x24:Platform 2: OpenCL
15:11:27:WU00:FS00:0x24:  opencl-device 0 specified
15:11:27:WU00:FS00:0x24:Platform 3: CUDA
15:11:27:WU00:FS00:0x24:  cuda-device 0 specified
15:12:04:WU00:FS00:0x24:Attempting to create CUDA context:
15:12:04:WU00:FS00:0x24:  Configuring platform CUDA
15:12:10:WU00:FS00:0x24:  Using CUDA on CUDA Platform and gpu 0
15:12:10:WU00:FS00:0x24:  GPU info: Platform: CUDA
15:12:10:WU00:FS00:0x24:  GPU info: PlatformIndex: 0
15:12:10:WU00:FS00:0x24:  GPU info: Device: NVIDIA GeForce RTX 2060
15:12:10:WU00:FS00:0x24:  GPU info: DeviceIndex: 0
15:12:10:WU00:FS00:0x24:  GPU info: Vendor: 0x10de
15:12:10:WU00:FS00:0x24:  GPU info: PCI: 02:00:00
15:12:10:WU00:FS00:0x24:  GPU info: Compute: 7.5
15:12:10:WU00:FS00:0x24:  GPU info: Driver: 12.7
15:12:10:WU00:FS00:0x24:  GPU info: GPU: true
15:12:10:WU00:FS00:0x24:Completed 4500000 out of 5000000 steps (90%)
Image
toTOW
Site Moderator
Posts: 6469
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 12129 is cruel to small GPUs

Post by toTOW »

Code: Select all

15:11:19:WU00:FS00:0x24:An exception occurred at step 4748836: Error invoking kernel: CUDA_ERROR_UNKNOWN (999)
15:11:20:WU00:FS00:0x24:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
15:11:20:WU00:FS00:0x24:Folding@home Core Shutdown: CORE_RESTART
15:11:25:WARNING:WU00:FS00:FahCore returned an unknown error code which probably indicates that it crashed
15:11:25:WARNING:WU00:FS00:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
Does it match with a GPU driver reset in syslog / Windows Event viewer ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
appepi
Posts: 117
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3) HP Z4G4 (3) ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3) Dell GTX 1080 NVIDIA P1000 (3)
Location: Sydney Australia

Re: 12129 is cruel to small GPUs

Post by appepi »

@TOW

Don't really know how or what to look for here, but Windows Event Viewer showed some Events for Fahcore_24. Headings are my typing, the rest cut-paste from Viewer

WU is currently up to 98% and counting ...

Code: Select all

[b]Logged Application Error at 15/08/2025  12:20:07 AM (Sydney - UTC+10)[/b]
Compare FAH log "14:20:07:WU00:FS00:0x24:An exception occurred at step 4627874: Error invoking kernel: CUDA_ERROR_UNKNOWN (999)"

Faulting application name: FahCore_24.exe, version: 0.0.0.0, time stamp: 0x66a1e5dd
Faulting module name: ucrtbase.dll, version: 10.0.19041.3636, time stamp: 0x81cf5d89
Exception code: 0xc0000409
Fault offset: 0x000000000007286e
Faulting process id: 0x8764
Faulting application start time: 0x01dc0cde272c9fec
Faulting application path: C:\ProgramData\FAHClient\cores\cores.foldingathome.org\openmm-core-24\windows-10-64bit\release\0x24-8.1.4\Core_24.fah\FahCore_24.exe
Faulting module path: C:\WINDOWS\System32\ucrtbase.dll
Report Id: 58f4efd0-86c8-44a9-b8dd-cc7b5accee39
Faulting package full name: 
Faulting package-relative application ID: 


[b]Logged Application Error at 15/08/2025  1:11:20 AM (Sydney - UTC+10)[/b]
Compare FAH log "15:11:19:WU00:FS00:0x24:An exception occurred at step 4748836: Error invoking kernel: CUDA_ERROR_UNKNOWN (999"

Faulting application name: FahCore_24.exe, version: 0.0.0.0, time stamp: 0x66a1e5dd
Faulting module name: ucrtbase.dll, version: 10.0.19041.3636, time stamp: 0x81cf5d89
Exception code: 0xc0000409
Fault offset: 0x000000000007286e
Faulting process id: 0x7a38
Faulting application start time: 0x01dc0d268d3c5ba3
Faulting application path: C:\ProgramData\FAHClient\cores\cores.foldingathome.org\openmm-core-24\windows-10-64bit\release\0x24-8.1.4\Core_24.fah\FahCore_24.exe
Faulting module path: C:\WINDOWS\System32\ucrtbase.dll
Report Id: 0c3a8388-cf9a-44ec-b977-1bbe3d016da1
Faulting package full name: 
Faulting package-relative application ID:
Image
appepi
Posts: 117
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3) HP Z4G4 (3) ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3) Dell GTX 1080 NVIDIA P1000 (3)
Location: Sydney Australia

Re: 12129 is cruel to small GPUs

Post by appepi »

Now finished, all OK, slow upload ~230 MB , 886,194 points, RTX2060 on another project at usual 2,1 MPPD. Summary Extract from 12129 log below.

Code: Select all

15:03:33:WU01:FS00:Server responded WORK_ACK (400)
15:03:33:WU01:FS00:Final credit estimate, 68216.00 points
15:03:33:WU01:FS00:Cleaning up

15:03:39:WU00:FS00:Download 49.21%
15:03:45:WU00:FS00:Download 63.02%
15:03:51:WU00:FS00:Download 76.83%
15:03:56:WU00:FS00:Download complete
15:03:56:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:12129 run:6 clone:52 gen:0 core:0x24 unit:0x000000003400000006000000612f0000
15:03:56:WU00:FS00:Starting
15:03:56:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/openmm-core-24/windows-10-64bit/release/0x24-8.1.4/Core_24.fah/FahCore_24.exe -dir 00 -suffix 01 -version 706 -lifeline 3636 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
15:03:56:WU00:FS00:Started FahCore on PID 39704
15:03:56:WU00:FS00:Core PID:43484
15:03:56:WU00:FS00:FahCore 0x24 started
15:03:56:WU00:FS00:0x24:*********************** Log Started 2025-08-13T15:03:56Z ***********************
15:03:56:WU00:FS00:0x24:*************************** Core24 Folding@home Core ***************************
15:03:56:WU00:FS00:0x24:       Core: Core24
15:03:56:WU00:FS00:0x24:       Type: 0x24
15:03:56:WU00:FS00:0x24:    Version: 8.1.4
15:03:56:WU00:FS00:0x24:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:03:56:WU00:FS00:0x24:  Copyright: 2022 foldingathome.org
15:03:56:WU00:FS00:0x24:   Homepage: https://foldingathome.org/
15:03:56:WU00:FS00:0x24:       Date: Jul 25 2024
15:03:56:WU00:FS00:0x24:       Time: 05:42:49
15:03:56:WU00:FS00:0x24:   Revision: cf9f0139862b8945a2091772770e4631aac37792
15:03:56:WU00:FS00:0x24:     Branch: HEAD
15:03:56:WU00:FS00:0x24:   Compiler: Visual C++
15:03:56:WU00:FS00:0x24:    Options: $( /TP $) /std:c++14 /nologo /EHa /wd4297 /wd4103 /O2
15:03:56:WU00:FS00:0x24:             /Zc:throwingNew /MT -DOPENMM_VERSION="\"8.1.1\"" /Ox /std:c++14
15:03:56:WU00:FS00:0x24:   Platform: win32 10
15:03:56:WU00:FS00:0x24:       Bits: 64
15:03:56:WU00:FS00:0x24:       Mode: Release
15:03:56:WU00:FS00:0x24:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
15:03:56:WU00:FS00:0x24:             <peastman@stanford.edu>
15:03:56:WU00:FS00:0x24:       Args: -dir 00 -suffix 01 -version 706 -lifeline 39704 -checkpoint 15
15:03:56:WU00:FS00:0x24:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
15:03:56:WU00:FS00:0x24:             nvidia -gpu 0 -gpu-usage 100
15:03:56:WU00:FS00:0x24:************************************ libFAH ************************************
15:03:56:WU00:FS00:0x24:       Date: Jul 25 2024
15:03:56:WU00:FS00:0x24:       Time: 05:23:50
15:03:56:WU00:FS00:0x24:   Revision: c7d2824a47eb025fa8cda8968c7a5e971585d90c
15:03:56:WU00:FS00:0x24:     Branch: HEAD
15:03:56:WU00:FS00:0x24:   Compiler: Visual C++
15:03:56:WU00:FS00:0x24:    Options: $( /TP $) /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
15:03:56:WU00:FS00:0x24:   Platform: win32 10
15:03:56:WU00:FS00:0x24:       Bits: 64
15:03:56:WU00:FS00:0x24:       Mode: Release
15:03:56:WU00:FS00:0x24:************************************ CBang *************************************
15:03:56:WU00:FS00:0x24:    Version: 1.7.2
15:03:56:WU00:FS00:0x24:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:03:56:WU00:FS00:0x24:        Org: Cauldron Development LLC
15:03:56:WU00:FS00:0x24:  Copyright: Cauldron Development LLC, 2003-2024
15:03:56:WU00:FS00:0x24:   Homepage: https://cauldrondevelopment.com/
15:03:56:WU00:FS00:0x24:    License: LGPL-2.1-or-later
15:03:56:WU00:FS00:0x24:       Date: Jul 25 2024
15:03:56:WU00:FS00:0x24:       Time: 05:22:43
15:03:56:WU00:FS00:0x24:   Revision: f1cd4c791e8c40a35dcfeab3ab85d910949cc0cb
15:03:56:WU00:FS00:0x24:     Branch: HEAD
15:03:56:WU00:FS00:0x24:   Compiler: Visual C++
15:03:56:WU00:FS00:0x24:    Options: $( /TP $) /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
15:03:56:WU00:FS00:0x24:   Platform: win32 10
15:03:56:WU00:FS00:0x24:       Bits: 64
15:03:56:WU00:FS00:0x24:       Mode: Release
15:03:56:WU00:FS00:0x24:************************************ System ************************************
15:03:56:WU00:FS00:0x24:        CPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
15:03:56:WU00:FS00:0x24:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
15:03:56:WU00:FS00:0x24:       CPUs: 12
15:03:56:WU00:FS00:0x24:     Memory: 31.91GiB
15:03:56:WU00:FS00:0x24:Free Memory: 20.65GiB
15:03:56:WU00:FS00:0x24: OS Version: 10.0
15:03:56:WU00:FS00:0x24:Has Battery: false
15:03:56:WU00:FS00:0x24: On Battery: false
15:03:56:WU00:FS00:0x24:   Hostname: Z442
15:03:56:WU00:FS00:0x24: UTC Offset: 10
15:03:56:WU00:FS00:0x24:        PID: 43484
15:03:56:WU00:FS00:0x24:        CWD: C:\ProgramData\FAHClient\work
15:03:56:WU00:FS00:0x24:       Exec: C:\ProgramData\FAHClient\cores\cores.foldingathome.org\openmm-core-24\windows-10-64bit\release\0x24-8.1.4\Core_24.fah\FahCore_24.exe
15:03:56:WU00:FS00:0x24:************************************ OpenMM ************************************
15:03:56:WU00:FS00:0x24:    Version: 8.1.1
15:03:56:WU00:FS00:0x24:********************************************************************************
15:03:56:WU00:FS00:0x24:Project: 12129 (Run 6, Clone 52, Gen 0)
15:03:56:WU00:FS00:0x24:Reading tar file core.xml
15:03:56:WU00:FS00:0x24:Reading tar file integrator.xml
15:03:56:WU00:FS00:0x24:Reading tar file state.xml
15:03:58:WU00:FS00:0x24:Reading tar file system.xml
15:03:59:WU00:FS00:0x24:Digital signatures verified
15:03:59:WU00:FS00:0x24:Folding@home GPU Core24 Folding@home Core
15:03:59:WU00:FS00:0x24:Version 8.1.4
15:03:59:WU00:FS00:0x24:  Checkpoint write interval: 250000 steps (5%) [20 total]
15:03:59:WU00:FS00:0x24:  JSON viewer frame write interval: 50000 steps (1%) [100 total]
15:03:59:WU00:FS00:0x24:  XTC frame write interval: 25000 steps (0.5%) [200 total]
15:03:59:WU00:FS00:0x24:  TRR frame write interval: disabled
15:03:59:WU00:FS00:0x24:  Global context and integrator variables write interval: disabled
15:04:00:WU00:FS00:0x24:There are 4 platforms available.
15:04:00:WU00:FS00:0x24:Platform 0: Reference
15:04:00:WU00:FS00:0x24:Platform 1: CPU
15:04:00:WU00:FS00:0x24:Platform 2: OpenCL
15:04:00:WU00:FS00:0x24:  opencl-device 0 specified
15:04:00:WU00:FS00:0x24:Platform 3: CUDA
15:04:00:WU00:FS00:0x24:  cuda-device 0 specified
15:04:41:WU00:FS00:0x24:Attempting to create CUDA context:
15:04:41:WU00:FS00:0x24:  Configuring platform CUDA
15:04:56:WU00:FS00:0x24:  Using CUDA on CUDA Platform and gpu 0
15:04:56:WU00:FS00:0x24:  GPU info: Platform: CUDA
15:04:56:WU00:FS00:0x24:  GPU info: PlatformIndex: 0
15:04:56:WU00:FS00:0x24:  GPU info: Device: NVIDIA GeForce RTX 2060
15:04:56:WU00:FS00:0x24:  GPU info: DeviceIndex: 0
15:04:56:WU00:FS00:0x24:  GPU info: Vendor: 0x10de
15:04:56:WU00:FS00:0x24:  GPU info: PCI: 02:00:00
15:04:56:WU00:FS00:0x24:  GPU info: Compute: 7.5
15:04:56:WU00:FS00:0x24:  GPU info: Driver: 12.7
15:04:56:WU00:FS00:0x24:  GPU info: GPU: true
15:04:56:WU00:FS00:0x24:Completed 0 out of 5000000 steps (0%)
15:04:57:WU00:FS00:0x24:Checkpoint completed at step 0
15:15:23:WU00:FS00:0x24:Completed 50000 out of 5000000 steps (1%)

...

15:01:09:WU00:FS00:0x24:Completed 4700000 out of 5000000 steps (94%)
15:11:19:WU00:FS00:0x24:An exception occurred at step 4748836: Error invoking kernel: CUDA_ERROR_UNKNOWN (999)
15:11:20:WU00:FS00:0x24:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
15:11:20:WU00:FS00:0x24:Folding@home Core Shutdown: CORE_RESTART
15:11:25:WARNING:WU00:FS00:FahCore returned an unknown error code which probably indicates that it crashed
15:11:25:WARNING:WU00:FS00:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
15:11:25:WU00:FS00:Starting
15:11:25:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/openmm-core-24/windows-10-64bit/release/0x24-8.1.4/Core_24.fah/FahCore_24.exe -dir 00 -suffix 01 -version 706 -lifeline 3636 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
15:11:25:WU00:FS00:Started FahCore on PID 33384
15:11:25:WU00:FS00:Core PID:28820
15:11:25:WU00:FS00:FahCore 0x24 started
15:11:26:WU00:FS00:0x24:*********************** Log Started 2025-08-14T15:11:26Z ***********************
15:11:26:WU00:FS00:0x24:*************************** Core24 Folding@home Core ***************************
15:11:26:WU00:FS00:0x24:       Core: Core24
15:11:26:WU00:FS00:0x24:       Type: 0x24
15:11:26:WU00:FS00:0x24:    Version: 8.1.4
15:11:26:WU00:FS00:0x24:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:11:26:WU00:FS00:0x24:  Copyright: 2022 foldingathome.org
15:11:26:WU00:FS00:0x24:   Homepage: https://foldingathome.org/
15:11:26:WU00:FS00:0x24:       Date: Jul 25 2024
15:11:26:WU00:FS00:0x24:       Time: 05:42:49
15:11:26:WU00:FS00:0x24:   Revision: cf9f0139862b8945a2091772770e4631aac37792
15:11:26:WU00:FS00:0x24:     Branch: HEAD
15:11:26:WU00:FS00:0x24:   Compiler: Visual C++
15:11:26:WU00:FS00:0x24:    Options: $( /TP $) /std:c++14 /nologo /EHa /wd4297 /wd4103 /O2
15:11:26:WU00:FS00:0x24:             /Zc:throwingNew /MT -DOPENMM_VERSION="\"8.1.1\"" /Ox /std:c++14
15:11:26:WU00:FS00:0x24:   Platform: win32 10
15:11:26:WU00:FS00:0x24:       Bits: 64
15:11:26:WU00:FS00:0x24:       Mode: Release
15:11:26:WU00:FS00:0x24:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
15:11:26:WU00:FS00:0x24:             <peastman@stanford.edu>
15:11:26:WU00:FS00:0x24:       Args: -dir 00 -suffix 01 -version 706 -lifeline 33384 -checkpoint 15
15:11:26:WU00:FS00:0x24:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
15:11:26:WU00:FS00:0x24:             nvidia -gpu 0 -gpu-usage 100
15:11:26:WU00:FS00:0x24:************************************ libFAH ************************************
15:11:26:WU00:FS00:0x24:       Date: Jul 25 2024
15:11:26:WU00:FS00:0x24:       Time: 05:23:50
15:11:26:WU00:FS00:0x24:   Revision: c7d2824a47eb025fa8cda8968c7a5e971585d90c
15:11:26:WU00:FS00:0x24:     Branch: HEAD
15:11:26:WU00:FS00:0x
24:   Compiler: Visual C++
15:11:26:WU00:FS00:0x24:    Options: $( /TP $) /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
15:11:26:WU00:FS00:0x24:   Platform: win32 10
15:11:26:WU00:FS00:0x24:       Bits: 64
15:11:26:WU00:FS00:0x24:       Mode: Release
15:11:26:WU00:FS00:0x24:************************************ CBang *************************************
15:11:26:WU00:FS00:0x24:    Version: 1.7.2
15:11:26:WU00:FS00:0x24:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:11:26:WU00:FS00:0x24:        Org: Cauldron Development LLC
15:11:26:WU00:FS00:0x24:  Copyright: Cauldron Development LLC, 2003-2024
15:11:26:WU00:FS00:0x24:   Homepage: https://cauldrondevelopment.com/
15:11:27:WU00:FS00:0x24:    License: LGPL-2.1-or-later
15:11:27:WU00:FS00:0x24:       Date: Jul 25 2024
15:11:27:WU00:FS00:0x24:       Time: 05:22:43
15:11:27:WU00:FS00:0x24:   Revision: f1cd4c791e8c40a35dcfeab3ab85d910949cc0cb
15:11:27:WU00:FS00:0x24:     Branch: HEAD
15:11:27:WU00:FS00:0x24:   Compiler: Visual C++
15:11:27:WU00:FS00:0x24:    Options: $( /TP $) /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
15:11:27:WU00:FS00:0x24:   Platform: win32 10
15:11:27:WU00:FS00:0x24:       Bits: 64
15:11:27:WU00:FS00:0x24:       Mode: Release
15:11:27:WU00:FS00:0x24:************************************ System ************************************
15:11:27:WU00:FS00:0x24:        CPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
15:11:27:WU00:FS00:0x24:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
15:11:27:WU00:FS00:0x24:       CPUs: 12
15:11:27:WU00:FS00:0x24:     Memory: 31.91GiB
15:11:27:WU00:FS00:0x24:Free Memory: 20.38GiB
15:11:27:WU00:FS00:0x24: OS Version: 10.0
15:11:27:WU00:FS00:0x24:Has Battery: false
15:11:27:WU00:FS00:0x24: On Battery: false
15:11:27:WU00:FS00:0x24:   Hostname: Z442
15:11:27:WU00:FS00:0x24: UTC Offset: 10
15:11:27:WU00:FS00:0x24:        PID: 28820
15:11:27:WU00:FS00:0x24:        CWD: C:\ProgramData\FAHClient\work
15:11:27:WU00:FS00:0x24:       Exec: C:\ProgramData\FAHClient\cores\cores.foldingathome.org\openmm-core-24\windows-10-64bit\release\0x24-8.1.4\Core_24.fah\FahCore_24.exe
15:11:27:WU00:FS00:0x24:************************************ OpenMM ************************************
15:11:27:WU00:FS00:0x24:    Version: 8.1.1
15:11:27:WU00:FS00:0x24:********************************************************************************
15:11:27:WU00:FS00:0x24:Project: 12129 (Run 6, Clone 52, Gen 0)
15:11:27:WU00:FS00:0x24:Digital signatures verified
15:11:27:WU00:FS00:0x24:Folding@home GPU Core24 Folding@home Core
15:11:27:WU00:FS00:0x24:Version 8.1.4
15:11:27:WU00:FS00:0x24:  Checkpoint write interval: 250000 steps (5%) [20 total]
15:11:27:WU00:FS00:0x24:  JSON viewer frame write interval: 50000 steps (1%) [100 total]
15:11:27:WU00:FS00:0x24:  XTC frame write interval: 25000 steps (0.5%) [200 total]
15:11:27:WU00:FS00:0x24:  TRR frame write interval: disabled
15:11:27:WU00:FS00:0x24:  Global context and integrator variables write interval: disabled
15:11:27:WU00:FS00:0x24:There are 4 platforms available.
15:11:27:WU00:FS00:0x24:Platform 0: Reference
15:11:27:WU00:FS00:0x24:Platform 1: CPU
15:11:27:WU00:FS00:0x24:Platform 2: OpenCL
15:11:27:WU00:FS00:0x24:  opencl-device 0 specified
15:11:27:WU00:FS00:0x24:Platform 3: CUDA
15:11:27:WU00:FS00:0x24:  cuda-device 0 specified
15:12:04:WU00:FS00:0x24:Attempting to create CUDA context:
15:12:04:WU00:FS00:0x24:  Configuring platform CUDA
15:12:10:WU00:FS00:0x24:  Using CUDA on CUDA Platform and gpu 0
15:12:10:WU00:FS00:0x24:  GPU info: Platform: CUDA
15:12:10:WU00:FS00:0x24:  GPU info: PlatformIndex: 0
15:12:10:WU00:FS00:0x24:  GPU info: Device: NVIDIA GeForce RTX 2060
15:12:10:WU00:FS00:0x24:  GPU info: DeviceIndex: 0
15:12:10:WU00:FS00:0x24:  GPU info: Vendor: 0x10de
15:12:10:WU00:FS00:0x24:  GPU info: PCI: 02:00:00
15:12:10:WU00:FS00:0x24:  GPU info: Compute: 7.5
15:12:10:WU00:FS00:0x24:  GPU info: Driver: 12.7
15:12:10:WU00:FS00:0x24:  GPU info: GPU: true
15:12:10:WU00:FS00:0x24:Completed 4500000 out of 5000000 steps (90%)
15:22:23:WU00:FS00:0x24:Completed 4550000 out of 5000000 steps (91%)
15:32:25:WU00:FS00:0x24:Completed 4600000 out of 5000000 steps (92%)
15:42:47:WU00:FS00:0x24:Completed 4650000 out of 5000000 steps (93%)
15:52:59:WU00:FS00:0x24:Completed 4700000 out of 5000000 steps (94%)
16:03:05:WU00:FS00:0x24:Completed 4750000 out of 5000000 steps (95%)
16:03:09:WU00:FS00:0x24:Checkpoint completed at step 4750000
16:13:34:WU00:FS00:0x24:Completed 4800000 out of 5000000 steps (96%)
16:23:47:WU00:FS00:0x24:Completed 4850000 out of 5000000 steps (97%)
16:34:06:WU00:FS00:0x24:Completed 4900000 out of 5000000 steps (98%)
16:44:17:WU00:FS00:0x24:Completed 4950000 out of 5000000 steps (99%)
16:54:30:WU00:FS00:0x24:Completed 5000000 out of 5000000 steps (100%)
16:54:30:WU00:FS00:0x24:Average performance: 14.0214 ns/day
16:54:31:WU01:FS00:Connecting to assign1.foldingathome.org:80
16:54:32:WU01:FS00:Assigned to work server 158.130.118.23

'''''


16:54:41:WU00:FS00:0x24:Saving result file ..\logfile_01.txt
16:54:41:WU00:FS00:0x24:Saving result file checkpointIntegrator.xml
16:54:41:WU00:FS00:0x24:Saving result file checkpointState.xml

16:54:46:WU00:FS00:0x24:Saving result file positions.xtc
16:54:56:WU00:FS00:0x24:Saving result file science.log
16:54:56:WU00:FS00:0x24:Folding@home Core Shutdown: FINISHED_UNIT
16:54:57:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
16:54:57:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:12129 run:6 clone:52 gen:0 core:0x24 unit:0x000000003400000006000000612f0000
16:54:57:WU00:FS00:Uploading 235.87MiB to 128.174.73.74
16:54:57:WU00:FS00:Connecting to 128.174.73.74:8080
6:55:06:WU00:FS00:Upload 0.08%

16:55:12:WU00:FS00:Upload 0.66%

16:55:18:WU00:FS00:Upload 2.04%
16:55:24:WU00:FS00:Upload 3.39%
16:55:30:WU00:FS00:Upload 4.82%
16:55:36:WU00:FS00:Upload 6.20%
16:55:42:WU00:FS00:Upload 7.66%
16:55:48:WU00:FS00:Upload 8.88%
16:55:54:WU00:FS00:Upload 10.15%
16:56:00:WU00:FS00:Upload 11.13%
16:56:06:WU00:FS00:Upload 12.45%
16:56:12:WU00:FS00:Upload 13.86%
16:56:18:WU00:FS00:Upload 14.81%
16:56:24:WU00:FS00:Upload 16.24%
16:56:30:WU00:FS00:Upload 17.62%
16:56:36:WU00:FS00:Upload 19.08%
16:56:42:WU00:FS00:Upload 20.46%
16:56:48:WU00:FS00:Upload 21.86%

...

17:02:43:WU00:FS00:Upload 97.83%
17:02:49:WU00:FS00:Upload 99.23%
17:03:03:WU00:FS00:Upload complete
17:03:03:WU00:FS00:Server responded WORK_ACK (400)
17:03:03:WU00:FS00:Final credit estimate, 886194.00 points
17:03:03:WU00:FS00:Cleaning up
Image
BobWilliams757
Posts: 563
Joined: Fri Apr 03, 2020 2:22 pm
Hardware configuration: ASRock X370M PRO4
Ryzen 2400G APU
16 GB DDR4-3200
MSI GTX 1660 Super Gaming X

Re: 12129 is cruel to small GPUs

Post by BobWilliams757 »

It is large for some of the lesser GPUs, but people can always pause it and resume if money spent on electricity is a major concern.

It's over 11 hours on a 1660 Super, but since I fold 24/7 I just let them run.

Maybe my GPU is closer to the "sweet spot" for GPU cores, as I still get 1.3M PPD or so and all runs I have done recently were over a million points for the work unit.
Fold them if you get them!
appepi
Posts: 117
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3) HP Z4G4 (3) ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3) Dell GTX 1080 NVIDIA P1000 (3)
Location: Sydney Australia

Re: 12129 is cruel to small GPUs

Post by appepi »

Hi again BobWilliams757

1. Smaller and elderly (or vintage) for sure, but definitely not "lesser". 8-) The three 2060's have completed more than 6000 WUs each and no doubt they had busy and interesting lives before they ended up on eBay and became Folding workhorses in appepi's stable.

2. The previous 16-hour run was on Z442's 2060, and just now (5:30am) I caught Z443's 2060 entertaining another 12129, which was 53% complete with 7 hours yet to run, so I'm looking at a 14 hour burn of electricity at about 60% of the usual PPD. We have thus moved from "once is an accident" to "twice is a coincidence", but luckily this will only extend into "shoulder" rates so I am letting it run.

3. I can't see any obvious reason why the 1660 Super would get usual performance on this WU while my 2060's don't, but then I don't see very far into any of these mysteries.

For the moment it doesn't matter because this weekend I am letting the attic devices run 24/2. The Indian Ocean dipole is drowning Sydney in record-breaking rain, and I have set up a clothesline in the attic and am using the waste heat from Folding as a clothes dryer. Also I am testing a new pre-loved Z4G4 from eBay, as part of my Post-W10-EOS planning. It is folding on 12 logical CPUs of a Xeon W-2145 (8C 16T) using Ubuntu 24.04 LTS on a NVME stick that I used for a while in Z442, so I am getting used to long runs for minimal returns (because "user" Z442u dumped too many GPU WUs during viewtopic.php?p=368871#p368871 ). :(

PS: But Z441 and Z442 are NOT in the attic and they were BOTH trudging through 12129's, so they are paused for 15 hours.
PPS: A person could easily get paranoid: What are the chances that 4 out of 5 devices will pick up a 12129 WU when all those high end GPUs are supposedly gnashing their orthodontics with eagerness to chomp WUs in a few yoctoseconds? All three 2060s are now paused for 15 hours, and the 4th device is a 1080, notionally to take 19 hours for 1.2 MPPD as against its usual 1.4-1.8. Will let it run just to collect the data point. :roll:
Image
Post Reply