In other words, it runs at full speed to completion, we get the following message:
04:21:10:I1:WU57:Saving result file ../logfile_01.txt
04:21:10:I1:WU57:Saving result file checkpointIntegrator.xml
04:21:10:I1:WU57:Saving result file checkpointState.xml.bz2
04:21:10:I1:WU57:Saving result file positions.xtc
04:21:10:I1:WU57:Saving result file science.log
04:21:10:I1:WU57:Saving result file xtcAtoms.csv.bz2
04:21:10:I1:WU57:Folding@home Core Shutdown: FINISHED_UNIT
And then it just... sits there. nvidia-smi shows that the core is still "running" on the GPU, but at 0% usage. That timestamp indicates that this WU finished 9 hours ago. Restarting the client causes it to load a new WU, but the old WU now shows as failed.
Code: Select all
wes@deathstar:~$ nvidia-smi
Tue May 6 09:23:37 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.51.03 Driver Version: 575.51.03 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Quadro P2200 On | 00000000:01:00.0 Off | N/A |
| 44% 31C P8 4W / 75W | 175MiB / 5120MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 69626 C ...4bit-release-8.2.0/FahCore_26 170MiB |
+-----------------------------------------------------------------------------------------+
Code: Select all
*********************** Log Started 2025-05-05T22:48:20Z ***********************
*************************** Core26 Folding@home Core ***************************
Core: Core26
Type: 0x26
Version: 8.2.0
Author: Joseph Coffland <joseph@cauldrondevelopment.com>
Copyright: 2022 foldingathome.org
Homepage: https://foldingathome.org/
Date: Jan 7 2025
Time: 00:35:47
Revision: 4f149b599caa4725076ef2de3b47c8d7ce725787
Branch: HEAD
Compiler: GNU 7.5.0
Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
-fdata-sections -O3 -funroll-loops -fno-pie
-DOPENMM_VERSION="\"8.2.0\""
Platform: linux 6.8.0-1017-azure
Bits: 64
Mode: Release
Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
<peastman@stanford.edu>
Args: -dir uqwzvdZ49x1rE-bkUJEYtIJ7mlXg7TVdLQDTIeJ-HbA -suffix 01
-version 8.4.9 -lifeline 3588 -gpu-uuid
4980b18d-392b-58c5-5ee6-07f03d1988f1 -gpu-platform cuda -gpu-vendor
nvidia -cuda-platform 0 -cuda-device 0
************************************ libFAH ************************************
Date: Jan 7 2025
Time: 00:29:24
Revision: c7d2824a47eb025fa8cda8968c7a5e971585d90c
Branch: HEAD
Compiler: GNU 7.5.0
Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
-fdata-sections -O3 -funroll-loops -fno-pie
Platform: linux 6.8.0-1017-azure
Bits: 64
Mode: Release
************************************ CBang *************************************
Version: 1.7.2
Author: Joseph Coffland <joseph@cauldrondevelopment.com>
Org: Cauldron Development LLC
Copyright: Cauldron Development LLC, 2003-2024
Homepage: https://cauldrondevelopment.com/
License: LGPL-2.1-or-later
Date: Jan 7 2025
Time: 00:28:59
Revision: f1cd4c791e8c40a35dcfeab3ab85d910949cc0cb
Branch: HEAD
Compiler: GNU 7.5.0
Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
-fdata-sections -O3 -funroll-loops -fno-pie -fPIC
Platform: linux 6.8.0-1017-azure
Bits: 64
Mode: Release
************************************ System ************************************
CPU: Intel(R) Xeon(R) E-2244G CPU @ 3.80GHz
CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
CPUs: 8
Memory: 31.07GiB
Free Memory: 28.63GiB
OS Version: 6.1
Has Battery: false
On Battery: false
Hostname: deathstar
UTC Offset: -4
PID: 69626
CWD: /var/lib/fah-client/work
Exec: /var/lib/fah-client/cores/openmm-core-26/centos-7.9.2009-64bit/release/fahcore-26-centos-7.9.2009-64bit-release-8.2.0/FahCore_26
************************************ OpenMM ************************************
Version: 8.2.0
********************************************************************************
Project: 18243 (Run 367, Clone 2, Gen 4)
Reading tar file core.xml
Reading tar file integrator.xml
Reading tar file state.xml.bz2
Reading tar file system.xml.bz2
Digital signatures verified
Folding@home GPU Core26 Folding@home Core
Version 8.2
Checkpoint write interval: 50000 steps (2%) [50 total]
JSON viewer frame write interval: 25000 steps (1%) [100 total]
XTC frame write interval: 10000 steps (0.4%) [250 total]
TRR frame write interval: disabled
Global context and integrator variables write interval: disabled
There are 4 platforms available.
Platform 0: Reference
Platform 1: CPU
Platform 2: OpenCL
Platform 3: CUDA
cuda-device 0 specified
Attempting to create CUDA context:
Configuring platform CUDA
Using CUDA on CUDA Platform and gpu 0
GPU info: Platform: CUDA
GPU info: PlatformIndex: 0
GPU info: Device: Quadro P2200
GPU info: DeviceIndex: 0
GPU info: Vendor: 0x10de
GPU info: PCI: 01:00:00
GPU info: Compute: 6.1
GPU info: Driver: 12.9
GPU info: GPU: true
Completed 0 out of 2500000 steps (0%)
Checkpoint completed at step 0
Completed 25000 out of 2500000 steps (1%)
Completed 50000 out of 2500000 steps (2%)
Code: Select all
Completed 2500000 out of 2500000 steps (100%)
Average performance: 43.0923 ns/day
Checkpoint completed at step 2500000
Saving result file ../logfile_01.txt
Saving result file checkpointIntegrator.xml
Saving result file checkpointState.xml.bz2
Saving result file positions.xtc
Saving result file science.log
Saving result file xtcAtoms.csv.bz2
Folding@home Core Shutdown: FINISHED_UNIT