Feature Request: Pause at next checkpoint
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 73
- Joined: Sat Mar 21, 2020 3:56 pm
Feature Request: Pause at next checkpoint
Hello Folding Team,
I use the pause feature a lot. Sometimes because I want to do something else on my PC that requires cycles, sometimes because I just want this corner in my basement to cool down. Either way, I would like a better way to pause progress so that X amount of work isn't lost and has to be redone. Currently I have the CPs set to five minutes, but even 4:59 worth of waste seems unnecessary to me.
I would like for another button to be added, "Pause at next checkpoint". This would encompass the checkpoints of every folding slot.
This way I could click the button and futz around until everything is paused, and then do my thing.
Also there's a whole "my PC crashes when it runs at 100% for too long and it comes back reporting BAD WORK_UNIT" story that I'm not going to go into too much detail about.
I use the pause feature a lot. Sometimes because I want to do something else on my PC that requires cycles, sometimes because I just want this corner in my basement to cool down. Either way, I would like a better way to pause progress so that X amount of work isn't lost and has to be redone. Currently I have the CPs set to five minutes, but even 4:59 worth of waste seems unnecessary to me.
I would like for another button to be added, "Pause at next checkpoint". This would encompass the checkpoints of every folding slot.
This way I could click the button and futz around until everything is paused, and then do my thing.
Also there's a whole "my PC crashes when it runs at 100% for too long and it comes back reporting BAD WORK_UNIT" story that I'm not going to go into too much detail about.
-
- Posts: 523
- Joined: Fri Mar 23, 2012 5:16 pm
Re: Feature Request: Pause at next checkpoint
Agreed, I would like this too.
-
- Posts: 78
- Joined: Wed Mar 25, 2020 2:39 am
- Location: Canada
Re: Feature Request: Pause at next checkpoint
I think it checkpoints when you pause it. The UI will go back to the last whole percentage point, but when you unpause it the "Completed steps" will show that you've kept your progress since then.
-
- Posts: 73
- Joined: Sat Mar 21, 2020 3:56 pm
Re: Feature Request: Pause at next checkpoint
I wish that were the case, Frogging101, but check out meh logfile of when I arbitrarily paused, waited a minute, then unpaused.
0x22 went from 68% to 65% after unpausing.
0x22 went from 68% to 65% after unpausing.
Code: Select all
03:15:19:WU00:FS00:0xa7:Completed 15000 out of 125000 steps (12%)
03:15:43:WU01:FS01:0x22:Completed 670000 out of 1000000 steps (67%)
03:16:22:WU00:FS00:0xa7:Completed 16250 out of 125000 steps (13%)
03:16:39:WU01:FS01:0x22:Completed 680000 out of 1000000 steps (68%)
03:17:03:FS00:Paused
03:17:03:FS01:Paused
03:17:03:FS00:Shutting core down
03:17:03:FS01:Shutting core down
03:17:03:WU01:FS01:0x22:WARNING:Console control signal 1 on PID 3584
03:17:03:WU00:FS00:0xa7:WARNING:Console control signal 1 on PID 16184
03:17:03:WU01:FS01:0x22:Exiting, please wait. . .
03:17:03:WU00:FS00:0xa7:Exiting, please wait. . .
03:17:04:WU01:FS01:0x22:Folding@home Core Shutdown: INTERRUPTED
03:17:04:WU01:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
03:17:05:Removing old file 'configs/config-20200410-193411.xml'
03:17:05:Saving configuration to config.xml
03:17:05:<config>
03:17:05: <!-- Folding Core -->
03:17:05: <checkpoint v='5'/>
03:17:05:
03:17:05: <!-- Network -->
03:17:05: <proxy v=':8080'/>
03:17:05:
03:17:05: <!-- Slot Control -->
03:17:05: <power v='MEDIUM'/>
03:17:05:
03:17:05: <!-- User Information -->
03:17:05: <passkey v='********************************'/>
03:17:05: <team v='64'/>
03:17:05: <user v='Crawdaddy79'/>
03:17:05:
03:17:05: <!-- Folding Slots -->
03:17:05: <slot id='0' type='CPU'>
03:17:05: <paused v='true'/>
03:17:05: </slot>
03:17:05: <slot id='1' type='GPU'>
03:17:05: <paused v='true'/>
03:17:05: </slot>
03:17:05:</config>
03:17:05:WU00:FS00:0xa7:Folding@home Core Shutdown: INTERRUPTED
03:17:05:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
03:17:21:FS00:Unpaused
03:17:21:FS01:Unpaused
03:17:21:WU01:FS01:Starting
03:17:21:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\crawd\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 705 -lifeline 10740 -checkpoint 5 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
03:17:21:WU01:FS01:Started FahCore on PID 1776
03:17:21:WU01:FS01:Core PID:14812
03:17:21:WU01:FS01:FahCore 0x22 started
03:17:21:WU01:FS01:0x22:*********************** Log Started 2020-04-11T03:17:21Z ***********************
03:17:21:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
03:17:21:WU01:FS01:0x22: Type: 0x22
03:17:21:WU01:FS01:0x22: Core: Core22
03:17:21:WU01:FS01:0x22: Website: https://foldingathome.org/
03:17:21:WU01:FS01:0x22: Copyright: (c) 2009-2018 foldingathome.org
03:17:21:WU01:FS01:0x22: Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
03:17:21:WU01:FS01:0x22: <rafal.wiewiora@choderalab.org>
03:17:21:WU01:FS01:0x22: Args: -dir 01 -suffix 01 -version 705 -lifeline 1776 -checkpoint 5
03:17:21:WU01:FS01:0x22: -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
03:17:21:WU01:FS01:0x22: Config: <none>
03:17:21:WU01:FS01:0x22:************************************ Build *************************************
03:17:21:WU01:FS01:0x22: Version: 0.0.2
03:17:21:WU01:FS01:0x22: Date: Dec 6 2019
03:17:21:WU01:FS01:0x22: Time: 21:30:31
03:17:21:WU01:FS01:0x22: Repository: Git
03:17:21:WU01:FS01:0x22: Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
03:17:21:WU01:FS01:0x22: Branch: HEAD
03:17:21:WU01:FS01:0x22: Compiler: Visual C++ 2008
03:17:21:WU01:FS01:0x22: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
03:17:21:WU01:FS01:0x22: Platform: win32 10
03:17:21:WU01:FS01:0x22: Bits: 64
03:17:21:WU01:FS01:0x22: Mode: Release
03:17:21:WU01:FS01:0x22:************************************ System ************************************
03:17:21:WU01:FS01:0x22: CPU: AMD Ryzen 7 2700X Eight-Core Processor
03:17:21:WU01:FS01:0x22: CPU ID: AuthenticAMD Family 23 Model 8 Stepping 2
03:17:21:WU01:FS01:0x22: CPUs: 16
03:17:21:WU01:FS01:0x22: Memory: 31.95GiB
03:17:21:WU01:FS01:0x22:Free Memory: 23.59GiB
03:17:21:WU01:FS01:0x22: Threads: WINDOWS_THREADS
03:17:21:WU01:FS01:0x22: OS Version: 6.2
03:17:21:WU01:FS01:0x22:Has Battery: false
03:17:21:WU01:FS01:0x22: On Battery: false
03:17:21:WU01:FS01:0x22: UTC Offset: -4
03:17:21:WU01:FS01:0x22: PID: 14812
03:17:21:WU01:FS01:0x22: CWD: C:\Users\crawd\AppData\Roaming\FAHClient\work
03:17:21:WU01:FS01:0x22: OS: Windows 10 Home
03:17:21:WU01:FS01:0x22: OS Arch: AMD64
03:17:21:WU01:FS01:0x22:********************************************************************************
03:17:21:WU01:FS01:0x22:Project: 11745 (Run 0, Clone 2225, Gen 26)
03:17:21:WU01:FS01:0x22:Unit: 0x000000388ca304f15e67f104dec31f90
03:17:21:WU01:FS01:0x22:Digital signatures verified
03:17:21:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
03:17:21:WU01:FS01:0x22:Version 0.0.2
03:17:22:WU01:FS01:0x22: Found a checkpoint file
03:17:22:WU00:FS00:Starting
03:17:22:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\crawd\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/avx/Core_a7.fah/FahCore_a7.exe -dir 00 -suffix 01 -version 705 -lifeline 10740 -checkpoint 5 -np 14
03:17:22:WU00:FS00:Started FahCore on PID 7020
03:17:22:WU00:FS00:Core PID:15444
03:17:22:WU00:FS00:FahCore 0xa7 started
03:17:22:WU00:FS00:0xa7:*********************** Log Started 2020-04-11T03:17:22Z ***********************
03:17:22:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
03:17:22:WU00:FS00:0xa7: Type: 0xa7
03:17:22:WU00:FS00:0xa7: Core: Gromacs
03:17:22:WU00:FS00:0xa7: Args: -dir 00 -suffix 01 -version 705 -lifeline 7020 -checkpoint 5 -np 14
03:17:22:WU00:FS00:0xa7:************************************ CBang *************************************
03:17:22:WU00:FS00:0xa7: Date: Oct 26 2019
03:17:22:WU00:FS00:0xa7: Time: 01:38:25
03:17:22:WU00:FS00:0xa7: Revision: c46a1a011a24143739ac7218c5a435f66777f62f
03:17:22:WU00:FS00:0xa7: Branch: master
03:17:22:WU00:FS00:0xa7: Compiler: Visual C++ 2008
03:17:22:WU00:FS00:0xa7: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
03:17:22:WU00:FS00:0xa7: Platform: win32 10
03:17:22:WU00:FS00:0xa7: Bits: 64
03:17:22:WU00:FS00:0xa7: Mode: Release
03:17:22:WU00:FS00:0xa7:************************************ System ************************************
03:17:22:WU00:FS00:0xa7: CPU: AMD Ryzen 7 2700X Eight-Core Processor
03:17:22:WU00:FS00:0xa7: CPU ID: AuthenticAMD Family 23 Model 8 Stepping 2
03:17:22:WU00:FS00:0xa7: CPUs: 16
03:17:22:WU00:FS00:0xa7: Memory: 31.95GiB
03:17:22:WU00:FS00:0xa7:Free Memory: 23.53GiB
03:17:22:WU00:FS00:0xa7: Threads: WINDOWS_THREADS
03:17:22:WU00:FS00:0xa7: OS Version: 6.2
03:17:22:WU00:FS00:0xa7:Has Battery: false
03:17:22:WU00:FS00:0xa7: On Battery: false
03:17:22:WU00:FS00:0xa7: UTC Offset: -4
03:17:22:WU00:FS00:0xa7: PID: 15444
03:17:22:WU00:FS00:0xa7: CWD: C:\Users\crawd\AppData\Roaming\FAHClient\work
03:17:22:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
03:17:22:WU00:FS00:0xa7: Version: 0.0.18
03:17:22:WU00:FS00:0xa7: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:17:22:WU00:FS00:0xa7: Copyright: 2019 foldingathome.org
03:17:22:WU00:FS00:0xa7: Homepage: https://foldingathome.org/
03:17:22:WU00:FS00:0xa7: Date: Oct 26 2019
03:17:22:WU00:FS00:0xa7: Time: 01:52:30
03:17:22:WU00:FS00:0xa7: Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
03:17:22:WU00:FS00:0xa7: Branch: master
03:17:22:WU00:FS00:0xa7: Compiler: Visual C++ 2008
03:17:22:WU00:FS00:0xa7: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
03:17:22:WU00:FS00:0xa7: Platform: win32 10
03:17:22:WU00:FS00:0xa7: Bits: 64
03:17:22:WU00:FS00:0xa7: Mode: Release
03:17:22:WU00:FS00:0xa7:************************************ Build *************************************
03:17:22:WU00:FS00:0xa7: SIMD: avx_256
03:17:22:WU00:FS00:0xa7:********************************************************************************
03:17:22:WU00:FS00:0xa7:Project: 13870 (Run 0, Clone 529, Gen 59)
03:17:22:WU00:FS00:0xa7:Unit: 0x000000440d5262775e764918e9059201
03:17:22:WU00:FS00:0xa7:Digital signatures verified
03:17:22:WU00:FS00:0xa7:Reducing thread count from 14 to 13 to avoid domain decomposition with large prime factor 7
03:17:22:WU00:FS00:0xa7:Reducing thread count from 13 to 12 to avoid domain decomposition by a prime number > 3
03:17:22:WU00:FS00:0xa7:Calling: mdrun -s frame59.tpr -o frame59.trr -x frame59.xtc -e frame59.edr -cpi state.cpt -cpt 5 -nt 12
03:17:22:WU00:FS00:0xa7:Steps: first=7375000 total=125000
03:17:24:WU00:FS00:0xa7:Completed 17072 out of 125000 steps (13%)
03:17:36:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
03:17:41:WU01:FS01:0x22:Completed 650000 out of 1000000 steps (65%)
03:17:41:WU01:FS01:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
03:17:46:WU00:FS00:0xa7:Completed 17500 out of 125000 steps (14%)
03:18:06:Removing old file 'configs/config-20200410-204117.xml'
03:18:06:Saving configuration to config.xml
03:18:06:<config>
03:18:06: <!-- Folding Core -->
03:18:06: <checkpoint v='5'/>
03:18:06:
03:18:06: <!-- Network -->
03:18:06: <proxy v=':8080'/>
03:18:06:
03:18:06: <!-- Slot Control -->
03:18:06: <power v='MEDIUM'/>
03:18:06:
03:18:06: <!-- User Information -->
03:18:06: <passkey v='********************************'/>
03:18:06: <team v='64'/>
03:18:06: <user v='Crawdaddy79'/>
03:18:06:
03:18:06: <!-- Folding Slots -->
03:18:06: <slot id='0' type='CPU'/>
03:18:06: <slot id='1' type='GPU'/>
03:18:06:</config>
03:18:38:WU01:FS01:0x22:Completed 660000 out of 1000000 steps (66%)
03:18:49:WU00:FS00:0xa7:Completed 18750 out of 125000 steps (15%)
-
- Posts: 78
- Joined: Wed Mar 25, 2020 2:39 am
- Location: Canada
Re: Feature Request: Pause at next checkpoint
Interesting. Your CPU slot did save, though:
Also, I've not seen a WU lose more than 1% before. I thought it at least checkpointed at every %. How odd.
Code: Select all
03:16:22:WU00:FS00:0xa7:Completed 16250 out of 125000 steps (13%)
03:17:03:FS00:Paused
[...]
03:17:21:FS00:Unpaused
03:17:22:WU00:FS00:Starting
03:17:22:WU00:FS00:0xa7:Steps: first=7375000 total=125000
03:17:24:WU00:FS00:0xa7:Completed 17072 out of 125000 steps (13%)
-
- Site Admin
- Posts: 7937
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Feature Request: Pause at next checkpoint
GPU WUs pause at set percentages that are set by the researcher when the project is set up. Depending on the project, typical values are every 2-5%.
CPU WUs running on the A7 core write out a checkpoint at the time interval set in the client through FAHControl. Depending on how it is paused, the A7 core will attempt to write a checkpoint then as well.
CPU WUs running on the A7 core write out a checkpoint at the time interval set in the client through FAHControl. Depending on how it is paused, the A7 core will attempt to write a checkpoint then as well.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
-
- Posts: 78
- Joined: Wed Mar 25, 2020 2:39 am
- Location: Canada
Re: Feature Request: Pause at next checkpoint
That figures, actually, since GPU computing generally processes data in large chunks for efficiency.Joe_H wrote: GPU WUs pause at set percentages that are set by the researcher when the project is set up. Depending on the project, typical values are every 2-5%.
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: Feature Request: Pause at next checkpoint
Generally speaking, if you're using the CPU, folding would hardly impact whatever tasks you're running on your CPU as folding priority is very low while most other tasks are set at a higher priority. I understand if wanted to pause the GPU slot since you may encounter screen lag. You can reduce/eliminate the screen lag by disabling the hardware acceleration on the application (if it's supported) and/or disable Windows animation.Crawdaddy79 wrote:...Sometimes because I want to do something else on my PC that requires cycles...
That's an indication of something else. I have folded on my CPU at 100% for months without issues. The only time I would restart was due to the monthly Windows updates and apart from that, it would fold day and night without crashing. If you would like us to investigate, please do share your log and as much details as possible about your system setup and usage.Crawdaddy79 wrote:..."my PC crashes when it runs at 100% for too long and it comes back reporting BAD WORK_UNIT" story that I'm not going to go into too much detail about.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
-
- Posts: 523
- Joined: Fri Mar 23, 2012 5:16 pm
Re: Feature Request: Pause at next checkpoint
The other day I needed to pause the GPU slot for a minute because I needed to move the electrical plug for the laptop (old battery not strong enough to sustain a full load), and I lost around 45 minutes of work
-
- Posts: 146
- Joined: Sun Jul 30, 2017 8:40 pm
Re: Feature Request: Pause at next checkpoint
Hence my! general recommendation: never ever pause a GPU slot.
-
- Posts: 523
- Joined: Fri Mar 23, 2012 5:16 pm
Re: Feature Request: Pause at next checkpoint
Yeah no that's not realisticfoldinghomealone2 wrote:Hence my! general recommendation: never ever pause a GPU slot.
-
- Posts: 73
- Joined: Sat Mar 21, 2020 3:56 pm
Re: Feature Request: Pause at next checkpoint
Agree about the CPU impact being minimal and 100% reliable when pegged at 100%. It's the GPU that I have issues with. It seems to become unstable if it runs at 100% for too long (60 - 90 minutes OK, 120+ minutes, not OK) - but that's digressing.PantherX wrote:That's an indication of something else. I have folded on my CPU at 100% for months without issues. The only time I would restart was due to the monthly Windows updates and apart from that, it would fold day and night without crashing. If you would like us to investigate, please do share your log and as much details as possible about your system setup and usage.Crawdaddy79 wrote:..."my PC crashes when it runs at 100% for too long and it comes back reporting BAD WORK_UNIT" story that I'm not going to go into too much detail about.
This is good info. Thank you.Joe_H wrote:GPU WUs pause at set percentages that are set by the researcher when the project is set up. Depending on the project, typical values are every 2-5%.
CPU WUs running on the A7 core write out a checkpoint at the time interval set in the client through FAHControl. Depending on how it is paused, the A7 core will attempt to write a checkpoint then as well.
Re: Feature Request: Pause at next checkpoint
A log notification about hitting a savepoint would be cool.iceman1992 wrote:Yeah no that's not realisticfoldinghomealone2 wrote:Hence my! general recommendation: never ever pause a GPU slot.
CPU: Ryzen 9 3900X (1x21 CPUs) ~ GPU: nVidia GeForce GTX 1660 Super (Asus)
-
- Posts: 523
- Joined: Fri Mar 23, 2012 5:16 pm
Re: Feature Request: Pause at next checkpoint
That would be (I would guess) the easiest update that can solve this problemuyaem wrote:A log notification about hitting a savepoint would be cool.iceman1992 wrote:Yeah no that's not realisticfoldinghomealone2 wrote:Hence my! general recommendation: never ever pause a GPU slot.
Re: Feature Request: Pause at next checkpoint
I haven't looked at any of the new covid projects but before that when I checked 2.5% was pretty much normal, a safe bet would be to stop after multiples of 5%.