Downtime between WUs?
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Downtime between WUs?
Was a ram disk a solution you found with Google?
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Posts: 147
- Joined: Mon May 21, 2012 10:28 am
Re: Downtime between WUs?
In passing mostly. It was reasonable to try it first since it was a very simple solution. Will post here later on how it went.7im wrote:Was a ram disk a solution you found with Google?
-
- Posts: 147
- Joined: Mon May 21, 2012 10:28 am
Re: Downtime between WUs?
EDIT: Not wanting to sound extreme, but is there any way to get rid of the 10 second sleep period that follows a finished folding?
Using a RAM disk indeed solves the issue and leads to streamlined performance...I would guess it shortened the overall transition period as well due to reduced I/O latency.
Using a RAM disk indeed solves the issue and leads to streamlined performance...I would guess it shortened the overall transition period as well due to reduced I/O latency.
Code: Select all
15:21:24:WU00:FS00:0xa4:Completed 490000 out of 500000 steps (98%)
15:23:07:WU00:FS00:0xa4:Completed 495000 out of 500000 steps (99%)
15:23:08:WU01:FS00:Connecting to assign3.stanford.edu:8080
15:23:10:WU01:FS00:News: Welcome to Folding@Home
15:23:10:WU01:FS00:Assigned to work server 171.67.108.58
15:23:10:WU01:FS00:Requesting new work unit for slot 00: RUNNING smp:12 from 171.67.108.58
15:23:10:WU01:FS00:Connecting to 171.67.108.58:8080
15:23:11:WU01:FS00:Downloading 478.52KiB
15:23:13:WU01:FS00:Download complete
15:23:13:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:OK project:8047 run:169 clone:12 gen:11 core:0xa4 unit:0x0000000f6652edca4fda1bedd712161d
15:23:13:WU01:FS00:Downloading project 8047 description
15:23:13:WU01:FS00:Connecting to fah-web.stanford.edu:80
15:23:14:WU01:FS00:Project 8047 description downloaded successfully
15:24:50:WU00:FS00:0xa4:Completed 500000 out of 500000 steps (100%)
15:24:51:WU00:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
15:25:01:WU00:FS00:0xa4:
15:25:01:WU00:FS00:0xa4:Finished Work Unit:
15:25:01:WU00:FS00:0xa4:- Reading up to 24721224 from "00/wudata_01.trr": Read 24721224
15:25:01:WU00:FS00:0xa4:trr file hash check passed.
15:25:01:WU00:FS00:0xa4:edr file hash check passed.
15:25:01:WU00:FS00:0xa4:logfile size: 25645
15:25:01:WU00:FS00:0xa4:Leaving Run
15:25:03:WU00:FS00:0xa4:- Writing 24754225 bytes of core data to disk...
15:25:06:WU00:FS00:0xa4:Done: 24753713 -> 19667030 (compressed to 79.4 percent)
15:25:06:WU00:FS00:0xa4: ... Done.
15:25:06:WU00:FS00:0xa4:- Shutting down core
15:25:06:WU00:FS00:0xa4:
15:25:06:WU00:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
15:25:06:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
15:25:06:WU00:FS00:Sending unit results: id:00 state:SEND error:OK project:7905 run:74 clone:0 gen:15 core:0xa4 unit:0x0000001100ac9c234e4d84c0ed2a54c3
15:25:06:WU00:FS00:Uploading 18.76MiB to 128.113.12.163
15:25:06:WU00:FS00:Connecting to 128.113.12.163:8080
15:25:06:WU01:FS00:Starting
15:25:06:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/ramdisk/cores/www.stanford.edu/~pande/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 01 -suffix 01 -version 701 -lifeline 2405 -checkpoint 30 -np 12
15:25:06:WU01:FS00:Started FahCore on PID 7801
15:25:06:Started thread 9 on PID 2405
15:25:06:WU01:FS00:Core PID:7805
15:25:06:WU01:FS00:FahCore 0xa4 started
15:25:07:WU01:FS00:0xa4:
15:25:07:WU01:FS00:0xa4:*------------------------------*
15:25:07:WU01:FS00:0xa4:Folding@Home Gromacs GB Core
15:25:07:WU01:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
15:25:07:WU01:FS00:0xa4:
15:25:07:WU01:FS00:0xa4:Preparing to commence simulation
15:25:07:WU01:FS00:0xa4:- Looking at optimizations...
15:25:07:WU01:FS00:0xa4:- Created dyn
15:25:07:WU01:FS00:0xa4:- Files status OK
15:25:07:WU01:FS00:0xa4:- Expanded 489490 -> 1142300 (decompressed 233.3 percent)
15:25:07:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=489490 data_size=1142300, decompressed_data_size=1142300 diff=0
15:25:07:WU01:FS00:0xa4:- Digital signature verified
15:25:07:WU01:FS00:0xa4:
15:25:07:WU01:FS00:0xa4:Project: 8047 (Run 169, Clone 12, Gen 11)
15:25:07:WU01:FS00:0xa4:
15:25:07:WU01:FS00:0xa4:Assembly optimizations on if available.
15:25:07:WU01:FS00:0xa4:Entering M.D.
15:25:12:WU00:FS00:Upload 28.99%
15:25:13:WU01:FS00:0xa4:Completed 0 out of 250000 steps (0%)
15:25:18:WU00:FS00:Upload 72.97%
15:25:25:WU00:FS00:Upload complete
15:25:25:WU00:FS00:Server responded WORK_ACK (400)
15:25:25:WU00:FS00:Final credit estimate, 4611.00 points
15:25:25:WU00:FS00:Cleaning up
15:25:27:WU01:FS00:0xa4:Completed 2500 out of 250000 steps (1%)
15:25:42:WU01:FS00:0xa4:Completed 5000 out of 250000 steps (2%)
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Downtime between WUs?
You'll fold faster setting the checkpoint to 30 minutes than worrying about those 10 seconds you can't change.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Posts: 147
- Joined: Mon May 21, 2012 10:28 am
Re: Downtime between WUs?
If I instead set the checkpoint interval to 0 in the config file (FAHControl only lets me go to 3 seconds), will this turn it off altogether?
-
- Site Moderator
- Posts: 2850
- Joined: Mon Jul 18, 2011 4:44 am
- Hardware configuration: OS: Windows 10, Kubuntu 19.04
CPU: i7-6700k
GPU: GTX 970, GTX 1080 TI
RAM: 24 GB DDR4 - Location: Western Washington
Re: Downtime between WUs?
Make sense, but of course the downside is that you may have to repeat 29 minutes of work if you resume from a checkpoint.7im wrote:You'll fold faster setting the checkpoint to 30 minutes than worrying about those 10 seconds you can't change.
I have also noticed the 10 second delay. Simply out of curiosity, why is it there? What purpose does it serve?
F@h is now the top computing platform on the planet and nothing unites people like a dedicated fight against a common enemy. This virus affects all of us. Lets end it together.
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Downtime between WUs?
Jesse_V wrote:Make sense, but of course the downside is that you may have to repeat 29 minutes of work if you resume from a checkpoint.7im wrote:You'll fold faster setting the checkpoint to 30 minutes than worrying about those 10 seconds you can't change.
I have also noticed the 10 second delay. Simply out of curiosity, why is it there? What purpose does it serve?
The downside is obvious. But anyone worrying about 10 seconds between work units obviously will NEVER turn the client or computer off! Otherwise even a single restart would negate many weeks of folding while trying to save those 10 seconds between work units, and so chasing those 10 seconds would be like pursuing an undomesticated water foul.
And again, the 10 seconds won't be changed, so csvanefalk will find it much more productive to find ways of never turning off the client or computer. And even if the 10 seconds could be changed, csvanefalk is still better off finding ways of never turning off the client or computer, like investing in a UPS.

Last edited by 7im on Sun Jun 24, 2012 4:40 pm, edited 2 times in total.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Site Admin
- Posts: 8002
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4 - Location: W. MA
Re: Downtime between WUs?
From prior version log messages, my assumption is that the delay is to allow all threads to complete and exit. It was more visible with older SMP cores that used MPI for inter-thread communication that ran in separate processes.Jesse_V wrote:I have also noticed the 10 second delay. Simply out of curiosity, why is it there? What purpose does it serve?
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Downtime between WUs?
Not all fahcore types and/or their threads shutdown at the same rate. It's a shutdown/startup buffer between WUs, as Joe_H describes. There should be a ticket on this, if anyone was curious.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Posts: 147
- Joined: Mon May 21, 2012 10:28 am
Re: Downtime between WUs?
A man's gotta optimize what a man's gotta optimize.
-
- Site Moderator
- Posts: 2850
- Joined: Mon Jul 18, 2011 4:44 am
- Hardware configuration: OS: Windows 10, Kubuntu 19.04
CPU: i7-6700k
GPU: GTX 970, GTX 1080 TI
RAM: 24 GB DDR4 - Location: Western Washington
Re: Downtime between WUs?
If you're willing to make the effort, by all means please do, especially for F@h.csvanefalk wrote:A man's gotta optimize what a man's gotta optimize.

F@h is now the top computing platform on the planet and nothing unites people like a dedicated fight against a common enemy. This virus affects all of us. Lets end it together.
Re: Downtime between WUs?
The programmer put that 10 second pause in there for a reason. I don't know what it was, but I have a pretty good guess that somebody discovered that without out it, there was a reasonable high probability that something could happen that would corrupt the WU as it was being written. Would you rather have an occasional WU be corrupted after reaching 100% and be discarded or would you rather wait 10 seconds after WUs which are not corrupt?
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.