Re: Downtime between WUs?
Posted: Sun Jun 24, 2012 12:54 pm
Was a ram disk a solution you found with Google?
Community driven support forum for Folding@home
https://foldingforum.org/
In passing mostly. It was reasonable to try it first since it was a very simple solution. Will post here later on how it went.7im wrote:Was a ram disk a solution you found with Google?
Code: Select all
15:21:24:WU00:FS00:0xa4:Completed 490000 out of 500000 steps (98%)
15:23:07:WU00:FS00:0xa4:Completed 495000 out of 500000 steps (99%)
15:23:08:WU01:FS00:Connecting to assign3.stanford.edu:8080
15:23:10:WU01:FS00:News: Welcome to Folding@Home
15:23:10:WU01:FS00:Assigned to work server 171.67.108.58
15:23:10:WU01:FS00:Requesting new work unit for slot 00: RUNNING smp:12 from 171.67.108.58
15:23:10:WU01:FS00:Connecting to 171.67.108.58:8080
15:23:11:WU01:FS00:Downloading 478.52KiB
15:23:13:WU01:FS00:Download complete
15:23:13:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:OK project:8047 run:169 clone:12 gen:11 core:0xa4 unit:0x0000000f6652edca4fda1bedd712161d
15:23:13:WU01:FS00:Downloading project 8047 description
15:23:13:WU01:FS00:Connecting to fah-web.stanford.edu:80
15:23:14:WU01:FS00:Project 8047 description downloaded successfully
15:24:50:WU00:FS00:0xa4:Completed 500000 out of 500000 steps (100%)
15:24:51:WU00:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
15:25:01:WU00:FS00:0xa4:
15:25:01:WU00:FS00:0xa4:Finished Work Unit:
15:25:01:WU00:FS00:0xa4:- Reading up to 24721224 from "00/wudata_01.trr": Read 24721224
15:25:01:WU00:FS00:0xa4:trr file hash check passed.
15:25:01:WU00:FS00:0xa4:edr file hash check passed.
15:25:01:WU00:FS00:0xa4:logfile size: 25645
15:25:01:WU00:FS00:0xa4:Leaving Run
15:25:03:WU00:FS00:0xa4:- Writing 24754225 bytes of core data to disk...
15:25:06:WU00:FS00:0xa4:Done: 24753713 -> 19667030 (compressed to 79.4 percent)
15:25:06:WU00:FS00:0xa4: ... Done.
15:25:06:WU00:FS00:0xa4:- Shutting down core
15:25:06:WU00:FS00:0xa4:
15:25:06:WU00:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
15:25:06:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
15:25:06:WU00:FS00:Sending unit results: id:00 state:SEND error:OK project:7905 run:74 clone:0 gen:15 core:0xa4 unit:0x0000001100ac9c234e4d84c0ed2a54c3
15:25:06:WU00:FS00:Uploading 18.76MiB to 128.113.12.163
15:25:06:WU00:FS00:Connecting to 128.113.12.163:8080
15:25:06:WU01:FS00:Starting
15:25:06:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/ramdisk/cores/www.stanford.edu/~pande/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 01 -suffix 01 -version 701 -lifeline 2405 -checkpoint 30 -np 12
15:25:06:WU01:FS00:Started FahCore on PID 7801
15:25:06:Started thread 9 on PID 2405
15:25:06:WU01:FS00:Core PID:7805
15:25:06:WU01:FS00:FahCore 0xa4 started
15:25:07:WU01:FS00:0xa4:
15:25:07:WU01:FS00:0xa4:*------------------------------*
15:25:07:WU01:FS00:0xa4:Folding@Home Gromacs GB Core
15:25:07:WU01:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
15:25:07:WU01:FS00:0xa4:
15:25:07:WU01:FS00:0xa4:Preparing to commence simulation
15:25:07:WU01:FS00:0xa4:- Looking at optimizations...
15:25:07:WU01:FS00:0xa4:- Created dyn
15:25:07:WU01:FS00:0xa4:- Files status OK
15:25:07:WU01:FS00:0xa4:- Expanded 489490 -> 1142300 (decompressed 233.3 percent)
15:25:07:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=489490 data_size=1142300, decompressed_data_size=1142300 diff=0
15:25:07:WU01:FS00:0xa4:- Digital signature verified
15:25:07:WU01:FS00:0xa4:
15:25:07:WU01:FS00:0xa4:Project: 8047 (Run 169, Clone 12, Gen 11)
15:25:07:WU01:FS00:0xa4:
15:25:07:WU01:FS00:0xa4:Assembly optimizations on if available.
15:25:07:WU01:FS00:0xa4:Entering M.D.
15:25:12:WU00:FS00:Upload 28.99%
15:25:13:WU01:FS00:0xa4:Completed 0 out of 250000 steps (0%)
15:25:18:WU00:FS00:Upload 72.97%
15:25:25:WU00:FS00:Upload complete
15:25:25:WU00:FS00:Server responded WORK_ACK (400)
15:25:25:WU00:FS00:Final credit estimate, 4611.00 points
15:25:25:WU00:FS00:Cleaning up
15:25:27:WU01:FS00:0xa4:Completed 2500 out of 250000 steps (1%)
15:25:42:WU01:FS00:0xa4:Completed 5000 out of 250000 steps (2%)
Make sense, but of course the downside is that you may have to repeat 29 minutes of work if you resume from a checkpoint.7im wrote:You'll fold faster setting the checkpoint to 30 minutes than worrying about those 10 seconds you can't change.
Jesse_V wrote:Make sense, but of course the downside is that you may have to repeat 29 minutes of work if you resume from a checkpoint.7im wrote:You'll fold faster setting the checkpoint to 30 minutes than worrying about those 10 seconds you can't change.
I have also noticed the 10 second delay. Simply out of curiosity, why is it there? What purpose does it serve?
From prior version log messages, my assumption is that the delay is to allow all threads to complete and exit. It was more visible with older SMP cores that used MPI for inter-thread communication that ran in separate processes.Jesse_V wrote:I have also noticed the 10 second delay. Simply out of curiosity, why is it there? What purpose does it serve?
If you're willing to make the effort, by all means please do, especially for F@h.csvanefalk wrote:A man's gotta optimize what a man's gotta optimize.