Project: 3062 (Run 5, Clone 177, Gen 30) Hang, Resume, Good
Posted: Mon Nov 10, 2008 3:45 am
This WU hung w/ long 1-4 interactions. I restarted the client, it resumed this WU, and successfully completed it.
And resuming:
Code: Select all
[06:01:41] + Closed connections
[06:01:46]
[06:01:46] + Processing work unit
[06:01:46] Work type a1 not eligible for variable processors
[06:01:46] Core required: FahCore_a1.exe
[06:01:46] Core found.
[06:01:46] Using generic mpiexec calls
[06:01:46] Working on queue slot 09 [November 9 06:01:46 UTC]
[06:01:46] + Working ...
[06:01:46] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 09 -checkpoint 15 -verbose -lifeline 1296 -version 623'
[06:01:47]
[06:01:47] *------------------------------*
[06:01:47] Folding@Home Gromacs SMP Core
[06:01:47] Version 1.74 (March 10, 2007)
[06:01:47]
[06:01:47] Preparing to commence simulation
[06:01:47] - Ensuring status. Please wait.
[06:01:47] - Starting from initial work packet
[06:01:47]
[06:01:47] Project: 3062 (Run 5, Clone 177, Gen 30)
[06:01:47]
[06:01:47] Assembly optimizations on if available.
[06:01:47] Entering M.D.
[06:02:04] percent)
[06:02:04] - Starting from initial work packet
[06:02:04]
[06:02:04] Project: 30Entering M.D.
[06:02:04] ne 177, Gen 30)
[06:02:04]
[06:02:04] Entering M.D.
[06:02:04] work packet
[06:02:04]
[06:02:04] Project: 3062 (Run 5, Clone 177, Gen 30)
[06:02:04]
[06:02:04] Entering M.D.
[06:02:11] xtra SSE boost OK.
[06:02:11] Writing local files
[06:02:11] Completed 0 out of 5000000 steps (0 percent)
[06:12:57] Writing local files
[06:12:57] Completed 50000 out of 5000000 steps (1 percent)
{snip}
[18:42:40] Writing local files
[18:42:40] Completed 3550000 out of 5000000 steps (71 percent)
[18:52:06] Warning: long 1-4 interactions
[20:46:01] - Autosending finished units... [November 9 20:46:01 UTC]
[20:46:01] Trying to send all finished work units
[20:46:01] + No unsent completed units remaining.
[20:46:01] - Autosend completed
[20:48:31] Killing all core threads
[20:48:31] Killing 3 cores
[20:48:31] Killing core 0
[20:48:31] Killing core 1
[20:48:31] Killing core 2
Folding@Home Client Shutdown at user request.
[20:48:31] ***** Got a SIGTERM signal (2)
[20:48:31] Killing all core threads
[20:48:31] Killing 3 cores
[20:48:31] Killing core 0
[20:48:31] Killing core 1
[20:48:31] Killing core 2
Folding@Home Client Shutdown.
Code: Select all
--- Opening Log file [November 9 20:48:57 UTC]
# Windows SMP Console Edition #################################################
###############################################################################
Folding@Home Client Version 6.23 Beta R1
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: C:\SMP\Folding@Home Windows SMP Client V1.01
Executable: C:\SMP\Folding@Home Windows SMP Client V1.01\Folding@home-Win32-x86.exe
Arguments: -verbosity 9 -smp
[20:48:57] - Ask before connecting: No
[20:48:57] - User name: anko1 (Team 47815)
[20:48:57] - User ID: XXXXX
[20:48:57] - Machine ID: 1
[20:48:57]
[20:48:57] Loaded queue successfully.
[20:48:57]
[20:48:57] - Autosending finished units... [November 9 20:48:57 UTC]
[20:48:57] + Processing work unit
[20:48:57] Trying to send all finished work units
[20:48:57] Work type a1 not eligible for variable processors
[20:48:57] + No unsent completed units remaining.
[20:48:57] Core required: FahCore_a1.exe
[20:48:57] - Autosend completed
[20:48:57] Core found.
[20:48:57] Using generic mpiexec calls
[20:48:57] Working on queue slot 09 [November 9 20:48:57 UTC]
[20:48:57] + Working ...
[20:48:57] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 09 -checkpoint 15 -verbose -lifeline 4748 -version 623'
[20:48:57]
[20:48:57] *------------------------------*
[20:48:57] Folding@Home Gromacs SMP Core
[20:48:57] Version 1.74 (March 10, 2007)
[20:48:57]
[20:48:57] Preparing to commence simulation
[20:48:57] - Ensuring status. Please wait.
[20:49:14] - Looking at optimizations...
[20:49:14] - Working with standard loops on this execution.
[20:49:14] - Previous termination of core was improper.
[20:49:14] - Going to use standard loops.
[20:49:14] - Files status OK
[20:49:14] - Expanded 608114 -> 3260637 (decompressed 536.1 percent)
[20:49:14]
[20:49:14] Project: 3062 (Run 5, Clone 177, Gen 30)
[20:49:14]
[20:49:14] Entering M.D.
[20:49:20] Calling FAH init
[20:49:21] mbda5_99sb
[20:49:21] Writing local files
[20:49:21] Completed 3550000 out of 5000000 steps (71 percent)
[20:49:21] Extra SSE boost OK.
[20:49:21]
[20:49:21] Completed 3550000 out of 5000000 steps (71 percent)
[20:49:21] Extra SSE boost OK.
[21:00:11] Writing local files
[21:00:11] Completed 3600000 out of 5000000 steps (72 percent)
{snip}
[02:05:06] Writing local files
[02:05:06] Completed 5000000 out of 5000000 steps (100 percent)
[02:05:06] Writing final coordinates.
[02:05:07] Past main M.D. loop
[02:05:07] Will end MPI now
[02:06:07]
[02:06:07] Finished Work Unit:
[02:06:07] - Reading up to 517488 from "work/wudata_09.arc": Read 517488
[02:06:07] - Reading up to 981740 from "work/wudata_09.xtc": Read 981740
[02:06:07] goefile size: 0
[02:06:07] logfile size: 150269
[02:06:07] Leaving Run
[02:06:09] - Writing 1852525 bytes of core data to disk...
[02:06:09] ... Done.
[02:06:09] - Failed to delete work/wudata_09.sas
[02:06:09] - Failed to delete work/wudata_09.goe
[02:06:09] Warning: check for stray files
[02:06:09] - Shutting down core
[02:06:09]
[02:06:09] Folding@home Core Shutdown: FINISHED_UNIT
[02:06:09]
[02:06:09] Folding@home Core Shutdown: FINISHED_UNIT
[02:08:13] CoreStatus = 64 (100)
[02:08:13] Unit 9 finished with 76 percent of time to deadline remaining.
[02:08:14] Updated performance fraction: 0.810441
[02:08:14] Sending work to server
[02:08:14] Project: 3062 (Run 5, Clone 177, Gen 30)
[02:08:14] + Attempting to send results [November 10 02:08:14 UTC]
[02:08:14] - Reading file work/wuresults_09.dat from core
[02:08:14] (Read 1852525 bytes from disk)
[02:08:14] Connecting to http://171.64.65.63:8080/
[02:08:48] Posted data.
[02:08:48] Initial: 0000; - Uploaded at ~53 kB/s
[02:08:48] - Averaged speed for that direction ~51 kB/s
[02:08:48] + Results successfully sent
[02:08:48] Thank you for your contribution to Folding@Home.