Project: 2665 (Run 2, Clone 39, Gen 30) *hung* at 99%
Posted: Sun Jul 20, 2008 9:42 am
Project: 2665 (Run 2, Clone 39, Gen 30) hung at 99%
OK guys. the above WU halted at 99%, I tried to restart the system but it didn't resume? I was running the 64bit SMP client, using Notfreds folding CD. The log states the core shutdown, there is no reason to my knowledge why this should have happened - could a power spike have caused it maybe?
The FAHlog.txt is below at the time it occurred and the full log is at the bottom. Can anything be done to salvage this WU? I see there is no FahCore_a1.exe in the folder on the USB Flash drive anymore - does this go when the core shuts down?
This is the log up to the time it shutdown:
[01:15:03] Completed 237500 out of 250000 steps (95 percent)
[01:29:49] Writing local files
[01:29:49] Completed 240000 out of 250000 steps (96 percent)
[01:44:36] Writing local files
[01:44:37] Completed 242500 out of 250000 steps (97 percent)
[01:48:38] - Autosending finished units...
[01:48:38] Trying to send all finished work units
[01:48:38] + No unsent completed units remaining.
[01:48:38] - Autosend completed
[01:59:22] Writing local files
[01:59:23] Completed 245000 out of 250000 steps (98 percent)
[02:14:09] Writing local files
[02:14:09] Completed 247500 out of 250000 steps (99 percent)
[02:17:15]
[02:17:15] Folding@home Core Shutdown: INTERRUPTED
[07:48:38] - Autosending finished units...
[07:48:38] Trying to send all finished work units
[07:48:38] + No unsent completed units remaining.
[07:48:38] - Autosend completed
Additionally, I had this error message being displayed:
FahCore_a1.exe [549]: segfault at 9db840 rip 5cc669 rsp 407Fdd60 error 4
Any ideas - can this be salvaged or is it lost?
Full FAHlog.txt for info.
OK guys. the above WU halted at 99%, I tried to restart the system but it didn't resume? I was running the 64bit SMP client, using Notfreds folding CD. The log states the core shutdown, there is no reason to my knowledge why this should have happened - could a power spike have caused it maybe?
The FAHlog.txt is below at the time it occurred and the full log is at the bottom. Can anything be done to salvage this WU? I see there is no FahCore_a1.exe in the folder on the USB Flash drive anymore - does this go when the core shuts down?
This is the log up to the time it shutdown:
[01:15:03] Completed 237500 out of 250000 steps (95 percent)
[01:29:49] Writing local files
[01:29:49] Completed 240000 out of 250000 steps (96 percent)
[01:44:36] Writing local files
[01:44:37] Completed 242500 out of 250000 steps (97 percent)
[01:48:38] - Autosending finished units...
[01:48:38] Trying to send all finished work units
[01:48:38] + No unsent completed units remaining.
[01:48:38] - Autosend completed
[01:59:22] Writing local files
[01:59:23] Completed 245000 out of 250000 steps (98 percent)
[02:14:09] Writing local files
[02:14:09] Completed 247500 out of 250000 steps (99 percent)
[02:17:15]
[02:17:15] Folding@home Core Shutdown: INTERRUPTED
[07:48:38] - Autosending finished units...
[07:48:38] Trying to send all finished work units
[07:48:38] + No unsent completed units remaining.
[07:48:38] - Autosend completed
Additionally, I had this error message being displayed:
FahCore_a1.exe [549]: segfault at 9db840 rip 5cc669 rsp 407Fdd60 error 4
Any ideas - can this be salvaged or is it lost?
Full FAHlog.txt for info.
Code: Select all
--- Opening Log file [July 19 02:48:37]
# SMP Client ##################################################################
###############################################################################
Folding@Home Client Version 6.02beta
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: /etc/folding/1
Executable: ./fah6
Arguments: -local -forceasm -verbosity 9 -smp
Warning:
By using the -forceasm flag, you are overriding
safeguards in the program. If you did not intend to
do this, please restart the program without -forceasm.
If work units are not completing fully (and particularly
if your machine is overclocked), then please discontinue
use of the flag.
[02:48:37] - Ask before connecting: No
[02:48:37] - User name: GreatGig (Team 132987)
[02:48:37] - User ID not found locally
[02:48:37] + Requesting User ID from server
[02:48:37] - Getting ID from AS:
[02:48:37] Connecting to http://assign.stanford.edu:8080/
[02:48:38] Posted data.
[02:48:38] Initial: 8F5D; - Received User ID = 5D8F39DD2CC5500F
[02:48:38] - Machine ID: 1
[02:48:38]
[02:48:38] Work directory not found. Creating...
[02:48:38] Could not open work queue, generating new queue...
[02:48:38] - Autosending finished units...
[02:48:38] Trying to send all finished work units
[02:48:38] + No unsent completed units remaining.
[02:48:38] - Autosend completed
[02:48:38] - Preparing to get new work unit...
[02:48:38] + Attempting to get work packet
[02:48:38] - Will indicate memory of 2001 MB
[02:48:38] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 15, Stepping: 11
[02:48:38] - Connecting to assignment server
[02:48:38] Connecting to http://assign.stanford.edu:8080/
[02:48:39] Posted data.
[02:48:39] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[02:48:39] + News From Folding@Home: Welcome to Folding@Home
[02:48:39] Loaded queue successfully.
[02:48:39] Connecting to http://171.64.65.64:8080/
[02:48:44] Posted data.
[02:48:44] Initial: 0000; - Receiving payload (expected size: 4818659)
[01:49:04] - Downloaded at ~4194302 kB/s
[01:49:04] - Averaged speed for that direction ~4194302 kB/s
[01:49:04] + Received work.
[01:49:04] + Closed connections
[01:49:04]
[01:49:04] + Processing work unit
[01:49:04] Core required: FahCore_a1.exe
[01:49:04] Core not found.
[01:49:04] - Core is not present or corrupted.
[01:49:04] - Attempting to download new core...
[01:49:04] + Downloading new core: FahCore_a1.exe
[01:49:04] Downloading core (/~pande/Linux/x86/Core_a1.fah from www.stanford.edu)
[01:49:04] Initial: AFDE; + 10240 bytes downloaded
*EDIT*
[01:49:14] Initial: 3A56; + 1484800 bytes downloaded
[01:49:14] Initial: D4FE; + 1490945 bytes downloaded
[01:49:14] Verifying core Core_a1.fah...
[01:49:14] Signature is VALID
[01:49:14]
[01:49:14] Trying to unzip core FahCore_a1.exe
[01:49:15] Decompressed FahCore_a1.exe (3625104 bytes) successfully
[01:49:15] + Core successfully engaged
[01:49:20]
[01:49:20] + Processing work unit
[01:49:20] Core required: FahCore_a1.exe
[01:49:20] Core found.
[01:49:20] Working on Unit 01 [July 19 01:49:20]
[01:49:20] + Working ...
[01:49:20] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 01 -checkpoint 30 -forceasm -verbose -lifeline 525 -version 602'
[01:49:20]
[01:49:20] *------------------------------*
[01:49:20] Folding@Home Gromacs SMP Core
[01:49:20] Version 1.74 (November 27, 2006)
[01:49:20]
[01:49:20] Preparing to commence simulation
[01:49:20] - Ensuring status. Please wait.
[01:49:21] - Starting from initial work packet
[01:49:21]
[01:49:21] Project: 2665 (Run 2, Clone 39, Gen 30)
[01:49:21]
[01:49:21] Assembly optimizations on if available.
[01:49:21] Entering M.D.
[01:49:38] on if available.
[01:49:38] Entering M.D.
[01:49:45] G with glycosylations
[01:49:45] osylations
[01:49:45] Writing local files
[01:49:45] Extra SSE boost OK.
[01:49:46] al files
[01:49:46] Completed 0 out of 250000 steps (0 percent)
[02:04:34] Writing local files
[02:04:34] Completed 2500 out of 250000 steps (1 percent)
[02:19:23] Writing local files
[02:19:23] Completed 5000 out of 250000 steps (2 percent)
[02:34:11] Writing local files
*EDIT*
[01:15:03] Completed 237500 out of 250000 steps (95 percent)
[01:29:49] Writing local files
[01:29:49] Completed 240000 out of 250000 steps (96 percent)
[01:44:36] Writing local files
[01:44:37] Completed 242500 out of 250000 steps (97 percent)
[01:48:38] - Autosending finished units...
[01:48:38] Trying to send all finished work units
[01:48:38] + No unsent completed units remaining.
[01:48:38] - Autosend completed
[01:59:22] Writing local files
[01:59:23] Completed 245000 out of 250000 steps (98 percent)
[02:14:09] Writing local files
[02:14:09] Completed 247500 out of 250000 steps (99 percent)
[02:17:15]
[02:17:15] Folding@home Core Shutdown: INTERRUPTED
[07:48:38] - Autosending finished units...
[07:48:38] Trying to send all finished work units
[07:48:38] + No unsent completed units remaining.
[07:48:38] - Autosend completed