7610 (151, 0, 0) and 7610 (147, 0, 0) Both EUEs
Posted: Wed Jun 15, 2011 10:13 am
Two pcs downloaded new cores this morning, started on 7610s and both cxrashed with EUEs - see log files below. Both pcs have run without fault for months so don't think it's the pcs and suspect bad units. Both pcs were closed down and rstarted and are running 7149 ( 14% completed) and 6950 (43% completed). I did expect that the 7610s would be picked up again but both pcs went for new units. Can these 2 units be removed and examined for faults please.
Thanks
Shunter
Logfile 7610 (151, 0, 0)
Logfile 7610 (147, 0, 0)
Thanks
Shunter
Logfile 7610 (151, 0, 0)
Code: Select all
[03:16:38] Verifying core Core_a4.fah...
[03:16:38] Signature is VALID
[03:16:38]
[03:16:38] Trying to unzip core FahCore_a4.exe
[03:16:39] Decompressed FahCore_a4.exe (10057216 bytes) successfully
[03:16:44] + Core successfully engaged
[03:16:49]
[03:16:49] + Processing work unit
[03:16:49] Work type a4 not eligible for variable processors
[03:16:49] Core required: FahCore_a4.exe
[03:16:49] Core found.
[03:16:49] Working on queue slot 05 [June 15 03:16:49 UTC]
[03:16:49] + Working ...
[03:16:49] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a4.exe -dir work/ -suffix 05 -checkpoint 10 -forceasm -verbose -lifeline 3272 -version 629'
[03:16:49]
[03:16:49] *------------------------------*
[03:16:49] Folding@Home Gromacs GB Core
[03:16:49] Version 2.27 (Dec. 15, 2010)
[03:16:49]
[03:16:49] Preparing to commence simulation
[03:16:49] - Ensuring status. Please wait.
[03:16:49] Called DecompressByteArray: compressed_data_size=270125 data_size=644556, decompressed_data_size=644556 diff=0
[03:16:49] - Digital signature verified
[03:16:49]
[03:16:49] Project: 7610 (Run 151, Clone 0, Gen 0)
[03:16:49]
[03:16:49] Assembly optimizations on if available.
[03:16:49] Entering M.D.
[03:16:55] Mapping NT from 1 to 1
[03:16:55] Completed 0 out of 2000000 steps (0%)
[03:17:05] ed 0 out of 2000000 steps (0%)
[03:40:57] teps (1%)
[03:41:15] 0000 out of 2000000 steps (1%)
[04:04:56] teps (2%)
[04:05:16] 0000 out of 2000000 steps (2%)
[04:28:48] teps (3%)
[04:29:15] 0000 out of 2000000 steps (3%)
[04:52:43] teps (4%)
[04:53:17] 0000 out of 2000000 steps (4%)
[04:56:00] - Autosending finished units... [June 15 04:56:00 UTC]
[04:56:00] Trying to send all finished work units
[04:56:00] + No unsent completed units remaining.
[04:56:00] - Autosend completed
[05:16:57] steps (5%)
[05:17:34] 0000 out of 2000000 steps (5%)
[05:40:44] steps (6%)
[05:41:27] 0000 out of 2000000 steps (6%)
[06:04:44] steps (7%)
[06:05:32] 0000 out of 2000000 steps (7%)
[06:07:06] ave done -- stepsTotalG=2000000
[06:07:06] Work fraction=0.0697 steps=2000000.
[06:07:10] logfile size=12019 infoLength=12019 edr=0 trr=25
[06:07:10] logfile size: 12019 info=12019 bed=0 hdr=25
[06:07:10] - Writing 12557 bytes of core data to disk...
[06:07:10] Done: 12045 -> 4013 (compressed to 33.3 percent)
[06:07:10] ... Done.
[06:07:10]
[06:07:10] Folding@home Core Shutdown: UNSTABLE_MACHINE
[06:37:10] (compressed to 32.0 percent)
[06:37:10] ... Done.
[06:37:10]
[06:37:10] Folding@home Core Shutdown: EARLY_UNIT_END
[07:50:40] Killing all core threads
[07:50:40] Could not get process id information. Please kill core process manually
Folding@Home Client Shutdown at user request.
[07:50:40] ***** Got a SIGTERM signal (2)
[07:50:40] Killing all core threads
[07:50:40] Could not get process id information. Please kill core process manually
Folding@Home Client Shutdown.
Code: Select all
[03:02:02] Verifying core Core_a4.fah...
[03:02:02] Signature is VALID
[03:02:02]
[03:02:02] Trying to unzip core FahCore_a4.exe
[03:02:04] Decompressed FahCore_a4.exe (10057216 bytes) successfully
[03:02:09] + Core successfully engaged
[03:02:14]
[03:02:14] + Processing work unit
[03:02:14] Work type a4 not eligible for variable processors
[03:02:14] Core required: FahCore_a4.exe
[03:02:14] Core found.
[03:02:14] Working on queue slot 09 [June 15 03:02:14 UTC]
[03:02:14] + Working ...
[03:02:14] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a4.exe -dir work/ -suffix 09 -checkpoint 15 -forceasm -verbose -lifeline 3104 -version 629'
[03:02:15]
[03:02:15] *------------------------------*
[03:02:15] Folding@Home Gromacs GB Core
[03:02:15] Version 2.27 (Dec. 15, 2010)
[03:02:15]
[03:02:15] Preparing to commence simulation
[03:02:15] - Assembly optimizations manually forced on.
[03:02:15] - Not checking prior termination.
[03:02:15] - Expanded 270054 -> 644556 (decompressed 238.6 percent)
[03:02:15] Called DecompressByteArray: compressed_data_size=270054 data_size=644556, decompressed_data_size=644556 diff=0
[03:02:15] - Digital signature verified
[03:02:15]
[03:02:15] Project: 7610 (Run 147, Clone 0, Gen 0)
[03:02:15]
[03:02:15] Assembly optimizations on if available.
[03:02:15] Entering M.D.
[03:02:21] Mapping NT from 1 to 1
[03:02:22] Completed 0 out of 2000000 steps (0%)
[03:44:17] Completed 20000 out of 2000000 steps (1%)
[04:26:08] Completed 40000 out of 2000000 steps (2%)
[04:32:24] mdrun returned 255
[04:32:24] Going to send back what have done -- stepsTotalG=2000000
[04:32:24] Work fraction=0.0214 steps=2000000.
[04:32:28] logfile size=10736 infoLength=10736 edr=0 trr=25
[04:32:28] logfile size: 10736 info=10736 bed=0 hdr=25
[04:32:28] - Writing 11274 bytes of core data to disk...
[04:32:28] Done: 10762 -> 3789 (compressed to 35.2 percent)
[04:32:28] ... Done.
[04:32:28]
[04:32:28] Folding@home Core Shutdown: EARLY_UNIT_END
[08:10:33] Killing all core threads
[08:10:33] Killing 2 cores
[08:10:33] Killing core 0
[08:10:33] Killing core 1
Folding@Home Client Shutdown at user request.
[08:10:33] ***** Got a SIGTERM signal (2)
[08:10:33] Killing all core threads
[08:10:33] Killing 2 cores
[08:10:33] Killing core 0
[08:10:33] Killing core 1
Folding@Home Client Shutdown.