Project: 5775 (Run 4, Clone 12, Gen 35) : UNSTABLE_MACHINE
Posted: Sun Mar 29, 2009 4:41 pm
Failed 5 times with NaNs detected on GPU and Self test failure errors.
Code: Select all
[10:58:01] Project: 5775 (Run 4, Clone 12, Gen 35)
[10:58:01]
[10:58:01] Assembly optimizations on if available.
[10:58:01] Entering M.D.
[10:58:07] Working on Protein
[10:58:09] Client config found, loading data.
[10:58:09] Starting GUI Server
[10:59:48] Completed 1%
[10:59:48] mdrun_gpu returned
[10:59:48] NANs detected on GPU
[10:59:48]
[10:59:48] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:59:51] CoreStatus = 7A (122)
[10:59:51] Sending work to server
[10:59:51] Project: 5775 (Run 4, Clone 12, Gen 35)
[10:59:51] - Read packet limit of 540015616... Set to 524286976.
[10:59:51] - Error: Could not get length of results file work/wuresults_05.dat
[10:59:51] - Error: Could not read unit 05 file. Removing from queue.
[...]
[10:59:59] Project: 5775 (Run 4, Clone 12, Gen 35)
[10:59:59]
[10:59:59] Assembly optimizations on if available.
[10:59:59] Entering M.D.
[11:00:05] Working on Protein
[11:00:07] Client config found, loading data.
[11:00:07] Starting GUI Server
[11:01:46] Completed 1%
[11:01:46] mdrun_gpu returned
[11:01:46] NANs detected on GPU
[11:01:46]
[11:01:46] Folding@home Core Shutdown: UNSTABLE_MACHINE
[11:01:49] CoreStatus = 7A (122)
[11:01:49] Sending work to server
[11:01:49] Project: 5775 (Run 4, Clone 12, Gen 35)
[11:01:49] - Read packet limit of 540015616... Set to 524286976.
[11:01:49] - Error: Could not get length of results file work/wuresults_06.dat
[11:01:49] - Error: Could not read unit 06 file. Removing from queue.
[...]
[11:01:58] Project: 5775 (Run 4, Clone 12, Gen 35)
[11:01:58]
[11:01:58] Assembly optimizations on if available.
[11:01:58] Entering M.D.
[11:02:04] Working on Protein
[11:02:06] Client config found, loading data.
[11:02:06] Starting GUI Server
[11:04:05] Completed 1%
[11:06:03] Completed 2%
[11:07:00] mdrun_gpu returned
[11:07:00] NANs detected on GPU
[11:07:00]
[11:07:00] Folding@home Core Shutdown: UNSTABLE_MACHINE
[11:07:04] CoreStatus = 7A (122)
[11:07:04] Sending work to server
[11:07:04] Project: 5775 (Run 4, Clone 12, Gen 35)
[11:07:04] - Read packet limit of 540015616... Set to 524286976.
[11:07:04] - Error: Could not get length of results file work/wuresults_07.dat
[11:07:04] - Error: Could not read unit 07 file. Removing from queue.
[...]
[11:07:12] Project: 5775 (Run 4, Clone 12, Gen 35)
[11:07:12]
[11:07:12] Assembly optimizations on if available.
[11:07:12] Entering M.D.
[11:07:19] Working on Protein
[11:07:21] mdrun_gpu returned
[11:07:21] Self-test failure
[11:07:21]
[11:07:21] Folding@home Core Shutdown: UNSTABLE_MACHINE
[11:07:25] CoreStatus = 7A (122)
[11:07:25] Sending work to server
[11:07:25] Project: 5775 (Run 4, Clone 12, Gen 35)
[11:07:25] - Read packet limit of 540015616... Set to 524286976.
[11:07:25] - Error: Could not get length of results file work/wuresults_08.dat
[11:07:25] - Error: Could not read unit 08 file. Removing from queue.
[...]
[11:07:33] Project: 5775 (Run 4, Clone 12, Gen 35)
[11:07:33]
[11:07:33] Assembly optimizations on if available.
[11:07:33] Entering M.D.
[11:07:40] Working on Protein
[11:07:42] Client config found, loading data.
[11:07:42] Starting GUI Server
[11:09:16] Completed 1%
[11:09:16] mdrun_gpu returned
[11:09:16] NANs detected on GPU
[11:09:16]
[11:09:16] Folding@home Core Shutdown: UNSTABLE_MACHINE
[11:09:19] CoreStatus = 7A (122)
[11:09:19] Sending work to server
[11:09:19] Project: 5775 (Run 4, Clone 12, Gen 35)
[11:09:19] - Read packet limit of 540015616... Set to 524286976.
[11:09:19] - Error: Could not get length of results file work/wuresults_09.dat
[11:09:19] - Error: Could not read unit 09 file. Removing from queue.
[11:09:19] EUE limit exceeded. Pausing 24 hours.