Page 1 of 1

Project: 5745 (Run 0, Clone 59, Gen 228)

Posted: Sat Dec 19, 2009 12:11 am
by SidVicious
Another broken WU, EUE'd five times and issued the dreadful "Pausing 24 hours."

Code: Select all

[21:26:37] Project: 5745 (Run 0, Clone 59, Gen 228)
[21:26:37] 
[21:26:37] Assembly optimizations on if available.
[21:26:37] Entering M.D.
[21:26:43] Tpr hash work/wudata_00.tpr:  57746033 946412100 2250985240 2869848819 1138190429
[21:26:43] Working on Protein
[21:26:44] Client config found, loading data.
[21:26:44] Starting GUI Server
[21:26:47] mdrun_gpu returned 
[21:26:47] Nonzero force sum on GPU
[21:26:47] 
[21:26:47] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:26:49] CoreStatus = 7A (122)
[21:26:49] Sending work to server
[21:26:49] Project: 5745 (Run 0, Clone 59, Gen 228)
[21:26:49] - Read packet limit of 540015616... Set to 524286976.
[21:26:49] - Error: Could not get length of results file work/wuresults_00.dat
[21:26:49] - Error: Could not read unit 00 file. Removing from queue.
[21:26:49] EUE limit exceeded. Pausing 24 hours.

Re: Project: 5745 (Run 0, Clone 59, Gen 228)

Posted: Sun Jul 04, 2010 6:24 pm
by Tynat
There were four prior UNSTABLE_MACHINE failures with this WU previous to this one. It's unknown while the job stack is being rebuilt whether restarting the GPU client would result in receiving the same WU.

Code: Select all

[16:39:31] + Received work.
[16:39:31] Trying to send all finished work units
[16:39:31] + No unsent completed units remaining.
[16:39:31] + Closed connections
[16:39:36] 
[16:39:36] + Processing work unit
[16:39:36] Core required: FahCore_11.exe
[16:39:36] Core found.
[16:39:36] Working on queue slot 01 [July 4 16:39:36 UTC]
[16:39:36] + Working ...
[16:39:36] - Calling '.\FahCore_11.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 680 -version 623'

[16:39:37] 
[16:39:37] *------------------------------*
[16:39:37] Folding@Home GPU Core - Beta
[16:39:37] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[16:39:37] 
[16:39:37] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[16:39:37] Build host: amoeba
[16:39:37] Board Type: AMD
[16:39:37] Core      : 
[16:39:37] Preparing to commence simulation
[16:39:37] - Looking at optimizations...
[16:39:37] - Created dyn
[16:39:37] - Files status OK
[16:39:37] - Expanded 68539 -> 357580 (decompressed 521.7 percent)
[16:39:37] Called DecompressByteArray: compressed_data_size=68539 data_size=357580, decompressed_data_size=357580 diff=0
[16:39:37] - Digital signature verified
[16:39:37] 
[16:39:37] Project: 5745 (Run 0, Clone 59, Gen 228)
[16:39:37] 
[16:39:37] Assembly optimizations on if available.
[16:39:37] Entering M.D.
[16:39:43] Tpr hash work/wudata_01.tpr:  57746033 946412100 2250985240 2869848819 1138190429
[16:39:43] Working on Protein
[16:39:44] Client config found, loading data.
[16:39:44] Starting GUI Server
[16:39:54] mdrun_gpu returned 
[16:39:54] Nonzero force sum on GPU
[16:39:54] 
[16:39:54] Folding@home Core Shutdown: UNSTABLE_MACHINE
[16:39:57] CoreStatus = 7A (122)
[16:39:57] Sending work to server
[16:39:57] Project: 5745 (Run 0, Clone 59, Gen 228)
[16:39:57] - Read packet limit of 540015616... Set to 524286976.
[16:39:57] - Error: Could not get length of results file work/wuresults_01.dat
[16:39:57] - Error: Could not read unit 01 file. Removing from queue.
[16:39:57] EUE limit exceeded. Pausing 24 hours.

Re: Project: 5745 (Run 0, Clone 59, Gen 228)

Posted: Sun Jul 04, 2010 8:01 pm
by bruce
The WU (P5745,R0,C59,G228) has been reported as a bad WU.