Page 1 of 1

Project: 5746 (Run 3, Clone 98, Gen 562)

Posted: Sun Nov 22, 2009 8:04 pm
by bapriebe
This one bombs immediately (and repeatedly) on Nonzero force sum on GPU...

Code: Select all

[19:54:44] Project: 5746 (Run 3, Clone 98, Gen 562)
[19:54:44] 
[19:54:44] Assembly optimizations on if available.
[19:54:44] Entering M.D.
[19:54:50] Tpr hash work/wudata_07.tpr:  567491816 748991043 797875715 2239557274 2225617188
[19:54:50] Working on Protein
[19:54:51] Client config found, loading data.
[19:54:51] Starting GUI Server
[19:54:53] mdrun_gpu returned 
[19:54:53] Nonzero force sum on GPU
[19:54:53] 
[19:54:53] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:54:56] CoreStatus = 7A (122)
[19:54:56] Sending work to server
[19:54:56] Project: 5746 (Run 3, Clone 98, Gen 562)
[19:54:56] - Read packet limit of 540015616... Set to 524286976.
[19:54:56] - Error: Could not get length of results file work/wuresults_07.dat
[19:54:56] - Error: Could not read unit 07 file. Removing from queue.

Re: Project: 5746 (Run 3, Clone 98, Gen 562)

Posted: Fri Jan 08, 2010 12:33 am
by SidVicious
This broken WU is still out in the wild :

Code: Select all

[14:37:33] Project: 5746 (Run 3, Clone 98, Gen 562)
[14:37:33] 
[14:37:33] Assembly optimizations on if available.
[14:37:33] Entering M.D.
[14:37:39] Tpr hash work/wudata_00.tpr:  567491816 748991043 797875715 2239557274 2225617188
[14:37:39] Working on Protein
[14:37:40] Client config found, loading data.
[14:37:40] Starting GUI Server
[14:37:43] mdrun_gpu returned 
[14:37:43] Nonzero force sum on GPU
[14:37:43] 
[14:37:43] Folding@home Core Shutdown: UNSTABLE_MACHINE
[14:37:45] CoreStatus = 7A (122)
[14:37:45] Sending work to server
[14:37:45] Project: 5746 (Run 3, Clone 98, Gen 562)
[14:37:45] - Read packet limit of 540015616... Set to 524286976.
[14:37:45] - Error: Could not get length of results file work/wuresults_00.dat
[14:37:45] - Error: Could not read unit 00 file. Removing from queue.
[14:37:45] EUE limit exceeded. Pausing 24 hours.
*Edit* And again:

Code: Select all

[00:31:29] Project: 5746 (Run 3, Clone 98, Gen 562)
[00:31:29] 
[00:31:29] Assembly optimizations on if available.
[00:31:29] Entering M.D.
[00:31:35] Tpr hash work/wudata_05.tpr:  567491816 748991043 797875715 2239557274 2225617188
[00:31:35] Working on Protein
[00:31:36] Client config found, loading data.
[00:31:36] Starting GUI Server
[00:31:39] mdrun_gpu returned 
[00:31:39] Nonzero force sum on GPU
[00:31:39] 
[00:31:39] Folding@home Core Shutdown: UNSTABLE_MACHINE
[00:31:41] CoreStatus = 7A (122)
[00:31:41] Sending work to server
[00:31:41] Project: 5746 (Run 3, Clone 98, Gen 562)
[00:31:41] - Read packet limit of 540015616... Set to 524286976.
[00:31:41] - Error: Could not get length of results file work/wuresults_05.dat
[00:31:41] - Error: Could not read unit 05 file. Removing from queue.
[00:31:41] EUE limit exceeded. Pausing 24 hours.

Re: Project: 5746 (Run 3, Clone 98, Gen 562)

Posted: Fri Jan 08, 2010 6:26 am
by bruce
I'll put this WU on the list to be pulled from circulation.