Page 1 of 1

Project: 5506 (Run 0, Clone 668, Gen 294)

Posted: Thu Nov 20, 2008 11:47 am
by osgorth
Another one of the recent "instant EUE" work units.. Can we please get these removed, they're killing my farm. :(

Code: Select all

[11:42:20] Project: 5506 (Run 0, Clone 668, Gen 294)
[11:42:20] 
[11:42:20] Assembly optimizations on if available.
[11:42:20] Entering M.D.
[11:42:27] Working on p5506_supervillin_e1
[11:42:27] Client config found, loading data.
[11:42:28] Starting GUI Server
[11:42:28] mdrun_gpu returned 
[11:42:28] NANs detected on GPU
[11:42:28] 
[11:42:28] Folding@home Core Shutdown: UNSTABLE_MACHINE
[11:42:31] CoreStatus = 7A (122)
[11:42:31] Sending work to server

Re: Project: 5506 (Run 0, Clone 668, Gen 294)

Posted: Thu Nov 20, 2008 7:20 pm
by toTOW
That's a bad WU : 5 similar reports of failures :(

Re: Project: 5506 (Run 0, Clone 668, Gen 294)

Posted: Thu Nov 20, 2008 10:26 pm
by Drugless
Failed here again.

Code: Select all

[22:07:38] Project: 5506 (Run 0, Clone 668, Gen 294)
[22:07:38] 
[22:07:38] Assembly optimizations on if available.
[22:07:38] Entering M.D.
[22:07:44] Working on p5506_supervillin_e1
[22:07:45] Client config found, loading data.
[22:07:45] mdrun_gpu returned 
[22:07:45] NANs detected on GPU
[22:07:45] 
[22:07:45] Folding@home Core Shutdown: UNSTABLE_MACHINE
[22:07:48] CoreStatus = 7A (122)
[22:07:48] Sending work to server
[22:07:48] Project: 5506 (Run 0, Clone 668, Gen 294)
[22:07:48] - Read packet limit of 540015616... Set to 524286976.
[22:07:48] - Error: Could not get length of results file work/wuresults_09.dat
[22:07:48] - Error: Could not read unit 09 file. Removing from queue.
[22:07:48] EUE limit exceeded. Pausing 24 hours.

Re: Project: 5506 (Run 0, Clone 668, Gen 294)

Posted: Fri Nov 21, 2008 4:36 pm
by subego
Got me, too.

Code: Select all

[06:56:26] Project: 5506 (Run 0, Clone 668, Gen 294)
[06:56:26] 
[06:56:26] Assembly optimizations on if available.
[06:56:26] Entering M.D.
[06:56:32] Working on p5506_supervillin_e1
[06:56:33] Client config found, loading data.
[06:56:33] mdrun_gpu returned 
[06:56:33] NANs detected on GPU
[06:56:33] 
[06:56:33] Folding@home Core Shutdown: UNSTABLE_MACHINE
[06:56:36] CoreStatus = 7A (122)
[06:56:36] Sending work to server
[06:56:36] Project: 5506 (Run 0, Clone 668, Gen 294)
[06:56:36] - Read packet limit of 540015616... Set to 524286976.
[06:56:36] - Error: Could not get length of results file work/wuresults_04.dat
[06:56:36] - Error: Could not read unit 04 file. Removing from queue.
[06:56:36] EUE limit exceeded. Pausing 24 hours.
[09:01:55] + Working...
[15:01:54] + Working...

Re: Project: 5506 (Run 0, Clone 668, Gen 294)

Posted: Fri Nov 21, 2008 9:17 pm
by subego
It's still being distributed. Just got me again.

Code: Select all

[19:13:22] Project: 5506 (Run 0, Clone 668, Gen 294)
[19:13:22] 
[19:13:22] Assembly optimizations on if available.
[19:13:22] Entering M.D.
[19:13:29] Working on p5506_supervillin_e1
[19:13:29] Client config found, loading data.
[19:13:29] mdrun_gpu returned 
[19:13:29] NANs detected on GPU
[19:13:29] 
[19:13:29] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:13:32] CoreStatus = 7A (122)
[19:13:32] Sending work to server
[19:13:32] Project: 5506 (Run 0, Clone 668, Gen 294)
[19:13:32] - Read packet limit of 540015616... Set to 524286976.
[19:13:32] - Error: Could not get length of results file work/wuresults_06.dat
[19:13:32] - Error: Could not read unit 06 file. Removing from queue.
[19:13:32] EUE limit exceeded. Pausing 24 hours.

Re: Project: 5506 (Run 0, Clone 668, Gen 294)

Posted: Sat Nov 22, 2008 6:44 am
by Drugless
I'm still receiving this bad wu!

Code: Select all

[23:41:51] Project: 5506 (Run 0, Clone 668, Gen 294)
[23:41:51] 
[23:41:51] Assembly optimizations on if available.
[23:41:51] Entering M.D.
[23:41:58] Working on p5506_supervillin_e1
[23:41:58] Client config found, loading data.
[23:41:58] Starting GUI Server
[23:41:58] mdrun_gpu returned 
[23:41:58] NANs detected on GPU
[23:41:58] 
[23:41:58] Folding@home Core Shutdown: UNSTABLE_MACHINE
[23:42:01] CoreStatus = 7A (122)
[23:42:01] Sending work to server
[23:42:01] Project: 5506 (Run 0, Clone 668, Gen 294)
[23:42:01] - Read packet limit of 540015616... Set to 524286976.
[23:42:01] - Error: Could not get length of results file work/wuresults_08.dat
[23:42:01] - Error: Could not read unit 08 file. Removing from queue.
[23:42:01] EUE limit exceeded. Pausing 24 hours.

Re: Project: 5506 (Run 0, Clone 668, Gen 294)

Posted: Sat Nov 22, 2008 5:13 pm
by VijayPande
Thanks for the reports and sorry for this hassle. We've manually removed this WU. We are looking into what needs to be done to better automate this problem (looks like an updated client will be needed to send the right info to the server).