Project: 5746 (Run 3, Clone 98, Gen 562)

Moderators: Site Moderators, FAHC Science Team

Post Reply
bapriebe
Posts: 44
Joined: Sun Apr 20, 2008 8:33 am
Hardware configuration: HP xw4600 workstation (4GB)+Q9650+Sapphire Vapor-X HD4890,
HP Z600 workstation (4GB)+2xXEON E5540+Sapphire HD5770,
HP ML350 server (4GB)+2xXEON E5520+Diamond HD3850
Location: Ottawa, Ontario

Project: 5746 (Run 3, Clone 98, Gen 562)

Post by bapriebe »

This one bombs immediately (and repeatedly) on Nonzero force sum on GPU...

Code: Select all

[19:54:44] Project: 5746 (Run 3, Clone 98, Gen 562)
[19:54:44] 
[19:54:44] Assembly optimizations on if available.
[19:54:44] Entering M.D.
[19:54:50] Tpr hash work/wudata_07.tpr:  567491816 748991043 797875715 2239557274 2225617188
[19:54:50] Working on Protein
[19:54:51] Client config found, loading data.
[19:54:51] Starting GUI Server
[19:54:53] mdrun_gpu returned 
[19:54:53] Nonzero force sum on GPU
[19:54:53] 
[19:54:53] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:54:56] CoreStatus = 7A (122)
[19:54:56] Sending work to server
[19:54:56] Project: 5746 (Run 3, Clone 98, Gen 562)
[19:54:56] - Read packet limit of 540015616... Set to 524286976.
[19:54:56] - Error: Could not get length of results file work/wuresults_07.dat
[19:54:56] - Error: Could not read unit 07 file. Removing from queue.
SidVicious
Posts: 30
Joined: Sun Jan 13, 2008 10:14 pm

Re: Project: 5746 (Run 3, Clone 98, Gen 562)

Post by SidVicious »

This broken WU is still out in the wild :

Code: Select all

[14:37:33] Project: 5746 (Run 3, Clone 98, Gen 562)
[14:37:33] 
[14:37:33] Assembly optimizations on if available.
[14:37:33] Entering M.D.
[14:37:39] Tpr hash work/wudata_00.tpr:  567491816 748991043 797875715 2239557274 2225617188
[14:37:39] Working on Protein
[14:37:40] Client config found, loading data.
[14:37:40] Starting GUI Server
[14:37:43] mdrun_gpu returned 
[14:37:43] Nonzero force sum on GPU
[14:37:43] 
[14:37:43] Folding@home Core Shutdown: UNSTABLE_MACHINE
[14:37:45] CoreStatus = 7A (122)
[14:37:45] Sending work to server
[14:37:45] Project: 5746 (Run 3, Clone 98, Gen 562)
[14:37:45] - Read packet limit of 540015616... Set to 524286976.
[14:37:45] - Error: Could not get length of results file work/wuresults_00.dat
[14:37:45] - Error: Could not read unit 00 file. Removing from queue.
[14:37:45] EUE limit exceeded. Pausing 24 hours.
*Edit* And again:

Code: Select all

[00:31:29] Project: 5746 (Run 3, Clone 98, Gen 562)
[00:31:29] 
[00:31:29] Assembly optimizations on if available.
[00:31:29] Entering M.D.
[00:31:35] Tpr hash work/wudata_05.tpr:  567491816 748991043 797875715 2239557274 2225617188
[00:31:35] Working on Protein
[00:31:36] Client config found, loading data.
[00:31:36] Starting GUI Server
[00:31:39] mdrun_gpu returned 
[00:31:39] Nonzero force sum on GPU
[00:31:39] 
[00:31:39] Folding@home Core Shutdown: UNSTABLE_MACHINE
[00:31:41] CoreStatus = 7A (122)
[00:31:41] Sending work to server
[00:31:41] Project: 5746 (Run 3, Clone 98, Gen 562)
[00:31:41] - Read packet limit of 540015616... Set to 524286976.
[00:31:41] - Error: Could not get length of results file work/wuresults_05.dat
[00:31:41] - Error: Could not read unit 05 file. Removing from queue.
[00:31:41] EUE limit exceeded. Pausing 24 hours.
I'm doing science and I'm still folding
I feel FANTASTIC and I'm still folding
While you are dying I'll still be folding
and when you're dead I'll still be folding
STILL FOLDING, still folding
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 5746 (Run 3, Clone 98, Gen 562)

Post by bruce »

I'll put this WU on the list to be pulled from circulation.
Post Reply