Page 1 of 1

Project 10632 (Run 35, Clone 3, Gen 7)

Posted: Thu Jul 22, 2010 7:37 pm
by rickoic
Well my eVga 460 folded 92 consececutive wu's without a hitch. Then it d/l'd Project: 10632 (Run 35, Clone 3, Gen 7) and hickuped 5 times and got shut down 24hours because it was an UNSTABLE_MACHINE.

Code: Select all

[19:06:32] + Results successfully sent
[19:06:32] Thank you for your contribution to Folding@Home.
[19:06:32] + Number of Units Completed: 3197

[19:06:36] - Preparing to get new work unit...
[19:06:36] Cleaning up work directory
[19:06:36] + Attempting to get work packet
[19:06:36] Passkey found
[19:06:36] Gpu type=3 species=30.
[19:06:36] - Connecting to assignment server
[19:06:37] - Successful: assigned to (171.67.108.20).
[19:06:37] + News From Folding@Home: Welcome to Folding@Home
[19:06:37] Loaded queue successfully.
[19:06:37] Gpu type=3 species=30.
[19:06:37] + Closed connections
[19:06:37] 
[19:06:37] + Processing work unit
[19:06:37] Core required: FahCore_15.exe
[19:06:37] Core found.
[19:06:37] Working on queue slot 05 [July 22 19:06:37 UTC]
[19:06:37] + Working ...
[19:06:37] 
[19:06:37] *------------------------------*
[19:06:37] Folding@Home GPU Core -- Beta
[19:06:37] Version 2.09 (Thu May 20 11:58:42 PDT 2010)
[19:06:37] 
[19:06:37] Build host: SimbiosNvdWin7
[19:06:37] Board Type: Nvidia
[19:06:37] Core      : 
[19:06:37] Preparing to commence simulation
[19:06:37] - Looking at optimizations...
[19:06:37] DeleteFrameFiles: successfully deleted file=work/wudata_05.ckp
[19:06:37] - Created dyn
[19:06:37] - Files status OK
[19:06:37] sizeof(CORE_PACKET_HDR) = 512 file=<>
[19:06:37] - Expanded 28881 -> 163067 (decompressed 564.6 percent)
[19:06:37] Called DecompressByteArray: compressed_data_size=28881 data_size=163067, decompressed_data_size=163067 diff=0
[19:06:37] - Digital signature verified
[19:06:37] 
[19:06:37] Project: 10632 (Run 35, Clone 3, Gen 7)
[19:06:37] 
[19:06:37] Assembly optimizations on if available.
[19:06:37] Entering M.D.
[19:06:43] Tpr hash work/wudata_05.tpr:  2909006096 4081660995 2390861205 816812102 2296539644
[19:06:43] Working on 582 p2750_N68H_AM03
[19:06:43] Client config found, loading data.
[19:06:44] Starting GUI Server
[19:06:47] mdrun_gpu returned 
[19:06:47] NANs detected on GPU
[19:06:47] 
[19:06:47] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:06:51] CoreStatus = 7A (122)
[19:06:51] Sending work to server
[19:06:51] Project: 10632 (Run 35, Clone 3, Gen 7)
[19:06:51] - Read packet limit of 540015616... Set to 524286976.
[19:06:51] - Error: Could not get length of results file work/wuresults_05.dat
[19:06:51] - Error: Could not read unit 05 file. Removing from queue.
[19:06:51] - Preparing to get new work unit...
[19:06:51] Cleaning up work directory
[19:06:51] + Attempting to get work packet
[19:06:51] Passkey found
[19:06:51] Gpu type=3 species=30.
[19:06:51] - Connecting to assignment server
[19:06:52] - Successful: assigned to (171.67.108.20).
[19:06:52] + News From Folding@Home: Welcome to Folding@Home
[19:06:52] Loaded queue successfully.
[19:06:52] Gpu type=3 species=30.
[19:06:53] + Closed connections
[19:06:58] 
[19:06:58] + Processing work unit
[19:06:58] Core required: FahCore_15.exe
[19:06:58] Core found.
[19:06:58] Working on queue slot 06 [July 22 19:06:58 UTC]
[19:06:58] + Working ...
[19:06:58] 
[19:06:58] *------------------------------*
[19:06:58] Folding@Home GPU Core -- Beta
[19:06:58] Version 2.09 (Thu May 20 11:58:42 PDT 2010)
[19:06:58] 
[19:06:58] Build host: SimbiosNvdWin7
[19:06:58] Board Type: Nvidia
[19:06:58] Core      : 
[19:06:58] Preparing to commence simulation
[19:06:58] - Looking at optimizations...
[19:06:58] DeleteFrameFiles: successfully deleted file=work/wudata_06.ckp
[19:06:58] - Created dyn
[19:06:58] - Files status OK
[19:06:58] sizeof(CORE_PACKET_HDR) = 512 file=<>
[19:06:58] - Expanded 28881 -> 163067 (decompressed 564.6 percent)
[19:06:58] Called DecompressByteArray: compressed_data_size=28881 data_size=163067, decompressed_data_size=163067 diff=0
[19:06:58] - Digital signature verified
[19:06:58] 
[19:06:58] Project: 10632 (Run 35, Clone 3, Gen 7)
[19:06:58] 
[19:06:58] Assembly optimizations on if available.
[19:06:58] Entering M.D.
[19:07:04] Tpr hash work/wudata_06.tpr:  2909006096 4081660995 2390861205 816812102 2296539644
[19:07:04] Working on 582 p2750_N68H_AM03
[19:07:04] Client config found, loading data.
[19:07:04] Starting GUI Server
[19:07:08] mdrun_gpu returned 
[19:07:08] NANs detected on GPU
[19:07:08] 
[19:07:08] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:07:12] CoreStatus = 7A (122)
[19:07:12] Sending work to server
[19:07:12] Project: 10632 (Run 35, Clone 3, Gen 7)
[19:07:12] - Read packet limit of 540015616... Set to 524286976.
[19:07:12] - Error: Could not get length of results file work/wuresults_06.dat
[19:07:12] - Error: Could not read unit 06 file. Removing from queue.
[19:07:12] - Preparing to get new work unit...
[19:07:12] Cleaning up work directory
[19:07:12] + Attempting to get work packet
[19:07:12] Passkey found
[19:07:12] Gpu type=3 species=30.
[19:07:12] - Connecting to assignment server
[19:07:12] - Successful: assigned to (171.67.108.20).
[19:07:12] + News From Folding@Home: Welcome to Folding@Home
[19:07:13] Loaded queue successfully.
[19:07:13] Gpu type=3 species=30.
[19:07:13] + Closed connections
[19:07:18] 
[19:07:18] + Processing work unit
[19:07:18] Core required: FahCore_15.exe
[19:07:18] Core found.
[19:07:18] Working on queue slot 07 [July 22 19:07:18 UTC]
[19:07:18] + Working ...
[19:07:18] 
[19:07:18] *------------------------------*
[19:07:18] Folding@Home GPU Core -- Beta
[19:07:18] Version 2.09 (Thu May 20 11:58:42 PDT 2010)
[19:07:18] 
[19:07:18] Build host: SimbiosNvdWin7
[19:07:18] Board Type: Nvidia
[19:07:18] Core      : 
[19:07:18] Preparing to commence simulation
[19:07:18] - Looking at optimizations...
[19:07:18] DeleteFrameFiles: successfully deleted file=work/wudata_07.ckp
[19:07:18] - Created dyn
[19:07:18] - Files status OK
[19:07:18] sizeof(CORE_PACKET_HDR) = 512 file=<>
[19:07:18] - Expanded 28881 -> 163067 (decompressed 564.6 percent)
[19:07:18] Called DecompressByteArray: compressed_data_size=28881 data_size=163067, decompressed_data_size=163067 diff=0
[19:07:18] - Digital signature verified
[19:07:18] 
[19:07:18] Project: 10632 (Run 35, Clone 3, Gen 7)
[19:07:18] 
[19:07:18] Assembly optimizations on if available.
[19:07:18] Entering M.D.
[19:07:24] Tpr hash work/wudata_07.tpr:  2909006096 4081660995 2390861205 816812102 2296539644
[19:07:24] Working on 582 p2750_N68H_AM03
[19:07:24] Client config found, loading data.
[19:07:25] Starting GUI Server
[19:07:29] mdrun_gpu returned 
[19:07:29] NANs detected on GPU
[19:07:29] 
[19:07:29] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:07:32] CoreStatus = 7A (122)
[19:07:32] Sending work to server
[19:07:32] Project: 10632 (Run 35, Clone 3, Gen 7)
[19:07:32] - Read packet limit of 540015616... Set to 524286976.
[19:07:32] - Error: Could not get length of results file work/wuresults_07.dat
[19:07:32] - Error: Could not read unit 07 file. Removing from queue.
[19:07:32] - Preparing to get new work unit...
[19:07:32] Cleaning up work directory
[19:07:32] + Attempting to get work packet
[19:07:32] Passkey found
[19:07:32] Gpu type=3 species=30.
[19:07:32] - Connecting to assignment server
[19:07:33] - Successful: assigned to (171.67.108.20).
[19:07:33] + News From Folding@Home: Welcome to Folding@Home
[19:07:33] Loaded queue successfully.
[19:07:33] Gpu type=3 species=30.
[19:07:33] + Closed connections
[19:07:38] 
[19:07:38] + Processing work unit
[19:07:38] Core required: FahCore_15.exe
[19:07:38] Core found.
[19:07:38] Working on queue slot 08 [July 22 19:07:38 UTC]
[19:07:38] + Working ...
[19:07:38] 
[19:07:38] *------------------------------*
[19:07:38] Folding@Home GPU Core -- Beta
[19:07:38] Version 2.09 (Thu May 20 11:58:42 PDT 2010)
[19:07:38] 
[19:07:38] Build host: SimbiosNvdWin7
[19:07:38] Board Type: Nvidia
[19:07:38] Core      : 
[19:07:38] Preparing to commence simulation
[19:07:38] - Looking at optimizations...
[19:07:38] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[19:07:38] - Created dyn
[19:07:38] - Files status OK
[19:07:38] sizeof(CORE_PACKET_HDR) = 512 file=<>
[19:07:38] - Expanded 28881 -> 163067 (decompressed 564.6 percent)
[19:07:38] Called DecompressByteArray: compressed_data_size=28881 data_size=163067, decompressed_data_size=163067 diff=0
[19:07:38] - Digital signature verified
[19:07:38] 
[19:07:38] Project: 10632 (Run 35, Clone 3, Gen 7)
[19:07:38] 
[19:07:38] Assembly optimizations on if available.
[19:07:38] Entering M.D.
[19:07:44] Tpr hash work/wudata_08.tpr:  2909006096 4081660995 2390861205 816812102 2296539644
[19:07:44] Working on 582 p2750_N68H_AM03
[19:07:44] Client config found, loading data.
[19:07:45] Starting GUI Server
[19:07:48] mdrun_gpu returned 
[19:07:48] NANs detected on GPU
[19:07:48] 
[19:07:48] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:07:53] CoreStatus = 7A (122)
[19:07:53] Sending work to server
[19:07:53] Project: 10632 (Run 35, Clone 3, Gen 7)
[19:07:53] - Read packet limit of 540015616... Set to 524286976.
[19:07:53] - Error: Could not get length of results file work/wuresults_08.dat
[19:07:53] - Error: Could not read unit 08 file. Removing from queue.
[19:07:53] - Preparing to get new work unit...
[19:07:53] Cleaning up work directory
[19:07:53] + Attempting to get work packet
[19:07:53] Passkey found
[19:07:53] Gpu type=3 species=30.
[19:07:53] - Connecting to assignment server
[19:07:53] - Successful: assigned to (171.67.108.20).
[19:07:53] + News From Folding@Home: Welcome to Folding@Home
[19:07:53] Loaded queue successfully.
[19:07:53] Gpu type=3 species=30.
[19:07:54] + Closed connections
[19:07:59] 
[19:07:59] + Processing work unit
[19:07:59] Core required: FahCore_15.exe
[19:07:59] Core found.
[19:07:59] Working on queue slot 09 [July 22 19:07:59 UTC]
[19:07:59] + Working ...
[19:07:59] 
[19:07:59] *------------------------------*
[19:07:59] Folding@Home GPU Core -- Beta
[19:07:59] Version 2.09 (Thu May 20 11:58:42 PDT 2010)
[19:07:59] 
[19:07:59] Build host: SimbiosNvdWin7
[19:07:59] Board Type: Nvidia
[19:07:59] Core      : 
[19:07:59] Preparing to commence simulation
[19:07:59] - Looking at optimizations...
[19:07:59] DeleteFrameFiles: successfully deleted file=work/wudata_09.ckp
[19:07:59] - Created dyn
[19:07:59] - Files status OK
[19:07:59] sizeof(CORE_PACKET_HDR) = 512 file=<>
[19:07:59] - Expanded 28881 -> 163067 (decompressed 564.6 percent)
[19:07:59] Called DecompressByteArray: compressed_data_size=28881 data_size=163067, decompressed_data_size=163067 diff=0
[19:07:59] - Digital signature verified
[19:07:59] 
[19:07:59] Project: 10632 (Run 35, Clone 3, Gen 7)
[19:07:59] 
[19:07:59] Assembly optimizations on if available.
[19:07:59] Entering M.D.
[19:08:05] Tpr hash work/wudata_09.tpr:  2909006096 4081660995 2390861205 816812102 2296539644
[19:08:05] Working on 582 p2750_N68H_AM03
[19:08:05] Client config found, loading data.
[19:08:05] Starting GUI Server
[19:08:09] mdrun_gpu returned 
[19:08:09] NANs detected on GPU
[19:08:09] 
[19:08:09] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:08:13] CoreStatus = 7A (122)
[19:08:13] Sending work to server
[19:08:13] Project: 10632 (Run 35, Clone 3, Gen 7)
[19:08:13] - Read packet limit of 540015616... Set to 524286976.
[19:08:13] - Error: Could not get length of results file work/wuresults_09.dat
[19:08:13] - Error: Could not read unit 09 file. Removing from queue.
[19:08:13] EUE limit exceeded. Pausing 24 hours.
Fold on
Rick

Re: Project 10632 (Run 35, Clone 3, Gen 7)

Posted: Thu Jul 22, 2010 7:52 pm
by sortofageek
Project: 10632 (Run 35, Clone 3, Gen 7) has been successfully completed and returned by another folder, so it would seem the problem is not the WU.

Re: Project 10632 (Run 35, Clone 3, Gen 7)

Posted: Thu Jul 22, 2010 8:29 pm
by rickoic
Well I have 2 eVga 460's installed so I swapped them. The 460 that was having the problems with the wu is now folding happily on P10632 (R85, C2, G41) and the other 460 is choking on the P10632 (R35, C3, G7) just like the other 460 was.

Fold on
Rick

Re: Project 10632 (Run 35, Clone 3, Gen 7)

Posted: Thu Jul 22, 2010 8:58 pm
by bruce
OK, it's not the GPU and it's not the WU, so my money is on the drivers.

Re: Project 10632 (Run 35, Clone 3, Gen 7)

Posted: Thu Jul 22, 2010 9:46 pm
by rickoic
Ok, I've now moved a 250 into slot 1, with the 460's in slots 2 and 3.

The 250 pulled a p5782 (R9, C25, G211) and failed on it 8 times before being put to sleep.

The 460 in slot 3 is continuing with P10632 (R85, C2, G41) with no problems.
The 460 in slot 2 is continuing with the P5782 (R10, C19, G225) that the 250 had started. Hopefully will finish it and the pick up a A3.

Next step will be to run memtest on ram, then if ram checks out alright, might have to RMA motherboard.

Fold on
Rick

Re: Project 10632 (Run 35, Clone 3, Gen 7)

Posted: Fri Jul 23, 2010 12:04 am
by rickoic
Ok, ran MEMTEST86 on my ram and got errors, but I don't think it's ram errors.
Have 6 sticks of 2GB Corsair ram installed.
Ran MEMTEST86 and got following errors:
Lowest error address: 0007F800000 2040.0MB
Highest error address:0007FC00000 2044.0MB

Removed 1st stick and moved all others forward and put 1st stick in as 6th stick (ram MEMTEST86 again and got same errors)
did the same think with sticks of ram again until all had cycled into 1st slot and each time I ram MEMTEST86 I got the same errors.

I think its now a motherboard issue. Anyone else have any ideas?

Tks
Rick

Re: Project 10632 (Run 35, Clone 3, Gen 7)

Posted: Fri Jul 23, 2010 4:11 am
by PantherX
rickoic wrote:...The 460 in slot 2 is continuing with the P5782 (R10, C19, G225) that the 250 had started. Hopefully will finish it and the pick up a A3...
That is your problem. A Fermi GPU will not process GPU2 WUs (WUs that use FahCore_11), it only processes GPU3 WUs (WUs that use FahCore_15) Maybe you can refer to my Guide and see if there is anything useful.