Project 5767 Multiple run clone gen's

Moderators: Site Moderators, FAHC Science Team

Post Reply
MtM
Posts: 1579
Joined: Fri Jun 27, 2008 2:20 pm
Hardware configuration: Q6600 - 8gb - p5q deluxe - gtx275 - hd4350 ( not folding ) win7 x64 - smp:4 - gpu slot
E6600 - 4gb - p5wdh deluxe - 9600gt - 9600gso - win7 x64 - smp:2 - 2 gpu slots
E2160 - 2gb - ?? - onboard gpu - win7 x32 - 2 uniprocessor slots
T5450 - 4gb - ?? - 8600M GT 512 ( DDR2 ) - win7 x64 - smp:2 - gpu slot
Location: The Netherlands
Contact:

Project 5767 Multiple run clone gen's

Post by MtM »

Originally Posted by Wesleynator
I got one of the new WU's but it EUE'd after 16%. Anyone else have an EUE problem with one of these? My 9800GT had been running the 511 point WU's with no problem.

01:43:13] Project: 5766 (Run 4, Clone 4, Gen 0)
[01:43:13]
[01:43:13] Assembly optimizations on if available.
[01:43:13] Entering M.D.
[01:43:20] Working on Protein
[01:43:20] Client config found, loading data.
[01:43:21] Starting GUI Server
[01:44:14] Completed 1%
[01:45:06] Completed 2%
[01:45:57] Completed 3%
[01:46:49] Completed 4%
[01:47:41] Completed 5%
[01:48:32] Completed 6%
[01:49:24] Completed 7%
[01:50:16] Completed 8%
[01:51:07] Completed 9%
[01:51:59] Completed 10%
[01:52:51] Completed 11%
[01:53:42] Completed 12%
[01:54:34] Completed 13%
[01:55:26] Completed 14%
[01:56:18] Completed 15%
[01:57:09] Completed 16%
[01:57:17] Run: exception thrown during GuardedRun
[01:57:17] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[01:57:17] Going to send back what have done -- stepsTotalG=15000000
[01:57:17] Work fraction=0.1616 steps=15000000.
[01:57:22] logfile size=0 infoLength=0 edr=0 trr=23
[01:57:22] - Writing 642 bytes of core data to disk...
[01:57:22] Done: 130 -> 128 (compressed to 98.4 percent)
[01:57:22] ... Done.
[01:57:22]
[01:57:22] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:57:25] CoreStatus = 7A (122)
[01:57:25] Sending work to server
[01:57:25] Project: 5766 (Run 4, Clone 4, Gen 0)

Same on one of my 8800GT's


Quote:
[11:47:18] Folding@home Core Shutdown: UNSTABLE_MACHINE
[11:47:20] CoreStatus = 7A (122)
[11:47:20] Sending work to server
[11:47:20] Project: 5767 (Run 11, Clone 29, Gen 5)
[11:47:20] - Error: Could not get length of results file work/wuresults_06.dat
[11:47:20] - Error: Could not read unit 06 file. Removing from queue.
[11:47:20] EUE limit exceeded. Pausing 24 hours.
As posted on eoc forum. Asked the last poster which other eue's he had since the pause doesn't happen from one eue alone :)
Razor_FX_II
Posts: 19
Joined: Mon Aug 04, 2008 1:21 am
Hardware configuration: Proud Member of the [H]orde
Proud Member of the [H]ard DC Commandos

Re: Project 5767 Multiple run clone gen's

Post by Razor_FX_II »

After the error, GPU2 client just stops so I exit and delete the log's and que files and restart it to get it running again. So atm I dont have the others.
The next batch I get I'll copy up to the forum.
So far this last error was on Vista32, 178.24 drivers, 8800 GTS (G92) 512mb that has folded over 2k work units with no probs.
I had one error this morning a 353 point work unit on my main folding rig - Vista64, 181.00 drivers, GTX 260 that has folded over 2k work units with no probs.

Code: Select all

[13:26:06] Folding@Home GPU Core - Beta
[13:26:06] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[13:26:06] 
[13:26:06] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[13:26:06] Build host: amoeba
[13:26:06] Board Type: Nvidia
[13:26:06] Core      : 
[13:26:06] Preparing to commence simulation
[13:26:06] - Looking at optimizations...
[13:26:06] - Created dyn
[13:26:06] - Files status OK
[13:26:06] - Expanded 43942 -> 252912 (decompressed 575.5 percent)
[13:26:06] Called DecompressByteArray: compressed_data_size=43942 data_size=252912, decompressed_data_size=252912 diff=0
[13:26:06] - Digital signature verified
[13:26:06] 
[13:26:06] Project: 5766 (Run 2, Clone 63, Gen 0)
[13:26:06] 
[13:26:06] Assembly optimizations on if available.
[13:26:06] Entering M.D.
[13:26:13] Working on Protein
[13:26:13] Client config found, loading data.
[13:26:13] mdrun_gpu returned 
[13:26:13] NANs detected on GPU
[13:26:13] 
[13:26:13] Folding@home Core Shutdown: UNSTABLE_MACHINE
[13:26:17] CoreStatus = 7A (122)
[13:26:17] Sending work to server
[13:26:17] Project: 5766 (Run 2, Clone 63, Gen 0)
[13:26:17] - Read packet limit of 540015616... Set to 524286976.
[13:26:17] - Error: Could not get length of results file work/wuresults_03.dat
[13:26:17] - Error: Could not read unit 03 file. Removing from queue.
[13:26:17] EUE limit exceeded. Pausing 24 hours.
[13:56:05] - Autosending finished units... [December 24 13:56:05 UTC]
[13:56:05] Trying to send all finished work units
[13:56:05] + No unsent completed units remaining.
[13:56:05] - Autosend completed
What is "[13:26:13] NANs detected on GPU"?
Proud Member of the [H]orde
Proud Member of the [H]ard DC Commandos
MtM
Posts: 1579
Joined: Fri Jun 27, 2008 2:20 pm
Hardware configuration: Q6600 - 8gb - p5q deluxe - gtx275 - hd4350 ( not folding ) win7 x64 - smp:4 - gpu slot
E6600 - 4gb - p5wdh deluxe - 9600gt - 9600gso - win7 x64 - smp:2 - 2 gpu slots
E2160 - 2gb - ?? - onboard gpu - win7 x32 - 2 uniprocessor slots
T5450 - 4gb - ?? - 8600M GT 512 ( DDR2 ) - win7 x64 - smp:2 - gpu slot
Location: The Netherlands
Contact:

Re: Project 5767 Multiple run clone gen's

Post by MtM »

NAN = Not A Number. Depends on the coding language being used what it means, wiki has an article about it http://en.wikipedia.org/wiki/NaN
Razor_FX_II
Posts: 19
Joined: Mon Aug 04, 2008 1:21 am
Hardware configuration: Proud Member of the [H]orde
Proud Member of the [H]ard DC Commandos

Re: Project 5767 Multiple run clone gen's

Post by Razor_FX_II »

Just had this on Vista64 with 181.00 drivers GTX 260.

Code: Select all

[21:27:39] 
[21:27:39] *------------------------------*
[21:27:39] Folding@Home GPU Core - Beta
[21:27:39] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[21:27:39] 
[21:27:39] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[21:27:39] Build host: amoeba
[21:27:39] Board Type: Nvidia
[21:27:39] Core      : 
[21:27:39] Preparing to commence simulation
[21:27:39] - Looking at optimizations...
[21:27:39] - Created dyn
[21:27:39] - Files status OK
[21:27:39] - Expanded 46678 -> 252912 (decompressed 541.8 percent)
[21:27:39] Called DecompressByteArray: compressed_data_size=46678 data_size=252912, decompressed_data_size=252912 diff=0
[21:27:39] - Digital signature verified
[21:27:39] 
[21:27:39] Project: 5768 (Run 8, Clone 47, Gen 8)
[21:27:39] 
[21:27:39] Assembly optimizations on if available.
[21:27:39] Entering M.D.
[21:27:46] Working on Protein
[21:27:46] Client config found, loading data.
[21:27:46] mdrun_gpu returned 
[21:27:46] NANs detected on GPU
[21:27:46] 
[21:27:46] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:27:49] CoreStatus = 7A (122)
[21:27:49] Sending work to server
[21:27:49] Project: 5768 (Run 8, Clone 47, Gen 8)
[21:27:49] - Read packet limit of 540015616... Set to 524286976.
[21:27:49] - Error: Could not get length of results file work/wuresults_04.dat
[21:27:49] - Error: Could not read unit 04 file. Removing from queue.
[21:27:49] EUE limit exceeded. Pausing 24 hours.
Last edited by Razor_FX_II on Thu Dec 25, 2008 10:55 am, edited 1 time in total.
Proud Member of the [H]orde
Proud Member of the [H]ard DC Commandos
Razor_FX_II
Posts: 19
Joined: Mon Aug 04, 2008 1:21 am
Hardware configuration: Proud Member of the [H]orde
Proud Member of the [H]ard DC Commandos

Re: Project 5767 Multiple run clone gen's

Post by Razor_FX_II »

EUE on Project: 5768 (Run 8, Clone 9, Gen 9) running Vista32, 181.00 drivers, 8800 GTS (G92) 512mb (very stable).

Code: Select all

[00:38:40] *------------------------------*
[00:38:40] Folding@Home GPU Core - Beta
[00:38:40] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[00:38:40] 
[00:38:40] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[00:38:40] Build host: amoeba
[00:38:40] Board Type: Nvidia
[00:38:40] Core      : 
[00:38:40] Preparing to commence simulation
[00:38:40] - Looking at optimizations...
[00:38:40] - Created dyn
[00:38:40] - Files status OK
[00:38:40] - Expanded 46616 -> 252912 (decompressed 542.5 percent)
[00:38:40] Called DecompressByteArray: compressed_data_size=46616 data_size=252912, decompressed_data_size=252912 diff=0
[00:38:40] - Digital signature verified
[00:38:40] 
[00:38:40] Project: 5768 (Run 8, Clone 9, Gen 9)
[00:38:40] 
[00:38:40] Assembly optimizations on if available.
[00:38:40] Entering M.D.
[00:38:46] Working on Protein
[00:38:47] Client config found, loading data.
[00:38:47] mdrun_gpu returned 
[00:38:47] NANs detected on GPU
[00:38:47] 
[00:38:47] Folding@home Core Shutdown: UNSTABLE_MACHINE
[00:38:51] CoreStatus = 7A (122)
[00:38:51] Sending work to server
[00:38:51] Project: 5768 (Run 8, Clone 9, Gen 9)
[00:38:51] - Read packet limit of 540015616... Set to 524286976.
[00:38:51] - Error: Could not get length of results file work/wuresults_01.dat
[00:38:51] - Error: Could not read unit 01 file. Removing from queue.
[00:38:51] EUE limit exceeded. Pausing 24 hours.
Proud Member of the [H]orde
Proud Member of the [H]ard DC Commandos
Post Reply