Page 1 of 1

Project: 2665 (Run 0, Clone 106, Gen 6)

Posted: Fri May 23, 2008 12:34 pm
by Oldhat
Just having a few problems completing this unit.
Windows XP Pro
2Gb RAM Intel e8400 (Stock)

The first run failed.

Code: Select all

[11:36:31] Writing local files
[11:36:31] Completed 105000 out of 250000 steps  (42 percent)
[11:40:10] Gromacs cannot continue further.
[11:40:10] Going to send back what have done.
[11:40:10] logfile size: 87625
[11:40:10] - Writing 88161 bytes of core data to disk...
[11:40:10]   ... Done.
[11:40:10] - Failed to delete work/wudata_01.sas
[11:40:10] - Failed to delete work/wudata_01.goe
[11:40:10] Warning:  check for stray files
[11:42:11] 
[11:42:11] Folding@home Core Shutdown: EARLY_UNIT_END
[11:42:11] 
[11:42:11] Folding@home Core Shutdown: EARLY_UNIT_END
[11:42:14] CoreStatus = 7B (123)
[11:42:14] Client-core communications error: ERROR 0x7b
[11:42:14] Deleting current work unit & continuing...
The next run I shut the client down and restarted prior to the failure and got...

Code: Select all

[19:40:04] Completed 67500 out of 250000 steps  (27 percent)
[19:48:32] Warning:  long 1-4 interactions
[19:48:32] Gromacs cannot continue further.
[19:48:32] Going to send back what have done.
[19:48:32] logfile size: 69039
[19:48:32] - Writing 69575 bytes of core data to disk...
[19:48:32]   ... Done.
[19:48:32] - Failed to delete work/wudata_02.arc
[19:48:32] Warning:  check for stray files
[19:50:32] 
[19:50:32] Folding@home Core Shutdown: EARLY_UNIT_END
[19:50:32] 
[19:50:32] Folding@home Core Shutdown: EARLY_UNIT_END
[19:50:35] CoreStatus = 7B (123)
[19:50:35] Client-core communications error: ERROR 0x7b
[19:50:35] Deleting current work unit & continuing...
After this I was lucky enough to be assigned a 3064 that completed successfully before getting the same unit.
This time I got...

Code: Select all

[04:44:27] Completed 82500 out of 250000 steps  (33 percent)
[04:54:11] Warning:  long 1-4 interactions
[04:54:11] Quit 101 - NaN detected: (ener[20])
[04:54:11] 
[04:54:11] Simulation instability has been encountered. The run has entered a
[04:54:11]   state from which no further progress can be made.
[04:54:11] This may be the correct result of the simulation, however if you
[04:54:11]   often see other project units terminating early like this
[04:54:11]   too, you may wish to check the stability of your computer (issues
[04:54:11]   such as high temperature, overclocking, etc.).
[04:54:11] Going to send back what have done.
[04:54:11] logfile size: 80224
[04:54:11] - Writing 80774 bytes of core data to disk...
[04:54:11]   ... Done.
[04:54:11] - Failed to delete work/wudata_04.arc
[04:54:11] Warning:  check for stray files
[04:56:11] 
[04:56:11] Folding@home Core Shutdown: EARLY_UNIT_END
[04:56:11] 
[04:56:11] Folding@home Core Shutdown: EARLY_UNIT_END
[04:56:15] CoreStatus = 7B (123)
[04:56:15] Client-core communications error: ERROR 0x7b
[04:56:15] Deleting current work unit & continuing...
[04:58:19] - Warning: Could not delete all work unit files (4): Core returned invalid code
[04:58:19] Trying to send all finished work units
[04:58:19] + No unsent completed units remaining.
Perhaps something wrong with this unit?

Re: Project 2665 (Run 0, Clone 106, Gen 6)

Posted: Fri May 23, 2008 2:25 pm
by kasson
Possibly. Multiple failures on a processor running at stock speed that is successfully completing other WU's argue that the WU is at least more finnicky than most. However, the fact that it's failing at different points means that there's not one single place that reproducibly kills the WU.

Re: Project 2665 (Run 0, Clone 106, Gen 6)

Posted: Sat May 24, 2008 12:25 am
by Oldhat
Thanks for the reply kasson. If I get it again perhaps I'll be lucky enough to get it on a different computer.

Cheers

Re: Project 2665 (Run 0, Clone 106, Gen 6)

Posted: Sat May 24, 2008 6:48 pm
by Mobius0412
I was having multiple failures with a 2665 WU. I "sneakered" it to another box and successfully completed it. I tracked my problem down to a bad HDD. I replaced the drive with notfred's diskless CD and all seems to be working now.

Re: Project 2665 (Run 0, Clone 106, Gen 6)

Posted: Sun May 25, 2008 5:06 am
by Oldhat
Thanks for the suggestion Mobius0412. I monitor my folders at least twice a day, but will definitely be keeping a closer eye
on that box. As kasson suggested though, it may purely have been a particularly finicky WU as that PC has successfully completed
a 3064 and has just finished a different 2665.

Rather sadly it is now at 39% of yet another one.

They definitely knock the ppd for a six even after the points boost.