Project: 2665 (Run 0, Clone 106, Gen 6)

Moderators: Site Moderators, FAHC Science Team

Post Reply
Oldhat
Posts: 30
Joined: Mon Dec 03, 2007 11:42 am
Location: Auckland

Project: 2665 (Run 0, Clone 106, Gen 6)

Post by Oldhat »

Just having a few problems completing this unit.
Windows XP Pro
2Gb RAM Intel e8400 (Stock)

The first run failed.

Code: Select all

[11:36:31] Writing local files
[11:36:31] Completed 105000 out of 250000 steps  (42 percent)
[11:40:10] Gromacs cannot continue further.
[11:40:10] Going to send back what have done.
[11:40:10] logfile size: 87625
[11:40:10] - Writing 88161 bytes of core data to disk...
[11:40:10]   ... Done.
[11:40:10] - Failed to delete work/wudata_01.sas
[11:40:10] - Failed to delete work/wudata_01.goe
[11:40:10] Warning:  check for stray files
[11:42:11] 
[11:42:11] Folding@home Core Shutdown: EARLY_UNIT_END
[11:42:11] 
[11:42:11] Folding@home Core Shutdown: EARLY_UNIT_END
[11:42:14] CoreStatus = 7B (123)
[11:42:14] Client-core communications error: ERROR 0x7b
[11:42:14] Deleting current work unit & continuing...
The next run I shut the client down and restarted prior to the failure and got...

Code: Select all

[19:40:04] Completed 67500 out of 250000 steps  (27 percent)
[19:48:32] Warning:  long 1-4 interactions
[19:48:32] Gromacs cannot continue further.
[19:48:32] Going to send back what have done.
[19:48:32] logfile size: 69039
[19:48:32] - Writing 69575 bytes of core data to disk...
[19:48:32]   ... Done.
[19:48:32] - Failed to delete work/wudata_02.arc
[19:48:32] Warning:  check for stray files
[19:50:32] 
[19:50:32] Folding@home Core Shutdown: EARLY_UNIT_END
[19:50:32] 
[19:50:32] Folding@home Core Shutdown: EARLY_UNIT_END
[19:50:35] CoreStatus = 7B (123)
[19:50:35] Client-core communications error: ERROR 0x7b
[19:50:35] Deleting current work unit & continuing...
After this I was lucky enough to be assigned a 3064 that completed successfully before getting the same unit.
This time I got...

Code: Select all

[04:44:27] Completed 82500 out of 250000 steps  (33 percent)
[04:54:11] Warning:  long 1-4 interactions
[04:54:11] Quit 101 - NaN detected: (ener[20])
[04:54:11] 
[04:54:11] Simulation instability has been encountered. The run has entered a
[04:54:11]   state from which no further progress can be made.
[04:54:11] This may be the correct result of the simulation, however if you
[04:54:11]   often see other project units terminating early like this
[04:54:11]   too, you may wish to check the stability of your computer (issues
[04:54:11]   such as high temperature, overclocking, etc.).
[04:54:11] Going to send back what have done.
[04:54:11] logfile size: 80224
[04:54:11] - Writing 80774 bytes of core data to disk...
[04:54:11]   ... Done.
[04:54:11] - Failed to delete work/wudata_04.arc
[04:54:11] Warning:  check for stray files
[04:56:11] 
[04:56:11] Folding@home Core Shutdown: EARLY_UNIT_END
[04:56:11] 
[04:56:11] Folding@home Core Shutdown: EARLY_UNIT_END
[04:56:15] CoreStatus = 7B (123)
[04:56:15] Client-core communications error: ERROR 0x7b
[04:56:15] Deleting current work unit & continuing...
[04:58:19] - Warning: Could not delete all work unit files (4): Core returned invalid code
[04:58:19] Trying to send all finished work units
[04:58:19] + No unsent completed units remaining.
Perhaps something wrong with this unit?
kasson
Pande Group Member
Posts: 1459
Joined: Thu Nov 29, 2007 9:37 pm

Re: Project 2665 (Run 0, Clone 106, Gen 6)

Post by kasson »

Possibly. Multiple failures on a processor running at stock speed that is successfully completing other WU's argue that the WU is at least more finnicky than most. However, the fact that it's failing at different points means that there's not one single place that reproducibly kills the WU.
Oldhat
Posts: 30
Joined: Mon Dec 03, 2007 11:42 am
Location: Auckland

Re: Project 2665 (Run 0, Clone 106, Gen 6)

Post by Oldhat »

Thanks for the reply kasson. If I get it again perhaps I'll be lucky enough to get it on a different computer.

Cheers
Mobius0412
Posts: 15
Joined: Fri Jan 04, 2008 4:04 pm

Re: Project 2665 (Run 0, Clone 106, Gen 6)

Post by Mobius0412 »

I was having multiple failures with a 2665 WU. I "sneakered" it to another box and successfully completed it. I tracked my problem down to a bad HDD. I replaced the drive with notfred's diskless CD and all seems to be working now.
Oldhat
Posts: 30
Joined: Mon Dec 03, 2007 11:42 am
Location: Auckland

Re: Project 2665 (Run 0, Clone 106, Gen 6)

Post by Oldhat »

Thanks for the suggestion Mobius0412. I monitor my folders at least twice a day, but will definitely be keeping a closer eye
on that box. As kasson suggested though, it may purely have been a particularly finicky WU as that PC has successfully completed
a 3064 and has just finished a different 2665.

Rather sadly it is now at 39% of yet another one.

They definitely knock the ppd for a six even after the points boost.
Post Reply