Page 1 of 1

Project 2653 Run 1, Clone 129, Gen 143 - EUE

Posted: Fri Feb 19, 2010 11:07 pm
by djchandler
On Windows 7 x64 Ultimate-RTM, 6.29 beta client. When running Core_A1 consistently get "Working with standard loops on this execution. - Previous termination of core was improper." If get A3 core WU following, either hangs or get ridiculously long TPF until re-boot.

Hardware: Phenom II x4 @ 2,8GHZ, 4GB Corsair XMS2 RAM DDR2 1066, conservative overclock stable (IMHO) with FSB increased ~3.5% (207MHZ, DDR2 clock 1103), CPU multiplier up to 14.5, discrete ATI 3450 graphics card, Gigabyte MA785GM-US2H mainboard with HD 4200 IGP disabled (Hybrid Crossfire capable).

Code: Select all

[11:08:31] Working on queue slot 01 [February 19 11:08:31 UTC]
[11:08:31] + Working ...
[11:08:31] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 2892 -version 629'

[11:08:31] 
[11:08:31] *------------------------------*
[11:08:31] Folding@Home Gromacs SMP Core
[11:08:31] Version 1.74 (March 10, 2007)
[11:08:31] 
[11:08:31] Preparing to commence simulation
[11:08:31] - Ensuring status. Please wait.
[11:08:48] - Looking at optimizations...
[11:08:48] - Working with standard loops on this execution.
[11:08:48] - Previous termination of core was improper.
[11:08:48] - Files status OK
[11:08:51] - Expanded 2431290 -> 1289199- Starting from initial work pa- Starting from initial work packet
[11:08:51] 
[11:08:51] Project: 2653 (Run 1, Clone 129, Gen 143)
[11:08:51] 
[11:08:51] Entering M.D.
[11:08:57] Rejecting checkpoint
[11:08:58] SE boost OK.
[11:08:58] ein in POPCExtra SSE boost OK.
[11:08:58] 
[11:08:59] Extra SSE boost OK.
[11:08:59] Writing local files
[11:08:59] Completed 0 out of 500000 steps  (0 percent)
[11:19:09] Writing local files
[11:19:10] Completed 5000 out of 500000 steps  (1 percent)
[11:29:19] Writing local files
[11:29:19] Completed 10000 out of 500000 steps  (2 percent)
<snip>
[21:09:31] Completed 295000 out of 500000 steps  (59 percent)
[21:19:41] Writing local files
[21:19:41] Completed 300000 out of 500000 steps  (60 percent)
[21:29:52] Writing local files
[21:29:52] Completed 305000 out of 500000 steps  (61 percent)
[21:32:56] Quit 101 - NaN detected: (ener[16])
[21:32:56] 
[21:32:56] Simulation instability has been encountered. The run has entered a
[21:32:56]   state from which no further progress can be made.
[21:32:56] This may be the correct result of the simulation, however if you
[21:32:56]   often see other project units terminating early like this
[21:32:56]   too, you may wish to check the stability of your computer (issues
[21:32:56]   such as high temperature, overclocking, etc.).
[21:32:56] Going to send back what have done.
[21:32:56] logfile size: 8783
[21:32:56] - Writing 9333 bytes of core data to disk...
[21:32:56]   ... Done.
[21:34:56] 
[21:34:56] Folding@home Core Shutdown: EARLY_UNIT_END
[21:34:56] 
[21:34:56] Folding@home Core Shutdown: EARLY_UNIT_END
[21:34:59] CoreStatus = 7B (123)
[21:34:59] Sending work to server
[21:34:59] Project: 2653 (Run 1, Clone 129, Gen 143)


[21:34:59] + Attempting to send results [February 19 21:34:59 UTC]
[21:34:59] - Reading file work/wuresults_01.dat from core
[21:34:59]   (Read 9333 bytes from disk)
[21:34:59] Connecting to http://171.64.65.64:8080/
[21:35:00] Posted data.
[21:35:00] Initial: 0000; - Uploaded at ~10 kB/s
[21:35:00] - Averaged speed for that direction ~94 kB/s
[21:35:00] + Results successfully sent
[21:35:00] Thank you for your contribution to Folding@Home.
[21:35:04] - Warning: Could not delete all work unit files (1): Core returned invalid code
[21:35:04] Trying to send all finished work units
[21:35:04] + No unsent completed units remaining.
[21:35:04] - Preparing to get new work unit...
[21:35:04] Cleaning up work directory
[21:35:05] + Attempting to get work packet
[21:35:05] Passkey found
[21:35:05] - Will indicate memory of 1536 MB
[21:35:05] - Connecting to assignment server
[21:35:05] Connecting to http://assign.stanford.edu:8080/
[21:35:05] Posted data.
[21:35:05] Initial: ED82; - Successful: assigned to (130.237.232.140).
[21:35:05] + News From Folding@Home: Welcome to Folding@Home
[21:35:06] Loaded queue successfully.
[21:35:06] Connecting to http://130.237.232.140:8080/
[21:35:07] Posted data.
[21:35:07] Initial: 0000; - Receiving payload (expected size: 1799285)
[21:35:12] - Downloaded at ~351 kB/s
[21:35:12] - Averaged speed for that direction ~493 kB/s
[21:35:12] + Received work.
[21:35:12] Trying to send all finished work units
[21:35:12] + No unsent completed units remaining.
[21:35:12] + Closed connections

Re: Project 2653 Run 1, Clone 129, Gen 143 - EUE

Posted: Sat Feb 20, 2010 5:23 am
by bruce
For you, Project: 2653 (Run 1, Clone 129, Gen 143) got calculation errors at 61%. Someone else was able to complete the same WU when it was reassigned. That probably means your hardware is overclocked too much or not being cooled adequately (such as dust in the heatsink).

If you're overclocking, have you run StressCPU and memtest recently?

Re: Project 2653 Run 1, Clone 129, Gen 143 - EUE

Posted: Sat Feb 20, 2010 8:37 pm
by djchandler
Will check hardware OC, stress test and adjust. In the other thread I started, viewtopic.php?f=58&t=13533, Wrish pointed out this could be a RAM problem. Will take your recommendation and proceed accordingly.

EDIT
:oops:
Found an error in my OC utility (Gigabyte EasyTune) settings. I had BIOS set for RAM @ 2.1v, so assumed that was default setting in utility--big mistake assuming, OC utility was taking it back to 1.85v, so now all may be well. I don't know why that changed. Recovering from sixth surgery in three years, so I'm blaming meds (again).