Page 1 of 1

Project: 4446 (Run 520, Clone 1, Gen 9) - Starting at 22%?

Posted: Sat Mar 28, 2009 1:56 am
by PeterA
OK. This one's got me baffled. :eo I finished one WU. Started a new WU. Then find out it's already part way finished. :?

Code: Select all

[01:28:03] Folding@home Core Shutdown: FINISHED_UNIT
[01:28:07] CoreStatus = 64 (100)
[01:28:07] Sending work to server
[01:28:07] Project: 4441 (Run 228, Clone 2, Gen 10)
[01:28:07] - Read packet limit of 540015616... Set to 524286976.


[01:28:07] + Attempting to send results [March 28 01:28:07 UTC]
[01:28:08] + Results successfully sent
[01:28:08] Thank you for your contribution to Folding@Home.
[01:28:08] + Number of Units Completed: 106

[01:28:12] - Preparing to get new work unit...
[01:28:12] + Attempting to get work packet
[01:28:12] - Connecting to assignment server
[01:28:13] - Successful: assigned to (171.67.108.13).
[01:28:13] + News From Folding@Home: Welcome to Folding@Home
[01:28:13] Loaded queue successfully.
[01:28:15] + Closed connections
[01:28:15] 
[01:28:15] + Processing work unit
[01:28:15] Core required: FahCore_78.exe
[01:28:15] Core found.
[01:28:15] Working on queue slot 01 [March 28 01:28:15 UTC]
[01:28:15] + Working ...
[01:28:15] 
[01:28:15] *------------------------------*
[01:28:15] Folding@Home Gromacs Core
[01:28:15] Version 1.90 (March 8, 2006)
[01:28:15] 
[01:28:15] Preparing to commence simulation
[01:28:15] - Looking at optimizations...
[01:28:15] - Previous termination of core was improper.
[01:28:15] - Going to use standard loops.
[01:28:15] - Files status OK
[01:28:15] - Expanded 238029 -> 1167489 (decompressed 490.4 percent)
[01:28:15] 
[01:28:15] Project: 4446 (Run 520, Clone 1, Gen 9)
[01:28:15] 
[01:28:15] Entering M.D.
[01:28:36] (Starting from checkpoint)
[01:28:36] Protein: p4446_Seq45_Amber03
[01:28:36] 
[01:28:36] Writing local files
[01:30:07] Completed 327840 out of 1500000 steps  (22%)
Could it be that I started this one once and never finished it?

Re: Project: 4446 (Run 520, Clone 1, Gen 9) - Starting at 22%?

Posted: Sat Mar 28, 2009 2:31 am
by bruce
This is a known problem. Sometimes a WU that has an EUE fails to clean up all of it's files. If one of those happens to be a checkpoint, some time later another WU will be assigned the same queue position, find the checkpoint, and falsely assume it belongs to the new WU. The WU is trash.

This problem has been fixed for the SMP core and for the GPU core but apparently not yet for the x86 Gromacs core.

Re: Project: 4446 (Run 520, Clone 1, Gen 9) - Starting at 22%?

Posted: Sat Mar 28, 2009 2:33 am
by PeterA
Should I let it run it's course or should I remove it somehow?

Re: Project: 4446 (Run 520, Clone 1, Gen 9) - Starting at 22%?

Posted: Sat Mar 28, 2009 2:39 am
by bruce
I'd discard it. There's a pretty good chance the same WU will be reassigned to you and it will start from 0.

As long as you don't have any completed WUs that are still trying to upload, stop the client and delete the entire WORK folder and restart. If you have anything still to upload, you can delete everything in WORK except things called wuresults*.dat.