Page 1 of 1

Project: 6503 (Run 19, Clone 16, Gen 72)

Posted: Fri Aug 06, 2010 6:42 pm
by DrSpalding
Hi,

I received an EUE (CoreStatus = 72 (114)) on the WU above, with partial work uploaded 00:44:21 UTC 5 August 2010. I suspect it is a bad WU:

Code: Select all

[23:47:51] - Connecting to assignment server
[23:47:52] - Successful: assigned to (171.64.65.62).
[23:47:52] + News From Folding@Home: Welcome to Folding@Home
[23:47:52] Loaded queue successfully.
[23:47:53] + Closed connections
[23:47:53]
[23:47:53] + Processing work unit
[23:47:53] Core required: FahCore_78.exe
[23:47:53] Core found.
[23:47:53] Working on queue slot 01 [August 4 23:47:53 UTC]
[23:47:53] + Working ...
[23:47:53]
[23:47:53] *------------------------------*
[23:47:53] Folding@Home Gromacs Core
[23:47:53] Version 1.90 (March 8, 2006)
[23:47:53]
[23:47:53] Preparing to commence simulation
[23:47:53] - Looking at optimizations...
[23:47:53] - Created dyn
[23:47:53] - Files status OK
[23:47:54] - Expanded 519826 -> 2536325 (decompressed 487.9 percent)
[23:47:54] - Starting from initial work packet
[23:47:54]
[23:47:54] Project: 6503 (Run 19, Clone 16, Gen 72)
[23:47:54]
[23:47:54] Assembly optimizations on if available.
[23:47:54] Entering M.D.
[23:48:00] Protein: TR462_B_20 in water
[23:48:00]
[23:48:00] Writing local files
[23:48:00] Extra SSE boost OK.
[23:48:01] Writing local files
[23:48:01] Completed 0 out of 250000 steps  (0%)
[23:54:29] Writing local files
[23:54:29] Completed 2500 out of 250000 steps  (1%)
[00:00:59] Writing local files
[00:00:59] Completed 5000 out of 250000 steps  (2%)
[00:09:19] Writing local files
[00:09:19] Completed 7500 out of 250000 steps  (3%)
[00:17:03] Writing local files
[00:17:04] Completed 10000 out of 250000 steps  (4%)
[00:25:45] Writing local files
[00:25:45] Completed 12500 out of 250000 steps  (5%)
[00:33:22] Writing local files
[00:33:22] Completed 15000 out of 250000 steps  (6%)
[00:39:52] Writing local files
[00:39:52] Completed 17500 out of 250000 steps  (7%)
[00:44:18] Gromacs cannot continue further.
[00:44:18] Going to send back what have done.
[00:44:18] logfile size: 0
[00:44:18] Warning: Core could not open logfile.
[00:44:18] - Writing 536 bytes of core data to disk...
[00:44:18] Done: 24 -> 69 (compressed to 287.5 percent)
[00:44:18]   ... Done.
[00:44:18]
[00:44:18] Folding@home Core Shutdown: EARLY_UNIT_END
[00:44:21] CoreStatus = 72 (114)
[00:44:21] Sending work to server
[00:44:21] Project: 6503 (Run 19, Clone 16, Gen 72)


[00:44:21] + Attempting to send results [August 5 00:44:21 UTC]
[00:44:22] + Results successfully sent
[00:44:22] Thank you for your contribution to Folding@Home.

Re: Project: 6503 (Run 19, Clone 16, Gen 72)

Posted: Fri Aug 06, 2010 6:47 pm
by sortofageek
That one gave trouble to three folders, including you, but was completed successfully by a fourth person.

Re: Project: 6503 (Run 19, Clone 16, Gen 72)

Posted: Fri Aug 06, 2010 11:59 pm
by DrSpalding
That would worry me; having two different results means that an unstable solution or some rounding or error calculation is not working properly. Was there a difference in platform and/or cpu (amd vs. intel) that could account for the differences? This is a Core 2 Duo in a Dell laptop running Win7 x64.

Re: Project: 6503 (Run 19, Clone 16, Gen 72)

Posted: Sat Aug 07, 2010 12:02 am
by sortofageek
I don't have a way to compare the different computers, no. I wouldn't let it worry me, however, unless I continued to have problems completing WUs which others could complete.

Re: Project: 6503 (Run 19, Clone 16, Gen 72)

Posted: Sat Aug 07, 2010 5:45 am
by bruce
I can't determine the OS of one of the folks with an error, but everybody else was running WIndows.

An important part of protein simulation is stochastic modeling of thermal effects (see "Brownian motion" though that's only part of the story). Different FahCores have different ways of dealing with the calculation of random events and the generation of random numbers, For some projects, a given WU is absolutely repeatable, while for other projects, the random motion may originate from a different starting point (seed) that generates a different but equivalent pattern of randomness. The latter can lead to a degree of unpredictability for any single WU, even though for that specific Project, the overall results are statistically valid. I don't know which situation describes this project, but it's one possible explanation.

Re: Project: 6503 (Run 19, Clone 16, Gen 72)

Posted: Sat Aug 07, 2010 5:36 pm
by DrSpalding
Thanks Bruce for the 'splanation. I suspected that there was likely a bit of randomness in the simulations and am happy to hear it doesn't come from floating point errors or rounding.