Page 1 of 1

Project: 5509 (Run 4, Clone 63, Gen 0) UM after completion

Posted: Tue Sep 09, 2008 1:25 am
by ^w^ing
The very first large WU (well, actually, there were some large WUs like month or two ago already (for like a day) ) i got. It ran from 0 to 100% without an issue but it ended with 'Gromacs cannot continue further' ยป UNSTABLE_MACHINE after the hash checking anyway. I dont see any reason why should that happen and it never happened before.

Code: Select all

[20:37:59] + Processing work unit
[20:37:59] Core required: FahCore_11.exe
[20:37:59] Core found.
[20:37:59] Working on queue slot 03 [September 8 20:37:59 UTC]
[20:37:59] + Working ...
[20:37:59] - Calling '.\FahCore_11.exe -dir work/ -suffix 03 -priority 96 -checkpoint 30 -forceasm -verbose -lifeline 3688 -version 620'

[20:37:59] 
[20:37:59] *------------------------------*
[20:37:59] Folding@Home GPU Core - Beta
[20:37:59] Version 1.09 (Fri Aug 1 11:46:54 PDT 2008)
[20:37:59] 
[20:37:59] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[20:37:59] Build host: amoeba
[20:37:59] Board Type: Nvidia
[20:37:59] Core      : 
[20:37:59] Preparing to commence simulation
[20:37:59] - Assembly optimizations manually forced on.
[20:37:59] - Not checking prior termination.
[20:37:59] - Expanded 84771 -> 447228 (decompressed 527.5 percent)
[20:37:59] Called DecompressByteArray: compressed_data_size=84771 data_size=447228, decompressed_data_size=447228 diff=0
[20:37:59] - Digital signature verified
[20:37:59] 
[20:37:59] Project: 5509 (Run 4, Clone 63, Gen 0)
[20:37:59] 
[20:37:59] Assembly optimizations on if available.
[20:37:59] Entering M.D.
[20:38:05] Working on p5509_lam5w_330K_g91
[20:38:09] Client config found, loading data.
[20:38:09] Starting GUI Server
[20:41:42] Completed 1%
[20:45:12] Completed 2%
[20:48:42] Completed 3%
[20:52:10] Completed 4%
[20:55:33] Completed 5%
[20:58:59] Completed 6%
[21:02:24] Completed 7%
[21:05:49] Completed 8%
[21:09:13] Completed 9%
[21:12:39] Completed 10%
[21:15:25] Completed 11%
[21:17:36] Completed 12%
[21:19:52] Completed 13%
[21:22:04] Completed 14%
[21:24:17] Completed 15%
[21:26:27] Completed 16%
[21:28:42] Completed 17%
[21:31:18] Completed 18%
[21:33:37] Completed 19%
[21:35:52] Completed 20%
[21:38:08] Completed 21%
[21:40:22] Completed 22%
[21:42:43] Completed 23%
[21:44:59] Completed 24%
[21:47:14] Completed 25%
[21:50:44] Completed 26%
[21:53:01] Completed 27%
[21:55:18] Completed 28%
[21:57:35] Completed 29%
[21:59:52] Completed 30%
[22:02:17] Completed 31%
[22:04:36] Completed 32%
[22:06:55] Completed 33%
[22:09:25] Completed 34%
[22:11:55] Completed 35%
[22:14:15] Completed 36%
[22:16:32] Completed 37%
[22:18:48] Completed 38%
[22:21:07] Completed 39%
[22:23:31] Completed 40%
[22:25:47] Completed 41%
[22:28:05] Completed 42%
[22:30:23] Completed 43%
[22:33:03] Completed 44%
[22:35:55] Completed 45%
[22:38:33] Completed 46%
[22:41:12] Completed 47%
[22:43:50] Completed 48%
[22:46:11] Completed 49%
[22:48:36] Completed 50%
[22:50:53] Completed 51%
[22:53:11] Completed 52%
[22:55:31] Completed 53%
[22:57:52] Completed 54%
[23:00:13] Completed 55%
[23:02:38] Completed 56%
[23:04:55] Completed 57%
[23:07:41] Completed 58%
[23:10:01] Completed 59%
[23:13:49] Completed 60%
[23:18:23] Completed 61%
[23:22:12] Completed 62%
[23:25:47] Completed 63%
[23:29:23] Completed 64%
[23:32:58] Completed 65%
[23:36:30] Completed 66%
[23:40:05] Completed 67%
[23:43:52] Completed 68%
[23:47:58] Completed 69%
[23:51:37] Completed 70%
[23:55:50] Completed 71%
[23:59:30] Completed 72%
[00:03:47] Completed 73%
[00:06:34] Completed 74%
[00:08:50] Completed 75%
[00:11:04] Completed 76%
[00:13:20] Completed 77%
[00:15:35] Completed 78%
[00:17:51] Completed 79%
[00:20:06] Completed 80%
[00:22:21] Completed 81%
[00:24:35] Completed 82%
[00:26:49] Completed 83%
[00:29:04] Completed 84%
[00:31:18] Completed 85%
[00:33:33] Completed 86%
[00:35:48] Completed 87%
[00:38:04] Completed 88%
[00:40:17] Completed 89%
[00:42:29] Completed 90%
[00:44:44] Completed 91%
[00:46:58] Completed 92%
[00:49:13] Completed 93%
[00:51:27] Completed 94%
[00:53:42] Completed 95%
[00:55:56] Completed 96%
[00:58:11] Completed 97%
[01:00:25] Completed 98%
[01:02:37] Completed 99%
[01:04:48] Completed 100%
[01:05:48] 
[01:05:48] Finished Work Unit:
[01:05:48] - Reading up to 30216 from "work/wudata_03.trr": Read 30216
[01:05:48] trr file hash check passed.
[01:05:48] - Reading up to 201012 from "work/wudata_03.xtc": Read 201012
[01:05:48] xtc file hash check passed.
[01:05:48] edr file hash check passed.
[01:05:48] logfile size: 35687
[01:05:48] Gromacs cannot continue further.
[01:05:48] Going to send back what have done.
[01:05:49] logfile size: 35687 info=35687 bed=0 hdr=23
[01:05:49] - Writing 36223 bytes of core data to disk...
[01:05:49] Done: 35711 -> 7380 (compressed to 20.6 percent)
[01:05:49]   ... Done.
[01:05:49] 
[01:05:49] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:05:53] CoreStatus = 7A (122)
[01:05:53] Sending work to server
[01:05:53] Project: 5509 (Run 4, Clone 63, Gen 0)
[01:05:53] - Read packet limit of 540015616... Set to 524286976.


[01:05:53] + Attempting to send results [September 9 01:05:53 UTC]
[01:05:53] - Reading file work/wuresults_03.dat from core
[01:05:53]   (Read 7892 bytes from disk)
[01:05:53] Connecting to http://171.64.65.106:8080/
[01:05:54] Posted data.
[01:05:54] Initial: 0000; - Uploaded at ~8 kB/s
[01:05:54] - Averaged speed for that direction ~154 kB/s
[01:05:54] + Results successfully sent
[01:05:54] Thank you for your contribution to Folding@Home.
I wonder if this is gonna happen with all the new large workunits i get :e?:

Re: Project: 5509 (Run 4, Clone 63, Gen 0) UM after completion

Posted: Tue Sep 09, 2008 9:15 am
by toTOW
It looks like someone else completed it without the error ... that's a bit strange indeed.

Re: Project: 5509 (Run 4, Clone 63, Gen 0) UM after completion

Posted: Tue Sep 09, 2008 4:53 pm
by ^w^ing
Project: 5509 (Run 9, Clone 254, Gen 0)

exactly same thing happened, reverting OC to stock values didnt make any difference.

Project: 5513 (Run 1, Clone 67, Gen 0)
&
Project: 5512 (Run 2, Clone 72, Gen 0)

failed with the 'mdrun_gpu returned -1' immediately after engaging the core.

I'm putting my GPU offline until this is resolved. I can finally have a nice sleep after two months of nonstop horrendous noise :lol:

Re: Project: 5509 (Run 4, Clone 63, Gen 0) UM after completion

Posted: Tue Sep 09, 2008 5:00 pm
by ^w^ing
hmm, PS. I backed up the
Project: 5509 (Run 9, Clone 254, Gen 0)
WU when it was at 99% and cut off internet so it couldnt send results after it fails. After that i tried finishing the unit w/ OC (failed), then w/o OC (failed). after that, i gave up, plugged internet back in and wrote my previous post. I turned the OC on again and let the WU finish so it could send the info about EUE back for the WU to be reissued to someone else. But this time the unit finished correctly. Im confused. And im still taking the GPU offline because i cant babysit the GPU client every 3 hours and try to finish the WU multiple times :e?:

Re: Project: 5509 (Run 4, Clone 63, Gen 0) UM after completion

Posted: Thu Sep 11, 2008 1:20 am
by Outback_Jon
See this thread for some more information:

viewtopic.php?f=19&t=5495

Same family of WU, apparently.