Page 1 of 1

Project: 3064 (Run 2, Clone 24, Gen 66) ERROR 0x0

Posted: Sat Jun 14, 2008 3:22 pm
by toaster8
Here us another self-deleting WU. At 74% that is a lot of wasted computing time.

Code: Select all

19:56:56] Project: 3064 (Run 2, Clone 24, Gen 66)
[19:56:56] 
[19:56:56] Assembly optimizations on if available.
[19:56:56] Entering M.D.
[19:57:02] mdrunner cpfilename: 
[19:57:02] Protein: p3064_lambdaProtein: p3064_lambda5_2003Extra SSE boost OK.
[19:57:02] 
[19:57:02] Extra SSE boost OK.
[19:57:02] Writing local files
[19:57:02] Completed 0 out of 5000000 steps  (0 percent)
[20:10:51] Writing local files
[20:10:51] Completed 50000 out of 5000000 steps  (1 percent)
[20:25:47] Writing local files
[20:25:47] Completed 100000 out of 5000000 steps  (2 percent)
[20:41:00] Writing local files
[20:41:00] Completed 150000 out of 5000000 steps  (3 percent)
[20:56:13] Writing local files
[20:56:13] Completed 200000 out of 5000000 steps  (4 percent)
[21:11:26] Writing local files
[21:11:26] Completed 250000 out of 5000000 steps  (5 percent)
[21:26:39] Writing local files
[21:26:39] Completed 300000 out of 5000000 steps  (6 percent)
[21:41:52] Writing local files
[21:41:52] Completed 350000 out of 5000000 steps  (7 percent)
[21:57:03] Writing local files
[21:57:03] Completed 400000 out of 5000000 steps  (8 percent)
[22:02:33] - Autosending finished units...
[22:02:33] Trying to send all finished work units
[22:02:33] + No unsent completed units remaining.
[22:02:33] - Autosend completed
[22:12:20] Writing local files
[22:12:20] Completed 450000 out of 5000000 steps  (9 percent)
[22:27:33] Writing local files
[22:27:34] Completed 500000 out of 5000000 steps  (10 percent)
[22:42:44] Writing local files
[22:42:44] Completed 550000 out of 5000000 steps  (11 percent)
[22:58:04] Writing local files
[22:58:04] Completed 600000 out of 5000000 steps  (12 percent)
[23:13:30] Writing local files
[23:13:30] Completed 650000 out of 5000000 steps  (13 percent)
[23:28:46] Writing local files
[23:28:46] Completed 700000 out of 5000000 steps  (14 percent)
[23:43:59] Writing local files
[23:43:59] Completed 750000 out of 5000000 steps  (15 percent)
[23:59:10] Writing local files
[23:59:10] Completed 800000 out of 5000000 steps  (16 percent)
[00:14:28] Writing local files
[00:14:28] Completed 850000 out of 5000000 steps  (17 percent)
[00:29:45] Writing local files
[00:29:45] Completed 900000 out of 5000000 steps  (18 percent)
[00:44:57] Writing local files
[00:44:57] Completed 950000 out of 5000000 steps  (19 percent)
[01:00:10] Writing local files
[01:00:10] Completed 1000000 out of 5000000 steps  (20 percent)
[01:15:23] Writing local files
[01:15:23] Completed 1050000 out of 5000000 steps  (21 percent)
[01:30:35] Writing local files
[01:30:35] Completed 1100000 out of 5000000 steps  (22 percent)
[01:45:56] Writing local files
[01:45:56] Completed 1150000 out of 5000000 steps  (23 percent)
[01:59:22] Writing local files
[01:59:22] Completed 1200000 out of 5000000 steps  (24 percent)
[02:13:30] Writing local files
[02:13:30] Completed 1250000 out of 5000000 steps  (25 percent)
[02:28:12] Writing local files
[02:28:12] Completed 1300000 out of 5000000 steps  (26 percent)
[02:42:56] Writing local files
[02:42:56] Completed 1350000 out of 5000000 steps  (27 percent)
[02:57:47] Writing local files
[02:57:47] Completed 1400000 out of 5000000 steps  (28 percent)
[03:12:44] Writing local files
[03:12:44] Completed 1450000 out of 5000000 steps  (29 percent)
[03:27:31] Writing local files
[03:27:31] Completed 1500000 out of 5000000 steps  (30 percent)
[03:42:15] Writing local files
[03:42:15] Completed 1550000 out of 5000000 steps  (31 percent)
[03:56:58] Writing local files
[03:56:58] Completed 1600000 out of 5000000 steps  (32 percent)
[04:02:34] - Autosending finished units...
[04:02:34] Trying to send all finished work units
[04:02:34] + No unsent completed units remaining.
[04:02:34] - Autosend completed
[04:11:45] Writing local files
[04:11:46] Completed 1650000 out of 5000000 steps  (33 percent)
[04:26:28] Writing local files
[04:26:28] Completed 1700000 out of 5000000 steps  (34 percent)
[04:41:11] Writing local files
[04:41:11] Completed 1750000 out of 5000000 steps  (35 percent)
[04:55:53] Writing local files
[04:55:53] Completed 1800000 out of 5000000 steps  (36 percent)
[05:10:55] Writing local files
[05:10:55] Completed 1850000 out of 5000000 steps  (37 percent)
[05:25:45] Writing local files
[05:25:45] Completed 1900000 out of 5000000 steps  (38 percent)
[05:40:30] Writing local files
[05:40:30] Completed 1950000 out of 5000000 steps  (39 percent)
[05:55:17] Writing local files
[05:55:17] Completed 2000000 out of 5000000 steps  (40 percent)
[06:10:14] Writing local files
[06:10:14] Completed 2050000 out of 5000000 steps  (41 percent)
[06:25:03] Writing local files
[06:25:03] Completed 2100000 out of 5000000 steps  (42 percent)
[06:39:46] Writing local files
[06:39:46] Completed 2150000 out of 5000000 steps  (43 percent)
[06:54:29] Writing local files
[06:54:29] Completed 2200000 out of 5000000 steps  (44 percent)
[07:09:12] Writing local files
[07:09:12] Completed 2250000 out of 5000000 steps  (45 percent)
[07:23:55] Writing local files
[07:23:55] Completed 2300000 out of 5000000 steps  (46 percent)
[07:38:38] Writing local files
[07:38:38] Completed 2350000 out of 5000000 steps  (47 percent)
[07:53:23] Writing local files
[07:53:23] Completed 2400000 out of 5000000 steps  (48 percent)
[08:08:10] Writing local files
[08:08:10] Completed 2450000 out of 5000000 steps  (49 percent)
[08:22:58] Writing local files
[08:22:58] Completed 2500000 out of 5000000 steps  (50 percent)
[08:37:43] Writing local files
[08:37:43] Completed 2550000 out of 5000000 steps  (51 percent)
[08:52:37] Writing local files
[08:52:37] Completed 2600000 out of 5000000 steps  (52 percent)
[09:07:26] Writing local files
[09:07:26] Completed 2650000 out of 5000000 steps  (53 percent)
[09:22:23] Writing local files
[09:22:23] Completed 2700000 out of 5000000 steps  (54 percent)
[09:37:05] Writing local files
[09:37:05] Completed 2750000 out of 5000000 steps  (55 percent)
[09:51:47] Writing local files
[09:51:47] Completed 2800000 out of 5000000 steps  (56 percent)
[10:02:35] - Autosending finished units...
[10:02:35] Trying to send all finished work units
[10:02:35] + No unsent completed units remaining.
[10:02:35] - Autosend completed
[10:06:30] Writing local files
[10:06:30] Completed 2850000 out of 5000000 steps  (57 percent)
[10:21:13] Writing local files
[10:21:13] Completed 2900000 out of 5000000 steps  (58 percent)
[10:35:55] Writing local files
[10:35:55] Completed 2950000 out of 5000000 steps  (59 percent)
[10:50:39] Writing local files
[10:50:39] Completed 3000000 out of 5000000 steps  (60 percent)
[11:05:23] Writing local files
[11:05:23] Completed 3050000 out of 5000000 steps  (61 percent)
[11:20:10] Writing local files
[11:20:10] Completed 3100000 out of 5000000 steps  (62 percent)
[11:34:54] Writing local files
[11:34:54] Completed 3150000 out of 5000000 steps  (63 percent)
[11:49:38] Writing local files
[11:49:38] Completed 3200000 out of 5000000 steps  (64 percent)
[12:04:20] Writing local files
[12:04:20] Completed 3250000 out of 5000000 steps  (65 percent)
[12:19:06] Writing local files
[12:19:06] Completed 3300000 out of 5000000 steps  (66 percent)
[12:33:49] Writing local files
[12:33:49] Completed 3350000 out of 5000000 steps  (67 percent)
[12:48:33] Writing local files
[12:48:33] Completed 3400000 out of 5000000 steps  (68 percent)
[13:03:17] Writing local files
[13:03:17] Completed 3450000 out of 5000000 steps  (69 percent)
[13:17:59] Writing local files
[13:17:59] Completed 3500000 out of 5000000 steps  (70 percent)
[13:32:41] Writing local files
[13:32:41] Completed 3550000 out of 5000000 steps  (71 percent)
[13:47:23] Writing local files
[13:47:23] Completed 3600000 out of 5000000 steps  (72 percent)
[14:02:05] Writing local files
[14:02:05] Completed 3650000 out of 5000000 steps  (73 percent)
[14:16:49] Writing local files
[14:16:49] Completed 3700000 out of 5000000 steps  (74 percent)
[14:17:08] Warning:  long 1-4 interactions
[14:17:13] CoreStatus = 0 (0)
[14:17:13] Client-core communications error: ERROR 0x0
[14:17:13] Deleting current work unit & continuing...
[14:17:13] - Using generic /Applications/Folding@home.app/mpiexec
[14:21:40] - Warning: Could not delete all work unit files (3): Core returned invalid code
[14:21:40] Trying to send all finished work units
[14:21:40] + No unsent completed units remaining.
[14:21:40] - Preparing to get new work unit...
[14:21:40] + Attempting to get work packet
[14:21:40] - Will indicate memory of 4096 MB
[14:21:40] - Connecting to assignment server
[14:21:40] Connecting to http://assign.stanford.edu:8080/
[14:21:40] Posted data.
[14:21:40] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[14:21:40] + News From Folding@Home: Welcome to Folding@Home
[14:21:40] Loaded queue successfully.
[14:21:40] Connecting to http://171.64.65.56:8080/
[14:21:43] Posted data.
[14:21:43] Initial: 0000; - Receiving payload (expected size: 2425125)
[14:21:49] - Downloaded at ~394 kB/s
[14:21:49] - Averaged speed for that direction ~194 kB/s
[14:21:49] + Received work.
[14:21:49] + Closed connections

Re: Project: 3064 (Run 2, Clone 24, Gen 66) ERROR 0x0

Posted: Sat Jun 14, 2008 3:41 pm
by alancabler

Re: Project: 3064 (Run 2, Clone 24, Gen 66) ERROR 0x0

Posted: Sat Jun 14, 2008 9:11 pm
by toaster8
Does FAH want us to report these errors or not. And don't tell me to stop before I get to that error and then restart. I don't have time to sit and watch a WU's progress to stop it and restart it. The other two WUs completed without any problems so it is unlikely to be a hardware issue.

Re: Project: 3064 (Run 2, Clone 24, Gen 66) ERROR 0x0

Posted: Mon Jun 16, 2008 5:52 am
by toaster8
And there it goes again, same error. Repetitive errors, no true direction for correcting them and a lack of response entirely from the Pande team is discouraging to say the least. Why should I or anyone else bother to contribute computing power to a group that treats the volunteers like this? Is anyone actually trying to get to the bottom of these 0x0 errors or is the only 'fix' to stop and restart because if it is then that is no fix at all.

Re: Project: 3064 (Run 2, Clone 24, Gen 66) ERROR 0x0

Posted: Mon Jun 16, 2008 6:41 am
by 7im
Treats volunteers like what? Alancabler posted the WIKI entry for the only info we have on 0x0 errors. Would it really help if I asked a Pande Group member to repeat what has already been said? Probably not. And this is a beta client, with a disclaimer about not running the client if you don't have the tolerance to put up with a few problems now and again. So I'm not really sure what you are asking for.

This is what I use as a standard for reporting errors or not...

http://fahwiki.net/index.php/Early_Unit ... rting_EUEs

Thanks for reporting the error. That's about all we can do, besides listing a few more details about your system as noted in the Wiki. And as noted in the Wiki entry Alan posted above, 0x0 errors are of an unknown cause, which would obviously be difficult to fix, even with more details, though I'm sure they've tried to fix them and will continue to try.

Ultimately, how you handle the error is up to you. Stopping and restarting helps both you and the project. You get the points, and the project gets the work done. But again, if you don't have the time, that's okay also. If the next WU that errors out doesn't get returned either, Pande Group will have to deal with that too. I'm sure they appreciate your effort either way. Thanks again.

Re: Project: 3064 (Run 2, Clone 24, Gen 66) ERROR 0x0

Posted: Mon Jun 16, 2008 4:52 pm
by DanEnsign
toaster8,

Another thing to try is to run the standard client, rather than the SMP client. While we're having a lot of success with the SMP client science-wise, it would certainly be less of a headache for you to run the standard client.

Have fun,
Dan

Re: Project: 3064 (Run 2, Clone 24, Gen 66) ERROR 0x0

Posted: Mon Jun 16, 2008 5:04 pm
by bruce
One of the causes of error 0x0 is memory errors. If your memory is overclocked or if you've got a module that has an occasional error, the SMP cores are better at finding the problem than most memory diagnostics, though they don't have any way to report if that is actually what went wrong.

Re: Project: 3064 (Run 2, Clone 24, Gen 66) ERROR 0x0

Posted: Mon Jun 16, 2008 5:07 pm
by 7im
bruce wrote:One of the causes of error 0x0 is memory errors. If your memory is overclocked or if you've got a module that has an occasional error, the SMP cores are better at finding the problem than most memory diagnostics, though they don't have any way to report if that is actually what went wrong.
Even not setting the correct voltage for the memory in the BIOS can cause these problems. For example, all of the Corsair memory I buy defaults to v1.9v, but doesn't run right until I up the voltage to the recommended 2.1v. I just saw another post about the exact same thing. ;)

Re: Project: 3064 (Run 2, Clone 24, Gen 66) ERROR 0x0

Posted: Tue Jun 17, 2008 3:01 am
by toaster8
Thanks, but a hardware issue is highly unlikely as it is stock - that is not overclocked, no voltage changes. Other WU's run fine.

7im, yes you're right I should expaect a few bugs with the client but if they don't want feedback (reporting of errors) then why ask for it?

Dan, that's not an answer, that's you telling me that you can't be bothered.

The Wiki says: If the same WU is failing multiple times at the same spot then this WU is likely faulty and should be removed from circulation. These errors should get reported in the forum (include system specs, and project, run, clone and generation numbers).

So I am reporting multiple failures on the same WU and then getting it again.

Re: Project: 3064 (Run 2, Clone 24, Gen 66) ERROR 0x0

Posted: Tue Jun 17, 2008 5:48 am
by 7im
My example of the Corsair memory was at stock speeds also. What I am saying is that the default "stock" speeds and settings are NOT always the correct settings, and can sometimes cause problems.

Anyway, I looked up that WU, and no one else has returned it successfully yet, so we'll go with "bad WU" in this case. Thanks for reporting it.