Page 1 of 1

Project: 4403 (Run 9, Clone 8, Gen 2)

Posted: Thu Mar 13, 2008 5:28 am
by dgermann
I've been assigned this WU three times. Each time, it aborts with a communications error. The error does not happen at the same step. The associated log file fragments for the three failing runs are included below.

I've successfully processed 3 other Project 4403 WUs to completion on this CPU (as well as 14 more WUs from other projects, plus 19 more WUs on the other CPU):
Project: 4403 (Run 29, Clone 1, Gen 1)
Project: 4403 (Run 53, Clone 11, Gen 11)
Project: 4403 (Run 40, Clone 10, Gen 17)

Does this mean there's most likely a problem with this particular WU, or could there be something wrong with my machine (off-the-shelf Dell PowerEdge 840 with C2D E4500 2.2 GHz, running Ubuntu Linux 7.10)?

Thanks!
-Dan

Code: Select all

[10:48:26] - Preparing to get new work unit...
[10:48:26] + Attempting to get work packet
[10:48:26] - Connecting to assignment server
[10:48:26] - Successful: assigned to (171.64.122.72).
[10:48:26] + News From Folding@Home: Welcome to Folding@Home
[10:48:26] Loaded queue successfully.
[10:48:28] + Closed connections
[10:48:28]
[10:48:28] + Processing work unit
[10:48:28] Core required: FahCore_81.exe
[10:48:28] Core found.
[10:48:28] Working on Unit 08 [March 1 10:48:28]
[10:48:28] + Working ...
[10:48:28]
[10:48:28] *------------------------------*
[10:48:28] Folding@Home Gromacs Simulated Tempering Core
[10:48:28] Version 1.00 (Dec 9, 2006)
[10:48:28]
[10:48:28] Preparing to commence simulation
[10:48:28] - Looking at optimizations...
[10:48:28] - Created dyn
[10:48:28] - Files status OK
[10:48:28] - Expanded 239425 -> 1168687 (decompressed 488.1 percent)
[10:48:28] - Starting from initial work packet
[10:48:28]
[10:48:28] Project: 4403 (Run 9, Clone 8, Gen 2)
[10:48:28]
[10:48:28] Assembly optimizations on if available.
[10:48:28] Entering M.D.
[10:48:34] Protein: p4403_Seq_42_unf_AMBER
[10:48:34]
[10:48:34] Writing local files
[10:48:56] Extra SSE boost OK.
[10:48:56] Writing local files
[10:48:56] Completed 0 out of 150000 steps  (0%)
[10:50:06] Writing local files
[10:50:06] Completed 1500 out of 150000 steps  (1%)
[10:51:16] Writing local files
[10:51:16] Completed 3000 out of 150000 steps  (2%)
[10:52:27] Writing local files
[10:52:27] Completed 4500 out of 150000 steps  (3%)
[10:53:37] Writing local files
[10:53:37] Completed 6000 out of 150000 steps  (4%)
[10:54:47] Writing local files
[10:54:47] Completed 7500 out of 150000 steps  (5%)
[10:55:58] Writing local files
[10:55:58] Completed 9000 out of 150000 steps  (6%)
[10:57:09] Writing local files
[10:57:09] Completed 10500 out of 150000 steps  (7%)
[10:58:19] Writing local files
[10:58:19] Completed 12000 out of 150000 steps  (8%)
[10:59:30] Writing local files
[10:59:30] Completed 13500 out of 150000 steps  (9%)
[11:00:40] Writing local files
[11:00:40] Completed 15000 out of 150000 steps  (10%)
[11:01:50] Writing local files
[11:01:50] Completed 16500 out of 150000 steps  (11%)
[11:03:01] Writing local files
[11:03:01] Completed 18000 out of 150000 steps  (12%)
[11:04:11] Writing local files
[11:04:11] Completed 19500 out of 150000 steps  (13%)
[11:05:22] Writing local files
[11:05:22] Completed 21000 out of 150000 steps  (14%)
[11:06:32] Writing local files
[11:06:32] Completed 22500 out of 150000 steps  (15%)
[11:07:43] Writing local files
[11:07:43] Completed 24000 out of 150000 steps  (16%)
[11:08:53] Writing local files
[11:08:53] Completed 25500 out of 150000 steps  (17%)
[11:10:03] Writing local files
[11:10:03] Completed 27000 out of 150000 steps  (18%)
[11:11:14] Writing local files
[11:11:14] Completed 28500 out of 150000 steps  (19%)
[11:12:24] Writing local files
[11:12:24] Completed 30000 out of 150000 steps  (20%)
[11:13:34] Writing local files
[11:13:34] Completed 31500 out of 150000 steps  (21%)
[11:14:45] Writing local files
[11:14:45] Completed 33000 out of 150000 steps  (22%)
[11:15:55] Writing local files
[11:15:55] Completed 34500 out of 150000 steps  (23%)
[11:17:06] Writing local files
[11:17:06] Completed 36000 out of 150000 steps  (24%)
[11:18:16] Writing local files
[11:18:16] Completed 37500 out of 150000 steps  (25%)
[11:19:26] Writing local files
[11:19:26] Completed 39000 out of 150000 steps  (26%)
[11:20:37] Writing local files
[11:20:37] Completed 40500 out of 150000 steps  (27%)
[11:21:47] Writing local files
[11:21:47] Completed 42000 out of 150000 steps  (28%)
[11:22:58] Writing local files
[11:22:58] Completed 43500 out of 150000 steps  (29%)
[11:24:08] Writing local files
[11:24:08] Completed 45000 out of 150000 steps  (30%)
[11:25:19] Writing local files
[11:25:19] Completed 46500 out of 150000 steps  (31%)
[11:26:29] Writing local files
[11:26:29] Completed 48000 out of 150000 steps  (32%)
[11:27:40] Writing local files
[11:27:40] Completed 49500 out of 150000 steps  (33%)
[11:28:50] Writing local files
[11:28:50] Completed 51000 out of 150000 steps  (34%)
[11:30:00] Writing local files
[11:30:00] Completed 52500 out of 150000 steps  (35%)
[11:31:11] Writing local files
[11:31:11] Completed 54000 out of 150000 steps  (36%)
[11:32:21] Writing local files
[11:32:21] Completed 55500 out of 150000 steps  (37%)
[11:33:32] Writing local files
[11:33:32] Completed 57000 out of 150000 steps  (38%)
[11:34:42] Writing local files
[11:34:42] Completed 58500 out of 150000 steps  (39%)
[11:35:53] Writing local files
[11:35:53] Completed 60000 out of 150000 steps  (40%)
[11:37:04] Writing local files
[11:37:04] Completed 61500 out of 150000 steps  (41%)
[11:38:14] Writing local files
[11:38:14] Completed 63000 out of 150000 steps  (42%)
[11:39:25] Writing local files
[11:39:25] Completed 64500 out of 150000 steps  (43%)
[11:40:35] Writing local files
[11:40:35] Completed 66000 out of 150000 steps  (44%)
[11:41:46] Writing local files
[11:41:46] Completed 67500 out of 150000 steps  (45%)
[11:42:56] Writing local files
[11:42:56] Completed 69000 out of 150000 steps  (46%)
[11:44:07] Writing local files
[11:44:07] Completed 70500 out of 150000 steps  (47%)
[11:45:18] Writing local files
[11:45:18] Completed 72000 out of 150000 steps  (48%)
[11:46:28] Writing local files
[11:46:28] Completed 73500 out of 150000 steps  (49%)
[11:47:39] Writing local files
[11:47:39] Completed 75000 out of 150000 steps  (50%)
[11:48:49] Writing local files
[11:48:49] Completed 76500 out of 150000 steps  (51%)
[11:50:00] Writing local files
[11:50:00] Completed 78000 out of 150000 steps  (52%)
[11:51:10] Writing local files
[11:51:10] Completed 79500 out of 150000 steps  (53%)
[11:52:21] Writing local files
[11:52:21] Completed 81000 out of 150000 steps  (54%)
[11:53:31] Writing local files
[11:53:31] Completed 82500 out of 150000 steps  (55%)
[11:54:42] Writing local files
[11:54:42] Completed 84000 out of 150000 steps  (56%)
[11:55:53] Writing local files
[11:55:53] Completed 85500 out of 150000 steps  (57%)
[11:57:03] Writing local files
[11:57:03] Completed 87000 out of 150000 steps  (58%)
[11:58:14] Writing local files
[11:58:14] Completed 88500 out of 150000 steps  (59%)
[11:59:24] Writing local files
[11:59:24] Completed 90000 out of 150000 steps  (60%)
[12:00:35] Writing local files
[12:00:35] Completed 91500 out of 150000 steps  (61%)
[12:01:46] Writing local files
[12:01:46] Completed 93000 out of 150000 steps  (62%)
[12:02:56] Writing local files
[12:02:56] Completed 94500 out of 150000 steps  (63%)
[12:04:07] Writing local files
[12:04:07] Completed 96000 out of 150000 steps  (64%)
[12:04:56] CoreStatus = 0 (0)
[12:04:56] Client-core communications error: ERROR 0x0
[12:04:56] Deleting current work unit & continuing...

Code: Select all

[18:53:04] - Preparing to get new work unit...
[18:53:04] + Attempting to get work packet
[18:53:04] - Connecting to assignment server
[18:53:04] - Successful: assigned to (171.64.122.72).
[18:53:04] + News From Folding@Home: Welcome to Folding@Home
[18:53:04] Loaded queue successfully.
[18:53:06] + Closed connections
[18:53:06]
[18:53:06] + Processing work unit
[18:53:06] Core required: FahCore_81.exe
[18:53:06] Core found.
[18:53:06] Working on Unit 09 [March 12 18:53:06]
[18:53:06] + Working ...
[18:53:06]
[18:53:06] *------------------------------*
[18:53:06] Folding@Home Gromacs Simulated Tempering Core
[18:53:06] Version 1.00 (Dec 9, 2006)
[18:53:06]
[18:53:06] Preparing to commence simulation
[18:53:06] - Looking at optimizations...
[18:53:06] - Created dyn
[18:53:06] - Files status OK
[18:53:06] - Expanded 239448 -> 1168687 (decompressed 488.0 percent)
[18:53:06] - Starting from initial work packet
[18:53:06]
[18:53:06] Project: 4403 (Run 9, Clone 8, Gen 2)
[18:53:06]
[18:53:06] Assembly optimizations on if available.
[18:53:06] Entering M.D.
[18:53:12] Protein: p4403_Seq_42_unf_AMBER
[18:53:12]
[18:53:12] Writing local files
[18:53:34] Extra SSE boost OK.
[18:53:34] Writing local files
[18:53:34] Completed 0 out of 150000 steps  (0%)
[18:54:44] Writing local files
[18:54:44] Completed 1500 out of 150000 steps  (1%)
[18:55:55] Writing local files
[18:55:55] Completed 3000 out of 150000 steps  (2%)
[18:57:06] Writing local files
[18:57:06] Completed 4500 out of 150000 steps  (3%)
[18:58:16] Writing local files
[18:58:16] Completed 6000 out of 150000 steps  (4%)
[18:59:26] Writing local files
[18:59:26] Completed 7500 out of 150000 steps  (5%)
[19:00:37] Writing local files
[19:00:37] Completed 9000 out of 150000 steps  (6%)
[19:01:47] Writing local files
[19:01:47] Completed 10500 out of 150000 steps  (7%)
[19:02:58] Writing local files
[19:02:58] Completed 12000 out of 150000 steps  (8%)
[19:04:08] Writing local files
[19:04:08] Completed 13500 out of 150000 steps  (9%)
[19:05:19] Writing local files
[19:05:19] Completed 15000 out of 150000 steps  (10%)
[19:06:29] Writing local files
[19:06:29] Completed 16500 out of 150000 steps  (11%)
[19:07:40] Writing local files
[19:07:40] Completed 18000 out of 150000 steps  (12%)
[19:08:51] Writing local files
[19:08:51] Completed 19500 out of 150000 steps  (13%)
[19:10:01] Writing local files
[19:10:01] Completed 21000 out of 150000 steps  (14%)
[19:11:12] Writing local files
[19:11:12] Completed 22500 out of 150000 steps  (15%)
[19:12:23] Writing local files
[19:12:23] Completed 24000 out of 150000 steps  (16%)
[19:13:33] Writing local files
[19:13:33] Completed 25500 out of 150000 steps  (17%)
[19:14:44] Writing local files
[19:14:44] Completed 27000 out of 150000 steps  (18%)
[19:15:54] Writing local files
[19:15:54] Completed 28500 out of 150000 steps  (19%)
[19:17:05] Writing local files
[19:17:05] Completed 30000 out of 150000 steps  (20%)
[19:18:16] Writing local files
[19:18:16] Completed 31500 out of 150000 steps  (21%)
[19:19:26] Writing local files
[19:19:26] Completed 33000 out of 150000 steps  (22%)
[19:20:37] Writing local files
[19:20:37] Completed 34500 out of 150000 steps  (23%)
[19:21:48] Writing local files
[19:21:48] Completed 36000 out of 150000 steps  (24%)
[19:22:58] Writing local files
[19:22:58] Completed 37500 out of 150000 steps  (25%)
[19:24:09] Writing local files
[19:24:09] Completed 39000 out of 150000 steps  (26%)
[19:25:20] Writing local files
[19:25:20] Completed 40500 out of 150000 steps  (27%)
[19:26:31] Writing local files
[19:26:31] Completed 42000 out of 150000 steps  (28%)
[19:27:41] Writing local files
[19:27:41] Completed 43500 out of 150000 steps  (29%)
[19:28:52] Writing local files
[19:28:52] Completed 45000 out of 150000 steps  (30%)
[19:30:03] Writing local files
[19:30:03] Completed 46500 out of 150000 steps  (31%)
[19:31:13] Writing local files
[19:31:13] Completed 48000 out of 150000 steps  (32%)
[19:32:24] Writing local files
[19:32:24] Completed 49500 out of 150000 steps  (33%)
[19:33:35] Writing local files
[19:33:35] Completed 51000 out of 150000 steps  (34%)
[19:34:46] Writing local files
[19:34:46] Completed 52500 out of 150000 steps  (35%)
[19:35:56] Writing local files
[19:35:56] Completed 54000 out of 150000 steps  (36%)
[19:37:05] CoreStatus = 0 (0)
[19:37:05] Client-core communications error: ERROR 0x0
[19:37:05] Deleting current work unit & continuing...

Code: Select all

[19:37:23] - Preparing to get new work unit...
[19:37:23] + Attempting to get work packet
[19:37:23] - Connecting to assignment server
[19:37:23] - Successful: assigned to (171.64.122.72).
[19:37:23] + News From Folding@Home: Welcome to Folding@Home
[19:37:23] Loaded queue successfully.
[19:37:25] + Closed connections
[19:37:30]
[19:37:30] + Processing work unit
[19:37:30] Core required: FahCore_81.exe
[19:37:30] Core found.
[19:37:30] Working on Unit 00 [March 12 19:37:30]
[19:37:30] + Working ...
[19:37:30]
[19:37:30] *------------------------------*
[19:37:30] Folding@Home Gromacs Simulated Tempering Core
[19:37:30] Version 1.00 (Dec 9, 2006)
[19:37:30]
[19:37:30] Preparing to commence simulation
[19:37:30] - Looking at optimizations...
[19:37:30] - Created dyn
[19:37:30] - Files status OK
[19:37:30] - Expanded 239448 -> 1168687 (decompressed 488.0 percent)
[19:37:30] - Starting from initial work packet
[19:37:30]
[19:37:30] Project: 4403 (Run 9, Clone 8, Gen 2)
[19:37:30]
[19:37:30] Assembly optimizations on if available.
[19:37:30] Entering M.D.
[19:37:36] Protein: p4403_Seq_42_unf_AMBER
[19:37:36]
[19:37:36] Writing local files
[19:37:58] Extra SSE boost OK.
[19:37:58] Writing local files
[19:37:58] Completed 0 out of 150000 steps  (0%)
[19:39:08] Writing local files
[19:39:08] Completed 1500 out of 150000 steps  (1%)
[19:40:19] Writing local files
[19:40:19] Completed 3000 out of 150000 steps  (2%)
[19:41:29] Writing local files
[19:41:29] Completed 4500 out of 150000 steps  (3%)
[19:42:39] Writing local files
[19:42:39] Completed 6000 out of 150000 steps  (4%)
[19:43:50] Writing local files
[19:43:50] Completed 7500 out of 150000 steps  (5%)
[19:45:00] Writing local files
[19:45:00] Completed 9000 out of 150000 steps  (6%)
[19:46:11] Writing local files
[19:46:11] Completed 10500 out of 150000 steps  (7%)
[19:47:21] Writing local files
[19:47:21] Completed 12000 out of 150000 steps  (8%)
[19:48:32] Writing local files
[19:48:32] Completed 13500 out of 150000 steps  (9%)
[19:49:42] Writing local files
[19:49:42] Completed 15000 out of 150000 steps  (10%)
[19:50:53] Writing local files
[19:50:53] Completed 16500 out of 150000 steps  (11%)
[19:52:03] Writing local files
[19:52:03] Completed 18000 out of 150000 steps  (12%)
[19:53:14] Writing local files
[19:53:14] Completed 19500 out of 150000 steps  (13%)
[19:54:24] Writing local files
[19:54:24] Completed 21000 out of 150000 steps  (14%)
[19:55:35] Writing local files
[19:55:35] Completed 22500 out of 150000 steps  (15%)
[19:56:46] Writing local files
[19:56:46] Completed 24000 out of 150000 steps  (16%)
[19:57:56] Writing local files
[19:57:56] Completed 25500 out of 150000 steps  (17%)
[19:59:07] Writing local files
[19:59:07] Completed 27000 out of 150000 steps  (18%)
[20:00:17] Writing local files
[20:00:17] Completed 28500 out of 150000 steps  (19%)
[20:01:28] Writing local files
[20:01:28] Completed 30000 out of 150000 steps  (20%)
[20:02:38] Writing local files
[20:02:38] Completed 31500 out of 150000 steps  (21%)
[20:03:49] Writing local files
[20:03:49] Completed 33000 out of 150000 steps  (22%)
[20:04:59] Writing local files
[20:04:59] Completed 34500 out of 150000 steps  (23%)
[20:06:10] Writing local files
[20:06:10] Completed 36000 out of 150000 steps  (24%)
[20:07:20] Writing local files
[20:07:20] Completed 37500 out of 150000 steps  (25%)
[20:08:31] Writing local files
[20:08:31] Completed 39000 out of 150000 steps  (26%)
[20:09:41] Writing local files
[20:09:41] Completed 40500 out of 150000 steps  (27%)
[20:10:52] Writing local files
[20:10:52] Completed 42000 out of 150000 steps  (28%)
[20:12:02] Writing local files
[20:12:02] Completed 43500 out of 150000 steps  (29%)
[20:13:13] Writing local files
[20:13:13] Completed 45000 out of 150000 steps  (30%)
[20:14:23] Writing local files
[20:14:23] Completed 46500 out of 150000 steps  (31%)
[20:15:34] Writing local files
[20:15:34] Completed 48000 out of 150000 steps  (32%)
[20:16:44] Writing local files
[20:16:44] Completed 49500 out of 150000 steps  (33%)
[20:17:55] Writing local files
[20:17:55] Completed 51000 out of 150000 steps  (34%)
[20:19:05] Writing local files
[20:19:05] Completed 52500 out of 150000 steps  (35%)
[20:20:16] Writing local files
[20:20:16] Completed 54000 out of 150000 steps  (36%)
[20:21:26] Writing local files
[20:21:26] Completed 55500 out of 150000 steps  (37%)
[20:22:37] Writing local files
[20:22:37] Completed 57000 out of 150000 steps  (38%)
[20:23:47] Writing local files
[20:23:47] Completed 58500 out of 150000 steps  (39%)
[20:24:58] Writing local files
[20:24:58] Completed 60000 out of 150000 steps  (40%)
[20:26:08] Writing local files
[20:26:08] Completed 61500 out of 150000 steps  (41%)
[20:27:19] Writing local files
[20:27:19] Completed 63000 out of 150000 steps  (42%)
[20:28:29] Writing local files
[20:28:29] Completed 64500 out of 150000 steps  (43%)
[20:29:40] Writing local files
[20:29:40] Completed 66000 out of 150000 steps  (44%)
[20:30:51] Writing local files
[20:30:51] Completed 67500 out of 150000 steps  (45%)
[20:32:01] Writing local files
[20:32:01] Completed 69000 out of 150000 steps  (46%)
[20:33:12] Writing local files
[20:33:12] Completed 70500 out of 150000 steps  (47%)
[20:34:22] Writing local files
[20:34:22] Completed 72000 out of 150000 steps  (48%)
[20:35:33] Writing local files
[20:35:33] Completed 73500 out of 150000 steps  (49%)
[20:36:43] Writing local files
[20:36:43] Completed 75000 out of 150000 steps  (50%)
[20:37:54] Writing local files
[20:37:54] Completed 76500 out of 150000 steps  (51%)
[20:39:05] Writing local files
[20:39:05] Completed 78000 out of 150000 steps  (52%)
[20:40:15] Writing local files
[20:40:15] Completed 79500 out of 150000 steps  (53%)
[20:41:14] CoreStatus = 0 (0)
[20:41:14] Client-core communications error: ERROR 0x0
[20:41:14] Deleting current work unit & continuing...

Re: Project: 4403 (Run 9, Clone 8, Gen 2)

Posted: Thu Mar 13, 2008 4:20 pm
by 7im
Consider this an unusual WU. It has been returned by 6 different people, all with different point totals, so it does fail in different places, for who knows what reason? Thanks for the report.

Re: Project: 4403 (Run 9, Clone 8, Gen 2)

Posted: Thu Mar 13, 2008 10:48 pm
by anandhanju
7im wrote:...It has been returned by 6 different people, all with different point totals, so it does fail in different places...
I'm curious to know how this happens. Of course, this being error 0x0 doesn't help but I thought a bad WU is one where the given set of values (and their computations until the point of failure) cause the "planet to fly off the trajectory" as someone said in the old forum (bruce perhaps?). Anyway, isn't this violation of limits supposed to occur at the same stage [unless caused by hardware instabilities] or is there some randomness inherent to each folding simulation of the same WU? Or are there many kinds of bad WUs? Forgive me if this is a silly question.

Re: Project: 4403 (Run 9, Clone 8, Gen 2)

Posted: Fri Mar 14, 2008 3:37 am
by bruce
Continuing discussion of the science has been moved here.