Page 1 of 1

Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Sat Apr 23, 2011 7:15 pm
by scutzi128
Another 6901 same issue with my other 2. I have not been able to successfully submit a 6901 wu all month. All other wus submit fine. Something is really wrong and its a pita. From here on out I'm killing all 6901s. No point in wasting electricity if its not helping anyone out.

Code: Select all

[07:15:46] - Preparing to get new work unit...
[07:15:46] Cleaning up work directory
[07:15:46] + Attempting to get work packet
[07:15:46] Passkey found
[07:15:46] - Connecting to assignment server
[07:15:46] - Successful: assigned to (130.237.232.237).
[07:15:46] + News From Folding@Home: Welcome to Folding@Home
[07:15:47] Loaded queue successfully.
[07:16:01] + Closed connections
[07:16:01] 
[07:16:01] + Processing work unit
[07:16:01] Core required: FahCore_a5.exe
[07:16:01] Core found.
[07:16:01] Working on queue slot 08 [April 22 07:16:01 UTC]
[07:16:01] + Working ...
[07:16:01] 
[07:16:01] *------------------------------*
[07:16:01] Folding@Home Gromacs SMP Core
[07:16:01] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[07:16:01] 
[07:16:01] Preparing to commence simulation
[07:16:01] - Looking at optimizations...
[07:16:01] - Created dyn
[07:16:01] - Files status OK
[07:16:03] - Expanded 24872435 -> 30796292 (decompressed 123.8 percent)
[07:16:03] Called DecompressByteArray: compressed_data_size=24872435 data_size=30796292, decompressed_data_size=30796292 diff=0
[07:16:03] - Digital signature verified
[07:16:03] 
[07:16:03] Project: 6901 (Run 6, Clone 10, Gen 26)
[07:16:03] 
[07:16:03] Assembly optimizations on if available.
[07:16:03] Entering M.D.
[07:16:09] Mapping NT from 12 to 12 
[07:16:11] Completed 0 out of 250000 steps  (0%)
[07:35:03] Completed 2500 out of 250000 steps  (1%)
[07:53:55] Completed 5000 out of 250000 steps  (2%)
[08:12:48] Completed 7500 out of 250000 steps  (3%)
[08:31:40] Completed 10000 out of 250000 steps  (4%)
[08:50:31] Completed 12500 out of 250000 steps  (5%)
[09:09:21] Completed 15000 out of 250000 steps  (6%)
[09:28:14] Completed 17500 out of 250000 steps  (7%)
[09:47:05] Completed 20000 out of 250000 steps  (8%)
[10:05:54] Completed 22500 out of 250000 steps  (9%)
[10:24:46] Completed 25000 out of 250000 steps  (10%)
[10:43:37] Completed 27500 out of 250000 steps  (11%)
[11:02:29] Completed 30000 out of 250000 steps  (12%)
[11:21:20] Completed 32500 out of 250000 steps  (13%)
[11:40:10] Completed 35000 out of 250000 steps  (14%)
[11:59:01] Completed 37500 out of 250000 steps  (15%)
[12:17:51] Completed 40000 out of 250000 steps  (16%)
[12:36:41] Completed 42500 out of 250000 steps  (17%)
[12:55:43] Completed 45000 out of 250000 steps  (18%)
[13:14:40] Completed 47500 out of 250000 steps  (19%)
[13:33:42] Completed 50000 out of 250000 steps  (20%)
[13:52:42] Completed 52500 out of 250000 steps  (21%)
[14:11:40] Completed 55000 out of 250000 steps  (22%)
[14:30:36] Completed 57500 out of 250000 steps  (23%)
[14:49:40] Completed 60000 out of 250000 steps  (24%)
[15:08:46] Completed 62500 out of 250000 steps  (25%)
[15:27:47] Completed 65000 out of 250000 steps  (26%)
[15:46:42] Completed 67500 out of 250000 steps  (27%)
[16:05:35] Completed 70000 out of 250000 steps  (28%)
[16:24:28] Completed 72500 out of 250000 steps  (29%)
[16:43:23] Completed 75000 out of 250000 steps  (30%)
[17:02:21] Completed 77500 out of 250000 steps  (31%)
[17:21:17] Completed 80000 out of 250000 steps  (32%)
[17:40:05] Completed 82500 out of 250000 steps  (33%)
[17:58:55] Completed 85000 out of 250000 steps  (34%)
[18:17:45] Completed 87500 out of 250000 steps  (35%)
[18:36:34] Completed 90000 out of 250000 steps  (36%)
[18:55:35] Completed 92500 out of 250000 steps  (37%)
[19:14:28] Completed 95000 out of 250000 steps  (38%)
[19:33:20] Completed 97500 out of 250000 steps  (39%)
[19:52:12] Completed 100000 out of 250000 steps  (40%)
[20:11:03] Completed 102500 out of 250000 steps  (41%)
[20:29:56] Completed 105000 out of 250000 steps  (42%)
[20:48:48] Completed 107500 out of 250000 steps  (43%)
[21:07:41] Completed 110000 out of 250000 steps  (44%)
[21:26:33] Completed 112500 out of 250000 steps  (45%)
[21:45:21] Completed 115000 out of 250000 steps  (46%)
[22:04:13] Completed 117500 out of 250000 steps  (47%)
[22:23:04] Completed 120000 out of 250000 steps  (48%)
[22:41:58] Completed 122500 out of 250000 steps  (49%)
[23:00:48] Completed 125000 out of 250000 steps  (50%)
[23:19:39] Completed 127500 out of 250000 steps  (51%)
[23:38:33] Completed 130000 out of 250000 steps  (52%)
[23:57:24] Completed 132500 out of 250000 steps  (53%)
[00:16:17] Completed 135000 out of 250000 steps  (54%)
[00:35:18] Completed 137500 out of 250000 steps  (55%)
[00:54:20] Completed 140000 out of 250000 steps  (56%)
[01:13:19] Completed 142500 out of 250000 steps  (57%)
[01:32:12] Completed 145000 out of 250000 steps  (58%)
[01:51:06] Completed 147500 out of 250000 steps  (59%)
[02:09:59] Completed 150000 out of 250000 steps  (60%)
[02:28:53] Completed 152500 out of 250000 steps  (61%)
[02:47:50] Completed 155000 out of 250000 steps  (62%)
[03:06:46] Completed 157500 out of 250000 steps  (63%)
[03:25:40] Completed 160000 out of 250000 steps  (64%)
[03:44:35] Completed 162500 out of 250000 steps  (65%)
[04:03:37] Completed 165000 out of 250000 steps  (66%)
[04:22:39] Completed 167500 out of 250000 steps  (67%)
[04:41:31] Completed 170000 out of 250000 steps  (68%)
[05:00:21] Completed 172500 out of 250000 steps  (69%)
[05:19:12] Completed 175000 out of 250000 steps  (70%)
[05:38:05] Completed 177500 out of 250000 steps  (71%)
[05:56:57] Completed 180000 out of 250000 steps  (72%)
[06:15:46] Completed 182500 out of 250000 steps  (73%)
[06:34:47] Completed 185000 out of 250000 steps  (74%)
[06:53:39] Completed 187500 out of 250000 steps  (75%)
[07:12:31] Completed 190000 out of 250000 steps  (76%)
[07:31:24] Completed 192500 out of 250000 steps  (77%)
[07:50:13] Completed 195000 out of 250000 steps  (78%)
[08:09:03] Completed 197500 out of 250000 steps  (79%)
[08:27:55] Completed 200000 out of 250000 steps  (80%)
[08:46:49] Completed 202500 out of 250000 steps  (81%)
[09:05:40] Completed 205000 out of 250000 steps  (82%)
[09:24:34] Completed 207500 out of 250000 steps  (83%)
[09:43:26] Completed 210000 out of 250000 steps  (84%)
[10:02:16] Completed 212500 out of 250000 steps  (85%)
[10:21:05] Completed 215000 out of 250000 steps  (86%)
[10:39:56] Completed 217500 out of 250000 steps  (87%)
[10:58:47] Completed 220000 out of 250000 steps  (88%)
[11:17:39] Completed 222500 out of 250000 steps  (89%)
[11:36:31] Completed 225000 out of 250000 steps  (90%)
[11:55:22] Completed 227500 out of 250000 steps  (91%)
[12:14:16] Completed 230000 out of 250000 steps  (92%)
[12:33:06] Completed 232500 out of 250000 steps  (93%)
[12:51:57] Completed 235000 out of 250000 steps  (94%)
[13:10:47] Completed 237500 out of 250000 steps  (95%)
[13:29:37] Completed 240000 out of 250000 steps  (96%)
[13:48:29] Completed 242500 out of 250000 steps  (97%)
[14:07:21] Completed 245000 out of 250000 steps  (98%)
[14:26:12] Completed 247500 out of 250000 steps  (99%)
[14:45:05] Completed 250000 out of 250000 steps  (100%)
[14:45:12] DynamicWrapper: Finished Work Unit: sleep=10000
[14:45:22] 
[14:45:22] Finished Work Unit:
[14:45:22] - Reading up to 52713120 from "work/wudata_08.trr": Read 52713120
[14:45:23] trr file hash check passed.
[14:45:23] - Reading up to 47101816 from "work/wudata_08.xtc": Read 47101816
[14:45:24] xtc file hash check passed.
[14:45:24] edr file hash check passed.
[14:45:24] logfile size: 212124
[14:45:24] Leaving Run
[14:45:24] - Writing 100197008 bytes of core data to disk...
[14:45:25]   ... Done.
[14:45:36] - Shutting down core
[14:45:36] 
[14:45:36] Folding@home Core Shutdown: FINISHED_UNIT
[14:45:37] CoreStatus = 64 (100)
[14:45:37] Sending work to server
[14:45:37] Project: 6901 (Run 6, Clone 10, Gen 26)


[14:45:37] + Attempting to send results [April 23 14:45:37 UTC]
[14:46:14] - Server reports problem with unit.
[14:46:14] - Preparing to get new work unit...
[14:46:14] Cleaning up work directory
[14:46:14] + Attempting to get work packet
[14:46:14] Passkey found
[14:46:14] - Connecting to assignment server
[14:46:14] - Successful: assigned to (130.237.232.237).
[14:46:14] + News From Folding@Home: Welcome to Folding@Home
[14:46:14] Loaded queue successfully.
[14:46:30] + Closed connections

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Sun Apr 24, 2011 11:49 am
by toTOW
I've asked Peter Kasson to take a look at your WU ... we had some troubles like (Server reports problem with unit.) this with p6901, so I hope it's not the same problem coming back ...

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Mon Apr 25, 2011 12:32 am
by gwildperson
Have Dr. Kasson look at a couple more while he's at it. There does appear to be a problem with 6901:
viewtopic.php?f=19&t=18413

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Mon Apr 25, 2011 12:56 am
by ChelseaOilman
The question is what is different about your setup. Other people are able to submit these WUs without issues.

Your WU (P6901 R10 C4 G17) was added to the stats database on 2011-04-20 18:08:40 for 73198.2 points of credit.
Your WU (P6901 R6 C10 G26) was added to the stats database on 2011-04-24 13:12:14 for 102911 points of credit.

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Mon Apr 25, 2011 4:53 pm
by kasson
We apologize for any problems that you are having with these work units. As a general principle, we typically do not check individual work unit beyond the mod checks. As ChelseaOilman suggests, the fact that other people finished and submitted these work units suggests a problem either on the client end or the network rather than the server. We may make an exception in this case; however, in order to do so, we'll need your username and the the IP address of the system that was requesting and returning the work unit. (It should be the *same* system requesting, completing, and returning. While some people are able to make "sneakernetting" work under some circumstances, it is not something we can provide support for.)

Some things to check on the client:
1--this server times out connections after 2 hours. If your upload is taking more than 2 hours, that may be the problem.
2--was this the first time your client got this work unit? Sometimes if you get a work unit, get it again, and then complete it, you may miss the original deadline. This is something that should be clarified.
3--are you completing and returning other bigadv WU's successfully? Are you completing and returning other non-bigadv SMP WU's successfully?

All this said, we do occasionally find a server configuration issue to check and are always grateful for such notification.

Once again, sorry you're having problems.

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Mon Apr 25, 2011 7:08 pm
by zodac
I can answer those questions.

1) The log shows the the WU completed at 14:45:05, and the server reported an error a minute later (14:46:14).

2) I do believe it is the first time this RCG was Folded. And would missing the deadline be relevant here? The issue isn't that there were no points credited, but that the server claimed there was an issue with the results.

3) Other -bigadv and normal SMP WUs are being completed and uploaded without issue. This error only occurs with P6901s (though not all P6901s).

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Mon Apr 25, 2011 7:16 pm
by codysluder
zodac wrote:2) I do believe it is the first time this RCG was Folded. And would missing the deadline be relevant here? The issue isn't that there were no points credited, but that the server claimed there was an issue with the results.
That depends on what you mean by Deadline. Bonus points are awarded based on the Preferred Deadline. Once a WU passes the Final Deadline, the servers cannot accept it, even for zero points. The v6 client is designed to discard WUs that expire (relative to the final deadline) but you may have disabled that feature or you may be running V7 where the feature doesn't work yet.

Did you PM Kasson with the answer to his request?
we'll need your username and the the IP address of the system that was requesting and returning the work unit.

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Mon Apr 25, 2011 7:40 pm
by zodac
While that have been be a possible explanation, this has happened too many times, with different WUs:
viewtopic.php?f=19&t=18413
viewtopic.php?f=19&t=18426
viewtopic.php?f=19&t=18425
viewtopic.php?f=19&t=18373
viewtopic.php?f=19&t=18405
viewtopic.php?f=19&t=18332

All resulting in the same "Server reports problem with unit" error.

And, no, I didn't PM him, but I did PM scutzi to pass on the info. :)

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Mon Apr 25, 2011 8:25 pm
by scutzi128
Thanks Z.

I do believe somewhere there is a problem. I have completed 4 6901s on 2 different setups myself and all 4 have netted in the same "server reports problems with unit" error. Now I could see how this could possibly be seen as a network timeout error but I have a 35/35 connections and uploads usually only take around 2 min. The deadlines should not be being missed either as the computers are folding 24/7 and usually complete a big adv between 1.5 and 2 days. Also I have been able to submit several 2684, 2685, 2686, and 6900 wus with no issues. There are also several other people over on my folding forums reporting the same issues and they too only have these issues with 6901s.

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Tue Apr 26, 2011 1:32 pm
by kasson
We're looking at it. There seems to be a pattern where it's on work units that have previously failed.

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Tue Apr 26, 2011 4:19 pm
by zodac
Thank you.

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Tue Apr 26, 2011 8:56 pm
by kasson
The work units in question are being rejected due to corruption detected in the returned work unit. I'm continuing to look at this; I want to make sure that there's not a discrepancy in the way we detect such corruption.
Just to rule out any core-dependent issues, could you tell me what core version and OS is failing for you? I'll check that against successful returns.

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Tue Apr 26, 2011 9:07 pm
by kasson
PS just for context, this problem affected 51 returns out of 13460 during a ~1-month period. Some of the unsuccessful returns were duplicates.

Re: Project: 6901 (Run 6, Clone 10, Gen 26)

Posted: Tue May 03, 2011 5:48 pm
by scutzi128
This has happened to me on both Windows 7 64bit and Ubuntu 10.10. Both running the 6.34 client.