Page 1 of 1

I think FAH WU started anew.

Posted: Wed Jul 27, 2011 3:18 pm
by Jim78418
I'm not all that literate in the computer world but from what I see in this log it would appear that something happened and the WU started fresh when I started my computer this morning. Can someone intrepet this log for me? If I'm doing something wrong I would like to know it. Thanks, Jim

Code: Select all


Launch directory: C:\Users\Jimmy\AppData\Roaming\Folding@home-x86

[13:53:05] - Ask before connecting: No
[13:53:05] - User name: Jim78418 (Team 0)
[13:53:05] - User ID: 1D64B22657331BBE
[13:53:05] - Machine ID: 9
[13:53:05] Loaded queue successfully.
[13:53:05] Initialization complete
[13:53:05] + Processing work unit
[13:53:05] Core required: FahCore_a4.exe
[13:53:05] Core found.
[13:53:05] Working on queue slot 00 [July 26 13:53:05 UTC]
[13:53:05] + Working ...
[13:53:06] *------------------------------*
[13:53:06] Folding@Home Gromacs GB Core
[13:53:06] Version 2.27 (Dec. 15, 2010)
[13:53:06] Preparing to commence simulation
[13:53:06] - Looking at optimizations...
[13:53:06] - Files status OK
[13:53:07] - Expanded 55883 -> 207472 (decompressed 371.2 percent)
[13:53:07] Called DecompressByteArray: compressed_data_size=55883 data_size=207472, decompressed_data_size=207472 diff=0
[13:53:07] - Digital signature verified
[13:53:07] Project: 7007 (Run 0, Clone 111, Gen 3)
[13:53:07] Assembly optimizations on if available.
[13:53:07] Entering M.D.
[13:53:13] Using Gromacs checkpoints
[13:53:13] Mapping NT from 1 to 1 
[13:53:13] Resuming from checkpoint
[13:53:13] Verified work/wudata_00.log
[13:53:13] Verified work/wudata_00.trr
[13:53:13] Verified work/wudata_00.xtc
[13:53:14] Verified work/wudata_00.edr
[13:53:14] Completed 2700001 out of 10000000 steps  (27%)
[14:16:02] Completed 2800000 out of 10000000 steps  (28%)
[14:38:19] Completed 2900000 out of 10000000 steps  (29%)
[15:00:38] Completed 3000000 out of 10000000 steps  (30%)
[15:51:56] Completed 3100000 out of 10000000 steps  (31%)
[16:13:54] Completed 3200000 out of 10000000 steps  (32%)
[16:35:56] Completed 3300000 out of 10000000 steps  (33%)
[18:27:56] Completed 3400000 out of 10000000 steps  (34%)
[18:50:13] Completed 3500000 out of 10000000 steps  (35%)
[19:12:13] Completed 3600000 out of 10000000 steps  (36%)
[19:34:12] Completed 3700000 out of 10000000 steps  (37%)
[20:08:14] + Working...
[20:26:15] Completed 3800000 out of 10000000 steps  (38%)
[20:48:14] Completed 3900000 out of 10000000 steps  (39%)
[21:10:13] Completed 4000000 out of 10000000 steps  (40%)
[21:32:05] Completed 4100000 out of 10000000 steps  (41%)
[21:55:55] Completed 4200000 out of 10000000 steps  (42%)
[22:17:52] Completed 4300000 out of 10000000 steps  (43%)
[22:39:49] Completed 4400000 out of 10000000 steps  (44%)
[23:01:52] Completed 4500000 out of 10000000 steps  (45%)
[23:23:54] Completed 4600000 out of 10000000 steps  (46%)

--- Opening Log file [July 27 01:00:25 UTC] 

# Windows CPU Systray Edition #################################################

                       Folding@Home Client Version 6.23



Launch directory: C:\Users\Jimmy\AppData\Roaming\Folding@home-x86

[01:00:25] - Ask before connecting: No
[01:00:25] - User name: Jim78418 (Team 0)
[01:00:25] - User ID: 1D64B22657331BBE
[01:00:25] - Machine ID: 9
[01:00:25] Loaded queue successfully.
[01:00:25] Initialization complete
[01:00:25] + Processing work unit
[01:00:25] Core required: FahCore_a4.exe
[01:00:25] Core found.
[01:00:25] Working on queue slot 00 [July 27 01:00:25 UTC]
[01:00:25] + Working ...
[01:00:25] *------------------------------*
[01:00:25] Folding@Home Gromacs GB Core
[01:00:25] Version 2.27 (Dec. 15, 2010)
[01:00:25] Preparing to commence simulation
[01:00:25] - Ensuring status. Please wait.
[01:00:35] - Looking at optimizations...
[01:00:35] - Working with standard loops on this execution.
[01:00:35] - Previous termination of core was improper.
[01:00:35] - Files status OK
[01:00:35] - Expanded 55883 -> 207472 (decompressed 371.2 percent)
[01:00:35] Called DecompressByteArray: compressed_data_size=55883 data_size=207472, decompressed_data_size=207472 diff=0
[01:00:35] - Digital signature verified
[01:00:35] Project: 7007 (Run 0, Clone 111, Gen 3)
[01:00:35] Entering M.D.
[01:00:41] Using Gromacs checkpoints
[01:00:41] Mapping NT from 1 to 1 
[01:00:41] Resuming from checkpoint
[01:00:41] Verified work/wudata_00.log
[01:00:45] Verified work/wudata_00.trr
[01:00:46] Verified work/wudata_00.xtc
[01:00:46] Verified work/wudata_00.edr
[01:00:46] Completed 4600001 out of 10000000 steps  (46%)
[01:08:21] .
[01:08:21] Folding@home Core Shutdown: UNKNOWN_ERROR
[01:27:48] %)
[01:52:03] Completed 4800000 out of 10000000 steps  (48%)
[02:16:30] Completed 4900000 out of 10000000 steps  (49%)
[02:41:15] Completed 5000000 out of 10000000 steps  (50%)
[03:05:46] Completed 5100000 out of 10000000 steps  (51%)
[03:29:49] Completed 5200000 out of 10000000 steps  (52%)
[03:53:34] Completed 5300000 out of 10000000 steps  (53%)

--- Opening Log file [July 27 12:00:07 UTC] 

# Windows CPU Systray Edition #################################################

                       Folding@Home Client Version 6.23



Launch directory: C:\Users\Jimmy\AppData\Roaming\Folding@home-x86

[12:00:07] - Ask before connecting: No
[12:00:07] - User name: Jim78418 (Team 0)
[12:00:07] - User ID: 1D64B22657331BBE
[12:00:07] - Machine ID: 9
[12:00:07] Loaded queue successfully.
[12:00:07] Initialization complete
[12:00:07] + Processing work unit
[12:00:07] Core required: FahCore_a4.exe
[12:00:07] Core found.
[12:00:08] Working on queue slot 00 [July 27 12:00:08 UTC]
[12:00:08] + Working ...
[12:00:08] *------------------------------*
[12:00:08] Folding@Home Gromacs GB Core
[12:00:08] Version 2.27 (Dec. 15, 2010)
[12:00:08] Preparing to commence simulation
[12:00:08] - Ensuring status. Please wait.
[12:00:18] - Looking at optimizations...
[12:00:18] - Working with standard loops on this execution.
[12:00:20] - Created dyn
[12:00:20] - Files status OK
[12:00:20] - Expanded 55883 -> 207472 (decompressed 371.2 percent)
[12:00:20] Called DecompressByteArray: compressed_data_size=55883 data_size=207472, decompressed_data_size=207472 diff=0
[12:00:20] - Digital signature verified
[12:00:20] Project: 7007 (Run 0, Clone 111, Gen 3)
[12:00:20] Entering M.D.
[12:00:26] Mapping NT from 1 to 1 
[12:00:26] Completed 0 out of 10000000 steps  (0%)
[12:23:28] Completed 100000 out of 10000000 steps  (1%)
[13:32:56] Completed 200000 out of 10000000 steps  (2%)
[13:55:05] Completed 300000 out of 10000000 steps  (3%)
[14:17:13] Completed 400000 out of 10000000 steps  (4%)
[14:39:50] Completed 500000 out of 10000000 steps  (5%)
[15:02:29] Completed 600000 out of 10000000 steps  (6%)

Re: I think FAH WU started anew.

Posted: Wed Jul 27, 2011 4:41 pm
by Jesse_V
I don't think its your fault. It just encountered an error and figured that it couldn't safely continue further, so it started over. I'm sure that will complete without any issues now.

Re: I think FAH WU started anew.

Posted: Wed Jul 27, 2011 4:52 pm
by bruce
Jesse is right . . . it might be a one-time occurrence. At least in this case, FAH has successfully recovered by restarting the WU.

Nevertheless, something bad did happen here:

Code: Select all

[01:00:46] Completed 4600001 out of 10000000 steps (46%)
[01:08:21] .
[01:08:21] Folding@home Core Shutdown: UNKNOWN_ERROR
[01:27:48] %)
[01:52:03] Completed 4800000 out of 10000000 steps (48%)
I've never seen that sort of error combination before so I can only guess. The UNKNOWN_ERROR indicates that something happened outside of FAH's control. The fact that the log data was corrupted is another indication that something is wrong but the two events are not necessarily related, though they might be. Whatever actually happened corrupted the data and when you restarted, FAH detected the corruption and corrected that problem by restarting from the beginning of the WU.

I'd look carefully at how FAH is being started. Is it possible that it got started twice at the same time, such as once as a service and/or once from clicking a shortcut and/or once from a shortcut in the startup folder?

UNKNOWN_ERRORs sometimes come from genuine hardware errors, whether it's a problem with the filesystem, from overclocking, from a bad stick of RAM, from overheating, et cetera. FAH does work your computer very hard and it might detect an error that doesn't show up by other means. For starters, I'd check my filesystem and see if any errors are detected by the OS.

Beyond that, I'd watch carefully to see if it was a one-time thing or if there is a pattern of failures.

Re: I think FAH WU started anew.

Posted: Thu Jul 28, 2011 3:06 am
by Jim78418
Thanks Jesse and Bruce for taking a look. The program may have been started once when I booted my machine and then again by me as I didn't see the icon on the taskbar. Sometime when I boot up in the morning, FAH gives an error message saying something is using it's port and it doesn't start but if I start it manually then it will run fine... I thought I had missed the error message but maybe it was just the icon not showing for some reason.

FWIW, I don't think the log was corrupted as I didn't think it necessary to post the whole thing so I only copied a part of it... assuming that is what you were referring to in your post Bruce.

In any case, I won't worry about it and will watch this WU closely. If I encounter problems I'll give a shout.

Thanks again, Jim

Re: I think FAH WU started anew.

Posted: Thu Jul 28, 2011 5:23 pm
by bruce
The only corruption that I see is the message Completed 4700000 out of 10000000 steps (47%) which is (almost completely) missing from the log. Those three log entries suggest that two clients may have been trying to update the same log. That's not supposed to be possible, but the V6 code that "prevents" two clients from getting started has some limitations and that's fixed in the new V7 code.