What just happened?

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

Post Reply
nss
Posts: 7
Joined: Thu Jul 11, 2013 6:14 am
Hardware configuration: Mid 2012 MacBook Pro, Mac OS 10.8: 8 GB 1600 MHz DDR3 Ram, 2.7 GHz quad-core Intel Core i7, 512 MB Intel Graphics 4000, Solid State Drive

What just happened?

Post by nss »

My progress on a project went from 88% to 15% (actually, as far as I can tell, it went to 0% and then back up to 15% but I wasn't watching it at the time.) I had closed my laptop when it was at 88% and then reopened it. Here are the logs:

Code: Select all

21:42:45:WU01:FS00:0xa4:*------------------------------*
21:42:45:WU01:FS00:0xa4:Folding@Home Gromacs Core
21:42:45:WU01:FS00:0xa4:Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
21:42:45:WU01:FS00:0xa4:
21:42:45:WU01:FS00:0xa4:Preparing to commence simulation
21:42:45:WU01:FS00:0xa4:- Looking at optimizations...
21:42:45:WU01:FS00:0xa4:- Created dyn
21:42:45:WU01:FS00:0xa4:- Files status OK
21:42:45:WU01:FS00:0xa4:- Expanded 1025467 -> 2544640 (decompressed 248.1 percent)
21:42:45:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=1025467 data_size=2544640, decompressed_data_size=2544640 diff=0
21:42:45:WU01:FS00:0xa4:- Digital signature verified
21:42:45:WU01:FS00:0xa4:
21:42:45:WU01:FS00:0xa4:Project: 8702 (Run 7, Clone 49, Gen 41)
21:42:45:WU01:FS00:0xa4:
21:42:45:WU01:FS00:0xa4:Assembly optimizations on if available.
21:42:45:WU01:FS00:0xa4:Entering M.D.
21:42:50:WU00:FS00:Upload 12.12%
21:42:51:WU01:FS00:0xa4:Mapping NT from 7 to 7 
21:42:51:WU01:FS00:0xa4:Completed 0 out of 250000 steps  (0%)
21:42:56:WU00:FS00:Upload 19.69%
21:43:02:WU00:FS00:Upload 27.26%
21:43:08:WU00:FS00:Upload 37.86%
21:43:14:WU00:FS00:Upload 48.46%
21:43:20:WU00:FS00:Upload 59.07%
21:43:20:5:127.0.0.1:New Web connection
21:43:26:WU00:FS00:Upload 66.64%
21:43:32:WU00:FS00:Upload 77.24%
21:43:38:WU00:FS00:Upload 87.84%
21:43:44:WU00:FS00:Upload 98.44%
21:43:47:WU00:FS00:Upload complete
21:43:47:WU00:FS00:Server responded WORK_ACK (400)
21:43:47:WU00:FS00:Final credit estimate, 2393.00 points
21:43:47:WU00:FS00:Cleaning up
21:44:21:WU01:FS00:0xa4:Completed 2500 out of 250000 steps  (1%)
21:45:49:WU01:FS00:0xa4:Completed 5000 out of 250000 steps  (2%)
21:47:18:WU01:FS00:0xa4:Completed 7500 out of 250000 steps  (3%)
21:48:43:WU01:FS00:0xa4:Completed 10000 out of 250000 steps  (4%)
21:50:09:WU01:FS00:0xa4:Completed 12500 out of 250000 steps  (5%)
21:51:34:WU01:FS00:0xa4:Completed 15000 out of 250000 steps  (6%)
21:53:00:WU01:FS00:0xa4:Completed 17500 out of 250000 steps  (7%)
21:54:26:WU01:FS00:0xa4:Completed 20000 out of 250000 steps  (8%)
21:55:51:WU01:FS00:0xa4:Completed 22500 out of 250000 steps  (9%)
21:57:14:WU01:FS00:0xa4:Completed 25000 out of 250000 steps  (10%)
21:58:38:WU01:FS00:0xa4:Completed 27500 out of 250000 steps  (11%)
22:00:02:WU01:FS00:0xa4:Completed 30000 out of 250000 steps  (12%)
22:01:26:WU01:FS00:0xa4:Completed 32500 out of 250000 steps  (13%)
22:02:50:WU01:FS00:0xa4:Completed 35000 out of 250000 steps  (14%)
22:04:15:WU01:FS00:0xa4:Completed 37500 out of 250000 steps  (15%)
22:05:41:WU01:FS00:0xa4:Completed 40000 out of 250000 steps  (16%)
22:07:11:WU01:FS00:0xa4:Completed 42500 out of 250000 steps  (17%)
22:08:40:WU01:FS00:0xa4:Completed 45000 out of 250000 steps  (18%)
22:10:05:WU01:FS00:0xa4:Completed 47500 out of 250000 steps  (19%)
22:11:31:WU01:FS00:0xa4:Completed 50000 out of 250000 steps  (20%)
22:12:57:WU01:FS00:0xa4:Completed 52500 out of 250000 steps  (21%)
22:14:22:WU01:FS00:0xa4:Completed 55000 out of 250000 steps  (22%)
22:15:50:WU01:FS00:0xa4:Completed 57500 out of 250000 steps  (23%)
22:17:15:WU01:FS00:0xa4:Completed 60000 out of 250000 steps  (24%)
22:18:39:WU01:FS00:0xa4:Completed 62500 out of 250000 steps  (25%)
22:20:05:WU01:FS00:0xa4:Completed 65000 out of 250000 steps  (26%)
22:21:30:WU01:FS00:0xa4:Completed 67500 out of 250000 steps  (27%)
22:22:58:WU01:FS00:0xa4:Completed 70000 out of 250000 steps  (28%)
22:24:24:WU01:FS00:0xa4:Completed 72500 out of 250000 steps  (29%)
22:25:48:WU01:FS00:0xa4:Completed 75000 out of 250000 steps  (30%)
22:27:12:WU01:FS00:0xa4:Completed 77500 out of 250000 steps  (31%)
22:28:37:WU01:FS00:0xa4:Completed 80000 out of 250000 steps  (32%)
22:30:01:WU01:FS00:0xa4:Completed 82500 out of 250000 steps  (33%)
22:31:25:WU01:FS00:0xa4:Completed 85000 out of 250000 steps  (34%)
22:32:49:WU01:FS00:0xa4:Completed 87500 out of 250000 steps  (35%)
22:34:14:WU01:FS00:0xa4:Completed 90000 out of 250000 steps  (36%)
22:35:39:WU01:FS00:0xa4:Completed 92500 out of 250000 steps  (37%)
22:37:04:WU01:FS00:0xa4:Completed 95000 out of 250000 steps  (38%)
22:38:30:WU01:FS00:0xa4:Completed 97500 out of 250000 steps  (39%)
22:39:55:WU01:FS00:0xa4:Completed 100000 out of 250000 steps  (40%)
22:41:20:WU01:FS00:0xa4:Completed 102500 out of 250000 steps  (41%)
22:42:45:WU01:FS00:0xa4:Completed 105000 out of 250000 steps  (42%)
22:44:10:WU01:FS00:0xa4:Completed 107500 out of 250000 steps  (43%)
22:45:35:WU01:FS00:0xa4:Completed 110000 out of 250000 steps  (44%)
22:47:00:WU01:FS00:0xa4:Completed 112500 out of 250000 steps  (45%)
22:48:25:WU01:FS00:0xa4:Completed 115000 out of 250000 steps  (46%)
22:49:50:WU01:FS00:0xa4:Completed 117500 out of 250000 steps  (47%)
22:51:15:WU01:FS00:0xa4:Completed 120000 out of 250000 steps  (48%)
22:52:41:WU01:FS00:0xa4:Completed 122500 out of 250000 steps  (49%)
22:54:06:WU01:FS00:0xa4:Completed 125000 out of 250000 steps  (50%)
22:55:31:WU01:FS00:0xa4:Completed 127500 out of 250000 steps  (51%)
22:56:56:WU01:FS00:0xa4:Completed 130000 out of 250000 steps  (52%)
22:58:22:WU01:FS00:0xa4:Completed 132500 out of 250000 steps  (53%)
22:59:47:WU01:FS00:0xa4:Completed 135000 out of 250000 steps  (54%)
23:01:12:WU01:FS00:0xa4:Completed 137500 out of 250000 steps  (55%)
23:02:38:WU01:FS00:0xa4:Completed 140000 out of 250000 steps  (56%)
23:04:04:WU01:FS00:0xa4:Completed 142500 out of 250000 steps  (57%)
23:05:30:WU01:FS00:0xa4:Completed 145000 out of 250000 steps  (58%)
23:06:56:WU01:FS00:0xa4:Completed 147500 out of 250000 steps  (59%)
23:08:22:WU01:FS00:0xa4:Completed 150000 out of 250000 steps  (60%)
23:09:55:WU01:FS00:0xa4:Completed 152500 out of 250000 steps  (61%)
23:11:24:WU01:FS00:0xa4:Completed 155000 out of 250000 steps  (62%)
23:12:49:WU01:FS00:0xa4:Completed 157500 out of 250000 steps  (63%)
23:14:15:WU01:FS00:0xa4:Completed 160000 out of 250000 steps  (64%)
23:15:41:WU01:FS00:0xa4:Completed 162500 out of 250000 steps  (65%)
23:17:06:WU01:FS00:0xa4:Completed 165000 out of 250000 steps  (66%)
23:18:32:WU01:FS00:0xa4:Completed 167500 out of 250000 steps  (67%)
23:19:57:WU01:FS00:0xa4:Completed 170000 out of 250000 steps  (68%)
23:21:22:WU01:FS00:0xa4:Completed 172500 out of 250000 steps  (69%)
23:22:48:WU01:FS00:0xa4:Completed 175000 out of 250000 steps  (70%)
23:24:13:WU01:FS00:0xa4:Completed 177500 out of 250000 steps  (71%)
23:25:38:WU01:FS00:0xa4:Completed 180000 out of 250000 steps  (72%)
23:27:04:WU01:FS00:0xa4:Completed 182500 out of 250000 steps  (73%)
23:28:29:WU01:FS00:0xa4:Completed 185000 out of 250000 steps  (74%)
23:29:54:WU01:FS00:0xa4:Completed 187500 out of 250000 steps  (75%)
23:31:19:WU01:FS00:0xa4:Completed 190000 out of 250000 steps  (76%)
23:32:44:WU01:FS00:0xa4:Completed 192500 out of 250000 steps  (77%)
23:34:08:WU01:FS00:0xa4:Completed 195000 out of 250000 steps  (78%)
23:35:34:WU01:FS00:0xa4:Completed 197500 out of 250000 steps  (79%)
23:36:59:WU01:FS00:0xa4:Completed 200000 out of 250000 steps  (80%)
23:38:23:WU01:FS00:0xa4:Completed 202500 out of 250000 steps  (81%)
23:39:48:WU01:FS00:0xa4:Completed 205000 out of 250000 steps  (82%)
23:41:13:WU01:FS00:0xa4:Completed 207500 out of 250000 steps  (83%)
23:42:38:WU01:FS00:0xa4:Completed 210000 out of 250000 steps  (84%)
23:44:03:WU01:FS00:0xa4:Completed 212500 out of 250000 steps  (85%)
23:45:28:WU01:FS00:0xa4:Completed 215000 out of 250000 steps  (86%)
23:46:52:WU01:FS00:0xa4:Completed 217500 out of 250000 steps  (87%)
23:48:18:WU01:FS00:0xa4:Completed 220000 out of 250000 steps  (88%)
23:48:31:FS00:Shutting core down
******************************* Date: 2013-07-21 *******************************
03:52:55:WARNING:WU01:FS00:Detected clock skew (4 hours 04 mins), adjusting time estimates
03:52:59:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
03:53:00:WU01:FS00:Starting
03:53:00:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper "/Library/Application Support/FAHClient/cores/www.stanford.edu/~pande/OSX/AMD64/Core_a4.fah/FahCore_a4" -dir 01 -suffix 01 -version 703 -lifeline 75 -checkpoint 15 -np 7
03:53:00:WU01:FS00:Started FahCore on PID 1654
03:53:00:WU01:FS00:Core PID:1655
03:53:00:WU01:FS00:FahCore 0xa4 started
03:53:00:WU01:FS00:0xa4:
03:53:00:WU01:FS00:0xa4:*------------------------------*
03:53:00:WU01:FS00:0xa4:Folding@Home Gromacs Core
03:53:00:WU01:FS00:0xa4:Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
03:53:00:WU01:FS00:0xa4:
03:53:00:WU01:FS00:0xa4:Preparing to commence simulation
03:53:00:WU01:FS00:0xa4:- Looking at optimizations...
03:53:00:WU01:FS00:0xa4:- Files status OK
03:53:00:WU01:FS00:0xa4:- Expanded 1025467 -> 2544640 (decompressed 248.1 percent)
03:53:00:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=1025467 data_size=2544640, decompressed_data_size=2544640 diff=0
03:53:00:WU01:FS00:0xa4:- Digital signature verified
03:53:00:WU01:FS00:0xa4:
03:53:00:WU01:FS00:0xa4:Project: 8702 (Run 7, Clone 49, Gen 41)
03:53:00:WU01:FS00:0xa4:
03:53:00:WU01:FS00:0xa4:Assembly optimizations on if available.
03:53:00:WU01:FS00:0xa4:Entering M.D.
03:53:06:WU01:FS00:0xa4:Using Gromacs checkpoints
03:53:06:WU01:FS00:0xa4:Mapping NT from 7 to 7 
03:53:06:WU01:FS00:0xa4:fcSaveRestoreState: I/O failed dir=0, var=0000000101E86B50, varsize=20
03:53:06:WU01:FS00:0xa4:fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
03:53:06:WU01:FS00:0xa4:fcSaveRestoreState: I/O failed dir=0, var=0000000101F8CB50, varsize=20
03:53:06:WU01:FS00:0xa4:fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
03:53:06:WU01:FS00:0xa4:fcSaveRestoreState: I/O failed dir=0, var=0000000101E03B50, varsize=20
03:53:06:WU01:FS00:0xa4:fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
03:53:06:WU01:FS00:0xa4:fcSaveRestoreState: I/O failed dir=0, var=0000000101CFDB50, varsize=20
03:53:06:WU01:FS00:0xa4:fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
03:53:06:WU01:FS00:0xa4:fcSaveRestoreState: I/O failed dir=0, var=0000000101D80B50, varsize=20
03:53:06:WU01:FS00:0xa4:fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
03:53:06:WU01:FS00:0xa4:fcSaveRestoreState: I/O failed dir=0, var=0000000101F09B50, varsize=20
03:53:06:WU01:FS00:0xa4:fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
03:53:06:WU01:FS00:0xa4:mdrun returnRde s3u
03:53:07:WARNING:WU01:FS00:FahCore returned: CORE_RESTART (98 = 0x62)
03:54:00:WU01:FS00:Starting
03:54:00:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper "/Library/Application Support/FAHClient/cores/www.stanford.edu/~pande/OSX/AMD64/Core_a4.fah/FahCore_a4" -dir 01 -suffix 01 -version 703 -lifeline 75 -checkpoint 15 -np 7
03:54:00:WU01:FS00:Started FahCore on PID 1660
03:54:00:WU01:FS00:Core PID:1661
03:54:00:WU01:FS00:FahCore 0xa4 started
03:54:00:WU01:FS00:0xa4:
03:54:00:WU01:FS00:0xa4:*------------------------------*
03:54:00:WU01:FS00:0xa4:Folding@Home Gromacs Core
03:54:00:WU01:FS00:0xa4:Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
03:54:00:WU01:FS00:0xa4:
03:54:00:WU01:FS00:0xa4:Preparing to commence simulation
03:54:00:WU01:FS00:0xa4:- Looking at optimizations...
03:54:00:WU01:FS00:0xa4:- Created dyn
03:54:00:WU01:FS00:0xa4:- Files status OK
03:54:00:WU01:FS00:0xa4:- Expanded 1025467 -> 2544640 (decompressed 248.1 percent)
03:54:00:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=1025467 data_size=2544640, decompressed_data_size=2544640 diff=0
03:54:00:WU01:FS00:0xa4:- Digital signature verified
03:54:00:WU01:FS00:0xa4:
03:54:00:WU01:FS00:0xa4:Project: 8702 (Run 7, Clone 49, Gen 41)
03:54:00:WU01:FS00:0xa4:
03:54:00:WU01:FS00:0xa4:Assembly optimizations on if available.
03:54:00:WU01:FS00:0xa4:Entering M.D.
03:54:06:WU01:FS00:0xa4:Mapping NT from 7 to 7 
03:54:06:WU01:FS00:0xa4:Completed 0 out of 250000 steps  (0%)
03:55:24:WU01:FS00:0xa4:Completed 2500 out of 250000 steps  (1%)
03:56:43:WU01:FS00:0xa4:Completed 5000 out of 250000 steps  (2%)
03:58:02:WU01:FS00:0xa4:Completed 7500 out of 250000 steps  (3%)
03:59:24:WU01:FS00:0xa4:Completed 10000 out of 250000 steps  (4%)
04:00:46:WU01:FS00:0xa4:Completed 12500 out of 250000 steps  (5%)
04:02:10:WU01:FS00:0xa4:Completed 15000 out of 250000 steps  (6%)
04:03:34:WU01:FS00:0xa4:Completed 17500 out of 250000 steps  (7%)
04:04:58:WU01:FS00:0xa4:Completed 20000 out of 250000 steps  (8%)
04:06:22:WU01:FS00:0xa4:Completed 22500 out of 250000 steps  (9%)
04:07:46:WU01:FS00:0xa4:Completed 25000 out of 250000 steps  (10%)
04:09:10:WU01:FS00:0xa4:Completed 27500 out of 250000 steps  (11%)
04:10:34:WU01:FS00:0xa4:Completed 30000 out of 250000 steps  (12%)
04:11:59:WU01:FS00:0xa4:Completed 32500 out of 250000 steps  (13%)
04:13:25:WU01:FS00:0xa4:Completed 35000 out of 250000 steps  (14%)
04:14:50:WU01:FS00:0xa4:Completed 37500 out of 250000 steps  (15%)
04:16:15:WU01:FS00:0xa4:Completed 40000 out of 250000 steps  (16%)
04:17:42:WU01:FS00:0xa4:Completed 42500 out of 250000 steps  (17%)
Mod Edit: Added Code tags around log file
Joe_H
Site Admin
Posts: 8002
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: What just happened?

Post by Joe_H »

It appears the folding core was interrupted, possibly by the sleep or hibernate function in the middle of shutting down at 88%. Normally on restart the core would resume at the last checkpoint, but those errors on startup all look to be related to verifying the checkpoint. The core crashed during that, on the next attempt it looks like it did not find any checkpoint file and restarted from the beginning. I would assume the checkpoint file was corrupted during the shutdown and closing of the laptop.
Image
nss
Posts: 7
Joined: Thu Jul 11, 2013 6:14 am
Hardware configuration: Mid 2012 MacBook Pro, Mac OS 10.8: 8 GB 1600 MHz DDR3 Ram, 2.7 GHz quad-core Intel Core i7, 512 MB Intel Graphics 4000, Solid State Drive

Re: What just happened?

Post by nss »

Hmm... would it be better for me to set it to "off" before shutting my laptop?
Joe_H
Site Admin
Posts: 8002
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: What just happened?

Post by Joe_H »

23:48:31:FS00:Shutting core down
This line in the log timestamped 13 seconds after the 88% progress report shows that the client already received a signal to shutdown. If you did not explicitly shut it down at that point, a possible cause of that signal would be either disconnecting your laptop from external power or using one of the settings that only folds when Idle and user activity paused the core. What I would expect to see in the log immediately after that message would be lines like these from pausing a WU on my laptop:
14:26:35:FS00:Shutting core down
14:26:40:WU01:FS00:0xa4:Client no longer detected. Shutting down core.
14:26:40:WU01:FS00:0xa4:
14:26:40:WU01:FS00:0xa4:Folding@home Core Shutdown: CLIENT_DIED
14:26:40:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
Instead there is the message about clock skew which comes from the laptop sleeping, followed a few seconds later with the core interrupted message in the log:

Code: Select all

******************************* Date: 2013-07-21 *******************************
03:52:55:WARNING:WU01:FS00:Detected clock skew (4 hours 04 mins), adjusting time estimates
03:52:59:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
The client started a new folding core process immediately after.

So what happened to the checkpoint? I suspect it was either corrupted while the core process was partly shutdown during the laptop sleep/hibernate state, or not completely written to disk at the point the new process started a few seconds after being woke up.

You can shutdown folding by moving the slider to Off or pausing the slot using FAHControl if you want, but the important thing to do is give your laptop a few seconds to finish processing commands before closing the lid. The client usually will be okay suspended in sleep state, so a sequence of sleeping the laptop and then disconnecting from power should also work depending on your power management settings.
Image
Jesse_V
Site Moderator
Posts: 2850
Joined: Mon Jul 18, 2011 4:44 am
Hardware configuration: OS: Windows 10, Kubuntu 19.04
CPU: i7-6700k
GPU: GTX 970, GTX 1080 TI
RAM: 24 GB DDR4
Location: Western Washington

Re: What just happened?

Post by Jesse_V »

Also, this is a pretty rare event. I've never had this happen before, and it may never occur for you again. Nevertheless, you can avoid it altogether with Joe_H's recommendations.
F@h is now the top computing platform on the planet and nothing unites people like a dedicated fight against a common enemy. This virus affects all of us. Lets end it together.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: What just happened?

Post by bruce »

Depending on your settings, there are a variety of things that can happen when you close your laptop. Under normal circumstances, the power should not be killed instantly and the OS should either do a shutdown or a hibernate. Both of those processes take several seconds and should never be interrupted. I can only guess what happened in your case. A failure to complete a "normal" shutdown can potentially corrupt portions of the file system, including the checkpoint. As Jesse_V said, that's pretty rare, but it is possible.

Setting the slider to OFF would give FAH a little longer to properly close it's files but it shouldn't be necessary. I'd look carefully at how long it takes your OS to properly close ALL programs, not just FAH.

I have an old laptop. The battery needs to be replaced and I decided to get a new laptop and relegate the old one to 24x7 folding while plugged in to the wall. Without a functional battery, it's a bit like a machine without a UPS in that it dies instantly if it's unplugged. It's vulnerable to the same sort of failure that I'm guessing you might have had.
Post Reply