Page 1 of 1

Project: 6904 Run 1, Clone 25, Gen. 48

Posted: Thu Feb 02, 2012 3:07 am
by Adak
I have this on an AMD 4 processor SuperMicro board, and keep getting the error below. I folded Run 1, Clone 24, Gen. 47, with no problems.

Code: Select all

[05:41:50] + Closed connections
[05:41:50] 
[05:41:50] + Processing work unit
[05:41:50] Core required: FahCore_a5.exe
[05:41:50] Core found.
[05:41:50] Working on queue slot 04 [February 1 05:41:50 UTC]
[05:41:50] + Working ...
[05:41:50] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -checkpoint 30 -verbose -lifeline 2324 -version 634'

[05:41:50] 
[05:41:50] *------------------------------*
[05:41:50] Folding@Home Gromacs SMP Core
[05:41:50] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[05:41:50] 
[05:41:50] Preparing to commence simulation
[05:41:50] - Looking at optimizations...
[05:41:50] - Created dyn
[05:41:50] - Files status OK
[05:41:57] - Expanded 57219017 -> 71843392 (decompressed 50.4 percent)
[05:41:57] Called DecompressByteArray: compressed_data_size=57219017 data_size=71843392, decompressed_data_size=71843392 diff=0
[05:41:58] - Digital signature verified
[05:41:58] 
[05:41:58] Project: 6904 (Run 1, Clone 25, Gen 48)
[05:41:58] 
[05:41:58] Assembly optimizations on if available.
[05:41:58] Entering M.D.
[05:42:07] Mapping NT from 64 to 64 
[05:42:15] Completed 0 out of 250000 steps  (0%)
[06:04:37] Completed 2500 out of 250000 steps  (1%)
[06:15:33]  NT from 64 to 64 
[06:15:59] fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.mdrun returned 3
[06:16:00] Gromacs detected an fcSaveRestoreState: I/O failed d..ir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Resuming Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:10] 
[06:16:10] Folding@home Core Shutdown: UNKNOWN_ERROR
[06:16:10] CoreStatus = 62 (98)
[06:16:10] + Restarting core (settings changed) 
[06:16:10] 
[06:16:10] + Processing work unit
[06:16:10] Core required: FahCore_a5.exe
[06:16:10] Core found.
[06:16:10] Working on queue slot 04 [February 1 06:16:10 UTC]
[06:16:10] + Working ...
[06:16:10] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 2324 -version 634'

[06:16:10] 
[06:16:10] *------------------------------*
[06:16:10] Folding@Home Gromacs SMP Core
[06:16:10] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[06:16:10] 
[06:16:10] Preparing to commence simulation
[06:16:10] - Looking at optimizations...
[06:16:10] - Not checking prior termination.
[06:16:18] - Expanded 57219017 -> 71843392 (decompressed 50.4 percent)
[06:16:18] Called DecompressByteArray: compressed_data_size=57219017 data_size=71843392, decompressed_data_size=71843392 diff=0
[06:16:19] - Digital signature verified
[06:16:19] 
[06:16:19] Project: 6904 (Run 1, Clone 25, Gen 48)
[06:16:19] 
[06:16:19] Assembly optimizations on if available.
[06:16:19] Entering M.D.
[06:16:27] Mapping NT from 64 to 64 
[06:16:36] Completed 0 out of 250000 steps  (0%)
[06:39:00] Completed 2500 out of 250000 steps  (1%)
[06:50:15] /O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.Resuming from checkpoint
[06:50:15] fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.mdrun returned 3
[06:50:15] Gromacs detected an invafcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:26] 
[06:50:26] Folding@home Core Shutdown: UNKNOWN_ERROR
[06:50:26] CoreStatus = 62 (98)
[06:50:26] + Restarting core (settings changed) 
[06:50:26] 
[06:50:26] + Processing work unit
[06:50:26] Core required: FahCore_a5.exe
[06:50:26] Core found.
[06:50:26] Working on queue slot 04 [February 1 06:50:26 UTC]
[06:50:26] + Working ...
[06:50:26] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 2324 -version 634'

[06:50:26] 
[06:50:26] *------------------------------*
[06:50:26] Folding@Home Gromacs SMP Core
[06:50:26] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[06:50:26] 
[06:50:26] Preparing to commence simulation
[06:50:26] - Looking at optimizations...
[06:50:26] - Not checking prior termination.
[06:50:34] - Expanded 57219017 -> 71843392 (decompressed 50.4 percent)
[06:50:34] Called DecompressByteArray: compressed_data_size=57219017 data_size=71843392, decompressed_data_size=71843392 diff=0
[06:50:35] - Digital signature verified
[06:50:35] 
[06:50:35] Project: 6904 (Run 1, Clone 25, Gen 48)
[06:50:35] 
[06:50:35] Assembly optimizations on if available.
[06:50:35] Entering M.D.
[06:50:43] Mapping NT from 64 to 64 
[06:50:52] Completed 0 out of 250000 steps  (0%)
[07:13:13] Completed 2500 out of 250000 steps  (1%)
[07:24:30] /O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.mdrun returned 3
[07:24:30] Gromacs detected an invalid checkpoint.  RestartifcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] ResumiCan't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:41] 
[07:24:41] Folding@home Core Shutdown: UNKNOWN_ERROR
[07:24:42] CoreStatus = 62 (98)
[07:24:42] + Restarting core (settings changed) 
[07:24:42] 
[07:24:42] + Processing work unit
[07:24:42] Core required: FahCore_a5.exe
[07:24:42] Core found.
[07:24:42] Working on queue slot 04 [February 1 07:24:42 UTC]
[07:24:42] + Working ...
[07:24:42] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 2324 -version 634'

[07:24:42] 
[07:24:42] *------------------------------*
[07:24:42] Folding@Home Gromacs SMP Core
[07:24:42] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[07:24:42] 
[07:24:42] Preparing to commence simulation
[07:24:42] - Looking at optimizations...
[07:24:42] - Not checking prior termination.
[07:24:49] - Expanded 57219017 -> 71843392 (decompressed 50.4 percent)
[07:24:49] Called DecompressByteArray: compressed_data_size=57219017 data_size=71843392, decompressed_data_size=71843392 diff=0
[07:24:50] - Digital signature verified
[07:24:50] 
[07:24:50] Project: 6904 (Run 1, Clone 25, Gen 48)
[07:24:50] 
[07:24:50] Assembly optimizations on if available.
[07:24:50] Entering M.D.
[07:24:59] Mapping NT from 64 to 64 
[07:25:07] Completed 0 out of 250000 steps  (0%)
[07:47:33] Completed 2500 out of 250000 steps  (1%)
[07:58:45] /O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.mdrun returned 3
[07:58:45] Gromacs detected an invfcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] ResumiCan't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:56] 
[07:58:56] Folding@home Core Shutdown: UNKNOWN_ERROR
[07:58:57] CoreStatus = 62 (98)
[07:58:57] + Restarting core (settings changed) 
[07:58:57] 
[07:58:57] + Processing work unit
[07:58:57] Core required: FahCore_a5.exe
[07:58:57] Core found.
[07:58:57] Working on queue slot 04 [February 1 07:58:57 UTC]
[07:58:57] + Working ...
[07:58:57] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 2324 -version 634'

[07:58:57] 
[07:58:57] *------------------------------*
[07:58:57] Folding@Home Gromacs SMP Core
[07:58:57] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[07:58:57] 
[07:58:57] Preparing to commence simulation
[07:58:57] - Looking at optimizations...
[07:58:57] - Not checking prior termination.
[07:59:05] - Expanded 57219017 -> 71843392 (decompressed 50.4 percent)
[07:59:05] Called DecompressByteArray: compressed_data_size=57219017 data_size=71843392, decompressed_data_size=71843392 diff=0
[07:59:06] - Digital signature verified
[07:59:06] 
[07:59:06] Project: 6904 (Run 1, Clone 25, Gen 48)
[07:59:06] 
[07:59:06] Assembly optimizations on if available.
[07:59:06] Entering M.D.
[07:59:14] Mapping NT from 64 to 64 
[07:59:22] Completed 0 out of 250000 steps  (0%)
[08:21:45] Completed 2500 out of 250000 steps  (1%)
[08:33:30] /O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Posted: Thu Feb 02, 2012 3:37 am
by Grandpa_01
I am guessing you shut it down during the run and when you started it back up you are getting this error. The computer probably got shut down before it got a chance to completely save the file. I have seen this on my 4P but I always save a copy of the FAH folder to the desktop before shutting down. Fortunately I was able to restart and complete the WU from the file I saved to the desktop.

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Posted: Thu Feb 02, 2012 7:26 am
by Adak
No shut down was made. This box runs headless, and I'm sitting just a few feet away, so there's no reboot or power loss.

Here's the next work unit, doing the same thing. I'm going to re-install Ubuntu and FAH. I thought it might be the drive acting up (it's not new), but it passed a HD test, just now. Maybe "the Kraken" is cracking up, for some reason - I don't know why the FAH program would try to go back and open the checkpoint at all, when it hasn't been turned off.

Code: Select all

--- Opening Log file [February 2 03:55:24 UTC] 

# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/adak/fah
Executable: ./fah6
Arguments: -smp -bigadv -verbosity 9 

[03:55:24] - Ask before connecting: No
[03:55:24] - User name: Adak (Team 32)
[03:55:24] - User ID not found locally
[03:55:24] + Requesting User ID from server
[03:55:24] - Getting ID from AS: 
[03:55:24] Connecting to http://assign.stanford.edu:8080/
[03:55:25] Posted data.
[03:55:25] Initial: 8D4B; - Received User ID = 4B8DB6BC66222BE1
[03:55:25] - Machine ID: 1
[03:55:25] 
[03:55:25] Work directory not found. Creating...
[03:55:25] Could not open work queue, generating new queue...
[03:55:25] - Preparing to get new work unit...
[03:55:25] - Autosending finished units... [February 2 03:55:25 UTC]
[03:55:25] Cleaning up work directory
[03:55:25] Trying to send all finished work units
[03:55:25] + No unsent completed units remaining.
[03:55:25] - Autosend completed
[03:55:25] + Attempting to get work packet
[03:55:25] Passkey found
[03:55:25] - Will indicate memory of 32233 MB
[03:55:25] - Connecting to assignment server
[03:55:25] Connecting to http://assign.stanford.edu:8080/
[03:55:25] Posted data.
[03:55:25] Initial: ED82; - Successful: assigned to (130.237.232.237).
[03:55:25] + News From Folding@Home: Welcome to Folding@Home
[03:55:25] Loaded queue successfully.
[03:55:25] Sent data
[03:55:25] Connecting to http://130.237.232.237:8080/
[03:55:38] Posted data.
[03:55:38] Initial: 0000; - Receiving payload (expected size: 57245855)
[03:59:48] - Downloaded at ~223 kB/s
[03:59:48] - Averaged speed for that direction ~223 kB/s
[03:59:48] + Received work.
[03:59:48] + Closed connections
[03:59:48] 
[03:59:48] + Processing work unit
[03:59:48] Core required: FahCore_a5.exe
[03:59:48] Core found.
[03:59:48] Working on queue slot 01 [February 2 03:59:48 UTC]
[03:59:48] + Working ...
[03:59:48] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 01 -np 64 -checkpoint 30 -verbose -lifeline 9786 -version 634'

[03:59:48] 
[03:59:48] *------------------------------*
[03:59:48] Folding@Home Gromacs SMP Core
[03:59:48] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[03:59:48] 
[03:59:48] Preparing to commence simulation
[03:59:48] - Looking at optimizations...
[03:59:48] - Created dyn
[03:59:48] - Files status OK
[03:59:56] - Expanded 57245343 -> 71846524 (decompressed 50.4 percent)
[03:59:56] Called DecompressByteArray: compressed_data_size=57245343 data_size=71846524, decompressed_data_size=71846524 diff=0
[03:59:56] - Digital signature verified
[03:59:56] 
[03:59:56] Project: 6903 (Run 2, Clone 29, Gen 5)
[03:59:56] 
[03:59:57] Assembly optimizations on if available.
[03:59:57] Entering M.D.
[04:00:05] Mapping NT from 64 to 64 
[04:00:13] Completed 0 out of 250000 steps  (0%)
[04:16:08] Completed 2500 out of 250000 steps  (1%)
[04:32:31]  NT from 64 to 64 
[04:33:01] fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.mdrun returned 3
[04:33:01] Gromacs detectedfcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] ReCan't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:11] 
[04:33:11] Folding@home Core Shutdown: UNKNOWN_ERROR
[04:33:12] CoreStatus = 62 (98)
[04:33:12] + Restarting core (settings changed) 
[04:33:12] 
[04:33:12] + Processing work unit
[04:33:12] Core required: FahCore_a5.exe
[04:33:12] Core found.
[04:33:12] Working on queue slot 01 [February 2 04:33:12 UTC]
[04:33:12] + Working ...
[04:33:12] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 01 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 9786 -version 634'

[04:33:12] 
[04:33:12] *------------------------------*
[04:33:12] Folding@Home Gromacs SMP Core
[04:33:12] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[04:33:12] 
[04:33:12] Preparing to commence simulation
[04:33:12] - Looking at optimizations...
[04:33:12] - Not checking prior termination.
[04:33:19] - Expanded 57245343 -> 71846524 (decompressed 50.4 percent)
[04:33:19] Called DecompressByteArray: compressed_data_size=57245343 data_size=71846524, decompressed_data_size=71846524 diff=0
[04:33:20] - Digital signature verified
[04:33:20] 
[04:33:20] Project: 6903 (Run 2, Clone 29, Gen 5)
[04:33:20] 
[04:33:20] Assembly optimizations on if available.
[04:33:20] Entering M.D.
[04:33:29] Mapping NT from 64 to 64 
[04:33:36] Completed 0 out of 250000 steps  (0%)
[04:49:34] Completed 2500 out of 250000 steps  (1%)
[05:07:01] /O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
One thing that stands out to me is this:

Code: Select all

[03:55:38] Initial: 0000; - Receiving payload (expected size: 57245855)
[03:59:48] - Downloaded at ~223 kB/s
[03:59:48] - Averaged speed for that direction ~223 kB/s
[03:59:48] + Received work.
[03:59:48] + Closed connections
I'm on DSL service, and I can't begin to receive a 57 MB file in just ten seconds! And isn't 57 MB a bit large for a wu download?

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Posted: Thu Feb 02, 2012 8:28 am
by Leonardo
If it's not a work unit problem, my first guess would be physical memory errors.

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Posted: Thu Feb 02, 2012 9:42 am
by Adak
Thanks, but the problem is with "the Kraken". I thought the same thing was likely, and ran some tests, but memory was fine.

This is what the Kraken does, according to Tear:

Code: Select all

Here's expected behaviour in the use case of interest:
1. Client downloads new WU (no checkpoint exists at this time)
2. Client starts the FahCore (no checkpoint exists at this time)
3. FahCore keeps on simulating
4. First checkpoint interval elapses (15 minutes in default config)
5. The Kraken detects completely written checkpoint
6. The Kraken restarts the FahCore
7. FahCore resumes from checkpoint
8. Hopefully, DLB gets engaged  (DLB is Dynamic Load Balancing)
And this is the problem, as Tear described it for Fedora:

Code: Select all

Upon installation a dedicated user gets created (called fahclient) and client
starts automatically (runs as 'fahclient'). It runs off /var/lib/fahclient/
directory. WUs, config files, FahCores, everything is kept there.

If you, however, at any time, stop the client (via 'service FAHClient stop')
and then start FAHControl from your terminal window or otherwise (while being
logged in as yourself) here's what's going to happen:

1) FAHControl will try to connect to FAHClient and will fail (client is not running)
2) FAHControl will start the client (that's what its default configuration is)
3) But note, you're not 'fahclient' anymore!
4) FAHClient doesn't even look at files in /var/lib/fahclient/ (no wonder, it
   can't access them anyway) and
5) creates fresh 'installation' in '.FAHClient' :!:
6) As it's a fresh installation, client will get a fresh WU, fresh FahCore, etc.
7) Bottom-line: you'll end up with two separate and independent installations:
   one in /var/lib/fahclient/ and another one in ~/.FAHClient

To prevent this issue from arising I'd recommend turning 'Autostart' feature
off in 'Preferences' and always running FAHClient as 'fahclient' user (I guess
one could call it a 'service mode').
Most of the above is true for Ubuntu, as well.

So of course, this wasn't in the summary I read to install the Kraken, so I'm gleefully testing various BIOS options, while folding - and of course, stopping and restarting the folding client - and eventually, I got caught with my user name, instead of "fahclient".

That's when this whole mess started, but I thought I had worked it out - but not so, since I knew none of this about the Kraken.

So the wu is probably fine - almost 100% sure it is.

Thanks for all the input however. In Linux, I can get off into the weeds, rather quickly.

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Posted: Mon Feb 06, 2012 8:02 am
by PantherX
Nothing in the WU Database so I have marked it for followup.

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Posted: Wed Apr 25, 2012 12:08 am
by sortofageek
My apologies for such a late follow-up but, just for the record, this one was completed successfully by two different donors.

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Posted: Wed Apr 25, 2012 1:48 pm
by Nathan_P
Adak wrote:No shut down was made. This box runs headless, and I'm sitting just a few feet away, so there's no reboot or power loss.

Here's the next work unit, doing the same thing. I'm going to re-install Ubuntu and FAH. I thought it might be the drive acting up (it's not new), but it passed a HD test, just now. Maybe "the Kraken" is cracking up, for some reason - I don't know why the FAH program would try to go back and open the checkpoint at all, when it hasn't been turned off.

Code: Select all

--- Opening Log file [February 2 03:55:24 UTC] 

# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/adak/fah
Executable: ./fah6
Arguments: -smp -bigadv -verbosity 9 

[03:55:24] - Ask before connecting: No
[03:55:24] - User name: Adak (Team 32)
[03:55:24] - User ID not found locally
[03:55:24] + Requesting User ID from server
[03:55:24] - Getting ID from AS: 
[03:55:24] Connecting to http://assign.stanford.edu:8080/
[03:55:25] Posted data.
[03:55:25] Initial: 8D4B; - Received User ID = 4B8DB6BC66222BE1
[03:55:25] - Machine ID: 1
[03:55:25] 
[03:55:25] Work directory not found. Creating...
[03:55:25] Could not open work queue, generating new queue...
[03:55:25] - Preparing to get new work unit...
[03:55:25] - Autosending finished units... [February 2 03:55:25 UTC]
[03:55:25] Cleaning up work directory
[03:55:25] Trying to send all finished work units
[03:55:25] + No unsent completed units remaining.
[03:55:25] - Autosend completed
[03:55:25] + Attempting to get work packet
[03:55:25] Passkey found
[03:55:25] - Will indicate memory of 32233 MB
[03:55:25] - Connecting to assignment server
[03:55:25] Connecting to http://assign.stanford.edu:8080/
[03:55:25] Posted data.
[03:55:25] Initial: ED82; - Successful: assigned to (130.237.232.237).
[03:55:25] + News From Folding@Home: Welcome to Folding@Home
[03:55:25] Loaded queue successfully.
[03:55:25] Sent data
[03:55:25] Connecting to http://130.237.232.237:8080/
[03:55:38] Posted data.
[03:55:38] Initial: 0000; - Receiving payload (expected size: 57245855)
[03:59:48] - Downloaded at ~223 kB/s
[03:59:48] - Averaged speed for that direction ~223 kB/s
[03:59:48] + Received work.
[03:59:48] + Closed connections
[03:59:48] 
[03:59:48] + Processing work unit
[03:59:48] Core required: FahCore_a5.exe
[03:59:48] Core found.
[03:59:48] Working on queue slot 01 [February 2 03:59:48 UTC]
[03:59:48] + Working ...
[03:59:48] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 01 -np 64 -checkpoint 30 -verbose -lifeline 9786 -version 634'

[03:59:48] 
[03:59:48] *------------------------------*
[03:59:48] Folding@Home Gromacs SMP Core
[03:59:48] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[03:59:48] 
[03:59:48] Preparing to commence simulation
[03:59:48] - Looking at optimizations...
[03:59:48] - Created dyn
[03:59:48] - Files status OK
[03:59:56] - Expanded 57245343 -> 71846524 (decompressed 50.4 percent)
[03:59:56] Called DecompressByteArray: compressed_data_size=57245343 data_size=71846524, decompressed_data_size=71846524 diff=0
[03:59:56] - Digital signature verified
[03:59:56] 
[03:59:56] Project: 6903 (Run 2, Clone 29, Gen 5)
[03:59:56] 
[03:59:57] Assembly optimizations on if available.
[03:59:57] Entering M.D.
[04:00:05] Mapping NT from 64 to 64 
[04:00:13] Completed 0 out of 250000 steps  (0%)
[04:16:08] Completed 2500 out of 250000 steps  (1%)
[04:32:31]  NT from 64 to 64 
[04:33:01] fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.mdrun returned 3
[04:33:01] Gromacs detectedfcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] ReCan't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:11] 
[04:33:11] Folding@home Core Shutdown: UNKNOWN_ERROR
[04:33:12] CoreStatus = 62 (98)
[04:33:12] + Restarting core (settings changed) 
[04:33:12] 
[04:33:12] + Processing work unit
[04:33:12] Core required: FahCore_a5.exe
[04:33:12] Core found.
[04:33:12] Working on queue slot 01 [February 2 04:33:12 UTC]
[04:33:12] + Working ...
[04:33:12] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 01 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 9786 -version 634'

[04:33:12] 
[04:33:12] *------------------------------*
[04:33:12] Folding@Home Gromacs SMP Core
[04:33:12] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[04:33:12] 
[04:33:12] Preparing to commence simulation
[04:33:12] - Looking at optimizations...
[04:33:12] - Not checking prior termination.
[04:33:19] - Expanded 57245343 -> 71846524 (decompressed 50.4 percent)
[04:33:19] Called DecompressByteArray: compressed_data_size=57245343 data_size=71846524, decompressed_data_size=71846524 diff=0
[04:33:20] - Digital signature verified
[04:33:20] 
[04:33:20] Project: 6903 (Run 2, Clone 29, Gen 5)
[04:33:20] 
[04:33:20] Assembly optimizations on if available.
[04:33:20] Entering M.D.
[04:33:29] Mapping NT from 64 to 64 
[04:33:36] Completed 0 out of 250000 steps  (0%)
[04:49:34] Completed 2500 out of 250000 steps  (1%)
[05:07:01] /O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
One thing that stands out to me is this:

Code: Select all

[03:55:38] Initial: 0000; - Receiving payload (expected size: 57245855)
[03:59:48] - Downloaded at ~223 kB/s
[03:59:48] - Averaged speed for that direction ~223 kB/s
[03:59:48] + Received work.
[03:59:48] + Closed connections
I'm on DSL service, and I can't begin to receive a 57 MB file in just ten seconds! And isn't 57 MB a bit large for a wu download?
Just as a further follow up as i missed it originally, 57MB is the size of the 6904 WU, 8101 runs about 35MB but i can't remember the others, needless to say these are at the top end of size and go hand in hand with the -bigadv status