Project: 2665 (Run 3, Clone 325, Gen 45) ERROR 0x7b
Posted: Tue Sep 23, 2008 1:06 pm
I've looked but i couldn't see it listed by anyone else so i'll have to find out from you if i need to do more testing/checking for problems at "my end" or not ;
Since it did a VERY EUE (just over 1%) on my main machine it was more convenient than usual to test it out on basically the same but somewhat lower spec sister D9200 which too is folding - as it wouldn't disrupt the folding time on the sister D9200 much as little downtime would be needed to test if the EUE happened at the same point on that different set of hardware. The only other diff varible is OS Vista 32 Home Premium pre-sp1 partition i'm folding in currently on the sister machine.
MAIN mach.;
SISTER mach.;
I've just used (-delete xx +) Qfix on the current work unit (07) in client i returned back to my main machine. Could/should i have used it on the previous unit (06) not for the points obviously but for the potential use in helping you to identify WU specific issues by getting that data sent back to the server ?
Since it did a VERY EUE (just over 1%) on my main machine it was more convenient than usual to test it out on basically the same but somewhat lower spec sister D9200 which too is folding - as it wouldn't disrupt the folding time on the sister D9200 much as little downtime would be needed to test if the EUE happened at the same point on that different set of hardware. The only other diff varible is OS Vista 32 Home Premium pre-sp1 partition i'm folding in currently on the sister machine.
MAIN mach.;
Code: Select all
[17:32:54] + Results successfully sent
[17:32:54] Thank you for your contribution to Folding@Home.
[17:32:54] + Number of Units Completed: 31
[17:32:54] + Sent 1 of 1 completed units to the server
[17:32:54] - Autosend completed
[17:32:58] Posted data.
[17:32:58] Initial: 0000; - Receiving payload (expected size: 4762417)
[17:33:10] - Downloaded at ~387 kB/s
[17:33:10] - Averaged speed for that direction ~301 kB/s
[17:33:10] + Received work.
[17:33:10] + Closed connections
[17:33:10]
[17:33:10] + Processing work unit
[17:33:10] Work type a1 not eligible for variable processors
[17:33:10] Core required: FahCore_a1.exe
[17:33:10] Core found.
[17:33:10] Using generic mpiexec calls
[17:33:10] Working on queue slot 06 [September 22 17:33:10 UTC]
[17:33:10] + Working ...
[17:33:10] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 06 -checkpoint 15 -verbose -lifeline 4992 -version 622'
[17:33:10]
[17:33:10] *------------------------------*
[17:33:10] Folding@Home Gromacs SMP Core
[17:33:10] Version 1.74 (March 10, 2007)
[17:33:10]
[17:33:10] Preparing to commence simulation
[17:33:10] - Ensuring status. Please wait.
[17:33:27] - Looking at optimizations...
[17:33:27] - Working with standard loops on this execution.
[17:33:28] - Created dyn
[17:33:28] - Files status OK
[17:33:47] - Expanded 4761905 -> 24426905 (decompressed 512.9 percent)
[17:33:47] - Starting from initial work packet
[17:33:48]
[17:33:48] Project: 2665 (Run 3, Clone 325, Gen 45)
[17:33:48]
[17:33:48] g from initial work packet
[17:33:48]
[17:33:48] Project: 2665 (Run 3, Clone 325, Gen 45)
[17:33:48]
[17:33:49] Entering M.D.
[17:33:58] Protein: HGG in water
[17:33:58] Writing local files
[17:33:59] Extra SSE boost OK.
[17:34:08] ps (0 percent)
[17:49:09] Timered checkpoint triggered.
[17:56:48] Writing local files
[17:56:48] Completed 2500 out of 250000 steps (1 percent)
[18:09:28] ore data to disk...
[18:09:28] ... Done.
[18:09:28] - Failed to delete work/wudata_06.arc
[18:09:29] gfile size: 11376
[18:09:29] - Writing 11912 byte- Failed to delete work/wudata_06.sas
[18:09:29] - Failed to delete work/wudata_06.goe
[18:09:29] - Failed to delete work/wudata_06.xvg
[18:09:29] Warning: check for stray files
[18:09:29]
[18:09:29] Folding@home Core Shutdown: EARLY_UNIT_END
[18:09:29] Finalizing output
[18:09:29] r stray files
[18:11:29]
[18:11:29] Folding@home Core Shutdown: EARLY_UNIT_END
[18:11:29]
[18:11:29] Folding@home Core Shutdown: EARLY_UNIT_END
[18:11:33] CoreStatus = 7B (123)
[18:11:33] Client-core communications error: ERROR 0x7b
[18:11:33] This is a sign of more serious problems, shutting down.
--- Opening Log file [September 22 19:34:17 UTC]
# Windows SMP Console Edition #################################################
###############################################################################
Folding@Home Client Version 6.22 SMP Beta2
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: C:\Program Files\Folding@Home Windows SMP Client V1.01
Executable: C:\Program Files\Folding@Home Windows SMP Client V1.01\fah6.22.exe
Arguments: -smp -verbosity 9
[19:34:17] - Ask before connecting: No
[19:34:17] - User name: al (Team 13505)
[19:34:17] - User ID: 1D00E77F41982A15
[19:34:17] - Machine ID: 1
[19:34:17]
[19:34:17] Loaded queue successfully.
[19:34:17]
[19:34:17] + Processing work unit
[19:34:17] - Autosending finished units... [September 22 19:34:17 UTC]
[19:34:17] Work type a1 not eligible for variable processors
[19:34:17] Core required: FahCore_a1.exe
[19:34:17] Trying to send all finished work units
[19:34:17] + No unsent completed units remaining.
[19:34:17] - Autosend completed
[19:34:17] Core found.
[19:34:17] Using generic mpiexec calls
[19:34:17] Working on queue slot 06 [September 22 19:34:17 UTC]
[19:34:17] + Working ...
[19:34:17] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 06 -checkpoint 15 -verbose -lifeline 3832 -version 622'
[19:34:17]
[19:34:17] *------------------------------*
[19:34:17] Folding@Home Gromacs SMP Core
[19:34:17] Version 1.74 (March 10, 2007)
[19:34:17]
[19:34:17] Preparing to commence simulation
[19:34:17] - Ensuring status. Please wait.
[19:34:34] - Looking at optimizations...
[19:34:34] - Working with standard loops on this execution.
[19:34:34] - Previous termination of core was improper.
[19:34:34] - Going to use standard loops.
[19:34:34] - Files status OK
SISTER mach.;
Code: Select all
--- Opening Log file [September 22 19:51:51 UTC]
# Windows SMP Console Edition #################################################
###############################################################################
Folding@Home Client Version 6.22 SMP Beta2
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: C:\Users\Al\Desktop\Copy (2) of Folding@Home Windows SMP Client V1.01
Executable: C:\Users\Al\Desktop\Copy (2) of Folding@Home Windows SMP Client V1.01\fah6.22.exe
Arguments: -smp -verbosity 9
[19:51:51] - Ask before connecting: No
[19:51:51] - User name: al (Team 13505)
[19:51:51] - User ID: 29363BE33B18C144
[19:51:51] - Machine ID: 1
[19:51:51]
[19:51:51] Loaded queue successfully.
[19:51:51]
[19:51:51] + Processing work unit
[19:51:51] Work type a1 not eligible for variable processors
[19:51:51] Core required: FahCore_a1.exe
[19:51:51] - Autosending finished units... [September 22 19:51:51 UTC]
[19:51:51] Trying to send all finished work units
[19:51:51] + No unsent completed units remaining.
[19:51:51] - Autosend completed
[19:51:51] Core found.
[19:51:51] Using generic mpiexec calls
[19:51:51] Working on queue slot 07 [September 22 19:51:51 UTC]
[19:51:51] + Working ...
[19:51:51] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 07 -checkpoint 15 -verbose -lifeline 4928 -version 622'
[19:51:51]
[19:51:51] *------------------------------*
[19:51:51] Folding@Home Gromacs SMP Core
[19:51:51] Version 1.74 (March 10, 2007)
[19:51:51]
[19:51:51] Preparing to commence simulation
[19:51:51] - Ensuring status. Please wait.
[19:52:09] - Looking at optimizations...
[19:52:09] - Working with standard loops on this execution.
[19:52:09] - Previous termination of core was improper.
[19:52:09] - Going to use standard loops.
[19:52:09] - Files status OK
[19:52:32] - Expanded 4761905 -> 24426905 (decompressed 512.9 percent)
[19:52:33]
[19:52:33] Project: 2665 (Run 3, Clone 325, Gen 45)
[19:52:33]
[19:52:34] Entering M.D.
[19:52:40] Rejecting checkpoint
[19:52:43]
[19:52:43] Writing local files
[19:52:43]
[19:52:43] Writing local files
[19:52:55] Extra SSE boost OK.
[19:52:56] Writing local files
[19:52:56] Completed 0 out of 250000 steps (0 percent)
[20:07:58] Timered checkpoint triggered.
[20:21:57] Writing local files
[20:21:58] Completed 2500 out of 250000 steps (1 percent)
[20:36:58] Timered checkpoint triggered.
[20:37:47] if you
[20:37:47] often see other project units terminating early like this
[20:37:47] too, you may wish to check the stability of your computer (issues
[20:37:47] such as high temperature, overclocking, etc.).
[20:37:47] Going to send back what have done.
[20:37:47] logfile size: 19546
[20:37:47] - Writing 18602 bytes of core data to disk...
[20:37:47] ... Done.
[20:37:47] early like this
[20:37:47] too, you may wish to check the stability of your computer (issues
[20:37:47] such as high temperature, overclocking, etc.).
[20:37:47] Going to send back what have done.
[20:37:47] logfile size: 18052
[20:37:47] - Writing 18602 bytes of core data to disk...
[20:37:47] ... Done.
[20:37:47] No C.P. to delete.
[20:37:47] - Failed to delete work/wudata_07.dyn
[20:37:47] Warning: check for stray files
[20:37:47]
[20:37:47] Folding@home Core Shutdown: EARLY_UNIT_END
[20:37:47]
[20:37:47] Folding@home Core Shutdown: EARLY_UNIT_END
[20:37:50] CoreStatus = 7B (123)
[20:37:50] Client-core communications error: ERROR 0x7b
[20:37:50] This is a sign of more serious problems, shutting down.