Anybody else seeing failures with this? The box has attempted to run the unit three times with the same error at the same point.
The client I'm using is the extended 5.91 client. 6.22 doesn't play nice running two clients on a quad core box for me, though reading some of kassons posts, I might have to upgrade
Completed 250000 out of 250000 steps (100 percent)
[22:44:36] Writing final coordinates.
[22:44:37] Past main M.D. loop
[22:44:37] Will end MPI now
[22:45:37]
[22:45:37] Finished Work Unit:
[22:45:37] - Reading up to 21310704 from "work/wudata_01.arc": Read 21310704
[22:45:37] - Reading up to 559868 from "work/wudata_01.xtc": Read 559868
[22:45:37] goefile size: 0
[22:45:37] logfile size: 213173
[22:45:37] Leaving Run
[22:45:40] - Writing 22090117 bytes of core data to disk...
[22:45:40] ... Done.
[22:45:40] - Failed to delete work/wudata_01.sas
[22:45:40] - Failed to delete work/wudata_01.goe
[22:45:40] Warning: check for stray files
[22:45:40] - Shutting down core
[22:47:40]
[22:47:40] Folding@home Core Shutdown: FINISHED_UNIT
[22:47:40]
[22:47:40] Folding@home Core Shutdown: FINISHED_UNIT
[22:47:44] CoreStatus = 64 (100)
[22:47:44] Sending work to server
[22:47:44] + Attempting to send results
[22:52:08] + Results successfully sent
[22:52:08] Thank you for your contribution to Folding@Home.
[22:52:08] + Number of Units Completed: 5
[22:54:12] - Preparing to get new work unit...
[22:54:12] + Attempting to get work packet
[22:54:12] - Connecting to assignment server
[22:54:33] - Couldn't send HTTP request to server
[22:54:33] + Could not connect to Assignment Server
[22:54:34] - Successful: assigned to (171.64.65.64).
[22:54:34] + News From Folding@Home: Welcome to Folding@Home
[22:54:34] Loaded queue successfully.
[22:54:52] + Closed connections
[22:54:52]
[22:54:52] + Processing work unit
[22:54:52] Core required: FahCore_a1.exe
[22:54:52] Core found.
[22:54:52] Working on Unit 02 [August 19 22:54:52]
[22:54:52] + Working ...
[22:54:52]
[22:54:52] *------------------------------*
[22:54:52] Folding@Home Gromacs SMP Core
[22:54:52] Version 1.74 (March 10, 2007)
[22:54:52]
[22:54:52] Preparing to commence simulation
[22:54:52] - Ensuring status. Please wait.
[22:54:58] - Starting from initial work packet
[22:54:58]
[22:54:58] Project: 2665 (Run 2, Clone 791, Gen 30)
[22:54:58]
[22:54:58] Assembly optimizations on if available.
[22:54:58] Entering M.D.
[22:55:23] al work pa- Starting from initial work packet
[22:55:23]
[22:55:23] Project: 2665 (Run 2, Clone 791, Gen 30)
[22:55:23]
[22:55:24] Entering M.D.
[22:55:30] Rejecting checkpoint
[22:55:32] Protein: HGG with glycosylations
[22:55:32] Writing local files
[22:55:41] Extra SSE boost OK.
[22:55:41] Writing local files
[22:55:41] Completed 0 out of 250000 steps (0 percent)
[23:14:13] Writing local files
[23:14:13] Completed 2500 out of 250000 steps (1 percent)
[23:33:03] Writing local files
[23:33:03] Completed 5000 out of 250000 steps (2 percent)
[23:42:43] iles
[23:42:43]
[23:42:43] Folding@home Core Shutdown: EARLY_UNIT_END
[23:42:43] Finalizing output
[23:42:43]
[23:42:43] logfile size: 12967
[23:42:43] - Writing 13503 bytes of core data to disk...
[23:42:43] ... Done.
[23:42:43] - Failed to delete work/wudata_02.sas
[23:42:43] - Failed to delete work/wudata_02.goe
[23:42:43] Warning: check for stray files
[23:44:43]
[23:44:43] Folding@home Core Shutdown: EARLY_UNIT_END
[23:44:43]
[23:44:43] Folding@home Core Shutdown: EARLY_UNIT_END
[23:44:46] CoreStatus = 7B (123)
[23:44:46] Client-core communications error: ERROR 0x7b
[23:44:46] Deleting current work unit & continuing...
[23:46:50] - Preparing to get new work unit...
[23:46:50] + Attempting to get work packet
[23:46:50] - Connecting to assignment server
[23:47:11] - Couldn't send HTTP request to server
[23:47:11] + Could not connect to Assignment Server
[23:47:12] - Successful: assigned to (171.64.65.64).
[23:47:12] + News From Folding@Home: Welcome to Folding@Home
[23:47:12] Loaded queue successfully.
[23:47:28] + Closed connections
[23:47:33]
[23:47:33] + Processing work unit
[23:47:33] Core required: FahCore_a1.exe
[23:47:33] Core found.
[23:47:33] Working on Unit 03 [August 19 23:47:33]
[23:47:33] + Working ...
[23:47:33]
[23:47:33] *------------------------------*
[23:47:33] Folding@Home Gromacs SMP Core
[23:47:33] Version 1.74 (March 10, 2007)
[23:47:33]
[23:47:33] Preparing to commence simulation
[23:47:33] - Ensuring status. Please wait.
[23:47:39] - Starting from initial work packet
[23:47:39]
[23:47:39] Project: 2665 (Run 2, Clone 791, Gen 30)
[23:47:39]
[23:47:39] Assembly optimizations on if available.
[23:47:39] Entering M.D.
[23:48:03] percent)
[23:48:03] - Starting from initial work packet
[23:48:03]
[23:48:03] Project: 2665 (Run 2, Clone 791, Gen 30)
[23:48:03]
[23:48:05] Entering M.D.
[23:48:11] Rejecting checkpoint
[23:48:12] cosylations
[23:48:12] Writing local files
[23:48:13]
[23:48:13] Writing local files
[23:48:22] Extra SSE boost OK.
[23:48:22] Writing local files
[23:48:22] Completed 0 out of 250000 steps (0 percent)
[00:07:00] Writing local files
[00:07:01] Completed 2500 out of 250000 steps (1 percent)
[00:25:50] Writing local files
[00:25:50] Completed 5000 out of 250000 steps (2 percent)
[00:35:30]
[00:35:30] les
[00:35:30]
[00:35:30] Folding@home Core Shutdown: EARLY_UNIT_END
[00:35:30] Finalizing output
[00:35:30]
[00:35:30] logfile size: 12967
[00:35:30] - Writing 13503 bytes of core data to disk...
[00:35:30] ... Done.
[00:35:30] - Failed to delete work/wudata_03.arc
[00:35:30] - Failed to delete work/wudata_03.sas
[00:35:30] - Failed to delete work/wudata_03.goe
[00:35:30] Warning: check for stray files
[00:37:30]
[00:37:30] Folding@home Core Shutdown: EARLY_UNIT_END
[00:37:30]
[00:37:30] Folding@home Core Shutdown: EARLY_UNIT_END
[00:37:33] CoreStatus = 7B (123)
[00:37:33] Client-core communications error: ERROR 0x7b
[00:37:33] Deleting current work unit & continuing...
[00:39:37] - Preparing to get new work unit...
[00:39:37] + Attempting to get work packet
[00:39:37] - Connecting to assignment server
[00:39:58] - Couldn't send HTTP request to server
Project: 2665 (Run 2, Clone 791, Gen 30)
Moderators: Site Moderators, FAHC Science Team
-
- Site Moderator
- Posts: 6349
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: Project: 2665 (Run 2, Clone 791, Gen 30)
I have 3 reports for partial credits ...
If you want to report yours, use qfix : viewtopic.php?f=8&t=191
If you want to report yours, use qfix : viewtopic.php?f=8&t=191
Re: Project: 2665 (Run 2, Clone 791, Gen 30)
After teh third failure, the client downloaded another wu and continued as if nothing happened. Thanks for the info.
Re: Project: 2665 (Run 2, Clone 791, Gen 30)
I am having the same problem with this unit - continuing EUE's at around 2-3%. Have done a 3065 without problem and am now doing a 2665 run 3 clone 807 gen 45 - hopefully this one works. Was using 6.22 R3
Re: Project: 2665 (Run 2, Clone 791, Gen 30)
hmm, maybe a bad unit??
thanks guys!
thanks guys!
-
- Site Moderator
- Posts: 6349
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: Project: 2665 (Run 2, Clone 791, Gen 30)
Yes probablynwkelley wrote:hmm, maybe a bad unit??
thanks guys!
Re: Project: 2665 (Run 2, Clone 791, Gen 30)
Yay, I'm having a fun week here, got the same Wu, this again happened 3 times:
Code: Select all
[16:09:18] *------------------------------*
[16:09:18] Folding@Home Gromacs SMP Core
[16:09:18] Version 1.74 (March 10, 2007)
[16:09:18]
[16:09:18] Preparing to commence simulation
[16:09:18] - Ensuring status. Please wait.
[16:09:23] - Starting from initial work packet
[16:09:23]
[16:09:23] Project: 2665 (Run 2, Clone 791, Gen 30)
[16:09:23]
[16:09:23] Assembly optimizations on if available.
[16:09:23] Entering M.D.
[16:09:43] percent)
[16:09:43] - Starting from initial work packet
[16:09:43]
[16:09:43] Project: 2665 (Run 2, Clone 791, Gen 30)
[16:09:43]
[16:09:45] Entering M.D.
[16:09:52] Rejecting checkpoint
[16:09:53] Protein: HGG with glycosylations
[16:09:53] Writing local files
[16:09:59] Extra SSE boost OK.
[16:10:00] Writing local files
[16:10:00] Completed 0 out of 250000 steps (0 percent)
[16:27:47] Writing local files
[16:27:47] Completed 2500 out of 250000 steps (1 percent)
[16:44:28] Writing local files
[16:44:28] Completed 5000 out of 250000 steps (2 percent)
[16:53:03] delete work/wudata_08.goe
[16:53:03] Warning: check for stray files
[16:53:03] ave done.
[16:53:03] logfile size: 12967
[16:53:03] - Writing 13503 bytes of core data to disk...
[16:53:03] ... Done.
[16:53:03] No C.P. to delete.
[16:53:03] - Failed to delete work/wudata_08.sas
[16:53:03] - Failed to delete work/wudata_08.goe
[16:53:03] Warning: check for stray files
[16:55:03]
[16:55:03] Folding@home Core Shutdown: EARLY_UNIT_END
[16:55:03]
[16:55:03] Folding@home Core Shutdown: EARLY_UNIT_END
[16:55:07] CoreStatus = 7B (123)
[16:55:07] Client-core communications error: ERROR 0x7b
[16:55:07] Deleting current work unit & continuing...
[16:57:10] - Preparing to get new work unit...
[16:57:10] + Attempting to get work packet
[16:57:10] - Connecting to assignment server
[16:57:11] - Successful: assigned to (171.64.65.64).
[16:57:11] + News From Folding@Home: Welcome to Folding@Home
[16:57:11] Loaded queue successfully.
[17:02:49] + Closed connections
[17:02:54]
[17:02:54] + Processing work unit
[17:02:54] Core required: FahCore_a1.exe
[17:02:54] Core found.
[17:02:54] Working on Unit 09 [September 10 17:02:54]
[17:02:54] + Working ...
[17:02:54]