Moderators: Site Moderators , FAHC Science Team
daveb
Post
by daveb » Sun May 18, 2008 3:06 pm
One of my machines has downloaded Project: 2620 (Run 83, Clone 3, Gen 0) several times in a row. Each time, the unit ends with an ERROR 0x79 almost immediately after enaging the core. I do no know if this is related, but I noticed that the data download for this unit was only ~10% of what Inormally see on a p2620 (~340 kB vs >3MB). I finally deleted the work files from the folder and the computer picked up another unit.
Code: Select all
# Windows Console Edition #####################################################
###############################################################################
Folding@Home Client Version 6.01beta2
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: E:\FAH
Executable: E:\FAH\FAH6.exe
Arguments: -local -verbosity 9
[13:28:26] - Ask before connecting: Yes
[13:28:26] - User name: daveb@xxxxxxxxxxxxxxxxx.com (Team 1971)
[13:28:26] - User ID: 1185E2FC37C99B5C
[13:28:26] - Machine ID: 3
[13:28:26]
[13:28:27] Loaded queue successfully.
[13:28:27] - Preparing to get new work unit...
[13:28:27] - Presenting message box asking to network.
[13:28:27] - Autosending finished units...
[13:28:27] Trying to send all finished work units
[13:28:27] + No unsent completed units remaining.
[13:28:27] - Autosend completed
[13:28:31] + Attempting to get work packet
[13:28:31] - Will indicate memory of 1015 MB
[13:28:31] - Detect CPU. Vendor: ®ˆÈ, Family: 6, Model: 10, Stepping: 8
[13:28:31] - Connecting to assignment server
[13:28:31] Connecting to http://assign.stanford.edu:8080/
[13:28:31] Posted data.
[13:28:31] Initial: 40AB; - Successful: assigned to (171.64.65.65).
[13:28:31] + News From Folding@Home: Welcome to Folding@Home
[13:28:31] Loaded queue successfully.
[13:28:31] Connecting to http://171.64.65.65:8080/
[13:28:32] Posted data.
[13:28:32] Initial: 0000; - Receiving payload (expected size: 345575)
[13:28:33] - Downloaded at ~337 kB/s
[13:28:33] - Averaged speed for that direction ~372 kB/s
[13:28:33] + Received work.
[13:28:35] + Connections closed: You may now disconnect
[13:28:35]
[13:28:35] + Processing work unit
[13:28:35] Core required: FahCore_78.exe
[13:28:35] Core found.
[13:28:35] Working on Unit 09 [May 18 13:28:35]
[13:28:35] + Working ...
[13:28:35] - Calling 'FahCore_78.exe -dir work/ -suffix 09 -checkpoint 15 -verbose -lifeline 3028 -version 601'
[13:28:36]
[13:28:36] *------------------------------*
[13:28:36] Folding@Home Gromacs Core
[13:28:36] Version 1.90 (March 8, 2006)
[13:28:36]
[13:28:36] Preparing to commence simulation
[13:28:36] - Looking at optimizations...
[13:28:36] - Created dyn
[13:28:36] - Files status OK
[13:28:37] - Expanded 345063 -> 1974633 (decompressed 572.2 percent)
[13:28:37] - Starting from initial work packet
[13:28:37]
[13:28:37] Project: 2620 (Run 83, Clone 3, Gen 0)
[13:28:37]
[13:28:38] Assembly optimizations on if available.
[13:28:38] Entering M.D.
[13:28:47] Protein: p2620_p1475_tet1_03_1 t= 20000.00000
[13:28:47]
[13:28:48] Writing local files
[13:28:51] Gromacs error.
[13:28:51]
[13:28:51] Folding@home Core Shutdown: UNKNOWN_ERROR
[13:29:00] CoreStatus = 79 (121)
[13:29:00] Client-core communications error: ERROR 0x79
[13:29:00] Deleting current work unit & continuing...
[13:29:26] Trying to send all finished work units
[13:29:26] + No unsent completed units remaining.
[13:29:26] - Preparing to get new work unit...
[13:29:26] - Presenting message box asking to network.
[13:29:29] + Attempting to get work packet
[13:29:29] - Will indicate memory of 1015 MB
[13:29:29] - Connecting to assignment server
[13:29:29] Connecting to http://assign.stanford.edu:8080/
[13:29:30] Posted data.
[13:29:30] Initial: 40AB; - Successful: assigned to (171.64.65.65).
[13:29:30] + News From Folding@Home: Welcome to Folding@Home
[13:29:30] Loaded queue successfully.
[13:29:30] Connecting to http://171.64.65.65:8080/
[13:29:30] Posted data.
[13:29:31] Initial: 0000; - Receiving payload (expected size: 345575)
[13:29:31] Conversation time very short, giving reduced weight in bandwidth avg
[13:29:31] - Downloaded at ~674 kB/s
[13:29:31] - Averaged speed for that direction ~406 kB/s
[13:29:31] + Received work.
[13:29:34] + Connections closed: You may now disconnect
[13:29:39]
[13:29:39] + Processing work unit
[13:29:39] Core required: FahCore_78.exe
[13:29:39] Core found.
[13:29:39] Working on Unit 00 [May 18 13:29:39]
[13:29:39] + Working ...
[13:29:39] - Calling 'FahCore_78.exe -dir work/ -suffix 00 -checkpoint 15 -verbose -lifeline 3028 -version 601'
[13:29:39]
[13:29:39] *------------------------------*
[13:29:39] Folding@Home Gromacs Core
[13:29:39] Version 1.90 (March 8, 2006)
[13:29:39]
[13:29:39] Preparing to commence simulation
[13:29:39] - Looking at optimizations...
[13:29:40] - Created dyn
[13:29:40] - Files status OK
[13:29:40] - Expanded 345063 -> 1974633 (decompressed 572.2 percent)
[13:29:40] - Starting from initial work packet
[13:29:42]
[13:29:42] Project: 2620 (Run 83, Clone 3, Gen 0)
[13:29:42]
[13:29:45] Assembly optimizations on if available.
[13:29:45] Entering M.D.
[13:29:54] Protein: p2620_p1475_tet1_03_1 t= 20000.00000
[13:29:54]
[13:29:58] Writing local files
[13:29:58] Gromacs error.
[13:29:58]
[13:29:58] Folding@home Core Shutdown: UNKNOWN_ERROR
[13:30:09] CoreStatus = 79 (121)
[13:30:09] Client-core communications error: ERROR 0x79
[13:30:09] Deleting current work unit & continuing...
[13:30:38] Trying to send all finished work units
[13:30:38] + No unsent completed units remaining.
[13:30:38] - Preparing to get new work unit...
[13:30:38] - Presenting message box asking to network.
[13:30:43] ***** Got a SIGTERM signal (2)
[13:30:43] Killing all core threads
Folding@Home Client Shutdown.
-Dave
rbrandman
Pande Group Member
Posts: 22 Joined: Wed May 14, 2008 4:11 pm
Post
by rbrandman » Sun May 18, 2008 3:37 pm
Thanks for your post, I've alerted the researcher in charge of this project.
Relly
daveb
Post
by daveb » Fri Jun 20, 2008 3:09 pm
This unit has shown up again today on one of my computers with the same results: an unkonwn Gromacs error followed by an 0x79 error within seconds of engaging the core. After the error, the same unit downloads again and repeats the process. I eventually deleted the queue and it got a differnet unit.
Code: Select all
14:09:57] + Attempting to get work packet
[14:09:57] - Will indicate memory of 1015 MB
[14:09:57] - Detect CPU. Vendor: ®ˆÈ, Family: 6, Model: 10, Stepping: 8
[14:09:57] - Connecting to assignment server
[14:09:57] Connecting to http://assign.stanford.edu:8080/
[14:09:58] Posted data.
[14:09:58] Initial: 40AB; - Successful: assigned to (171.64.65.65).
[14:09:58] + News From Folding@Home: Welcome to Folding@Home
[14:09:58] Loaded queue successfully.
[14:09:58] Connecting to http://171.64.65.65:8080/
[14:09:58] Posted data.
[14:09:58] Initial: 0000; - Receiving payload (expected size: 345575)
[14:09:59] - Downloaded at ~337 kB/s
[14:09:59] - Averaged speed for that direction ~387 kB/s
[14:09:59] + Received work.
[14:10:00] + Connections closed: You may now disconnect
[14:10:00]
[14:10:00] + Processing work unit
[14:10:00] Core required: FahCore_78.exe
[14:10:00] Core found.
[14:10:00] Working on Unit 05 [June 20 14:10:00]
[14:10:00] + Working ...
[14:10:00] - Calling 'FahCore_78.exe -dir work/ -suffix 05 -checkpoint 15 -verbose -lifeline 264 -version 601'
[14:10:00]
[14:10:00] *------------------------------*
[14:10:00] Folding@Home Gromacs Core
[14:10:00] Version 1.90 (March 8, 2006)
[14:10:00]
[14:10:00] Preparing to commence simulation
[14:10:00] - Looking at optimizations...
[14:10:01] - Created dyn
[14:10:01] - Files status OK
[14:10:01] - Expanded 345063 -> 1974633 (decompressed 572.2 percent)
[14:10:02] - Starting from initial work packet
[14:10:02]
[14:10:02] Project: 2620 (Run 83, Clone 3, Gen 0)
[14:10:02]
[14:10:03] Assembly optimizations on if available.
[14:10:03] Entering M.D.
[14:10:13] Protein: p2620_p1475_tet1_03_1 t= 20000.00000
[14:10:13]
[14:10:14] Writing local files
[14:10:15] Gromacs error.
[14:10:15]
[14:10:15] Folding@home Core Shutdown: UNKNOWN_ERROR
[14:10:22] CoreStatus = 79 (121)
[14:10:22] Client-core communications error: ERROR 0x79
[14:10:22] Deleting current work unit & continuing...
[14:10:47] Trying to send all finished work units
[14:10:47] + No unsent completed units remaining.
[14:10:47] - Preparing to get new work unit...
[14:10:47] - Presenting message box asking to network.
[14:11:04] + Attempting to get work packet
[14:11:04] - Will indicate memory of 1015 MB
[14:11:04] - Connecting to assignment server
[14:11:04] Connecting to http://assign.stanford.edu:8080/
[14:11:04] Posted data.
[14:11:04] Initial: 40AB; - Successful: assigned to (171.64.65.65).
[14:11:04] + News From Folding@Home: Welcome to Folding@Home
[14:11:04] Loaded queue successfully.
[14:11:04] Connecting to http://171.64.65.65:8080/
[14:11:05] Posted data.
[14:11:05] Initial: 0000; - Receiving payload (expected size: 345575)
[14:11:06] - Downloaded at ~337 kB/s
[14:11:06] - Averaged speed for that direction ~377 kB/s
[14:11:06] + Received work.
[14:11:06] + Connections closed: You may now disconnect
[14:11:11]
[14:11:11] + Processing work unit
[14:11:11] Core required: FahCore_78.exe
[14:11:11] Core found.
[14:11:11] Working on Unit 06 [June 20 14:11:11]
[14:11:11] + Working ...
[14:11:11] - Calling 'FahCore_78.exe -dir work/ -suffix 06 -checkpoint 15 -verbose -lifeline 264 -version 601'
[14:11:12]
[14:11:12] *------------------------------*
[14:11:12] Folding@Home Gromacs Core
[14:11:12] Version 1.90 (March 8, 2006)
[14:11:12]
[14:11:12] Preparing to commence simulation
[14:11:12] - Looking at optimizations...
[14:11:12] - Created dyn
[14:11:12] - Files status OK
[14:11:13] - Expanded 345063 -> 1974633 (decompressed 572.2 percent)
[14:11:13] - Starting from initial work packet
[14:11:14]
[14:11:14] Project: 2620 (Run 83, Clone 3, Gen 0)
[14:11:14]
[14:11:15] Assembly optimizations on if available.
[14:11:15] Entering M.D.
[14:11:24] Protein: p2620_p1475_tet1_03_1 t= 20000.00000
[14:11:24]
[14:11:24] Writing local files
[14:11:25] Gromacs error.
[14:11:25]
[14:11:25] Folding@home Core Shutdown: UNKNOWN_ERROR
[14:11:28] ***** Got a SIGTERM signal (2)
[14:11:28] Killing all core threads
Folding@Home Client Shutdown.
Dave
rbrandman
Pande Group Member
Posts: 22 Joined: Wed May 14, 2008 4:11 pm
Post
by rbrandman » Fri Jun 20, 2008 3:55 pm
Thanks for letting us know that you got the same WU and errors again. I'll alert the researcher in charge of the project.
Relly
daveb
Post
by daveb » Mon Jun 23, 2008 2:13 am
The same unit showed up again today, with the same results. This time, it was repeatedlly reissued to the machine even after I deleted the queue and the work folders.
Dave
nwkelley
Pande Group Member
Posts: 57 Joined: Wed May 14, 2008 9:43 pm
Post
by nwkelley » Mon Jun 23, 2008 7:08 am
hi daveb, you've probably seen my post on the other thread about this particular work unit, but i've let peter know. thanks for keeping a close eye on things
nick
nwkelley
Pande Group Member
Posts: 57 Joined: Wed May 14, 2008 9:43 pm
Post
by nwkelley » Mon Jun 23, 2008 7:09 am
(by the way he's out of town at a meeting, so don't get too frustrated if you get this one again before he decides what to do with it)
nwkelley
Pande Group Member
Posts: 57 Joined: Wed May 14, 2008 9:43 pm
Post
by nwkelley » Mon Jun 23, 2008 11:18 pm
i've removed this work unit, great catch both of you, let me know if somehow it causes any trouble again! (it really shouldn't)
nick