Page 1 of 1

Project: 2671 (R24, C41, G91) AND (R49, C98, G84)

Posted: Sun Sep 06, 2009 8:15 pm
by Oak37
Both work units reported CoreStatus = FF (255) as soon as they started. Went through three downloads of the same unit each with the same failure.

Code: Select all

[17:53:19] Completed 237500 out of 250000 steps  (95%)
[18:00:44] Completed 240000 out of 250000 steps  (96%)
[18:08:09] Completed 242500 out of 250000 steps  (97%)
[18:15:35] Completed 245000 out of 250000 steps  (98%)
[18:22:59] Completed 247500 out of 250000 steps  (99%)
[18:30:23] Completed 250000 out of 250000 steps  (100%)
[18:30:24] DynamicWrapper: Finished Work Unit: sleep=10000
[18:30:34] 
[18:30:34] Finished Work Unit:
[18:30:34] - Reading up to 21167856 from "work/wudata_00.trr": Read 21167856
[18:30:35] trr file hash check passed.
[18:30:35] - Reading up to 27661808 from "work/wudata_00.xtc": Read 27661808
[18:30:35] xtc file hash check passed.
[18:30:35] edr file hash check passed.
[18:30:35] logfile size: 184580
[18:30:35] Leaving Run
[18:30:40] - Writing 49164028 bytes of core data to disk...
[18:30:41]   ... Done.
[18:30:46] - Shutting down core
[18:30:46] 
[18:30:46] Folding@home Core Shutdown: FINISHED_UNIT
[18:33:55] CoreStatus = 64 (100)
[18:33:55] Unit 0 finished with 83 percent of time to deadline remaining.
[18:33:55] Updated performance fraction: 0.823919
[18:33:55] Sending work to server
[18:33:55] Project: 2671 (Run 5, Clone 25, Gen 97)


[18:33:55] + Attempting to send results [September 6 18:33:55 UTC]
[18:33:55] - Reading file work/wuresults_00.dat from core
[18:33:55]   (Read 49164028 bytes from disk)
[18:33:55] Connecting to http://171.67.108.24:8080/
[18:54:30] Posted data.
[18:54:30] Initial: 0000; - Uploaded at ~38 kB/s
[18:54:33] - Averaged speed for that direction ~37 kB/s
[18:54:33] + Results successfully sent
[18:54:33] Thank you for your contribution to Folding@Home.
[18:54:33] + Number of Units Completed: 423

[18:54:35] - Warning: Could not delete all work unit files (0): Core file absent
[18:54:35] Trying to send all finished work units
[18:54:35] + No unsent completed units remaining.
[18:54:35] - Preparing to get new work unit...
[18:54:35] + Attempting to get work packet
[18:54:35] - Will indicate memory of 1988 MB
[18:54:35] - Connecting to assignment server
[18:54:35] Connecting to http://assign.stanford.edu:8080/
[18:54:41] Posted data.
[18:54:41] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[18:54:41] + News From Folding@Home: Welcome to Folding@Home
[18:54:41] Loaded queue successfully.
[18:54:41] Connecting to http://171.67.108.24:8080/
[18:54:49] Posted data.
[18:54:49] Initial: 0000; - Receiving payload (expected size: 1492620)
[18:55:03] - Downloaded at ~104 kB/s
[18:55:03] - Averaged speed for that direction ~99 kB/s
[18:55:03] + Received work.
[18:55:03] Trying to send all finished work units
[18:55:03] + No unsent completed units remaining.
[18:55:03] + Closed connections
[18:55:03] 
[18:55:03] + Processing work unit
[18:55:03] At least 4 processors must be requested.Core required: FahCore_a2.exe
[18:55:03] Core found.
[18:55:03] Working on queue slot 01 [September 6 18:55:03 UTC]
[18:55:03] + Working ...
[18:55:03] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 01 -checkpoint 30 -forceasm -verbose -lifeline 31528 -version 624'

[18:55:03] 
[18:55:03] *------------------------------*
[18:55:03] Folding@Home Gromacs SMP Core
[18:55:03] Version 2.09 (Sun Aug 30 03:43:28 CEST 2009)
[18:55:03] 
[18:55:03] Preparing to commence simulation
[18:55:03] - Ensuring status. Please wait.
[18:55:13] - Assembly optimizations manually forced on.
[18:55:13] - Not checking prior termination.
[18:55:13] - Expanded 1492108 -> 24057197 (decompressed 1612.2 percent)
[18:55:13] Called DecompressByteArray: compressed_data_size=1492108 data_size=24057197, decompressed_data_size=24057197 diff=0
[18:55:14] - Digital signature verified
[18:55:14] 
[18:55:14] Project: 2671 (Run 24, Clone 41, Gen 91)
[18:55:14] 
[18:55:14] Assembly optimizations on if available.
[18:55:14] Entering M.D.
[18:55:42] Completed 0 out of 250000 steps  (0%)
[18:55:46] CoreStatus = FF (255)
[18:55:46] Sending work to server
[18:55:46] Project: 2671 (Run 24, Clone 41, Gen 91)
[18:55:46] - Error: Could not get length of results file work/wuresults_01.dat
[18:55:46] - Error: Could not read unit 01 file. Removing from queue.
[18:55:46] Trying to send all finished work units
[18:55:46] + No unsent completed units remaining.
[18:55:46] - Preparing to get new work unit...
[18:55:46] + Attempting to get work packet
[18:55:46] - Will indicate memory of 1988 MB
[18:55:46] - Connecting to assignment server
[18:55:46] Connecting to http://assign.stanford.edu:8080/
[18:55:52] Posted data.
[18:55:52] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[18:55:52] + News From Folding@Home: Welcome to Folding@Home
[18:55:52] Loaded queue successfully.
[18:55:52] Connecting to http://171.67.108.24:8080/
[18:56:00] Posted data.
[18:56:00] Initial: 0000; - Receiving payload (expected size: 1492620)
[18:56:14] - Downloaded at ~104 kB/s
[18:56:14] - Averaged speed for that direction ~100 kB/s
[18:56:14] + Received work.
[18:56:14] Trying to send all finished work units
[18:56:14] + No unsent completed units remaining.
[18:56:14] + Closed connections
[18:56:19] 
[18:56:19] + Processing work unit
[18:56:19] At least 4 processors must be requested.Core required: FahCore_a2.exe
[18:56:19] Core found.
[18:56:19] Working on queue slot 02 [September 6 18:56:19 UTC]
[18:56:19] + Working ...
[18:56:19] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 02 -checkpoint 30 -forceasm -verbose -lifeline 31528 -version 624'

[18:56:19] 
[18:56:19] *------------------------------*
[18:56:19] Folding@Home Gromacs SMP Core
[18:56:19] Version 2.09 (Sun Aug 30 03:43:28 CEST 2009)
[18:56:19] 
[18:56:19] Preparing to commence simulation
[18:56:19] - Ensuring status. Please wait.
[18:56:29] - Assembly optimizations manually forced on.
[18:56:29] - Not checking prior termination.
[18:56:29] - Expanded 1492108 -> 24057197 (decompressed 1612.2 percent)
[18:56:29] Called DecompressByteArray: compressed_data_size=1492108 data_size=24057197, decompressed_data_size=24057197 diff=0
[18:56:30] - Digital signature verified
[18:56:30] 
[18:56:30] Project: 2671 (Run 24, Clone 41, Gen 91)
[18:56:30] 
[18:56:30] Assembly optimizations on if available.
[18:56:30] Entering M.D.
[18:56:57] Completed 0 out of 250000 steps  (0%)
[18:57:02] CoreStatus = FF (255)
[18:57:02] Sending work to server
[18:57:02] Project: 2671 (Run 24, Clone 41, Gen 91)
[18:57:02] - Error: Could not get length of results file work/wuresults_02.dat
[18:57:02] - Error: Could not read unit 02 file. Removing from queue.
[18:57:02] Trying to send all finished work units
[18:57:02] + No unsent completed units remaining.
[18:57:02] - Preparing to get new work unit...
[18:57:02] + Attempting to get work packet
[18:57:02] - Will indicate memory of 1988 MB
[18:57:02] - Connecting to assignment server
[18:57:02] Connecting to http://assign.stanford.edu:8080/
[18:57:08] Posted data.
[18:57:08] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[18:57:08] + News From Folding@Home: Welcome to Folding@Home
[18:57:08] Loaded queue successfully.
[18:57:08] Connecting to http://171.67.108.24:8080/
[18:57:16] Posted data.
[18:57:16] Initial: 0000; - Receiving payload (expected size: 1492620)
[18:57:23] - Downloaded at ~208 kB/s
[18:57:23] - Averaged speed for that direction ~121 kB/s
[18:57:23] + Received work.
[18:57:23] Trying to send all finished work units
[18:57:23] + No unsent completed units remaining.
[18:57:23] + Closed connections
[18:57:28] 
[18:57:28] + Processing work unit
[18:57:28] At least 4 processors must be requested.Core required: FahCore_a2.exe
[18:57:28] Core found.
[18:57:28] Working on queue slot 03 [September 6 18:57:28 UTC]
[18:57:28] + Working ...
[18:57:28] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 03 -checkpoint 30 -forceasm -verbose -lifeline 31528 -version 624'

[18:57:28] 
[18:57:28] *------------------------------*
[18:57:28] Folding@Home Gromacs SMP Core
[18:57:28] Version 2.09 (Sun Aug 30 03:43:28 CEST 2009)
[18:57:28] 
[18:57:28] Preparing to commence simulation
[18:57:28] - Ensuring status. Please wait.
[18:57:38] - Assembly optimizations manually forced on.
[18:57:38] - Not checking prior termination.
[18:57:38] - Expanded 1492108 -> 24057197 (decompressed 1612.2 percent)
[18:57:38] Called DecompressByteArray: compressed_data_size=1492108 data_size=24057197, decompressed_data_size=24057197 diff=0
[18:57:39] - Digital signature verified
[18:57:39] 
[18:57:39] Project: 2671 (Run 24, Clone 41, Gen 91)
[18:57:39] 
[18:57:39] Assembly optimizations on if available.
[18:57:39] Entering M.D.
[18:58:06] Completed 0 out of 250000 steps  (0%)
[18:58:07] 
[18:58:07] Folding@home Core Shutdown: INTERRUPTED
[18:58:11] CoreStatus = FF (255)
[18:58:11] Sending work to server
[18:58:11] Project: 2671 (Run 24, Clone 41, Gen 91)
[18:58:11] - Error: Could not get length of results file work/wuresults_03.dat
[18:58:11] - Error: Could not read unit 03 file. Removing from queue.
[18:58:11] Trying to send all finished work units
[18:58:11] + No unsent completed units remaining.
[18:58:11] - Preparing to get new work unit...
[18:58:11] + Attempting to get work packet
[18:58:11] - Will indicate memory of 1988 MB
[18:58:11] - Connecting to assignment server
[18:58:11] Connecting to http://assign.stanford.edu:8080/
[18:58:17] Posted data.
[18:58:17] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[18:58:17] + News From Folding@Home: Welcome to Folding@Home
[18:58:17] Loaded queue successfully.
[18:58:17] Connecting to http://171.67.108.24:8080/
[18:58:18] Posted data.
[18:58:18] Initial: 0000; - Error: Bad packet type from server, expected work assignment
[18:58:18] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[18:58:27] + Attempting to get work packet
[18:58:27] - Will indicate memory of 1988 MB
[18:58:27] - Connecting to assignment server
[18:58:27] Connecting to http://assign.stanford.edu:8080/
[18:58:32] Posted data.
[18:58:32] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[18:58:32] + News From Folding@Home: Welcome to Folding@Home
[18:58:33] Loaded queue successfully.
[18:58:33] Connecting to http://171.67.108.24:8080/
[18:58:38] Posted data.
[18:58:38] Initial: 0000; - Receiving payload (expected size: 1497992)
[18:58:48] - Downloaded at ~146 kB/s
[18:58:48] - Averaged speed for that direction ~126 kB/s
[18:58:48] + Received work.
[18:58:48] Trying to send all finished work units
[18:58:48] + No unsent completed units remaining.
[18:58:48] + Closed connections
[18:58:53] 
[18:58:53] + Processing work unit
[18:58:53] At least 4 processors must be requested.Core required: FahCore_a2.exe
[18:58:53] Core found.
[18:58:53] Working on queue slot 04 [September 6 18:58:53 UTC]
[18:58:53] + Working ...
[18:58:53] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 04 -checkpoint 30 -forceasm -verbose -lifeline 31528 -version 624'

[18:58:53] 
[18:58:53] *------------------------------*
[18:58:53] Folding@Home Gromacs SMP Core
[18:58:53] Version 2.09 (Sun Aug 30 03:43:28 CEST 2009)
[18:58:53] 
[18:58:53] Preparing to commence simulation
[18:58:53] - Ensuring status. Please wait.
[18:59:02] - Assembly optimizations manually forced on.
[18:59:02] - Not checking prior termination.
[18:59:03] - Expanded 1497480 -> 24036661 (decompressed 1605.1 percent)
[18:59:03] Called DecompressByteArray: compressed_data_size=1497480 data_size=24036661, decompressed_data_size=24036661 diff=0
[18:59:03] - Digital signature verified
[18:59:03] 
[18:59:03] Project: 2671 (Run 49, Clone 98, Gen 84)
[18:59:03] 
[18:59:03] Assembly optimizations on if available.
[18:59:03] Entering M.D.
[18:59:31] Completed 0 out of 250000 steps  (0%)
[18:59:35] CoreStatus = FF (255)
[18:59:35] Sending work to server
[18:59:35] Project: 2671 (Run 49, Clone 98, Gen 84)
[18:59:35] - Error: Could not get length of results file work/wuresults_04.dat
[18:59:35] - Error: Could not read unit 04 file. Removing from queue.
[18:59:35] Trying to send all finished work units
[18:59:35] + No unsent completed units remaining.
[18:59:35] - Preparing to get new work unit...
[18:59:35] + Attempting to get work packet
[18:59:35] - Will indicate memory of 1988 MB
[18:59:35] - Connecting to assignment server
[18:59:35] Connecting to http://assign.stanford.edu:8080/
[18:59:41] Posted data.
[18:59:41] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[18:59:41] + News From Folding@Home: Welcome to Folding@Home
[18:59:41] Loaded queue successfully.
[18:59:41] Connecting to http://171.67.108.24:8080/
[18:59:47] Posted data.
[18:59:47] Initial: 0000; - Receiving payload (expected size: 1497992)
[19:00:06] - Downloaded at ~76 kB/s
[19:00:06] - Averaged speed for that direction ~116 kB/s
[19:00:06] + Received work.
[19:00:06] Trying to send all finished work units
[19:00:06] + No unsent completed units remaining.
[19:00:06] + Closed connections
[19:00:11] 
[19:00:11] + Processing work unit
[19:00:11] At least 4 processors must be requested.Core required: FahCore_a2.exe
[19:00:11] Core found.
[19:00:11] Working on queue slot 05 [September 6 19:00:11 UTC]
[19:00:11] + Working ...
[19:00:11] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 05 -checkpoint 30 -forceasm -verbose -lifeline 31528 -version 624'

[19:00:12] 
[19:00:12] *------------------------------*
[19:00:12] Folding@Home Gromacs SMP Core
[19:00:12] Version 2.09 (Sun Aug 30 03:43:28 CEST 2009)
[19:00:12] 
[19:00:12] Preparing to commence simulation
[19:00:12] - Ensuring status. Please wait.
[19:00:21] - Assembly optimizations manually forced on.
[19:00:21] - Not checking prior termination.
[19:00:22] - Expanded 1497480 -> 24036661 (decompressed 1605.1 percent)
[19:00:22] Called DecompressByteArray: compressed_data_size=1497480 data_size=24036661, decompressed_data_size=24036661 diff=0
[19:00:22] - Digital signature verified
[19:00:22] 
[19:00:22] Project: 2671 (Run 49, Clone 98, Gen 84)
[19:00:22] 
[19:00:22] Assembly optimizations on if available.
[19:00:22] Entering M.D.
[19:00:50] Completed 0 out of 250000 steps  (0%)
[19:00:54] CoreStatus = FF (255)
[19:00:54] Sending work to server
[19:00:54] Project: 2671 (Run 49, Clone 98, Gen 84)
[19:00:54] - Error: Could not get length of results file work/wuresults_05.dat
[19:00:54] - Error: Could not read unit 05 file. Removing from queue.
[19:00:54] Trying to send all finished work units
[19:00:54] + No unsent completed units remaining.
[19:00:54] - Preparing to get new work unit...
[19:00:54] + Attempting to get work packet
[19:00:54] - Will indicate memory of 1988 MB
[19:00:54] - Connecting to assignment server
[19:00:54] Connecting to http://assign.stanford.edu:8080/
[19:01:00] Posted data.
[19:01:00] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[19:01:00] + News From Folding@Home: Welcome to Folding@Home
[19:01:00] Loaded queue successfully.
[19:01:00] Connecting to http://171.67.108.24:8080/
[19:01:05] Posted data.
[19:01:05] Initial: 0000; - Receiving payload (expected size: 1497992)
[19:01:31] - Downloaded at ~56 kB/s
[19:01:31] - Averaged speed for that direction ~104 kB/s
[19:01:31] + Received work.
[19:01:31] Trying to send all finished work units
[19:01:31] + No unsent completed units remaining.
[19:01:31] + Closed connections
[19:01:36] 
[19:01:36] + Processing work unit
[19:01:36] At least 4 processors must be requested.Core required: FahCore_a2.exe
[19:01:36] Core found.
[19:01:36] Working on queue slot 06 [September 6 19:01:36 UTC]
[19:01:36] + Working ...
[19:01:36] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 06 -checkpoint 30 -forceasm -verbose -lifeline 31528 -version 624'

[19:01:36] 
[19:01:36] *------------------------------*
[19:01:36] Folding@Home Gromacs SMP Core
[19:01:36] Version 2.09 (Sun Aug 30 03:43:28 CEST 2009)
[19:01:36] 
[19:01:36] Preparing to commence simulation
[19:01:36] - Ensuring status. Please wait.
[19:01:45] - Assembly optimizations manually forced on.
[19:01:45] - Not checking prior termination.
[19:01:46] - Expanded 1497480 -> 24036661 (decompressed 1605.1 percent)
[19:01:46] Called DecompressByteArray: compressed_data_size=1497480 data_size=24036661, decompressed_data_size=24036661 diff=0
[19:01:46] - Digital signature verified
[19:01:46] 
[19:01:46] Project: 2671 (Run 49, Clone 98, Gen 84)
[19:01:46] 
[19:01:46] Assembly optimizations on if available.
[19:01:46] Entering M.D.
[19:02:14] Completed 0 out of 250000 steps  (0%)
[19:02:18] CoreStatus = FF (255)
[19:02:18] Sending work to server
[19:02:18] Project: 2671 (Run 49, Clone 98, Gen 84)
[19:02:18] - Error: Could not get length of results file work/wuresults_06.dat
[19:02:18] - Error: Could not read unit 06 file. Removing from queue.
[19:02:18] Trying to send all finished work units
[19:02:18] + No unsent completed units remaining.
[19:02:18] - Preparing to get new work unit...
[19:02:18] + Attempting to get work packet
[19:02:18] - Will indicate memory of 1988 MB
[19:02:18] - Connecting to assignment server
[19:02:18] Connecting to http://assign.stanford.edu:8080/
[19:02:24] Posted data.
[19:02:24] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[19:02:24] + News From Folding@Home: Welcome to Folding@Home
[19:02:24] Loaded queue successfully.
[19:02:24] Connecting to http://171.67.108.24:8080/
[19:02:25] Posted data.
[19:02:25] Initial: 0000; - Error: Bad packet type from server, expected work assignment
[19:02:25] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[19:02:42] + Attempting to get work packet
[19:02:42] - Will indicate memory of 1988 MB
[19:02:42] - Connecting to assignment server
[19:02:42] Connecting to http://assign.stanford.edu:8080/
[19:02:48] Posted data.
[19:02:48] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[19:02:48] + News From Folding@Home: Welcome to Folding@Home
[19:02:48] Loaded queue successfully.
[19:02:48] Connecting to http://171.67.108.24:8080/
[19:02:55] Posted data.
[19:02:55] Initial: 0000; - Receiving payload (expected size: 4836564)
[19:04:20] - Downloaded at ~55 kB/s
[19:04:20] - Averaged speed for that direction ~94 kB/s
[19:04:20] + Received work.
[19:04:20] Trying to send all finished work units
[19:04:20] + No unsent completed units remaining.
[19:04:20] + Closed connections
[19:04:25] 
[19:04:25] + Processing work unit
[19:04:25] At least 4 processors must be requested.Core required: FahCore_a2.exe
[19:04:25] Core found.
[19:04:25] Working on queue slot 07 [September 6 19:04:25 UTC]
[19:04:25] + Working ...
[19:04:25] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 07 -checkpoint 30 -forceasm -verbose -lifeline 31528 -version 624'

[19:04:25] 
[19:04:25] *------------------------------*
[19:04:25] Folding@Home Gromacs SMP Core
[19:04:25] Version 2.09 (Sun Aug 30 03:43:28 CEST 2009)
[19:04:25] 
[19:04:25] Preparing to commence simulation
[19:04:25] - Ensuring status. Please wait.
[19:04:35] - Assembly optimizations manually forced on.
[19:04:35] - Not checking prior termination.
[19:04:36] - Expanded 4836052 -> 24031821 (decompressed 496.9 percent)
[19:04:36] Called DecompressByteArray: compressed_data_size=4836052 data_size=24031821, decompressed_data_size=24031821 diff=0
[19:04:36] - Digital signature verified
[19:04:36] 
[19:04:36] Project: 2671 (Run 26, Clone 86, Gen 97)
[19:04:36] 
[19:04:36] Assembly optimizations on if available.
[19:04:36] Entering M.D.
[19:04:45] Completed 0 out of 250000 steps  (0%)
[19:12:12] Completed 2500 out of 250000 steps  (1%)
[19:19:37] Completed 5000 out of 250000 steps  (2%)
[19:27:01] Completed 7500 out of 250000 steps  (3%)
[19:34:28] Completed 10000 out of 250000 steps  (4%)
[19:41:56] Completed 12500 out of 250000 steps  (5%)

Re: Project: 2671 (R24, C41, G91) AND (R49, C98, G84)

Posted: Mon Sep 07, 2009 1:15 am
by uncle fuzzy
That is the result of a bad WU meeting the 2.10 core. The small data size is another indicator.
compressed_data_size=1492108

It should keep doing that until it gets a good WU.

Re: Project: 2671 (R24, C41, G91) AND (R49, C98, G84)

Posted: Mon Sep 07, 2009 9:14 am
by Oak37
Thanks uncle fuzzy, I didn't notice the small compressed data size.

edit: I reported these in the "List of SMP WUs with the "1 core usage" issue" thread.

Re: Project: 2671 (R24, C41, G91) AND (R49, C98, G84)

Posted: Thu Sep 10, 2009 6:57 pm
by jevans64
R24, C41, G91 craps out with core 2.10 also. Got it on a couple of my VM boxes.