Project: 5742 (Run 1, Clone 42, Gen 619)

Moderators: Site Moderators, FAHC Science Team

Post Reply
valleton
Posts: 13
Joined: Tue Mar 24, 2009 5:22 pm
Hardware configuration: Intel E8600
Club3D Radeon HD4870 1GB
Location: Estonia

Project: 5742 (Run 1, Clone 42, Gen 619)

Post by valleton »

Most likely another broken WU.

Code: Select all

[02:34:45] Project: 5742 (Run 1, Clone 42, Gen 619)
[02:34:45] 
[02:34:45] Assembly optimizations on if available.
[02:34:45] Entering M.D.
[02:34:51] Tpr hash work/wudata_09.tpr:  1907284785 89372323 585158592 825259943 4244972031
[02:34:52] Working on Protein
[02:34:52] Client config found, loading data.
[02:34:52] Starting GUI Server
[02:34:53] mdrun_gpu returned 
[02:34:53] Nonzero force sum on GPU
[02:34:53] 
[02:34:53] Folding@home Core Shutdown: UNSTABLE_MACHINE
[02:34:57] CoreStatus = 7A (122)
[02:34:57] Sending work to server
[02:34:57] Project: 5742 (Run 1, Clone 42, Gen 619)
[02:34:57] - Read packet limit of 540015616... Set to 524286976.
[02:34:57] - Error: Could not get length of results file work/wuresults_09.dat
[02:34:57] - Error: Could not read unit 09 file. Removing from queue.
[02:34:57] - Preparing to get new work unit...
[02:34:57] + Attempting to get work packet
[02:34:57] - Connecting to assignment server
[02:34:58] - Successful: assigned to (171.64.65.102).
[02:34:58] + News From Folding@Home: Welcome to Folding@Home
[02:34:58] Loaded queue successfully.
[02:35:00] + Closed connections
[02:35:05] 
[02:35:05] + Processing work unit
[02:35:05] Core required: FahCore_11.exe
[02:35:05] Core found.
[02:35:05] Working on queue slot 00 [November 26 02:35:05 UTC]
[02:35:05] + Working ...
[02:35:05] 
[02:35:05] *------------------------------*
[02:35:05] Folding@Home GPU Core - Beta
[02:35:05] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[02:35:05] 
[02:35:05] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[02:35:05] Build host: amoeba
[02:35:05] Board Type: AMD
[02:35:05] Core      : 
[02:35:05] Preparing to commence simulation
[02:35:05] - Looking at optimizations...
[02:35:05] - Created dyn
[02:35:05] - Files status OK
[02:35:05] - Expanded 70183 -> 360060 (decompressed 513.0 percent)
[02:35:05] Called DecompressByteArray: compressed_data_size=70183 data_size=360060, decompressed_data_size=360060 diff=0
[02:35:05] - Digital signature verified
[02:35:05] 
[02:35:05] Project: 5742 (Run 1, Clone 42, Gen 619)
[02:35:05] 
[02:35:05] Assembly optimizations on if available.
[02:35:05] Entering M.D.
[02:35:12] Tpr hash work/wudata_00.tpr:  1907284785 89372323 585158592 825259943 4244972031
[02:35:12] Working on Protein
[02:35:12] Client config found, loading data.
[02:35:12] Starting GUI Server
[02:35:14] mdrun_gpu returned 
[02:35:14] Nonzero force sum on GPU
[02:35:14] 
[02:35:14] Folding@home Core Shutdown: UNSTABLE_MACHINE
[02:35:17] CoreStatus = 7A (122)
[02:35:17] Sending work to server
[02:35:17] Project: 5742 (Run 1, Clone 42, Gen 619)
[02:35:17] - Read packet limit of 540015616... Set to 524286976.
[02:35:17] - Error: Could not get length of results file work/wuresults_00.dat
[02:35:17] - Error: Could not read unit 00 file. Removing from queue.
[02:35:17] EUE limit exceeded. Pausing 24 hours.
bapriebe
Posts: 44
Joined: Sun Apr 20, 2008 8:33 am
Hardware configuration: HP xw4600 workstation (4GB)+Q9650+Sapphire Vapor-X HD4890,
HP Z600 workstation (4GB)+2xXEON E5540+Sapphire HD5770,
HP ML350 server (4GB)+2xXEON E5520+Diamond HD3850
Location: Ottawa, Ontario

Re: Project: 5742 (Run 1, Clone 42, Gen 619)

Post by bapriebe »

Killed one of my GPU clients here in exactly the same way. 5 attempts at it gave identical errors as illustrated below then it disabled the client for 24hrs.

Code: Select all

[21:55:40] + Processing work unit
[21:55:40] Core required: FahCore_11.exe
[21:55:40] Core found.
[21:55:40] Working on queue slot 01 [November 26 21:55:40 UTC]
[21:55:40] + Working ...
[21:55:40] - Calling '.\FahCore_11.exe -dir work/ -suffix 01 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 5480 -version 623'

[21:55:40] 
[21:55:40] *------------------------------*
[21:55:40] Folding@Home GPU Core - Beta
[21:55:40] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[21:55:40] 
[21:55:40] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[21:55:40] Build host: amoeba
[21:55:40] Board Type: AMD
[21:55:40] Core      : 
[21:55:40] Preparing to commence simulation
[21:55:40] - Looking at optimizations...
[21:55:40] - Created dyn
[21:55:40] - Files status OK
[21:55:40] - Expanded 70183 -> 360060 (decompressed 513.0 percent)
[21:55:40] Called DecompressByteArray: compressed_data_size=70183 data_size=360060, decompressed_data_size=360060 diff=0
[21:55:40] - Digital signature verified
[21:55:40] 
[21:55:40] Project: 5742 (Run 1, Clone 42, Gen 619)
[21:55:40] 
[21:55:40] Assembly optimizations on if available.
[21:55:40] Entering M.D.
[21:55:46] Tpr hash work/wudata_01.tpr:  1907284785 89372323 585158592 825259943 4244972031
[21:55:46] Working on Protein
[21:55:47] Client config found, loading data.
[21:55:47] Starting GUI Server
[21:55:50] mdrun_gpu returned 
[21:55:50] Nonzero force sum on GPU
[21:55:50] 
[21:55:50] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:55:52] CoreStatus = 7A (122)
[21:55:52] Sending work to server
[21:55:52] Project: 5742 (Run 1, Clone 42, Gen 619)
[21:55:52] - Read packet limit of 540015616... Set to 524286976.
[21:55:52] - Error: Could not get length of results file work/wuresults_01.dat
[21:55:52] - Error: Could not read unit 01 file. Removing from queue.
Zagen30
Posts: 823
Joined: Tue Mar 25, 2008 12:45 am
Hardware configuration: Core i7 3770K @3.5 GHz (not folding), 8 GB DDR3 @2133 MHz, 2xGTX 780 @1215 MHz, Windows 7 Pro 64-bit running 7.3.6 w/ 1xSMP, 2xGPU

4P E5-4650 @3.1 GHz, 64 GB DDR3 @1333MHz, Ubuntu Desktop 13.10 64-bit

Project: 5742 (Run 1, Clone 42, Gen 619)

Post by Zagen30 »

Got this one several times, failed immediately every time. Other than this, my GPU (3850, running in WinXP Pro 32-bit) has been pretty stable;

Code: Select all

[12:30:58] - Preparing to get new work unit...
[12:30:58] + Attempting to get work packet
[12:30:58] - Connecting to assignment server
[12:30:58] - Successful: assigned to (171.64.65.102).
[12:30:58] + News From Folding@Home: Welcome to Folding@Home
[12:30:59] Loaded queue successfully.
[12:31:00] + Closed connections
[12:31:00] 
[12:31:00] + Processing work unit
[12:31:00] Core required: FahCore_11.exe
[12:31:00] Core found.
[12:31:00] Working on queue slot 00 [December 26 12:31:00 UTC]
[12:31:00] + Working ...
[12:31:00] 
[12:31:00] *------------------------------*
[12:31:00] Folding@Home GPU Core - Beta
[12:31:00] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[12:31:00] 
[12:31:00] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[12:31:00] Build host: amoeba
[12:31:00] Board Type: AMD
[12:31:00] Core      : 
[12:31:00] Preparing to commence simulation
[12:31:00] - Looking at optimizations...
[12:31:00] - Created dyn
[12:31:00] - Files status OK
[12:31:00] - Expanded 70183 -> 360060 (decompressed 513.0 percent)
[12:31:00] Called DecompressByteArray: compressed_data_size=70183 data_size=360060, decompressed_data_size=360060 diff=0
[12:31:00] - Digital signature verified
[12:31:00] 
[12:31:00] Project: 5742 (Run 1, Clone 42, Gen 619)
[12:31:00] 
[12:31:00] Assembly optimizations on if available.
[12:31:00] Entering M.D.
[12:31:06] Tpr hash work/wudata_00.tpr:  1907284785 89372323 585158592 825259943 4244972031
[12:31:08] Working on Protein
[12:31:08] Client config found, loading data.
[12:31:08] Starting GUI Server
[12:31:17] mdrun_gpu returned 
[12:31:17] Nonzero force sum on GPU
[12:31:17] 
[12:31:17] Folding@home Core Shutdown: UNSTABLE_MACHINE
[12:31:21] CoreStatus = 7A (122)
[12:31:21] Sending work to server
[12:31:21] Project: 5742 (Run 1, Clone 42, Gen 619)
[12:31:21] - Read packet limit of 540015616... Set to 524286976.
[12:31:21] - Error: Could not get length of results file work/wuresults_00.dat
[12:31:21] - Error: Could not read unit 00 file. Removing from queue.
[12:31:21] - Preparing to get new work unit...
[12:31:21] + Attempting to get work packet
[12:31:21] - Connecting to assignment server
[12:31:22] - Successful: assigned to (171.64.65.102).
[12:31:22] + News From Folding@Home: Welcome to Folding@Home
[12:31:22] Loaded queue successfully.
[12:31:23] + Closed connections
[12:31:28] 
[12:31:28] + Processing work unit
[12:31:28] Core required: FahCore_11.exe
[12:31:28] Core found.
[12:31:28] Working on queue slot 01 [December 26 12:31:28 UTC]
[12:31:28] + Working ...
[12:31:29] 
[12:31:29] *------------------------------*
[12:31:29] Folding@Home GPU Core - Beta
[12:31:29] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[12:31:29] 
[12:31:29] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[12:31:29] Build host: amoeba
[12:31:29] Board Type: AMD
[12:31:29] Core      : 
[12:31:29] Preparing to commence simulation
[12:31:29] - Looking at optimizations...
[12:31:29] - Created dyn
[12:31:29] - Files status OK
[12:31:29] - Expanded 70183 -> 360060 (decompressed 513.0 percent)
[12:31:29] Called DecompressByteArray: compressed_data_size=70183 data_size=360060, decompressed_data_size=360060 diff=0
[12:31:29] - Digital signature verified
[12:31:29] 
[12:31:29] Project: 5742 (Run 1, Clone 42, Gen 619)
[12:31:29] 
[12:31:29] Assembly optimizations on if available.
[12:31:29] Entering M.D.
[12:31:35] Tpr hash work/wudata_01.tpr:  1907284785 89372323 585158592 825259943 4244972031
[12:31:35] Working on Protein
[12:31:38] Client config found, loading data.
[12:31:38] Starting GUI Server
[12:31:47] mdrun_gpu returned 
[12:31:47] Nonzero force sum on GPU
[12:31:47] 
[12:31:47] Folding@home Core Shutdown: UNSTABLE_MACHINE
[12:31:50] CoreStatus = 7A (122)
[12:31:50] Sending work to server
[12:31:50] Project: 5742 (Run 1, Clone 42, Gen 619)
[12:31:50] - Read packet limit of 540015616... Set to 524286976.
[12:31:50] - Error: Could not get length of results file work/wuresults_01.dat
[12:31:50] - Error: Could not read unit 01 file. Removing from queue.
[12:31:50] - Preparing to get new work unit...
[12:31:50] + Attempting to get work packet
[12:31:50] - Connecting to assignment server
[12:31:51] - Successful: assigned to (171.64.65.102).
[12:31:51] + News From Folding@Home: Welcome to Folding@Home
[12:31:51] Loaded queue successfully.
[12:31:52] + Closed connections
[12:31:57] 
[12:31:57] + Processing work unit
[12:31:57] Core required: FahCore_11.exe
[12:31:57] Core found.
[12:31:57] Working on queue slot 02 [December 26 12:31:57 UTC]
[12:31:57] + Working ...
[12:31:57] 
[12:31:57] *------------------------------*
[12:31:57] Folding@Home GPU Core - Beta
[12:31:57] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[12:31:57] 
[12:31:57] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[12:31:57] Build host: amoeba
[12:31:57] Board Type: AMD
[12:31:57] Core      : 
[12:31:57] Preparing to commence simulation
[12:31:57] - Looking at optimizations...
[12:31:57] - Created dyn
[12:31:57] - Files status OK
[12:31:57] - Expanded 70183 -> 360060 (decompressed 513.0 percent)
[12:31:57] Called DecompressByteArray: compressed_data_size=70183 data_size=360060, decompressed_data_size=360060 diff=0
[12:31:57] - Digital signature verified
[12:31:57] 
[12:31:57] Project: 5742 (Run 1, Clone 42, Gen 619)
[12:31:57] 
[12:31:58] Assembly optimizations on if available.
[12:31:58] Entering M.D.
[12:32:04] Tpr hash work/wudata_02.tpr:  1907284785 89372323 585158592 825259943 4244972031
[12:32:04] Working on Protein
[12:32:07] Client config found, loading data.
[12:32:08] Starting GUI Server
[12:32:17] mdrun_gpu returned 
[12:32:17] Nonzero force sum on GPU
[12:32:17] 
[12:32:17] Folding@home Core Shutdown: UNSTABLE_MACHINE
[12:32:19] CoreStatus = 7A (122)
[12:32:19] Sending work to server
[12:32:19] Project: 5742 (Run 1, Clone 42, Gen 619)
[12:32:19] - Read packet limit of 540015616... Set to 524286976.
[12:32:19] - Error: Could not get length of results file work/wuresults_02.dat
[12:32:19] - Error: Could not read unit 02 file. Removing from queue.
[12:32:19] - Preparing to get new work unit...
[12:32:19] + Attempting to get work packet
[12:32:19] - Connecting to assignment server
[12:32:20] - Successful: assigned to (171.64.65.102).
[12:32:20] + News From Folding@Home: Welcome to Folding@Home
[12:32:20] Loaded queue successfully.
[12:32:21] + Closed connections
[12:32:26] 
[12:32:26] + Processing work unit
[12:32:26] Core required: FahCore_11.exe
[12:32:26] Core found.
[12:32:26] Working on queue slot 03 [December 26 12:32:26 UTC]
[12:32:26] + Working ...
[12:32:26] 
[12:32:26] *------------------------------*
[12:32:26] Folding@Home GPU Core - Beta
[12:32:26] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[12:32:26] 
[12:32:26] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[12:32:26] Build host: amoeba
[12:32:26] Board Type: AMD
[12:32:26] Core      : 
[12:32:26] Preparing to commence simulation
[12:32:26] - Looking at optimizations...
[12:32:27] - Created dyn
[12:32:27] - Files status OK
[12:32:27] - Expanded 70183 -> 360060 (decompressed 513.0 percent)
[12:32:27] Called DecompressByteArray: compressed_data_size=70183 data_size=360060, decompressed_data_size=360060 diff=0
[12:32:27] - Digital signature verified
[12:32:27] 
[12:32:27] Project: 5742 (Run 1, Clone 42, Gen 619)
[12:32:27] 
[12:32:27] Assembly optimizations on if available.
[12:32:27] Entering M.D.
[12:32:33] Tpr hash work/wudata_03.tpr:  1907284785 89372323 585158592 825259943 4244972031
[12:32:33] Working on Protein
[12:32:34] Client config found, loading data.
[12:32:34] Starting GUI Server
[12:32:43] mdrun_gpu returned 
[12:32:43] Nonzero force sum on GPU
[12:32:43] 
[12:32:43] Folding@home Core Shutdown: UNSTABLE_MACHINE
[12:32:46] CoreStatus = 7A (122)
[12:32:46] Sending work to server
[12:32:46] Project: 5742 (Run 1, Clone 42, Gen 619)
[12:32:46] - Read packet limit of 540015616... Set to 524286976.
[12:32:46] - Error: Could not get length of results file work/wuresults_03.dat
[12:32:46] - Error: Could not read unit 03 file. Removing from queue.
[12:32:46] - Preparing to get new work unit...
[12:32:46] + Attempting to get work packet
[12:32:46] - Connecting to assignment server
[12:32:47] - Successful: assigned to (171.64.65.102).
[12:32:47] + News From Folding@Home: Welcome to Folding@Home
[12:32:47] Loaded queue successfully.
[12:32:48] + Closed connections
[12:32:53] 
[12:32:53] + Processing work unit
[12:32:53] Core required: FahCore_11.exe
[12:32:53] Core found.
[12:32:53] Working on queue slot 04 [December 26 12:32:53 UTC]
[12:32:53] + Working ...
[12:32:54] 
[12:32:54] *------------------------------*
[12:32:54] Folding@Home GPU Core - Beta
[12:32:54] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[12:32:54] 
[12:32:54] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[12:32:54] Build host: amoeba
[12:32:54] Board Type: AMD
[12:32:54] Core      : 
[12:32:54] Preparing to commence simulation
[12:32:54] - Looking at optimizations...
[12:32:54] - Created dyn
[12:32:54] - Files status OK
[12:32:54] - Expanded 70183 -> 360060 (decompressed 513.0 percent)
[12:32:54] Called DecompressByteArray: compressed_data_size=70183 data_size=360060, decompressed_data_size=360060 diff=0
[12:32:54] - Digital signature verified
[12:32:54] 
[12:32:54] Project: 5742 (Run 1, Clone 42, Gen 619)
[12:32:54] 
[12:32:54] Assembly optimizations on if available.
[12:32:54] Entering M.D.
[12:33:00] Tpr hash work/wudata_04.tpr:  1907284785 89372323 585158592 825259943 4244972031
[12:33:00] Working on Protein
[12:33:01] Client config found, loading data.
[12:33:01] Starting GUI Server
[12:33:10] mdrun_gpu returned 
[12:33:10] Nonzero force sum on GPU
[12:33:10] 
[12:33:10] Folding@home Core Shutdown: UNSTABLE_MACHINE
[12:33:13] CoreStatus = 7A (122)
[12:33:13] Sending work to server
[12:33:13] Project: 5742 (Run 1, Clone 42, Gen 619)
[12:33:13] - Read packet limit of 540015616... Set to 524286976.
[12:33:13] - Error: Could not get length of results file work/wuresults_04.dat
[12:33:13] - Error: Could not read unit 04 file. Removing from queue.
[12:33:13] EUE limit exceeded. Pausing 24 hours.
[17:59:55] + Working...
[23:59:55] + Working...
After rebooting the client and it giving this WU another go, it finally got a new WU that is going much better.
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 5742 (Run 1, Clone 42, Gen 619)

Post by bruce »

I've put this WU on the list to be suspended.
Post Reply