Project 5767 (Run 5- Clone 173- Gen 1013)

Moderators: Site Moderators, FAHC Science Team

Post Reply
ProtectedVoid
Posts: 6
Joined: Sat Mar 14, 2009 1:22 am

Project 5767 (Run 5- Clone 173- Gen 1013)

Post by ProtectedVoid »

This WU has been assigned to me repeatedly, and fails to process on each of 5 GPUs that it has been assigned to; GPUs which have not had problems processing other 576x Projects. I'm wondering if this is, perhaps, just a bad WU? If so, is there a way to prevent it from being re-assigned to me?
Image
toTOW
Site Moderator
Posts: 6359
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project 5767 (Run 5- Clone 173- Gen 1013)

Post by toTOW »

Could you show the log of a failure ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
ProtectedVoid
Posts: 6
Joined: Sat Mar 14, 2009 1:22 am

Re: Project 5767 (Run 5- Clone 173- Gen 1013)

Post by ProtectedVoid »

This is the log of the last time it happened. My other fahlog-prev.txts only go back a few days, so I don't have any of them to post, though the error message is exactly the same. I make a list of any WU that I have that fails so I have a record to compare against should I get another WU that fails. In most cases, if a WU fails, I am usually able to process the same WU after stopping and restarting the client. This particular unit I have not had such luck with.

Please keep in mind that this particular WU has failed on multiple different cards, including my two GTS250s which process these 576x WU on a very regular basis without failing. In fact, those two cards have been so remarkably stable, that I pay special attention if a WU DOES fail on them. It is very, very rare. - I do not have any temperature problems on any of the cards. They stay between 68°C and 75°C while folding on these WU in an air conditioned room.

Here's the Log:
#######################################################################################################################

Code: Select all

[10:49:01] *------------------------------*
[10:49:01] Folding@Home GPU Core - Beta
[10:49:01] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:49:01] 
[10:49:01] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[10:49:01] Build host: amoeba
[10:49:01] Board Type: Nvidia
[10:49:01] Core      : 
[10:49:01] Preparing to commence simulation
[10:49:01] - Looking at optimizations...
[10:49:01] - Created dyn
[10:49:01] - Files status OK
[10:49:01] - Expanded 46576 -> 252912 (decompressed 543.0 percent)
[10:49:01] Called DecompressByteArray: compressed_data_size=46576 data_size=252912, decompressed_data_size=252912 diff=0
[10:49:01] - Digital signature verified
[10:49:01] 
[10:49:01] Project: 5767 (Run 5, Clone 173, Gen 1013)
[10:49:01] 
[10:49:01] Assembly optimizations on if available.
[10:49:01] Entering M.D.
[10:49:08] Working on Protein
[10:49:08] Client config found, loading data.
[10:49:08] mdrun_gpu returned 
[10:49:08] NANs detected on GPU
[10:49:08] 
[10:49:08] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:49:11] CoreStatus = 7A (122)
[10:49:11] Sending work to server
[10:49:11] Project: 5767 (Run 5, Clone 173, Gen 1013)
[10:49:11] - Read packet limit of 540015616... Set to 524286976.
[10:49:11] - Error: Could not get length of results file work/wuresults_03.dat
[10:49:11] - Error: Could not read unit 03 file. Removing from queue.
[10:49:11] - Preparing to get new work unit...
[10:49:11] + Attempting to get work packet
[10:49:11] - Connecting to assignment server
[10:49:12] - Successful: assigned to (171.67.108.11).
[10:49:12] + News From Folding@Home: Welcome to Folding@Home
[10:49:12] Loaded queue successfully.
[10:49:13] + Closed connections
[10:49:18] 
[10:49:18] + Processing work unit
[10:49:18] Core required: FahCore_11.exe
[10:49:18] Core found.
[10:49:18] Working on queue slot 04 [September 22 10:49:18 UTC]
[10:49:18] + Working ...
[10:49:18] 
[10:49:18] *------------------------------*
[10:49:18] Folding@Home GPU Core - Beta
[10:49:18] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:49:18] 
[10:49:18] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[10:49:18] Build host: amoeba
[10:49:18] Board Type: Nvidia
[10:49:18] Core      : 
[10:49:18] Preparing to commence simulation
[10:49:18] - Looking at optimizations...
[10:49:18] - Created dyn
[10:49:18] - Files status OK
[10:49:18] - Expanded 46576 -> 252912 (decompressed 543.0 percent)
[10:49:18] Called DecompressByteArray: compressed_data_size=46576 data_size=252912, decompressed_data_size=252912 diff=0
[10:49:18] - Digital signature verified
[10:49:18] 
[10:49:18] Project: 5767 (Run 5, Clone 173, Gen 1013)
[10:49:18] 
[10:49:18] Assembly optimizations on if available.
[10:49:18] Entering M.D.
[10:49:25] Working on Protein
[10:49:25] Client config found, loading data.
[10:49:25] mdrun_gpu returned 
[10:49:25] NANs detected on GPU
[10:49:25] 
[10:49:25] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:49:28] CoreStatus = 7A (122)
[10:49:28] Sending work to server
[10:49:28] Project: 5767 (Run 5, Clone 173, Gen 1013)
[10:49:28] - Read packet limit of 540015616... Set to 524286976.
[10:49:28] - Error: Could not get length of results file work/wuresults_04.dat
[10:49:28] - Error: Could not read unit 04 file. Removing from queue.
[10:49:28] - Preparing to get new work unit...
[10:49:28] + Attempting to get work packet
[10:49:28] - Connecting to assignment server
[10:49:29] - Successful: assigned to (171.67.108.11).
[10:49:29] + News From Folding@Home: Welcome to Folding@Home
[10:49:29] Loaded queue successfully.
[10:49:30] + Closed connections
[10:49:35] 
[10:49:35] + Processing work unit
[10:49:35] Core required: FahCore_11.exe
[10:49:35] Core found.
[10:49:35] Working on queue slot 05 [September 22 10:49:35 UTC]
[10:49:35] + Working ...
[10:49:35] 
[10:49:35] *------------------------------*
[10:49:35] Folding@Home GPU Core - Beta
[10:49:35] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:49:35] 
[10:49:35] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[10:49:35] Build host: amoeba
[10:49:35] Board Type: Nvidia
[10:49:35] Core      : 
[10:49:35] Preparing to commence simulation
[10:49:35] - Looking at optimizations...
[10:49:35] - Created dyn
[10:49:35] - Files status OK
[10:49:35] - Expanded 46576 -> 252912 (decompressed 543.0 percent)
[10:49:35] Called DecompressByteArray: compressed_data_size=46576 data_size=252912, decompressed_data_size=252912 diff=0
[10:49:35] - Digital signature verified
[10:49:35] 
[10:49:35] Project: 5767 (Run 5, Clone 173, Gen 1013)
[10:49:35] 
[10:49:35] Assembly optimizations on if available.
[10:49:35] Entering M.D.
[10:49:42] Working on Protein
[10:49:42] Client config found, loading data.
[10:49:42] mdrun_gpu returned 
[10:49:42] NANs detected on GPU
[10:49:42] 
[10:49:42] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:49:45] CoreStatus = 7A (122)
[10:49:45] Sending work to server
[10:49:45] Project: 5767 (Run 5, Clone 173, Gen 1013)
[10:49:45] - Read packet limit of 540015616... Set to 524286976.
[10:49:45] - Error: Could not get length of results file work/wuresults_05.dat
[10:49:45] - Error: Could not read unit 05 file. Removing from queue.
[10:49:45] - Preparing to get new work unit...
[10:49:45] + Attempting to get work packet
[10:49:45] - Connecting to assignment server
[10:49:46] - Successful: assigned to (171.67.108.11).
[10:49:46] + News From Folding@Home: Welcome to Folding@Home
[10:49:46] Loaded queue successfully.
[10:49:47] + Closed connections
[10:49:52] 
[10:49:52] + Processing work unit
[10:49:52] Core required: FahCore_11.exe
[10:49:52] Core found.
[10:49:52] Working on queue slot 06 [September 22 10:49:52 UTC]
[10:49:52] + Working ...
[10:49:52] 
[10:49:52] *------------------------------*
[10:49:52] Folding@Home GPU Core - Beta
[10:49:52] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:49:52] 
[10:49:52] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[10:49:52] Build host: amoeba
[10:49:52] Board Type: Nvidia
[10:49:52] Core      : 
[10:49:52] Preparing to commence simulation
[10:49:52] - Looking at optimizations...
[10:49:52] - Created dyn
[10:49:52] - Files status OK
[10:49:52] - Expanded 46576 -> 252912 (decompressed 543.0 percent)
[10:49:52] Called DecompressByteArray: compressed_data_size=46576 data_size=252912, decompressed_data_size=252912 diff=0
[10:49:52] - Digital signature verified
[10:49:52] 
[10:49:52] Project: 5767 (Run 5, Clone 173, Gen 1013)
[10:49:52] 
[10:49:52] Assembly optimizations on if available.
[10:49:52] Entering M.D.
[10:49:59] Working on Protein
[10:49:59] Client config found, loading data.
[10:49:59] mdrun_gpu returned 
[10:49:59] NANs detected on GPU
[10:49:59] 
[10:49:59] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:50:02] CoreStatus = 7A (122)
[10:50:02] Sending work to server
[10:50:02] Project: 5767 (Run 5, Clone 173, Gen 1013)
[10:50:02] - Read packet limit of 540015616... Set to 524286976.
[10:50:02] - Error: Could not get length of results file work/wuresults_06.dat
[10:50:02] - Error: Could not read unit 06 file. Removing from queue.
[10:50:02] - Preparing to get new work unit...
[10:50:02] + Attempting to get work packet
[10:50:02] - Connecting to assignment server
[10:50:03] - Successful: assigned to (171.67.108.11).
[10:50:03] + News From Folding@Home: Welcome to Folding@Home
[10:50:03] Loaded queue successfully.
[10:50:04] + Closed connections
[10:50:09] 
[10:50:09] + Processing work unit
[10:50:09] Core required: FahCore_11.exe
[10:50:09] Core found.
[10:50:09] Working on queue slot 07 [September 22 10:50:09 UTC]
[10:50:09] + Working ...
[10:50:09] 
[10:50:09] *------------------------------*
[10:50:09] Folding@Home GPU Core - Beta
[10:50:09] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:50:09] 
[10:50:09] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[10:50:09] Build host: amoeba
[10:50:09] Board Type: Nvidia
[10:50:09] Core      : 
[10:50:09] Preparing to commence simulation
[10:50:09] - Looking at optimizations...
[10:50:09] - Created dyn
[10:50:09] - Files status OK
[10:50:09] - Expanded 46576 -> 252912 (decompressed 543.0 percent)
[10:50:09] Called DecompressByteArray: compressed_data_size=46576 data_size=252912, decompressed_data_size=252912 diff=0
[10:50:09] - Digital signature verified
[10:50:09] 
[10:50:09] Project: 5767 (Run 5, Clone 173, Gen 1013)
[10:50:09] 
[10:50:09] Assembly optimizations on if available.
[10:50:09] Entering M.D.
[10:50:16] Working on Protein
[10:50:16] Client config found, loading data.
[10:50:16] mdrun_gpu returned 
[10:50:16] NANs detected on GPU
[10:50:16] 
[10:50:16] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:50:19] CoreStatus = 7A (122)
[10:50:19] Sending work to server
[10:50:19] Project: 5767 (Run 5, Clone 173, Gen 1013)
[10:50:19] - Read packet limit of 540015616... Set to 524286976.
[10:50:19] - Error: Could not get length of results file work/wuresults_07.dat
[10:50:19] - Error: Could not read unit 07 file. Removing from queue.
[10:50:19] EUE limit exceeded. Pausing 24 hours.
[16:39:33] + Working...

Folding@Home Client Shutdown.
Image
Post Reply