Page 1 of 1
Project: 5745 (Run 2, Clone 34, Gen 337)
Posted: Thu Jul 30, 2009 7:05 pm
by bapriebe
5 times in a row got a "SHAKE violations on GPU" error precisely 3sec after the GUI server started up. (Some say this can be caused by overheating. The 4890's fan is set to 100% and temperatures don't seem to rise above 66C.) After restarting the client, it started right into another WU from project 5732 without incident.
Code: Select all
[18:59:07] Folding@Home GPU Core - Beta
[18:59:07] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[18:59:07]
[18:59:07] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[18:59:07] Build host: amoeba
[18:59:07] Board Type: AMD
[18:59:07] Core :
[18:59:07] Preparing to commence simulation
[18:59:07] - Looking at optimizations...
[18:59:07] - Created dyn
[18:59:07] - Files status OK
[18:59:07] - Expanded 68495 -> 357580 (decompressed 522.0 percent)
[18:59:07] Called DecompressByteArray: compressed_data_size=68495 data_size=357580, decompressed_data_size=357580 diff=0
[18:59:07] - Digital signature verified
[18:59:07]
[18:59:07] Project: 5745 (Run 2, Clone 34, Gen 337)
[18:59:07]
[18:59:07] Assembly optimizations on if available.
[18:59:07] Entering M.D.
[18:59:13] Tpr hash work/wudata_03.tpr: 3292426534 3612594659 715314425 3296096959 2941335996
[18:59:13] Working on Protein
[18:59:14] Client config found, loading data.
[18:59:14] Starting GUI Server
[18:59:17] mdrun_gpu returned
[18:59:17] SHAKE violations on GPU
[18:59:17]
[18:59:17] Folding@home Core Shutdown: UNSTABLE_MACHINE
[18:59:19] CoreStatus = 7A (122)
[18:59:19] Sending work to server
[18:59:19] Project: 5745 (Run 2, Clone 34, Gen 337)
[18:59:19] - Read packet limit of 540015616... Set to 524286976.
[18:59:19] - Error: Could not get length of results file work/wuresults_03.dat
[18:59:19] - Error: Could not read unit 03 file. Removing from queue.
Re: Project: 5745 (Run 2, Clone 34, Gen 337)
Posted: Sat Aug 01, 2009 12:11 am
by valleton
Same here with 4870 1GB, but definitely not a temperature issue in my case.
EDIT: Vista SP1, Catalyst 9.5 (and no VPU recoveries on Vista)
Code: Select all
[21:46:31] Folding@Home GPU Core - Beta
[21:46:31] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[21:46:31]
[21:46:31] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[21:46:31] Build host: amoeba
[21:46:31] Board Type: AMD
[21:46:31] Core :
[21:46:31] Preparing to commence simulation
[21:46:31] - Looking at optimizations...
[21:46:31] - Created dyn
[21:46:31] - Files status OK
[21:46:31] - Expanded 68495 -> 357580 (decompressed 522.0 percent)
[21:46:31] Called DecompressByteArray: compressed_data_size=68495 data_size=357580, decompressed_data_size=357580 diff=0
[21:46:31] - Digital signature verified
[21:46:31]
[21:46:31] Project: 5745 (Run 2, Clone 34, Gen 337)
[21:46:31]
[21:46:31] Assembly optimizations on if available.
[21:46:31] Entering M.D.
[21:46:37] Tpr hash work/wudata_04.tpr: 3292426534 3612594659 715314425 3296096959 2941335996
[21:46:37] Working on Protein
[21:46:37] Client config found, loading data.
[21:46:37] Starting GUI Server
[21:46:39] mdrun_gpu returned
[21:46:39] SHAKE violations on GPU
[21:46:39]
[21:46:39] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:46:43] CoreStatus = 7A (122)
[21:46:43] Sending work to server
[21:46:43] Project: 5745 (Run 2, Clone 34, Gen 337)
[21:46:43] - Read packet limit of 540015616... Set to 524286976.
[21:46:43] - Error: Could not get length of results file work/wuresults_04.dat
[21:46:43] - Error: Could not read unit 04 file. Removing from queue.
[21:46:43] - Preparing to get new work unit...
[21:46:43] + Attempting to get work packet
[21:46:43] - Connecting to assignment server
[21:46:44] - Successful: assigned to (171.64.65.102).
[21:46:44] + News From Folding@Home: Welcome to Folding@Home
[21:46:44] Loaded queue successfully.
[21:46:46] + Closed connections
[21:46:51]
[21:46:51] + Processing work unit
[21:46:51] Core required: FahCore_11.exe
[21:46:51] Core found.
[21:46:51] Working on queue slot 05 [July 31 21:46:51 UTC]
[21:46:51] + Working ...
[21:46:51]
[21:46:51] *------------------------------*
[21:46:51] Folding@Home GPU Core - Beta
[21:46:51] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[21:46:51]
[21:46:51] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[21:46:51] Build host: amoeba
[21:46:51] Board Type: AMD
[21:46:51] Core :
[21:46:51] Preparing to commence simulation
[21:46:51] - Looking at optimizations...
[21:46:51] - Created dyn
[21:46:51] - Files status OK
[21:46:51] - Expanded 68495 -> 357580 (decompressed 522.0 percent)
[21:46:51] Called DecompressByteArray: compressed_data_size=68495 data_size=357580, decompressed_data_size=357580 diff=0
[21:46:51] - Digital signature verified
[21:46:51]
[21:46:51] Project: 5745 (Run 2, Clone 34, Gen 337)
[21:46:51]
[21:46:51] Assembly optimizations on if available.
[21:46:51] Entering M.D.
[21:46:57] Tpr hash work/wudata_05.tpr: 3292426534 3612594659 715314425 3296096959 2941335996
[21:46:57] Working on Protein
[21:46:57] Client config found, loading data.
[21:46:57] Starting GUI Server
[21:46:59] mdrun_gpu returned
[21:46:59] SHAKE violations on GPU
[21:46:59]
[21:46:59] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:47:03] CoreStatus = 7A (122)
[21:47:03] Sending work to server
[21:47:03] Project: 5745 (Run 2, Clone 34, Gen 337)
[21:47:03] - Read packet limit of 540015616... Set to 524286976.
[21:47:03] - Error: Could not get length of results file work/wuresults_05.dat
[21:47:03] - Error: Could not read unit 05 file. Removing from queue.
[21:47:03] EUE limit exceeded. Pausing 24 hours.
Re: Project: 5745 (Run 2, Clone 34, Gen 337)
Posted: Sat Aug 01, 2009 3:23 am
by LIVESTRONG
I've seen a couple threads about 480x series running into this issue. Have you overclocked the cards?
You can experience an unstable machine with high overclocks even with low and stable temperatures >80c.
If clock speed is not the issue perhaps check if your voltages are adequate compared to other 480x cards.
Re: Project: 5745 (Run 2, Clone 34, Gen 337)
Posted: Sat Aug 01, 2009 6:01 am
by susato
Thanks for the reports folks - the similar reports of early failures on otherwise stable rigs tend to point to the WU as the problem, rather than the hardware.
Re: Project: 5745 (Run 2, Clone 34, Gen 337)
Posted: Fri Aug 07, 2009 3:46 pm
by DrSpalding
You can add my machine to the pile on this WU as well. Same error: SHAKE violations on GPU.
Running Vista 64-bit with a ATI/AMD 48xx of some sort. Definitely not overclocked and the room temperature is 65°F this morning so I don't suspect any overheating issues. Never had any stability issues thus far with this pretty new rig.
Re: Project: 5745 (Run 2, Clone 34, Gen 337)
Posted: Thu Aug 20, 2009 4:03 am
by vladh4x0r
Just got this work unit (P5745,R2,C34,G337), same experience as others - "SHAKE violations on GPU" 2-3 seconds after starting. This is on a secondary 4830 that crunched hundreds of other WUs so far with few complaints.