Rash of bad WU all 13000/13001
Moderators: Site Moderators, FAHC Science Team
Re: Bad WU: 13001 (Run 532, Clone 3, Gen 4)
As to the CPU threads: very little return for the CPU folding versus the GPU... fan noise ect..
As to the GPU OC.. none. I cant see OCing a GPU and then running it 24/7 . This is my one and only rig, so no super duper juice.
			
			
									
						
										
						As to the GPU OC.. none. I cant see OCing a GPU and then running it 24/7 . This is my one and only rig, so no super duper juice.
Re: Rash of bad WU all 13000/13001
The first time ..I cant recall which WU. the WU stuck at 99.9% I had to reboot.
I then deleted the Work folder and it picked another very short WU ( 9xxx) this ran to completion
It then picked up another 13000 WU. this locked up the GPU
I re-booted and let the WU reload... again today the GPU locked up
I had to hard boot yet again and the WU re=loaded at 80%
I have yet to complete any 13000/13001 WU
Currently I have all work paused, as there is no use in just running to less than 100%. Plus I am not here when the GPU locks up, so I am not sure what may be happening. I do run gpuz. The CPU is water cooled, but as noted not doing anything.
I will be glad to DL the 14.4 if you think that will get me back to folding.
			
			
									
						
										
						I then deleted the Work folder and it picked another very short WU ( 9xxx) this ran to completion
It then picked up another 13000 WU. this locked up the GPU
I re-booted and let the WU reload... again today the GPU locked up
I had to hard boot yet again and the WU re=loaded at 80%
I have yet to complete any 13000/13001 WU
Currently I have all work paused, as there is no use in just running to less than 100%. Plus I am not here when the GPU locks up, so I am not sure what may be happening. I do run gpuz. The CPU is water cooled, but as noted not doing anything.
I will be glad to DL the 14.4 if you think that will get me back to folding.
- 
				PantherX
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
 Windows 10 64-bit
 CPU:2/3/4/6 -> Intel i7-6700K
 GPU:1 -> Nvidia GTX 1080 Ti
 §
 Retired:
 2x Nvidia GTX 1070
 Nvidia GTX 675M
 Nvidia GTX 660 Ti
 Nvidia GTX 650 SC
 Nvidia GTX 260 896 MB SOC
 Nvidia 9600GT 1 GB OC
 Nvidia 9500M GS
 Nvidia 8800GTS 320 MB
 Intel Core i7-860
 Intel Core i7-3840QM
 Intel i3-3240
 Intel Core 2 Duo E8200
 Intel Core 2 Duo E6550
 Intel Core 2 Duo T8300
 Intel Pentium E5500
 Intel Pentium E5400
- Location: Land Of The Long White Cloud
- Contact:
Re: Bad WU: 13000 (Run 952, Clone 0, Gen 22)
Welcome to the F@H Forum pinetor,
Please note that if you don't want to fold on the CPU, simply remove the CPU Slot by following these instructions:
1) Open up Advanced Control (AKA FAHControl)
2) Click Configure
3) Select the Slots Tab
4) Select the appropriate Slot
5) Click Remove
6) Click Save
Thus, you will only have a single GPU Slot which you can fold on.
Moreover, please refrain from manually deleting the work folder since it delays the progress of science.
			
			
									
						
							Please note that if you don't want to fold on the CPU, simply remove the CPU Slot by following these instructions:
1) Open up Advanced Control (AKA FAHControl)
2) Click Configure
3) Select the Slots Tab
4) Select the appropriate Slot
5) Click Remove
6) Click Save
Thus, you will only have a single GPU Slot which you can fold on.
Moreover, please refrain from manually deleting the work folder since it delays the progress of science.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
			
						Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
- 
				PantherX
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
 Windows 10 64-bit
 CPU:2/3/4/6 -> Intel i7-6700K
 GPU:1 -> Nvidia GTX 1080 Ti
 §
 Retired:
 2x Nvidia GTX 1070
 Nvidia GTX 675M
 Nvidia GTX 660 Ti
 Nvidia GTX 650 SC
 Nvidia GTX 260 896 MB SOC
 Nvidia 9600GT 1 GB OC
 Nvidia 9500M GS
 Nvidia 8800GTS 320 MB
 Intel Core i7-860
 Intel Core i7-3840QM
 Intel i3-3240
 Intel Core 2 Duo E8200
 Intel Core 2 Duo E6550
 Intel Core 2 Duo T8300
 Intel Pentium E5500
 Intel Pentium E5400
- Location: Land Of The Long White Cloud
- Contact:
Re: Rash of bad WU all 13000/13001
In majority of cases, it is caused by the driver being reloaded by the OS. Can you search Windows Event Log for messages related to driver reloading? Also, while you have stated that the GPU isn't overclocked, is it factory overclocked by chance? If so, it is possible that the factory overclock is unstable so "down-clock" to the AMD stock frequencies.pinetor wrote:The first time ..I cant recall which WU. the WU stuck at 99.9% I had to reboot...
Could you please explain by what "locked up the GPU" means? If you mean that the cursor moves very slowly across the screen, this is called screen lag. The cause of the screen lag can be a combination of drivers and GPU model (among other factors) and the reason for this is that the GPU works on a First In First Out (FIFO) manner which doesn't have any kind of priority/scheduling system like the CPU to manage tasks. If you encounter screen lag, the best solution is to configure the GPU to fold only when the system is idle (http://folding.stanford.edu/home/faq/fa ... ion/#ntoc3).pinetor wrote:...It then picked up another 13000 WU. this locked up the GPU
I re-booted and let the WU reload... again today the GPU locked up
I had to hard boot yet again and the WU re=loaded at 80%
I have yet to complete any 13000/13001 WU
Currently I have all work paused, as there is no use in just running to less than 100%. Plus I am not here when the GPU locks up, so I am not sure what may be happening. I do run gpuz. The CPU is water cooled, but as noted not doing anything...
It has been reported by donors that 14.4 WHQL improves performance over 13.X WHQL driver series for folding. Thus, you can try it out.pinetor wrote:...I will be glad to DL the 14.4 if you think that will get me back to folding.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
			
						Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Re: Rash of bad WU all 13000/13001
Thanks for all the assistance. 
I have updated the Catalyst ( complete package) to the latest.
I have also manually over ridden the GPU fan speed to 55%. Generally it never gets above 40% even after long hours at 98% load.
What I mean by GPU lock-up is that the screen has a cyan "checker-board pattern" and the entire system is non-responsive. I have seen a GPU go bad ( during mining) and the situation seems similar thus i conclude the GPU is locked up. Given the CPU is doing very little ( 13% load) and the memory use is below 3GB ( 2.75) out of 8GB. I don't think of those sub-systems are to blame. I was able to fold about 560k worth of points before any problems popped up ( I know thats not impressive, but it represent at least two weeks worth of error free folding).
			
			
									
						
										
						I have updated the Catalyst ( complete package) to the latest.
I have also manually over ridden the GPU fan speed to 55%. Generally it never gets above 40% even after long hours at 98% load.
What I mean by GPU lock-up is that the screen has a cyan "checker-board pattern" and the entire system is non-responsive. I have seen a GPU go bad ( during mining) and the situation seems similar thus i conclude the GPU is locked up. Given the CPU is doing very little ( 13% load) and the memory use is below 3GB ( 2.75) out of 8GB. I don't think of those sub-systems are to blame. I was able to fold about 560k worth of points before any problems popped up ( I know thats not impressive, but it represent at least two weeks worth of error free folding).
Re: Bad WU: 13000 (Run 952, Clone 0, Gen 22)
Thank you for the welcome!
It certainly would not be my first thought (to delete the work folder) But at the time. the WU was stuck at 99% and my GPU ( the slot assigned to it) was at 0% load. I let it set this way for several hours ( while browsing the forums) after a few more re-boots, I gave up and deleted the folder. This DID get me a non-13000 WU which then did run to completion. However the next WU and all since then ( on the GPU) have been 13000/13001 WU.
			
			
									
						
										
						It certainly would not be my first thought (to delete the work folder) But at the time. the WU was stuck at 99% and my GPU ( the slot assigned to it) was at 0% load. I let it set this way for several hours ( while browsing the forums) after a few more re-boots, I gave up and deleted the folder. This DID get me a non-13000 WU which then did run to completion. However the next WU and all since then ( on the GPU) have been 13000/13001 WU.
Re: Bad WU: 13001 (Run 486, Clone 5, Gen 15)
Partial credit, in spite of the error. Presumably the WU has been reassigned to see if somebody else can complete it but we'll have to wait until they have time to process it before we can report the final status.P5-133XL wrote:Hi pinetor (team 224497),
Your WU (P13001 R486 C5 G15) was added to the stats database on 2014-06-04 22:03:33 for 13869.6 points of credit.
Posting FAH's log:
How to provide enough info to get helpful support.
			
						How to provide enough info to get helpful support.
Re: Bad WU: 13000 (Run 952, Clone 0, Gen 22)
Having the GPU at 0% while the WU appears to be stuck at 99% has been reported many times.  There are several possible causes, especially overclocking or overheating or MSRemoteDesktop or a Sleep state, all of which can reset the GPU, thereby stopping all progress.  Unfortunately, the estimated progress continues to increase from whatever point the reset happened ... until it reaches 99% in the GUI ... although the log stops reporting progress.
If you discover that the progress indications in the log stop and become unsynchronized with the GUI, you can manually recover by doing a Pause, followed by a Fold.
You'll need to eliminate all but one of the causes the OS has decided to reset the GPU and then prevent that from causing it to reset.
I see you have reported several WUs with the message "Bad State detected... attempting to resume from last good checkpoint" Other people are not having that problem although we'll have to wait to confirm that others have successfully complete the same WUs. I have a strong hunch that those messages are an indication of the same problem as what I've called an OS-initiated GPU Reset. There's a good chance that FAH puts heavier computational demands on the GPU that SHA64 so GPUs which appeared to be stable are now demonstrably unstable.
			
			
									
						
							If you discover that the progress indications in the log stop and become unsynchronized with the GUI, you can manually recover by doing a Pause, followed by a Fold.
You'll need to eliminate all but one of the causes the OS has decided to reset the GPU and then prevent that from causing it to reset.
I see you have reported several WUs with the message "Bad State detected... attempting to resume from last good checkpoint" Other people are not having that problem although we'll have to wait to confirm that others have successfully complete the same WUs. I have a strong hunch that those messages are an indication of the same problem as what I've called an OS-initiated GPU Reset. There's a good chance that FAH puts heavier computational demands on the GPU that SHA64 so GPUs which appeared to be stable are now demonstrably unstable.
Posting FAH's log:
How to provide enough info to get helpful support.
			
						How to provide enough info to get helpful support.
Re: Rash of bad WU all 13000/13001
Project 13000/13001 puts heavier demands on your system than many other applications.  FAH is very good at uncovering systems which have appeared to be stable until now but which, in fact, are marginal under high load.  
See also the answer I provided in one of your other topics.
viewtopic.php?f=19&t=26436&p=265705#p265705
			
			
									
						
							See also the answer I provided in one of your other topics.
viewtopic.php?f=19&t=26436&p=265705#p265705
Posting FAH's log:
How to provide enough info to get helpful support.
			
						How to provide enough info to get helpful support.
- 
				PantherX
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
 Windows 10 64-bit
 CPU:2/3/4/6 -> Intel i7-6700K
 GPU:1 -> Nvidia GTX 1080 Ti
 §
 Retired:
 2x Nvidia GTX 1070
 Nvidia GTX 675M
 Nvidia GTX 660 Ti
 Nvidia GTX 650 SC
 Nvidia GTX 260 896 MB SOC
 Nvidia 9600GT 1 GB OC
 Nvidia 9500M GS
 Nvidia 8800GTS 320 MB
 Intel Core i7-860
 Intel Core i7-3840QM
 Intel i3-3240
 Intel Core 2 Duo E8200
 Intel Core 2 Duo E6550
 Intel Core 2 Duo T8300
 Intel Pentium E5500
 Intel Pentium E5400
- Location: Land Of The Long White Cloud
- Contact:
Re: Rash of bad WU all 13000/13001
It sounds like a VRAM issue. Maybe your GPU is failing or encountering some serious hardware issue. As a test, can you run some GPU benchmarks (http://www.techpowerup.com/downloads/Benchmarking/) and see if you spot any visual artifacts (http://www.playtool.com/pages/artifacts/artifacts.html) or if the system locks-up/crashes?pinetor wrote:...What I mean by GPU lock-up is that the screen has a cyan "checker-board pattern" and the entire system is non-responsive. I have seen a GPU go bad ( during mining) and the situation seems similar thus i conclude the GPU is locked up. Given the CPU is doing very little ( 13% load) and the memory use is below 3GB ( 2.75) out of 8GB. I don't think of those sub-systems are to blame. I was able to fold about 560k worth of points before any problems popped up ( I know thats not impressive, but it represent at least two weeks worth of error free folding).
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
			
						Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Re: Bad WU: 13000 (Run 952, Clone 0, Gen 22)
Bruce (bruce)
Again thanks for your patience with me. I suspect your are correct in that 13000 projects are taxing the GPU more any other task set to it. i actually have this WU back and have been running since I posted all this last night... and we have made it through the day!!! I am at 78% completion. The GPU is loaded at 98 to 99% and holding at 50 to 51C ( according to gpuz). Either the new drivers or the fan at constant 55% seems to be doing the trick.
I can rule out:
any OC ( CPU, RAM, or GPU)
Remote desktop
Sleep
however, I do have lots of the typical processes that might decide to break things.. synapse updates, java,AMD, qnap.
PS.. I never did SHA64 .. to late to get into BTC.. I did Scrypt, running up to 3 GPUs.. but I sold them all but the lowly 7850
Still I think the load is the issue.. either the driver could not handle it or jsut not enough cooling. I can bump the GPU fans up to 60% but... its just too nosiy.
Thanks again!
			
			
									
						
										
						Again thanks for your patience with me. I suspect your are correct in that 13000 projects are taxing the GPU more any other task set to it. i actually have this WU back and have been running since I posted all this last night... and we have made it through the day!!! I am at 78% completion. The GPU is loaded at 98 to 99% and holding at 50 to 51C ( according to gpuz). Either the new drivers or the fan at constant 55% seems to be doing the trick.
I can rule out:
any OC ( CPU, RAM, or GPU)
Remote desktop
Sleep
however, I do have lots of the typical processes that might decide to break things.. synapse updates, java,AMD, qnap.
PS.. I never did SHA64 .. to late to get into BTC.. I did Scrypt, running up to 3 GPUs.. but I sold them all but the lowly 7850
Still I think the load is the issue.. either the driver could not handle it or jsut not enough cooling. I can bump the GPU fans up to 60% but... its just too nosiy.
Thanks again!
Re: Rash of bad WU all 13000/13001
and stop my folding???? (grins)
so far so good today... fingers crossed.(new drivers/manual fan control)
			
			
									
						
										
						so far so good today... fingers crossed.(new drivers/manual fan control)
- 
				7im
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Bad WU: 13001 (Run 532, Clone 3, Gen 4)
Depending on the model of your HD 7800, a lot of GPUs come factory overclocked these days.  It's easy to miss unless looking for it.
			
			
									
						
							How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
			
						Tell me and I forget. Teach me and I remember. Involve me and I learn.
Re: Bad WU: 13000 (Run 952, Clone 0, Gen 22)
You've reported several WUs that crashed in separate topics (which makes sense if the problem is associated with a specific WU).  You have also opened a general topic about multiple p13000/13001 WUs.  I'm going to merge them into a single topic, on the theory that they're all caused by the same sort of problem with your GPU hardware.  The posts will be in chronological order so may appear to be intermixed, but at least all the answers that might be applicable might be in one place.
			
			
									
						
							Posting FAH's log:
How to provide enough info to get helpful support.
			
						How to provide enough info to get helpful support.
Re: Rash of bad WU all 13000/13001
Another folder was able to complete the WU successfully:
Hi xxxx (team xxxx),
Your WU (P13001 R486 C5 G15) was added to the stats database on 2014-06-24 04:03:47 for 17123 points of credit.
			
			
									
						
							Hi xxxx (team xxxx),
Your WU (P13001 R486 C5 G15) was added to the stats database on 2014-06-24 04:03:47 for 17123 points of credit.