9401 - Bad state detected

Moderators: Site Moderators, FAHC Science Team

Post Reply
Jim Saunders
Posts: 45
Joined: Fri Jan 03, 2014 4:53 am
Hardware configuration: A: i5 + 2 GTX 660
B: i5 + 2 GTX 670
C: i7 + GTX670

9401 - Bad state detected

Post by Jim Saunders »

Hi, I didn't see anything else in the thread on it while the project was in beta, but I got this:

Code: Select all

02:01:41:WU00:FS01:Connecting to assign-GPU.stanford.edu:80
02:01:53:WU00:FS01:News: Welcome to Folding@Home
02:01:53:WU00:FS01:Assigned to work server 171.67.108.31
02:01:53:WU00:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:"GF116 [GeForce GT 610]" from 171.67.108.31
02:01:53:WU00:FS01:Connecting to 171.67.108.31:8080
02:01:57:WU00:FS01:Downloading 4.32MiB
02:03:07:WU00:FS01:Download complete
02:03:07:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9401 run:87 clone:0 gen:9 core:0x17 unit:0x0000000e6652edaf52eae313afb39c24
02:03:07:WU00:FS01:Downloading project 9401 description
02:03:07:WU00:FS01:Connecting to fah-web.stanford.edu:80
02:03:10:WU00:FS01:Project 9401 description downloaded successfully
02:06:37:WU00:FS01:Starting
02:06:37:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Jim/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 702 -lifeline 5804 -checkpoint 30 -gpu 0
02:06:37:WU00:FS01:Started FahCore on PID 6072
02:06:37:WU00:FS01:Core PID:6484
02:06:37:WU00:FS01:FahCore 0x17 started
02:06:37:WU00:FS01:0x17:*********************** Log Started 2014-02-12T02:06:37Z ***********************
02:06:37:WU00:FS01:0x17:Project: 9401 (Run 87, Clone 0, Gen 9)
02:06:37:WU00:FS01:0x17:Unit: 0x0000000e6652edaf52eae313afb39c24
02:06:37:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
02:06:37:WU00:FS01:0x17:Machine: 1
02:06:37:WU00:FS01:0x17:Reading tar file state.xml
02:06:38:WU00:FS01:0x17:Reading tar file system.xml
02:06:39:WU00:FS01:0x17:Reading tar file integrator.xml
02:06:39:WU00:FS01:0x17:Reading tar file core.xml
02:06:39:WU00:FS01:0x17:Digital signatures verified
02:06:39:WU00:FS01:0x17:Folding@home GPU core17
02:06:39:WU00:FS01:0x17:Version 0.0.52
02:10:49:WU00:FS01:0x17:Completed 0 out of 5000000 steps (0%)
02:10:49:WU00:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
02:27:37:WU00:FS01:0x17:Completed 50000 out of 5000000 steps (1%)
03:02:09:WU00:FS01:0x17:Completed 100000 out of 5000000 steps (2%)
03:02:09:WU00:FS01:0x17:Bad State detected... attempting to resume from last good checkpoint
03:35:19:WU00:FS01:0x17:Completed 150000 out of 5000000 steps (3%)
******************************** Date: 12/02/14 ********************************
03:52:06:WU00:FS01:0x17:Completed 200000 out of 5000000 steps (4%)
04:09:00:WU00:FS01:0x17:Completed 250000 out of 5000000 steps (5%)
GTX 580 on Win7, stock clocks, SMP running on 7 cores on an i7-950. V7 indicates 33K PPD, HFM 22K; I'm not concerned about the score, but I wanted to pass this up on the chance it indicates something. Near as I can tell the slot carries on as per normal though. I've never seen a log entry like it for any of the other projects (8018 and 8900 in recent memory), and another unit on a different GPU has run to 17% without a similar report.

Jim
Good science and heat for my basement you say?
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 9401 - Bad state detected

Post by PantherX »

IIRC, this is a feature that is build-in FahCore_17 which attempts to resolves some NANs(?) before giving up on it. I encountered this issue once on my GPUs and the WU successfully competed the WU and was credited. If you can successfully fold the WU and upload it, you should be credited for it.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Jim Saunders
Posts: 45
Joined: Fri Jan 03, 2014 4:53 am
Hardware configuration: A: i5 + 2 GTX 660
B: i5 + 2 GTX 670
C: i7 + GTX670

Re: 9401 - Bad state detected

Post by Jim Saunders »

Thanks, I figured it was something like this. If it goes sideways I'll report, but otherwise I see no reason to worry about it.

Jim
Good science and heat for my basement you say?
Jim Saunders
Posts: 45
Joined: Fri Jan 03, 2014 4:53 am
Hardware configuration: A: i5 + 2 GTX 660
B: i5 + 2 GTX 670
C: i7 + GTX670

Re: 9401 - Bad state detected

Post by Jim Saunders »

But then this happened:

Code: Select all

02:10:49:WU00:FS01:0x17:Completed 0 out of 5000000 steps (0%)
02:10:49:WU00:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
02:27:37:WU00:FS01:0x17:Completed 50000 out of 5000000 steps (1%)
03:02:09:WU00:FS01:0x17:Completed 100000 out of 5000000 steps (2%)
03:02:09:WU00:FS01:0x17:Bad State detected... attempting to resume from last good checkpoint
03:35:19:WU00:FS01:0x17:Completed 150000 out of 5000000 steps (3%)
******************************** Date: 12/02/14 ********************************
03:52:06:WU00:FS01:0x17:Completed 200000 out of 5000000 steps (4%)
04:09:00:WU00:FS01:0x17:Completed 250000 out of 5000000 steps (5%)
04:38:36:WU00:FS01:0x17:Completed 300000 out of 5000000 steps (6%)
04:38:36:WU00:FS01:0x17:Bad State detected... attempting to resume from last good checkpoint
05:12:40:WU00:FS01:0x17:Completed 350000 out of 5000000 steps (7%)
05:12:40:WU00:FS01:0x17:Bad State detected... attempting to resume from last good checkpoint
05:12:41:WU00:FS01:0x17:Max number of retries reached. Aborting.
05:12:41:WU00:FS01:0x17:ERROR:exception: Max Retries Reached
05:12:41:WU00:FS01:0x17:Saving result file logfile_01.txt
05:12:41:WU00:FS01:0x17:Saving result file log.txt
05:12:41:WU00:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
05:12:41:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:12:41:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9401 run:87 clone:0 gen:9 core:0x17 unit:0x0000000e6652edaf52eae313afb39c24
05:12:41:WU00:FS01:Uploading 2.64KiB to 171.67.108.31
05:12:41:WU00:FS01:Connecting to 171.67.108.31:8080
05:12:41:WU01:FS01:Connecting to assign-GPU.stanford.edu:80
05:12:42:WU00:FS01:Upload complete
05:12:42:WU00:FS01:Server responded WORK_ACK (400)
05:12:42:WU00:FS01:Cleaning up
I'm not going to worry about it until it becomes a pattern; the other slot from above it up to 20% with no indications of the same problem.

Jim
Good science and heat for my basement you say?
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 9401 - Bad state detected

Post by bruce »

Molecular simulation of folding involves a degree of randomness depending on the temperature of the sample being simulated. Unfortunately, that means that once in a while what we call a "bad WU" is issued, though nobody knows it's bad until somebody runs it. We hope that the "bad WUs" are weeded out during beta testing but ithat is never certain since it's a probabilistic process.

The same WU was reassigned and somebody else encountered an error, too, so don't worry about trying to fix your system.
Rel25917
Posts: 303
Joined: Wed Aug 15, 2012 2:31 am

Re: 9401 - Bad state detected

Post by Rel25917 »

Is it stock stock clocks or factory overclocked stock clocks? This can be a sign of a bit to much overclock. Core 17 is sensitive to memory oc. I had to reduce the memory speed on my evga superclocked titan to get rid of that error(but I'm running +15mhz on core over the superclock speed). If you keep seeing it every now and then you may need to try tweaking the memory speed.
Jim Saunders
Posts: 45
Joined: Fri Jan 03, 2014 4:53 am
Hardware configuration: A: i5 + 2 GTX 660
B: i5 + 2 GTX 670
C: i7 + GTX670

Re: 9401 - Bad state detected

Post by Jim Saunders »

So far it's been a one-of incident, on a unit I haven't seen much before; HFM didn't keep it in the log and I don't have any more running. If it happens again I'll pass it up, but I have no reason to think anything is wrong on my end. The card off the top of my head isn't one of the factory overclocked ones.

Jim
Good science and heat for my basement you say?
Ripshod
Posts: 19
Joined: Fri Nov 27, 2009 2:27 pm

Re: 9401 - Bad state detected

Post by Ripshod »

I'm struggling to get a stable machine with these 9401s. 8900s were absolutely fine with everything including my overclocks. 9401 just constant crashes. Fresh install with zero modification (crikey it's slooooow) now and the 13.12 whql drivers. Will report back here if I still have problems.

Nothing in the logs nor in the event viewer.
Last edited by Ripshod on Sun Feb 16, 2014 9:40 am, edited 1 time in total.
1090T / HD5770 / HD7950
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: 9401 - Bad state detected

Post by 7im »

Stability issues always mean too much overclock. Do you have this problem at oem stock speeds?
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Ripshod
Posts: 19
Joined: Fri Nov 27, 2009 2:27 pm

Re: 9401 - Bad state detected

Post by Ripshod »

Funnily enough yh, at stock everything and a fresh install.

Got it sorted. For some reason uninstalling 'CCC' and all the other stuff works, Just the basic drivers installed and everything is good again. Overclocks are fine now too.

Gotta say I didn't see that one coming!!
1090T / HD5770 / HD7950
Jim Saunders
Posts: 45
Joined: Fri Jan 03, 2014 4:53 am
Hardware configuration: A: i5 + 2 GTX 660
B: i5 + 2 GTX 670
C: i7 + GTX670

Re: 9401 - Bad state detected

Post by Jim Saunders »

As a postscript this GPU has demonstrated instability on P8900 WUs also; any criticism of P9401 should be considered in that context.

Jim
Good science and heat for my basement you say?
Post Reply