This may not be the WU fault- 66xx's [NVidia-8400GS]

Moderators: Site Moderators, FAHC Science Team

Meetinghouse
Posts: 9
Joined: Mon Mar 02, 2009 7:56 pm
Location: Massachusetts, USA

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by Meetinghouse »

I have had UNSTABLE MACHINE messages on every 66xx WU my machine has ever tried to process going back to whenever they first went up for processing. I don't have a log file at the moment, but the queue info shows 2 in the last 10 attempts.

6606 (run 3, clone 714, gen 11) on 11/29/10
6600 (run 1 clone 152, gen 344) on 11/29/10

my gpu is NVIDIA 9400GT not 8400GS, but it isn't just one person having the trouble.

All these failures are definitely affecting my scores. especially when it shuts it down for 24 hours.

I was told when I first inquired many months ago that my fan speed might need to be increased. My technician says it's already at max. I have had problems with another series too, but can't say for sure which one. I've been just quitting the process and restarting it to see if I can get a WU that won't fail.
Meetinghouse
Posts: 9
Joined: Mon Mar 02, 2009 7:56 pm
Location: Massachusetts, USA

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by Meetinghouse »

here's the log info I posted last March.
* Edit post
* Report this post
* Reply with quote

Projects 6600, 6601, 6604 UNSTABLE_MACHINE

Postby Meetinghouse » Tue Mar 16, 2010 3:52 am
I've been getting the types of errors shown in the log data at the end here, on some projects. I rebooted my machine this evening when I saw the errors again from today, so I only have current sessions in my log. Before I did that I saw that at least one project this morning completed without error, but this is the second time in recent days I've noticed the situation where it says it will pause for 24 hours for exceeding EUE limits.

I'm a bit overwhelmed by the technical discussions about these EUE errors so forgive my questions if they have been asked and answered elsewhere. I have no problem with my graphics card for other things and as I said, some work units seem to process okay. I've been running the GPU for at least a year without any problems, so I'm at a loss for what's happening all of a sudden. It appears that no work is getting processed and each WU is failing immediately. The specific WUs in my log from tonight are:
Project: 6600 (Run 0, Clone 841, Gen 1)
Project: 6601 (Run 0, Clone 509, Gen 5)
Project: 6601 (Run 1, Clone 853, Gen 1)
Project: 6604 (Run 0, Clone 707, Gen 7)
Project: 6604 (Run 0, Clone 858, Gen 1) all with the same error and status codes.

In my queue status I can also see projects from today, but before I cold started my machine.
Project: 6606 (Run 4, Clone 850, Gen 1)
Project: 6606 (Run 1, Clone 851, Gen 1)
Project: 6606 (Run 3, Clone 857, Gen 0)
Project: 6604 (Run 5, Clone 792, Gen 7)
Project: 6604 (Run 0, Clone 792, Gen 7)

My machine is Windows XP sp 3
graphics card is NVIDIA GeForce 9400GT
Intel pentium dual cpu E2180 2.00ghz

Here is the last bit of the current log file. If there is more info needed and you can give me directions on how to get the info, I would be happy to provide it, but I don't know a lot about using some of these system checking programs I see mentioned. Thanks.

[06:57:44] + Attempting to send results [March 16 06:57:44 UTC]
[06:57:45] + Results successfully sent
[06:57:45] Thank you for your contribution to Folding@Home.
[06:57:49] - Preparing to get new work unit...
[06:57:49] + Attempting to get work packet
[06:57:49] - Connecting to assignment server
[06:57:50] - Successful: assigned to (171.64.65.61).
[06:57:50] + News From Folding@Home: Welcome to Folding@Home
[06:57:50] Loaded queue successfully.
[06:57:51] + Closed connections
[06:57:56]
[06:57:56] + Processing work unit
[06:57:56] Core required: FahCore_11.exe
[06:57:56] Core found.
[06:57:56] Working on queue slot 08 [March 16 06:57:56 UTC]
[06:57:56] + Working ...
[06:57:57]
[06:57:57] *------------------------------*
[06:57:57] Folding@Home GPU Core
[06:57:57] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[06:57:57]
[06:57:57] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[06:57:57] Build host: amoeba
[06:57:57] Board Type: Nvidia
[06:57:57] Core :
[06:57:57] Preparing to commence simulation
[06:57:57] - Looking at optimizations...
[06:57:57] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[06:57:57] - Created dyn
[06:57:57] - Files status OK
[06:57:57] - Expanded 73784 -> 383588 (decompressed 519.8 percent)
[06:57:57] Called DecompressByteArray: compressed_data_size=73784 data_size=383588, decompressed_data_size=383588 diff=0
[06:57:57] - Digital signature verified
[06:57:57]
[06:57:57] Project: 6604 (Run 0, Clone 858, Gen 1)
[06:57:57]
[06:57:57] Assembly optimizations on if available.
[06:57:57] Entering M.D.
[06:58:03] Tpr hash work/wudata_08.tpr: 352014151 1134679828 2100455267 517859258 4060641076
[06:58:03]
[06:58:03] Calling fah_main args: 14 usage=100
[06:58:03]
[06:58:03] Working on Protein
[06:58:04] mdrun_gpu returned
[06:58:04] Going to send back what have done -- stepsTotalG=0
[06:58:04] Work fraction=0.0000 steps=0.
[06:58:08] logfile size=9151 infoLength=9151 edr=0 trr=25
[06:58:08] + Opened results file
[06:58:08] - Writing 9689 bytes of core data to disk...
[06:58:08] Done: 9177 -> 3340 (compressed to 36.3 percent)
[06:58:08] ... Done.
[06:58:08] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[06:58:08]
[06:58:08] Folding@home Core Shutdown: UNSTABLE_MACHINE
[06:58:11] CoreStatus = 7A (122)
[06:58:11] Sending work to server
[06:58:11] Project: 6604 (Run 0, Clone 858, Gen 1)


[06:58:11] + Attempting to send results [March 16 06:58:11 UTC]
[06:58:11] + Results successfully sent
[06:58:11] Thank you for your contribution to Folding@Home.
[06:58:15] EUE limit exceeded. Pausing 24 hours.
[07:30:03] Opening ...
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

MH: A few tips- but I can't advise in depth on your setup.
Try the -oneunit flag in config, to stop endless loops while the 66xx units are about in force.
[A client restart resets the bar on D/Loads, but gets only one with this].
That's if you can't block the relevant servers in your firewall].
NB: Not approved behaviour, generally - but it stops units bouncing back.

Try the 6.40r.1 GPU client.
Try the flag -force nvidia_gpu80 , at the correct number for your card.

These helped for me -you will have to research a bit for your card type etc. It's a bug and not quite right yet- as I got a 66xx unit not long back, that still bounced. Down to a minimum occurence now, though.
Image
Nathan_P
Posts: 1164
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by Nathan_P »

new08 wrote:MH: A few tips- but I can't advise in depth on your setup.
Try the -oneunit flag in config, to stop endless loops while the 66xx units are about in force.
[A client restart resets the bar on D/Loads, but gets only one with this].
That's if you can't block the relevant servers in your firewall].
NB: Not approved behaviour, generally - but it stops units bouncing back.

Try the 6.40r.1 GPU client.
Try the flag -force nvidia_gpu80 , at the correct number for your card.

These helped for me -you will have to research a bit for your card type etc. It's a bug and not quite right yet- as I got a 66xx unit not long back, that still bounced. Down to a minimum occurence now, though.
Some good tips there, a quick look on wikipedia lists your 9400gt in the same vein as a 8400GS, a 16 shader card that is at the low end of the range of 9xxx cards. As new08 says - use the latest client and see how you get on
Image
Meetinghouse
Posts: 9
Joined: Mon Mar 02, 2009 7:56 pm
Location: Massachusetts, USA

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by Meetinghouse »

Thanks for these tips. found gpu v6.23 on download page, but no 6.40. Current WU will finish overnight and will pause, then I'll download new gpu client. My log shows v 1.31 apparently not very current. One thing at a time, if I need to I'll try the flag. Hate to firewall IP addresses. Have settings on my router for PS3 ports etc. Don't want to mess that up.
bretth603
Posts: 19
Joined: Wed Nov 18, 2009 4:32 pm

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by bretth603 »

@new08,
new08 wrote:Strangely Brett,on your query after a goodish run, I got just what I described from earlier..

This shows a 66xx unit going back instantly -and a 105xx coming in immediately.
This is certainly better than a while back, when I could wait seeing 30 hours of non runners.
Without knowing what the servers have waiting now, or what may have been done to the client to help, that's all I can report on the download situation.
The 8400 GS is now doing a steady 500+ ppd -and that's a ~10% boost to the GT240 results, of 4500ppd.
It's also possible you just happened to get randomly assigned "runners" each time until this one. The WU mix changes from time to time on the servers, and at 500ppd there is a long time between WUs. From what you've described, the servers are still assigning you WUs the 8400 GS cannot handle. IMHO this means it has not improved. If the servers were truly making assignments based on your hardware capabilities, then you shouldn't receive ANY 66xx or 101xx WUs.


@Meetinghouse,

So far the only proven cure for this affliction is to block the work servers that issue 66xx and 101xx WUs which for the moment are 171.64.65.61 and 171.64.65.71. Unfortunately this also means you might go several days without a WU. If you're OK with your GPU crashing, then by all means keep on with what you're doing b/c the 24 hour sleep might be more productive. For me a GPU crash sometimes causes my screen to go all funny and I have to reboot, forcing me to stop whatever else I'm doing.

The 6.40r1 was a drop-in executable that was announced somewhere on the message boards (I forget where). That's why you didn't find it on the website. As far as I can tell it so far has no affect on this problem. Maybe that will change if PG someday changes the Work Server code to avoid assigning 66xx and 101xx WUs to our hardware.

@Everyone

As has been suggested, it's possible PG is just not expecting less than a certain number of shaders. That would explain why we have seen the exact same issue on the 8400 GS, ION, and 9400 GT. The number of shaders does not show up in "GPU Type", so AFAIK there is no way for PG to change the work servers just for us with the current version of the client.
Meetinghouse
Posts: 9
Joined: Mon Mar 02, 2009 7:56 pm
Location: Massachusetts, USA

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by Meetinghouse »

Hmmm. well, I'll download the newer gpu client, but not hold my breath. It doesn't crash my machine, just does the 24 hour pause. Then there will be days on end with no 66xx or 101xx WUs, so I think I'll just continue to limp this machine. The other 3 CPUs and PS3 I've got running keep me moving in the stats, but the GPU isn't adding as much as I'd hoped.
Meetinghouse
Posts: 9
Joined: Mon Mar 02, 2009 7:56 pm
Location: Massachusetts, USA

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by Meetinghouse »

well, I guess it did crash my machine the other day when I tried to open a game, so I just need to remember to pause it for that game.
Nathan_P
Posts: 1164
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by Nathan_P »

Meetinghouse wrote:well, I guess it did crash my machine the other day when I tried to open a game, so I just need to remember to pause it for that game.
That happens to most of us, i tried to play Civ 3 the other day on my 8800GT and it rebooted the machine :shock: , my old 8800GTX however could do both - maybe its something to do with the new drivers?
Image
bretth603
Posts: 19
Joined: Wed Nov 18, 2009 4:32 pm

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by bretth603 »

Sorry, by GPU crash I meant the GPU gets into a state where it no longer refreshes the screen properly. As soon as I try to move a window or do anything, the portion I am moving just pixellates and smears. The rest of the machine is still running fine, but it's difficult to use without a monitor! Most EUE WUs have no effect, but a few put the GPU in this funny state.
gwildperson
Posts: 450
Joined: Tue Dec 04, 2007 8:36 pm

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by gwildperson »

Well, consider the answers to these questions and then figure out what to do next.

1) Does the GPU only hang with these drivers?
2) Does the GPU only hang when a protein contains some Hydrogen atoms ;) or something like that?
3) Does the GPU only hang with the current version of FahCore_11? .... or of FahCore_15?
4) Does the GPU only hang when you overclock?
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

@Brett The 6.40r1 was a drop-in executable that was announced somewhere on the message boards (I forget where). That's why you didn't find it on the website. As far as I can tell it so far has no affect on this problem. Maybe that will change if PG someday changes the Work Server code to avoid assigning 66xx and 101xx WUs to our hardware.

My comment:
This was a link of mine referring to Bruces' pointer to the 6.40r1 client...
viewtopic.php?f=59&t=16471
but it has gone to ground, for whatever reason, and won't let me in!
I think it was even my own thread <g>
Even though Brett doesn't agree- I feel that the improvement since using it too noticeable to be chance.
No idea why it's not more widely promoted- as it has run very well for me.

[NB:A slight reservation with the new client on concurrent CPU client use being degraded - which is so much less efficient anyway. I'm not concerned as a single producer.
There may be wider implications to this and multiple CPU clients, I'm not privy to...]

My final point to Brett is that I suspect, having only seen 2 examples of 66xx units [& once off's] on the new client- then this could be a gating of the download server to a more productive one? Just a thought...
Image
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

I don't know why the source is suppressed- but this page gives the links to F@H Client for GPU3 ver 6.40r1
http://www.overclock.net/overclock-net- ... -40r1.html
Which ever one you use, don't forget to remove the ' _systray' or '_console' from the downloaded exe file when you copy it to your working folder[s], or it won't execute. Also remove other short cuts to the older clients, especially if auto starting.
It has run very stable for me.
NB: Note the rider to lower CUDA 1.0 cards in the post.
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by bruce »

new08 wrote:I don't know why the source is suppressed- but this page gives the links to F@H Client for GPU3 ver 6.40r1
http://www.overclock.net/overclock-net- ... -40r1.html
Which ever one you use, don't forget to remove the ' _systray' or '_console' from the downloaded exe file when you copy it to your working folder[s], or it won't execute. Also remove other short cuts to the older clients, especially if auto starting.
It has run very stable for me.
NB: Note the rider to lower CUDA 1.0 cards in the post.
I guess you didn't read viewtopic.php?f=59&t=16698&p=167548#p167548.

Beta testing is risky business, although the risk does vary from one version to the next. The problem is that you are encouraging folks to beta test a client which comes with a much higher risk than the public beta that's posted on the download page.

Anyone can CHOOSE to beta test and can CHOOSE to accept higher levels of risk, but only if they're informed of that choice, and neither you nor the person posting that link said anything about the risk. See viewtopic.php?f=2&t=10364

Beta testing needs to be an opt-in process where people are informed about the risks so they can make an intelligent choice. ANYBODY who wants to beta test can join the beta team once they acknowledge having read the warnings. Distributing unpublished links is not helping people it gets people into trouble who have not made an informed choice.
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

Beta testing encouragement? :This is unintentional- it's not flagged as beta.
It does however, seem to enhance performance of the cards that will run it OK.
That was my intention- and if this is counter to any policy, then feel free to modify my post accordingly, Bruce.
If the new client is as good as it seems- then a Beta comment would correct.
No ,I'd not seen VJ's post on this, as you didn't link to it earlier, AFAIK.
This a good resource for card capabilities & Beta issues viewtopic.php?p=158851#p158849
and subsequent posts thereon.
Image
Post Reply