This may not be the WU fault- 66xx's [NVidia-8400GS]

Moderators: Site Moderators, FAHC Science Team

new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

OK, Brett- that's clear.
I suppose, if I was to deal with 'our' problem, I'd get a failed download to register, check the device asking for work [now possible for my card] and then switch to a known successful work unit, like wot 'appened -[ probably by accident- from your server notes.]
Unless, unless- they are already now capable of re-directing work..? Mmmm... ;)

Later Edit: The next completion went the same. [Getting another 105xx OK]
Notice the number of card checks* done at sending and getting new work. This looks more than previous- or am I just mardy? :)

Code: Select all

[12:00:26] + Attempting to send results [October 30 12:00:26 UTC]
*[12:00:26] Gpu type=2 species=11.
[12:00:26] + Results successfully sent
[12:00:26] Thank you for your contribution to Folding@Home.
[12:00:30] - Preparing to get new work unit...
[12:00:30] Cleaning up work directory
[12:00:30] + Attempting to get work packet
*[12:00:30] Gpu type=2 species=11.
[12:00:30] - Connecting to assignment server
[12:00:31] - Successful: assigned to (171.64.65.61).
[12:00:31] + News From Folding@Home: Welcome to Folding@Home
[12:00:31] Loaded queue successfully.
*[12:00:31] Gpu type=2 species=11.
[12:00:33] + Closed connections
[12:00:38] 
Last edited by new08 on Wed Nov 03, 2010 7:10 pm, edited 1 time in total.
Image
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

Your view was correct, Brett!
I think my hopes were a little high.
I got 3 consecutive 105xx units which ran straight off and then got stuck on the 171.64.65.71 Server again- which I had to block, due to wasting units.
A good thing then, that my new build P4 MoBo with GT240 is running now, at default clocks & gaining a decent 3333 ppd. :)
The whole rig at less than £100- 'pre-loved' as they say!
Last edited by new08 on Wed Nov 03, 2010 7:15 pm, edited 1 time in total.
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by bruce »

new08 wrote:OK, Brett- that's clear.
I suppose, if I was to deal with 'our' problem, I'd get a failed download to register, check the device asking for work [now possible for my card] and then switch to a known successful work unit, like wot 'appened -[ probably by accident] from your server notes.
Unless, unless- they are already now capable of re-directing work..? Mmmm... ;)

Later Edit: The next completion went the same. [Getting another 105xx OK]
Notice the number of card checks* done at sending and getting new work. This looks more than previous- or am I just mardy? :)

Code: Select all

[12:00:26] + Attempting to send results [October 30 12:00:26 UTC]
*[12:00:26] Gpu type=2 species=11.
[12:00:26] + Results successfully sent
[12:00:26] Thank you for your contribution to Folding@Home.
[12:00:30] - Preparing to get new work unit...
[12:00:30] Cleaning up work directory
[12:00:30] + Attempting to get work packet
*[12:00:30] Gpu type=2 species=11.
[12:00:30] - Connecting to assignment server
[12:00:31] - Successful: assigned to (171.64.65.61).
[12:00:31] + News From Folding@Home: Welcome to Folding@Home
[12:00:31] Loaded queue successfully.
*[12:00:31] Gpu type=2 species=11.
[12:00:33] + Closed connections
[12:00:38] 
Mardy? I don't see anything wrong with the log you posted. Sure, there are more messages about species=11 than there probably needs to be, but that sort of thing can be cleaned up in the final code. The important thing is that you're successfully completing the WUs you're being assigned.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by bruce »

bretth603 wrote:OK, now I understand what you're thinking.

The 171.64.65.61 Work Server has been serving up 105xx as well as 66xx WUs since before I started blocking IPs, though. I have them on a little notepad on my desk from a couple of months ago. Sorry you got your hopes up. :(

I really don't think the changes PG has put in place with this new client will help us, since they are only identifying CUDA Compute Capability, not individual hardware. We share the same CUDA Compute Capability with lots of other cards that have no problem with these WUs. But, hey, let's hope I'm wrong.
If NVidia says that a specific piece of hardware can perform certain types of calculations needed by a specific project, it's possible that Nvidia drivers are reporting the wrong information. It's also possible that your GPU is somehow "unique" and can't do what NVidia says it should be able to do (such as being overclocked or defective). FAH has to depend on NVidia knowing what the [card+drivers] can do and cannot do.

I'd really like to figure out which it is.

This topic was originally about projects 105xx as well as 66xx but now you're having trouble because you're blocking 171.64.65.71 which doesn't have either of those projects. (It has 101xx.) Maybe you're too quick on the trigger to block a server that has projects that happen to be incompatible with your overclock or something like that. Please post FAHlog showing why you decided to block that server. While you're at it, please save the data from your work file so the Pande Group can actually reproduce the problem that you seem to be having.
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

I won't be able to help much on that soon Bruce ,as I will switch rigs to the new one and give the 8400GS experiment a breather. if it was going to jeep slogging away at the 750 ppd [GPU 650ppd+CPU 100 max] then it would be worth keeping the power on. Over 24 hours out of service just knocks back production too much for the hassle.
The card runs sweet enough on good units, straight through -and no temperature transients @ 99% utilisation.
No guarantee on this being enough, but my attennae detect the lack of design proving on this issue with early cards, difficult to do remotely..
What would be interesting to know is how many people could/would up their production on this card, if fitted - if this rather crucial problem was resolved?
Image
bretth603
Posts: 19
Joined: Wed Nov 18, 2009 4:32 pm

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by bretth603 »

bruce wrote: If NVidia says that a specific piece of hardware can perform certain types of calculations needed by a specific project, it's possible that Nvidia drivers are reporting the wrong information. It's also possible that your GPU is somehow "unique" and can't do what NVidia says it should be able to do (such as being overclocked or defective). FAH has to depend on NVidia knowing what the [card+drivers] can do and cannot do.

I'd really like to figure out which it is.

This topic was originally about projects 105xx as well as 66xx but now you're having trouble because you're blocking 171.64.65.71 which doesn't have either of those projects. (It has 101xx.) Maybe you're too quick on the trigger to block a server that has projects that happen to be incompatible with your overclock or something like that. Please post FAHlog showing why you decided to block that server. While you're at it, please save the data from your work file so the Pande Group can actually reproduce the problem that you seem to be having.
Hi Bruce,

The problem is with all 66xx and 101xx WUs. All of them crash instantly (EUE) on 8400 GS cards. 57xx and 105xx WUs run fine. It's not an overclock or stability issue; I've investigated overclocking, underclocking, cooling, memtestg80, etc. I'm not overclocking now but the card can stably run at 10% overclock on memory and 25% overclock on shaders. Even underclocked with a giant external A/C fan blowing on the card it still crashes instantly on those WUs.

We are blocking WSs 171.64.65.61 and 171.64.65.71 to avoid getting 66xx and 101xx WUs. It keeps the GPU from crashing, but unfortunately it also means I sometimes have to wait several days for a WU.

Thanks for taking an interest. I would be happy to provide any assistance PG would like should they be interested.
Last edited by bretth603 on Wed Nov 03, 2010 9:03 pm, edited 2 times in total.
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

From previous page: [my post]..
10:12:39] Calling fah_main args: 14 usage=100
[10:12:39]
[10:12:43] Working on Protein
[10:12:44] mdrun_gpu returned
[10:12:44] Going to send back what have done -- stepsTotalG=0
[10:12:44] Work fraction=0.0000 steps=0.
[10:12:47] logfile size=9159 infoLength=9159 edr=0 trr=25
[10:12:47] + Opened results file
[10:12:47] - Writing 9697 bytes of core data to disk...
[10:12:48] Done: 9185 -> 3339 (compressed to 36.3 percent)
[10:12:48] ... Done.

Looks like zero process on unit done and logfile sent on mdrun problem [common result]
Data has been going back!
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by bruce »

new08 wrote:From previous page: [my post]..
10:12:39] Calling fah_main args: 14 usage=100
[10:12:39]
[10:12:43] Working on Protein
[10:12:44] mdrun_gpu returned
[10:12:44] Going to send back what have done -- stepsTotalG=0
[10:12:44] Work fraction=0.0000 steps=0.
[10:12:47] logfile size=9159 infoLength=9159 edr=0 trr=25
[10:12:47] + Opened results file
[10:12:47] - Writing 9697 bytes of core data to disk...
[10:12:48] Done: 9185 -> 3339 (compressed to 36.3 percent)
[10:12:48] ... Done.

Looks like zero process on unit done and logfile sent on mdrun problem [common result]
Data has been going back!
Realistically, that post isn't helpful. I don't see any sign of a message saying Gpu type=2 species=11. If that was the older client you may be still reporting a problem that has already been fixed.

Compute_capability >=1.1 is not simply a question of what hardware you have. It also identifies certain driver capabilities and certain CUDA level requirements.

Certainly there was a problem, but the question remaining was whether it has been fixed or not, and that may mean you have to upgrade certain software features, too. I'm looking for information that clearly documents that a problem remains AFTER the new client is in use and I'm not seeing that information yet.
bretth603
Posts: 19
Joined: Wed Nov 18, 2009 4:32 pm

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by bretth603 »

Log file below from my run with the new core. I had posted this in the thread for the new core but I'll repost here for convenience.

I don't see any work files. If they were created, then they must have been deleted by the client.

Code: Select all

--- Opening Log file [October 27 16:48:23 UTC] 


# Windows GPU Systray Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.40r1

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\xxxxxxxx\Application Data\Folding@home-gpu
Arguments: -verbosity 9 

[16:48:23] - Ask before connecting: No
[16:48:23] - User name: xxxxxxx
[16:48:23] - User ID: xxxxxxx
[16:48:23] - Machine ID: 2
[16:48:23] 
[16:48:23] Gpu type=2 species=11.
[16:48:23] Loaded queue successfully.
[16:48:23] Initialization complete
[16:48:23] - Preparing to get new work unit...
[16:48:23] Cleaning up work directory
[16:48:23] - Autosending finished units... [October 27 16:48:23 UTC]
[16:48:23] Trying to send all finished work units
[16:48:23] + No unsent completed units remaining.
[16:48:23] - Autosend completed
[16:48:23] + Attempting to get work packet
[16:48:23] Passkey found
[16:48:23] - Will indicate memory of 3063 MB
[16:48:23] Gpu type=2 species=11.
[16:48:23] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 14, Stepping: 5
[16:48:23] - Connecting to assignment server
[16:48:23] Connecting to http://assign-GPU.stanford.edu:8080/
[16:48:24] Posted data.
[16:48:24] Initial: 40AB; - Successful: assigned to (171.64.65.71).
[16:48:24] + News From Folding@Home: Welcome to Folding@Home
[16:48:24] Loaded queue successfully.
[16:48:24] Gpu type=2 species=11.
[16:48:24] Sent data
[16:48:24] Connecting to http://171.64.65.71:8080/
[16:48:25] Posted data.
[16:48:25] Initial: 0000; - Receiving payload (expected size: 82294)
[16:48:26] - Downloaded at ~80 kB/s
[16:48:26] - Averaged speed for that direction ~96 kB/s
[16:48:26] + Received work.
[16:48:26] + Closed connections
[16:48:26] 
[16:48:26] + Processing work unit
[16:48:26] Core required: FahCore_11.exe
[16:48:26] Core found.
[16:48:26] Working on queue slot 07 [October 27 16:48:26 UTC]
[16:48:26] + Working ...
[16:48:26] - Calling '.\FahCore_11.exe -dir work/ -suffix 07 -nice 19 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 3372 -version 640'

[16:48:26] 
[16:48:26] *------------------------------*
[16:48:26] Folding@Home GPU Core
[16:48:26] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[16:48:26] 
[16:48:26] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[16:48:26] Build host: amoeba
[16:48:26] Board Type: Nvidia
[16:48:26] Core      : 
[16:48:26] Preparing to commence simulation
[16:48:26] - Looking at optimizations...
[16:48:26] DeleteFrameFiles: successfully deleted file=work/wudata_07.ckp
[16:48:26] - Created dyn
[16:48:26] - Files status OK
[16:48:26] - Expanded 81782 -> 421543 (decompressed 515.4 percent)
[16:48:26] Called DecompressByteArray: compressed_data_size=81782 data_size=421543, decompressed_data_size=421543 diff=0
[16:48:26] - Digital signature verified
[16:48:26] 
[16:48:26] Project: 10111 (Run 457, Clone 0, Gen 55)
[16:48:26] 
[16:48:26] Assembly optimizations on if available.
[16:48:26] Entering M.D.
[16:48:32] Tpr hash work/wudata_07.tpr:  564795167 4246505016 3975217984 3119005825 1735423598
[16:48:32] 
[16:48:32] Calling fah_main args: 14 usage=100
[16:48:32] 
[16:48:33] Working on 1174 p10111_ubiquitin_300K
[16:48:33] mdrun_gpu returned 
[16:48:33] Going to send back what have done -- stepsTotalG=0
[16:48:33] Work fraction=0.0000 steps=0.
[16:48:37] logfile size=9172 infoLength=9172 edr=0 trr=25
[16:48:37] + Opened results file
[16:48:37] - Writing 9710 bytes of core data to disk...
[16:48:37] Done: 9198 -> 3355 (compressed to 36.4 percent)
[16:48:37]   ... Done.
[16:48:37] DeleteFrameFiles: successfully deleted file=work/wudata_07.ckp
[16:48:37] 
[16:48:37] Folding@home Core Shutdown: UNSTABLE_MACHINE
[16:48:40] CoreStatus = 7A (122)
[16:48:40] Sending work to server
[16:48:40] Project: 10111 (Run 457, Clone 0, Gen 55)


[16:48:40] + Attempting to send results [October 27 16:48:40 UTC]
[16:48:40] - Reading file work/wuresults_07.dat from core
[16:48:40]   (Read 3867 bytes from disk)
[16:48:40] Gpu type=2 species=11.
[16:48:40] Connecting to http://171.64.65.71:8080/
[16:48:41] Posted data.
[16:48:41] Initial: 0000; - Uploaded at ~4 kB/s
[16:48:41] - Averaged speed for that direction ~57 kB/s
[16:48:41] + Results successfully sent
[16:48:41] Thank you for your contribution to Folding@Home.
[16:48:45] Trying to send all finished work units
[16:48:45] + No unsent completed units remaining.
[16:48:45] + Closed connections
[16:48:45] + Paused after finishing unit
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

OK Bruce- this is another example in more detail [3 attenpts -followed by my FW blocking further work requests] :

Code: Select all

--- Opening Log file [November 2 22:59:08 UTC] 

# Windows GPU Systray Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.40r1

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: D:\Documents and Settings\myname\Application Data\Folding@home-gpu

[22:59:08] - Ask before connecting: No
[22:59:08] - User name: new08 (Team 39340)
[22:59:08] - User ID: **************
[22:59:08] - Machine ID: 2
[22:59:08] 
[22:59:08] Gpu type=2 species=11.
[22:59:08] Loaded queue successfully.
[22:59:08] Initialization complete
[22:59:08] - Preparing to get new work unit...
[22:59:08] Cleaning up work directory
[22:59:08] + Attempting to get work packet
[22:59:08] Gpu type=2 species=11.
[22:59:08] - Connecting to assignment server
[22:59:13] - Successful: assigned to (171.64.65.71).
[22:59:13] + News From Folding@Home: Welcome to Folding@Home
[22:59:13] Loaded queue successfully.
[22:59:13] Gpu type=2 species=11.
[22:59:34] - Couldn't send HTTP request to server
[22:59:34] + Could not connect to Work Server
[22:59:34] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[22:59:42] + Attempting to get work packet
[22:59:42] Gpu type=2 species=11.
[22:59:42] - Connecting to assignment server
[22:59:46] - Successful: assigned to (171.64.65.71).
[22:59:46] + News From Folding@Home: Welcome to Folding@Home
[22:59:46] Loaded queue successfully.
[22:59:46] Gpu type=2 species=11.
[23:00:07] - Couldn't send HTTP request to server
[23:00:07] + Could not connect to Work Server
[23:00:07] - Attempt #2  to get work failed, and no other work to do.
Waiting before retry.
[23:00:20] + Attempting to get work packet
[23:00:20] Gpu type=2 species=11.
[23:00:20] - Connecting to assignment server
[23:00:21] - Successful: assigned to (171.64.65.71).
[23:00:21] + News From Folding@Home: Welcome to Folding@Home
[23:00:21] Loaded queue successfully.
[23:00:21] Gpu type=2 species=11.
[23:00:23] + Closed connections
[23:00:23] 
[23:00:23] + Processing work unit
[23:00:23] Core required: FahCore_11.exe
[23:00:23] Core found.
[23:00:23] Working on queue slot 07 [November 2 23:00:23 UTC]
[23:00:23] + Working ...
[23:00:25] 
[23:00:25] *------------------------------*
[23:00:25] Folding@Home GPU Core
[23:00:25] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[23:00:25] 
[23:00:25] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[23:00:25] Build host: amoeba
[23:00:25] Board Type: Nvidia
[23:00:25] Core      : 
[23:00:25] Preparing to commence simulation
[23:00:25] - Looking at optimizations...
[23:00:25] DeleteFrameFiles: successfully deleted file=work/wudata_07.ckp
[23:00:25] - Created dyn
[23:00:25] - Files status OK
[23:00:26] - Expanded 81791 -> 421543 (decompressed 515.3 percent)
[23:00:26] Called DecompressByteArray: compressed_data_size=81791 data_size=421543, decompressed_data_size=421543 diff=0
[23:00:26] - Digital signature verified
[23:00:26] 
[23:00:26] Project: 10111 (Run 317, Clone 1, Gen 16)
[23:00:26] 
[23:00:26] Assembly optimizations on if available.
[23:00:26] Entering M.D.
[23:00:32] Tpr hash work/wudata_07.tpr:  3474740447 3253851595 1993699476 3031274735 3062200486
[23:00:32] 
[23:00:32] Calling fah_main args: 14 usage=100
[23:00:32] 
[23:00:38] Working on 1174 p10111_ubiquitin_300K
[23:00:39] mdrun_gpu returned 
[23:00:39] Going to send back what have done -- stepsTotalG=0
[23:00:39] Work fraction=0.0000 steps=0.
[23:00:43] logfile size=9172 infoLength=9172 edr=0 trr=25
[23:00:43] + Opened results file
[23:00:43] - Writing 9710 bytes of core data to disk...
[23:00:44] Done: 9198 -> 3360 (compressed to 36.5 percent)
[23:00:44]   ... Done.
[23:00:44] DeleteFrameFiles: successfully deleted file=work/wudata_07.ckp
[23:00:47] 
[23:00:47] Folding@home Core Shutdown: UNSTABLE_MACHINE
[23:00:50] CoreStatus = 7A (122)
[23:00:50] Sending work to server
[23:00:50] Project: 10111 (Run 317, Clone 1, Gen 16)
[23:00:50] - Read packet limit of 540015616... Set to 524286976.


[23:00:50] + Attempting to send results [November 2 23:00:50 UTC]
[23:00:50] Gpu type=2 species=11.
[23:00:53] + Results successfully sent
[23:00:53] Thank you for your contribution to Folding@Home.
[23:00:57] - Preparing to get new work unit...
[23:00:57] Cleaning up work directory
[23:00:57] + Attempting to get work packet
[23:00:57] Gpu type=2 species=11.
[23:00:57] - Connecting to assignment server
[23:00:58] - Successful: assigned to (171.64.65.71).
[23:00:58] + News From Folding@Home: Welcome to Folding@Home
[23:00:58] Loaded queue successfully.
[23:00:58] Gpu type=2 species=11.
[23:01:00] + Closed connections
[23:01:05] 
[23:01:05] + Processing work unit
[23:01:05] Core required: FahCore_11.exe
[23:01:05] Core found.
[23:01:05] Working on queue slot 08 [November 2 23:01:05 UTC]
[23:01:05] + Working ...
[23:01:06] 
[23:01:06] *------------------------------*
[23:01:06] Folding@Home GPU Core
[23:01:06] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[23:01:06] 
[23:01:06] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[23:01:06] Build host: amoeba
[23:01:06] Board Type: Nvidia
[23:01:06] Core      : 
[23:01:06] Preparing to commence simulation
[23:01:06] - Looking at optimizations...
[23:01:06] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[23:01:06] - Created dyn
[23:01:06] - Files status OK
[23:01:06] - Expanded 81882 -> 421543 (decompressed 514.8 percent)
[23:01:06] Called DecompressByteArray: compressed_data_size=81882 data_size=421543, decompressed_data_size=421543 diff=0
[23:01:06] - Digital signature verified
[23:01:06] 
[23:01:06] Project: 10111 (Run 234, Clone 1, Gen 56)
[23:01:06] 
[23:01:06] Assembly optimizations on if available.
[23:01:06] Entering M.D.
[23:01:12] Tpr hash work/wudata_08.tpr:  3784691646 81043957 1579807727 3630863380 2547381512
[23:01:12] 
[23:01:12] Calling fah_main args: 14 usage=100
[23:01:12] 
[23:01:16] Working on 1174 p10111_ubiquitin_300K
[23:01:17] mdrun_gpu returned 
[23:01:17] Going to send back what have done -- stepsTotalG=0
[23:01:17] Work fraction=0.0000 steps=0.
[23:01:21] logfile size=9177 infoLength=9177 edr=0 trr=25
[23:01:21] + Opened results file
[23:01:21] - Writing 9715 bytes of core data to disk...
[23:01:21] Done: 9203 -> 3365 (compressed to 36.5 percent)
[23:01:21]   ... Done.
[23:01:21] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[23:01:25] 
[23:01:25] Folding@home Core Shutdown: UNSTABLE_MACHINE
[23:01:28] CoreStatus = 7A (122)
[23:01:28] Sending work to server
[23:01:28] Project: 10111 (Run 234, Clone 1, Gen 56)
[23:01:28] - Read packet limit of 540015616... Set to 524286976.


[23:01:28] + Attempting to send results [November 2 23:01:28 UTC]
[23:01:28] Gpu type=2 species=11.
[23:01:49] - Couldn't send HTTP request to server
[23:01:49] + Could not connect to Work Server (results)
[23:01:49]     (171.64.65.71:8080)
[23:01:49] + Retrying using alternative port
[23:02:10] - Couldn't send HTTP request to server
[23:02:10] + Could not connect to Work Server (results)
[23:02:10]     (171.64.65.71:80)
[23:02:10] - Error: Could not transmit unit 08 (completed November 2) to work server.
[23:02:10]   Keeping unit 08 in queue.
[23:02:10] Project: 10111 (Run 234, Clone 1, Gen 56)
[23:02:10] - Read packet limit of 540015616... Set to 524286976.


[23:02:10] + Attempting to send results [November 2 23:02:10 UTC]
[23:02:10] Gpu type=2 species=11.
[23:02:31] - Couldn't send HTTP request to server
[23:02:31] + Could not connect to Work Server (results)
[23:02:31]     (171.64.65.71:8080)
[23:02:31] + Retrying using alternative port
[23:02:35] + Results successfully sent
[23:02:35] Thank you for your contribution to Folding@Home.
[23:02:35] - Preparing to get new work unit...
[23:02:35] Cleaning up work directory
[23:02:35] + Attempting to get work packet
[23:02:35] Gpu type=2 species=11.
[23:02:35] - Connecting to assignment server
[23:02:36] - Successful: assigned to (171.64.65.71).
[23:02:36] + News From Folding@Home: Welcome to Folding@Home
[23:02:36] Loaded queue successfully.
[23:02:36] Gpu type=2 species=11.
[23:02:38] + Closed connections
[23:02:43] 
[23:02:43] + Processing work unit
[23:02:43] Core required: FahCore_11.exe
[23:02:43] Core found.
[23:02:43] Working on queue slot 09 [November 2 23:02:43 UTC]
[23:02:43] + Working ...
[23:02:43] 
[23:02:43] *------------------------------*
[23:02:43] Folding@Home GPU Core
[23:02:43] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[23:02:43] 
[23:02:43] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[23:02:43] Build host: amoeba
[23:02:43] Board Type: Nvidia
[23:02:43] Core      : 
[23:02:43] Preparing to commence simulation
[23:02:43] - Looking at optimizations...
[23:02:43] DeleteFrameFiles: successfully deleted file=work/wudata_09.ckp
[23:02:43] - Created dyn
[23:02:43] - Files status OK
[23:02:43] - Expanded 81817 -> 421543 (decompressed 515.2 percent)
[23:02:43] Called DecompressByteArray: compressed_data_size=81817 data_size=421543, decompressed_data_size=421543 diff=0
[23:02:43] - Digital signature verified
[23:02:43] 
[23:02:43] Project: 10111 (Run 529, Clone 3, Gen 8)
[23:02:43] 
[23:02:44] Assembly optimizations on if available.
[23:02:44] Entering M.D.
[23:02:50] Tpr hash work/wudata_09.tpr:  2698002448 6404310 319504577 1447482104 148196192
[23:02:50] 
[23:02:50] Calling fah_main args: 14 usage=100
[23:02:50] 
[23:02:54] Working on 1174 p10111_ubiquitin_300K
[23:02:54] mdrun_gpu returned 
[23:02:54] Going to send back what have done -- stepsTotalG=0
[23:02:54] Work fraction=0.0000 steps=0.
[23:02:58] logfile size=0 infoLength=0 edr=0 trr=25
[23:02:58] + Opened results file
[23:02:58] - Writing 637 bytes of core data to disk...
[23:02:58] Done: 125 -> 124 (compressed to 99.2 percent)
[23:02:58]   ... Done.
[23:02:59] DeleteFrameFiles: successfully deleted file=work/wudata_09.ckp
[23:03:01] 
[23:03:01] Folding@home Core Shutdown: UNSTABLE_MACHINE
[23:03:03] CoreStatus = 7A (122)
[23:03:03] Sending work to server
[23:03:03] Project: 10111 (Run 529, Clone 3, Gen 8)
[23:03:03] - Read packet limit of 540015616... Set to 524286976.


[23:03:03] + Attempting to send results [November 2 23:03:03 UTC]
[23:03:03] Gpu type=2 species=11.
[23:03:25] - Couldn't send HTTP request to server
[23:03:25] + Could not connect to Work Server (results)
[23:03:25]     (171.64.65.71:8080)
[23:03:25] + Retrying using alternative port
[23:03:46] - Couldn't send HTTP request to server
[23:03:46] + Could not connect to Work Server (results)
[23:03:46]     (171.64.65.71:80)
[23:03:46] - Error: Could not transmit unit 09 (completed November 2) to work server.
[23:03:46]   Keeping unit 09 in queue.
[23:03:46] Project: 10111 (Run 529, Clone 3, Gen 8)
[23:03:46] - Read packet limit of 540015616... Set to 524286976.


[23:03:46] + Attempting to send results [November 2 23:03:46 UTC]
[23:03:46] Gpu type=2 species=11.
[23:04:07] - Couldn't send HTTP request to server
[23:04:07] + Could not connect to Work Server (results)
[23:04:07]     (171.64.65.71:8080)
[23:04:07] + Retrying using alternative port
[23:04:27] - Couldn't send HTTP request to server
[23:04:27] + Could not connect to Work Server (results)
[23:04:27]     (171.64.65.71:80)
[23:04:27] - Error: Could not transmit unit 09 (completed November 2) to work server.
[23:04:27] - Read packet limit of 540015616... Set to 524286976.


[23:04:27] + Attempting to send results [November 2 23:04:27 UTC]
[23:04:27] Gpu type=2 species=11.
[23:04:28] + Results successfully sent
[23:04:28] Thank you for your contribution to Folding@Home.
[23:04:28]   Successfully sent unit 09 to Collection server.
[23:04:28] - Preparing to get new work unit...
[23:04:28] Cleaning up work directory
[23:04:28] + Attempting to get work packet
[23:04:28] Gpu type=2 species=11.
[23:04:28] - Connecting to assignment server
[23:04:29] - Successful: assigned to (171.64.65.71).
[23:04:29] + News From Folding@Home: Welcome to Folding@Home
[23:04:30] Loaded queue successfully.
[23:04:30] Gpu type=2 species=11.
[23:04:51] - Couldn't send HTTP request to server
[23:04:51] + Could not connect to Work Server
[23:04:51] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[23:05:00] + Attempting to get work packet
[23:05:00] Gpu type=2 species=11.
Hope this helps a bit more- tallies with Brett.
I've decommissioned that PC, for F@H -now the GT240 is blazing away @ 3K+ ppd. on a new PCIe capable rig.
If this problem is solved for the 8400GS I would consider bringing it back though, for another 20% per day!
Last edited by new08 on Fri Nov 05, 2010 8:48 pm, edited 1 time in total.
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by bruce »

Thanks. That helps a lot.
Nathan_P
Posts: 1164
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by Nathan_P »

bretth603 wrote:
bruce wrote: If NVidia says that a specific piece of hardware can perform certain types of calculations needed by a specific project, it's possible that Nvidia drivers are reporting the wrong information. It's also possible that your GPU is somehow "unique" and can't do what NVidia says it should be able to do (such as being overclocked or defective). FAH has to depend on NVidia knowing what the [card+drivers] can do and cannot do.

I'd really like to figure out which it is.

This topic was originally about projects 105xx as well as 66xx but now you're having trouble because you're blocking 171.64.65.71 which doesn't have either of those projects. (It has 101xx.) Maybe you're too quick on the trigger to block a server that has projects that happen to be incompatible with your overclock or something like that. Please post FAHlog showing why you decided to block that server. While you're at it, please save the data from your work file so the Pande Group can actually reproduce the problem that you seem to be having.
Hi Bruce,

The problem is with all 66xx and 101xx WUs. All of them crash instantly (EUE) on 8400 GS cards. 57xx and 105xx WUs run fine. It's not an overclock or stability issue; I've investigated overclocking, underclocking, cooling, memtestg80, etc. I'm not overclocking now but the card can stably run at 10% overclock on memory and 25% overclock on shaders. Even underclocked with a giant external A/C fan blowing on the card it still crashes instantly on those WUs.

We are blocking WSs 171.64.65.61 and 171.64.65.71 to avoid getting 66xx and 101xx WUs. It keeps the GPU from crashing, but unfortunately it also means I sometimes have to wait several days for a WU.

Thanks for taking an interest. I would be happy to provide any assistance PG would like should they be interested.

There was a thread a while ago that covered this, basically the very low end nvidia cards cannot complete these projects - the suspected cause was down to an insuffiecent number of shaders on the cards to enable them to process the WU, however this was never confirmed by either nvidia or PG. the assumption is only based on what the community has been able to deduce from the info given by various donors.

once i can find the link i will add it to the bottom of this post.
Image
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

Yes Nathan, that history would throw some light on the topic.
If you're right about shaders, then only PG know which units will 'throw a wobbly' on a low shader card like the 8400GS.
At least current reports seem consistent as to what units these are -and which ones run fine.
Image
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

Things have improved somewhat.
Now the new GPU3 client can recognize the card more reliably it seems to be helping, unless the balance of w/us has altered a lot recently. I've been getting much more consistent downloads that will run on the 8400GS.
This recent log shows the 8400GS being flagged a Type 2 Species 11, getting two 66xx units, failing to start twice and them getting a runner [a 105xx] the next time.

If this is intended to work like this, then this log may be some use, anyways- obviously there's no bar on accessing 66xx units, initially.

NB: Note Version is 6.40r1 [Systray]
[Log file is edited]

Code: Select all


--- Opening Log file [November 23 04:33:58 UTC] 


# Windows GPU Systray Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.40r1

                          http://folding.stanford.edu

###############################################################################
###############################################################################





11:45:01] Working on Protein

*************       [11:45:02] mdrun_gpu returned   *************

[11:45:02] Going to send back what have done -- stepsTotalG=0
[11:45:02] Work fraction=0.0000 steps=0.
[11:45:06] logfile size=9154 infoLength=9154 edr=0 trr=25
[11:45:06] + Opened results file
[11:45:06] - Writing 9692 bytes of core data to disk...
[11:45:06] Done: 9180 -> 3345 (compressed to 36.4 percent)
[11:45:06]   ... Done.
[11:45:06] DeleteFrameFiles: successfully deleted file=work/wudata_06.ckp
[11:45:06] 
[11:45:06] Folding@home Core Shutdown: UNSTABLE_MACHINE
[11:45:15] CoreStatus = 7A (122)
[11:45:15] Sending work to server
[11:45:15] Project: 6603 (Run 7, Clone 822, Gen 482)


[11:45:15] + Attempting to send results [November 25 11:45:15 UTC]
[11:45:21] Gpu type=2 species=11.
[11:45:22] + Results successfully sent
[11:45:22] Thank you for your contribution to Folding@Home.
[11:45:26] - Preparing to get new work unit...
[11:45:26] Cleaning up work directory
[11:45:26] + Attempting to get work packet
[11:45:26] Gpu type=2 species=11.
[11:45:26] - Connecting to assignment server
[11:45:27] - Successful: assigned to (171.64.65.61).
[11:45:27] + News From Folding@Home: Welcome to Folding@Home
[11:45:27] Loaded queue successfully.
[11:45:27] Gpu type=2 species=11.
[11:45:29] + Closed connections
[11:45:34] 
[11:45:34] + Processing work unit
[11:45:34] Core required: FahCore_11.exe
[11:45:34] Core found.
[11:45:34] Working on queue slot 07 [November 25 11:45:34 UTC]
[11:45:34] + Working ...
[11:45:34] 
[11:45:34] *------------------------------*
[11:45:34] Folding@Home GPU Core
[11:45:34] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[11:45:34] 
[11:45:34] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[11:45:34] Build host: amoeba
[11:45:34] Board Type: Nvidia
[11:45:34] Core      : 
[11:45:34] Preparing to commence simulation
[11:45:34] - Looking at optimizations...
[11:45:34] DeleteFrameFiles: successfully deleted file=work/wudata_07.ckp
[11:45:34] - Created dyn
[11:45:34] - Files status OK
[11:45:34] - Expanded 73967 -> 383588 (decompressed 518.5 percent)
[11:45:34] Called DecompressByteArray: compressed_data_size=73967 data_size=383588, decompressed_data_size=383588 diff=0
[11:45:34] - Digital signature verified
[11:45:34] 
[11:45:34] Project: 6601 (Run 4, Clone 460, Gen 538)
[11:45:34] 
[11:45:34] Assembly optimizations on if available.
[11:45:34] Entering M.D.
[11:45:40] Tpr hash work/wudata_07.tpr:  642383820 538657200 2057568056 328885807 1027131843
[11:45:40] 
[11:45:40] Calling fah_main args: 14 usage=100
[11:45:40] 
[11:45:42] Working on Protein
[11:45:43] mdrun_gpu returned 
[11:45:43] Going to send back what have done -- stepsTotalG=0
[11:45:43] Work fraction=0.0000 steps=0.
[11:45:47] logfile size=9156 infoLength=9156 edr=0 trr=25
[11:45:47] + Opened results file
[11:45:47] - Writing 9694 bytes of core data to disk...
[11:45:47] Done: 9182 -> 3349 (compressed to 36.4 percent)
[11:45:47]   ... Done.
[11:45:47] DeleteFrameFiles: successfully deleted file=work/wudata_07.ckp
[11:45:49] 
[11:45:49] Folding@home Core Shutdown: UNSTABLE_MACHINE
[11:45:51] CoreStatus = 7A (122)
[11:45:51] Sending work to server
[11:45:51] Project: 6601 (Run 4, Clone 460, Gen 538)


[11:45:51] + Attempting to send results [November 25 11:45:51 UTC]
[11:45:51] Gpu type=2 species=11.
[11:45:52] + Results successfully sent
[11:45:52] Thank you for your contribution to Folding@Home.
[11:45:56] - Preparing to get new work unit...
[11:45:56] Cleaning up work directory
[11:45:56] + Attempting to get work packet
[11:45:56] Gpu type=2 species=11.
[11:45:56] - Connecting to assignment server
[11:45:57] - Successful: assigned to (171.64.65.61).
[11:45:57] + News From Folding@Home: Welcome to Folding@Home
[11:45:57] Loaded queue successfully.
[11:45:57] Gpu type=2 species=11.
[11:45:59] + Closed connections
[11:46:04] 
[11:46:04] + Processing work unit
[11:46:04] Core required: FahCore_11.exe
[11:46:04] Core found.
[11:46:04] Working on queue slot 08 [November 25 11:46:04 UTC]
[11:46:04] + Working ...
[11:46:04] 
[11:46:04] *------------------------------*
[11:46:04] Folding@Home GPU Core
[11:46:04] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[11:46:04] 
[11:46:04] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[11:46:04] Build host: amoeba
[11:46:04] Board Type: Nvidia
[11:46:04] Core      : 
[11:46:04] Preparing to commence simulation
[11:46:04] - Looking at optimizations...
[11:46:04] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[11:46:04] - Created dyn
[11:46:04] - Files status OK
[11:46:04] - Expanded 62796 -> 336076 (decompressed 535.1 percent)
[11:46:04] Called DecompressByteArray: compressed_data_size=62796 data_size=336076, decompressed_data_size=336076 diff=0
[11:46:04] - Digital signature verified
[11:46:04] 
[11:46:04] Project: 10516 (Run 6, Clone 876, Gen 127)
[11:46:04] 
[11:46:04] Assembly optimizations on if available.
[11:46:04] Entering M.D.
[11:46:10] Tpr hash work/wudata_08.tpr:  951600134 4197749379 2938061380 4174309697 3739342597
[11:46:10] 
[11:46:10] Calling fah_main args: 14 usage=100
[11:46:10] 
[11:46:12] Working on Protein
[11:46:21] Client config found, loading data.
[11:46:21] Starting GUI Server
[12:02:54] Completed 1%
[12:18:22] Completed 2%
[12:33:29] Completed 3%
Image
new08
Posts: 188
Joined: Fri Jan 04, 2008 11:02 pm
Hardware configuration: Hewlett-Packard 1494 Win10 Build 1836
GeForce [MSI] GTX 950
Runs F@H Ver7.6.21
[As of Jan 2021]
Location: England

Re: This may not be the WU fault- 66xx's [NVidia-8400GS]

Post by new08 »

Strangely Brett,on your query after a goodish run, I got just what I described from earlier..

Code: Select all

18:24:20] Project: 6602 (Run 6, Clone 601, Gen 417)
[18:24:20] 
[18:24:20] Assembly optimizations on if available.
[18:24:20] Entering M.D.
[18:24:27] Tpr hash work/wudata_07.tpr:  3019176968 3400642789 1886931439 1417760104 3915870578
[18:24:27] 
[18:24:27] Calling fah_main args: 14 usage=100
[18:24:27] 
[18:24:30] Working on Protein
[18:24:31] mdrun_gpu returned 
[18:24:31] Going to send back what have done -- stepsTotalG=0
[18:24:31] Work fraction=0.0000 steps=0.
[18:24:35] logfile size=9157 infoLength=9157 edr=0 trr=25
[18:24:35] + Opened results file
[18:24:35] - Writing 9695 bytes of core data to disk...
[18:24:35] Done: 9183 -> 3343 (compressed to 36.4 percent)
[18:24:35]   ... Done.
[18:24:35] DeleteFrameFiles: successfully deleted file=work/wudata_07.ckp
[18:24:37] 
[18:24:37] Folding@home Core Shutdown: UNSTABLE_MACHINE
[18:24:41] CoreStatus = 7A (122)
[18:24:41] Sending work to server
[18:24:41] Project: 6602 (Run 6, Clone 601, Gen 417)


[18:24:41] + Attempting to send results [December 2 18:24:41 UTC]
[18:24:41] Gpu type=2 species=11.
[18:24:41] + Results successfully sent
[18:24:41] Thank you for your contribution to Folding@Home.
[18:24:46] - Preparing to get new work unit...
[18:24:46] Cleaning up work directory
[18:24:46] + Attempting to get work packet
[18:24:46] Gpu type=2 species=11.
[18:24:46] - Connecting to assignment server
[18:24:47] - Successful: assigned to (171.64.65.61).
[18:24:47] + News From Folding@Home: Welcome to Folding@Home
[18:24:47] Loaded queue successfully.
[18:24:47] Gpu type=2 species=11.
[18:24:51] + Closed connections
[18:24:56] 
[18:24:56] + Processing work unit
[18:24:56] Core required: FahCore_11.exe
[18:24:56] Core found.
[18:24:56] Working on queue slot 08 [December 2 18:24:56 UTC]
[18:24:56] + Working ...
[18:24:56] 
[18:24:56] *------------------------------*
[18:24:56] Folding@Home GPU Core
[18:24:56] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[18:24:56] 
[18:24:56] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[18:24:56] Build host: amoeba
[18:24:56] Board Type: Nvidia
[18:24:56] Core      : 
[18:24:56] Preparing to commence simulation
[18:24:56] - Looking at optimizations...
[18:24:56] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[18:24:56] - Created dyn
[18:24:56] - Files status OK
[18:24:56] - Expanded 62149 -> 331840 (decompressed 533.9 percent)
[18:24:56] Called DecompressByteArray: compressed_data_size=62149 data_size=331840, decompressed_data_size=331840 diff=0
[18:24:56] - Digital signature verified
[18:24:56] 
[18:24:56] Project: 10514 (Run 8, Clone 976, Gen 124)
[18:24:56] 
[18:24:56] Assembly optimizations on if available.
[18:24:56] Entering M.D.
[18:25:02] Tpr hash work/wudata_08.tpr:  3748502429 4172367758 2215078451 431336339 1077127076
[18:25:02] 
[18:25:02] Calling fah_main args: 14 usage=100
[18:25:02] 
[18:25:06] Working on Protein
[18:25:16] Client config found, loading data.
[18:25:16] Starting GUI Server
This shows a 66xx unit going back instantly -and a 105xx coming in immediately.
This is certainly better than a while back, when I could wait seeing 30 hours of non runners.
Without knowing what the servers have waiting now, or what may have been done to the client to help, that's all I can report on the download situation.
The 8400 GS is now doing a steady 500+ ppd -and that's a ~10% boost to the GT240 results, of 4500ppd.
Image
Post Reply