New Assignment Server feedback/problem

Moderators: Site Moderators, FAHC Science Team

Post Reply
Gary480six
Posts: 93
Joined: Mon Jan 21, 2008 6:42 pm

Re: New Assignment Server feedback/problem

Post by Gary480six »

VijayPande wrote:ok, we'll set 9406, 13000 and 13001 to be beta only for Maxwell. That should keep them away from the Maxwell GPUs and still give us beta team feedback for what's going wrong w/those WUs.
Dr. Pande,

Can you also move the P10467 and P10469 core 17 work units back to beta only for Maxwell?

They are failing on my GTX 750Ti cards with the same 'Force RMSE error' that was crashing the P13000 and P13001 work.
Breach
Posts: 204
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Re: New Assignment Server feedback/problem

Post by Breach »

Confirmed for 10467, 10468 and 10469.
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
Nathan_P
Posts: 1164
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: New Assignment Server feedback/problem

Post by Nathan_P »

Kjetil wrote:Okay, running 78xx and 9202 now. Thanks.
Edit: Can you fix hfm.net to? It show 0 points and core unknown on 78xx.
HFM is a 3rd party app and as such is unsupported by PG. Which version are you running, the latest version is 0.9.2. If you still have problems or are already upgraded I would suggest contacting the developer - harlam357
Image
Kjetil
Posts: 175
Joined: Sat Apr 14, 2012 5:56 pm
Location: Stavanger Norway

Re: New Assignment Server feedback/problem

Post by Kjetil »

HFM.Net is not the problems, ps running 0.9.2. This is http://fah-web.stanford.edu/psummaryC.html
Gary480six
Posts: 93
Joined: Mon Jan 21, 2008 6:42 pm

Re: New Assignment Server feedback/problem

Post by Gary480six »

Just a quick anecdotal update.

As of 1:30 PM Eastern.. I'm now getting core 15 work on my GTX 750Ti cards - rather than the failed core 17 work I was getting.

Version 7.4.4 client on Windows 7 systems with the GPU slot set to normal and the 340.xx video drivers.

I was issued a P7621, a P7622 and a P7623 work unit - and all seem to be Folding normally.

Still wish I could get back to the successful core 17 Folding production I had two weeks ago, but at least now I can be of some help to the science.
Tim_H
Posts: 14
Joined: Wed Sep 01, 2010 10:06 pm
Hardware configuration: *FAH001
Asus M2N-SLI Deluxe
Phenom II X4 940 BE at 3.5GHz
2x GTS 250
4x 1GB Patriot DDR2 800

*FAH002
MSI P7N Platinum
Xeon X3330
2x 9600GT
1x 8800GS
2x 2GB Kingston DDR2 1066

*Ogre
Gigabyte (model)
Core 2 Quad q9400
2x 2GB Gskill DDR2 800
Gts 250
Location: Regina, Canada

Re: New Assignment Server feedback/problem

Post by Tim_H »

Hi, I'm having some issues getting work since last night. I have restarted the machine, as well as tried removing all flags and no change.
I keep seeing "Empty work server assignment" I can ping the server but no work.

Any help would be appreciated.

Code: Select all

18:00:02:<config>
18:00:02:  <!-- Folding Core -->
18:00:02:  <core-priority v='low'/>
18:00:02:
18:00:02:  <!-- Logging -->
18:00:02:  <verbosity v='5'/>
18:00:02:
18:00:02:  <!-- Network -->
18:00:02:  <proxy v=':8080'/>
18:00:02:
18:00:02:  <!-- Remote Command Server -->
18:00:02:  <password v='*******'/>
18:00:02:
18:00:02:  <!-- User Information -->
18:00:02:  <passkey v='********************************'/>
18:00:02:  <team v='37412'/>
18:00:02:  <user v='Tim_H'/>
18:00:02:
18:00:02:  <!-- Folding Slots -->
18:00:02:  <slot id='0' type='CPU'>
18:00:02:    <client-type v='bigadv'/>
18:00:02:    <cpus v='48'/>
18:00:02:    <max-packet-size v='big'/>
18:00:02:  </slot>
18:00:02:</config>
18:01:19:WU00:FS00:Connecting to 171.67.108.200:8080
18:01:20:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.200:8080': Empty work server assignment
18:01:20:WU00:FS00:Connecting to 171.64.65.121:80
18:02:23:WARNING:WU00:FS00:Failed to get assignment from '171.64.65.121:80': Failed to connect to 171.64.65.121:80: Connection timed out
18:02:23:ERROR:WU00:FS00:Exception: Could not get an assignment
18:05:34:WU00:FS00:Connecting to 171.67.108.200:8080
18:05:34:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.200:8080': Empty work server assignment
18:05:34:WU00:FS00:Connecting to 171.64.65.121:80
18:06:37:WARNING:WU00:FS00:Failed to get assignment from '171.64.65.121:80': Failed to connect to 171.64.65.121:80: Connection timed out
18:06:37:ERROR:WU00:FS00:Exception: Could not get an assignment
kimben777
Posts: 23
Joined: Mon Apr 07, 2014 1:21 pm
Hardware configuration: 1) 1090t @3.6ghz, corsair 850w psu, 16gb gskill, asus m4a89td pro mb, asus 780 ti
2) fx-8350 @4ghz, cooler master 1000w psu, 8gb gskill, asus m5a99fx pro mb, two asus 780 ti's
3) am3 x4 @3.4ghz, rosewill 650w psu, asus M5A78L-M/USB3 mb, 8gb gskill, asus 780 ti

Re: New Assignment Server feedback/problem

Post by kimben777 »

VijayPande wrote:ok, we'll set 9406, 13000 and 13001 to be beta only for Maxwell. That should keep them away from the Maxwell GPUs and still give us beta team feedback for what's going wrong w/those WUs.
I have picked up core 15 on all four of my 780 ti's with the advanced flag, I thought things were going to stay the same for kepler, I'll go to beta flag now and see what happens.
johnim
Posts: 2
Joined: Thu Feb 06, 2014 7:33 pm

Re: New Assignment Server feedback/problem

Post by johnim »

hi thanks what ever you did the 970s are folding on beta now im using 344.11 drivers

Image
Last edited by johnim on Mon Oct 06, 2014 6:56 pm, edited 1 time in total.
Image
Breach
Posts: 204
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Re: New Assignment Server feedback/problem

Post by Breach »

@johnim that's good news, but if you run beta you may get one of the problematic WUs unless they have been completely blocked for Maxwell users.

FYI, 10466-10469 should no longer be assigned on Maxwell at least in fah and advanced:

viewtopic.php?f=66&t=26405&start=30

Let's hope that whatever c17 projects are left are OK. I guess it's normal to start getting c15 units too now that the selection pool has been reduced.
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: New Assignment Server feedback/problem

Post by bruce »

Gary480six wrote: Still wish I could get back to the successful core 17 Folding production I had two weeks ago, but at least now I can be of some help to the science.
We all wish that.

Core_17 uses some features which were not needed by Cores 15/16. Given a choice between WUs that do fold and WUs that do not, you're probably stuck with cores 15/16 for a while. There seem to be strong reasons to blame the drivers for not supporting the features they're supposed to support and nobody has identified drivers which work. Even if somebody says I'm successfully using drivers xxx.xx, results seem to depend on which class of GPU you're running, perhaps on which version of the OS you're running, and possibly on other factors.

At this point, there is no reason to believe that we're looking at a bug in the Assignment Server code. Rather, as changes were made to minimize the impact of the real AS bugs, attention was focused on problems that were already present but which had remained "under the radar."

Unfortunately, when dealing with bugs which might be in the drivers, which might be in the FahCore, or which might be in the latest hardware, it takes time to isolate each problem (since there may be several) and test the validity of the fix. If all of the bugs are in the FahCore (which I doubt) then Stanford can fix them. If some are in the drivers or the hardware (which I suspect), all Stanford can do is (A) Wait for NV/ATI to create drivers which bypass the defective sections of the hardware,or (B) Wait for NV/ATI to fix their drivers or (C) Rewrite segments of the FahCore to bypass the errors encountered when the drivers produce the correct results.

Note that "(A)" presumes that issuing a recall for defective hardware is not on the list, even if the automakers have sometimes used that option.
Biffa
Posts: 69
Joined: Sun Nov 16, 2008 11:40 pm
Hardware configuration: RTX2080Ti
Threadripper 1950X
Location: UK
Contact:

Re: New Assignment Server feedback/problem

Post by Biffa »

VijayPande wrote:ok, we'll set 9406, 13000 and 13001 to be beta only for Maxwell. That should keep them away from the Maxwell GPUs and still give us beta team feedback for what's going wrong w/those WUs.
22:57:47:WU00:FS01:0x17:Project: 7814 (Run 57, Clone 0, Gen 14)

Running fine on Maxwell GTX970 now :)

Stock clocks (1076/1753), Win 8.1, Driver 344.16
Image
Gary480six
Posts: 93
Joined: Mon Jan 21, 2008 6:42 pm

Re: New Assignment Server feedback/problem

Post by Gary480six »

More data for the diagnostic team:

My GTX 750Ti cards have all picked up P9201 core 17 work units. So some core 17 work is still available to my Maxwell cards - and the work does complete!


If it helps, in each case it is running on version 0.0.52 of core 17.


The configuration on all three systems is the same - version 7.4.4 client on Windows 7 systems with the GPU slot set to normal and the 340.52 video drivers. The GPUs are at stock clocks and I am not SMP Folding on these systems.

Also, two of the systems are Windows 7 Ultimate 64-bit and one is Windows 7 Professional 32-bit - yet they all seemed to be reacting exactly the same to the failed P13001 and P10467 series of core 17 work.


The only other factor I can think of - from the donors side of the equation... is that perhaps a Windows Update fouled things up for some of the 'core 17 on Maxwell' Folding?

It's been about two weeks since I hit my peak Folding day. I looked at one of those boxes - and in that time, I have installed about a dozen Windows updates and about 15 updates to Microsoft Security Essentials. (yeah - I know.. but it is just a Folding box)

If the Pande Group wants to persue updates as being the problem, I could probably cut and paste a list of those updates.
Biffa
Posts: 69
Joined: Sun Nov 16, 2008 11:40 pm
Hardware configuration: RTX2080Ti
Threadripper 1950X
Location: UK
Contact:

Re: New Assignment Server feedback/problem

Post by Biffa »

Its not updates. It was an issue with some projects on Maxwell

I've managed to complete the following projects on my 970

22:57:47:WU00:FS01:0x17:Project: 7814 (Run 57, Clone 0, Gen 14)
2:51:16:WU00:FS01:0x17:Project: 9202 (Run 110, Clone 2, Gen 215)
21:47:39:WU00:FS01:0x18:Project: 10473 (Run 0, Clone 194, Gen 44)

But not getting any of the ones that were causing problems.
Image
runpaint
Posts: 42
Joined: Sun Aug 24, 2014 3:58 am

Re: New Assignment Server feedback/problem

Post by runpaint »

I'm running all gtx750 & 750ti cards, I haven't changed anything or updated the drivers, and 8 of them failed this week for the first time. I just re-started another one today (removed the gpu slot and added it back), I think it was working yesterday but it might have been sitting there doing nothing for 2 or 3 days. They all run 24 hours a day so I don't always remember to check each one daily. I went from 500,000 ppd to a little over 200,000, although that's partly because of all the core 15s.
Image
Breach
Posts: 204
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Re: New Assignment Server feedback/problem

Post by Breach »

runpaint wrote:I'm running all gtx750 & 750ti cards, I haven't changed anything or updated the drivers, and 8 of them failed this week for the first time. I just re-started another one today (removed the gpu slot and added it back), I think it was working yesterday but it might have been sitting there doing nothing for 2 or 3 days. They all run 24 hours a day so I don't always remember to check each one daily. I went from 500,000 ppd to a little over 200,000, although that's partly because of all the core 15s.
Which projects failed? If WUs from 13000, 13001, 10467, 10468, 10469 please dump the WUs and you'll pick up working ones. If others please report them.
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
Post Reply