Page 5 of 7

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Thu Jul 31, 2014 12:24 pm
by JimboPalmer
Once you remove the GPU slot, you can set the CPU slot to 8

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Thu Jul 31, 2014 2:58 pm
by bruce
Right.

Until you upgrade your GPU or they fix that server, you can manually set your CPU for 8 processors rather than 7. Just remember to put it back if you start folding with a GPU again. The change from 7 to 8 will increase your CPU production very slightly, but probably the most significant reason for do it is that some CPU projects are specifically excluded from 7.

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Thu Jul 31, 2014 5:20 pm
by barebear
Hi Mike -- I removed the GPU slot -- not ready to invest in an upgrade

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Thu Jul 31, 2014 6:25 pm
by Napoleon
VijayPande wrote:We've been working to back up that machine and get all of the data off. I think the RAID is about to go and we'll retire that server. With it, that probably means the retirement of core11.
Would it be OK to force pre-Fermi GPUs to do core15 work?

I know core15 is on the way out as well, but at least the folding lifespan of older GPUs could be extended a bit. I won't go into details at this point, but I know it can be done and apparently quite a few pre-Fermi GPUs could fold core15 WUs. Apparently being the operative word... one wouldn't want to produce subtly corrupted results.

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Thu Jul 31, 2014 7:46 pm
by 7im
No need to force anything.

As I recall, core_11 and core_15 had the same level of hardware requirements on NV cards. Series 8xxx and above.

And if so, then this comes down to are there core_15 WUs out available, and are there any AS configured to assign them to this lower end hardware?

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Thu Jul 31, 2014 8:04 pm
by bruce
Core 11 comes in two species. One is for CUDA and one is for CAL. Core15 is for CUDA but NV has gone through several versions of CUDA and they don't always play nicely together.

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Thu Jul 31, 2014 8:20 pm
by Joe_H
As I recall, this was tried over a year ago. The Core_15 projects worked on some pre-Fermi cards, but not others. Which ones did work did not seem to follow any detectable pattern. They usually failed with memtest errors, there was a long thread about that as I recall.

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Fri Aug 01, 2014 12:43 am
by Luscious
I understand the need to upgrade hardware. I understand that new GPU's are more efficient. And I also understand why Core 11 is being placed on the back burner.

But I think pulling the plug on Core 11 is premature considering all that has happened is a server outage. Is it really that difficult to get it back up and running on another box? And if it is running in a server environment, why isn't there any redundancy in cases like this when a certain server does go down?

I've had a pair of 9800GTS cards folding 24/7 for close to 5 years without a hiccup. Many other users out there too also have perfectly working GPU's that are now shelved.

In my case I buy my hardware for long-term, keeping it for 5-6 years. If my best "insurance" for F@H is to invest in Maxwell architecture, I would skip the 750/750Ti and buy an upcoming 800 series card, but these aren't out yet, and probably will only hit shelves Holiday season at the earliest.

So yes, I do intend to upgrade, but until I can, shouldn't Core 11 be kept running for a few more months? At least until the Maxwell stuff becomes mainstream and users like myself have a chance to lock into the newest architecture.

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Fri Aug 01, 2014 2:47 am
by bruce
FAH has several forms of redundancy.

Data is RAIDed, but even RAID has an occasional failure. Maybe that's what happened here. (I don't know which RAID N?)

Redundancy is also provided by having multiple servers, each with it's own set of project(s) and if you can't get work from one of them, you can be assigned something from another server. In this case, most projects for Core_11 have finished and new projects use newer analysis methods (see Dr. Pande's comment above) and so there aren't enough Core_11 projects to need more than one server ... hence single-point-of-failure, unless you have a newer GPU. :(

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Sat Aug 02, 2014 2:55 pm
by toTOW
I think the time has come to find a suitable BOINC project for my two 9800 GTX+ until I am able to replace them by high end Maxwell based GPU and return to FAH ... I think this might be hard to find something really useful for such old GPUs :(

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Sat Aug 02, 2014 4:47 pm
by 7im
I did the same thing with and old P3 a few years ago. Took it off fah and put it on a non-time sensitive project. But I don't promote or name them here.

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Sat Aug 02, 2014 6:44 pm
by Ghetto_Child
if FahCore_11 is being retired, what will be the replacement for pre-Fermi GPUs like a GTX 295? Completing two GPU WUs per hour sounds acceptably efficient to me. How fast is the average GPU WU completed?

Re: 171.67.108.11 & 171.67.108.21 down

Posted: Sat Aug 02, 2014 6:56 pm
by toTOW
Unfortunately, nothing will replace Core 11 ... and pre-Fermi GPUs will just remain unsupported.

You'll have to set up a plan to replace them if you want to contribute to GPU folding again ... :(

Issues with 171.67.108.201 and 171.64.65.160

Posted: Mon Aug 04, 2014 3:09 pm
by katakaio
Hey folks,

Woke up this morning and found my CPU slot on 7.4.4 happily chugging away while the GPU slot threw the following errors:

Code: Select all

15:04:47:WARNING:WU01:FS02:Failed to get assignment from '171.67.108.201:80': Failed to connect to 171.67.108.201:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
15:04:47:WU01:FS02:Connecting to 171.64.65.160:80
15:04:48:WARNING:WU01:FS02:Failed to get assignment from '171.64.65.160:80': Failed to connect to 171.64.65.160:80: No connection could be made because the target machine actively refused it.
15:04:48:ERROR:WU01:FS02:Exception: Could not get an assignment
I pinged 171.67.108.201 with no success, although I could ping 171.64.65.160. I check the server status page for the first time and was surprised to find neither server listed there. Definitely a rookie when it comes to investigating server issues, but I have done all the usual firewall/router/modem troubleshooting. Any ideas?

Thanks!

Re: Issues with 171.67.108.201 and 171.64.65.160

Posted: Mon Aug 04, 2014 4:41 pm
by Joe_H
katakaio wrote:

Code: Select all

15:04:47:WARNING:WU01:FS02:Failed to get assignment from '171.67.108.201:80': Failed to connect to 171.67.108.201:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
15:04:47:WU01:FS02:Connecting to 171.64.65.160:80
15:04:48:WARNING:WU01:FS02:Failed to get assignment from '171.64.65.160:80': Failed to connect to 171.64.65.160:80: No connection could be made because the target machine actively refused it.
15:04:48:ERROR:WU01:FS02:Exception: Could not get an assignment
I pinged 171.67.108.201 with no success, although I could ping 171.64.65.160. I check the server status page for the first time and was surprised to find neither server listed there. Definitely a rookie when it comes to investigating server issues, but I have done all the usual firewall/router/modem troubleshooting. Any ideas?

Thanks!
The two servers you tried to check are assignment servers, a recent change in the Server Status page did not include them in what gets displayed. I have merged your post with others about the same problem, the Work Servers with assignments for older nVidia GPU's went down recently and it appears they may not return.