Page 3 of 4

Re: No appropriate work server was available [ATI]

Posted: Fri Jul 02, 2010 4:54 pm
by bruce
Jacko36 wrote:You really have to wonder when ATI cards that support OpenCl have been around for a couple of years now, and which other distributed computing groups report a linear increase in performance, but no effort has been made here to support them.

Then, the new gpu client which ONLY supports nvidia cards is released which just so happens to coincide with the release of the new nvidia cards.

And now all of a sudden there's no wu available for ATI cards.

Maybe money does speak the loudest after all.

This is just how i see it, maybe i'm wrong.
Welcome to the foldingforum, Jacko36.

Unfortunately, you're confusing two unrelated issues with each other.

1) I've been trying to find out why few WUs are available and get that problem fixed but I'm not having much success. When new WUs are provided, they disappear rapidly and I have no explanation.

2) There are several topics on this and other forums about OpenCL for FAH. While OpenCL may work for simple tasks, it's still not well enough for FAH. It is poorly optimized and would run so much slower than the existing Brook-CAL version of FahCore_11 that nobody would want it. ATI and the Pande Group and the OpenMM folks all want a good OpenCL core as much or more than you do. They have been working hard to make OpenCL work for FAH but at this point it's still not something that you'd want.

In some respects, you're right about the money -- but don't level that accusation at Stanford or at FAH. NVidia invested in CUDA; ATI did not. Both are investing in OpenCL.

It should be noted that the nVidia core doesn't work for OpenCL either. Not surprisingly, the FAH side of OpenMM that will eventually interface with OpenCL is quite similar to the FAH side of an interface with CUDA, and CUDA is well optimized for nVidia so making the new core work through CUDA is a reasonably small step in the right direction.

If ATI supported CUDA, I'm sure that there would be a new FahCore version that would work with it rather quickly, but I doubt that's going to happen. It has nothing to do with the hardware itself, but rather the investments that nV spent on developing the proprietary CUDA interface. In comparison, ATI's CAL and CTM require the development of much more external software, which is where Brook came in.

ATI can choose to license CUDA from nVidia (at an extremely high price, I'm sure) or the OpenCL folks can develop a version that approaches the efficiency of CUDA, but I don't expect either one very soon. At the present time, ATI is a strong competitor for gaming but is increasingly falling behind in the areas that include stream computing. Hopefully there will be some important developments in that area, but I don't know what they will be or when.

ATI work units

Posted: Fri Jul 02, 2010 5:04 pm
by Brucifer
So are we going to get ATI work units for either gpu2 or gpu3 or is this just resolving down to a cuda work unit only thing?

Thank you

Re: No appropriate work server was available [ATI]

Posted: Fri Jul 02, 2010 5:58 pm
by bruce
Brucifer wrote:So are we going to get ATI work units for either gpu2 or gpu3 or is this just resolving down to a cuda work unit only thing?

Thank you
Welcome to the foldingforum, Brucifer.

I've merged your question into a topic on the same subject. I believe you'll see most of your questions answered above.

Re: No appropriate work server was available [ATI]

Posted: Fri Jul 02, 2010 8:25 pm
by Tynat
This is kind of confusing. This thread appears to have been moved to the ATI forum, then moved back and in the process the "View first unread post" link got reset. I take it that the ATI forum needed a "shortcut"? :?
bruce wrote:1) I've been trying to find out why few WUs are available and get that problem fixed but I'm not having much success. When new WUs are provided, they disappear rapidly and I have no explanation.
Perhaps it has to do with all the starved ATI clients that are out there in the world taking a big gulp of available WUs? Here, it takes a short time to finish a WU and getting a new WUs seems to have greatly improved, that it's currently down to 1-3 attempts. Hopefully this will continue and improve.

Re: No appropriate work server was available [ATI]

Posted: Sat Jul 03, 2010 3:39 am
by pfv
Wouldn't it be nice if Stanford were to post a message on the official news page saying "sorry, we are experiencing a shortage/issue with ATI WU, we are working on it and keep you updated every day"?
It looks like the ATI WU shortage/not availalbe issue has been going on for over a week.
It appears that everytime someone's client stops working (for a reason or another such as ATI WU shortage), many people spend cycles trying to trouble shoot (involving even more once they post the issue), wondering if it is not your rig, searching the Forum threads for hints, etc... it's a waste of time for many people. A simple message from the Stanford team in charge of these units/clients would be very considerate.

Re: No appropriate work server was available [ATI]

Posted: Sat Jul 03, 2010 5:00 am
by DBoone
I agree! It's gotten to the point that I don't know where to look anymore. Is it on the Stanford news page? In the announcements section of this forum? In one of the threads on the forum?

PG should pick a spot and use it!

OK, back to folding now....

Re: No appropriate work server was available [ATI]

Posted: Sat Jul 03, 2010 7:59 am
by Tynat
My former ISP does that and they do it right on their front page for everyone to see. From network outages to server issues, whatever it is, you knew where to go in order to find out what's going on. When the issue was resolved, again they would post what happened. I think they mirror it on their TwitFace pages, too. Oh, and it's a former ISP only because I moved and they do not cover the new area.

Anyway, on to the current issue with getting a WU.

Since the last completed WU for the HD5870's GPU client, the following has happened:

After 1 attempt, received Project: 5747 (Run 3, Clone 23, Gen 197), UNSTABLE_MACHINE.
Received Project: 5747 (Run 3, Clone 23, Gen 197), UNSTABLE_MACHINE.
Received Project: 5747 (Run 3, Clone 23, Gen 197), UNSTABLE_MACHINE.
Received Project: 5747 (Run 3, Clone 23, Gen 197), UNSTABLE_MACHINE.
Received Project: 5747 (Run 3, Clone 23, Gen 197), UNSTABLE_MACHINE, EUE limit exceeded, restarted client.
Received Project: 5747 (Run 3, Clone 23, Gen 197), UNSTABLE_MACHINE.
After 4 attempts, received Project: 5747 (Run 3, Clone 16, Gen 1280), UNSTABLE_MACHINE.
After 2 attempts, received Project: 5733 (Run 0, Clone 72, Gen 259) and 5870m is back in business (for now).

It appears that the time waiting for a new WU has improved, but there seems to be more UNSTABLE_MACHINE errors then in the recent past.

Project: 5747 (Run 3, Clone 23, Gen 197) was reported.
Project: 5747 (Run 3, Clone 16, Gen 1280) was reported.

Re: No appropriate work server was available [ATI]

Posted: Sat Jul 03, 2010 3:02 pm
by Racer43
I would tend to agree with you, Tynat. I've have not had an UNSTABLE_MACHINE for a long time until last night; my was 5744. Restarted the client and got an wu; now waiting again. I'm glad I'm not in Dr. Pande's shoes; monies in California for higher learning have been severely curtailed in the past 2 years, thus slowing down what the Dr and his group need to expand and improve. And Guv Terminator now is taking it out on the state employees by threatening to move all he can to minimum wage as leverage against the legislature for not getting a budget passed, and university employees are state workers. Talk about fun times for him. And we are on attempt #9. Anyone got a cheap GTX to sell me........LOL

Re: No appropriate work server was available [ATI]

Posted: Sat Jul 03, 2010 5:01 pm
by toTOW
I've seen a lot of report from ATI users about bad WUs recently ... I've sent an email to Pande Group.

Re: No appropriate work server was available [ATI]

Posted: Sat Jul 03, 2010 9:59 pm
by Tynat
Since the last completed WU for the HD3850's GPU client, the following has happened:

After 1 attempt, received Project: 5745 (Run 2, Clone 13, Gen 153), UNSTABLE_MACHINE.
Received Project: 5745 (Run 2, Clone 13, Gen 153), UNSTABLE_MACHINE.
Received Project: 5745 (Run 2, Clone 13, Gen 153), UNSTABLE_MACHINE.
Received Project: 5745 (Run 2, Clone 13, Gen 153), UNSTABLE_MACHINE.
Received Project: 5745 (Run 2, Clone 13, Gen 153), UNSTABLE_MACHINE, EUE limit exceeded, 5 hours later restarted client.
Received Project: 5745 (Run 2, Clone 13, Gen 153), UNSTABLE_MACHINE.
After 1 attempt, received Project: 5743 (Run 4, Clone 22, Gen 199), UNSTABLE_MACHINE.
Received Project: 5743 (Run 4, Clone 22, Gen 199), UNSTABLE_MACHINE.
Received Project: 5743 (Run 4, Clone 22, Gen 199), UNSTABLE_MACHINE.
Received Project: 5743 (Run 4, Clone 22, Gen 199), UNSTABLE_MACHINE, EUE limit exceeded, restarted client.
Received Project: 5743 (Run 4, Clone 22, Gen 199), UNSTABLE_MACHINE.
Received Project: 5743 (Run 4, Clone 22, Gen 199), UNSTABLE_MACHINE.
After 1 attempt, received Project: 5737 (Run 1, Clone 153, Gen 51)and 3850 is back in business (for now).

Since the last completed WU for the HD5870m's GPU client, the following has happened:

After 1 attempt, received Project: 5746 (Run 3, Clone 77, Gen 312), UNSTABLE_MACHINE.
Received Project: 5746 (Run 3, Clone 77, Gen 312), UNSTABLE_MACHINE.
Received Project: 5746 (Run 3, Clone 77, Gen 312), UNSTABLE_MACHINE.
Received Project: 5746 (Run 3, Clone 77, Gen 312), UNSTABLE_MACHINE.
Received Project: 5746 (Run 3, Clone 77, Gen 312), UNSTABLE_MACHINE, EUE limit exceeded, 2½ hours later restarted client.
After 3 attempts, received Project: 5732 (Run 3, Clone 269, Gen 0) and 5870m is back in business (for now).

Project: 5745 (Run 2, Clone 13, Gen 153) was reported.
Project: 5743 (Run 4, Clone 22, Gen 199) was already reported as bad on 2010-07-03 (Sat), 9:51 am.
Project: 5746 (Run 3, Clone 77, Gen 312) was already reported as bad on 2010-07-02 (Fri), 6:29 pm.

Note: Project: 5746 (Run 3, Clone 77, Gen 312) was first reported as bad on 2009-10-13 (Tue), 10:18 pm and again on the date and time above.
toTOW wrote:I've seen a lot of report from ATI users about bad WUs recently ... I've sent an email to Pande Group.
Thanks toTOW. Had a rather uneventful run for quite sometime up until these past couple of days. Things like this act as a reminder of how much money and wear on the equipment is taking place. Sure would like to get back to set it and forget it. :lol:

Re: No appropriate work server was available [ATI]

Posted: Sun Jul 04, 2010 12:42 am
by DBoone
I'm taking the opportunity to determine whether SMP folding alone is more productive (points-wise) than SMP + GPU.

Do you guys need more info on which WU are resulting in unstable machine errors?

Re: No appropriate work server was available [ATI]

Posted: Sun Jul 04, 2010 6:08 am
by Sahkolihaa
Project 5744 (Run 3, Clone 77, Gen 113) gave me 'UNSTABLE_MACHINE' on a HD3870.

Re: No appropriate work server was available [ATI]

Posted: Sun Jul 04, 2010 7:41 am
by Athlonite
gee gettin sick of seeing this msg after every single WU

[06:53:48] + Attempting to get work packet
[06:53:48] Gpu type=1 species=4.
[06:53:48] - Connecting to assignment server
[06:53:49] - Successful: assigned to (171.64.65.102).
[06:53:49] + News From Folding@Home: Welcome to Folding@Home
[06:53:50] Loaded queue successfully.
[06:53:50] Gpu type=1 species=4.
[06:53:50] + Could not connect to Work Server
[06:53:50] - Attempt #8 to get work failed, and no other work to do.
Waiting before retry.

Re: No appropriate work server was available [ATI]

Posted: Sun Jul 04, 2010 7:42 am
by VijayPande
We've had some issues with the ATI GPU servers. I think we have a temporary fix which should give out some more WUs, but a more complete fix will likely come this week.

Re: No appropriate work server was available [ATI]

Posted: Sun Jul 04, 2010 7:45 am
by VijayPande
PS Looking at the server, I see there still are many assigns which are failing. This should get better in time, but as I mentioned above, the real will come next week (but this should help some donors).