Page 7 of 10

Re: Project 5801 issues.

Posted: Tue Oct 28, 2008 11:47 pm
by VijayPande
We've taken these off line until we can see what's up.

Re: Project 5801 issues.

Posted: Tue Oct 28, 2008 11:48 pm
by theo343

Re: Project 5801 issues.

Posted: Tue Oct 28, 2008 11:48 pm
by Naki
Here is my log:

Code: Select all

[22:43:32] Project: 5801 (Run 1, Clone 43, Gen 0)
[22:43:32] Assembly optimizations on if available.
[22:43:32] Entering M.D.
[22:43:38] mdrun_gpu returned 
[22:43:38] Going to send back what have done -- stepsTotalG=0
[22:43:38] Work fraction=0.0000 steps=0.
[22:43:42] logfile size=0 infoLength=0 edr=0 trr=25
[22:43:42] - Writing 637 bytes of core data to disk...
[22:43:42] Done: 125 -> 124 (compressed to 99.2 percent)
[22:43:42]   ... Done.
[22:43:42] Folding@home Core Shutdown: UNSTABLE_MACHINE
[22:43:46] CoreStatus = 7A (122)
[22:43:46] Sending work to server
[22:43:46] Project: 5801 (Run 1, Clone 43, Gen 0)
[22:43:46] - Read packet limit of 540015616... Set to 524286976.

Re: Project 5801 issues.

Posted: Tue Oct 28, 2008 11:50 pm
by leexgx
why does it force me to connect to even thought i URL blocked that server very annoying it should goto another server after so many fails

scrach that the URL BLock on /.11:8080 /.11:80 worked after failing 4 connections to it redirected me to the 5016 server

will remove the block from that server when project is removed 100% from it going sleep now, Server code could do with been tweeked an little to detect when Every work unit is failing and stop handing them out or at lest the server that hands out the project server Should Not keep handing out the same server on every fail

Re: Project 5801 issues.

Posted: Tue Oct 28, 2008 11:58 pm
by toTOW
VijayPande wrote:We've taken these off line until we can see what's up.
Thank you.

I can now got to bed in peace of mind :)

Re: Project 5801 issues.

Posted: Wed Oct 29, 2008 12:01 am
by Saleen219
This is getting ridicules as far as Im concerned. Get it fixed already.

Re: Project 5801 issues. [Should be Offline]

Posted: Wed Oct 29, 2008 12:06 am
by crosby
Glad these have been taken away - have had to restart 3 clients due to them

Re: Project 5801 issues. [Should be Offline]

Posted: Wed Oct 29, 2008 12:07 am
by Iannis
same problem here 2 days now
on 9800GTX :(

Re: Project 5801 issues. [Should be Offline]

Posted: Wed Oct 29, 2008 12:07 am
by theo343
pitty i cant reach half of my GPU clients, but I will get to them in the morning.

Re: Project 5801 issues. [Should be Offline]

Posted: Wed Oct 29, 2008 12:13 am
by leexgx
server was handing out them work units an little i remove that block in 24hrs (or when ever i come back home)

Re: Project 5801 issues. [Should be Offline]

Posted: Wed Oct 29, 2008 12:24 am
by VijayPande
Sorry about the really nasty problem on this one. It was definitely strange since these WU's were QA'd before. I think this may be an issue where they were QA'd on an earlier core and 1.15 is causing issues.

Re: Project 5801 issues.

Posted: Wed Oct 29, 2008 12:32 am
by MoneyGuyBK
Welcome to the party toTOW ... I mean what feels like a funeral !!!
toTOW wrote:I feel alone, depressed and helpless :cry:
I have finally stopped getting the 5801s, did not do anything else except a restart, I got all 5506s....
God only knows how much PpD I lost and how much benefit Humanity missed today.


I am surprised that:
1) F@H released this WU in such a bad state :!:
However, more stumped that:
2) F@H has not chimed in here officially after 7 Pages of comments :(

EDIT/Added..... I see VP chimed in on the cause while I was writing.... Thanx VP


Re: Project 5801 issues. [Should be Offline]

Posted: Wed Oct 29, 2008 12:42 am
by Insidious
VijayPande wrote:Sorry about the really nasty problem on this one. It was definitely strange since these WU's were QA'd before. I think this may be an issue where they were QA'd on an earlier core and 1.15 is causing issues.
Thanks Dr. Pande,

They are dying on 1.18 too.


Re: Project 5801 issues. [Should be Offline]

Posted: Wed Oct 29, 2008 12:55 am
by harlam357
Well, there lies part of the problem... poor QA. Can you honestly say that not even one p5801 WU was not run on the most recent core before deploying them?

I'm a software developer... I won't go into details about the software I develop but suffice it to say an engineering design engine is the meat of the software. What was done here is akin to us developing an updated engine, then not running a single piece of data through it before releasing it out into the wild. Then when it fails we'll just shrug our shoulders and say... "Well, it worked on the previous version."

I understand resources are limited... failures happen... and the software is beta. As long as lessons are learned and processes are improved, then that's all we can ask for.

This recent string of debacles with the GPU2 core and WUs have really cast a shadow on what was, IMO, the best rollout in FAH history.

Re: Project 5801 issues.

Posted: Wed Oct 29, 2008 12:58 am
by VijayPande
PS In case you're curious:
MoneyGuyBK wrote: I am surprised that:
1) F@H released this WU in such a bad state :!:
This was beta tested before (this was a project # change due to a move onto a new server -- which was done to try to keep work around while the CS servers were down).
However, more stumped that:
2) F@H has not chimed in here officially after 7 Pages of comments :(
We keep an eye on the forum, but the first post was just a few hours ago. Due to staff having other responsibilities, our response will typically be on the hours time scale not minutes time scales for issues like this. I wish it could be faster, but that's what we're staffed to do at the moment.