Suggested bug fix: stuck proteins

Moderators: Site Moderators, FAHC Science Team

Post Reply
Nefarious
Posts: 21
Joined: Thu Dec 06, 2007 8:10 pm
Hardware configuration: Moderator for MacLife.com Forums / Folding @ Home section. Approximately 300 CPUs.

Suggested bug fix: stuck proteins

Post by Nefarious »

From time to time, a Folding computer on the team gets stuck. The owner has to delete the protein manually and then restart or re-install Folding.

This stuck protein should be automatically detected after a few days and Folding reinitializes itself.
Moderator of the MacLife forum for Team 18.
MtM
Posts: 1579
Joined: Fri Jun 27, 2008 2:20 pm
Hardware configuration: Q6600 - 8gb - p5q deluxe - gtx275 - hd4350 ( not folding ) win7 x64 - smp:4 - gpu slot
E6600 - 4gb - p5wdh deluxe - 9600gt - 9600gso - win7 x64 - smp:2 - 2 gpu slots
E2160 - 2gb - ?? - onboard gpu - win7 x32 - 2 uniprocessor slots
T5450 - 4gb - ?? - 8600M GT 512 ( DDR2 ) - win7 x64 - smp:2 - gpu slot
Location: The Netherlands
Contact:

Re: Suggested bug fix: stuck proteins

Post by MtM »

Nefarious wrote:From time to time, a Folding computer on the team gets stuck. The owner has to delete the protein manually and then restart or re-install Folding.

This stuck protein should be automatically detected after a few days and Folding reinitializes itself.
What do you mean with stuck?
  • The WU eue's, and is downloaded continuesly resulting in a 24h pauze?
  • The WU eue's and corrupts the queue file?
  • something else entirely?
For 1, this is an assigment server issue not a client issue and can not be fixed with a client update.
For 2, without knowing which caused the corruption there is no solution as I can see.
If 3, please specify exactly what you mean with stuck?
Nefarious
Posts: 21
Joined: Thu Dec 06, 2007 8:10 pm
Hardware configuration: Moderator for MacLife.com Forums / Folding @ Home section. Approximately 300 CPUs.

Re: Suggested bug fix: stuck proteins

Post by Nefarious »

Usually a team member just deletes the files and re-installs without mentioning the exact bug. In one case mentioned recently, the team member mrreet2001 had not noticed the problem for a month.

At the very least, if Folding doesn't produce proteins in a certain number of days and if there is no protein that is making progress, its safe to say that the install or the servers are buggered.
Moderator of the MacLife forum for Team 18.
MtM
Posts: 1579
Joined: Fri Jun 27, 2008 2:20 pm
Hardware configuration: Q6600 - 8gb - p5q deluxe - gtx275 - hd4350 ( not folding ) win7 x64 - smp:4 - gpu slot
E6600 - 4gb - p5wdh deluxe - 9600gt - 9600gso - win7 x64 - smp:2 - 2 gpu slots
E2160 - 2gb - ?? - onboard gpu - win7 x32 - 2 uniprocessor slots
T5450 - 4gb - ?? - 8600M GT 512 ( DDR2 ) - win7 x64 - smp:2 - gpu slot
Location: The Netherlands
Contact:

Re: Suggested bug fix: stuck proteins

Post by MtM »

Without being more specific it's not possible to have any meaningfull debate over this.

If a wu which is being processed is past it's final deadline it's already discarded so again without answering the questions above there isn't much to be said :(

What you say about no wu being proccesed is not entirely true as well, the 24h pause is deliberatly inplace to get the attention of the donor so he can check for likely causes. When he does, and presents the information leading to the problem, only then a possible solution can be looked for.

I would urge you to ask your team members to not delete those files but post here with an explanation of the issue and events leading up to it so there can be a community effort to get them fixed. It's rather pointless to complain about an issue without presenting exact data on what it entails, as it can never lead to a viable solution don't you agree?

For mrreet2001 ( nice name if you're Dutch like me :lol: ) an issue lasting a month sounds like a queue corruption/upload issue but without more data it's impossible to say :(
Pick2
Posts: 85
Joined: Fri Feb 13, 2009 12:38 pm
Hardware configuration: Linux & CPUs
Location: USA

Re: Suggested bug fix: stuck proteins

Post by Pick2 »

There was something about a "restart if hung" script here:
http://www.techreport.com/forums/viewto ... =9&t=63291
Which also mentions notfred's "Diskless folding programs (based on linux)" here, you'll have to page down or find "notfred":
viewtopic.php?f=14&t=52
I believe notfred incorporates a "restart if hung" script in his programs. recommended !
HTH
MtM
Posts: 1579
Joined: Fri Jun 27, 2008 2:20 pm
Hardware configuration: Q6600 - 8gb - p5q deluxe - gtx275 - hd4350 ( not folding ) win7 x64 - smp:4 - gpu slot
E6600 - 4gb - p5wdh deluxe - 9600gt - 9600gso - win7 x64 - smp:2 - 2 gpu slots
E2160 - 2gb - ?? - onboard gpu - win7 x32 - 2 uniprocessor slots
T5450 - 4gb - ?? - 8600M GT 512 ( DDR2 ) - win7 x64 - smp:2 - gpu slot
Location: The Netherlands
Contact:

Re: Suggested bug fix: stuck proteins

Post by MtM »

He's a moderator on a mac forum, sure vm's run on mac's but it's not really that relevant.
Pick2
Posts: 85
Joined: Fri Feb 13, 2009 12:38 pm
Hardware configuration: Linux & CPUs
Location: USA

Re: Suggested bug fix: stuck proteins

Post by Pick2 »

The first link to the script will run on Linux or a Mac.
notfred's ""Diskless folding" will run in a VM , from a USD flash drive , from a CD , or with PXE booting. It has a "restart if hung" script , which I have run separately , which will run on OS X or Linux , and should do what the OP wants. Granted ,it's a workaround , but It's all we have till stanford comes up with a Fix.
HTH
( BTW I'm also a Mod on another Mac forum , but only Fold with Linux now-a-days :) )
MtM
Posts: 1579
Joined: Fri Jun 27, 2008 2:20 pm
Hardware configuration: Q6600 - 8gb - p5q deluxe - gtx275 - hd4350 ( not folding ) win7 x64 - smp:4 - gpu slot
E6600 - 4gb - p5wdh deluxe - 9600gt - 9600gso - win7 x64 - smp:2 - 2 gpu slots
E2160 - 2gb - ?? - onboard gpu - win7 x32 - 2 uniprocessor slots
T5450 - 4gb - ?? - 8600M GT 512 ( DDR2 ) - win7 x64 - smp:2 - gpu slot
Location: The Netherlands
Contact:

Re: Suggested bug fix: stuck proteins

Post by MtM »

Hey I didn't know the restart script ran on OSX, sorry my bad and thanks for the correction :oops:

That's a good suggestion, thanks :biggrin: Hope I can remember it if I come accros a simular post in the future!
Post Reply