Let me preface this by saying this is not a whine or blame thread. I am merely looking for a solution.
Problem: daily credited points is about 10,000 less than what it apparently should be, as measured by FAHMon
Considerations: I've been folding since the beginning. Currently I'm running a total 20 clients, mix of GPU2 Nvidia and quad core SMP
Last week I dismantled one of my platforms, selling most of the parts, but moving over a dual GPU (9800GX2) to another computer with an open PCI-e slot. So the net loss in production should have been only that of the CPU that is no longer in my Folding array, a Q6600 that was producing 3000-4000PPD. Instead, I am now down about 12-13000 from where I was before. Actual credited production is consistently 10,000PPD or more lower than what FAHMon shows productions rates to be. I have observed this actual:credited production disparity now for a week.
Troubleshooting and analysis:
1. I have checked the configuration each client, ensuring that user name and team number are correct for each. It is correct.
2. I have scoured the clients' logs for botched work units. Yes, there have been a number of failed Nvidia 575x work units, but only enough to account for maybe 1500 PPD, certainly not 10,000K.
What is happening? I've looked at this closely enough to believe it's not in my head.
What other troubleshooting/investigational steps can I take.
If any moderators or PG staff are reading, and you think checking the 'server' would be advisable, I am Folder Leonardo, Team 93. All my clients are registered so.
Something seemingly serious wrong with receiving credit
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 1579
- Joined: Fri Jun 27, 2008 2:20 pm
- Hardware configuration: Q6600 - 8gb - p5q deluxe - gtx275 - hd4350 ( not folding ) win7 x64 - smp:4 - gpu slot
E6600 - 4gb - p5wdh deluxe - 9600gt - 9600gso - win7 x64 - smp:2 - 2 gpu slots
E2160 - 2gb - ?? - onboard gpu - win7 x32 - 2 uniprocessor slots
T5450 - 4gb - ?? - 8600M GT 512 ( DDR2 ) - win7 x64 - smp:2 - gpu slot - Location: The Netherlands
- Contact:
Re: Something seemingly serious wrong with receiving credit
Question: how do you monitor them? Last 3 frames, all frames or effective rate? Effective rate should show the most realistic figures as it takes the actual time it takes to complete a workunit from download to completion.Leonardo wrote:Let me preface this by saying this is not a whine or blame thread. I am merely looking for a solution.
Problem: daily credited points is about 10,000 less than what it apparently should be, as measured by FAHMon
Considerations: I've been folding since the beginning. Currently I'm running a total 20 clients, mix of GPU2 Nvidia and quad core SMP
Last week I dismantled one of my platforms, selling most of the parts, but moving over a dual GPU (9800GX2) to another computer with an open PCI-e slot. So the net loss in production should have been only that of the CPU that is no longer in my Folding array, a Q6600 that was producing 3000-4000PPD. Instead, I am now down about 12-13000 from where I was before. Actual credited production is consistently 10,000PPD or more lower than what FAHMon shows productions rates to be. I have observed this actual:credited production disparity now for a week.
Troubleshooting and analysis:
1. I have checked the configuration each client, ensuring that user name and team number are correct for each. It is correct.
2. I have scoured the clients' logs for botched work units. Yes, there have been a number of failed Nvidia 575x work units, but only enough to account for maybe 1500 PPD, certainly not 10,000K.
What is happening? I've looked at this closely enough to believe it's not in my head.
What other troubleshooting/investigational steps can I take.
If any moderators or PG staff are reading, and you think checking the 'server' would be advisable, I am Folder Leonardo, Team 93. All my clients are registered so.
There have been server issues as well, I trust you looked for stuck wu's as well ( wu's in queue not uploaded are offcourse not credited as well and a wu worth allot of points is going to create a large diffrence in actuall ppd ).
-
- Posts: 260
- Joined: Tue Dec 04, 2007 5:09 am
- Hardware configuration: GPU slots on home-built, purpose-built PCs.
- Location: Eagle River, Alaska
Re: Something seemingly serious wrong with receiving credit
Monitoring: I've been using the last frame method in FAHMon for a year, so my measuring methodology is a constant, and not a variable. I will though, try a different monitoring means, just in case something strange is going on.
I'm monitoring the 'farm' on two different computers. I have now set one FAHMon instance to monitor for current and the other to monitor for effective rate.
Work units stuck in the queue and not released to Stanford servers? I don't think so. I've performed spot checks on different clients and so far has observed that all completed work units were "successfully sent" to the server. I will perform more checks on this. For the life of me, I can't think of anything concerning client configurations that is different than before. The only differences, at least that I know of, are that there is one less CPU and that two GPUs in a former machine are now consolidated in another machine.
I'm monitoring the 'farm' on two different computers. I have now set one FAHMon instance to monitor for current and the other to monitor for effective rate.
Work units stuck in the queue and not released to Stanford servers? I don't think so. I've performed spot checks on different clients and so far has observed that all completed work units were "successfully sent" to the server. I will perform more checks on this. For the life of me, I can't think of anything concerning client configurations that is different than before. The only differences, at least that I know of, are that there is one less CPU and that two GPUs in a former machine are now consolidated in another machine.
-
- Posts: 1579
- Joined: Fri Jun 27, 2008 2:20 pm
- Hardware configuration: Q6600 - 8gb - p5q deluxe - gtx275 - hd4350 ( not folding ) win7 x64 - smp:4 - gpu slot
E6600 - 4gb - p5wdh deluxe - 9600gt - 9600gso - win7 x64 - smp:2 - 2 gpu slots
E2160 - 2gb - ?? - onboard gpu - win7 x32 - 2 uniprocessor slots
T5450 - 4gb - ?? - 8600M GT 512 ( DDR2 ) - win7 x64 - smp:2 - gpu slot - Location: The Netherlands
- Contact:
Re: Something seemingly serious wrong with receiving credit
There is a problem with the current update as well, that can't influence average ppd for a whole week though
If you got allot off nvidia gpu2's on advmethods you might have gotten allot of 5900 wu's for which ppd measurements using last frame are way off the mark ( see the thread about it, I'm pretty worn out and was about to turn in so I'm afraid I don't have the stamina to link you ).
For the stuck wu's, don't look through the logs but use -queueinfo it's much less time consuming?
If you got allot off nvidia gpu2's on advmethods you might have gotten allot of 5900 wu's for which ppd measurements using last frame are way off the mark ( see the thread about it, I'm pretty worn out and was about to turn in so I'm afraid I don't have the stamina to link you ).
For the stuck wu's, don't look through the logs but use -queueinfo it's much less time consuming?
-
- Posts: 260
- Joined: Tue Dec 04, 2007 5:09 am
- Hardware configuration: GPU slots on home-built, purpose-built PCs.
- Location: Eagle River, Alaska
Re: Something seemingly serious wrong with receiving credit
The problem seems to have resolved itself. I wrote 'resolved itself' as I do not know the cause of the resolution or the cause of the original problem.
The amount of credit awarded for the completed units is now very close to the estimated production - both effective and 'last frame' calculation - as given shown by FAHMon. Whatever the problem was, just BING, one day the documented production credit went right back up to where it theoretically should have been. Only thing I can think of was that there were server problems (the catch-all scapegoat, I believe, no?) or client queues weren't releasing finished work units.
The amount of credit awarded for the completed units is now very close to the estimated production - both effective and 'last frame' calculation - as given shown by FAHMon. Whatever the problem was, just BING, one day the documented production credit went right back up to where it theoretically should have been. Only thing I can think of was that there were server problems (the catch-all scapegoat, I believe, no?) or client queues weren't releasing finished work units.
Indeed, I will make use of that in the future.For the stuck wu's, don't look through the logs but use -queueinfo it's much less time consuming?