Deadline/ETA mismatch with GPU WUs (Solved)

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

Re: Deadline/ETA mismatch with GPU WUs

Post by P5-133XL »

Trekkie wrote:So, could I get Windows to put up a notification that doesn't go away until I acknowledge it, if the driver resets?
Not that I know of.

However, if the driver reset occurred a long time ago and folding is currently running normally I don't think it matters anymore. Whatever damage has been done is likely already been dealt with by the client and nothing you are going to do is going to help. The pause+unpause is useful only if you do it immediately upon reset (so you minimize the time spent wasted on a major failure) or if there is a sign of major failure like the log not progressing anymore and you need to restart it.

The event log is really only useful in identifying the cause of a major failure so you can address preventive measures like diminishing or removing an OC. No one expects the user to check the event logs every time they return from an absence.
Image
Rel25917
Posts: 303
Joined: Wed Aug 15, 2012 2:31 am

Re: Deadline/ETA mismatch with GPU WUs

Post by Rel25917 »

If you dont have something like evga precision or msi afterburner installed to monitor the card get something. A quick look at the temps or % usage will let you know if its running.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Deadline/ETA mismatch with GPU WUs

Post by bruce »

The Development team has been working on a capability to detect a GPU-Reset and do something similar to a manual Pause/Unpause automatically. That code is still being developed/tested and a final roll-out is sometime in the future (provided there are no undiscovered bugs in the code).
davidcoton
Posts: 1094
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: Deadline/ETA mismatch with GPU WUs

Post by davidcoton »

bruce wrote: ... (provided there are no undiscovered bugs in the code).
:shock: Bruce, I think we need you to increase your Rumsfeldian studies. Undiscovered bugs will, unfortunately, not prevent software being released. The essential part of the testing part of the development process is to convert as many undiscovered bugs as possible to discovered bugs. Then there is a chance that they will be fixed before release -- though all too often they are ignored, or converted back to undiscovered by an attempted but incomplete or even incorrect fix. :roll:
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Deadline/ETA mismatch with GPU WUs

Post by bruce »

You're right, but what's important is the RATE at which undiscovered bugs are turned into discovered bugs. Whenever that happens, it tends to delay the distribution of the code. The code doesn't get released until the discovery rate approaches zero (but that takes a lot more words to explain).
Trekkie
Posts: 11
Joined: Thu Jul 31, 2014 4:41 pm

Re: Deadline/ETA mismatch with GPU WUs

Post by Trekkie »

Rel25917 wrote:If you dont have something like evga precision or msi afterburner installed...
I've got Afterburner; the GPU runs at 96 - 99% when F@H is set to Full, so it's working, just not hard enough. Which is strange, because apparently:
Napoleon wrote:Looking at the R250X Time Per Frame, it shouldn't have any problems returning WUs before Timeout even if you only folded for a few hours per day.
So why is my ETA still b0rken? Right now, it's still 4 days off (compared to deadline).
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Deadline/ETA mismatch with GPU WUs

Post by 7im »

ETA is calculated from the date and time the WU was downloaded, and the based on how much time has passed and how long the average frame time is, projects out that ETA. If the date/time was off or adjusted it will skew the final estimate.

The next WU should work better.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Trekkie
Posts: 11
Joined: Thu Jul 31, 2014 4:41 pm

Re: Deadline/ETA mismatch with GPU WUs

Post by Trekkie »

7im wrote:ETA is calculated from the date and time the WU was downloaded, and the based on how much time has passed and how long the average frame time is, projects out that ETA. If the date/time was off or adjusted it will skew the final estimate. The next WU should work better.
OK, so provided that Napoleon was correct in saying my 250X is more than adequate, the current WU should complete in time? Note that I fold part-time - 12 to 16 hours, but still.

Also, I take it that folding part-time also skews the ETA?
Joe_H
Site Admin
Posts: 8002
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Deadline/ETA mismatch with GPU WUs

Post by Joe_H »

Yes, part-time folding does skew the ETA. It takes a few percent completion, especially with Core_17 GOU WU's, before the ETA will settle down after each restart. Based on the elapsed time between 1% and 2% in your log, that project will complete a bit over 3% every uninterrupted hour of processing. So overall at 12 hours per day it will take about 3 days. The preferred deadline for that project from the Project Summary is 8.9 days.
Image
Trekkie
Posts: 11
Joined: Thu Jul 31, 2014 4:41 pm

Re: Deadline/ETA mismatch with GPU WUs

Post by Trekkie »

Joe_H wrote:Based on the elapsed time between 1% and 2% in your log, that project will complete a bit over 3% every uninterrupted hour of processing. So overall at 12 hours per day it will take about 3 days.
How would I go about calculating an estimate for myself from the logs? Not that I doubt your maths, it's that I've been able to leave the computer completely uninterrupted, and I have more up-to-date numbers from the log.

Also, do you know the scale for Full/Medium/Light? So far, I've been running F@H on Full, but that might have to change if I want to play a graphics-intensive game (this is a gaming PC). Or I could, y'know, turn down the graphics, but who builds their own gaming PC to turn around and do that? :P

Finally, are the calculations bound by the core clock? (I.e. would over-clocking be of any benefit?)

New weirdness: Afterburner shows a ~30 second drop in GPU usage, from almost 100% to as low as 20%, but I can't divine a reason for that from the F@H logs (the only GPU-intensive program running). Any ideas?
Rel25917
Posts: 303
Joined: Wed Aug 15, 2012 2:31 am

Re: Deadline/ETA mismatch with GPU WUs

Post by Rel25917 »

Full/medium/light affect when the gpu folds but not how hard. It is either off or on. Most just turn the gpu slot off when gaming. Overclocking can help but core 17 is very sensitive and will start trashing units real easy so if you try it keep a very close eye on it. The drops in gpu usage are likely the checkpoints being written by the core if you are only seeing them every 2-5% of the work unit.
Joe_H
Site Admin
Posts: 8002
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Deadline/ETA mismatch with GPU WUs

Post by Joe_H »

Trekkie wrote:How would I go about calculating an estimate for myself from the logs?
Just look at the time interval between each percentage recorded in the log for a particular WU in its folding slot. In your case I looked at your log file and saw that interval was about 19 minutes, or just about 3% per hour. The rest is simple math, 3% times 12 hours times 3 days is 108%.
Image
Trekkie
Posts: 11
Joined: Thu Jul 31, 2014 4:41 pm

Re: Deadline/ETA mismatch with GPU WUs

Post by Trekkie »

Rel25917 wrote:Overclocking can help but core 17 is very sensitive and will start trashing units real easy...
OK, I'll steer clear of it for this WU, but do you know what exactly it's sensitive to? On the face of it, I can't see how a GPU running over-clocked at 1 GHz would appear different to a program to the same GPU running at a stock 1000 Hz. Cycles are cycles, non? (Edit: I can see how changing the clock speed mid-WU would cause problems, however.)
Rel25917 wrote:Most just turn the gpu slot off when gaming.
Hmmm, I can't seem to get the GPU slot to pause independently of the CPU, not even with the Advanced Control.
Rel25917 wrote:The drops in gpu usage are likely the checkpoints being written by the core if you are only seeing them every 2-5% of the work unit.
OK, so just so I can confirm that's the cause, how do I change the log's verbosity level (temporarily, of course) and what level would show the checkpoints?
Joe_H
Site Admin
Posts: 8002
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Deadline/ETA mismatch with GPU WUs

Post by Joe_H »

More correctly, Core_17 is not sensitive to over clocking so much as it will detect any calculation error caused by that over clocking. That error detection will cause the WU to fail.

As for pausing a single slot in FAHControl, right-click on the slot under the Folding Slots status area and a menu will come up that includes the option to pause the slot.

As for changing the log verbosity level, that will not show the checkpoints being done. You will have to look in the work folder for the WU being processed by the GPU slot. Another option would be to monitor data I/O being written to your drive, there should be a spike when the checkpoint is written.
Image
Rel25917
Posts: 303
Joined: Wed Aug 15, 2012 2:31 am

Re: Deadline/ETA mismatch with GPU WUs

Post by Rel25917 »

Cycles are not just cycles,each gpu family has its designed speeds and you cant compare them to a other familys (nvidia fermi to kepler to maxwell.) Even within a family you cant compare a gtx 660 to a 670 to a 680. Each individual chip is also different, some will overclock very well and some not at all. If a gpu is running stock at 1ghz and one is OC at 1ghz they cant be the same gpu or they would both be stock at 1 or oc to 1 yes?

To find out when a core 17 unit will checkpoint you need to go into the work folder for that unit, then the 01 folder then look at the log.txt file in there and find this info-

Reading core settings...
Total number of steps: 5000000
XTC write frequency: 125000

A little math and you know you get a checkpoint every 2.5% This varies by project and could be almost any number but most are 2-5%
Post Reply