Why isn't TPF averaged (or is it)?

Moderators: Site Moderators, FAHC Science Team

Post Reply
Breach
Posts: 204
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Why isn't TPF averaged (or is it)?

Post by Breach »

Background: I have started maintaining per WU Excel table with various details. One of the items I am calculating is WU average points per hour metric which I calculate as: Total Estimated points (with no interruptions) / 24 x TPF x 100. That's fine, but I am doing this on the assumption that the TPF stabilizes at some point (more or less). However that's a very wrong assumption as it turns out. For example PRCG 8089,1852,3,33 - for every frame the TPF estimate jumps between 1 min 02 seconds and 1 min 33 secs (with the ETA being 1 h 15 mins and 1 h 52 mins respectively). In this particular case that's not a problem as I can still calculate the average TPF as there's a pattern in this case, but anyway my more general question is - why isn't the reported TPF an average calculated on the basis of the sum of the previous TPFs in principle? It seems to me the TPF reported is the TPF of just the last frame which can lead to confusing ETAs?

Thanks.

PS
No, there's not extra load influencing this - I have dedicated 6 out of 8 threads and plenty of system idle headroom.
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Why isn't TPF averaged (or is it)?

Post by PantherX »

Is there any reason to manually maintain the Excel sheet? If not, than I would recommend that you use HFM.NET (http://code.google.com/p/hfm-net/) which does work with V7. However, it may not report the Slots type correctly. Am unsure of this since I use V7.2.6 with HFM.NET and it works just fine for my purpose. Do note that HFM.NET does calculate the TFP as an average over the last 3 complete frames and you can change it to other methods also.

FAHControl does display the average TPF and not the last TPF. Am not sure how many frames it uses to calculate the average TPF.

FahCores report the actual TPF. One reason for the variation is the different forces that needs to be calculated. It may change with each frame so it might contribute to the TPF fluctuation.

Do note that you can't rule out interruptions by the system unless you have set-up core-affinity (by using 3rd party tools). With that core-affinity, it will lock those process to those cores and the OS scheduler will not be able to use those locked cores.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Why isn't TPF averaged (or is it)?

Post by bruce »

I have wondered about many of those same questions and I have not been able to get any good answers. I have a couple of suggestions that might or might not be true but you can bounce them against the data and see if anything useful turns out.

First, every frame is NOT identical. There are periodic interruptions (like writing a checkpoint) which don't happen at the same frequency as the TPF.

Second, even if it seems like calculating a step with N atoms should take the same amount of time, that is simply NOT true. Search the forum for "folding event"

Third, if an average is used, and the protein shape changes appreciably or another task in your computer changes, the delay time before the number stabilize is highly dependent on how many samples are included. Donors want a TPF that converges quickly (say they added two more CPU-cores or overclocked their GPU) yet they want a stable predictiion -- and those two concepts contradict each other -- so no predicion of future results will always be satisfactory.

During the development of the 3rd party tool HFM, the pros and cons of several methods were evaluated and in the end, a choice of three methods was left to the user. V7 does not give you a choice, and apparently is using a fourth method.

You'll notice that there have been several V7 tickets opened on this topic but since the have no influence on science, only on ('worthless"?) points and since it's called "estimated" anyway, things like this do not get a lot of attention from the Developers who have plenty of more important things to be worrying about.
Breach
Posts: 204
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Re: Why isn't TPF averaged (or is it)?

Post by Breach »

Thanks. I am unfamiliar with HFM and will check it out now

Are you sure that FAHControl displays TPF as an average? I'm seeing 1% 1:02 2% 1:33 3% 1.02 4% 1:33 etc. I am at 64% now (1:35), 65% (1:03) - if it was a true average it would have stabilized by now, though I guess it's still possible...

I have locked affinity for FAH to the first 3 cores and set it with realtime priority - no changes as far as TPF behavior is concerned. I don't understand why that would be needed though? I have no other programs with affinity set - if there's an available thread on 8 why would the scheduler interrupt the FAH core running on 1 to run it there...?

[Update:]
So:

1. Not sure whether whether HFM calculates average TPF or uses the delta from the log file - still it matches my manual calculations:

Code: Select all

20:16:18:WU02:FS01:0xa4:Completed 2500 out of 250000 steps  (1%) - N/A
20:17:33:WU02:FS01:0xa4:Completed 5000 out of 250000 steps  (2%) - 1m 15 sec
20:18:49:WU02:FS01:0xa4:Completed 7500 out of 250000 steps  (3%) - 1m 16 sec
20:20:04:WU02:FS01:0xa4:Completed 10000 out of 250000 steps  (4%) - 1m 15 sec
20:21:20:WU02:FS01:0xa4:Completed 12500 out of 250000 steps  (5%) - 1m 16 sec
20:22:35:WU02:FS01:0xa4:Completed 15000 out of 250000 steps  (6%) - 1m 15 sec
20:23:51:WU02:FS01:0xa4:Completed 17500 out of 250000 steps  (7%) - 1m 16 sec
20:25:07:WU02:FS01:0xa4:Completed 20000 out of 250000 steps  (8%) - 1m 16 sec
20:26:23:WU02:FS01:0xa4:Completed 22500 out of 250000 steps  (9%) - 1m 16 sec
20:27:38:WU02:FS01:0xa4:Completed 25000 out of 250000 steps  (10%) - 1m 15 sec
20:28:54:WU02:FS01:0xa4:Completed 27500 out of 250000 steps  (11%) - 1m 16 sec
There's an option on how calculate the PPD in HFM - I don't see an option on how to calculate TPF?

2. This gets even stranger - give these figures in the log file I can't understand what kind of logic would the FAHClient use to come up with these 1m 02 sec / 1m 33 sec TPF figures...? Sure the average of these is about 1m 18 sec, but...
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
Breach
Posts: 204
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Re: Why isn't TPF averaged (or is it)?

Post by Breach »

bruce wrote:
Third, if an average is used, and the protein shape changes appreciably or another task in your computer changes, the delay time before the number stabilize is highly dependent on how many samples are included. Donors want a TPF that converges quickly (say they added two more CPU-cores or overclocked their GPU) yet they want a stable predictiion -- and those two concepts contradict each other -- so no predicion of future results will always be satisfactory.
Thanks. I suppose it comes down to that. Current speed vs. average speed per km/mile. Still, a TPF/ETA which jumps up and down every frame in the FAHClient 7 is not helping either case... though I guess it's only an issue in specific projects.

[Update]

Just to update before I move on with my life ;-) On another WU:

From the logs:

0-1%: 15:26
1-2%: 15:23
2-3%: 15:31
3-4%: 15:28

Average: 15:27

HFM reports the same figure so it seems it is indeed calculating a TPF average. FAHControl reports a TPF of 15:30... at least it's not jumping -/+ 30% between frames on this WU. ;-)
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Why isn't TPF averaged (or is it)?

Post by PantherX »

Breach wrote:...Are you sure that FAHControl displays TPF as an average? I'm seeing 1% 1:02 2% 1:33 3% 1.02 4% 1:33 etc. I am at 64% now (1:35), 65% (1:03) - if it was a true average it would have stabilized by now, though I guess it's still possible...
Technically, a mathematical model is being used and I am unaware of what it is. For sake of simplicity, I used the term "Average" since it is easy to understand and eventually, FAHControl gets is correct (or within a few seconds of it). It does maintain a history of the Project so the next time you download the WU from a folded Project, it will display the TPF automatically and will be updated from the actual WU if needed.
Breach wrote:...I have locked affinity for FAH to the first 3 cores and set it with realtime priority - no changes as far as TPF behavior is concerned. I don't understand why that would be needed though? I have no other programs with affinity set - if there's an available thread on 8 why would the scheduler interrupt the FAH core running on 1 to run it there...?...
The scheduler has it's own logic that I am unaware off. Quite sometime ago, I read that if you used a 3rd party affinity program, you can lock all non-FAH programs to a particular CPU(s) and the free ones can be locked to FAH. One may see a difference in the TPF but generally, this not that widely used.
Breach wrote:...There's an option on how calculate the PPD in HFM - I don't see an option on how to calculate TPF?...
Edit -> Preference -> Options:
Calculate PPD Based on: 4 Options available from the drop down menu
Calculate Bonus Based on: 3 Options available from the drop down menu
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Jesse_V
Site Moderator
Posts: 2850
Joined: Mon Jul 18, 2011 4:44 am
Hardware configuration: OS: Windows 10, Kubuntu 19.04
CPU: i7-6700k
GPU: GTX 970, GTX 1080 TI
RAM: 24 GB DDR4
Location: Western Washington

Re: Why isn't TPF averaged (or is it)?

Post by Jesse_V »

According to https://fah-web.stanford.edu/projects/F ... ticket/395
As of v7.1.44 the client tries to measure frames by watching the frame number change. If the core is shutdown or the machine is hibernated the core will detect this and adjust it's estimate accordingly. It saves the last three frame measures and uses the median value for it's predictions. This both allows the estimates to adjust quickly (on a per frame basis) and provides some smoothing for abnormal values.
To be clear the client is not taking the mean of the last three. It is taking the median value.
https://fah-web.stanford.edu/projects/F ... ticket/581 is an enhancement request to choose the PPD calculation algorithm.
F@h is now the top computing platform on the planet and nothing unites people like a dedicated fight against a common enemy. This virus affects all of us. Lets end it together.
EXT64
Posts: 323
Joined: Mon Apr 09, 2012 11:54 pm

Re: Why isn't TPF averaged (or is it)?

Post by EXT64 »

On certain projects, FAHControl does not agree with the log (and thus core) on WU progress. It instead oscillates around the true TPF (low-high-low-...). This has been mentioned as a bug several times but as it was not deemed a high priority it has never been fixed or explained.

See viewtopic.php?f=88&t=23960
Flaschie
Posts: 69
Joined: Sun Mar 11, 2012 5:52 pm

Re: Why isn't TPF averaged (or is it)?

Post by Flaschie »

PantherX wrote:
Breach wrote:...There's an option on how calculate the PPD in HFM - I don't see an option on how to calculate TPF?...
Edit -> Preference -> Options:
Calculate PPD Based on: 4 Options available from the drop down menu
Calculate Bonus Based on: 3 Options available from the drop down menu
If this is something you check often, then it may be nice to know the shortcuts for toggling: "Alt+P" for PPD and "Alt+O" for bonus.
Qinsp
Posts: 216
Joined: Sun Oct 17, 2010 2:34 pm

Re: Why isn't TPF averaged (or is it)?

Post by Qinsp »

With my machines, once I get them running sweet, the actual TPF variation is small.

The 7.3.6 TPF estimation code is broken. Bad. Think train wreck. Ignore it. Look at the log file.
Quality Inspection - Corona, CA, USA
Dimensional Inspection Laboratory
Pat McSwain, President
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Why isn't TPF averaged (or is it)?

Post by bruce »

The FAH Client design specifies that it does NOT look at the log ... so you know more about the TFP than it does.
Post Reply