ETA and PPD estimation are very inaccurate

Moderators: Site Moderators, FAHC Science Team

Post Reply
Zac67
Posts: 10
Joined: Mon Oct 16, 2023 11:15 am

ETA and PPD estimation are very inaccurate

Post by Zac67 »

(8.3.18 running here)

The ETA and PPD estimations are miles off for fresh units. Not only does the client forget about previous WU speeds from the same project (like 7.x), but it also shows ludicrous figures after a reboot.

Additionally, there seems to be something wrong in the algorithm in that it likely includes periods in its calculation when the WU wasn't even processed.

Example (lufah output every 5 minutes):

Code: Select all

Project  CPUs  GPUs  Core  Status        Progress    PPD       ETA      Deadline
--------------------------------------------------------------------------------
16525    0     1     0x23  Running           0.0%    3177931   14h 24m  6d 23h  
16525    0     1     0x23  Running           1.2%    8598032   7h 19m   6d 23h  
16525    0     1     0x23  Running           2.6%    10025687  6h 31m   6d 23h  
16525    0     1     0x23  Running           4.0%    10648839  6h 10m   6d 23h  
16525    0     1     0x23  Running           5.4%    10837803  6h 0m    6d 23h  
16525    0     1     0x23  Running           6.8%    10979237  5h 52m   6d 23h  
16525    0     1     0x23  Running           8.2%    11141595  5h 43m   6d 23h  
16525    0     1     0x23  Running           9.6%    11407828  5h 32m   6d 23h  
16525    0     1     0x23  Running          11.3%    11889863  5h 17m   6d 23h  
16525    0     1     0x23  Running          12.8%    12079660  5h 9m    6d 23h  
16525    0     1     0x23  Running          14.5%    12337759  4h 59m   6d 23h  
16525    0     1     0x23  Running          16.0%    12519788  4h 50m   6d 23h  
16525    0     1     0x23  Running          17.6%    12598785  4h 44m   6d 22h  
16525    0     1     0x23  Running          19.2%    12773290  4h 36m   6d 22h       <-- more than 1 hour required to settle on a realistic estimate
16525    0     1     0x23  Finishing        20.7%    12848371  4h 29m   6d 22h  
16525    0     1     0x23  Finishing        22.3%    12980588  4h 22m   6d 22h  
16525    0     1     0x23  Finishing        23.9%    13025810  4h 16m   6d 22h  
16525    0     1     0x23  Finishing        25.5%    13106309  4h 10m   6d 22h  
16525    0     1     0x23  Finishing        27.1%    13186656  4h 4m    6d 22h  
16525    0     1     0x23  Finishing        28.6%    13225159  3h 58m   6d 22h  
16525    0     1     0x23  Finishing        30.2%    13296736  3h 52m   6d 22h  
16525    0     1     0x23  Finishing        31.8%    13324747  3h 46m   6d 22h  
16525    0     1     0x23  Finishing        33.4%    13385101  3h 40m   6d 22h  
16525    0     1     0x23  Finishing        34.9%    13416464  3h 35m   6d 22h  
16525    0     1     0x23  Finishing        36.5%    13464246  3h 29m   6d 21h  
16525    0     1     0x23  Finishing        38.1%    13515389  3h 23m   6d 21h  
16525    0     1     0x23  Finishing        39.7%    13534909  3h 18m   6d 21h  
16525    0     1     0x23  Finishing        41.3%    13579257  3h 12m   6d 21h  
16525    0     1     0x23  Finishing        42.9%    13606195  3h 7m    6d 21h  
16525    0     1     0x23  Finishing        44.5%    13644678  3h 1m    6d 21h  
16525    0     1     0x23  Finishing        46.1%    13682295  2h 55m   6d 21h  
16525    0     1     0x23  Finishing        47.7%    13696969  2h 50m   6d 21h  
16525    0     1     0x23  Finishing        49.3%    13722187  2h 45m   6d 21h  
16525    0     1     0x23  Finishing        50.8%    13734827  2h 40m   6d 21h  
16525    0     1     0x23  Finishing        52.4%    13754878  2h 34m   6d 21h  
16525    0     1     0x23  Finishing        54.0%    13777099  2h 29m   6d 21h  
16525    0     1     0x23  Finishing        55.6%    13774977  2h 24m   6d 20h  
16525    0     1     0x23  Finishing        57.1%    13791944  2h 19m   6d 20h  
[system reboot]
16525    0     1     0x23  Finishing        66.0%    300769304  5m 52s   6d 20h     <-- wait - what???
16525    0     1     0x23  Finishing        67.8%    195683020  8m 25s   6d 19h 
16525    0     1     0x23  Finishing        69.0%    166724306  9m 24s   6d 19h 
16525    0     1     0x23  Finishing        71.0%    129792511  11m 9s   6d 19h 
16525    0     1     0x23  Finishing        72.0%    116539239  11m 52s  6d 19h 
16525    0     1     0x23  Finishing        74.0%    98053485  12m 57s  6d 19h  
16525    0     1     0x23  Finishing        75.1%    84892957  14m 10s  6d 19h  
16525    0     1     0x23  Finishing        77.0%    79418755  13m 52s  6d 19h  
16525    0     1     0x23  Finishing        78.8%    70930549  14m 11s  6d 19h  
16525    0     1     0x23  Finishing        80.0%    67120301  14m 1s   6d 19h  
16525    0     1     0x23  Finishing        82.0%    61277542  13m 43s  6d 19h  
16525    0     1     0x23  Finishing        83.0%    58533218  13m 27s  6d 19h  
16525    0     1     0x23  Finishing        85.0%    54146224  12m 45s  6d 19h  
16525    0     1     0x23  Finishing        86.0%    51987210  12m 18s  6d 18h  
16525    0     1     0x23  Finishing        88.0%    48782482  11m 10s  6d 18h  
16525    0     1     0x23  Finishing        89.5%    46030121  10m 15s  6d 18h  
16525    0     1     0x23  Finishing        91.0%    44681952  9m 2s    6d 18h  
16525    0     1     0x23  Finishing        93.0%    42500737  7m 21s   6d 18h  
16525    0     1     0x23  Finishing        94.0%    41346782  6m 26s   6d 18h  
16525    0     1     0x23  Finishing        96.0%    39567845  4m 27s   6d 18h  
16525    0     1     0x23  Finishing        97.4%    37942897  3m 0s    6d 18h  
16525    0     1     0x23  Finishing        99.0%    37129375  1m 10s   6d 18h  
16525    0     1     0x23  Uploading         0.0%    35770239  0s       6d 18h      <-- client unable to cope even after hours
Also, it would be nice for the status to show the point estimate for the current WU (like 7.x). Obviously it uses that estimate in its PPD calculation (you can reverse calculate it), but it should show it outright.
Last edited by Zac67 on Thu Nov 14, 2024 9:32 am, edited 1 time in total.
calxalot
Site Moderator
Posts: 1139
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: ETA and PPD estimation are very inacurate

Post by calxalot »

You can modify shown columns using
https://v8-4.foldingathome.org/
It is backwards compatible with v8.3.

Estimates are usually off until ~3% done.
Estimates currently use current time and assignment time, so it will be affected by paused time.
Other estimate techniques may be used in the future.
There are several issues related to estimates filed against web control.
https://github.com/FoldingAtHome/fah-we ... tet/issues

The client does keep track of "run time" of previous projects.
I believe run time here is completion time - assignment time.
It does not load any v7 data.

Failed network operations have an exponential back off retry delay.

What OS are you using?
Zac67
Posts: 10
Joined: Mon Oct 16, 2023 11:15 am

Re: ETA and PPD estimation are very inaccurate

Post by Zac67 »

Sorry for being thick, but I can't see where to change displayed columns on https://v8-4.foldingathome.org/ after logging in.

If you look at the table above, at 6.8% the estimate (10,979,237) is still nearly 20% lower than the 13,791,944 just before rebooting. The machine was largely idle and there was no background load that might have caused that. The actual credit given (3,016,665) calculates to a PPD of 12,629,123 but that small gap can be explained by the reboot delay.

After the reboot, the estimate (300,769,304) is more than +2000% off and is still +200% off (35,770,239) nearly two hours later. Obviously, there is something very wrong there. My guess is that the previous work was calculated with 0 time instead of from the assignment time.

> The client does keep track of "run time" of previous projects.

No, mine does not. It starts with the same, extremely low estimate until it jumps from e.g. 0.2% to 1.0% and then constantly climbs until it roughly fits at ~10% - see the table. It doesn't matter how many WUs from the same project have already run, even on the same day.
calxalot
Site Moderator
Posts: 1139
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: ETA and PPD estimation are very inaccurate

Post by calxalot »

You can set columns from upper right account settings > appearance
https://v8-4.foldingathome.org/account/appearance
Drag and reorder labels in top box.

Previous run time data, when it exists, is in client.db.
If run time is not known, I think the client starts by assuming it will be equal to expiration interval.

The client sometimes gets confused after a WU is started close to an integer percent completion.
When the next percent is reached, it suddenly has a huge estimated speed that slowly drifts down.
Zac67
Posts: 10
Joined: Mon Oct 16, 2023 11:15 am

Re: ETA and PPD estimation are very inaccurate

Post by Zac67 »

> You can set columns from upper right account settings > appearance

Excellent, well hidden column config! But there's no Estimated Credit or similar, just Base Credit.

> Previous run time data, when it exists, is in client.db.

Doesn't seem to exist - yet. I hope that's WIP.

The client gets confused quite a lot. I just wanted to point out that problem as v7 was much more accurate.
calxalot
Site Moderator
Posts: 1139
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: ETA and PPD estimation are very inaccurate

Post by calxalot »

client.db will be in the client working directory. On Linux I think that’s ~fah-client for the service install. It is printed in log at launch.
calxalot
Site Moderator
Posts: 1139
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: ETA and PPD estimation are very inaccurate

Post by calxalot »

client.db is currently a sqlite3 file, so it’s easy to peek at. Please don’t modify anything. It may be locked while client runs. I don’t know if you can read a locked file.
Zac67
Posts: 10
Joined: Mon Oct 16, 2023 11:15 am

Re: ETA and PPD estimation are very inaccurate

Post by Zac67 »

Already tried looking at the db - there are four tables: config, cores, groups , units (empty). Neither holds any information about project performance or similar.

Possibly related, the TPF column in https://v8-4.foldingathome.org/ displays ??? when folding. On the Work Units tab, only the current WU is displayed, none from a previous run or the current one.
Post Reply