Page 1 of 1

Statistics (not the usual...)

Posted: Tue Mar 31, 2020 5:11 am
by ajm
I would be nice to have some current stats expressed in more universal values than the "points", at least for FAH's overall performance. I don't know, peta/teraFLOPS might be appropriate?
It was published recently that FAH is now faster than the world's top 7 supercomputers combined: https://www.extremetech.com/extreme/308 ... s-combined
This kind of stats (real-time possible?) would help finding support for FAH at a more institutional level. And for users, it also would be more down-to-earth than the points, somehow.

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 5:18 am
by anandhanju

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 5:19 am
by Artemios
There following stats are posted on the F@H site:

https://stats.foldingathome.org/os

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 5:34 am
by ajm
Thanks!

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 6:19 am
by VAcharonD1
If I understand correctly, those stats are based on the capacity of all clients that submitted work in the last 50 days ("active CPUs and GPUs"). I don't think it's the kind of real-time data the OP is looking for because not all of those machines will be folding 24/7, some have dropped out for various reasons, some could be double-counted, and there were some very large resources added in the last month but quickly removed (e.g. the points peak on 3/20). It measures what FAH is capable of, but I don't think we have a measure of how much we are actually doing, other than the points.

Of course if I am grossly wrong I welcome correction.

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 6:56 am
by anandhanju
https://api.foldingathome.org/os?days=1 gives you data for clients that have returned results in the past day. For the most part, this should eliminate duplicates (e.g., an uninstall and a reinstall). There isn't a way to separate out those that do not fold the entire day but this gives a good idea.

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 6:58 am
by ajm
Hm, I suspect that VAcharonD1 is right. There's that sentence: "FLOPS per core is estimated." It looks more like a broad estimate than like a statistics of the effective performance.
Maybe it should be done the other way around: not from the number and hardware of donors, but from the results that scientists are effectively getting?

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 4:37 pm
by Joe_H
That is the what they are estimating from. They have data estimates of how many FLOPS completing WU's takes depending on the size and how many time steps are involved, and can back out an estimate of the total being done in a second over all the returns.

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 4:58 pm
by Nert
I'm with OP, but recognize that this sort of thing is in the lowest priority "nice to have" category. I could imagine a speedometer visualization updated hourly showing # of Tera/Exa flops currently being donated. That would be really cool and generate continuing buzz around the project. In the meantime, let's hope that FAH is successful in fully simulating all 20 proteins in SARS-CoV-2. When that happens, I'm hoping that the recognition resulting from that will make additional resources available that could make something "cool" like real time stats available.

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 5:01 pm
by ajm
Joe_H wrote:That is the what they are estimating from. They have data estimates of how many FLOPS completing WU's takes depending on the size and how many time steps are involved, and can back out an estimate of the total being done in a second over all the returns.
OK, thank you to take the time!
Can we then say that this stat: https://stats.foldingathome.org/os shows the theoretical capacity of FAH, as that of Summit, for example, is of 200 petaFLOPS, whereas FAH hasn't been (can't?) be clocked, as Summit has been at 148.6 petaFLOPS?

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 5:16 pm
by ajm
Nert wrote:I'm with OP, but recognize that this sort of thing is in the lowest priority "nice to have" category. I could imagine a speedometer visualization updated hourly showing # of Tera/Exa flops currently being donated. That would be really cool and generate continuing buzz around the project. In the meantime, let's hope that FAH is successful in fully simulating all 20 proteins in SARS-CoV-2. When that happens, I'm hoping that the recognition resulting from that will make additional resources available that could make something "cool" like real time stats available.
Yes, agreed 100%. It's just that I'm trying to put together a case for gathering those ressources. I'm looking for solid arguments in favor of distributed supercomputing in general and FAH in particular. There are now more processors at large than anybody can gather in a supercomputer. I think it's the future of the sector, at least for the next 2-3 decades. But it needs convincing.

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 5:38 pm
by Neil-B
From my relatively non technical perspective it might be reasonable to suggest that the one day stats linked above is a near correlation to a "clocked" figure, whereas the longer time period estimates err towards a correlation with theoretical potential … The one day are estimates of the actual FLOPS needed to complete the work actually done during that period (if I understand it correctly it averages "highs and lows" over the 24hr period so at times it may peak higher or lower so FAH has effectively been "clocked"?

I have watched HPC vendors argue over benchmarking - Each seems to prefer their own subtle definitions and distinctions of how benchmarks are defined and measured - and observed that system performance against real tasks can have significant variance against such benchmarked tests … FAH is real performance against the task that is required - the figures might be inferred estimates of FLOPs but to me that is still a healthy benchmark.

… but irrespective of how things are actually measured I would be love to see (at some point when everything is less loaded) some real time visualisations of various performance metrics with periodic min/max as that would be quite cool (and no doubt non-trivial to actually make happen) … whether it will ever be "worth" doing this is thankfully not my call :)

Re: Statistics (not the usual...)

Posted: Tue Mar 31, 2020 6:23 pm
by Nert
ajm wrote:It's just that I'm trying to put together a case for gathering those ressources. I'm looking for solid arguments in favor of distributed supercomputing in general and FAH in particular.
I think the use case will come after FAH delivers a big win for COVID-19. If that happens, I think that resources available for the project will expand a lot. One of the risks is that the success won't be credited to FAH. My concern is that some pharmaceutical company uses the results to develop a treatment for infected patients, but doesn't properly credit the project with providing the research that led to the treatment.
Neil-B wrote: … whether it will ever be "worth" doing this is thankfully not my call
Yup. :D ... appreciate your other comments as well.

Re: Statistics (not the usual...)

Posted: Tue May 05, 2020 8:14 am
by PantherX
I came across this link which has historic information regarding F@H OS Stats from 2013-12-15 to 2017-07-09 and thought it might be insightful: https://app.johanssonrobotics.com/folding/