ideas on speeding up stats
Moderators: Site Moderators, FAHC Science Team
-
- Pande Group Member
- Posts: 2058
- Joined: Fri Nov 30, 2007 6:25 am
- Location: Stanford
ideas on speeding up stats
We have been looking at how we can speed our stats updates. Looking at our db, we see that there's a primary culprit to slow stats updates: The " Contributions by team and project" pages. These are the pages like
http://fah-web.stanford.edu/cgi-bin/mai ... range=4000
We think we should stop updating these pages in order to keep the stats updating going much more speedily. We have some ideas for how to get similar functionality by using other tables we have. This functionality would not list the number of WU's by project as is now, but would allow donors to query a project number to find out how many WU's were completed.
Slow stats updates are going to be a problem unless we deal with this now, so we'd like to get moving on this. Please give us your opinion over the next week. Thanks!
http://fah-web.stanford.edu/cgi-bin/mai ... range=4000
We think we should stop updating these pages in order to keep the stats updating going much more speedily. We have some ideas for how to get similar functionality by using other tables we have. This functionality would not list the number of WU's by project as is now, but would allow donors to query a project number to find out how many WU's were completed.
Slow stats updates are going to be a problem unless we deal with this now, so we'd like to get moving on this. Please give us your opinion over the next week. Thanks!
Re: ideas on speeding up stats
I use these pages seldom. Mainly for looking up projects I folded awhile ago, so I probably won't remember the exact project numbers unless I see them. These pages definitely add to the FAH experience though.
Can I suggest another option?
Would it help your servers if these stats are updated less frequently, like once per week for instance?
Can I suggest another option?
Would it help your servers if these stats are updated less frequently, like once per week for instance?
-
- Site Moderator
- Posts: 6359
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: ideas on speeding up stats
I use these pages ... I'd like to be able to keep using this functionality, but I don't care if they're updated less frequently than the other stats ...
-
- Posts: 118
- Joined: Mon Mar 03, 2008 3:11 am
- Hardware configuration: Intel Core2 Quad Q9300 (Intel P35 chipset)
Radeon 3850, 512MB model (Catalyst 8.10)
Windows XP, SP2 - Location: Syracuse, NY
Re: ideas on speeding up stats
Does the code process everyone in the database every time, or is it able to only spend CPU cycles on donors who have submitted new results, and skip over inactive ones? I know you don't like to edit the database, but perhaps you could somehow separate donors and teams who have not been heard from in a very long time, such that very old results are archived, but not part of the processing. These unchanging data points would also not be packaged with the Lists that 3rd party sites download every day.
Core2 Quad/Q9300, Radeon 3850/512MB (WinXP SP2)
-
- Posts: 390
- Joined: Sun Dec 02, 2007 4:53 am
- Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.
Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that. - Location: UK
- Contact:
Re: ideas on speeding up stats
Ideas:
1. Update the main stats page once per week as suggested (Sunday morning 3am??).
2. Update main stats once per day, and make it possible for the FAH client to download the relevant team/user data when it picks up a workunit, ready to be displayed in the MyFolding.html file. Maybe as an option in the cfg.
3. Make only short stats. The one only open to top 2k teams.
4. Do away with the stats pages and add more detail to the daily_user/team files, then create an offline reader to parse it so those that want to read there stats can.
Personally i use EOC more than i use Stanford.
1. Update the main stats page once per week as suggested (Sunday morning 3am??).
2. Update main stats once per day, and make it possible for the FAH client to download the relevant team/user data when it picks up a workunit, ready to be displayed in the MyFolding.html file. Maybe as an option in the cfg.
3. Make only short stats. The one only open to top 2k teams.
4. Do away with the stats pages and add more detail to the daily_user/team files, then create an offline reader to parse it so those that want to read there stats can.
Personally i use EOC more than i use Stanford.
-
- Posts: 460
- Joined: Sun Dec 02, 2007 10:15 pm
- Location: Michigan
Re: ideas on speeding up stats
I do use those pages, but you can dump them if you provide that alternate look-up system.
Proud to crash my machines as a Beta Tester!
Re: ideas on speeding up stats
I regularly use the (team member)) summary page to see the number of points and work units that I have contributed to the team (and to print off certificates for major milestones!) but I do not use the breakdown by project at all. Hope this helps!
Note: Obviously I can get the statistics from third-party sites but they tend to lag behind Stanford and I cannot get the certificates from anywhere else (as far as I know)!
Note: Obviously I can get the statistics from third-party sites but they tend to lag behind Stanford and I cannot get the certificates from anywhere else (as far as I know)!
-
- Posts: 2948
- Joined: Sun Dec 02, 2007 4:36 am
- Hardware configuration: Machine #1:
Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).
Machine #2:
Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.
Machine 3:
Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32
I am currently folding just on the 5x GTX 460's for aprox. 70K PPD - Location: Salem. OR USA
Re: ideas on speeding up stats
The only reason I access the Stanford stats is to check the number of CPU's. What I'd like is the ability to search the database to see the result of what was turned in so I could compare logs vs points/results: sometimes what FAHMon says is my PPD is not even close to what the stats say I'm getting and I have too many machines folding to keep track or even feel comfortable in asking someone to look them all up.
Re: ideas on speeding up stats
I use the team page list with all the team members (not the one with just the first 1000 members) - that is very important, but for my purposes, updating it once a day would be all I need.
As for the project numbers & WU completed - I've very rarely checked what projects I've done (haven't for over a year except for yesterday when i checked on what the GPU2 client had worked on) so once a week would be fine, but entering the project number to see how much was done would be ok too.
As for the project numbers & WU completed - I've very rarely checked what projects I've done (haven't for over a year except for yesterday when i checked on what the GPU2 client had worked on) so once a week would be fine, but entering the project number to see how much was done would be ok too.
The NCIX Forum Folding Team
-
- Posts: 14
- Joined: Sun Dec 02, 2007 7:32 pm
Re: ideas on speeding up stats
I like looking at them every once in a while. Would it be possible to only update those pages once a day/every other day?
Re: ideas on speeding up stats
Please dont get rid of them.
I keep an extensive personal stats record, and doing so counts a lot towards my folding pleasure. It adds colour to my folding.
The speed issue is no big deal, having stats every 3 hours instead of two would be OK.
The most important thing with stats is their reflecting the folding done. After 13 months of folding, 19 WUs (1.6%) are uncredited, more reliability should be aimed for.
I keep an extensive personal stats record, and doing so counts a lot towards my folding pleasure. It adds colour to my folding.
The speed issue is no big deal, having stats every 3 hours instead of two would be OK.
The most important thing with stats is their reflecting the folding done. After 13 months of folding, 19 WUs (1.6%) are uncredited, more reliability should be aimed for.
Re: ideas on speeding up stats
First, I question why this pole exists. If anything the stats site should include more information and it should update more often.
As for speed I find it hard to believe that a program such as folding supported by Stanford University can’t afford a proper server. I also find it even harder to believe that with all the computer courses Stanford has a proper programmer can’t be found to maintain not only the servers but help with the code problems as well.
Just today I read this:
If anything, we the folders need more accountability, not less.
As for speed I find it hard to believe that a program such as folding supported by Stanford University can’t afford a proper server. I also find it even harder to believe that with all the computer courses Stanford has a proper programmer can’t be found to maintain not only the servers but help with the code problems as well.
Just today I read this:
How on earth can a staff member be surprised by something that was supposed to be tested?kasson wrote:We try to be as transparent as possible with released projects, code, etc. We don't like publicizing things that are still under development precisely because of the fluid nature of that development--plans change, projects get delayed, we find bugs, etc. But requests for more communication are understandable and appreciated.
One note about the quad-core issue: the performance of A1 work units on quad-core machines is something that took us by surprise. We expected much more efficient utilization. We've been working very hard to improve this, and we anticipate releasing an update to the A2 core in the near future that has very close to full utilization of all four (or more) cores. [One of our rare pre-release announcements.]
One other response: we understand that many folders use points yield as a way of assessing the scientific impact of their contributions. We try to keep things as consistent as we can, but there are challenges both of inter-machine variation and of balancing points/effort and points/science.
If anything, we the folders need more accountability, not less.
-
- Posts: 170
- Joined: Sun Dec 02, 2007 12:45 pm
- Location: Oklahoma
Re: ideas on speeding up stats
Do you have any idea of the costs associated with this program? Perhaps your endeavors aren't limited by finances, or by trying to figure out how to do something that's never been done before. If so, then why don't you write a check...BillR wrote:As for speed I find it hard to believe that a program such as folding supported by Stanford University can’t afford a proper server. I also find it even harder to believe that with all the computer courses Stanford has a proper programmer can’t be found to maintain not only the servers but help with the code problems as well.
I don't think you understand the issue... Stanford was surprised to find (during testing- since this had never been done before) how the Quad processors under- performed. There weren't Quad processors before, either. Who knew. This is all cutting- edge stuff. "Surprises" abound.How on earth can a staff member be surprised by something that was supposed to be tested?
Ps In case you hadn't noticed, some of the most (if not The Most) illustrious names in modern programming are hard at work on various FAH clients and cores and myriad other FAH issues.
If you know of any Stanford (or other) students/coders who are competent in Molecular Dynamics and SMP (or GPU or just DC) programming, just let Pande Group know... no wait, they're there already.
Pps Sheesh
Facts are not truth. Facts are merely facets of the shining diamond of truth.
-
- Posts: 118
- Joined: Mon Mar 03, 2008 3:11 am
- Hardware configuration: Intel Core2 Quad Q9300 (Intel P35 chipset)
Radeon 3850, 512MB model (Catalyst 8.10)
Windows XP, SP2 - Location: Syracuse, NY
Re: ideas on speeding up stats
Stats have nothing to do with folding proteins. Pande Group also isn't made of money.BillR wrote:First, I question why this pole exists. If anything the stats site should include more information and it should update more often.
As for speed I find it hard to believe that a program such as folding supported by Stanford University can’t afford a proper server. I also find it even harder to believe that with all the computer courses Stanford has a proper programmer can’t be found to maintain not only the servers but help with the code problems as well.
They're trying to improve performance by dropping an unpopular feature. Did you sign up here just to crap on everything? Go away.
Core2 Quad/Q9300, Radeon 3850/512MB (WinXP SP2)
-
- Posts: 357
- Joined: Mon Dec 03, 2007 4:36 pm
- Hardware configuration: Q9450 OC @ 3.2GHz (Win7 Home Premium) - SMP2
E7500 OC @ 3.66GHz (Windows Home Server) - SMP2
i5-3750k @ 3.8GHz (Win7 Pro) - SMP2 - Location: University of Birmingham, UK
Re: ideas on speeding up stats
http://fah-web.stanford.edu/serverstat.htmlBillR wrote:As for speed I find it hard to believe that a program such as folding supported by Stanford University can’t afford a proper server.
The Pande Group's main interest is in making sure that the whole network has enough units to consistently keep processing. The money they have goes on new servers to feed the clients with units (see the link above to get an idea of how many servers that requires), rather than stats. For some people the stats are important, but without the work they would be irrelevant because nothing would happen. The Pande Group is more bothered (quite rightly IMO) in furthering the science, rather than spending endless hours tracking down tiny little bugs. On the whole, this works well, and only recently have there been any genuinely major problems. At the same time, there have been unit shortages so their efforts have gone back into creating new units to keep 270,000+ machines happy and folding. Give them a break.
As for the cores not working as expected, they are still in testing. Ever wondered why the clients have "beta" in the version names? The SMP client is in beta, and always has been. It wasn't like they released the A1 core as a full, finished product and were surprised by the performance then - the methods are still being refined. Properly multithreaded code is still in its infancy, and this is bleeding-edge programming. If anything the Pande Group should be congratulated for even attempting to try this, nevermind getting near-perfect CPU usage at only the second attempt (the new A2 core), as the beta team are reporting.
Last edited by John Naylor on Wed May 14, 2008 4:32 pm, edited 1 time in total.
Folding whatever I'm sent since March 2006 Beta testing since October 2006. www.FAH-Addict.net Administrator since August 2009.