Stats Update Constantly Unavailable

Moderators: Site Moderators, FAHC Science Team

Post Reply
ericbarch
Posts: 4
Joined: Sun Oct 19, 2008 5:10 am

Stats Update Constantly Unavailable

Post by ericbarch »

Just had a quick question about why the stats server seems to go down almost every how for an update...is it always like this? (I'm new to F@H). Thanks!
torswin
Posts: 21
Joined: Mon Mar 24, 2008 5:02 pm

Re: Stats Update Constantly Unavailable

Post by torswin »

Here's a good place to get updates:
http://folding.typepad.com/news/

This one is relevant:
http://folding.typepad.com/news/2008/10 ... -soon.html
bbtkd
Posts: 1
Joined: Fri Aug 22, 2008 9:16 pm

Re: Stats Update Constantly Unavailable

Post by bbtkd »

I suspect what he was asking is why it is frequently unavailable because they are updating stats.

The way they have written the stats database, they must lock some or all of it while they are updating which seems to me to take a majority of the time. I believe it occurs hourly and seems to take a long time. Don't take this as a complaint per se, but isn't there some way to update the stats without locking it for so much of the time?
DreadedOne509
Posts: 13
Joined: Sun Dec 02, 2007 1:29 pm

Re: Stats Update Constantly Unavailable

Post by DreadedOne509 »

I don't think there is a way to keep the stats unlocked while they are updating and still have accurate information reflected.
If they did, the information would only be half accurate at best since there are always work units being returned for credit.
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: Stats Update Constantly Unavailable

Post by codysluder »

Suppose there are THREE copies of the stats database. Suppose all three copies are identical. The web page points to Copy A which is currently being used to provide answers to people who want to check their stats. Start updating copy B. When it has been updated, copy it to copy C. Switch the web interface so that copy B provides answers to stats queries. (The changeover is instantaneous, without any down-time.) Start updating copy C. Copy it to Copy A and then make Copy C active. Update copy A. Etc.

How big in the database, and what would it cost to store three copies rather than one?
WangFeiHong
Posts: 47
Joined: Mon Oct 27, 2008 1:40 pm

Re: Stats Update Constantly Unavailable

Post by WangFeiHong »

I agree with codysluder.

What is preventing us from having live updates?

Bandwidth? I don't think so since it's all LAN there.
Processing speed? Can use codysluder's suggestion, after all Pande Lab has like a few hundred CPUs running SMPs now, surely they can afford 2 for stats???
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Stats Update Constantly Unavailable

Post by bruce »

WangFeiHong wrote:I agree with codysluder.

What is preventing us from having live updates?

Bandwidth? I don't think so since it's all LAN there.
Processing speed? Can use codysluder's suggestion, after all Pande Lab has like a few hundred CPUs running SMPs now, surely they can afford 2 for stats???
It's not the CPUs that matter. The stats database is HUGE. That's why it takes so long to update. Cody's suggestion means a couple more large RAID arrays would be needed, too. Moreover, if something bad happens during an update, recovery of the corrected data becomes more and more difficult every time you make the process more complex. Live updates might mean data recovery might not be possible, or might take many, many days.
WangFeiHong
Posts: 47
Joined: Mon Oct 27, 2008 1:40 pm

Re: Stats Update Constantly Unavailable

Post by WangFeiHong »

Kakaostats uses:

Code: Select all

232  Intel DQ35JO motherboard
408  Intel Quad Q6600
099  450W PS
354  2 X 2 GB memory
331  2 Seagate 400 GB 7,200 rpm 16 MB cache in Raid 1

EOC uses some 8-core server with 2x15krpm seagate drives.

On stanford side they probably have more details though less graphs, but is it really necessary to record who completed every single WU? Just wondering whether it can be streamlined... Though i must say having more updated stats shoudl probably be on low priority.
VijayPande
Pande Group Member
Posts: 2058
Joined: Fri Nov 30, 2007 6:25 am
Location: Stanford

Re: Stats Update Constantly Unavailable

Post by VijayPande »

WangFeiHong wrote:I agree with codysluder.

What is preventing us from having live updates?

Bandwidth? I don't think so since it's all LAN there.
Processing speed? Can use codysluder's suggestion, after all Pande Lab has like a few hundred CPUs running SMPs now, surely they can afford 2 for stats???
3rd party stats are easy to run since we've done "all the hard work" in our stats update. We need to take all the raw data from the servers, parse and validate it, and then enter it into multiple databases, including the WU db that mods use to check whether a given WU has been completed (the 3rd party sites don't have that). Just the mods WU db alone is a huge endeavor, since we need to have a record for each individual WU.

Updating the stats db server could help quite a bit. If I had $100,000 to spend on it, we could get a machine with lots of cores and *LOTS* of RAM, ideally enough to hold the relevant parts of db in RAM simultaneously (eg 32 to 64GB RAM -- large size due to the WU db). With that, I think the stats update would be lightening fast and thus we could allow simultaneous web access during updates.

However, I have new server funds allocated to increase server reliability and increase storage space. Those seem to be higher priorities.
Xilikon
Posts: 155
Joined: Sun Dec 02, 2007 1:34 pm

Re: Stats Update Constantly Unavailable

Post by Xilikon »

VijayPande wrote:
WangFeiHong wrote:I agree with codysluder.

What is preventing us from having live updates?

Bandwidth? I don't think so since it's all LAN there.
Processing speed? Can use codysluder's suggestion, after all Pande Lab has like a few hundred CPUs running SMPs now, surely they can afford 2 for stats???
3rd party stats are easy to run since we've done "all the hard work" in our stats update. We need to take all the raw data from the servers, parse and validate it, and then enter it into multiple databases, including the WU db that mods use to check whether a given WU has been completed (the 3rd party sites don't have that). Just the mods WU db alone is a huge endeavor, since we need to have a record for each individual WU.

Updating the stats db server could help quite a bit. If I had $100,000 to spend on it, we could get a machine with lots of cores and *LOTS* of RAM, ideally enough to hold the relevant parts of db in RAM simultaneously (eg 32 to 64GB RAM -- large size due to the WU db). With that, I think the stats update would be lightening fast and thus we could allow simultaneous web access during updates.

However, I have new server funds allocated to increase server reliability and increase storage space. Those seem to be higher priorities.
Wise choices, stats is less important than having work units sent and received reliably.
Image
anandhanju
Posts: 522
Joined: Mon Dec 03, 2007 4:33 am
Location: Australia

Re: Stats Update Constantly Unavailable

Post by anandhanju »

Can FAH/ Stanford only process the stats/ internal lookup database updates and provide a more detailed feed to the 3rd party stats sites. This way, FAH can shut down its stats servers whenever it wants to, while 3rd party sites can continue to show slightly outdated stats for 4 hours. I haven't seen them complain about live updates as I expect that to be a relatively less computationally expensive operation (from VP's post). What I'm trying to get to is that FAH can close down its internet frontend for stats and leave it to 3rd party sites, provided they are willing and capable of handling the increased user connections.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Stats Update Constantly Unavailable

Post by bruce »

anandhanju wrote:Can FAH/ Stanford only process the stats/ internal lookup database updates and provide a more detailed feed to the 3rd party stats sites. This way, FAH can shut down its stats servers whenever it wants to, while 3rd party sites can continue to show slightly outdated stats for 4 hours. I haven't seen them complain about live updates as I expect that to be a relatively less computationally expensive operation (from VP's post). What I'm trying to get to is that FAH can close down its internet frontend for stats and leave it to 3rd party sites, provided they are willing and capable of handling the increased user connections.
the page that says come back later could have links to 3rd party stats sites but that puts Stanford in a position of recommending sites that it has no control over . . . besides the issue of expressing a preference . . . plus the links do need occasional maintenance.
Post Reply