Stats Update Constantly Unavailable
Posted: Sun Oct 19, 2008 5:18 am
Just had a quick question about why the stats server seems to go down almost every how for an update...is it always like this? (I'm new to F@H). Thanks!
Community driven support forum for Folding@home
https://foldingforum.org/
It's not the CPUs that matter. The stats database is HUGE. That's why it takes so long to update. Cody's suggestion means a couple more large RAID arrays would be needed, too. Moreover, if something bad happens during an update, recovery of the corrected data becomes more and more difficult every time you make the process more complex. Live updates might mean data recovery might not be possible, or might take many, many days.WangFeiHong wrote:I agree with codysluder.
What is preventing us from having live updates?
Bandwidth? I don't think so since it's all LAN there.
Processing speed? Can use codysluder's suggestion, after all Pande Lab has like a few hundred CPUs running SMPs now, surely they can afford 2 for stats???
Code: Select all
232 Intel DQ35JO motherboard
408 Intel Quad Q6600
099 450W PS
354 2 X 2 GB memory
331 2 Seagate 400 GB 7,200 rpm 16 MB cache in Raid 1
3rd party stats are easy to run since we've done "all the hard work" in our stats update. We need to take all the raw data from the servers, parse and validate it, and then enter it into multiple databases, including the WU db that mods use to check whether a given WU has been completed (the 3rd party sites don't have that). Just the mods WU db alone is a huge endeavor, since we need to have a record for each individual WU.WangFeiHong wrote:I agree with codysluder.
What is preventing us from having live updates?
Bandwidth? I don't think so since it's all LAN there.
Processing speed? Can use codysluder's suggestion, after all Pande Lab has like a few hundred CPUs running SMPs now, surely they can afford 2 for stats???
Wise choices, stats is less important than having work units sent and received reliably.VijayPande wrote:3rd party stats are easy to run since we've done "all the hard work" in our stats update. We need to take all the raw data from the servers, parse and validate it, and then enter it into multiple databases, including the WU db that mods use to check whether a given WU has been completed (the 3rd party sites don't have that). Just the mods WU db alone is a huge endeavor, since we need to have a record for each individual WU.WangFeiHong wrote:I agree with codysluder.
What is preventing us from having live updates?
Bandwidth? I don't think so since it's all LAN there.
Processing speed? Can use codysluder's suggestion, after all Pande Lab has like a few hundred CPUs running SMPs now, surely they can afford 2 for stats???
Updating the stats db server could help quite a bit. If I had $100,000 to spend on it, we could get a machine with lots of cores and *LOTS* of RAM, ideally enough to hold the relevant parts of db in RAM simultaneously (eg 32 to 64GB RAM -- large size due to the WU db). With that, I think the stats update would be lightening fast and thus we could allow simultaneous web access during updates.
However, I have new server funds allocated to increase server reliability and increase storage space. Those seem to be higher priorities.
the page that says come back later could have links to 3rd party stats sites but that puts Stanford in a position of recommending sites that it has no control over . . . besides the issue of expressing a preference . . . plus the links do need occasional maintenance.anandhanju wrote:Can FAH/ Stanford only process the stats/ internal lookup database updates and provide a more detailed feed to the 3rd party stats sites. This way, FAH can shut down its stats servers whenever it wants to, while 3rd party sites can continue to show slightly outdated stats for 4 hours. I haven't seen them complain about live updates as I expect that to be a relatively less computationally expensive operation (from VP's post). What I'm trying to get to is that FAH can close down its internet frontend for stats and leave it to 3rd party sites, provided they are willing and capable of handling the increased user connections.