how much data has FAH generated

beer · Post by **beer** » Mon Aug 12, 2013 9:31 am

I just wondering how much data we have generated?

k1wi · Post by **k1wi** » Mon Aug 12, 2013 11:06 am

In what measurement?

Post by **Jesse_V** » Mon Aug 12, 2013 3:51 pm

The last reliable measurement that I've heard comes from 2009.
http://en.fah-addict.net/articles/artic ... e-home.php

The Folding@Home project has been taken to an unprecedented scale. This is the first - and largest - distributed computing project in the world, in terms of raw power. On our side, we the contributors/clients have a small portion of our drive on which to store data files for the project (no more than the current WU and any pending work items), and at Stanford there is the other. Each WU and result file is carefully preserved. The results for a given project are then combined to create the videos of proteins that have been released by the project.

All of this data is kept in storage servers at Stanford. The terabytes are countless; people speak of more than 400TB of valuable scientific data. However, such storage is very expensive, and the power of the projects equipment is increasing, and the PS3 and the GPU has only increased this need for storage space.

The principle of Storage@Home is simple; data derived from the WUs of Folding clients are sent to your PC. When a server needs to access data that you are mirroring, your computer is accessed and the data uploaded.

However, this system requires some forward planning. First, redundant data must be stored on multiple clients, as it would be disastrous to lose simulation data if John Smith had a hard drive crash. Redundancy also allows load balancing, which enables better data availability for servers. The use of encryption, signature data, and a digital fingerprint ensures that the content has not been modified or damaged, and that the sender is authorised by Stanford.

beer · Post by **beer** » Mon Aug 12, 2013 4:06 pm

k1wi: I was thinking of GB/TB etc

Napoleon · Post by **Napoleon** » Mon Aug 12, 2013 4:34 pm

Is everything since the very beginning really stored?

I have a vague recollection that some PG member (perhaps Vijay himself?) mentioned that some results have become outdated and deleted. I don't trust my memory on this one though. Could be that some people merely discussed the possibility of discarding some results.

Post by **bruce** » Mon Aug 12, 2013 5:31 pm

My memory is vague, too, but I think I remember a discussion of remote off-line storage. Certainly Stanford has purchased a lot of RAID, but I sincerely doubt that all of the data is on-line.

Post by **kasson** » Mon Aug 12, 2013 5:49 pm

The quick, approximate answer is likely PB but not EB.
The intent is to retain analyzed data, but it is not always feasible to retain all "primary" data since the start of the project. We always try to retain as many data as we think feasible and that might be useful for other scientists in the future.
I won't speak for Dr. Pande regarding his intentions for the archival practices of the Stanford scientists, though.

Folding Forum

how much data has FAH generated

how much data has FAH generated

Re: how much data has FAH generated

Re: how much data has FAH generated

Re: how much data has FAH generated

Re: how much data has FAH generated

Re: how much data has FAH generated

Re: how much data has FAH generated