171.67.108.54 (Fermi GPU P6802/3/4 series) Observations

Moderators: Site Moderators, FAHC Science Team

Post Reply
GreyWhiskers
Posts: 660
Joined: Mon Oct 25, 2010 5:57 am
Hardware configuration: a) Main unit
Sandybridge in HAF922 w/200 mm side fan
--i7 2600K@4.2 GHz
--ASUS P8P67 DeluxeB3
--4GB ADATA 1600 RAM
--750W Corsair PS
--2Seagate Hyb 750&500 GB--WD Caviar Black 1TB
--EVGA 660GTX-Ti FTW - Signature 2 GPU@ 1241 Boost
--MSI GTX560Ti @900MHz
--Win7Home64; FAH V7.3.2; 327.23 drivers

b) 2004 HP a475c desktop, 1 core Pent 4 HT@3.2 GHz; Mem 2GB;HDD 160 GB;Zotac GT430PCI@900 MHz
WinXP SP3-32 FAH v7.3.6 301.42 drivers - GPU slot only

c) 2005 Toshiba M45-S551 laptop w/2 GB mem, 160GB HDD;Pent M 740 CPU @ 1.73 GHz
WinXP SP3-32 FAH v7.3.6 [Receiving Core A4 work units]
d) 2011 lappy-15.6"-1920x1080;i7-2860QM,2.5;IC Diamond Thermal Compound;GTX 560M 1,536MB u/c@700;16GB-1333MHz RAM;HDD:500GBHyb w/ 4GB SSD;Win7HomePrem64;320.18 drivers FAH 7.4.2ß
Location: Saratoga, California USA

171.67.108.54 (Fermi GPU P6802/3/4 series) Observations

Post by GreyWhiskers »

Image

All the Fermi GPU WUs I've gotten between 30 June and today, 22 July, have been from the P6802/3/4 series. It's some 220 of them now.

I was pondering from the server stats that this project seems to be a "just in time" reprocessing of completed WUs into new WUs along the project trajectories. The profiles seem to be different than those I've seen in other projects - very low WU Available considering the frantic pace of assignments. The stats are showing this server has ~30%+ of the GPU assignments.

[What follows is my pure conjecture about the project design from the server stats.]

When I produce these charts, I don't usually scale the WU Avail and WU Received axes the same since there are usually many, many more avail than received. But, they are scaled the same here.

It seems that a large batch of WUs were generated to start the series out - 4798 per the stats. Then, as the weighting resulted in more of these WUs being assigned, after the seeming runout of the p6801 series, the reservoir went down.

In the last week or so, it seems to have reached a steady state - there are about 850 +/- WUs available against a WU return average of about 400 every 35 minutes.

The back end must be reprocessing those returned WUs just in time to keep barely enough WUs in the reservoir to match the assignments.

No big message here, just observations from a "staring at goats" retiree who tries to understand the project.

Thanks, PG, for keeping it interesting.
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.67.108.54 (Fermi GPU P6802/3/4 series) Observations

Post by bruce »

Consider the fact that turn-around time is important enough to FAH that they're willing to award large bonuses to the SMP client based on how quickly the WU is returned . . . and there are indications that the same concept applies to GPU and Uniprocessor WUs, though a similar time-based bonus system is not in place for GPU and is being phased in gradually for Uniprocessor projects.

If somebody downloads a WU and keeps it for X hours longer than necessary to process it, the value of X is a penalty to the science, whether that's reflected in the points.

But . . . the total time between Gens isn't just the time that the WU is being processed by a donor, it's also the time spent uploading, downloading, and sitting somewhere waiting to be assigned to someone. The server-end of the process can also contribute penalty delays. Thus having a just-in-time server generation process is ideal. I've never seen it so clearly presented, but the right side of your graph shows a server that is tuned just about as well as it can be.

Keeping keep barely enough WUs in the reservoir to match the assignments is a goal, but there needs to be a safety margin to accommodate changes. In addition to a small reservoir on the server, that's normally achieved by having more than one set of projects on more than one server. The most important projects run near the just-in-time limit and other projects with a lower priority (Weight) with enough WUs so that the Assignment Server can find something to assign if everybody wants an assignment at the same time.

New projects go through several steps but normally (1) the project is loaded on the server, (2) some testing is done in the Pande lab, (3) the project is opened to beta testing (with a limited number of people), (4) the project is opened to advmethods testing (with a lot more people), and (5) the project is released to everybody. I didn't check the dates, but that probably explains the abrupt changes in WU availability. (Though I would expect to see one more abrupt change of a smaller magnitude.)

There can also be other reasonably abrupt changes, depending on how many servers are providing work for Fermi clients and depending on the Weight assigned to the various sources of WUs. (All this assumes that the number of active clients changes very slowly. That's not unreasonable, considering that there are some 9500 active NV clients (10 day moving average) returning WUs.
Post Reply