Proposal to the PG: Optimizing Performance using v7
Posted: Sat Nov 19, 2011 4:12 am
I have been thinking about the Pande Group's goals and policies, and the reasons behind their actions. There have been some recent changes to -bigadv that have led to extensive discussions about these reasons. It seemed that both the PG and the donors were debating about performance, which involved discussions of deadlines and minimum system requirements. I thought about its implications to lower classes of WUs. To me, it was emphasized by the v7 client auto-configuring a setup in an apparent attempt to gain the most scientific ground using that computer, while making F@h much easier for a layman user. This user may not necessarily pay attention to some of the more subtle aspects of F@h, such as the deadlines that come into play especially for high-performance clients, which the v7 client installs by default. I propose a compromise that I believe finds the middle ground and should maximum scientific production for the newcomer. As this is relevant to my proposal, let me first list where I believe the Pande Group stands, so that I can explain why I believe this could be in their best interest.
It seems that the Pande Group makes their decisions on the following grounds:
1. Computer should be utilized to the greatest reasonable degree possible (emphasizing user choice of course)
2. Each type of Work Units should be run on the most appropriate hardware available
3. Reasonable efforts should be made to minimize user interference or maintenance
4. Exceeding deadlines is detrimental to science and should be discouraged/prevented
5. Completing WUs significantly before the deadline is preferred
6. F@h should be as simple as possible for the donor
7. Changes to F@h should minimize donor disruptions while maintaining the above
8. F@h should be competitive, fun, and slightly addictive, while focusing on the above
9. Any proposed change must be examined in the long term for its cost-benefit ratio
I propose that the Pande Group implement a feature into v7 that allows the slot configuration to be automatically changed based on measured performance of a small set of WUs. My idea is similar to the "Performance Percentages" idea in v6 that simply divided the amount of time it took the user to complete the WU by the WU's deadline. However, I can see how that could be an inaccurate measurement, since it only measures one WU. My idea, described in detail below, aims to use multiple WUs, to achieve a better sample into the performance of the system. This performance is not only tied to raw hardware speed but also the system's availability hours. This measurement would all be on v7's end, and should not require a server upgrade, which makes implementation much easier.
Ever since F@h evolved from the screensaver, the uniprocessor has been the main recommended client. While it does do useful work, it acts as a set-and-forget client, and requires almost no interaction or maintenance by the user. However, if the new v7 client detects the appropriate hardware, it installs SMP+GPU. Compared to the uniprocessor, these two clients are very scientifically productive, use the most system resources, and have the shortest deadlines. As soon as v7 goes public, assuming the install policy does not change, high performance clients will now be installed. The v7 client does give a small description what SMP and GPU are, but does not explain about deadlines or system impact. It doesn’t need to, because that would be overly complicated for a user who just wants to donate to this interesting project he's heard about. I'm not implying that he should be forced into the uniprocessor, since 7im has informed me earlier that there are many people on the forums asking why their system aren't fully utilized. Thus, I think there is an easy choice.
I propose that the v7 client contain another option, the "Auto" feature, which is selected by default. Users who are familiar with F@h can still jump to "SMP+GPU" if they wish, but "Auto" tailors to newcomers. Although it knows that the computer system is capable of SMP+GPU, it installs only SMP. While SMP does have shorter deadlines, it is very scientifically productive so it would be useful on computers that have the performance and availability to meet the deadlines. The CPU does an excellent job at backing off for the user's other applications. The v7 client should run 10 SMP WUs in this way, and record which how many met the deadline. If 7 or less met the deadline, it uninstalls SMP and installs three uniprocessors (assuming the SMP is on 4 threads). I say "three" rather than "four" because it’s been my experience that three uniprocessors generate more heat (and thus a louder and faster fan) than -smp 4, and this makes sense to me because I can imagine the data overhead in moving information to/from the independent CPU cores. These uniprocessors have much longer deadlines, so it allows the donor to make scientific contributions based on their computer's performance. In addition, when this change happens there should be some sort of message displayed in the client (not a popup) illustrating that their configuration was automatically reconfigured based on their performance statistics. They should also have the option to easily undo the action if they feel their habits have changed. There's no way for v7 to determine if they later qualify for SMP and upgrade them, since SMP WUs are different, change over time, and it's difficult to determine how long it would take to run on their system without actually running it to completion. Benchmarks have been proposed before and then officially rejected. Thus it by far easier to downgrade than upgrade a configuration.
Now, if it is determined that 8 or more of the rolling 10 SMP WUs make the deadline, than GPU should be installed as well if it’s possible. Nevertheless, it too should be measured. So if 7 out of a rolling 10 don't make the deadline, then it should uninstall, again showing the aforementioned message. It's a good idea to delay the installation of the GPU slot like this, as donors may be caught off guard by every computing component suddenly generating heat and revving the fan up at the same time. Better to start at a reasonable level, and go up gradually from there.
Now, given its resource impact, -bigadv should remain completely optional as it stands, so at least now is rather outside this scope of this proposal.
Both of these automatic configurations should leave the computer with the best possible setup given their hardware usage. Back to my list of PG reasons, this completes #1. #2 is also covered because I'm sure there's been internal discussion which led to the SMP+GPU default installation by v7. #3 is covered by SMP, but to a lesser degree by GPU since they difficulty with priority processing. This setup also reduces the violations of #4, while still focusing on #1-3. It does not handle #5, but v7 could be modified to do this if it is a big enough issue. Although the automatic configuration changes could be a bit confusing to a donor who was carefully watching, I think that since most of the decision-making is automated, #6 is reasonably fulfilled. As I have mentioned, those who are familiar with F@h's clients can still choose the setup that is best for them. So long they are aware of that, #7 should still be fine. No real change to #8, although without changes to computer habits PPD is basically maximized, which would be encouraging since v7 now includes obvious links to the stats pages. As for #9, that's not really my call, but as I have noted the servers shouldn't need to be changed, and most of the work would be up to v7, which given its flexible design should be up the task.
I realize that I’ve said something close to this before, but after further consideration I feel that this is a more thorough idea. I hope that the Pande Group reviews and considers my proposal, and I welcome further comments and suggestions.
Thank you for your time in reading all of this,
Jesse V.
It seems that the Pande Group makes their decisions on the following grounds:
1. Computer should be utilized to the greatest reasonable degree possible (emphasizing user choice of course)
2. Each type of Work Units should be run on the most appropriate hardware available
3. Reasonable efforts should be made to minimize user interference or maintenance
4. Exceeding deadlines is detrimental to science and should be discouraged/prevented
5. Completing WUs significantly before the deadline is preferred
6. F@h should be as simple as possible for the donor
7. Changes to F@h should minimize donor disruptions while maintaining the above
8. F@h should be competitive, fun, and slightly addictive, while focusing on the above
9. Any proposed change must be examined in the long term for its cost-benefit ratio
I propose that the Pande Group implement a feature into v7 that allows the slot configuration to be automatically changed based on measured performance of a small set of WUs. My idea is similar to the "Performance Percentages" idea in v6 that simply divided the amount of time it took the user to complete the WU by the WU's deadline. However, I can see how that could be an inaccurate measurement, since it only measures one WU. My idea, described in detail below, aims to use multiple WUs, to achieve a better sample into the performance of the system. This performance is not only tied to raw hardware speed but also the system's availability hours. This measurement would all be on v7's end, and should not require a server upgrade, which makes implementation much easier.
Ever since F@h evolved from the screensaver, the uniprocessor has been the main recommended client. While it does do useful work, it acts as a set-and-forget client, and requires almost no interaction or maintenance by the user. However, if the new v7 client detects the appropriate hardware, it installs SMP+GPU. Compared to the uniprocessor, these two clients are very scientifically productive, use the most system resources, and have the shortest deadlines. As soon as v7 goes public, assuming the install policy does not change, high performance clients will now be installed. The v7 client does give a small description what SMP and GPU are, but does not explain about deadlines or system impact. It doesn’t need to, because that would be overly complicated for a user who just wants to donate to this interesting project he's heard about. I'm not implying that he should be forced into the uniprocessor, since 7im has informed me earlier that there are many people on the forums asking why their system aren't fully utilized. Thus, I think there is an easy choice.
I propose that the v7 client contain another option, the "Auto" feature, which is selected by default. Users who are familiar with F@h can still jump to "SMP+GPU" if they wish, but "Auto" tailors to newcomers. Although it knows that the computer system is capable of SMP+GPU, it installs only SMP. While SMP does have shorter deadlines, it is very scientifically productive so it would be useful on computers that have the performance and availability to meet the deadlines. The CPU does an excellent job at backing off for the user's other applications. The v7 client should run 10 SMP WUs in this way, and record which how many met the deadline. If 7 or less met the deadline, it uninstalls SMP and installs three uniprocessors (assuming the SMP is on 4 threads). I say "three" rather than "four" because it’s been my experience that three uniprocessors generate more heat (and thus a louder and faster fan) than -smp 4, and this makes sense to me because I can imagine the data overhead in moving information to/from the independent CPU cores. These uniprocessors have much longer deadlines, so it allows the donor to make scientific contributions based on their computer's performance. In addition, when this change happens there should be some sort of message displayed in the client (not a popup) illustrating that their configuration was automatically reconfigured based on their performance statistics. They should also have the option to easily undo the action if they feel their habits have changed. There's no way for v7 to determine if they later qualify for SMP and upgrade them, since SMP WUs are different, change over time, and it's difficult to determine how long it would take to run on their system without actually running it to completion. Benchmarks have been proposed before and then officially rejected. Thus it by far easier to downgrade than upgrade a configuration.
Now, if it is determined that 8 or more of the rolling 10 SMP WUs make the deadline, than GPU should be installed as well if it’s possible. Nevertheless, it too should be measured. So if 7 out of a rolling 10 don't make the deadline, then it should uninstall, again showing the aforementioned message. It's a good idea to delay the installation of the GPU slot like this, as donors may be caught off guard by every computing component suddenly generating heat and revving the fan up at the same time. Better to start at a reasonable level, and go up gradually from there.
Now, given its resource impact, -bigadv should remain completely optional as it stands, so at least now is rather outside this scope of this proposal.
Both of these automatic configurations should leave the computer with the best possible setup given their hardware usage. Back to my list of PG reasons, this completes #1. #2 is also covered because I'm sure there's been internal discussion which led to the SMP+GPU default installation by v7. #3 is covered by SMP, but to a lesser degree by GPU since they difficulty with priority processing. This setup also reduces the violations of #4, while still focusing on #1-3. It does not handle #5, but v7 could be modified to do this if it is a big enough issue. Although the automatic configuration changes could be a bit confusing to a donor who was carefully watching, I think that since most of the decision-making is automated, #6 is reasonably fulfilled. As I have mentioned, those who are familiar with F@h's clients can still choose the setup that is best for them. So long they are aware of that, #7 should still be fine. No real change to #8, although without changes to computer habits PPD is basically maximized, which would be encouraging since v7 now includes obvious links to the stats pages. As for #9, that's not really my call, but as I have noted the servers shouldn't need to be changed, and most of the work would be up to v7, which given its flexible design should be up the task.
I realize that I’ve said something close to this before, but after further consideration I feel that this is a more thorough idea. I hope that the Pande Group reviews and considers my proposal, and I welcome further comments and suggestions.
Thank you for your time in reading all of this,
Jesse V.