Project (WU) allocation question
Moderators: Site Moderators, FAHC Science Team
Project (WU) allocation question
I fold on a 1080ti. I see generally three general sizes of WU's: 50k (e.g. 9415), 60k (e.g. 9431) and those over 100k. Over the weekend I experienced som issues with high points work units failing (gpu oc I think).
Prior to this weekend I got mostly small and medium WUs. This weekend I got an influx of the large ones. I would guesstimate it being at 50% or more. The large ones yields a higher PPD and i like this.
Hence my question. Is there a mechanism that yields users that have had failed large WUs smaller WUs as a consequence of the failures? Or something similar?
with regards
Prior to this weekend I got mostly small and medium WUs. This weekend I got an influx of the large ones. I would guesstimate it being at 50% or more. The large ones yields a higher PPD and i like this.
Hence my question. Is there a mechanism that yields users that have had failed large WUs smaller WUs as a consequence of the failures? Or something similar?
with regards
-
- Site Admin
- Posts: 7937
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Project (WU) allocation question
Please read the Forum Policies post, bumping is not to be used here. Posted here as a reminder to other posters as well. Bump post deleted.ikek wrote:bump.
And to answer your question, No there is no such policy. Your client will get assigned WU's that are available and processable on your system.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Re: Project (WU) allocation question
Thank you for your reply. It is nice to know the answer to my query. I asked simply because I have widely diverging PPD on my gpu (200-300k per 24h cycle) and seemingly "only" get projects 9414/9415/9431 rather than those nice big ones.
If violated the forum policy then I apologize for violating said policy through the use of a bump. However I am not certain I did violate the forum rules.
My interpretation of the above quote, from the forum rules, is that I would have to say I have not been in violation of the forum rules/policy because said sentence permits the singular. Further I showed exemplary judgement, in reference to the bump, by waiting about a week (I think, hard to say since my post was removed) until I posted the solitary bump. Nor can it be described as repetitive.
Further I find it uncouth to quote my post and then delete the post you referenced. It is only you and me that knows that the post was quoted in its entirety. If you found it worthy to quote my post then it has sufficient worth to be left in the thread. Irregardless of it violating the forum rules and especially in the manner you used it. If it is standard protocol to remove an erring post and simultaneously reference it, without the public knowing if it was completely or selectively quoted, when addressing said error, then I am astonished. It is simply both bad form and practice.
I also question the manner the quote is used to dress me down and said dressing down is used as a reminder for the forum as a whole. I do not mind the reference to the forum rules, that's on me, but the open reminder to all others is unnecessary. In essence what you are stating is: "Hey! Look at this idiot, don't be like him, and everything will be ok!" Obviously I exaggerate but this could have been done through removing the post and sending a PM with the same text as in the above post.
With regards
If violated the forum policy then I apologize for violating said policy through the use of a bump. However I am not certain I did violate the forum rules.
There could be an ambiguity in both the forum rules and your post whether bumping is a singular or plural of bump. Putting bumping and spam like posts in the same sentence can also lead to an impression that the singular is ok, within reason, but bumpings (plural) is seen as spam. A follow up post (bump) about a week after original post is hardly excessive, nor is it repetitive. Further the text after the colon also explicitly refers to the plural. Now I will not ask a moderator to read the forum rules but I would implore the forum moderators to have a look at the wording of the above quote and if necessary amend it.Bumping or Spam-like posts: Repeated meaningless posts or other excessive posting. We do not require that every post be precisely on-topic, but we expect you to use good judgment.
My interpretation of the above quote, from the forum rules, is that I would have to say I have not been in violation of the forum rules/policy because said sentence permits the singular. Further I showed exemplary judgement, in reference to the bump, by waiting about a week (I think, hard to say since my post was removed) until I posted the solitary bump. Nor can it be described as repetitive.
Further I find it uncouth to quote my post and then delete the post you referenced. It is only you and me that knows that the post was quoted in its entirety. If you found it worthy to quote my post then it has sufficient worth to be left in the thread. Irregardless of it violating the forum rules and especially in the manner you used it. If it is standard protocol to remove an erring post and simultaneously reference it, without the public knowing if it was completely or selectively quoted, when addressing said error, then I am astonished. It is simply both bad form and practice.
I also question the manner the quote is used to dress me down and said dressing down is used as a reminder for the forum as a whole. I do not mind the reference to the forum rules, that's on me, but the open reminder to all others is unnecessary. In essence what you are stating is: "Hey! Look at this idiot, don't be like him, and everything will be ok!" Obviously I exaggerate but this could have been done through removing the post and sending a PM with the same text as in the above post.
With regards
-
- Posts: 2522
- Joined: Mon Feb 16, 2009 4:12 am
- Location: Greenwood MS USA
Re: Project (WU) allocation question
Not a dressing down, just good advice: take this to PMs! Later in the sentence you quote is "We do not require that every post be precisely on-topic" but this sure isn't about folding or how to fold.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Re: Project (WU) allocation question
One of FAH's goals is to align the points based on scientific value, which includes both the complexity of the computation and the speed with which it is completed. Another goal is to find an alignment which works for any of the supported GPUs. Both are difficult to do.ikek wrote:I fold on a 1080ti. I see generally three general sizes of WU's: 50k (e.g. 9415), 60k (e.g. 9431) and those over 100k. Over the weekend I experienced som issues with high points work units failing (gpu oc I think).
Considering the second, and looking at the 1080Ti compared to other GPUs. the 1080Ti is more efficient with WUs with 100k atoms and less efficient with smaller proteins. That means that overclicked GPUs which are stable for a small protein are more likely to become unstable than on a large WU. In contrast, s less powerful GPU are better at producing nice PPDs with those smaller proteins. That calls into question the ability to establish a "fair" PPD value that's linear for both classes of GPUs since the production is decidedly non-linear.
As far as specific assignments are concerned, several servers at mskcc.org (IPs 140.xx.xx.xx) were down which probably accounted for your observation of a change in the variety of assignments.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Project (WU) allocation question
Bruce, thank you for your informative reply. If you do not mind will ask a few follow up questions:bruce wrote:One of FAH's goals is to align the points based on scientific value, which includes both the complexity of the computation and the speed with which it is completed. Another goal is to find an alignment which works for any of the supported GPUs. Both are difficult to do.
Considering the second, and looking at the 1080Ti compared to other GPUs. the 1080Ti is more efficient with WUs with 100k atoms and less efficient with smaller proteins. That means that overclicked GPUs which are stable for a small protein are more likely to become unstable than on a large WU. In contrast, s less powerful GPU are better at producing nice PPDs with those smaller proteins. That calls into question the ability to establish a "fair" PPD value that's linear for both classes of GPUs since the production is decidedly non-linear.
As far as specific assignments are concerned, several servers at mskcc.org (IPs 140.xx.xx.xx) were down which probably accounted for your observation of a change in the variety of assignments.
1. In the current range of gpus (sticking with nvidia) from e.g. a 1050ti at the low end to a Titan variant at the higher end do all these qualify for both smaller and larger work units as long as these conform to the deadline set by the work unit? Put differently is there an allocation mechanism that pushes bigger work units on quicker gpus or is it more or less a free for all?
2. If there is no allocation mechanism. Would this not be an inefficient process for the project as a whole?
3. This might be a dumb question; does PPD have to be fair? Say a higher end card produces 10x the effort for the overall project, relative to a slower card, why can it not get 10x PPD?
4. For smaller proteins (projects). From your post I read that slower gpus are better than quicker gpus at folding the smaller projects (in terms of PPD?). Is this correct? Will not a quick gpu always be faster than a slower gpu? If it is not correct, why are these projects allocated at all to the higher end cards when they comparatively achieve less? I guess this ties in the issue of linearity(?).
I will be brief. The only reason I chose to reply in the thread rather than take it in a PM was that I got the response in the public domain. Period. Further what I truly objected to was having my post deleted and then quoted. There are many words that can describe such acts and it is certainly not good form (here i am diplomatic). Whether it being a dressing down it is probably a matter of perspective and the fact it was left in public.JimboPalmer wrote:Not a dressing down, just good advice: take this to PMs! Later in the sentence you quote is "We do not require that every post be precisely on-topic" but this sure isn't about folding or how to fold.
with regards
-
- Posts: 2522
- Joined: Mon Feb 16, 2009 4:12 am
- Location: Greenwood MS USA
Re: Project (WU) allocation question
1) there is a mess of clients, cores and servers to implement a new idea. Watch your idea (already started) spread through F@Hikek wrote: 1. In the current range of gpus (sticking with nvidia) from e.g. a 1050ti at the low end to a Titan variant at the higher end do all these qualify for both smaller and larger work units as long as these conform to the deadline set by the work unit? Put differently is there an allocation mechanism that pushes bigger work units on quicker gpus or is it more or less a free for all?
2. If there is no allocation mechanism. Would this not be an inefficient process for the project as a whole?
3. This might be a dumb question; does PPD have to be fair? Say a higher end card produces 10x the effort for the overall project, relative to a slower card, why can it not get 10x PPD?
4. For smaller proteins (projects). From your post I read that slower gpus are better than quicker gpus at folding the smaller projects (in terms of PPD?). Is this correct? Will not a quick gpu always be faster than a slower gpu? If it is not correct, why are these projects allocated at all to the higher end cards when they comparatively achieve less? I guess this ties in the issue of linearity(?).
a) About 6 months ago I started noticing numbers after the GPU name in GPUs.txt
[GeForce GTX 1080] 8873
[GeForce GTX 1070] 6463
(I have no idea what this number represents, ideally it is the minimum number of atoms in a protein worth folding on this GPU, perhaps scaled by a thousand)
Only some cards are so marked, but new cards seem to be getting a performance number of some sort, regard this as step one.
b) that number is now sent back to the server as part of the "end of this WU, Start of that WU" sequence regard this as step two.
c) in some way, the researcher needs to communicate how large a Protein he is working on in this WU: it could be guessed from files size, he may specify the number of atoms, etc. (due to the problems I will address in 4, we can not trust the researcher)
d) with the power of the GPU and the WU size, the server code could match WUs to the best GPU.
I have no evidence c or d have been done. But I am not sure how I would get evidence.
2) The software developers have that issue with multiple clients, Multiple cores, and servers all over the world. getting an idea from inspiration to completion is measured in years. (based on projects I was involved in, minimum of 4 years, although I only ever had one server)
3) fairness is that a GPU that is twice as powerful should get twice the points per day. But as you see, it is hard to pick a reference point that is fair to all, some slower cards will fail to complete if it is too big, some faster cards will have unused shaders if it is too small.
4)Correct, this is why we can't 'trust' the researchers, they will want the fastest cards, even if that means the card is not fully utilized and getting lower points. The strategy above where the GPUs.txt gives card strength then the server picks the WU best suited to that card, seems to take human 'greed' out of the decision. (Some donors will still fold intermittently, throwing all the calculations off)
If no larger WU is available, the top end cards should still get smaller WUs, and low power cards get larger WUs, but ideally this will be rarer.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Re: Project (WU) allocation question
1) The fundamental concept is that FAH will make use of whatever hardware is made available by donors. If a certain CPU or GPU is capable of processing a WU within that WUs deadline, then it should be assigned work. When FAH originated, there was so much work to be done, that they didn't care whether you had fast hardware or slow hardware ... they'd give you an assignment. That's still pretty much true.
Projects were given a priority meaning that projects were considered more critical were chosen for assignment more frequently than projects which were considered less critical. The objective was still based on maximizing the time that WUs were being processed rather than waiting to be assigned -- without regard to they type of equipment that was chosen for the assignment. The process of maximizing active processing had to work as new projects were added and older ones were completed, and especially when servers went offline or became available again.
2) FAH looks at the assignment process from the server side which leads to a rather different perspective than what you see from the perspective of an individual donor. They don't consider the history or capability of an individual donor even though they reported individual points.
The bonus system was added to discourage someone downloading an assignment and then turning their computer off over-night, distorting their perspective on whether a WU was being worked on or not. It also discouraged those who manage to run multiple WUs (originally with HyperTreading) with less processing power than allocating those same resources to individual WUs. FAH emphasizes faster turn-around time rather than higher CPU utilization.
3) As I have implied already, they do NOT attempt to assign based on comparisons between Donor hardware. If your hardware is 10x faster than my hardware, it's still important that we're both given a useful assignment. You'll be awarded more than 10x as many points as I will, no matter which project is assigned to either of us.
It would take a major redesign of the assignment system to come up with a way to list each project that's instantaneously available and each GPU/CPU that's seeking work at this moment and continuously re-optimize that part of the assignment process. It's true that some things would work better if they did, but it's not realistic with the development resources that FAH would be able to allocate to that project.
Consider the fact that when FAH is initially installed, the default configuration often creates a slot that's configured for 7 CPUs -- and we've known there are no assignments for 7 CPUs for quite some time ... In your case, the objective is still to make sure you get an assignment that keeps your system busy without regard to how it compares to some other available assignment that you'd rather have.
4) If your GPU is less efficient on some projects than on other projects, that's an issue for NVidia to solve. In the past, that was a major issue of contention between Donors who had ATI vs. NV preferences -- back when all projects were much smaller. ATI's internal design required that each of the shaders processed the same sequence of operations, whereas NV's design allowed more flexibility of what individual shaders did. This meant that the hardware efficiency varied with protein size. FAH didn't respond by designing a system that customized assignments by protein size so I doubt the will do it now .. and base it on the type of NV GPU you have.
(It would have been a lot easier to consider ATI vs. NV, since that has always been reported to the server than to somehow rank the variations between various NV GPUs, which isn't really known by the servers.)
Projects were given a priority meaning that projects were considered more critical were chosen for assignment more frequently than projects which were considered less critical. The objective was still based on maximizing the time that WUs were being processed rather than waiting to be assigned -- without regard to they type of equipment that was chosen for the assignment. The process of maximizing active processing had to work as new projects were added and older ones were completed, and especially when servers went offline or became available again.
2) FAH looks at the assignment process from the server side which leads to a rather different perspective than what you see from the perspective of an individual donor. They don't consider the history or capability of an individual donor even though they reported individual points.
The bonus system was added to discourage someone downloading an assignment and then turning their computer off over-night, distorting their perspective on whether a WU was being worked on or not. It also discouraged those who manage to run multiple WUs (originally with HyperTreading) with less processing power than allocating those same resources to individual WUs. FAH emphasizes faster turn-around time rather than higher CPU utilization.
3) As I have implied already, they do NOT attempt to assign based on comparisons between Donor hardware. If your hardware is 10x faster than my hardware, it's still important that we're both given a useful assignment. You'll be awarded more than 10x as many points as I will, no matter which project is assigned to either of us.
It would take a major redesign of the assignment system to come up with a way to list each project that's instantaneously available and each GPU/CPU that's seeking work at this moment and continuously re-optimize that part of the assignment process. It's true that some things would work better if they did, but it's not realistic with the development resources that FAH would be able to allocate to that project.
Consider the fact that when FAH is initially installed, the default configuration often creates a slot that's configured for 7 CPUs -- and we've known there are no assignments for 7 CPUs for quite some time ... In your case, the objective is still to make sure you get an assignment that keeps your system busy without regard to how it compares to some other available assignment that you'd rather have.
4) If your GPU is less efficient on some projects than on other projects, that's an issue for NVidia to solve. In the past, that was a major issue of contention between Donors who had ATI vs. NV preferences -- back when all projects were much smaller. ATI's internal design required that each of the shaders processed the same sequence of operations, whereas NV's design allowed more flexibility of what individual shaders did. This meant that the hardware efficiency varied with protein size. FAH didn't respond by designing a system that customized assignments by protein size so I doubt the will do it now .. and base it on the type of NV GPU you have.
(It would have been a lot easier to consider ATI vs. NV, since that has always been reported to the server than to somehow rank the variations between various NV GPUs, which isn't really known by the servers.)
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Site Admin
- Posts: 7937
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Project (WU) allocation question
The number is the cards performance listed for single precision Gflops. It currently is not used, Bruce has been adding it for possible future use by the servers during a WU assignment.JimboPalmer wrote:a) About 6 months ago I started noticing numbers after the GPU name in GPUs.txt
[GeForce GTX 1080] 8873
[GeForce GTX 1070] 6463
(I have no idea what this number represents, ideally it is the minimum number of atoms in a protein worth folding on this GPU, perhaps scaled by a thousand)
Only some cards are so marked, but new cards seem to be getting a performance number of some sort, regard this as step one.
As for what is used, the listing for a card consists of four fields - the device ID, GPU type (AMD or nVidia), GPU species, and description. The last is currently just cosmetic, it is only used for display purposes in the client. Further information on the GPUs.txt format for its entries can be found here - https://fah.stanford.edu/projects/FAHCl ... tOfGPUsTxt.
Currently different preferences for assignment are associated with the GPU species. However, for example that lumps all Pascal cards together and does not let the assignment servers distinguish between a 1060 and a 1080 card. And since this is just a preference, if no WU's with the highest preference for a type of card are available, the assignment will go to the next available type of project that can be processed.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Re: Project (WU) allocation question
I started this before last two posts, so it could be a bit out of place but i'll leave it as is. Edit over.
Jimbopalmer, thank you for the lengthy response it was an interesting read.
I think that I should preface my writings that I see things from a client perspective and that I have limited to no knowledge of how the "behind-the-curtain" elements function. Also that I generally talk about gpu folding.
1. I took it for granted that there is/was a centralised allocation mechanism that sluices WUs to the appropriate available resource. One that gives big stuff to quick cards and so forth (not to simplify too much).
3. I understand that finding a common denominator to establish the distribution of award points for work can be difficult. Especially as it has to take generations of gpus into account and a host other variables. I am sure it could be seen as elitist, and it probably is, that projects like 9414/9415 that give me roughly 50k points on a 1080ti are perceived as small -hence less desirable -when the very same WUs will be a daunting challenge for comparatively slower cards. And how to balance this.
I do not think I have questioned the points distribution for a project per se rather I have questioned the distribution of work units. Ideally I would prefer to have my system receive projects like 174413 which utilize the donated processing power fully. Over a 24h cycle I get between 10 and 18 work units. Roughly if I get 15 or more WU then I go under 1m PPD. I have more 24h cycles nearer 18 work units than 10. So there is some potential for improvement as seen from a client perspective.
4. I found this a fascinating tidbit.
with regards
edit: first line of post.
Jimbopalmer, thank you for the lengthy response it was an interesting read.
I think that I should preface my writings that I see things from a client perspective and that I have limited to no knowledge of how the "behind-the-curtain" elements function. Also that I generally talk about gpu folding.
1. I took it for granted that there is/was a centralised allocation mechanism that sluices WUs to the appropriate available resource. One that gives big stuff to quick cards and so forth (not to simplify too much).
3. I understand that finding a common denominator to establish the distribution of award points for work can be difficult. Especially as it has to take generations of gpus into account and a host other variables. I am sure it could be seen as elitist, and it probably is, that projects like 9414/9415 that give me roughly 50k points on a 1080ti are perceived as small -hence less desirable -when the very same WUs will be a daunting challenge for comparatively slower cards. And how to balance this.
I do not think I have questioned the points distribution for a project per se rather I have questioned the distribution of work units. Ideally I would prefer to have my system receive projects like 174413 which utilize the donated processing power fully. Over a 24h cycle I get between 10 and 18 work units. Roughly if I get 15 or more WU then I go under 1m PPD. I have more 24h cycles nearer 18 work units than 10. So there is some potential for improvement as seen from a client perspective.
4. I found this a fascinating tidbit.
with regards
edit: first line of post.
Re: Project (WU) allocation question
@brucebruce wrote:...
1 & 2) For prioritised projects (perhaps 9414) is there in addition to the normal points distribution a further priority bonus? The reason I ask is because there is a balance between the needs of the users (folders) and the project as a whole. If I could choose then I would prefer not to fold project 9414, when assessing solely on my PPD return. I can have a fluctuation of up to 300k points per 24h cycle. Which I find a little excessive.
Further I wonder at the first come first serve approach. If a slow gpu, that gets 200-300k ppd would spend 10 hours or more on a WU from e.g. project 11432 would it not be in the interest of the project to allocate said WU to a quicker gpus like e.g. a 1080ti that spends a little over three hours on the same work unit? Is it not fair to assume that the longer a gpu spends on a work unit the higher the probability that something will interrupt the completion of the WU? Thus, possibly affecting overall project progress? I am obviously cherry picking the circumstances but I am finding the overall approach perhaps an excellent solution when FAH was in its infancy but perhaps less so today. Then again there is probably a bunch of variables I am not aware of that makes the issue significantly more complex.
It might be a dumb question but why does the assignment process not take the individual donors history and capability into account when allocating work? Is not the relevant information already available?
3) Regarding the 10x PPD I personally see the deviation in PPD (per 24h) to be more of an issue rather than how much my hardware gets compared to something else. Perhaps the 1080ti and Titan is more vulnerable in this regard (?).
Is there a white- and/or blacklist for the individual projects? How controversial would it be to propose that say project 11432 (bigger WUs) only is assigned to say nvidia 1070 or better gpus (and amd equivalents)?
4) Is it still the case that hardware efficiency varies with protein size which seems a possibility on high end cards like the 1080ti. Though I am vary of using efficiency as a metric. Could it not be that under some conditions some GPUs have more available resources than what the specified task demands and hence only allocates a portion of its available computing power?
I wonder about the portion you put in parentheses. From my reading of the fahcontrol log the log includes a set of parameters that from what I can comprehend should enable ranking of the tiers of graphics cards. This irrespective of it being a red or a green gpu. Server side knows which gpu is in use for a specific slot, it knows which projects has been allocated, it know when it was allocated and when it was completed (from next wu request) and so forth. Based on this information from a large set of users it should be possible to have a statistical approach to which gpu tier gets which set of available gpus for optimum allocation efficiency. Obviously other rules would also be included e.g. prioritising certain projects, accounting for shortages and so forth.
Perhaps my programming naivety, absence, is coloring my perception of the issue at hand.
Again thank you for an interesting reply.
with regards
PS: is there somewhere I can adjust the timer for the automated log out? I got logged out when writing the post.
Re: Project (WU) allocation question
Historically, there have been a number of FAHCores with (very) rare instances where changes were made to enhance the types of science that could be done. There also have been new FAHCores which add support for new devices.
Originally, the CPU core was written for one CPU core since early home computers contained CPUs only supported one processor. A new version supported multh threaded operation; a new version supported SSE2; a new version now supports AVX. Ar some point, the uniprocessor core was merged with the multiprocessor core. At some point non SSE2 CPUs were rare enough that it was no longer necessary to support both SSE2 cpuS and non-SSE2 processors. Currently, there's support both for AVX hardware and non-AVX hardware
The first GPU core worked only on early ATI hardware. Later a version was created that supported early NV hardware and code was later developed that worked with FERMI or better. Some FAHCores have worked with CUDA (NV only) while others worked with OpenCL. [OpenCL is supported on either ATI or NV, and until recently, Apple didn't provide sufficient support for OpenCL for it to be used by FAH.
At all times, it was necessary for the FAHClient to report enough information for the servers to provide access to a version of a FAHCore that worked on the designated hardware. .Currently, that's important for CPUs, since there are a lot of CPUs with AVX support as well as a lot with only SSE2 support. The current FAHCore for GPUs needs only OpenCL, so it's being used on ALL GPUs. If a future FAHCore happens to be compiled to use CUDA, it will be necessary to differentiate between GPUs that support CUDA and those which do not. {That is already detected by distinguishing between ATI and NV-Fermi-or-better.
the priority of a project has been used by the server to alter the percentage of the active projects that are being assigned. I'm not sure if that server feature is currently active, but it's not a donor-controlled feature and does not affect the points. Baseline Points are based on the quantity of work in the WU and assumes that if something requires more ops, it's not important whether they're completed on your hardware or mine. Bonuses are awarded based entirely on the elapsed time from when a WU is assigned until it is returned.
Keeping track of individual donors's history is another developmental enhancement that's unlikely for sufficient development resources to be allocated. If N new Donors are added, total performance would go up, and the value of N doesn't need to be very big to exceed the enhancements you’re asking for.
The variables that you're not taking into account are (A) Available development resources are extremely limited. (B) The FAHClient doesn't gather the data the servers would need.(C) Useful contributions by slow GPUs are scientifically valuable too, even if they earn fewer points.
Assignments are based on which FAHCore is required, as stated above. A Project Owner can reconfigure his project to avoid Windows/Linus/Mac or avoid NV/ATI. The question (s)he has to answer is "Can this class of hardware help my project or should it be excluded?" The WhiteList/BlackList i intended exclusively to exclude GPUs that cannot be supported.
In the situation a few days ago, ASSUMING the owners of projects like 9414/9415 had excluded class of GPU including your system, and then the servers with larger projects went off-line for a few days to rebuild the RAID, assignments to that class of GPUs would probably have lapsed, leaving a lot of idle resources.for several days.
Originally, the CPU core was written for one CPU core since early home computers contained CPUs only supported one processor. A new version supported multh threaded operation; a new version supported SSE2; a new version now supports AVX. Ar some point, the uniprocessor core was merged with the multiprocessor core. At some point non SSE2 CPUs were rare enough that it was no longer necessary to support both SSE2 cpuS and non-SSE2 processors. Currently, there's support both for AVX hardware and non-AVX hardware
The first GPU core worked only on early ATI hardware. Later a version was created that supported early NV hardware and code was later developed that worked with FERMI or better. Some FAHCores have worked with CUDA (NV only) while others worked with OpenCL. [OpenCL is supported on either ATI or NV, and until recently, Apple didn't provide sufficient support for OpenCL for it to be used by FAH.
At all times, it was necessary for the FAHClient to report enough information for the servers to provide access to a version of a FAHCore that worked on the designated hardware. .Currently, that's important for CPUs, since there are a lot of CPUs with AVX support as well as a lot with only SSE2 support. The current FAHCore for GPUs needs only OpenCL, so it's being used on ALL GPUs. If a future FAHCore happens to be compiled to use CUDA, it will be necessary to differentiate between GPUs that support CUDA and those which do not. {That is already detected by distinguishing between ATI and NV-Fermi-or-better.
the priority of a project has been used by the server to alter the percentage of the active projects that are being assigned. I'm not sure if that server feature is currently active, but it's not a donor-controlled feature and does not affect the points. Baseline Points are based on the quantity of work in the WU and assumes that if something requires more ops, it's not important whether they're completed on your hardware or mine. Bonuses are awarded based entirely on the elapsed time from when a WU is assigned until it is returned.
Keeping track of individual donors's history is another developmental enhancement that's unlikely for sufficient development resources to be allocated. If N new Donors are added, total performance would go up, and the value of N doesn't need to be very big to exceed the enhancements you’re asking for.
The variables that you're not taking into account are (A) Available development resources are extremely limited. (B) The FAHClient doesn't gather the data the servers would need.(C) Useful contributions by slow GPUs are scientifically valuable too, even if they earn fewer points.
No.Is there a white- and/or blacklist for the individual projects?
Assignments are based on which FAHCore is required, as stated above. A Project Owner can reconfigure his project to avoid Windows/Linus/Mac or avoid NV/ATI. The question (s)he has to answer is "Can this class of hardware help my project or should it be excluded?" The WhiteList/BlackList i intended exclusively to exclude GPUs that cannot be supported.
In the situation a few days ago, ASSUMING the owners of projects like 9414/9415 had excluded class of GPU including your system, and then the servers with larger projects went off-line for a few days to rebuild the RAID, assignments to that class of GPUs would probably have lapsed, leaving a lot of idle resources.for several days.
Possibly. I don't know enough about what the GPU designer put into his design. Maybe something is saturating (other than the GFLOPS of the shaders). FAH passes the data to be processed into the OpenCL API and waits for it to be completed. How it's processed by the (hardware + drivers) is up to NVidia.Could it not be that under some conditions some GPUs have more available resources than what the specified task demands and hence only allocates a portion of its available computing power?
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 2522
- Joined: Mon Feb 16, 2009 4:12 am
- Location: Greenwood MS USA
Re: Project (WU) allocation question
(this post contains a lot of jargon and acronyms, Sorry. if you are curious they are defined on Wikipedia)ikek wrote:I think that I should preface my writings that I see things from a client perspective and that I have limited to no knowledge of how the "behind-the-curtain" elements function. Also that I generally talk about gpu folding.
In 1986 -1988* I had the experience of writing BASIC for 300 IBM PS/2 30 PCs spread across Canada to talk to COBOL on an IBM S/36 near Toronto that talked to CICS and RJE on an IBM 3080 in the same room. You will notice that this is simpler than F@H, but kept me busy writing for two years. We hear of at least Assignment Servers, Collection Servers, Work Servers, and a Stats Server. The exact role of each we can only guess. It is also clear that while many of these are at Stanford, they are spread across many campuses. (I think all are plural except the single Stats Server which is at Stanford)
I am just a donor in F@H but as an old programmer who has written for PCs to talk to servers, I have (hopefully) intelligent guesses.
*Being the 80s, the PCs had 1200 baud dial up modems, the S/36 talked at 56k to an X.25 PAD and communicated to the 3080 via LU6.2 56k Bisync. I compressed data so that the average (non-interactive) phone call lasted under 1 minute, and only sent data via RJE to or from the IBM 3080 overnight. (The good old days were not good, just slow)
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Re: Project (WU) allocation question
@JimboPalmer: I understand that organizing and running a project that currently is nearly 100 x86 TFLOPS is a complex machinery though I perhaps do not understand fully the complexities. As with many things in life.
Since there is, publicly at least, no information about the distribution of the tiers of hardware within a hardware category it is impossible to quantify if N+x donors will be a better approach than optimising for the existing folders. The project is currently running around 98 x86 TFLOPS http://folding.stanford.edu/stats/os. A 10 percent efficiency optimisation would probably require a significant amount of recruiting by way of comparison.
If sticking with gpus, and my case is possibly an outlier in the grand scheme of things. I have up to 300k PPD variance and a hypothetical maximum PPD that is 500k lower than my lower 24h threshold. For 1080ti's this means that, in the best case, an optimisation for these cards to make them run at their capacity equals for every 3 cards optimised one new folding card. I run between low 900k (24h) and 1.2m, in the momentary PPD estimates I have been as high as 1.4m but have never received enough of those WU to get to that level of production.
For every 1,000 1080ti's, assuming same ppd characteristics, with the same variation in PPD that could be as high as 300-500m extra PPD per 24h cycle. If 1080s are in the same situation then the magnitude increases. I'm cherry picking the numbers and the 1080ti could be the only card in this situation but there is room for improvement. An allocation mechanism seems to be a possible solution.
I have tried the stats page on the FAH site and struggle with it. The extremeoverclocking.com stats page is good but not perfect. There are many motivations for folding. Some even like seeing their stats or fold because of them. That FAH does not provide good statistical information about the users for its users is unfortunate. Further, imagine something similar to steams hardware summary but tailored for folding. How often hasn't somebody asked "if i get X cpu/gpu how good will it be?" How many hardware related questions could be answered by referring to a stats page? It would be useful for the folders, perhaps not for the team behind FAH.
For instance if I could look up a table showing that the 1080 has a 24h PPD of say 800k+-50k and the 1080ti is 1m with a low of 900k and high of 1.2 (with a hypothetical PPD even higher) then my spending decision would be informed. Today there are some doubtful numbers available on the internet rather than something statistical significant. Would it not be in the interest of the FAH project to have such information readily available to the end user their donors?
B) That the FAHClient doesn't gather gpu id alongside whether it is amd or nvidia is not the same as it is not possible? I assume the information is available to the FAHClient since it is to be found in the FAHClient log and hence possible to send to server side.
C) While it might appear that I am only thinking of my own sick mother (a 1080ti) it is because it is the GPU I have. A point is a point and equal in terms of scientific value regardless of it being from a low or a high end piece of hardware. I have no issue with this. What I question is why there is not a focus on delivering the best suitable work unit to the specific class of gpu. The answer I think I have gotten relates to available resources and that is a fair position.
With such a mechanism the loss in computing power would be, as far as I can see it, limited to the time between server goes down and client side or another server asks it if its up and ready for work. Then the contingency rules can be implemented and so forth. It is obviously not necessarily an easy thing to implement.
One last thing, I was looking at the FAH website for white papers (or similar material) that cover the technical aspects of the project and detailed information in general. I could not find any white papers. I found plenty of information on setting up my system and so forth but little/nothing on the backend stuff.
Thank you for an informative reply.bruce wrote:Keeping track of individual donors's history is another developmental enhancement that's unlikely for sufficient development resources to be allocated. If N new Donors are added, total performance would go up, and the value of N doesn't need to be very big to exceed the enhancements you’re asking for.
Since there is, publicly at least, no information about the distribution of the tiers of hardware within a hardware category it is impossible to quantify if N+x donors will be a better approach than optimising for the existing folders. The project is currently running around 98 x86 TFLOPS http://folding.stanford.edu/stats/os. A 10 percent efficiency optimisation would probably require a significant amount of recruiting by way of comparison.
If sticking with gpus, and my case is possibly an outlier in the grand scheme of things. I have up to 300k PPD variance and a hypothetical maximum PPD that is 500k lower than my lower 24h threshold. For 1080ti's this means that, in the best case, an optimisation for these cards to make them run at their capacity equals for every 3 cards optimised one new folding card. I run between low 900k (24h) and 1.2m, in the momentary PPD estimates I have been as high as 1.4m but have never received enough of those WU to get to that level of production.
For every 1,000 1080ti's, assuming same ppd characteristics, with the same variation in PPD that could be as high as 300-500m extra PPD per 24h cycle. If 1080s are in the same situation then the magnitude increases. I'm cherry picking the numbers and the 1080ti could be the only card in this situation but there is room for improvement. An allocation mechanism seems to be a possible solution.
I have tried the stats page on the FAH site and struggle with it. The extremeoverclocking.com stats page is good but not perfect. There are many motivations for folding. Some even like seeing their stats or fold because of them. That FAH does not provide good statistical information about the users for its users is unfortunate. Further, imagine something similar to steams hardware summary but tailored for folding. How often hasn't somebody asked "if i get X cpu/gpu how good will it be?" How many hardware related questions could be answered by referring to a stats page? It would be useful for the folders, perhaps not for the team behind FAH.
For instance if I could look up a table showing that the 1080 has a 24h PPD of say 800k+-50k and the 1080ti is 1m with a low of 900k and high of 1.2 (with a hypothetical PPD even higher) then my spending decision would be informed. Today there are some doubtful numbers available on the internet rather than something statistical significant. Would it not be in the interest of the FAH project to have such information readily available to the end user their donors?
A) I did not go into the issue of resources and their prioritisation in my last post since it hard to comment on the resource situation without knowing anything about it. Nor what a desired or proposed feature would cost etc. This is not to say that end user features which the end user might appreciate is not raised.bruce wrote:The variables that you're not taking into account are (A) Available development resources are extremely limited. (B) The FAHClient doesn't gather the data the servers would need.(C) Useful contributions by slow GPUs are scientifically valuable too, even if they earn fewer points.
B) That the FAHClient doesn't gather gpu id alongside whether it is amd or nvidia is not the same as it is not possible? I assume the information is available to the FAHClient since it is to be found in the FAHClient log and hence possible to send to server side.
C) While it might appear that I am only thinking of my own sick mother (a 1080ti) it is because it is the GPU I have. A point is a point and equal in terms of scientific value regardless of it being from a low or a high end piece of hardware. I have no issue with this. What I question is why there is not a focus on delivering the best suitable work unit to the specific class of gpu. The answer I think I have gotten relates to available resources and that is a fair position.
With a layered set of WU distribution rules this could be evaded. Hypothetically would it not be possible to have a set of rules with tiers that take effect if the conditions surrounding the primary rules are compromised? The hypothetical project X (or group of projects) accepts only 1080ti's or better cards as long as those collectively provide minimum 100m PPD (24h). If the tier (1080ti or better) does not deliver enough PPD then the next tier of cards gains access to the project, and so forth. If project X is down the resources that are available are momentarily redirected to a less demanding set of projects.bruce wrote:In the situation a few days ago, ASSUMING the owners of projects like 9414/9415 had excluded class of GPU including your system, and then the servers with larger projects went off-line for a few days to rebuild the RAID, assignments to that class of GPUs would probably have lapsed, leaving a lot of idle resources.for several days.
With such a mechanism the loss in computing power would be, as far as I can see it, limited to the time between server goes down and client side or another server asks it if its up and ready for work. Then the contingency rules can be implemented and so forth. It is obviously not necessarily an easy thing to implement.
One last thing, I was looking at the FAH website for white papers (or similar material) that cover the technical aspects of the project and detailed information in general. I could not find any white papers. I found plenty of information on setting up my system and so forth but little/nothing on the backend stuff.
Re: Project (WU) allocation question
Are these the papers you're looking for?
http://folding.stanford.edu/category/papers/
Some (most?) of the links at the top of the forum pages are out-of-date or simply wrong. Try the links in the following post which may help when looking for data:
viewtopic.php?p=292520#p292520
I used the Home link and clicked on News to find the Papers link.
Sometimes the folding.stanford.edu domain will time out or fail and all you can do is try later.
http://folding.stanford.edu/category/papers/
Some (most?) of the links at the top of the forum pages are out-of-date or simply wrong. Try the links in the following post which may help when looking for data:
viewtopic.php?p=292520#p292520
I used the Home link and clicked on News to find the Papers link.
Sometimes the folding.stanford.edu domain will time out or fail and all you can do is try later.