Page 1 of 1
Not the second unit download CPU
Posted: Mon Jun 13, 2016 8:47 am
by Zarck
I have the following problem after my first CPU is finished to calculate it is sent, then I display the next unit is being downloaded and nothing happens ... the unit will not load ... a solution?
Pas de téléchargement de la deuxième unité CPU
J'ai le problème suivant une fois que ma première unité CPU est finie de calculer celle-ci est envoyée, puis j'ai l'affichage que l'unité suivante est en cours de téléchargement et rien ne se passe... l'unité ne se charge pas... une solution ?
https://www.dropbox.com/s/t3hbr2s3f9wm8 ... 2.png?dl=0
Re: Not the second unit download CPU
Posted: Mon Jun 13, 2016 9:52 am
by mmonnin
Change your CPU slot from 11 to 10. 11 is a prime and those WUs are not being sent by some servers any more.
Re: Not the second unit download CPU
Posted: Mon Jun 13, 2016 9:54 am
by Zarck
mmonnin wrote:Chance your CPU slot from 11 to 10. 11 is a prime and those WUs are not being sent by some servers any more.
I do not understand the answer ...
Re: Not the second unit download CPU
Posted: Mon Jun 13, 2016 10:53 am
by artoar_11
mmonnin wrote:Chance your CPU slot from 11 to 10. 11 is a prime and those WUs are not being sent by some servers any more.
Change your CPU
cores from 11 to 10. 11 is a prime and those WUs are not being sent by some servers any more.
Configure -> Slots -> cpu ->
CPUs (from 11 to 10 cores)
Re: Not the second unit download CPU
Posted: Mon Jun 13, 2016 11:52 am
by Zarck
Thank you for the answer.
It's good now.
It's weird before I did not have to do this manipulation !!!
@+
*_*
Re: Not the second unit download CPU
Posted: Mon Jun 13, 2016 7:29 pm
by bruce
The problem is not new, but it's not well-known, and the steps taken by the Stanford servers to hide the problem from you are gradually changing.
Some numbers of CPUs work well, some don't work at all,, and some are simply unreliable. Those numbers which are unreliable are gradually being excluded.
Calculate the factors of you CPU count. (e.g., 10 = 5 * 2 * 1.) If any of the factor are greater than 6, you can expect problems. When you added a GPU, FAH reserved 1 CPU to support it, leaving 11 (a prime number > 6) and should have reserved 2, leaving 10 tor your CPU slot.
FAH plans to provide an automatic solution "soon" so that you won't have to make similar manual changes.
Re: Not the second unit download CPU
Posted: Mon Jun 13, 2016 8:25 pm
by ChristianVirtual
I always wanted to know what are meaningful values for CPU slot setting; just made a little Python script:
Code: Select all
list=[]
for x in xrange(1, 7):
for y in xrange(1, 7):
for z in xrange(1, 7):
c = x*y*z
if not c in list:
list.append(c)
list.sort()
print list
Result:
Code: Select all
[1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36, 40, 45, 48, 50, 54, 60, 64, 72, 75, 80, 90, 96, 100, 108, 120, 125, 144, 150, 180, 216]
Rule of thumb: pick up the highest number from the list which is <= ( thread per CPU - number of GPU slots )
Like 6core/12HT - 1 GPU = 11; highest number in list is 10; that's the suggested CPU: setting ...
I wish I could test CPU:216 setup here at my home
Re: Not the second unit download CPU
Posted: Mon Jun 13, 2016 8:31 pm
by Nathan_P
ChristianVirtual wrote:[/code]
I wish I could test CPU:216 setup here at my home
That would be 9 slots at 24 threads each, about 1.8m PPD if you can stand the heat and power draw - oh to run an old 6903 or 8102 on such a beast
Re: Not the second unit download CPU
Posted: Mon Jun 13, 2016 9:42 pm
by Joe_H
One limitation of that list of numbers, many of the odd numbers higher than 20 are also excluded by the servers. In theory thay might be usable, but the decision to exclude them was made at some point in the past because of the limited opportunity to test runs at those settings.
Re: Not the second unit download CPU
Posted: Mon Jun 13, 2016 10:33 pm
by 7im
Joe_H wrote:One limitation of that list of numbers, many of the odd numbers higher than 20 are also excluded by the servers. In theory thay might be usable, but the decision to exclude them was made at some point in the past because of the limited opportunity to test runs at those settings.
Well, someone needs to get their crap together and figure out what GROMACS officially supports, what it doesn't, and clean up the servers and clients with the new "acceptable settings." Because guess what, we have donors with CPU counts in to the 100s, and if you want to flatly turn away that kind of power because PG doesn't have their ducks lined up, that's a sad day for the project.
Re: Not the second unit download CPU
Posted: Mon Jun 13, 2016 10:43 pm
by mmonnin
artoar_11 wrote:mmonnin wrote:Chance your CPU slot from 11 to 10. 11 is a prime and those WUs are not being sent by some servers any more.
Change your CPU
cores from 11 to 10. 11 is a prime and those WUs are not being sent by some servers any more.
Configure -> Slots -> cpu ->
CPUs (from 11 to 10 cores)
Threads.
Re: Not the second unit download CPU
Posted: Tue Jun 14, 2016 2:17 am
by bruce
7im wrote:Well, someone needs to get their crap together and figure out what GROMACS officially supports, what it doesn't, and clean up the servers and clients with the new "acceptable settings." Because guess what, we have donors with CPU counts in to the 100s, and if you want to flatly turn away that kind of power because PG doesn't have their ducks lined up, that's a sad day for the project.
I agree (somewhat). I have been unable to find an unofficial statement from GROMACS -- probably because nobody has tested all of the combinations with all of the possible proteins. My statement limiting the values to Mr. Virtual's is my best guess, not a proven fact. If anybody tries a number that's NOT on that list and it produces reliable results over a number of different proteins, be sure to let me know.
Actually, for a long time, 7 was neither included in the "good" list or the "bad" list and a number of different proteins did work -- but many did not. Projects could be manually excluded if they demonstrated a non-zero failure rate.
That just shows that it's really a challenge to PROVE what's on the good/bad lists.
Another way of looking at the issue: For any hardware that might be available to FAH which allow the use of
all of their CPU-threads (i.e.- no threads dedicated to GPUs or other things) what CPU slot configurations might be available?