Search found 6 matches

by beuk
Sun Apr 19, 2020 1:24 pm
Forum: Issues with a specific WU
Topic: [Solved] Gromacs Error
Replies: 13
Views: 2271

Re: [Solved] Gromacs Error

Dear Neil-B,

Thank you for this feedback.

I will add more 32 cores nodes this week, so I will let them run without slots (so default 1x32 cores) and share with you the error when it happen, so it can help to debug if needed :)
by beuk
Fri Apr 17, 2020 8:22 am
Forum: Issues with a specific WU
Topic: [Solved] Gromacs Error
Replies: 13
Views: 2271

Re: [Solved] Gromacs Error

Dear _r2w_ben, No, we removed the GPU from our SMP, to use them somewhere else (and also because Nvidia dropped linux driver support on our GPU. Drivers for CUDA can only be found on Windows 10 today, so we also have Windows 10 computes nodes now, alongside our Centos ones :roll: ). It seems FAH alw...
by beuk
Wed Apr 15, 2020 7:57 am
Forum: Issues with a specific WU
Topic: [Solved] Gromacs Error
Replies: 13
Views: 2271

Re: [Solved] Gromacs Error

Dear Portella,

Many thanks for this! I am learning Singularity, so this will be a good topic to start :-)

With my best regards

Beuk
by beuk
Sun Apr 12, 2020 2:05 pm
Forum: Issues with a specific WU
Topic: [Solved] Gromacs Error
Replies: 13
Views: 2271

Re: Gromacs Error

Dear Portella, Many thanks for this feedback. So I will concentrate on slim nodes, then fat nodes, and consider first nodes without interconnect. I will update the topic subject to solved. :) By the way, if it could help someone, we are using an home made open source cluster stack, based on Ansible,...
by beuk
Sun Apr 12, 2020 12:49 pm
Forum: Issues with a specific WU
Topic: [Solved] Gromacs Error
Replies: 13
Views: 2271

Re: Gromacs Error

Dear All, Many thanks for these answers. I will reduce to 16 cores slots, since even our 32 cores nodes face this issue. It seems that FAClient retry indefinitely the same WU when it fails, making randomly nodes stuck in a loop. I will create a cron to detect that and restart FAClient + erase work d...
by beuk
Fri Apr 10, 2020 2:36 pm
Forum: Issues with a specific WU
Topic: [Solved] Gromacs Error
Replies: 13
Views: 2271

[Solved] Gromacs Error

Hi, We are running F@H on our cluster at the Fabrique du Loch - FabLab, Auray, France, and we have an issue with one of our nodes. While slim nodes (16 cores/32Gb ram, I think Charmm is running on these) run ok, our SMP node (64/32 cores/1Tb ram) faces issues with Gromacs parameters: 14:28:54:WU00:F...