Dear Neil-B,
Thank you for this feedback.
I will add more 32 cores nodes this week, so I will let them run without slots (so default 1x32 cores) and share with you the error when it happen, so it can help to debug if needed
Search found 6 matches
- Sun Apr 19, 2020 1:24 pm
- Forum: Issues with a specific WU
- Topic: [Solved] Gromacs Error
- Replies: 13
- Views: 2271
- Fri Apr 17, 2020 8:22 am
- Forum: Issues with a specific WU
- Topic: [Solved] Gromacs Error
- Replies: 13
- Views: 2271
Re: [Solved] Gromacs Error
Dear _r2w_ben, No, we removed the GPU from our SMP, to use them somewhere else (and also because Nvidia dropped linux driver support on our GPU. Drivers for CUDA can only be found on Windows 10 today, so we also have Windows 10 computes nodes now, alongside our Centos ones :roll: ). It seems FAH alw...
- Wed Apr 15, 2020 7:57 am
- Forum: Issues with a specific WU
- Topic: [Solved] Gromacs Error
- Replies: 13
- Views: 2271
Re: [Solved] Gromacs Error
Dear Portella,
Many thanks for this! I am learning Singularity, so this will be a good topic to start
With my best regards
Beuk
Many thanks for this! I am learning Singularity, so this will be a good topic to start
With my best regards
Beuk
- Sun Apr 12, 2020 2:05 pm
- Forum: Issues with a specific WU
- Topic: [Solved] Gromacs Error
- Replies: 13
- Views: 2271
Re: Gromacs Error
Dear Portella, Many thanks for this feedback. So I will concentrate on slim nodes, then fat nodes, and consider first nodes without interconnect. I will update the topic subject to solved. :) By the way, if it could help someone, we are using an home made open source cluster stack, based on Ansible,...
- Sun Apr 12, 2020 12:49 pm
- Forum: Issues with a specific WU
- Topic: [Solved] Gromacs Error
- Replies: 13
- Views: 2271
Re: Gromacs Error
Dear All, Many thanks for these answers. I will reduce to 16 cores slots, since even our 32 cores nodes face this issue. It seems that FAClient retry indefinitely the same WU when it fails, making randomly nodes stuck in a loop. I will create a cron to detect that and restart FAClient + erase work d...
- Fri Apr 10, 2020 2:36 pm
- Forum: Issues with a specific WU
- Topic: [Solved] Gromacs Error
- Replies: 13
- Views: 2271
[Solved] Gromacs Error
Hi, We are running F@H on our cluster at the Fabrique du Loch - FabLab, Auray, France, and we have an issue with one of our nodes. While slim nodes (16 cores/32Gb ram, I think Charmm is running on these) run ok, our SMP node (64/32 cores/1Tb ram) faces issues with Gromacs parameters: 14:28:54:WU00:F...