Moderators: Site Moderators , FAHC Science Team
gw666
Posts: 14 Joined: Thu Apr 09, 2020 8:53 am
Post
by gw666 » Thu May 21, 2020 1:25 pm
I'm backfilling on a batch system that limits the job's RAM usage. Unfortunately, many folding jobs are terminated because they use too much RAM. Does the --memory option do any good?
Code: Select all
*********************** Log Started 2020-05-21T13:02:06Z ***********************
13:02:06:****************************** FAHClient ******************************
13:02:06: Version: 7.6.9
13:02:06: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
13:02:06: Copyright: 2020 foldingathome.org
13:02:06: Homepage: https://foldingathome.org/
13:02:06: Date: Apr 17 2020
13:02:06: Time: 18:11:26
13:02:06: Revision: 398c2b17fa535e0cc6c9d10856b2154c32771646
13:02:06: Branch: master
13:02:06: Compiler: GNU 8.3.0
13:02:06: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
13:02:06: -funroll-loops -fno-pie
13:02:06: Platform: linux2 4.19.0-5-amd64
13:02:06: Bits: 64
13:02:06: Mode: Release
13:02:06: Args: --user=DESY-ZN_GPU --team=38188
13:02:06: --passkey=******************************** --gpu=true
13:02:06: --smp=false --max-units=1 --exit-when-done=true
13:02:06: --web-enable=false --command-enable=false --memory=5368709120
13:02:06: Config: /batch/60848470.17.gpu.q/config.xml
13:02:06:******************************** CBang ********************************
13:02:06: Date: Apr 17 2020
13:02:06: Time: 18:10:13
13:02:06: Revision: 2fb0be7809c5e45287a122ca5fbc15b5ae859a3b
13:02:06: Branch: master
13:02:06: Compiler: GNU 8.3.0
13:02:06: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
13:02:06: -funroll-loops -fno-pie -fPIC
13:02:06: Platform: linux2 4.19.0-5-amd64
13:02:06: Bits: 64
13:02:06: Mode: Release
13:02:06:******************************* System ********************************
13:02:06: CPU: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
13:02:06: CPU ID: GenuineIntel Family 6 Model 85 Stepping 4
13:02:06: CPUs: 64
13:02:06: Memory: 376.38GiB
13:02:06: Free Memory: 180.65GiB
13:02:06: Threads: POSIX_THREADS
13:02:06: OS Version: 3.10
13:02:06: Has Battery: false
13:02:06: On Battery: false
13:02:06: UTC Offset: 2
13:02:06: PID: 98732
It did detect the amout of system memory, but how to tell it to use only 5GiB of it?
Foliant
Posts: 104 Joined: Wed May 13, 2020 4:39 pm
Location: Bavaria
Post
by Foliant » Thu May 21, 2020 2:42 pm
Hello,
from the Help of FAHClient you can get:
Code: Select all
memory <string>
Override memory, in bytes, reported to Folding@home.
As there are also arguments left from older versions I dont know if this would work but you might give it a try.
Regards
Patrick
24/7
1x i5 3470 @2Cores
1x GTX750 (GM107)
2x GTX750Ti (GM107)
gw666
Posts: 14 Joined: Thu Apr 09, 2020 8:53 am
Post
by gw666 » Thu May 21, 2020 7:00 pm
Foliant wrote: Hello,
from the Help of FAHClient you can get:
Code: Select all
memory <string>
Override memory, in bytes, reported to Folding@home.
As there are also arguments left from older versions I don't know if this would work but you might give it a try.
Regards
Patrick
I had used that option, see my original message. It didn't produce any visible change in the log file.
bruce
Posts: 20824 Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.
Post
by bruce » Thu May 21, 2020 7:23 pm
I would not expect any visible change in the log file.
There is a min-memory setting on the server which limits assignments of a project to machines that don't have enough resources. The only visible change would be that you'd get the dreaded "Exception: Could not get an assignment" message or maybe "Failed to get assignment from 'ip.ip,ip.ip': No WUs available for this configuration"
Either way, you can't tell WHY you didn't get an assignment.
I've never used that setting, but I think I'll experiment. Maybe it can be used inside or outside of <slot .... </slot>. If inside, I could limit one GPU to small memory assignments but not the other one. It might prevent my slower GPU from getting really big projects.
gw666
Posts: 14 Joined: Thu Apr 09, 2020 8:53 am
Post
by gw666 » Sun May 24, 2020 1:33 pm
It looks like the --memory option doesn't work for me, I have set it to 4468709120, but the batch system killed my job with this error:
Code: Select all
Reason: job 60953324.264 exceeds job master hard limit "h_rss" (4791971840.00000 > limit:4509715660.80000) - initiate terminate method;
So 4509715660 is the limit I have set for the batch job, even higher than the --memory value.