Search found 14 matches
- Sun May 24, 2020 1:33 pm
- Forum: V7.6.x Public Release Windows/Linux/MacOS X
- Topic: Linux: limit RAM usage
- Replies: 4
- Views: 1043
Re: Linux: limit RAM usage
It looks like the --memory option doesn't work for me, I have set it to 4468709120, but the batch system killed my job with this error: Reason: job 60953324.264 exceeds job master hard limit "h_rss" (4791971840.00000 > limit:4509715660.80000) - initiate terminate method; So 4509715660 is t...
- Thu May 21, 2020 7:00 pm
- Forum: V7.6.x Public Release Windows/Linux/MacOS X
- Topic: Linux: limit RAM usage
- Replies: 4
- Views: 1043
Re: Linux: limit RAM usage
Hello, from the Help of FAHClient you can get: memory <string> Override memory, in bytes, reported to Folding@home. As there are also arguments left from older versions I don't know if this would work but you might give it a try. Regards Patrick I had used that option, see my original message. It d...
- Thu May 21, 2020 1:25 pm
- Forum: V7.6.x Public Release Windows/Linux/MacOS X
- Topic: Linux: limit RAM usage
- Replies: 4
- Views: 1043
Linux: limit RAM usage
I'm backfilling on a batch system that limits the job's RAM usage. Unfortunately, many folding jobs are terminated because they use too much RAM. Does the --memory option do any good? *********************** Log Started 2020-05-21T13:02:06Z *********************** 13:02:06:**************************...
- Mon Apr 20, 2020 2:13 pm
- Forum: V7.5.1 Public Release Windows/Linux/MacOS X [deprecated]
- Topic: problem with exit-when-done
- Replies: 12
- Views: 3114
Re: problem with exit-when-done
Thanks for pointing it out, I've removed the option.bruce wrote:Why are you setting --cuda_index=0?
The current folding slots don't really care wht cuda-index is used because cuda, itself, is never used.
- Mon Apr 20, 2020 7:24 am
- Forum: V7.5.1 Public Release Windows/Linux/MacOS X [deprecated]
- Topic: problem with exit-when-done
- Replies: 12
- Views: 3114
Re: problem with exit-when-done
I'm not yet happy, there have been several cases where the program was idle for 4 hours without getting a WU. I would've preferred the program to exit in that case.
- Fri Apr 17, 2020 2:04 pm
- Forum: V7.5.1 Public Release Windows/Linux/MacOS X [deprecated]
- Topic: problem with exit-when-done
- Replies: 12
- Views: 3114
Re: problem with exit-when-done
I had another look at the options and think that you have to use this one in addition to exit-when-done set to true: max-units <integer=0> Process at most this number of units, then pause. This this might be how you run it: /usr/bin/FAHClient --user=ANALY_MANC_GPU --team=38188 --gpu=true --cuda-ind...
- Thu Apr 16, 2020 9:18 am
- Forum: V7.5.1 Public Release Windows/Linux/MacOS X [deprecated]
- Topic: problem with exit-when-done
- Replies: 12
- Views: 3114
Re: problem with exit-when-done
I had another look at the options and think that you have to use this one in addition to exit-when-done set to true: max-units <integer=0> Process at most this number of units, then pause. This this might be how you run it: /usr/bin/FAHClient --user=ANALY_MANC_GPU --team=38188 --gpu=true --cuda-ind...
- Wed Apr 15, 2020 10:19 am
- Forum: V7.5.1 Public Release Windows/Linux/MacOS X [deprecated]
- Topic: problem with exit-when-done
- Replies: 12
- Views: 3114
Re: problem with exit-when-done
It should have exited if the correct command was give. It seems that you might be using a wrong command: exit-when-done <boolean=false> Exit when all slots are paused. Try this instead: --finish Finish all current work units, send the results, then exit. I think --finish is for exiting an already r...
- Wed Apr 15, 2020 8:57 am
- Forum: V7.5.1 Public Release Windows/Linux/MacOS X [deprecated]
- Topic: problem with exit-when-done
- Replies: 12
- Views: 3114
Re: problem with exit-when-done
I had the same command on a different node. I has finished a few WU: 21:35:20:WU01:FS00:Upload 70.73% 21:35:26:WU01:FS00:Upload 81.37% 21:35:30:WU00:FS00:Connecting to 65.254.110.245:8080 21:35:30:WARNING:WU00:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this config...
- Wed Apr 15, 2020 7:35 am
- Forum: V7.5.1 Public Release Windows/Linux/MacOS X [deprecated]
- Topic: problem with exit-when-done
- Replies: 12
- Views: 3114
problem with exit-when-done
Hi everyone, I'm trying to do some backfilling on a GPU farm, e.g. starting some GPU load if available and exiting if no work units are available. I am using FAHClient on Ubuntu 18.04. config.xml looks like this: <config> <!-- Folding Slots --> <slot id='0' type='GPU'/> </config> The full command li...
- Wed Apr 15, 2020 6:55 am
- Forum: V7.5.1 Public Release Windows/Linux/MacOS X [deprecated]
- Topic: FAHControl won't install on CentOS 8
- Replies: 8
- Views: 2638
Re: FAHControl won't install on CentOS 8
There is no longer a /usr/bin/python on EL8, only /usr/bin/python2 and /usr/bin/python3, so the package deps cannot be resolved. The package must be adapted for EL8 or installed with --nodeps.
- Wed Apr 15, 2020 6:51 am
- Forum: New Donors start here
- Topic: High Throughput Resources
- Replies: 10
- Views: 1793
Re: High Throughput Resources
Thank you again for the notes. Bit of progress: jobs are running and completing ok but uploads fail with: 01:39:40:WU00:FS00:Connecting to 155.247.166.219:8080 01:39:40:All slots are done, exiting 01:39:40:WARNING:WU00:FS00:Exception: Failed to send results to work server: Transfer failed 01:39:41:...
- Thu Apr 09, 2020 1:20 pm
- Forum: Problems with NVidia drivers
- Topic: Please use only one of the GPUs
- Replies: 3
- Views: 1822
Re: Please use only one of the GPUs
This is a bug in FAHClient, it sees 6 GPUs in HW but only one OpenCL device 0. So it should only create one GPU slot. As workaround you need to edit the config.xml file manually and delete the other GPU slots. How do these slots correspondent? In this case, the automatically generated config.xml lo...
- Thu Apr 09, 2020 12:03 pm
- Forum: Problems with NVidia drivers
- Topic: Please use only one of the GPUs
- Replies: 3
- Views: 1822
Please use only one of the GPUs
Hi everyone, I'm trying to do some backfilling on a farm machine, just like the friends at CERN are doing. My setup is Scientific Linux 7.7 on x86_64, the machines all have two Xeon CPUs and 6 or 8 Nvidia GPUs of several generations, in this example 6 NVidia Tesla P4. I'm using the latest CUDA 10.2....