Running FAHclient on a cloud resouses on temporary VMs
Moderators: Site Moderators, FAHC Science Team
Running FAHclient on a cloud resouses on temporary VMs
Hello!
I work in an organization. We have a private cloud for our business needs.
The problem is that the cloud is never 100% busy, there are always some resources available: 40-100 CPU cores.
We have an idea to use these "free" resources for Folding@home.
The Operating System on VMs is CentOS7.
Unfortunately, we can allow VMs to run for a week continuously.
What we can do is to create a new VM, complete exactly one work unit, and delete the VM. If there are still free resources available: repeat.
What I need right now is a command like:
FAHClient --amount-of-workunits=1 --user=username --team=12345 --passkey=***** --gpu=false --cpu-usage=100
This command should request a work unit and when the one is done, finish with exit code 0.
I did not find anything like that in FAHClient help. I tried cycles option but it is different.
Basically, there are two questions:
1. Is that use case with a cloud useful for the Folding@home project? (VMs created in the cloud and removed after one work unit is done)
2. If the first answer is yes, how can we restrict the number of work units done by FAHClient during one run?
I work in an organization. We have a private cloud for our business needs.
The problem is that the cloud is never 100% busy, there are always some resources available: 40-100 CPU cores.
We have an idea to use these "free" resources for Folding@home.
The Operating System on VMs is CentOS7.
Unfortunately, we can allow VMs to run for a week continuously.
What we can do is to create a new VM, complete exactly one work unit, and delete the VM. If there are still free resources available: repeat.
What I need right now is a command like:
FAHClient --amount-of-workunits=1 --user=username --team=12345 --passkey=***** --gpu=false --cpu-usage=100
This command should request a work unit and when the one is done, finish with exit code 0.
I did not find anything like that in FAHClient help. I tried cycles option but it is different.
Basically, there are two questions:
1. Is that use case with a cloud useful for the Folding@home project? (VMs created in the cloud and removed after one work unit is done)
2. If the first answer is yes, how can we restrict the number of work units done by FAHClient during one run?
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: Running FAHclient on a cloud resouses on temporary VMs
Welcome to the F@H Forum Gavelock,
This is the command that would work for you:
You can experiment with the value, say 3 WUs which could potentially be finished within 1 Week assuming that it runs 24/7 and has multiple CPUs to fold.
Since you're using company owned hardware, please ensure that you have permission (usually written) from the person authorized to make such decisions (Internal IT, CTO, etc.). Folding on CPUs is valuable and important scientific work so whatever your business can contribute towards, it would be appreciated
This is the command that would work for you:
Code: Select all
max-units <integer=0>
Process at most this number of units, then pause.
Since you're using company owned hardware, please ensure that you have permission (usually written) from the person authorized to make such decisions (Internal IT, CTO, etc.). Folding on CPUs is valuable and important scientific work so whatever your business can contribute towards, it would be appreciated
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Re: Running FAHclient on a cloud resouses on temporary VMs
Hi PantherX,
Thank you for your answer!
I have started a test run with max-units parameter.
The company is interested in participating in helping COVID-19 research projects. Right now it is just a request to study the possibility to participate in F@H. Once it is done I hope we will run real campaigns.
Thank you for your answer!
I have started a test run with max-units parameter.
The company is interested in participating in helping COVID-19 research projects. Right now it is just a request to study the possibility to participate in F@H. Once it is done I hope we will run real campaigns.
Re: Running FAHclient on a cloud resouses on temporary VMs
I think it'll be better to run a script on your servers to pause the VMs, once your resources are less than x-amount of threads.
Run one major VM running FAH on multiple cores, and run a few smaller ones that you can easily pause (like running 4 to 8 cores).
That way you don't have to set up and reload each VM.
As long as the (average) WU is able to continue within ~8-14 hours (on average hardware of ~3Ghz quad core or more), it should make the deadline.
Run one major VM running FAH on multiple cores, and run a few smaller ones that you can easily pause (like running 4 to 8 cores).
That way you don't have to set up and reload each VM.
As long as the (average) WU is able to continue within ~8-14 hours (on average hardware of ~3Ghz quad core or more), it should make the deadline.
-
- Posts: 82
- Joined: Sat Dec 17, 2011 4:22 pm
- Hardware configuration: none anymore, FAH doesn't want it, it seems.
Re: Running FAHclient on a cloud resouses on temporary VMs
Instead of a full VM just for FAH, you could also run FAH inside a docker container that runs inside any other VM.
Probably somewhat less efficient, but much less work to set up every time.
There's several dockerfiles on dockerhub to use as inspiration.
Probably somewhat less efficient, but much less work to set up every time.
There's several dockerfiles on dockerhub to use as inspiration.
It seems I can't write a signature that both conveys my feelings and doesn't look like a miserable trolling attempt...
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: Running FAHclient on a cloud resouses on temporary VMs
If you're planning on using Docker, have a look here (https://github.com/FoldingAtHome/containers). If you're planning to use VMWare, then have a look here (https://flings.vmware.com/vmware-applia ... lding-home). Please note that the VMWare appliance isn't officially supported.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Re: Running FAHclient on a cloud resouses on temporary VMs
See also enhancement suggestion: https://github.com/FoldingAtHome/fah-issues/issues/1474
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 41
- Joined: Thu Oct 09, 2008 8:59 pm
Re: Running FAHclient on a cloud resouses on temporary VMs
See also issue with managing preemptible VMs https://github.com/FoldingAtHome/fah-issues/issues/1458
-
- Posts: 29
- Joined: Fri May 08, 2020 6:12 pm
Re: Running FAHclient on a cloud resouses on temporary VMs
Hi Gavelock,Gavelock wrote:Hello!
I work in an organization. We have a private cloud for our business needs.
The problem is that the cloud is never 100% busy, there are always some resources available: 40-100 CPU cores.
We have an idea to use these "free" resources for Folding@home.
The Operating System on VMs is CentOS7.
Unfortunately, we can allow VMs to run for a week continuously.
What we can do is to create a new VM, complete exactly one work unit, and delete the VM. If there are still free resources available: repeat.
What I need right now is a command like:
FAHClient --amount-of-workunits=1 --user=username --team=12345 --passkey=***** --gpu=false --cpu-usage=100
This command should request a work unit and when the one is done, finish with exit code 0.
I did not find anything like that in FAHClient help. I tried cycles option but it is different.
Basically, there are two questions:
1. Is that use case with a cloud useful for the Folding@home project? (VMs created in the cloud and removed after one work unit is done)
2. If the first answer is yes, how can we restrict the number of work units done by FAHClient during one run?
I have a similar situation and I would like to ask you if is possible to know what configuration are you using for your VMs (vCPU, Ram, Disk).
We are testing private cloud deployment (KVM clusters) as you and next step is to find the best VM configuration for maximum performances.
Thanks in advance
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: Running FAHclient on a cloud resouses on temporary VMs
The most stable CPU values are: 2, 4, 8, 12, 16 while RAM would what the OS needs plus a bit more as F@H isn't RAM intensive on CPU folding only. For storage, a fast one means less time writing checkpoints but F@H isn't disk heavy, only when reading/writing checkpoints and packing/unpack WUs to be sent/received.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
-
- Posts: 1996
- Joined: Sun Mar 22, 2020 5:52 pm
- Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21 - Location: UK
Re: Running FAHclient on a cloud resouses on temporary VMs
24 and 32 are also pretty rock solid so if your VMs are scalable to that then these will complete WUs much faster - dependant on underlying hardware and the specific project probably in the 45mins to 4hours window.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Re: Running FAHclient on a cloud resouses on temporary VMs
Hello PeterGarlic,PeterGarlic wrote: Hi Gavelock,
I have a similar situation and I would like to ask you if is possible to know what configuration are you using for your VMs (vCPU, Ram, Disk).
We are testing private cloud deployment (KVM clusters) as you and next step is to find the best VM configuration for maximum performances.
Thanks in advance
We are using 1 CPU core, 4 GB RAM, 15 GB disks, CentOS7 for that task. That was done to allow filling even the smallest pieces of free CPU resources.
Kind regards,
Re: Running FAHclient on a cloud resouses on temporary VMs
Hello PantherX,PantherX wrote:Welcome to the F@H Forum Gavelock,
This is the command that would work for you:You can experiment with the value, say 3 WUs which could potentially be finished within 1 Week assuming that it runs 24/7 and has multiple CPUs to fold.Code: Select all
max-units <integer=0> Process at most this number of units, then pause.
Since you're using company owned hardware, please ensure that you have permission (usually written) from the person authorized to make such decisions (Internal IT, CTO, etc.). Folding on CPUs is valuable and important scientific work so whatever your business can contribute towards, it would be appreciated
Thanks again for your help with FAHClient. I would like to share some information about our solution.
In the institute (Joint Institute for Nuclear Research) we have a cloud. Some other members of our institute also have clouds. These clouds are partially used to run batch jobs on either dedicated resources or on free ones. The batch job here is a shell script that should be executed. All clouds are joint together with DIRAC Interware. It is some special opensource platform used in science to organize distributed heterogeneous systems to run High Throughput Computing load through them. When jobs for cloud resources appear, DIRAC spawns VMs on available clouds. Each VM after contextualization ask the central DIRAC service for one job. DIRAC sends to each VM one job from queue. When the job is done on the VM, that VM asks DIRAC to delete itself(delete VM). If there are still jobs in the job queue, DIRAC will try to spawn new VMs on the freed resources.
So the task was to create shell script to run FAHClient as a job which will finish after the FaH work unit is completed. The shell script for the job is super simple:
Code: Select all
#!/bin/bash
set -x
echo $1
echo $2
FAHClient --cause=covid-19 --user=$1 --team=265602 --passkey=$2 --gpu=false --cycles=-1 --cpu-usage=100 --exit-when-done --max-units=1
Our team is Joint Institute for Nuclear Research, ID: 265602. It's been 3 months since the start of this activity. The team has rank around 7000, 23M credits received(https://stats.foldingathome.org/team/265602). And we are happy that idle resources are used now for good cause.
Thank you PantherX for your help!
Thanks, everybody for reaction on this thread! That was a surprise for me when I came here today. I found very interesting suggestions and ideas.
Kind regards,
Igor Pelevanyuk
Re: Running FAHclient on a cloud resouses on temporary VMs
Thank you for helping out with the research.
The advantage to running one WU at a time like you do - compared to running it as an interruptible instance - is that you'll more likely complete the work within the timeout. So that is preferable if the alternative is to have the instance paused for days - which might cause the effort to be wasted.
On a very heterogenous platform, doing it like you have with just one CPU thread per instance may indeed be the way to go, but also note that CPU folding takes good advantage of multi-threading - so for most VM hosts it might be better to run just one or a few instances with say 8 or 16 threads on low priority to use idle resources.
The advantage to running one WU at a time like you do - compared to running it as an interruptible instance - is that you'll more likely complete the work within the timeout. So that is preferable if the alternative is to have the instance paused for days - which might cause the effort to be wasted.
On a very heterogenous platform, doing it like you have with just one CPU thread per instance may indeed be the way to go, but also note that CPU folding takes good advantage of multi-threading - so for most VM hosts it might be better to run just one or a few instances with say 8 or 16 threads on low priority to use idle resources.
Online: GTX 1660 Super + occasional CPU folding in the cold.
Offline: Radeon HD 7770, GTX 1050 Ti 4G OC, RX580
-
- Posts: 1996
- Joined: Sun Mar 22, 2020 5:52 pm
- Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21 - Location: UK
Re: Running FAHclient on a cloud resouses on temporary VMs
... so to consider ... whilst a multi core vm will tie up more cores it will do it for a shorter time ... even just 2 or 4 cores will significantly assist the science by returning WUs quicker than 2x or 4x single core vms
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)