Page 1 of 1
10000 cores to donate for 3 weeks. Help me set up FAHControl
Posted: Wed Mar 25, 2020 9:33 am
by hopoffbaby
Hi All,
I have some excess HPC resource in the region of 10,000 cores that I wish to donate for the next 3 weeks.
I have created a docker container running fahclient on centos7. I want to be able to monitor my deployment in bulk.
The FAHControl client looks to let me add remote servers, but I really dont want to add 1500+ severs by hand. Can this be scripted or is there a config file I can update somewhere?
Cheers
Re: 10000 cores to donate for 3 weeks. Help me set up FAHCon
Posted: Wed Mar 25, 2020 12:17 pm
by Neil-B
A few links to threads that might "help" until some of the core team can get to your question:
viewtopic.php?f=61&t=33018
viewtopic.php?f=61&t=30563
viewtopic.php?f=61&t=32949
The last one has a similar situation to yourself mid thread I believe and may indicate good people to contact?
Re: 10000 cores to donate for 3 weeks. Help me set up FAHCon
Posted: Wed Mar 25, 2020 2:32 pm
by scerbera
There is also rosetta at home on BOINC which is just cpu work, they have plenty at the moment. Cpu slots here are idle more often than not, hoping that will soon change.
Re: 10000 cores to donate for 3 weeks. Help me set up FAHCon
Posted: Wed Mar 25, 2020 7:37 pm
by treckin
Ive been getting steady CPU:18 jobs, currently working
https://apps.foldingathome.org/project?p=14409.
I have a couple CPU:8 macbook pros set up as well and they seem to idle mostly, so It could be the 4.4Ghz 20T setup getting love from the assignment servers based on estimated points?
Re: 10000 cores to donate for 3 weeks. Help me set up FAHCon
Posted: Wed Mar 25, 2020 7:41 pm
by Jesse_V
treckin wrote:Ive been getting steady CPU:18 jobs, currently working
https://apps.foldingathome.org/project?p=14409.
I have a couple CPU:8 macbook pros set up as well and they seem to idle mostly, so It could be the 4.4Ghz 20T setup getting love from the assignment servers based on estimated points?
Some projects are set to run on CPUs with more than 16 cores and it's also possible that you have less competition for those projects. Could also be just luck, I suppose!
Re: 10000 cores to donate for 3 weeks. Help me set up FAHCon
Posted: Thu Mar 26, 2020 12:37 am
by treckin
Jesse_V wrote:treckin wrote:Ive been getting steady CPU:18 jobs, currently working
https://apps.foldingathome.org/project?p=14409.
I have a couple CPU:8 macbook pros set up as well and they seem to idle mostly, so It could be the 4.4Ghz 20T setup getting love from the assignment servers based on estimated points?
Some projects are set to run on CPUs with more than 16 cores and it's also possible that you have less competition for those projects. Could also be just luck, I suppose!
I now have:
cpu:18
cpu:12
cpu:8
All running WUs for 14590:
https://apps.foldingathome.org/project?p=14590
Re: 10000 cores to donate for 3 weeks. Help me set up FAHCon
Posted: Thu Mar 26, 2020 12:49 am
by Jesse_V
Excellent, I'm truly glad to see that it's working now!
Re: 10000 cores to donate for 3 weeks. Help me set up FAHCon
Posted: Thu Mar 26, 2020 9:19 pm
by hopoffbaby
Hi All,
I have got a good config now from the client side. I created this docker file:
Code: Select all
FROM centos:7
RUN yum install wget -y && \
cd /tmp && \
wget https://download.foldingathome.org/releases/public/release/fahclient/centos-6.7-64bit/v7.5/fahclient-7.5.1-1.x86_64.rpm && \
yum install -y /tmp/fahclient-7.5.1-1.x86_64.rpm && \
groupadd -g 9999 appuser && \
useradd -r -u 9999 -g appuser appuser
WORKDIR /tmp
USER appuser
EXPOSE 36330
COPY --chown=9999:9999 config.xml /tmp/config.xml
ENTRYPOINT ["/usr/bin/FAHClient"]
This allows me to run in a sandbox area
I then use this config file:
Code: Select all
<config>
<!-- Folding Slots -->
<slot id='0' type='CPU'>
<cpus v='2'/>
</slot>
<slot id='1' type='CPU'>
<cpus v='2'/>
</slot>
<slot id='2' type='CPU'>
<cpus v='4'/>
</slot>
<fold-anon v='false'/>
<command-allow v='xxxxxxxxxxx'/>
<password v='xxxxxxxxxxx'/>
<user v='xxxxxxxxxx'/>
<passkey v='xxxxxxxxxxxx'/>
<team v='xxxxxxxxxxxxxxxx'/>
<allow v='xxxxxxx'/>
<client-threads v='8'/>
<idle-seconds v='0'/>
<max-packet-size v='big'/>
<priority v='low'/>
<max-shutdown-wait v='5'/>
<next-unit-percentage v='90'/>
<stall-detection-enabled v='true'/>
</config>
Which all works fine and allows me to connect the FAHControl app to my remote deployments.
The problem is I have 1560 machines I will run this on and would like to add them to FAHControl. Is there a way I can manipulate the database directly. I see there is a database.py that is included as part of the RPM and I think it could be done through that somehow.
I took a look at the links, but from what I can see they are about connecting the FHAControl to a FAHClient, which I can do. What I am interested in is adding remote hosts in bulk.
Any ideas?
Cheers
Re: 10000 cores to donate for 3 weeks. Help me set up FAHCon
Posted: Thu Mar 26, 2020 9:27 pm
by _r2w_ben
hopoffbaby wrote:
Code: Select all
<config>
<!-- Folding Slots -->
<slot id='0' type='CPU'>
<cpus v='2'/>
</slot>
<slot id='1' type='CPU'>
<cpus v='2'/>
</slot>
<slot id='2' type='CPU'>
<cpus v='4'/>
</slot>
Any particular reason why you want to run 3 slots? If these are 8 core machines, this will need 3 work units to keep all cores occupied. Given the many reports of insufficient work, this would increase time spent idling waiting for a new work unit.
Fast returns are highly valued and promoted by the
Quick Return Bonus. One slot with 8 CPUs would require less work units to keep busy and result in far more points.
Edit: The code to start saving a new connection in FAHControl begins
here. The only API information I could find relates to
FAHClient. You're more interested in interacting with FAHControl.
Re: 10000 cores to donate for 3 weeks. Help me set up FAHCon
Posted: Thu Mar 26, 2020 11:07 pm
by Jesse_V
For each machine, add the following two lines to their subnet:
<allow v='10.0.0.0/8 127.0.0.1'/>
<password v='PASSWORD_GOES_HERE'/>
Of course replace the subnet and password with the subnet of your LAN and password of your choice. Leave the 127.0.0.1 item in there.
Then on the FAHControl that you want to use to rule them all, you can put in the IP address and password for each one. FAHControl saves its own settings to ~/.FAHClient/FAHControl.db, which is a SQLite3 file, so if you wanted to modify that (on your own) then you could deploy all those remote configurations quickly that way too.