Do you need help?

Moderators: Site Moderators, FAHC Science Team

uyaem
Posts: 219
Joined: Sat Mar 21, 2020 7:35 pm
Location: Esslingen, Germany

Re: Do you need help?

Post by uyaem »

Connor_Horak wrote:It seems some of my projects are not sending and I am not receiving new projects. I have started a new team (243339) and I have processed twice as many Work Units but have half the points of the other team member. I have a significant leg up on him in terms of both CPU and GPU processing power. Is everything just overloaded because of COVID-19 or is there something I can do to increase my machines productivity?
Unless you see something other than "Failed to get assignment from 'xxx': No WUs available for this configuration" in your logs, I'd say that's it.
Also note that the stats servers are quite behind on the data that they show.
Image
CPU: Ryzen 9 3900X (1x21 CPUs) ~ GPU: nVidia GeForce GTX 1660 Super (Asus)
Connor_Horak
Posts: 3
Joined: Sun Mar 22, 2020 8:43 pm

Re: Do you need help?

Post by Connor_Horak »

Joe_H wrote:Welcome back.

For the log you post we do not need the entire log. If you post the beginning section, about 200 lines, which shows the software, hardware, and folding setup info along with a section that shows the the problem, that is enough.

There have been some problems with the stats keeping up, a big kick was given to the server yesterday evening to get some things going again. Still may be some delay in posting. Other question is if you have a passkey entered into the client, that does affect the point.

Beyond that , yes everything is just overloaded. The request rate for WU's was less than 10,000 an hour two weeks ago, at times now the rate of sending out work can get to 100,000 an hour, there are more requests that can't be filled. They are working on more servers being available, but that takes a bit of time to get set up.

And yes, you can use the same user ID and passkey on multiple machines.



I am not currently using a passkey.






Code: Select all

*********************** Log Started 2020-03-22T20:58:02Z ***********************
20:58:02:************************* Folding@home Client *************************
20:58:02:        Website: https://foldingathome.org/
20:58:02:      Copyright: (c) 2009-2018 foldingathome.org
20:58:02:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
20:58:02:           Args: --open-web-control
20:58:02:         Config: C:\Users\conno_9d9pq71\AppData\Roaming\FAHClient\config.xml
20:58:02:******************************** Build ********************************
20:58:02:        Version: 7.5.1
20:58:02:           Date: May 11 2018
20:58:02:           Time: 13:06:32
20:58:02:     Repository: Git
20:58:02:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
20:58:02:         Branch: master
20:58:02:       Compiler: Visual C++ 2008
20:58:02:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
20:58:02:       Platform: win32 10
20:58:02:           Bits: 32
20:58:02:           Mode: Release
20:58:02:******************************* System ********************************
20:58:02:            CPU: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
20:58:02:         CPU ID: GenuineIntel Family 6 Model 158 Stepping 12
20:58:02:           CPUs: 8
20:58:02:         Memory: 31.82GiB
20:58:02:    Free Memory: 17.36GiB
20:58:02:        Threads: WINDOWS_THREADS
20:58:02:     OS Version: 6.2
20:58:02:    Has Battery: false
20:58:02:     On Battery: false
20:58:02:     UTC Offset: -5
20:58:02:            PID: 20096
20:58:02:            CWD: C:\Users\conno_9d9pq71\AppData\Roaming\FAHClient
20:58:02:             OS: Windows 10 Home
20:58:02:        OS Arch: AMD64
20:58:02:           GPUs: 1
20:58:02:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:8 TU104 [GeForce RTX 2080 Rev. A]
20:58:02:                 10068
20:58:02:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:7.5 Driver:10.2
20:58:02:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:441.12
20:58:02:OpenCL Device 1: Platform:1 Device:0 Bus:NA Slot:NA Compute:2.1 Driver:26.20
20:58:02:  Win32 Service: false
20:58:02:***********************************************************************
20:58:02:<config>
20:58:02:  <!-- Folding Core -->
20:58:02:  <core-priority v='low'/>
20:58:02:
20:58:02:  <!-- Network -->
20:58:02:  <proxy v=':8080'/>
20:58:02:
20:58:02:  <!-- Slot Control -->
20:58:02:  <pause-on-battery v='false'/>
20:58:02:  <power v='full'/>
20:58:02:
20:58:02:  <!-- User Information -->
20:58:02:  <team v='243339'/>
20:58:02:  <user v='Connor_Horak'/>
20:58:02:
20:58:02:  <!-- Folding Slots -->
20:58:02:  <slot id='0' type='CPU'/>
20:58:02:  <slot id='1' type='GPU'/>
20:58:02:</config>
20:58:02:Trying to access database...
Last edited by Joe_H on Sun Mar 22, 2020 11:14 pm, edited 1 time in total.
Reason: added Code tags to log
Connor_Horak
Posts: 3
Joined: Sun Mar 22, 2020 8:43 pm

Re: Do you need help?

Post by Connor_Horak »

uyaem wrote:
Connor_Horak wrote:It seems some of my projects are not sending and I am not receiving new projects. I have started a new team (243339) and I have processed twice as many Work Units but have half the points of the other team member. I have a significant leg up on him in terms of both CPU and GPU processing power. Is everything just overloaded because of COVID-19 or is there something I can do to increase my machines productivity?
Unless you see something other than "Failed to get assignment from 'xxx': No WUs available for this configuration" in your logs, I'd say that's it.
Also note that the stats servers are quite behind on the data that they show.

I kind of figured

22:15:23:WARNING:WU03:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
22:15:23:ERROR:WU03:FS00:Exception: Could not get an assignment

is more than half of my log right now.
fangfufu
Posts: 93
Joined: Thu Jan 01, 2009 3:26 am
Hardware configuration: 4 cores on a Intel(R) Core(TM) i7-8700K
Location: Cambridge, United Kingdom
Contact:

Re: Do you need help?

Post by fangfufu »

fangfufu wrote:Have you guys thought about contacting Linus Tech Tip? They are literally holding a competition for running Folding@Home. Linus himself has a few server with 100TB storage space.

https://linustechtips.com/main/topic/11 ... -covid-19/
Turns out LMG is hosting a FAH WU server. lol.
https://youtu.be/KU4qOebhkfs?t=544
Folding with 4 cores on Intel(R) Core(TM) i7-8700K

I first started folding back in the Google Compute days!
mikeestacio
Posts: 22
Joined: Tue Mar 17, 2020 3:34 pm

Re: Do you need help?

Post by mikeestacio »

fangfufu wrote:
fangfufu wrote:Have you guys thought about contacting Linus Tech Tip? They are literally holding a competition for running Folding@Home. Linus himself has a few server with 100TB storage space.

https://linustechtips.com/main/topic/11 ... -covid-19/
Turns out LMG is hosting a FAH WU server. lol.
https://youtu.be/KU4qOebhkfs?t=544
Linus and Jay aren't maintaining a 6ft distance :shock:
jklos
Posts: 1
Joined: Mon Mar 23, 2020 4:16 pm

Re: Do you need help?

Post by jklos »

I'm happy to donate server resources. I can host a work unit server, for instance, on a machine with 40 TB of space and a gigabit of bandwidth. I have quite a few servers which can handle large amounts of email. I understand that both the foldingforum org email and the passkey email have had issues. I'd be happy to host and/or help with email servers.

I don't see any discussions about offering these kinds of services, though... Is this an appropriate place? Is another place more appropriate?
lazyacevw
Posts: 35
Joined: Tue Mar 17, 2020 8:12 pm

Re: Do you need help?

Post by lazyacevw »

EternalPainSkylar wrote:
sukritsingh wrote:Hi all! If you are interested in contributing hardware or server side power to our effort, please reach out to me at sukrit.singh@wustl.edu.
Generally the minimum requirements we have for setting up a work server are:
Debian Linux machines with 8GiB RAM, 100TiB SSD storage, and 1GiB/s network.

If you think you could help out please send me an email! Thanks so much for your support and help!
Oof. I only have 1Gb/s and can't afford that much SSD storage. A gibibyte is about 8.59 gigabits, right? Maybe I should sell my car and ask my ISP for 8 additional connections :lol:
I'm not sure what parts of sukritsingh's post were typos but my first though was to SSD space. For consumer SSD's it's around $100/TB so for 100TB of SSDs, it would run around $10,000. Plus, you would need the servers to support the roughly 100 SATA connections (Say, 4 SANs?). With Azure you would need 4 of their E80 (32 TB Standard SSDs) at a cost of $9830 a month.

Other than that, I have a few dozen TB of space and a 1Gbps connection.

I'm sure sukritsingh is overworked right now! If the reqs are Debian Linux machine with 8GiB RAM, 100GB SSD storage, and 1Gb/s network, I could see quite a few more donated WS.
nic
Posts: 2
Joined: Wed Mar 18, 2020 11:04 pm
Hardware configuration: CPUs:
Xeon E5-2680 v2 (cpu:9)
Ryzen 7 2700X (cpu:9)
i7-8700T (cpu:9)
i5-4500 (cpu:3)

GPUs:
RX 5700 XT
GTX 1070
GTX 1050
Location: Virginia, USA

Re: Do you need help?

Post by nic »

lazyacevw wrote: I'm not sure what parts of sukritsingh's post were typos but my first though was to SSD space. For consumer SSD's it's around $100/TB so for 100TB of SSDs, it would run around $10,000. Plus, you would need the servers to support the roughly 100 SATA connections (Say, 4 SANs?). With Azure you would need 4 of their E80 (32 TB Standard SSDs) at a cost of $9830 a month.

Other than that, I have a few dozen TB of space and a 1Gbps connection.

I'm sure sukritsingh is overworked right now! If the reqs are Debian Linux machine with 8GiB RAM, 100GB SSD storage, and 1Gb/s network, I could see quite a few more donated WS.
WU response could likely be better handled by Azure's Blob storage APIs. That permits significantly cheaper $/GB and allows you to offload the IO performance scaling onto Azure.

Plugging into their pricing calculator: 100TB with 86,400,000 (120,000 WU assignments/hr * 24 * 30) write operations and twice as many read operations comes out to $3017/mo with $2087 of that being the raw cost of storage itself.

I'm not sure how long WUs are kept around for nor how frequently they are pulled, but if we can move storage tier down to Cold or Archival, that price can go down a lot.
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Do you need help?

Post by Joe_H »

The 100 TB in his post was not a typo, just a simplification of a requirement list I have also seen. That stated 50-100 TB.

Multiple active projects can go through a lot of storage and quickly. As needed they do transfer off the data to other storage. Some of that movement is to other machines for analysis of the data. They do keep the actual WU data around for a long time, part of that is so that other researchers can look at it if wanted.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
walli
Posts: 1
Joined: Wed Mar 25, 2020 3:33 am

Re: Do you need help?

Post by walli »

It's not about the storage capacity, it's about the (entire) storage having to be SSD/flash...?
lazyacevw
Posts: 35
Joined: Tue Mar 17, 2020 8:12 pm

Re: Do you need help?

Post by lazyacevw »

That was my point exactly. I could put up 100TB but it would be on 7200 RPM drives. I only have about 10TB of SSD based storage. I'm not a millionaire!
robputt
Posts: 1
Joined: Tue Mar 31, 2020 7:41 pm

Re: Do you need help?

Post by robputt »

Good Morning,

I would like to throw in my 2 cents here, I know my post may not hold much weight as it's my first ever post. However, I've never needed to post before, Folding@Home has always just worked for me. I have skimmed a majority of this thread although I will admit I have skipped a few posts, so apologies if this has already been covered before...

I personally think that it's time for a wider re-architect of Folding@Home's backend to be more "cloudy". Simply building assignment and work servers in various places is fine although the requirements for these are quite heavy and expensive, consuming other cloud services may be more effective and allow Folding@Home to horizontally scale in a better fashion.

My current understanding:
- The client makes a request to the assignment server which provides a work server in a round-robin style fashion.
- The client requests work from the work server provided by the assignment server
- The client does the work and sends the results back to the work server
- Work servers need huge disks to store the work unit and the results
- Work servers at the moment are highly strained due to the number of new donors and this causes work units to stop being handed out and clients sit doing nothing

My observations:
- Although Folding@Home is an amazing distributed system from the client/compute perspective the backend is largely old school in architecture
- Although you can scale work servers they have heavy requirements making them unattractive

My thoughts:
- Work units and results need to be stored on some form of distributed, HA object store such as S3, Swift, Ceph
- Pending work units should be queued using a proper queue mechanism such as RabbitMQ cluster
- Instead of having many work servers each with distinct jobs and storage, these should be replaced by a stateless work microservice which interfaces wth said queue and object stores in a load-balanced pool
- When the load gets heavy work API nodes can be scaled horizontally and quickly based on demand

In terms of the cost, of course, it comes with a cost. I'd hope that the likes of AWS / Azure / GCP would donate the cloud resources to make this happen, but if this isn't possible these large providers often aren't the most cost-effective and a small OpenStack cloud in house would allow such delivery if enough hardware can be purchased to build it out.

I know the donor spike is likely to tail off after COVID-19 has been resolved, however having a cloud-based work unit management that scales properly and in a cloudy fashion would allow Folding@Home to retract to its previous size (automatically if desired) and then scale back up again in case (when) we need it again in the future.

I have very limited time available at the moment, however, I would be more than happy to provide architectural/cloud advice as required, although may not be able to do much hands-on work.

Best Regards,

Rob
alxbelu
Posts: 105
Joined: Sat Mar 14, 2020 6:28 pm

Re: Do you need help?

Post by alxbelu »

robputt wrote: I personally think that it's time for a wider re-architect of Folding@Home's backend to be more "cloudy". Simply building assignment and work servers in various places is fine although the requirements for these are quite heavy and expensive, consuming other cloud services may be more effective and allow Folding@Home to horizontally scale in a better fashion.
While I definitely agree, and at the same time understand that the team is still putting out fires, I feel that with limited effort it should be possible to add some simple load-balancing logic (to replace the round-robin strategy) to the assignment servers? For example, on the AS, restricting the number of WS X designations to Y clients over period Z. The limits could differ for each WS depending on its capabilities, so that for example older/more restricted servers would only get say 500 requests per 2 minutes, where higher performance ones might be able to handle 1000 requests per 2 minutes. (Obviously I don't know actual reasonable numbers here, but the team should have a fair grasp on the avg. time needed to serve each client and thus be able to calculate the optimal utilization)

Guess I'm not sure the clients could currently handle a potential case of a "no available WS"-reply from an AS; but a hacky way around it could simply be to assign the client to 127.0.0.1 and let it time-out/reject itself.

(Then there should really be an upper limit to the client retry timer, but recognizing that would probably require a new client, that sounds like more effort with limited impact in the current situation)
Official F@H Twitter (frequently updated): https://twitter.com/foldingathome
Official F@H Facebook: https://www.facebook.com/Foldinghome-136059519794607/

(I'm not affiliated with the F@H Team, just promoting these channels for official updates)
lazyacevw
Posts: 35
Joined: Tue Mar 17, 2020 8:12 pm

Re: Do you need help?

Post by lazyacevw »

robputt wrote:Good Morning,

I would like to throw in my 2 cents here, I know my post may not hold much weight as it's my first ever post. However, I've never needed to post before, Folding@Home has always just worked for me. I have skimmed a majority of this thread although I will admit I have skipped a few posts, so apologies if this has already been covered before... Rob
Excellent analysis. Unfortunately, without people forming teams and working over time to implement, this probably won't be implemented anytime soon. As alxbelu mentioned, the current team only has enough in-house resources to put out fires. There is some hope that partnerships might be formed with Azure, AWS, etc. where they could donate a few short term teams to help implement such architecture but that would be like winning the lottery. The only surefire way forward is to donate resources the best you can. I am also slammed at work so I am not in the position to volunteer right now. In the mean time, all I can do is spin up 64 core servers to replace individual clients in effort to reduce the number of client requests and try to help out in the forums.
Tohya
Posts: 48
Joined: Thu Feb 07, 2008 12:41 am

Re: Do you need help?

Post by Tohya »

https://twitter.com/foldingathome/statu ... 3412453378
Interested in helping us develop and enhance the F@h architecture? Come join the development team at our F@h Fireside Chat on Thursday, April 9th, 4-5pm EDT! To receive a Discord invite link, fill out this form: https://tinyurl.com/firesidedev
Post Reply