3.21.157.11 overloaded?

Moderators: Site Moderators, FAHC Science Team

Neil-B
Posts: 1996
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: 3.21.157.11 overloaded?

Post by Neil-B »

The WU needs to return to the WS that deployed it so the next gen can be created ... for the most part under normal loads this happens seemlessly ... these is an option for the researchers to state a CS(s) which can temporarily hold the WU until the WS can receive it - but it still has to go back to the WS.

So the process already exists and works if a CS has been set - but actually that just moves the problem as now the WS is trying to receive both the folders contributions and work from the CS ... Originally I believe this option was designed for WS failures or service outages - not to balance load.

There are no doubt ways to re-architect the whole way FaH infrastructure works ... but an easier solution is to work on balancing out the loads so the servers aren't under stress - and iirc this is a huge server to get balanced correctly.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 3.21.157.11 overloaded?

Post by PantherX »

Neil-B wrote:...Originally I believe this option was designed for WS failures or service outages - not to balance load...
That's correct. Given that the original design was about 20 years old, there's technical debt and legacy decisions which needs to be addressed. Work was in the pipeline but that got side tracked with the pandemic and it will take time for things to get back on track. In the meantime, we can all fold COVID WUs to help out everyone :)

BTW, there is a new V8 client in development (no ETA) prior to the pandemic arriving so there will be an opportunity to "start fresh". Keep in mind that V7 was the first proper client that was written from the ground up and has aged well (about 10 years) and V1 to V6 were all written by researchers. I am not sure if V8 is a fresh re-write or not but my guess is that it might be a new code base. Time will tell what happens :eugeek:
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 3.21.157.11 overloaded?

Post by bruce »

Some of your assumptions are weak.

For downloading, the client reports your hardware description to the Assignment Server. The AS looks at the list of Work Servers that have WUs that can be assigned to your hardware and chooses one. Then it the process continues be handled between that WS and your Client. If there happens to be only one WS with compatible WUs, then that's the server that will be assigned. If there are several, other factors are considered. If there are zero, then you'll get an error saying (in essence) there are no WU that can be assigned to your client.

If some WS are off-line, so be it. If one has a temporary shortage, that might change as soon as somebody uploads a completed WU and the WS can generate Gen (N+1).
Post Reply