Page 1 of 1
7.6.13 FAHControl connects to the client and gets thrown out
Posted: Wed May 06, 2020 6:57 pm
by JTorset
After upgrading to version 7.6.13 FAHControl connects to the client and gets thrown out immediately and is repeating this endlessly. Telnet shows same behavior, first successfully connect and FAH greeting text appear and then connection is lost.
Strangely enough, the WebControl is working normally and without the 'refreshing no information' bug that previously happened in Opera, Chrome and new Edge. Old Edge did not have this bug.
Trying to downgrade to earlier version but FAHcontrol is also there getting thrown out of the client after connection. So not sure if the installer allows you to install earlier versions than already installed on the machine. Or else it is a config file somewhere that do not change when downgrading and need editing.
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Wed May 06, 2020 7:15 pm
by JimboPalmer
I can't make heads or tails of what you are experiencing.
Can you please post the first 200 lines of the log where it shows your configuration?
viewtopic.php?f=24&t=26036
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Thu May 07, 2020 5:28 pm
by JTorset
Here, see below start of the log.
There is no problem folding or connecting the web control to the client. Only FAHcontrol or telnet fails to connect properly to the client.
Code: Select all
*********************** Log Started 2020-05-07T05:52:13Z ***********************
05:52:13:Trying to access database...
05:52:13:Successfully acquired database lock
05:52:13:Read GPUs.txt
05:52:14:Enabled folding slot 01: READY gpu:0:Vega 10 XL/XT [Radeon RX Vega 56/64]
05:52:14:Enabled folding slot 02: READY cpu:31
05:52:14:****************************** FAHClient ******************************
05:52:14: Version: 7.6.13
05:52:14: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
05:52:14: Copyright: 2020 foldingathome.org
05:52:14: Homepage: https://foldingathome.org/
05:52:14: Date: Apr 27 2020
05:52:14: Time: 21:21:01
05:52:14: Revision: 5a652817f46116b6e135503af97f18e094414e3b
05:52:14: Branch: master
05:52:14: Compiler: Visual C++ 2008
05:52:14: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
05:52:14: Platform: win32 10
05:52:14: Bits: 32
05:52:14: Mode: Release
05:52:14: Config: C:\Users\John\AppData\Roaming\FAHClient\config.xml
05:52:14:******************************** CBang ********************************
05:52:14: Date: Apr 24 2020
05:52:14: Time: 17:07:55
05:52:14: Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
05:52:14: Branch: master
05:52:14: Compiler: Visual C++ 2008
05:52:14: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
05:52:14: Platform: win32 10
05:52:14: Bits: 32
05:52:14: Mode: Release
05:52:14:******************************* System ********************************
05:52:14: CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
05:52:14: CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
05:52:14: CPUs: 32
05:52:14: Memory: 31.87GiB
05:52:14: Free Memory: 25.07GiB
05:52:14: Threads: WINDOWS_THREADS
05:52:14: OS Version: 6.2
05:52:14: Has Battery: false
05:52:14: On Battery: false
05:52:14: UTC Offset: 2
05:52:14: PID: 24964
05:52:14: CWD: C:\Users\John\AppData\Roaming\FAHClient
05:52:14: Win32 Service: false
05:52:14: OS: Windows 10 Enterprise
05:52:14: OS Arch: AMD64
05:52:14: GPUs: 1
05:52:14: GPU 0: Bus:79 Slot:0 Func:0 AMD:5 Vega 10 XL/XT [Radeon RX Vega 56/64]
05:52:14: CUDA: Not detected: Failed to open dynamic library 'nvcuda.dll': The
05:52:14: specified module could not be found.
05:52:14:
05:52:14:OpenCL Device 0: Platform:0 Device:0 Bus:79 Slot:0 Compute:1.2 Driver:3004.8
05:52:14:******************************* libFAH ********************************
05:52:14: Date: Apr 15 2020
05:52:14: Time: 14:53:14
05:52:14: Revision: 216968bc7025029c841ed6e36e81a03a316890d3
05:52:14: Branch: master
05:52:14: Compiler: Visual C++ 2008
05:52:14: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
05:52:14: Platform: win32 10
05:52:14: Bits: 32
05:52:14: Mode: Release
05:52:14:***********************************************************************
05:52:14:<config>
05:52:14: <!-- Folding Core -->
05:52:14: <checkpoint v='30'/>
05:52:14: <core-priority v='low'/>
05:52:14:
05:52:14: <!-- Folding Slot Configuration -->
05:52:14: <client-type v='bigadv'/>
05:52:14:
05:52:14: <!-- HTTP Server -->
05:52:14: <allow v='192.168.1.21'/>
05:52:14:
05:52:14: <!-- Network -->
05:52:14: <proxy v=':8080'/>
05:52:14:
05:52:14: <!-- Remote Command Server -->
05:52:14: <command-allow-no-pass v='192.168.1.21'/>
05:52:14:
05:52:14: <!-- Slot Control -->
05:52:14: <pause-on-battery v='false'/>
05:52:14: <power v='full'/>
05:52:14:
05:52:14: <!-- User Information -->
05:52:14: <passkey v='*****'/>
05:52:14: <team v='238979'/>
05:52:14: <user v='JTorset'/>
05:52:14:
05:52:14: <!-- Work Unit Control -->
05:52:14: <next-unit-percentage v='100'/>
05:52:14:
05:52:14: <!-- Folding Slots -->
05:52:14: <slot id='1' type='GPU'/>
05:52:14: <slot id='2' type='CPU'>
05:52:14: <cpus v='31'/>
05:52:14: </slot>
05:52:14:</config>
05:52:14:WU02:FS01:Starting
05:52:14:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\John\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 02 -suffix 01 -version 706 -lifeline 24964 -checkpoint 30 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
05:52:14:WU02:FS01:Started FahCore on PID 25096
05:52:14:WU02:FS01:Core PID:25124
05:52:14:WU02:FS01:FahCore 0x22 started
05:52:14:WU00:FS02:Starting
05:52:14:WU00:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\John\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/avx/Core_a7.fah/FahCore_a7.exe -dir 00 -suffix 01 -version 706 -lifeline 24964 -checkpoint 30 -np 31
05:52:14:WU00:FS02:Started FahCore on PID 25144
05:52:14:WU00:FS02:Core PID:25168
05:52:14:WU00:FS02:FahCore 0xa7 started
05:52:15:WU02:FS01:0x22:*********************** Log Started 2020-05-07T05:52:14Z ***********************
05:52:15:WU02:FS01:0x22:*************************** Core22 Folding@home Core ***************************
05:52:15:WU02:FS01:0x22: Type: 0x22
05:52:15:WU02:FS01:0x22: Core: Core22
05:52:15:WU02:FS01:0x22: Website: https://foldingathome.org/
05:52:15:WU02:FS01:0x22: Copyright: (c) 2009-2018 foldingathome.org
05:52:15:WU02:FS01:0x22: Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
05:52:15:WU02:FS01:0x22: <rafal.wiewiora@choderalab.org>
05:52:15:WU02:FS01:0x22: Args: -dir 02 -suffix 01 -version 706 -lifeline 25096 -checkpoint 30
05:52:15:WU02:FS01:0x22: -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
05:52:15:WU02:FS01:0x22: Config: <none>
05:52:15:WU02:FS01:0x22:************************************ Build *************************************
05:52:15:WU02:FS01:0x22: Version: 0.0.5
05:52:15:WU02:FS01:0x22: Date: Apr 22 2020
05:52:15:WU02:FS01:0x22: Time: 04:42:59
05:52:15:WU02:FS01:0x22: Repository: Git
05:52:15:WU02:FS01:0x22: Revision: 2d69202c898bd9bb3e093f51cd32bf411c2a0388
05:52:15:WU02:FS01:0x22: Branch: HEAD
05:52:15:WU02:FS01:0x22: Compiler: Visual C++ 2008
05:52:15:WU02:FS01:0x22: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
05:52:15:WU02:FS01:0x22: Platform: win32 10
05:52:15:WU02:FS01:0x22: Bits: 64
05:52:15:WU02:FS01:0x22: Mode: Release
05:52:15:WU02:FS01:0x22:************************************ System ************************************
05:52:15:WU02:FS01:0x22: CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
05:52:15:WU02:FS01:0x22: CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
05:52:15:WU02:FS01:0x22: CPUs: 64
05:52:15:WU02:FS01:0x22: Memory: 31.87GiB
05:52:15:WU02:FS01:0x22:Free Memory: 25.04GiB
05:52:15:WU02:FS01:0x22: Threads: WINDOWS_THREADS
05:52:15:WU02:FS01:0x22: OS Version: 6.2
05:52:15:WU02:FS01:0x22:Has Battery: false
05:52:15:WU02:FS01:0x22: On Battery: false
05:52:15:WU02:FS01:0x22: UTC Offset: 2
05:52:15:WU02:FS01:0x22: PID: 25124
05:52:15:WU02:FS01:0x22: CWD: C:\Users\John\AppData\Roaming\FAHClient\work
05:52:15:WU02:FS01:0x22: OS: Windows 10 Pro for Workstations
05:52:15:WU02:FS01:0x22: OS Arch: AMD64
05:52:15:WU02:FS01:0x22:********************************************************************************
05:52:15:WU02:FS01:0x22:Project: 16443 (Run 0, Clone 248, Gen 18)
05:52:15:WU02:FS01:0x22:Unit: 0x0000001680fccb015eaa001ae6350c68
05:52:15:WU02:FS01:0x22:Digital signatures verified
05:52:15:WU02:FS01:0x22:Folding@home GPU Core22 Folding@home Core
05:52:15:WU02:FS01:0x22:Version 0.0.5
05:52:15:WU00:FS02:0xa7:*********************** Log Started 2020-05-07T05:52:14Z ***********************
05:52:15:WU00:FS02:0xa7:************************** Gromacs Folding@home Core ***************************
05:52:15:WU00:FS02:0xa7: Type: 0xa7
05:52:15:WU00:FS02:0xa7: Core: Gromacs
05:52:15:WU00:FS02:0xa7: Args: -dir 00 -suffix 01 -version 706 -lifeline 25144 -checkpoint 30 -np
05:52:15:WU00:FS02:0xa7: 31
05:52:15:WU00:FS02:0xa7:************************************ CBang *************************************
05:52:15:WU00:FS02:0xa7: Date: Oct 26 2019
05:52:15:WU00:FS02:0xa7: Time: 01:38:25
05:52:15:WU00:FS02:0xa7: Revision: c46a1a011a24143739ac7218c5a435f66777f62f
05:52:15:WU00:FS02:0xa7: Branch: master
05:52:15:WU00:FS02:0xa7: Compiler: Visual C++ 2008
05:52:15:WU00:FS02:0xa7: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
05:52:15:WU00:FS02:0xa7: Platform: win32 10
05:52:15:WU00:FS02:0xa7: Bits: 64
05:52:15:WU00:FS02:0xa7: Mode: Release
05:52:15:WU00:FS02:0xa7:************************************ System ************************************
05:52:15:WU00:FS02:0xa7: CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
05:52:15:WU00:FS02:0xa7: CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
05:52:15:WU00:FS02:0xa7: CPUs: 64
05:52:15:WU00:FS02:0xa7: Memory: 31.87GiB
05:52:15:WU00:FS02:0xa7:Free Memory: 25.03GiB
05:52:15:WU00:FS02:0xa7: Threads: WINDOWS_THREADS
05:52:15:WU00:FS02:0xa7: OS Version: 6.2
05:52:15:WU00:FS02:0xa7:Has Battery: false
05:52:15:WU00:FS02:0xa7: On Battery: false
05:52:15:WU00:FS02:0xa7: UTC Offset: 2
05:52:15:WU00:FS02:0xa7: PID: 25168
05:52:15:WU00:FS02:0xa7: CWD: C:\Users\John\AppData\Roaming\FAHClient\work
05:52:15:WU00:FS02:0xa7:******************************** Build - libFAH ********************************
05:52:15:WU00:FS02:0xa7: Version: 0.0.18
05:52:15:WU00:FS02:0xa7: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
05:52:15:WU00:FS02:0xa7: Copyright: 2019 foldingathome.org
05:52:15:WU00:FS02:0xa7: Homepage: https://foldingathome.org/
05:52:15:WU00:FS02:0xa7: Date: Oct 26 2019
05:52:15:WU00:FS02:0xa7: Time: 01:52:30
05:52:15:WU00:FS02:0xa7: Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
05:52:15:WU00:FS02:0xa7: Branch: master
05:52:15:WU00:FS02:0xa7: Compiler: Visual C++ 2008
05:52:15:WU00:FS02:0xa7: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
05:52:15:WU00:FS02:0xa7: Platform: win32 10
05:52:15:WU00:FS02:0xa7: Bits: 64
05:52:15:WU00:FS02:0xa7: Mode: Release
05:52:15:WU00:FS02:0xa7:************************************ Build *************************************
05:52:15:WU00:FS02:0xa7: SIMD: avx_256
05:52:15:WU00:FS02:0xa7:********************************************************************************
05:52:15:WU00:FS02:0xa7:Project: 14217 (Run 524, Clone 0, Gen 19)
05:52:15:WU00:FS02:0xa7:Unit: 0x00000014cedfaa925eab7635545095f8
05:52:15:WU00:FS02:0xa7:Digital signatures verified
05:52:15:WU00:FS02:0xa7:Reducing thread count from 31 to 30 to avoid domain decomposition by a prime number > 3
05:52:15:WU00:FS02:0xa7:Calling: mdrun -s frame19.tpr -o frame19.trr -x frame19.xtc -cpt 30 -nt 30
05:52:15:WU02:FS01:0x22: Found a checkpoint file
05:52:15:WU00:FS02:0xa7:Steps: first=1187500 total=62500
05:52:19:WU00:FS02:0xa7:Completed 1 out of 62500 steps (0%)
05:52:27:WU02:FS01:0x22:Completed 1780000 out of 5000000 steps (35%)
05:52:27:WU02:FS01:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
05:52:41:WU00:FS02:0xa7:Completed 625 out of 62500 steps (1%)
05:53:02:WU00:FS02:0xa7:Completed 1250 out of 62500 steps (2%)
05:53:24:WU00:FS02:0xa7:Completed 1875 out of 62500 steps (3%)
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Thu May 07, 2020 6:11 pm
by Rel25917
You need to leave 127.0.0.1 in the allowed ip addresses.
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Thu May 07, 2020 7:09 pm
by PantherX
Please note that the following flag was discontinued several years ago so you can safely remove it from your system:
05:52:14: <client-type v='bigadv'/>
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Thu May 07, 2020 7:48 pm
by bruce
05:52:14: <allow v='192.168.1.21'/>
05:52:14: <command-allow-no-pass v='192.168.1.21'/>
You removed the required permissions for 127.0.0.1 which is the local FAHControl. They should read 127.0.0.1,192.168.1.21 or something like that.
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Thu May 07, 2020 11:38 pm
by JimboPalmer
Not your problem, I just noticed it.
05:52:15:WU00:FS02:0xa7:Reducing thread count from 31 to 30 to avoid domain decomposition by a prime number > 3
05:52:15:WU00:FS02:0xa7:Calling: mdrun -s frame19.tpr -o frame19.trr -x frame19.xtc -cpt 30 -nt 30
You have your system trying to use 31 CPU threads (one is used to feed data to the GPU)
You may wish to consider setting it to 30 or 27 CPUs instead as they have no large prime factors. 27 is safer than 30 as 30 still has a factor of 5.
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Sat May 09, 2020 10:21 pm
by JTorset
PantherX wrote:Please note that the following flag was discontinued several years ago so you can safely remove it from your system:
05:52:14: <client-type v='bigadv'/>
Thank you for your information.
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Sat May 09, 2020 10:43 pm
by JTorset
JimboPalmer wrote:Not your problem, I just noticed it.
05:52:15:WU00:FS02:0xa7:Reducing thread count from 31 to 30 to avoid domain decomposition by a prime number > 3
05:52:15:WU00:FS02:0xa7:Calling: mdrun -s frame19.tpr -o frame19.trr -x frame19.xtc -cpt 30 -nt 30
You have your system trying to use 31 CPU threads (one is used to feed data to the GPU)
You may wish to consider setting it to 30 or 27 CPUs instead as they have no large prime factors. 27 is safer than 30 as 30 still has a factor of 5.
Domain decomposition by a prime number > 3 seems only to be a Linux issue. I am running on Windows. I have 33 threads for the GPU.
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Sat May 09, 2020 10:48 pm
by bruce
Domain decomposition is a potential problem for the CPU analysis done with FAHCore_a7. Your GPU uses FAHCore_2* which handles things very differently.
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Sat May 09, 2020 10:48 pm
by PantherX
JTorset wrote:...Domain decomposition by a prime number > 3 seems only to be a Linux issue. I am running on Windows. I have 33 threads for the GPU.
Please note that this is a GROMACS issue, i.e. it impacts all the OSes that FahCore_a7 runs on. It would be interesting to see your log file to verify if FahCore_a7 is using 33 CPUs or is it rounding it down to a different number.
Re: 7.6.13 FAHControl connects to the client and gets thrown
Posted: Sun May 10, 2020 9:05 am
by Neil-B
JTorset wrote:Domain decomposition by a prime number > 3 seems only to be a Linux issue. I am running on Windows. I have 33 threads for the GPU.
Domain Decomp happens with windows as well (honestly - I have seen it) … 33 threads for GPU is a little more than it needs (one thread is usually fine) but obviously if you want to keep a fair bit of CPU for other tasks that perfectly OK … If you are looking to maximise your folding you have more choices - in windows 32thread slots work well - and you could have a second slot of maybe 27threads leaving the rest of the threads for other work and servicing GPU.
It will run "warmer" than a single 31thread slot … and there may be AMD gurus on the forums who would advise the best setup for your (rather nice) cpu … as some CPUs work best (more efficiently, cooler, etc. without too much drop in performance) with no threading on the cores as this can allow the cores to boost higher even if there are fewer threads making up much of the performance hit from halving the threads.