3.21.157.11 overloaded?
Posted: Tue Aug 11, 2020 10:21 am
I am having big troubles with 3.21.157.11.
One client is not getting any work units from it, despite initially connecting. After establishing the connection it sits there literally for hours and nothing happens.
The log file was grabbed at 10:15Z, so it had been sitting in this state for over an hour and a half.
Another client is unable to upload a finished unit to 3.21.157.11:
It looks like the server is overloaded, or doesn't have enough bandwidth. Connecting to it with a web browser sometimes gives me the "Work Server Version something" page, but only after a very long wait, and sometimes the connection times out.
Is there anything else I should try, or is this a problem with the server that is known already?
Cheers,
HG
One client is not getting any work units from it, despite initially connecting. After establishing the connection it sits there literally for hours and nothing happens.
Code: Select all
*********************** Log Started 2020-08-11T08:38:46Z ***********************
08:38:46:Trying to access database...
08:38:46:Successfully acquired database lock
08:38:46:Read GPUs.txt
08:38:46:Enabled folding slot 00: READY cpu:3
08:38:46:****************************** FAHClient ******************************
08:38:46: Version: 7.6.13
08:38:46: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
08:38:46: Copyright: 2020 foldingathome.org
08:38:46: Homepage: https://foldingathome.org/
08:38:46: Date: Apr 27 2020
08:38:46: Time: 21:20:45
08:38:46: Revision: 5a652817f46116b6e135503af97f18e094414e3b
08:38:46: Branch: master
08:38:46: Compiler: GNU 4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.8)
08:38:46: Options: -std=c++11 -O3 -funroll-loops -mmacosx-version-min=10.7
08:38:46: -Wno-unused-local-typedefs -stdlib=libc++
08:38:46: Platform: darwin 19.2.0
08:38:46: Bits: 64
08:38:46: Mode: Release
08:38:46: Args: --user=XXXXX --team=XXXXX
08:38:46: --passkey=******************************** --gpu=false --smp=true
08:38:46: --cpus=3 --chdir /Users/bernd/FAH --log-color=false --password
08:38:46: ******** --pause-on-start=false
08:38:46: Config: /Users/bernd/FAH/config.xml
08:38:46:******************************** CBang ********************************
08:38:46: Date: Apr 24 2020
08:38:46: Time: 17:07:50
08:38:46: Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
08:38:46: Branch: master
08:38:46: Compiler: GNU 4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.8)
08:38:46: Options: -std=c++11 -O3 -funroll-loops -mmacosx-version-min=10.7
08:38:46: -Wno-unused-local-typedefs -stdlib=libc++ -fPIC
08:38:46: Platform: darwin 19.2.0
08:38:46: Bits: 64
08:38:46: Mode: Release
08:38:46:******************************* System ********************************
08:38:46: CPU: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
08:38:46: CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
08:38:46: CPUs: 4
08:38:46: Memory: 8.00GiB
08:38:46:Free Memory: 18.86MiB
08:38:46: Threads: POSIX_THREADS
08:38:46: OS Version: 10.13
08:38:46:Has Battery: false
08:38:46: On Battery: false
08:38:46: UTC Offset: 2
08:38:46: PID: 18009
08:38:46: CWD: /Users/bernd
08:38:46: OS: Darwin 17.7.0 x86_64
08:38:46: OS Arch: AMD64
08:38:46: GPUs: 0
08:38:46: CUDA: Not detected: Failed to open dynamic library 'libcuda.dylib':
08:38:46: dlopen(libcuda.dylib, 1): image not found
08:38:46: OpenCL: Not detected: Failed to open dynamic library 'libOpenCL.dylib':
08:38:46: dlopen(libOpenCL.dylib, 1): image not found
08:38:46:******************************* libFAH ********************************
08:38:46: Date: Apr 15 2020
08:38:46: Time: 14:43:28
08:38:46: Revision: 216968bc7025029c841ed6e36e81a03a316890d3
08:38:46: Branch: master
08:38:46: Compiler: GNU 4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.8)
08:38:46: Options: -std=c++11 -O3 -funroll-loops -mmacosx-version-min=10.7
08:38:46: -Wno-unused-local-typedefs -stdlib=libc++
08:38:46: Platform: darwin 19.2.0
08:38:46: Bits: 64
08:38:46: Mode: Release
08:38:46:***********************************************************************
08:38:46:<config>
08:38:46: <!-- Network -->
08:38:46: <proxy v=':8080'/>
08:38:46:
08:38:46: <!-- Work Unit Control -->
08:38:46: <next-unit-percentage v='100'/>
08:38:46:
08:38:46: <!-- Folding Slots -->
08:38:46: <slot id='0' type='CPU'/>
08:38:46:</config>
08:38:46:WU00:FS00:Connecting to assign1.foldingathome.org:80
08:38:47:WU00:FS00:Assigned to work server 69.94.66.7
08:38:47:WU00:FS00:Requesting new work unit for slot 00: READY cpu:3 from 69.94.66.7
08:38:47:WU00:FS00:Connecting to 69.94.66.7:8080
08:38:47:ERROR:WU00:FS00:Exception: Server did not assign work unit
08:38:48:WU00:FS00:Connecting to assign1.foldingathome.org:80
08:38:48:WU00:FS00:Assigned to work server 3.21.157.11
08:38:48:WU00:FS00:Requesting new work unit for slot 00: READY cpu:3 from 3.21.157.11
08:38:48:WU00:FS00:Connecting to 3.21.157.11:8080
08:39:08:ERROR:WU00:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
08:39:48:WU00:FS00:Connecting to assign1.foldingathome.org:80
08:39:48:WARNING:WU00:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
08:39:48:WU00:FS00:Connecting to assign2.foldingathome.org:80
08:39:49:WU00:FS00:Assigned to work server 3.21.157.11
08:39:49:WU00:FS00:Requesting new work unit for slot 00: READY cpu:3 from 3.21.157.11
08:39:49:WU00:FS00:Connecting to 3.21.157.11:8080
Another client is unable to upload a finished unit to 3.21.157.11:
Code: Select all
*********************** Log Started 2020-08-11T06:22:00Z ***********************
06:22:00:Trying to access database...
06:22:00:Successfully acquired database lock
06:22:00:Read GPUs.txt
06:22:00:WARNING:Exception: Failed to open '/proc/bus/pci/devices': Failed to open '/proc/bus/pci/devices': No such file or directory: iostream error: No such file or directory
06:22:00:Enabled folding slot 00: READY cpu:24
06:22:00:****************************** FAHClient ******************************
06:22:00: Version: 7.6.13
06:22:00: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
06:22:00: Copyright: 2020 foldingathome.org
06:22:00: Homepage: https://foldingathome.org/
06:22:00: Date: Apr 28 2020
06:22:00: Time: 04:20:16
06:22:00: Revision: 5a652817f46116b6e135503af97f18e094414e3b
06:22:00: Branch: master
06:22:00: Compiler: GNU 8.3.0
06:22:00: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
06:22:00: -fno-pie
06:22:00: Platform: linux2 4.19.0-5-amd64
06:22:00: Bits: 64
06:22:00: Mode: Release
06:22:00: Args: --user=XXXXX --team=XXXXX
06:22:00: --passkey=******************************** --gpu=false --smp=true
06:22:00: --cpus=24 --log-color=false --allow=127.0.0.1 192.168.1.0/24
06:22:00: --web-allow=127.0.0.1 192.168.1.0/24 --chdir /home/bernd/FAH
06:22:00: Config: /home/bernd/FAH/config.xml
06:22:00:******************************** CBang ********************************
06:22:00: Date: Apr 25 2020
06:22:00: Time: 00:07:53
06:22:00: Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
06:22:00: Branch: master
06:22:00: Compiler: GNU 8.3.0
06:22:00: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
06:22:00: -fno-pie -fPIC
06:22:00: Platform: linux2 4.19.0-5-amd64
06:22:00: Bits: 64
06:22:00: Mode: Release
06:22:00:******************************* System ********************************
06:22:00: CPU: Intel(R) Xeon(R) CPU X5675 @ 3.07GHz
06:22:00: CPU ID: GenuineIntel Family 6 Model 44 Stepping 2
06:22:00: CPUs: 24
06:22:00: Memory: 39.99GiB
06:22:00:Free Memory: 2.22GiB
06:22:00: Threads: POSIX_THREADS
06:22:00: OS Version: 3.11
06:22:00:Has Battery: false
06:22:00: On Battery: false
06:22:00: UTC Offset: 2
06:22:00: PID: 26760
06:22:00: CWD: /home/bernd
06:22:00: OS: Linux 3.11.6 x86_64
06:22:00: OS Arch: AMD64
06:22:00: GPUs: 0
06:22:00: CUDA: Not detected: Failed to open dynamic library 'libcuda.so':
06:22:00: libcuda.so: cannot open shared object file: No such file or
06:22:00: directory
06:22:00: OpenCL: Not detected: Failed to open dynamic library 'libOpenCL.so':
06:22:00: libOpenCL.so: cannot open shared object file: No such file or
06:22:00: directory
06:22:00:******************************* libFAH ********************************
06:22:00: Date: Apr 15 2020
06:22:00: Time: 21:43:24
06:22:00: Revision: 216968bc7025029c841ed6e36e81a03a316890d3
06:22:00: Branch: master
06:22:00: Compiler: GNU 8.3.0
06:22:00: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
06:22:00: -fno-pie
06:22:00: Platform: linux2 4.19.0-5-amd64
06:22:00: Bits: 64
06:22:00: Mode: Release
06:22:00:***********************************************************************
06:22:00:<config>
06:22:00: <!-- Network -->
06:22:00: <proxy v=':8080'/>
06:22:00:
06:22:00: <!-- Slot Control -->
06:22:00: <power v='full'/>
06:22:00:
06:22:00: <!-- Folding Slots -->
06:22:00: <slot id='0' type='CPU'/>
06:22:00:</config>
06:22:00:WU00:FS00:Starting
06:22:00:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /home/USERNAME/FAH/cores/cores.foldingathome.org/lin/64bit-sse2/a7-0.0.19/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 706 -lifeline 26760 -checkpoint 15 -np 24
06:22:00:WU00:FS00:Started FahCore on PID 9768
06:22:00:WU00:FS00:Core PID:7241
06:22:00:WU00:FS00:FahCore 0xa7 started
06:22:01:WU00:FS00:0xa7:*********************** Log Started 2020-08-11T06:22:00Z ***********************
06:22:01:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
06:22:01:WU00:FS00:0xa7: Type: 0xa7
06:22:01:WU00:FS00:0xa7: Core: Gromacs
06:22:01:WU00:FS00:0xa7: Args: -dir 00 -suffix 01 -version 706 -lifeline 9768 -checkpoint 15 -np
06:22:01:WU00:FS00:0xa7: 24
06:22:01:WU00:FS00:0xa7:************************************ CBang *************************************
06:22:01:WU00:FS00:0xa7: Date: Nov 27 2019
06:22:01:WU00:FS00:0xa7: Time: 11:26:54
06:22:01:WU00:FS00:0xa7: Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
06:22:01:WU00:FS00:0xa7: Branch: master
06:22:01:WU00:FS00:0xa7: Compiler: GNU 8.3.0
06:22:01:WU00:FS00:0xa7: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
06:22:01:WU00:FS00:0xa7: -fno-pie -fPIC
06:22:01:WU00:FS00:0xa7: Platform: linux2 4.19.0-5-amd64
06:22:01:WU00:FS00:0xa7: Bits: 64
06:22:01:WU00:FS00:0xa7: Mode: Release
06:22:01:WU00:FS00:0xa7:************************************ System ************************************
06:22:01:WU00:FS00:0xa7: CPU: Intel(R) Xeon(R) CPU X5675 @ 3.07GHz
06:22:01:WU00:FS00:0xa7: CPU ID: GenuineIntel Family 6 Model 44 Stepping 2
06:22:01:WU00:FS00:0xa7: CPUs: 24
06:22:01:WU00:FS00:0xa7: Memory: 39.99GiB
06:22:01:WU00:FS00:0xa7:Free Memory: 2.22GiB
06:22:01:WU00:FS00:0xa7: Threads: POSIX_THREADS
06:22:01:WU00:FS00:0xa7: OS Version: 3.11
06:22:01:WU00:FS00:0xa7:Has Battery: false
06:22:01:WU00:FS00:0xa7: On Battery: false
06:22:01:WU00:FS00:0xa7: UTC Offset: 2
06:22:01:WU00:FS00:0xa7: PID: 7241
06:22:01:WU00:FS00:0xa7: CWD: /home/bernd/FAH/work
06:22:01:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
06:22:01:WU00:FS00:0xa7: Version: 0.0.19
06:22:01:WU00:FS00:0xa7: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
06:22:01:WU00:FS00:0xa7: Copyright: 2019 foldingathome.org
06:22:01:WU00:FS00:0xa7: Homepage: https://foldingathome.org/
06:22:01:WU00:FS00:0xa7: Date: Nov 26 2019
06:22:01:WU00:FS00:0xa7: Time: 00:41:43
06:22:01:WU00:FS00:0xa7: Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
06:22:01:WU00:FS00:0xa7: Branch: master
06:22:01:WU00:FS00:0xa7: Compiler: GNU 8.3.0
06:22:01:WU00:FS00:0xa7: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
06:22:01:WU00:FS00:0xa7: -fno-pie
06:22:01:WU00:FS00:0xa7: Platform: linux2 4.19.0-5-amd64
06:22:01:WU00:FS00:0xa7: Bits: 64
06:22:01:WU00:FS00:0xa7: Mode: Release
06:22:01:WU00:FS00:0xa7:************************************ Build *************************************
06:22:01:WU00:FS00:0xa7: SIMD: sse2
06:22:01:WU00:FS00:0xa7:********************************************************************************
06:22:01:WU00:FS00:0xa7:Project: 14703 (Run 213, Clone 0, Gen 122)
06:22:01:WU00:FS00:0xa7:Unit: 0x0000008503159d0b5eb159232f247f05
06:22:01:WU00:FS00:0xa7:Digital signatures verified
06:22:01:WU00:FS00:0xa7:Calling: mdrun -s frame122.tpr -o frame122.trr -cpi state.cpt -cpt 15 -nt 24
06:22:01:WU00:FS00:0xa7:Steps: first=0 total=250000
06:22:05:WU00:FS00:0xa7:Completed 10552 out of 250000 steps (4%)
[...]
07:41:23:WU00:FS00:0xa7:Completed 250000 out of 250000 steps (100%)
07:41:25:WU00:FS00:0xa7:Saving result file ../logfile_01.txt
07:41:25:WU00:FS00:0xa7:Saving result file dhdl.xvg
07:41:25:WU00:FS00:0xa7:Saving result file frame122.trr
07:41:25:WU00:FS00:0xa7:Saving result file md.log
07:41:25:WU00:FS00:0xa7:Saving result file pullf.xvg
07:41:25:WU00:FS00:0xa7:Saving result file pullx.xvg
07:41:25:WU00:FS00:0xa7:Saving result file science.log
07:41:25:WU00:FS00:0xa7:Saving result file traj_comp.xtc
07:41:25:WU00:FS00:0xa7:Folding@home Core Shutdown: FINISHED_UNIT
07:41:25:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
07:41:26:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
07:41:26:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
07:41:26:WU00:FS00:Connecting to 3.21.157.11:8080
07:42:41:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
07:42:41:WU00:FS00:Connecting to 3.21.157.11:80
07:42:59:WU00:FS00:Upload 0.92%
07:42:59:WARNING:WU00:FS00:Exception: Failed to send results to work server: Transfer failed
07:42:59:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
07:42:59:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
07:42:59:WU00:FS00:Connecting to 3.21.157.11:8080
07:44:14:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
07:44:14:WU00:FS00:Connecting to 3.21.157.11:80
07:46:49:WU00:FS00:Upload 0.92%
07:46:49:WARNING:WU00:FS00:Exception: Failed to send results to work server: Transfer failed
07:46:49:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
07:46:49:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
07:46:49:WU00:FS00:Connecting to 3.21.157.11:8080
07:48:04:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
07:48:04:WU00:FS00:Connecting to 3.21.157.11:80
07:49:19:WARNING:WU00:FS00:Exception: Failed to send results to work server: Failed to connect to 3.21.157.11:80: Connection timed out
07:49:19:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
07:49:19:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
07:49:19:WU00:FS00:Connecting to 3.21.157.11:8080
08:00:26:WU00:FS00:Upload 0.92%
08:00:26:WARNING:WU00:FS00:Exception: Failed to send results to work server: Transfer failed
08:00:26:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
08:00:26:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
08:00:26:WU00:FS00:Connecting to 3.21.157.11:8080
08:01:41:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
08:01:41:WU00:FS00:Connecting to 3.21.157.11:80
08:02:56:WARNING:WU00:FS00:Exception: Failed to send results to work server: Failed to connect to 3.21.157.11:80: Connection timed out
08:04:40:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
08:04:40:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
08:04:40:WU00:FS00:Connecting to 3.21.157.11:8080
08:05:55:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
08:05:55:WU00:FS00:Connecting to 3.21.157.11:80
08:07:11:WARNING:WU00:FS00:Exception: Failed to send results to work server: Failed to connect to 3.21.157.11:80: Connection timed out
08:11:32:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
08:11:32:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
08:11:32:WU00:FS00:Connecting to 3.21.157.11:8080
08:12:47:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
08:12:47:WU00:FS00:Connecting to 3.21.157.11:80
08:14:02:WARNING:WU00:FS00:Exception: Failed to send results to work server: Failed to connect to 3.21.157.11:80: Connection timed out
08:22:37:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
08:22:37:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
08:22:37:WU00:FS00:Connecting to 3.21.157.11:8080
08:22:56:WU00:FS00:Upload 0.92%
08:22:56:WARNING:WU00:FS00:Exception: Failed to send results to work server: Transfer failed
08:40:34:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
08:40:34:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
08:40:34:WU00:FS00:Connecting to 3.21.157.11:8080
08:41:49:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
08:41:49:WU00:FS00:Connecting to 3.21.157.11:80
08:43:04:WARNING:WU00:FS00:Exception: Failed to send results to work server: Failed to connect to 3.21.157.11:80: Connection timed out
09:09:36:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
09:09:36:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
09:09:36:WU00:FS00:Connecting to 3.21.157.11:8080
09:10:51:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
09:10:51:WU00:FS00:Connecting to 3.21.157.11:80
09:12:06:WARNING:WU00:FS00:Exception: Failed to send results to work server: Failed to connect to 3.21.157.11:80: Connection timed out
09:56:35:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14703 run:213 clone:0 gen:122 core:0xa7 unit:0x0000008503159d0b5eb159232f247f05
09:56:35:WU00:FS00:Uploading 6.82MiB to 3.21.157.11
09:56:35:WU00:FS00:Connecting to 3.21.157.11:8080
09:57:50:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
09:57:50:WU00:FS00:Connecting to 3.21.157.11:80
09:59:05:WARNING:WU00:FS00:Exception: Failed to send results to work server: Failed to connect to 3.21.157.11:80: Connection timed out
Is there anything else I should try, or is this a problem with the server that is known already?
Cheers,
HG