Issues with mskcc1
Posted: Fri Jan 14, 2022 6:10 pm
I noticed all my slots across 5 systems transitioning to "Failed" and reboots would not help.
The logs showed: or connecting and receiving short responses.
Changing Client Preference from "COVID" to "Any" resulted in not being assigned to mskcc1and getting work.
So it looks like there might be an issue with mskcc1 where it has jobs to give out but can't assign them and the client spins away trying and eventuall drops to a failed state.
OS: Ubuntu 18.04.3 LTS; NVidia Driver: 460.91.03; Client: 7.6.21
The logs showed:
Code: Select all
17:54:18:WU00:FS00:Requesting new work unit for slot 00: gpu:10:0 TU104 [GeForce RTX 2070 SUPER] 8218 from 54.157.202.86
17:54:18:WU00:FS00:Connecting to 54.157.202.86:8080
17:54:48:ERROR:WU00:FS00:Exception: Not connected
Changing Client Preference from "COVID" to "Any" resulted in not being assigned to mskcc1and getting work.
So it looks like there might be an issue with mskcc1 where it has jobs to give out but can't assign them and the client spins away trying and eventuall drops to a failed state.
OS: Ubuntu 18.04.3 LTS; NVidia Driver: 460.91.03; Client: 7.6.21