Page 1 of 1

8.1.13 dies suddenly

Posted: Sat Jul 20, 2024 8:14 pm
by JL_678
Hi,

I have bene running 8.1.13 for a while now. It was a bear to setup so I have not upgraded. I have 5 nodes running, and all worked perfectly until 7/17/24, when all of them stopped getting jobs. Here is a log snippet:

Code: Select all

01:37:35:I1:OUT2483:> GET https://api.foldingathome.org/gpus HTTP/1.1
01:37:35:I3:Connecting to api.foldingathome.org:443
01:37:36:I1:OUT2483:< api.foldingathome.org:443 HTTP/1.1 200 HTTP_OK
01:37:36:E :Exception: Failed to open dynamic library 'libcuda.so': libcuda.so: cannot open shared object file: No such file or directory
01:37:36:E :Exception: Failed to open dynamic library 'libOpenCL.so': libOpenCL.so: cannot open shared object file: No such file or directory
01:37:37:I1::WU969:Requesting WU assignment
01:37:37:I1:OUT2484:> POST https://assign6.foldingathome.org/api/assign HTTP/1.1
01:37:37:I3:Connecting to assign6.foldingathome.org:443
01:37:38:I1:OUT2484:< assign6.foldingathome.org:443 HTTP/1.1 503 HTTP_SERVICE_UNAVAILABLE
01:37:38:E ::WU969:HTTP_SERVICE_UNAVAILABLE: {"error":"No appropriate assignment"}
01:37:38:I1::WU969:Retry #408 in 8 mins 32 secs
01:46:10:I1::WU969:Requesting WU assignment
01:46:10:I1:OUT2485:> POST https://assign1.foldingathome.org/api/assign HTTP/1.1
01:46:10:I3:Connecting to assign1.foldingathome.org:443
01:46:10:I1:OUT2485:< assign1.foldingathome.org:443 HTTP/1.1 503 HTTP_SERVICE_UNAVAILABLE
The cude stuff is curious because I am not running with a GPU. Not sure why it is even there. All systems are running Intel CPUs if that matters. Does this perhaps related to 8.1.3 being deprecated? I can upgrade, but it took so much work to setup 8.1.3 that I am worried about making the change.

Re: 8.1.13 dies suddenly

Posted: Sat Jul 20, 2024 8:21 pm
by Joe_H
If these systems running CPU jobs have less than 4 CPU threads available, see this topic - viewtopic.php?t=41716. The server with projects that assign WUs to CPU thread counts less than 4 is currently down.

Re: 8.1.13 dies suddenly

Posted: Mon Jul 22, 2024 1:25 am
by JL_678
That explains it. Thank you.