gpus.json gets messed with?
Posted: Tue Aug 27, 2024 9:39 pm
Sometime earlier this month (8/24) my gpu,T600, was no longer recognized. I thought that it was on my end as a result of kernel updates (maybe, maybe not) so I reinstalled the client. No dice.
I took a hard look at the error message in the log at these lines
***************************snippet of log****************************************************
[93m15:10:27:W :OpenCL not supported: clGetPlatformIDs() returned -1001[0m
[93m15:10:27:W :CUDA not supported: cuInit() returned 999[0m
15:10:27:I3:gpus = {
15:10:27:I3: "gpu:01:00:00": {"vendor": 4318, "device": 8122, "type": "nvidia", "supported": false, "description": "TU117GLM [T600 Mobile]"}
15:10:27:I3:}
**********************************************************************************************
It is correct that there is no TU117GLM [T600 MOBILE] in gpus.json. Recall that I wrote "my gpu was no longer recognized." It appears that some time this month gpus.json must have been updated and the record relevant to my gpu was screwed with.
clinfo identifies this cpu as 'NVIDIA T600 Laptop GPU', so it is correct that gpus.json does not contain a record with the description field value "TU117GLM [T600 Mobile]".
*******************snippet of clinfo output********************************
Platform Name NVIDIA CUDA
Number of devices 1
Device Name NVIDIA T600 Laptop GPU
Device Vendor NVIDIA Corporation
Device Vendor ID 0x10de
Device Version OpenCL 3.0 CUDA
*******************************************************************************
As a expedient I edited cpus.json description field for the suspect record: "vendor": 4318, "device": 8122, "type": "nvidia" to be the value that clinfo returns, i.e. "NVIDIA T600 Laptop GPU." Now, everything is just peachy
So, what happened to gpus.json?
I took a hard look at the error message in the log at these lines
***************************snippet of log****************************************************
[93m15:10:27:W :OpenCL not supported: clGetPlatformIDs() returned -1001[0m
[93m15:10:27:W :CUDA not supported: cuInit() returned 999[0m
15:10:27:I3:gpus = {
15:10:27:I3: "gpu:01:00:00": {"vendor": 4318, "device": 8122, "type": "nvidia", "supported": false, "description": "TU117GLM [T600 Mobile]"}
15:10:27:I3:}
**********************************************************************************************
It is correct that there is no TU117GLM [T600 MOBILE] in gpus.json. Recall that I wrote "my gpu was no longer recognized." It appears that some time this month gpus.json must have been updated and the record relevant to my gpu was screwed with.
clinfo identifies this cpu as 'NVIDIA T600 Laptop GPU', so it is correct that gpus.json does not contain a record with the description field value "TU117GLM [T600 Mobile]".
*******************snippet of clinfo output********************************
Platform Name NVIDIA CUDA
Number of devices 1
Device Name NVIDIA T600 Laptop GPU
Device Vendor NVIDIA Corporation
Device Vendor ID 0x10de
Device Version OpenCL 3.0 CUDA
*******************************************************************************
As a expedient I edited cpus.json description field for the suspect record: "vendor": 4318, "device": 8122, "type": "nvidia" to be the value that clinfo returns, i.e. "NVIDIA T600 Laptop GPU." Now, everything is just peachy
So, what happened to gpus.json?