Page 1 of 1

bad_cast when running FAHclient on fedora 33

Posted: Thu Dec 24, 2020 4:07 pm
by asdfghjkl
Hi,
I am trying to run FAHclient on Fedora 33, and it failed with a bad_cast exception.

FAHClient /etc/fahclient/config.xml
15:59:46:Read GPUs.txt
terminate called after throwing an instance of 'std::bad_cast'
what(): std::bad_cast
Aborted (core dumped)

I tried to run it under gdb and it gives the following stack trace:

Code: Select all

#0  0x00007ffff7c919d5 in raise () from /lib64/libc.so.6
#1  0x00007ffff7c7a8a4 in abort () from /lib64/libc.so.6
#2  0x00007ffff452f926 in __gnu_cxx::__verbose_terminate_handler() [clone .cold] ()
   from /lib64/libstdc++.so.6
#3  0x00007ffff453b1ac in __cxxabiv1::__terminate(void (*)()) () from /lib64/libstdc++.so.6
#4  0x00007ffff453b217 in std::terminate() () from /lib64/libstdc++.so.6
#5  0x00007ffff453b4c9 in __cxa_throw () from /lib64/libstdc++.so.6
#6  0x00007ffff4532222 in std::__throw_bad_cast() () from /lib64/libstdc++.so.6
#7  0x00007ffff79ea110 in std::__check_facet<std::ctype<char> > (__f=0x0)
    at /usr/include/c++/4.8.2/bits/basic_ios.h:49
#8  std::basic_ios<char, std::char_traits<char> >::widen (this=<optimized out>, __c=10 '\n')
    at /usr/include/c++/4.8.2/bits/basic_ios.h:444
#9  std::istream::getline (__n=5120, __s=0x7fffffff92c0 "", this=0x7fffffff90b0)
    at /usr/include/c++/4.8.2/istream:428
#10 Intel::OpenCL::Utils::GetModulePathName (
    modulePtr=0x7ffff79fd323 <Intel::OpenCL::Utils::ConfigFile::ReadFile(std::string const&, Intel::OpenCL::Utils::ConfigFile&)::__FUNCTION__>, fileName=fileName@entry=0x7fffffffa9f0 "", strLen=4095)
    at /netbatch/donb41412_00/runDir/93/20180921_000000/llvm/projects/opencl/utils/cl_sys_utils/cl_sys_info_linux.cpp:262
#11 0x00007ffff79ea12d in Intel::OpenCL::Utils::GetModuleDirectoryImp (addr=<optimized out>, 
    szModuleDir=0x7fffffffa9f0 "", strLen=<optimized out>)
    at /netbatch/donb41412_00/runDir/93/20180921_000000/llvm/projects/opencl/utils/cl_sys_utils/cl_sys_info_linux.cpp:205
#12 0x00007ffff79e3590 in Intel::OpenCL::Utils::ConfigFile::ReadFile (fileName="cl.cfg", cfg=...)
    at /netbatch/donb41412_00/runDir/93/20180921_000000/llvm/projects/opencl/utils/cl_sys_utils/cl_config.cpp:297
#13 0x00007ffff79e402e in Intel::OpenCL::Utils::ConfigFile::ConfigFile (this=0x7fffffffbb00, 
    filename="cl.cfg", delimiter=..., comment="", 
    sentry="hc\367\000\000\000\000\000\316\027\235\367\377\177\000\000\000\000\000\000\000\000\000\000@\273\377\377\377\177", '\000' <repeats 14 times>, "\377\177\000\000\000\000\000\000\000\000\000\000\b\273\377\377\377\177\000\000\b\273\377\377\377\177\000\000\000\000\000\000\000\000\000\000\070e\367\000\000\000\000\000\070b\367\000\000\000\000\000hc\367\000\000\000\000\000\377\377\377\377\000\000\000\000\367\020\000\000\000\000\000\000\350\306\302\367\377\177\000\000\001\000\000\000\000\000\000\000\210\335\377\377\377\177\000\000\230\335\377\377\377\177\000\000\210\310\302\367\377\177\000\000\000\000\000\000\000\000\000\000\304\344\217\367\377\177\000\000\350\306\302\367\377\177\000\000ߨ\217\367\377\177\000\000\210\310\302\367\377\177\000\000"...)
    at /netbatch/donb41412_00/runDir/93/20180921_000000/llvm/projects/opencl/utils/cl_sys_utils/cl_config.cpp:220
#14 0x00007ffff79e85d9 in Intel::OpenCL::Utils::FrameworkUserLogger::FrameworkUserLogger (
    this=0x7ffff7c449c0 <Intel::OpenCL::Utils::FrameworkUserLogger::Instance()::instance>)
    at /netbatch/donb41412_00/runDir/93/20180921_000000/llvm/projects/opencl/utils/cl_sys_utils/cl_user_logger.cpp:161
#15 0x00007ffff78fe4c4 in Intel::OpenCL::Utils::FrameworkUserLogger::Instance ()
    at /netbatch/donb41412_00/runDir/93/20180921_000000/llvm/projects/opencl/framework/ocl_config.cpp:52
#16 0x00007ffff78fa8df in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
    at /netbatch/donb41412_00/runDir/93/20180921_000000/llvm/projects/opencl/framework/ocl_config.cpp:58
#17 _GLOBAL__sub_I_ocl_config.cpp(void) ()
    at /netbatch/donb41412_00/runDir/93/20180921_000000/llvm/projects/opencl/framework/ocl_config.cpp:60
#18 0x00007ffff7fe18ee in call_init.part () from /lib64/ld-linux-x86-64.so.2
#19 0x00007ffff7fe19d8 in _dl_init () from /lib64/ld-linux-x86-64.so.2
#20 0x00007ffff7d91095 in _dl_catch_exception () from /lib64/libc.so.6
#21 0x00007ffff7fe5e35 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
#22 0x00007ffff7d91038 in _dl_catch_exception () from /lib64/libc.so.6
#23 0x00007ffff7fe566e in _dl_open () from /lib64/ld-linux-x86-64.so.2
#24 0x00007ffff7f8539c in dlopen_doit () from /lib64/libdl.so.2
--Type <RET> for more, q to quit, c to continue without paging--  
#25 0x00007ffff7d91038 in _dl_catch_exception () from /lib64/libc.so.6
#26 0x00007ffff7d91103 in _dl_catch_error () from /lib64/libc.so.6
#27 0x00007ffff7f85bd9 in _dlerror_run () from /lib64/libdl.so.2
#28 0x00007ffff7f85428 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#29 0x00007ffff742e0ee in _open_driver () from /lib64/libOpenCL.so
#30 0x00007ffff7434d15 in _initClIcd_real () from /lib64/libOpenCL.so
#31 0x00007ffff7435804 in clGetPlatformIDs () from /lib64/libOpenCL.so
#32 0x0000000000650423 in ?? ()
#33 0x0000000000482569 in ?? ()
#34 0x00000000004c272f in ?? ()
#35 0x000000000043efc0 in ?? ()
#36 0x0000000000510194 in ?? ()
#37 0x0000000000522fcc in ?? ()
#38 0x00000000004380d4 in ?? ()
#39 0x000000000042dcd7 in ?? ()
#40 0x00007ffff7c7c1e2 in __libc_start_main () from /lib64/libc.so.6
#41 0x000000000042d0f1 in ?? ()
#42 0x00007fffffffdd78 in ?? ()
#43 0x00007ffff7ffe020 in ?? () from /lib64/ld-linux-x86-64.so.2
#44 0x0000000000000001 in ?? ()
#45 0x00007fffffffe0e7 in ?? ()
#46 0x0000000000000000 in ?? ()
My guess is intel integrated GPU is not supported, so I tried with additional argument --gpu=false, but it doesn't help and still gives the same bad_cast error.

Any help with how to troubleshoot this problem will be very appreciated.

Mod Edit: Added Code Tags - PantherX

Re: bad_cast when running FAHclient on fedora 33

Posted: Thu Dec 24, 2020 6:15 pm
by psaam0001
Did you try updating your Intel iGPU drivers? That would be a good initial step. The other would be to make sure you are using version 7.6.21 of the FAHClient.

One of the more experienced forum members w/an Intel iGPU (on a Fedora system) may be able to help you more with the driver upgrade process. As I'm primarily using NVidia GPU equipped cards.

Paul

Re: bad_cast when running FAHclient on fedora 33

Posted: Thu Dec 24, 2020 6:48 pm
by asdfghjkl
Thanks for your fast response.

I am using i915 driver, version 5.9.15-200.fc33.x86_64

lspci -k | grep -EA3 'VGA|3D|Display'
00:02.0 VGA compatible controller: Intel Corporation Skylake GT2 [HD Graphics 520] (rev 07)
DeviceName: Onboard IGD
Subsystem: Dell Device 078c
Kernel driver in use: i915

modinfo i915
......
vermagic: 5.9.15-200.fc33.x86_64 SMP mod_unload
......

I am not sure whether I need to try with a different driver.

I downloaded FAHClient this morning, it is version 7.6.21.

FAHClient --version
7.6.21

Re: bad_cast when running FAHclient on fedora 33

Posted: Thu Dec 24, 2020 6:54 pm
by bruce
asdfghjkl wrote:I am not sure whether I need to try with a different driver.
Intel iPGUs are very particular about drivers and they're being beta tested. I would recommend you leave the iGP disabled at least until you're able to fold with your CPU (and with an AMD or nVidia GPU, if you have one).

Re: bad_cast when running FAHclient on fedora 33

Posted: Fri Dec 25, 2020 12:31 am
by PantherX
This is what I know:
Trusted Source wrote:...for Linux: "For those looking for Linux graphics display support, it is recommended to use the software provided by your operating system distribution vendor.:", however, for "compute", we have: https://dgpu-docs.intel.com/index.html

Re: bad_cast when running FAHclient on fedora 33

Posted: Tue Dec 29, 2020 3:43 am
by asdfghjkl
By the way, is there any way to prevent FAHClient from probing for GPU? I tried --gpu=false, but it is still probing for GPU and getting bad_cast exception. I cannot just remove the OpenCL runtime, as I need it for other programs.
I don't think so.

We don't know what hardware/software you have so anything I suggest is strictly a guess.

The OpenCL that comes from Intel is not the same as the one that comes from other vendors, such as NVidia. If you have the nVidia OpenCL runtime (it comes with the nVidia drivers if you get them from nVidia. I suspect FAH will try to use it instead of the Intel version -- and it might allow FAHClient to believe you have no GPUs. It might also be useful to temporarily remove (rename?) the Intel OpenCL runtime.

Does FAHClient get far enough to produce the top of FAH's log.txt?

Setting gpus='false' might also help.