FahCore 23 broken on Fedora 39

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

FahCore 23 broken on Fedora 39

Post by wdanwatts »

When I run the software for any length of time, it gives me a core 23 project and then I get this:

Code: Select all

03:56:50:WU00:FS00:Starting
03:56:50:WU00:FS00:Removing old file 'work/00/logfile_01-20231201-032446.txt'
03:56:50:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23 -dir 00 -suffix 01 -version 706 -lifeline 1612 -checkpoint 30 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
03:56:50:WU00:FS00:Started FahCore on PID 40865
03:56:50:WU00:FS00:Core PID:40869
03:56:50:WU00:FS00:FahCore 0x23 started
03:56:51:WU00:FS00:0x23:*********************** Log Started 2023-12-01T03:56:50Z ***********************
03:56:51:WU00:FS00:0x23:*************************** Core23 Folding@home Core ***************************
03:56:51:WU00:FS00:0x23:       Core: Core23
03:56:51:WU00:FS00:0x23:       Type: 0x23
03:56:51:WU00:FS00:0x23:    Version: 8.0.3
03:56:51:WU00:FS00:0x23:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:56:51:WU00:FS00:0x23:  Copyright: 2022 foldingathome.org
03:56:51:WU00:FS00:0x23:   Homepage: https://foldingathome.org/
03:56:51:WU00:FS00:0x23:       Date: Aug 3 2023
03:56:51:WU00:FS00:0x23:       Time: 08:28:22
03:56:51:WU00:FS00:0x23:   Revision: 199cb870317d05441d0a301287d9ef61254fa32b
03:56:51:WU00:FS00:0x23:     Branch: HEAD
03:56:51:WU00:FS00:0x23:   Compiler: GNU 7.5.0
03:56:51:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
03:56:51:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie
03:56:51:WU00:FS00:0x23:             -DOPENMM_VERSION="\"8.0.0\""
03:56:51:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
03:56:51:WU00:FS00:0x23:       Bits: 64
03:56:51:WU00:FS00:0x23:       Mode: Release
03:56:51:WU00:FS00:0x23:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
03:56:51:WU00:FS00:0x23:             <peastman@stanford.edu>
03:56:51:WU00:FS00:0x23:       Args: -dir 00 -suffix 01 -version 706 -lifeline 40865 -checkpoint 30
03:56:51:WU00:FS00:0x23:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
03:56:51:WU00:FS00:0x23:             0 -gpu 0
03:56:51:WU00:FS00:0x23:************************************ libFAH ************************************
03:56:51:WU00:FS00:0x23:       Date: Aug 3 2023
03:56:51:WU00:FS00:0x23:       Time: 08:27:48
03:56:51:WU00:FS00:0x23:   Revision: 112c2234abe20611a05652defc3c7f854cbf927f
03:56:51:WU00:FS00:0x23:     Branch: HEAD
03:56:51:WU00:FS00:0x23:   Compiler: GNU 7.5.0
03:56:51:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
03:56:51:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie
03:56:51:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
03:56:51:WU00:FS00:0x23:       Bits: 64
03:56:51:WU00:FS00:0x23:       Mode: Release
03:56:51:WU00:FS00:0x23:************************************ CBang *************************************
03:56:51:WU00:FS00:0x23:    Version: 1.7.2
03:56:51:WU00:FS00:0x23:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:56:51:WU00:FS00:0x23:        Org: Cauldron Development LLC
03:56:51:WU00:FS00:0x23:  Copyright: Cauldron Development LLC, 2003-2023
03:56:51:WU00:FS00:0x23:   Homepage: https://cauldrondevelopment.com/
03:56:51:WU00:FS00:0x23:    License: GPL 2+
03:56:51:WU00:FS00:0x23:       Date: Aug 3 2023
03:56:51:WU00:FS00:0x23:       Time: 08:27:30
03:56:51:WU00:FS00:0x23:   Revision: eae4b58965bdd4d54ea9eb77972674352b37a547
03:56:51:WU00:FS00:0x23:     Branch: HEAD
03:56:51:WU00:FS00:0x23:   Compiler: GNU 7.5.0
03:56:51:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
03:56:51:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
03:56:51:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
03:56:51:WU00:FS00:0x23:       Bits: 64
03:56:51:WU00:FS00:0x23:       Mode: Release
03:56:51:WU00:FS00:0x23:************************************ System ************************************
03:56:51:WU00:FS00:0x23:        CPU: AMD Phenom(tm) II X2 545 Processor
03:56:51:WU00:FS00:0x23:     CPU ID: AuthenticAMD Family 16 Model 4 Stepping 2
03:56:51:WU00:FS00:0x23:       CPUs: 2
03:56:51:WU00:FS00:0x23:     Memory: 3.81GiB
03:56:51:WU00:FS00:0x23:Free Memory: 744.96MiB
03:56:51:WU00:FS00:0x23:    Threads: POSIX_THREADS
03:56:51:WU00:FS00:0x23: OS Version: 6.5
03:56:51:WU00:FS00:0x23:Has Battery: false
03:56:51:WU00:FS00:0x23: On Battery: false
03:56:51:WU00:FS00:0x23: UTC Offset: -6
03:56:51:WU00:FS00:0x23:        PID: 40869
03:56:51:WU00:FS00:0x23:        CWD: /var/lib/fahclient/work
03:56:51:WU00:FS00:0x23:       Exec: /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23
03:56:51:WU00:FS00:0x23:************************************ OpenMM ************************************
03:56:51:WU00:FS00:0x23:    Version: 8.0.0
03:56:51:WU00:FS00:0x23:********************************************************************************
03:56:51:WU00:FS00:0x23:Project: 12248 (Run 0, Clone 105, Gen 10)
03:56:51:WU00:FS00:0x23:Digital signatures verified
03:56:51:WU00:FS00:0x23:Folding@home GPU Core23 Folding@home Core
03:56:51:WU00:FS00:0x23:Version 8.0.3
03:56:51:WU00:FS00:0x23:  Checkpoint write interval: 50000 steps (2%) [50 total]
03:56:51:WU00:FS00:0x23:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
03:56:51:WU00:FS00:0x23:  XTC frame write interval: 25000 steps (1%) [100 total]
03:56:51:WU00:FS00:0x23:  Global context and integrator variables write interval: disabled
03:56:51:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
It continually cycles. I can stop this behavior by removing the GPU slot and re-entering it. That lasts until another core 23 job is assigned.
I'm runnung Fedora Linux 39 (Workstation Edition) on an AMD Phenom™ II X2 545 × 2 with a NVIDIA GeForce GTX 1660 SUPER GPU. It had been running ~ 1 million 'points' per day.
Running the command suggested in another thread

Code: Select all

./var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
gets a reply

Code: Select all

 error while loading shared libraries: libOpenMM.so.8.0: cannot open shared object file: No such file or directory
even though the 'offending' library is in that folder, and removing the Core_23 directory (which causes a new one to be rebuilt) causes no different behavior.
Is this core cursed, or does Fedora 39 have a folding problem?
bikeaddict
Posts: 210
Joined: Sun May 03, 2020 1:20 am

Re: FahCore 23 broken on Fedora 39

Post by bikeaddict »

Core 23 has been running on my three Fedora 39 systems with no errors.

It could be a dependency problem with missing libraries on some systems. Wouldn't think lack of execute permissions on libraries would cause that error message.

Output from running ldd on libOpenMM.so.8.0:

Code: Select all

ldd libOpenMM.so.8.0
ldd: warning: you do not have execution permission for `./libOpenMM.so.8.0'
	linux-vdso.so.1 (0x00007ffc377f6000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f934e25f000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f934e25a000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f934e255000)
	libstdc++.so.6 => /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/././libstdc++.so.6 (0x00007f934dc1d000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f934db3c000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f934e22f000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f934d95a000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f934e278000)
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: FahCore 23 broken on Fedora 39

Post by wdanwatts »

I'll see what "ldd libOpenMM.so.8.0" reports on my machine this evening.
hmacdope
Scientist
Posts: 2
Joined: Tue Aug 15, 2023 11:55 pm

Re: FahCore 23 broken on Fedora 39

Post by hmacdope »

Not always the reccomended solution but you can try manually adding the core dir to LD_LIBRARY_PATH like so

export LD_LIBRARY_PATH=var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah:$LD_LIBRARY_PATH
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: FahCore 23 broken on Fedora 39

Post by wdanwatts »

Code: Select all

ldd libOpenMM.so.8.0
	linux-vdso.so.1 (0x00007ffd5fdac000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007ff22250f000)
	librt.so.1 => /lib64/librt.so.1 (0x00007ff22250a000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff222505000)
	libstdc++.so.6 => /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/././libstdc++.so.6 (0x00007ff221e1d000)
	libm.so.6 => /lib64/libm.so.6 (0x00007ff222424000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ff2223fe000)
	libc.so.6 => /lib64/libc.so.6 (0x00007ff221c3b000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ff22253c000)
with permissions did no good.
bikeaddict
Posts: 210
Joined: Sun May 03, 2020 1:20 am

Re: FahCore 23 broken on Fedora 39

Post by bikeaddict »

The libOpenMM message may not be the real problem. Something is causing the INTERRUPTED (102 = 0x66). Check for crashes at the bottom of

Code: Select all

journalctl -p err
or scan through /var/log/messages for anything related to files in /var/lib/fahclient/ or FahCore_23 or libOpenMM.so.8.0.
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: FahCore 23 broken on Fedora 39

Post by wdanwatts »

It appears that a build-id is missing or wrong.

Code: Select all

Dec 02 08:47:46 192-168-1-12 systemd-coredump[74592]: [🡕] Process 74588 (FahCore_23) of user 976 dumped core.
                                                      
                                                      Module libOpenMMPME.so without build-id.
                                                      Module libOpenMMCUDA.so without build-id.
                                                      Module libOpenMMOpenCL.so without build-id.
                                                      Module libOpenMMCPU.so without build-id.
                                                      Module libstdc++.so.6 without build-id.
                                                      Module libz.so.1 from rpm zlib-1.2.13-4.fc39.x86_64
                                                      Module libbz2.so.1.0 without build-id.
                                                      Module libexpat.so.1 from rpm expat-2.5.0-3.fc39.x86_64
                                                      Module libcrypto.so.1.1 without build-id.
                                                      Module libssl.so.1.1 without build-id.
                                                      Module libOpenMM.so.8.0 without build-id.
                                                      Module FahCore_23 without build-id.
                                                      Stack trace of thread 74588:
                                                      #0  0x00007f30fc0a48c6 _GLOBAL__sub_I_CpuPmeKernels.cpp (libOpenMMPME.so + 0x88c6)
                                                      #1  0x00007f31003a6237 call_init (ld-linux-x86-64.so.2 + 0x5237)
                                                      #2  0x00007f31003a632d _dl_init (ld-linux-x86-64.so.2 + 0x532d)
                                                      #3  0x00007f31003a25c2 __GI__dl_catch_exception (ld-linux-x86-64.so.2 + 0x15c2)
                                                      #4  0x00007f31003aceec dl_open_worker (ld-linux-x86-64.so.2 + 0xbeec)
                                                      #5  0x00007f31003a2523 __GI__dl_catch_exception (ld-linux-x86-64.so.2 + 0x1523)
How is that repaired?
bikeaddict
Posts: 210
Joined: Sun May 03, 2020 1:20 am

Re: FahCore 23 broken on Fedora 39

Post by bikeaddict »

It may be trying to execute instructions like AVX or SSE4.x that aren't supported on a CPU as old as an AMD Phenom II X2.
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: FahCore 23 broken on Fedora 39

Post by wdanwatts »

Have such commands been recently added? This problem has only shown up this year.
bikeaddict
Posts: 210
Joined: Sun May 03, 2020 1:20 am

Re: FahCore 23 broken on Fedora 39

Post by bikeaddict »

From what I can tell, Core22 uses OpenMM 7.7 and Core23 uses OpenMM 8.0. The documentation for both versions says the OpenCL platform requires SSE 4.1. Maybe something was changed in Core23 or OpenMM 8.0 that executes SSE 4.1 instructions that causes a crash.

It would be up to the F@H core developer(s) to determine if this can be fixed and whether they are willing to do so.

But the cost of picking up more modern hardware with SSE4.x and AVX support is minimal. I previously used a HP Z240 Tower and Dell Precision T3600 workstation with 6-pin to 8-pin PCIe adapters that could easily power a GTX 1660 Super. These are now available for $100 or less on eBay, FB Marketplace or Craigslist. See greenpcgamers.com and their YouTube channel for HOWTOs.
toTOW
Site Moderator
Posts: 6394
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: FahCore 23 broken on Fedora 39

Post by toTOW »

bikeaddict wrote: Sun Dec 03, 2023 2:18 am It may be trying to execute instructions like AVX or SSE4.x that aren't supported on a CPU as old as an AMD Phenom II X2.
Don't worry about this, the core is running perfectly fine on my old first generation i7s (920 and 860).

This is definitely a library issue. Fedora 39 is very new ...
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
bikeaddict
Posts: 210
Joined: Sun May 03, 2020 1:20 am

Re: FahCore 23 broken on Fedora 39

Post by bikeaddict »

toTOW wrote: Mon Dec 04, 2023 7:46 pm Don't worry about this, the core is running perfectly fine on my old first generation i7s (920 and 860).

This is definitely a library issue. Fedora 39 is very new ...
The i7-920 and i7-860 support SSE 4.1 and 4.2, but the Phenom II doesn't.

My three Fedora 39 systems have had no problems for a month.

From looking at objdump output of libOpenMMPME.so, there are SSE 4.1 instructions in OpenMM 8.0, but not in OpenMM 7.7. Not sure if the F@H core devs modified version 7.7 to compile without SSE 4.1 or what.
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: FahCore 23 broken on Fedora 39

Post by wdanwatts »

Could we get a "Don't use Core ZZ" flag?
muziqaz
Posts: 1042
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: FahCore 23 broken on Fedora 39

Post by muziqaz »

wdanwatts wrote: Tue Dec 05, 2023 3:05 am Could we get a "Don't use Core ZZ" flag?
No,

but we'll see if we can get core_23 disabled for older CPUs
FAH Omega tester
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: FahCore 23 broken on Fedora 39

Post by wdanwatts »

:D
Post Reply