Page 1 of 4

FahCore 23 broken on Fedora 39

Posted: Fri Dec 01, 2023 4:11 am
by wdanwatts
When I run the software for any length of time, it gives me a core 23 project and then I get this:

Code: Select all

03:56:50:WU00:FS00:Starting
03:56:50:WU00:FS00:Removing old file 'work/00/logfile_01-20231201-032446.txt'
03:56:50:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23 -dir 00 -suffix 01 -version 706 -lifeline 1612 -checkpoint 30 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
03:56:50:WU00:FS00:Started FahCore on PID 40865
03:56:50:WU00:FS00:Core PID:40869
03:56:50:WU00:FS00:FahCore 0x23 started
03:56:51:WU00:FS00:0x23:*********************** Log Started 2023-12-01T03:56:50Z ***********************
03:56:51:WU00:FS00:0x23:*************************** Core23 Folding@home Core ***************************
03:56:51:WU00:FS00:0x23:       Core: Core23
03:56:51:WU00:FS00:0x23:       Type: 0x23
03:56:51:WU00:FS00:0x23:    Version: 8.0.3
03:56:51:WU00:FS00:0x23:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:56:51:WU00:FS00:0x23:  Copyright: 2022 foldingathome.org
03:56:51:WU00:FS00:0x23:   Homepage: https://foldingathome.org/
03:56:51:WU00:FS00:0x23:       Date: Aug 3 2023
03:56:51:WU00:FS00:0x23:       Time: 08:28:22
03:56:51:WU00:FS00:0x23:   Revision: 199cb870317d05441d0a301287d9ef61254fa32b
03:56:51:WU00:FS00:0x23:     Branch: HEAD
03:56:51:WU00:FS00:0x23:   Compiler: GNU 7.5.0
03:56:51:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
03:56:51:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie
03:56:51:WU00:FS00:0x23:             -DOPENMM_VERSION="\"8.0.0\""
03:56:51:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
03:56:51:WU00:FS00:0x23:       Bits: 64
03:56:51:WU00:FS00:0x23:       Mode: Release
03:56:51:WU00:FS00:0x23:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
03:56:51:WU00:FS00:0x23:             <peastman@stanford.edu>
03:56:51:WU00:FS00:0x23:       Args: -dir 00 -suffix 01 -version 706 -lifeline 40865 -checkpoint 30
03:56:51:WU00:FS00:0x23:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
03:56:51:WU00:FS00:0x23:             0 -gpu 0
03:56:51:WU00:FS00:0x23:************************************ libFAH ************************************
03:56:51:WU00:FS00:0x23:       Date: Aug 3 2023
03:56:51:WU00:FS00:0x23:       Time: 08:27:48
03:56:51:WU00:FS00:0x23:   Revision: 112c2234abe20611a05652defc3c7f854cbf927f
03:56:51:WU00:FS00:0x23:     Branch: HEAD
03:56:51:WU00:FS00:0x23:   Compiler: GNU 7.5.0
03:56:51:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
03:56:51:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie
03:56:51:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
03:56:51:WU00:FS00:0x23:       Bits: 64
03:56:51:WU00:FS00:0x23:       Mode: Release
03:56:51:WU00:FS00:0x23:************************************ CBang *************************************
03:56:51:WU00:FS00:0x23:    Version: 1.7.2
03:56:51:WU00:FS00:0x23:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:56:51:WU00:FS00:0x23:        Org: Cauldron Development LLC
03:56:51:WU00:FS00:0x23:  Copyright: Cauldron Development LLC, 2003-2023
03:56:51:WU00:FS00:0x23:   Homepage: https://cauldrondevelopment.com/
03:56:51:WU00:FS00:0x23:    License: GPL 2+
03:56:51:WU00:FS00:0x23:       Date: Aug 3 2023
03:56:51:WU00:FS00:0x23:       Time: 08:27:30
03:56:51:WU00:FS00:0x23:   Revision: eae4b58965bdd4d54ea9eb77972674352b37a547
03:56:51:WU00:FS00:0x23:     Branch: HEAD
03:56:51:WU00:FS00:0x23:   Compiler: GNU 7.5.0
03:56:51:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
03:56:51:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
03:56:51:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
03:56:51:WU00:FS00:0x23:       Bits: 64
03:56:51:WU00:FS00:0x23:       Mode: Release
03:56:51:WU00:FS00:0x23:************************************ System ************************************
03:56:51:WU00:FS00:0x23:        CPU: AMD Phenom(tm) II X2 545 Processor
03:56:51:WU00:FS00:0x23:     CPU ID: AuthenticAMD Family 16 Model 4 Stepping 2
03:56:51:WU00:FS00:0x23:       CPUs: 2
03:56:51:WU00:FS00:0x23:     Memory: 3.81GiB
03:56:51:WU00:FS00:0x23:Free Memory: 744.96MiB
03:56:51:WU00:FS00:0x23:    Threads: POSIX_THREADS
03:56:51:WU00:FS00:0x23: OS Version: 6.5
03:56:51:WU00:FS00:0x23:Has Battery: false
03:56:51:WU00:FS00:0x23: On Battery: false
03:56:51:WU00:FS00:0x23: UTC Offset: -6
03:56:51:WU00:FS00:0x23:        PID: 40869
03:56:51:WU00:FS00:0x23:        CWD: /var/lib/fahclient/work
03:56:51:WU00:FS00:0x23:       Exec: /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23
03:56:51:WU00:FS00:0x23:************************************ OpenMM ************************************
03:56:51:WU00:FS00:0x23:    Version: 8.0.0
03:56:51:WU00:FS00:0x23:********************************************************************************
03:56:51:WU00:FS00:0x23:Project: 12248 (Run 0, Clone 105, Gen 10)
03:56:51:WU00:FS00:0x23:Digital signatures verified
03:56:51:WU00:FS00:0x23:Folding@home GPU Core23 Folding@home Core
03:56:51:WU00:FS00:0x23:Version 8.0.3
03:56:51:WU00:FS00:0x23:  Checkpoint write interval: 50000 steps (2%) [50 total]
03:56:51:WU00:FS00:0x23:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
03:56:51:WU00:FS00:0x23:  XTC frame write interval: 25000 steps (1%) [100 total]
03:56:51:WU00:FS00:0x23:  Global context and integrator variables write interval: disabled
03:56:51:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
It continually cycles. I can stop this behavior by removing the GPU slot and re-entering it. That lasts until another core 23 job is assigned.
I'm runnung Fedora Linux 39 (Workstation Edition) on an AMD Phenom™ II X2 545 × 2 with a NVIDIA GeForce GTX 1660 SUPER GPU. It had been running ~ 1 million 'points' per day.
Running the command suggested in another thread

Code: Select all

./var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
gets a reply

Code: Select all

 error while loading shared libraries: libOpenMM.so.8.0: cannot open shared object file: No such file or directory
even though the 'offending' library is in that folder, and removing the Core_23 directory (which causes a new one to be rebuilt) causes no different behavior.
Is this core cursed, or does Fedora 39 have a folding problem?

Re: FahCore 23 broken on Fedora 39

Posted: Fri Dec 01, 2023 11:46 am
by bikeaddict
Core 23 has been running on my three Fedora 39 systems with no errors.

It could be a dependency problem with missing libraries on some systems. Wouldn't think lack of execute permissions on libraries would cause that error message.

Output from running ldd on libOpenMM.so.8.0:

Code: Select all

ldd libOpenMM.so.8.0
ldd: warning: you do not have execution permission for `./libOpenMM.so.8.0'
	linux-vdso.so.1 (0x00007ffc377f6000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f934e25f000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f934e25a000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f934e255000)
	libstdc++.so.6 => /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/././libstdc++.so.6 (0x00007f934dc1d000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f934db3c000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f934e22f000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f934d95a000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f934e278000)

Re: FahCore 23 broken on Fedora 39

Posted: Fri Dec 01, 2023 6:53 pm
by wdanwatts
I'll see what "ldd libOpenMM.so.8.0" reports on my machine this evening.

Re: FahCore 23 broken on Fedora 39

Posted: Fri Dec 01, 2023 10:21 pm
by hmacdope
Not always the reccomended solution but you can try manually adding the core dir to LD_LIBRARY_PATH like so

export LD_LIBRARY_PATH=var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah:$LD_LIBRARY_PATH

Re: FahCore 23 broken on Fedora 39

Posted: Sat Dec 02, 2023 4:58 pm
by wdanwatts

Code: Select all

ldd libOpenMM.so.8.0
	linux-vdso.so.1 (0x00007ffd5fdac000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007ff22250f000)
	librt.so.1 => /lib64/librt.so.1 (0x00007ff22250a000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff222505000)
	libstdc++.so.6 => /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/././libstdc++.so.6 (0x00007ff221e1d000)
	libm.so.6 => /lib64/libm.so.6 (0x00007ff222424000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ff2223fe000)
	libc.so.6 => /lib64/libc.so.6 (0x00007ff221c3b000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ff22253c000)
with permissions did no good.

Re: FahCore 23 broken on Fedora 39

Posted: Sat Dec 02, 2023 6:13 pm
by bikeaddict
The libOpenMM message may not be the real problem. Something is causing the INTERRUPTED (102 = 0x66). Check for crashes at the bottom of

Code: Select all

journalctl -p err
or scan through /var/log/messages for anything related to files in /var/lib/fahclient/ or FahCore_23 or libOpenMM.so.8.0.

Re: FahCore 23 broken on Fedora 39

Posted: Sun Dec 03, 2023 1:47 am
by wdanwatts
It appears that a build-id is missing or wrong.

Code: Select all

Dec 02 08:47:46 192-168-1-12 systemd-coredump[74592]: [🡕] Process 74588 (FahCore_23) of user 976 dumped core.
                                                      
                                                      Module libOpenMMPME.so without build-id.
                                                      Module libOpenMMCUDA.so without build-id.
                                                      Module libOpenMMOpenCL.so without build-id.
                                                      Module libOpenMMCPU.so without build-id.
                                                      Module libstdc++.so.6 without build-id.
                                                      Module libz.so.1 from rpm zlib-1.2.13-4.fc39.x86_64
                                                      Module libbz2.so.1.0 without build-id.
                                                      Module libexpat.so.1 from rpm expat-2.5.0-3.fc39.x86_64
                                                      Module libcrypto.so.1.1 without build-id.
                                                      Module libssl.so.1.1 without build-id.
                                                      Module libOpenMM.so.8.0 without build-id.
                                                      Module FahCore_23 without build-id.
                                                      Stack trace of thread 74588:
                                                      #0  0x00007f30fc0a48c6 _GLOBAL__sub_I_CpuPmeKernels.cpp (libOpenMMPME.so + 0x88c6)
                                                      #1  0x00007f31003a6237 call_init (ld-linux-x86-64.so.2 + 0x5237)
                                                      #2  0x00007f31003a632d _dl_init (ld-linux-x86-64.so.2 + 0x532d)
                                                      #3  0x00007f31003a25c2 __GI__dl_catch_exception (ld-linux-x86-64.so.2 + 0x15c2)
                                                      #4  0x00007f31003aceec dl_open_worker (ld-linux-x86-64.so.2 + 0xbeec)
                                                      #5  0x00007f31003a2523 __GI__dl_catch_exception (ld-linux-x86-64.so.2 + 0x1523)
How is that repaired?

Re: FahCore 23 broken on Fedora 39

Posted: Sun Dec 03, 2023 2:18 am
by bikeaddict
It may be trying to execute instructions like AVX or SSE4.x that aren't supported on a CPU as old as an AMD Phenom II X2.

Re: FahCore 23 broken on Fedora 39

Posted: Sun Dec 03, 2023 5:58 pm
by wdanwatts
Have such commands been recently added? This problem has only shown up this year.

Re: FahCore 23 broken on Fedora 39

Posted: Sun Dec 03, 2023 6:52 pm
by bikeaddict
From what I can tell, Core22 uses OpenMM 7.7 and Core23 uses OpenMM 8.0. The documentation for both versions says the OpenCL platform requires SSE 4.1. Maybe something was changed in Core23 or OpenMM 8.0 that executes SSE 4.1 instructions that causes a crash.

It would be up to the F@H core developer(s) to determine if this can be fixed and whether they are willing to do so.

But the cost of picking up more modern hardware with SSE4.x and AVX support is minimal. I previously used a HP Z240 Tower and Dell Precision T3600 workstation with 6-pin to 8-pin PCIe adapters that could easily power a GTX 1660 Super. These are now available for $100 or less on eBay, FB Marketplace or Craigslist. See greenpcgamers.com and their YouTube channel for HOWTOs.

Re: FahCore 23 broken on Fedora 39

Posted: Mon Dec 04, 2023 7:46 pm
by toTOW
bikeaddict wrote: Sun Dec 03, 2023 2:18 am It may be trying to execute instructions like AVX or SSE4.x that aren't supported on a CPU as old as an AMD Phenom II X2.
Don't worry about this, the core is running perfectly fine on my old first generation i7s (920 and 860).

This is definitely a library issue. Fedora 39 is very new ...

Re: FahCore 23 broken on Fedora 39

Posted: Mon Dec 04, 2023 9:03 pm
by bikeaddict
toTOW wrote: Mon Dec 04, 2023 7:46 pm Don't worry about this, the core is running perfectly fine on my old first generation i7s (920 and 860).

This is definitely a library issue. Fedora 39 is very new ...
The i7-920 and i7-860 support SSE 4.1 and 4.2, but the Phenom II doesn't.

My three Fedora 39 systems have had no problems for a month.

From looking at objdump output of libOpenMMPME.so, there are SSE 4.1 instructions in OpenMM 8.0, but not in OpenMM 7.7. Not sure if the F@H core devs modified version 7.7 to compile without SSE 4.1 or what.

Re: FahCore 23 broken on Fedora 39

Posted: Tue Dec 05, 2023 3:05 am
by wdanwatts
Could we get a "Don't use Core ZZ" flag?

Re: FahCore 23 broken on Fedora 39

Posted: Sat Dec 09, 2023 3:48 pm
by muziqaz
wdanwatts wrote: Tue Dec 05, 2023 3:05 am Could we get a "Don't use Core ZZ" flag?
No,

but we'll see if we can get core_23 disabled for older CPUs

Re: FahCore 23 broken on Fedora 39

Posted: Sat Dec 09, 2023 6:32 pm
by wdanwatts
:D