Page 1 of 1
NixOS, AMD GPU and v8 (specifically v8.3.18)
Posted: Sat Dec 28, 2024 4:38 pm
by damage
Hi All,
Thanks for stopping by and looking at my post.
Does anyone fold on NixOS? I have 3 issues that maybe need fixing to allow me using the GPU on v8.3.18 (although happened on previous versions also):
1) HTTP errors:
16:22:42:I1:OUT186:< HTTP/1.1 503 HTTP_SERVICE_UNAVAILABLE
16:22:42:E :OUT186:HTTP_SERVICE_UNAVAILABLE: {"error":{"message":"Please wait","code":503}}
2) Bad work units for multiple WU:
16:22:25:W :WU475537:Core returned BAD_WORK_UNIT (114)
3) "Does not belong" errors:
16:14:22:E :WU with client ID (variable ID for all messages) does not belong client (common ID for all messages)
I get a SIGINT on a process - it looks like the GPU WU gets downloaded then quickly fails, but I thought I'd tackle the above first as a possible cause of that as with NixOS the last piece of the puzzle could be challenging.
I'm happy to upload logs but I'm hoping for a steer on the hopefully easy issues so I can sort those out to help focus on what remains!
Re: NixOS, AMD GPU and v8 (specifically v8.3.18)
Posted: Sun Dec 29, 2024 9:49 am
by muziqaz
Hi, your username is well known within FAH. You have been failing massive, I mean, massive amounts of work units in past several months, to the point where failure rate was in 10s of thousands a day. I tried to contact you in one of the forums I noticed you were in (was it level1techs?), but there was no response. So we made a decision to blacklist your username, because researchers would fail their whole project because of this. I mean, look at your log, WU475537. That is 475 thousand WUs, 99% of them failed.
So anyways, above is for your question number 1.
Question 2 requires full system config posted with drivers installed, clinfo output.
Please update your clients to latest beta (soon to be full released, hopefully) from foldingathome.org/beta.
I hope for the moment you paused all your FAHClients until this can be resolved, hopefully.
Also, please go into you fahclient Web interface, click on log icon within it (for the PC, not for the slot), in the search bar enter WU475537, punch Enter, and paste the log here, please.
And we can go from there
Welcome to the forum, what took you so long
Re: NixOS, AMD GPU and v8 (specifically v8.3.18)
Posted: Sun Dec 29, 2024 7:42 pm
by toTOW
NixOS ... sounds like an exotic distribution ... how close is it to one that is supported by FAH ?
But as muziqaz said, we need to see logs from the client.
Re: NixOS, AMD GPU and v8 (specifically v8.3.18)
Posted: Mon Dec 30, 2024 12:05 pm
by damage
Starting out with sincerest apologies - I didn't realise the issues I'd caused
- everything is now stopped.
Machine details (from fahclient web interface)
Machine
Hostname - Damage
OS - linux
Client Version - 8.3.18
OS Version - 6.6
Build Mode - Release
Revision - (blank)
Has Battery - false
On Battery - false
CPU
Description - AMD Ryzen 9 3900X 12-Core Processor
Cores - 24
Type - amd64
gpu:14:00:00
Description - gfx1030
Vendor - amd
Supported - true
UUID - (blank)
PCI Device ID - 0x73af
PCI Vendor ID - 0x1002
OpenCL - supported
Compute - 2.0
Driver - 3602.0
CUDA - unsupported
Output of clinfo:
Code: Select all
Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.1 AMD-APP (3602.0)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback
Platform Extensions function suffix AMD
Platform Host timer resolution 1ns
Platform Name AMD Accelerated Parallel Processing
Number of devices 1
Device Name gfx1030
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 2.0
Driver Version 3602.0 (HSA1.1,LC)
Device OpenCL C Version OpenCL C 2.0
Device Type GPU
Device Board Name (AMD) AMD Radeon RX 6900 XT
Device PCI-e ID (AMD) 0x73af
Device Topology (AMD) PCI-E, 0000:0e:00.0
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 40
SIMD per compute unit (AMD) 4
SIMD width (AMD) 32
SIMD instruction width (AMD) 1
Max clock frequency 2720MHz
Graphics IP (AMD) 10.3
Device Partition (core)
Max number of sub-devices 40
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 256
Preferred work group size (AMD) 256
Max work group size (AMD) 1024
Preferred work group size multiple (kernel) 32
Wavefront width (AMD) 32
Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (cl_khr_fp16)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 17163091968 (15.98GiB)
Global free memory (AMD) 16541696 (15.78GiB) 16541696 (15.78GiB)
Global memory channels (AMD) 8
Global memory banks per channel (AMD) 4
Global memory bank width (AMD) 256 bytes
Error Correction support No
Max memory allocation 14588628168 (13.59GiB)
Unified memory for Host and Device No
Shared Virtual Memory (SVM) capabilities (core)
Coarse-grained buffer sharing Yes
Fine-grained buffer sharing Yes
Fine-grained system sharing No
Atomics No
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Preferred alignment for atomics
SVM 0 bytes
Global 0 bytes
Local 0 bytes
Max size for global variable 14588628168 (13.59GiB)
Preferred total size of global vars 17163091968 (15.98GiB)
Global Memory cache type Read/Write
Global Memory cache size 16384 (16KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 8192 images
Base address alignment for 2D image buffers 256 bytes
Pitch alignment for 2D image buffers 256 pixels
Max 2D image size 16384x16384 pixels
Max 3D image size 16384x16384x8192 pixels
Max number of read image args 128
Max number of write image args 8
Max number of read/write image args 64
Max number of pipe args 16
Max active pipe reservations 16
Max pipe packet size 1703726280 (1.587GiB)
Local memory type Local
Local memory size 65536 (64KiB)
Local memory size per CU (AMD) 65536 (64KiB)
Local memory banks (AMD) 32
Max number of constant args 8
Max constant buffer size 14588628168 (13.59GiB)
Preferred constant buffer size (AMD) 16384 (16KiB)
Max size of kernel argument 1024
Queue properties (on host)
Out-of-order execution No
Profiling Yes
Queue properties (on device)
Out-of-order execution Yes
Profiling Yes
Preferred size 262144 (256KiB)
Max size 8388608 (8MiB)
Max queues on device 1
Max events on device 1024
Prefer user sync for interop Yes
Number of P2P devices (AMD) 0
Profiling timer resolution 1ns
Profiling timer offset since Epoch (AMD) 0ns (Wed Dec 31 19:00:00 1969)
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Thread trace supported (AMD) No
Number of async queues (AMD) 8
Max real-time compute queues (AMD) 8
Max real-time compute units (AMD) 40
printf() buffer size 4194304 (4MiB)
Built-in kernels (n/a)
Device Extensions cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) AMD Accelerated Parallel Processing
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [AMD]
clCreateContext(NULL, ...) [default] Success [AMD]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name gfx1030
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name gfx1030
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name gfx1030
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.3.2
ICD loader Profile OpenCL 3.0
Now the log for WU475537 (log-20241228-182120.txt for later reference). This is from the log files as I couldn't find it on the web interface...
Code: Select all
16:22:22:I1:WU475537:Requesting WU assignment for user Damage team 242858
16:22:22:I1:OUT175:> POST https://pllwskifah2.mskcc.org/api/results HTTP/1.1
16:22:22:I1:OUT176:> POST https://assign5.foldingathome.org/api/assign HTTP/1.1
16:22:22:I1:WU473934:Caught signal SIGINT(2) on PID 767519
16:22:22:I1:WU473934:Exiting, please wait. . .
16:22:22:I1:OUT176:< HTTP/1.1 200 HTTP_OK
16:22:22:I1:WU475537:Received WU assignment IMMUoeIKeOI-rMtHcoVrdv1OT1Zp236-AfG2rhYUzQE
16:22:22:I1:WU475537:Downloading WU
16:22:22:I1:OUT175:< HTTP/1.1 200 HTTP_OK
16:22:22:I1:WU475536:Credited
16:22:22:I1:OUT177:> POST https://pllwskifah2.mskcc.org/api/assign HTTP/1.1
16:22:23:I1:WU473934:Completed 1848016 out of 5000000 steps (36%)
16:22:23:I1:WU473934:Folding@home Core Shutdown: INTERRUPTED
16:22:23:I1:WU473934:Core returned INTERRUPTED (102)
16:22:23:I3:Removing old file 'work/-Kj2n-3HB3ncCLxJxxE8p4jtQgwROlsAFyrs2bHif-o/logfile_01-20241228-161322.txt'
16:22:23:I3:WU473934:Running FahCore: /var/lib/private/foldingathome/cores/fahcore-a8-lin-64bit-avx2_256-0.0.12/FahCore_a8 -dir -Kj2n-3HB3ncCLxJxxE8p4jtQgwROlsAFyrs2bHif-o -suffix 01 -version 8.3.18 -lifeline 765710 -np 22
16:22:23:I3:WU473934:Started FahCore on PID 767555
16:22:23:I1:OUT177:< HTTP/1.1 200 HTTP_OK
16:22:23:I1:WU475537:Received WU P17650 R79 C8 G187
16:22:24:I3:WU475537:Running FahCore: /var/lib/private/foldingathome/cores/openmm-core-23/centos-7.9.2009-64bit/release/fahcore-23-centos-7.9.2009-64bit-release-8.0.3/FahCore_23 -dir IMMUoeIKeOI-rMtHcoVrdv1OT1Zp236-AfG2rhYUzQE -suffix 01 -version 8.3.18 -lifeline 765710 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
16:22:24:I3:WU475537:Started FahCore on PID 767559
16:22:24:I1:WU473934:*********************** Log Started 2024-12-28T16:22:23Z ***********************
16:22:24:I1:WU473934:************************** Gromacs Folding@home Core ***************************
16:22:24:I1:WU473934: Core: Gromacs
16:22:24:I1:WU473934: Type: 0xa8
16:22:24:I1:WU473934: Version: 0.0.12
16:22:24:I1:WU473934: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:22:24:I1:WU473934: Copyright: 2020 foldingathome.org
16:22:24:I1:WU473934: Homepage: https://foldingathome.org/
16:22:24:I1:WU473934: Date: Jan 16 2021
16:22:24:I1:WU473934: Time: 19:24:44
16:22:24:I1:WU473934: Compiler: GNU 8.3.0
16:22:24:I1:WU473934: Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
16:22:24:I1:WU473934: -fdata-sections -O3 -funroll-loops -fno-pie
16:22:24:I1:WU473934: Platform: linux2 4.15.0-128-generic
16:22:24:I1:WU473934: Bits: 64
16:22:24:I1:WU473934: Mode: Release
16:22:24:I1:WU473934: SIMD: avx2_256
16:22:24:I1:WU473934: OpenMP: ON
16:22:24:I1:WU473934: CUDA: OFF
16:22:24:I1:WU473934: Args: -dir -Kj2n-3HB3ncCLxJxxE8p4jtQgwROlsAFyrs2bHif-o -suffix 01
16:22:24:I1:WU473934: -version 8.3.18 -lifeline 765710 -np 22
16:22:24:I1:WU473934:************************************ libFAH ************************************
16:22:24:I1:WU473934: Date: Jan 16 2021
16:22:24:I1:WU473934: Time: 19:21:38
16:22:24:I1:WU473934: Compiler: GNU 8.3.0
16:22:24:I1:WU473934: Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
16:22:24:I1:WU473934: -fdata-sections -O3 -funroll-loops -fno-pie
16:22:24:I1:WU473934: Platform: linux2 4.15.0-128-generic
16:22:24:I1:WU473934: Bits: 64
16:22:24:I1:WU473934: Mode: Release
16:22:24:I1:WU473934:************************************ CBang *************************************
16:22:24:I1:WU473934: Date: Jan 16 2021
16:22:24:I1:WU473934: Time: 19:21:24
16:22:24:I1:WU473934: Compiler: GNU 8.3.0
16:22:24:I1:WU473934: Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
16:22:24:I1:WU473934: -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
16:22:24:I1:WU473934: Platform: linux2 4.15.0-128-generic
16:22:24:I1:WU473934: Bits: 64
16:22:24:I1:WU473934: Mode: Release
16:22:24:I1:WU473934:************************************ System ************************************
16:22:24:I1:WU473934: CPU: AMD Ryzen 9 3900X 12-Core Processor
16:22:24:I1:WU473934: CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
16:22:24:I1:WU473934: CPUs: 24
16:22:24:I1:WU473934: Memory: 62.71GiB
16:22:24:I1:WU473934:Free Memory: 4.45GiB
16:22:24:I1:WU473934: Threads: POSIX_THREADS
16:22:24:I1:WU473934: OS Version: 6.6
16:22:24:I1:WU473934:Has Battery: false
16:22:24:I1:WU473934: On Battery: false
16:22:24:I1:WU473934: UTC Offset: -5
16:22:24:I1:WU473934: PID: 767555
16:22:24:I1:WU473934: CWD: /var/lib/private/foldingathome/work
16:22:24:I1:WU473934:********************************************************************************
16:22:24:I1:WU473934:Project: 12421 (Run 91, Clone 0, Gen 107)
16:22:24:I1:WU473934:Unit: 0x00000000000000000000000000000000
16:22:24:I1:WU473934:Digital signatures verified
16:22:24:I1:WU473934:Calling: mdrun -c frame107.gro -s frame107.tpr -x frame107.xtc -cpi state.cpt -cpt 5 -nt 22 -ntmpi 1
16:22:24:I1:WU473934:Steps: first=535000000 total=540000000
16:22:24:I1:WU475537:*********************** Log Started 2024-12-28T16:22:24Z ***********************
16:22:24:I1:WU475537:*************************** Core23 Folding@home Core ***************************
16:22:24:I1:WU475537: Core: Core23
16:22:24:I1:WU475537: Type: 0x23
16:22:24:I1:WU475537: Version: 8.0.3
16:22:24:I1:WU475537: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:22:24:I1:WU475537: Copyright: 2022 foldingathome.org
16:22:24:I1:WU475537: Homepage: https://foldingathome.org/
16:22:24:I1:WU475537: Date: Aug 3 2023
16:22:24:I1:WU475537: Time: 08:28:22
16:22:24:I1:WU475537: Revision: 199cb870317d05441d0a301287d9ef61254fa32b
16:22:24:I1:WU475537: Branch: HEAD
16:22:24:I1:WU475537: Compiler: GNU 7.5.0
16:22:24:I1:WU475537: Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
16:22:24:I1:WU475537: -fdata-sections -O3 -funroll-loops -fno-pie
16:22:24:I1:WU475537: -DOPENMM_VERSION="\"8.0.0\""
16:22:24:I1:WU475537: Platform: linux 5.15.0-1041-azure
16:22:24:I1:WU475537: Bits: 64
16:22:24:I1:WU475537: Mode: Release
16:22:24:I1:WU475537:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
16:22:24:I1:WU475537: <peastman@stanford.edu>
16:22:24:I1:WU475537: Args: -dir IMMUoeIKeOI-rMtHcoVrdv1OT1Zp236-AfG2rhYUzQE -suffix 01
16:22:24:I1:WU475537: -version 8.3.18 -lifeline 765710 -gpu-vendor amd -opencl-platform 0
16:22:24:I1:WU475537: -opencl-device 0 -gpu 0
16:22:24:I1:WU475537:************************************ libFAH ************************************
16:22:24:I1:WU475537: Date: Aug 3 2023
16:22:24:I1:WU475537: Time: 08:27:48
16:22:24:I1:WU475537: Revision: 112c2234abe20611a05652defc3c7f854cbf927f
16:22:24:I1:WU475537: Branch: HEAD
16:22:24:I1:WU475537: Compiler: GNU 7.5.0
16:22:24:I1:WU475537: Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
16:22:24:I1:WU475537: -fdata-sections -O3 -funroll-loops -fno-pie
16:22:24:I1:WU475537: Platform: linux 5.15.0-1041-azure
16:22:24:I1:WU475537: Bits: 64
16:22:24:I1:WU475537: Mode: Release
16:22:24:I1:WU475537:************************************ CBang *************************************
16:22:24:I1:WU475537: Version: 1.7.2
16:22:24:I1:WU475537: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:22:24:I1:WU475537: Org: Cauldron Development LLC
16:22:24:I1:WU475537: Copyright: Cauldron Development LLC, 2003-2023
16:22:24:I1:WU475537: Homepage: https://cauldrondevelopment.com/
16:22:24:I1:WU475537: License: GPL 2+
16:22:24:I1:WU475537: Date: Aug 3 2023
16:22:24:I1:WU475537: Time: 08:27:30
16:22:24:I1:WU475537: Revision: eae4b58965bdd4d54ea9eb77972674352b37a547
16:22:24:I1:WU475537: Branch: HEAD
16:22:24:I1:WU475537: Compiler: GNU 7.5.0
16:22:24:I1:WU475537: Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
16:22:24:I1:WU475537: -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
16:22:24:I1:WU475537: Platform: linux 5.15.0-1041-azure
16:22:24:I1:WU475537: Bits: 64
16:22:24:I1:WU475537: Mode: Release
16:22:24:I1:WU475537:************************************ System ************************************
16:22:24:I1:WU475537: CPU: AMD Ryzen 9 3900X 12-Core Processor
16:22:24:I1:WU475537: CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
16:22:24:I1:WU475537: CPUs: 24
16:22:24:I1:WU475537: Memory: 62.71GiB
16:22:24:I1:WU475537:Free Memory: 4.44GiB
16:22:24:I1:WU475537: Threads: POSIX_THREADS
16:22:24:I1:WU475537: OS Version: 6.6
16:22:24:I1:WU475537:Has Battery: false
16:22:24:I1:WU475537: On Battery: false
16:22:24:I1:WU475537: UTC Offset: -5
16:22:24:I1:WU475537: PID: 767559
16:22:24:I1:WU475537: CWD: /var/lib/private/foldingathome/work
16:22:24:I1:WU475537: Exec: /var/lib/private/foldingathome/cores/openmm-core-23/centos-7.9.2009-64bit/release/fahcore-23-centos-7.9.2009-64bit-release-8.0.3/FahCore_23
16:22:24:I1:WU475537:************************************ OpenMM ************************************
16:22:24:I1:WU475537: Version: 8.0.0
16:22:24:I1:WU475537:********************************************************************************
16:22:24:I1:WU475537:Project: 17650 (Run 79, Clone 8, Gen 187)
16:22:24:I1:WU475537:Reading tar file core.xml
16:22:24:I1:WU475537:Reading tar file integrator.xml.bz2
16:22:24:I1:WU475537:Reading tar file state.xml.bz2
16:22:24:I1:WU475537:Reading tar file system.xml.bz2
16:22:24:I1:WU475537:Digital signatures verified
16:22:24:I1:WU475537:Folding@home GPU Core23 Folding@home Core
16:22:24:I1:WU475537:Version 8.0.3
16:22:24:I1:WU475537: Checkpoint write interval: 125000 steps (5%) [20 total]
16:22:24:I1:WU475537: JSON viewer frame write interval: 25000 steps (1%) [100 total]
16:22:24:I1:WU475537: XTC frame write interval: 12500 steps (0.5%) [200 total]
16:22:24:I1:WU475537: Global context and integrator variables write interval: disabled
16:22:24:I1:WU475537:There are 2 platforms available.
16:22:24:I1:WU475537:Platform 0: Reference
16:22:24:I1:WU475537:Platform 1: CPU
16:22:24:I1:WU475537:opencl-device was set but OpenCL platform could not be found.
16:22:24:I1:WU475537:ERROR:126: Neither CUDA nor OpenCL is available.
16:22:24:I1:WU475537:Saving result file ../logfile_01.txt
16:22:24:I1:WU475537:Saving result file science.log
16:22:24:I1:WU475537:Folding@home Core Shutdown: BAD_WORK_UNIT
[93m16:22:25:W :WU475537:Core returned BAD_WORK_UNIT (114)[0m
16:22:25:I1:Default:Added new work unit: cpus:0 gpus:gpu:14:00:00
16:22:25:I1:WU475537:Uploading WU results
16:22:25:I1:WU475538:Requesting WU assignment for user Damage team 242858
16:22:25:I1:OUT178:> POST https://pllwskifah2.mskcc.org/api/results HTTP/1.1
16:22:25:I1:OUT179:> POST https://assign6.foldingathome.org/api/assign HTTP/1.1
16:22:25:I1:WU473934:Caught signal SIGINT(2) on PID 767555
16:22:25:I1:WU473934:Exiting, please wait. . .
16:22:25:I1:OUT179:< HTTP/1.1 200 HTTP_OK
16:22:25:I1:WU475538:Received WU assignment we7x3ytIDjUfDuQunSfrlcKaJLfVREhlpvE9d_EvbTY
16:22:25:I1:WU475538:Downloading WU
16:22:25:I1:OUT178:< HTTP/1.1 200 HTTP_OK
16:22:25:I1:WU475537:Credited
Other points - thanks for the welcome
For the latest version, should that be the 8.4 series? Using versions other than packaged ones are more awkward, so I'd like to target the correct one. Also I'll ask the maintainer to keep a lookout for the full release.
As for taking so long? The last time I scoured the forums NixOS seemed the lonely child and I wasn't sure I'd stick with the OS at all or spare the time. I still don't have the time, but it turns out I can be stubborn...
Re: NixOS, AMD GPU and v8 (specifically v8.3.18)
Posted: Mon Dec 30, 2024 2:42 pm
by muziqaz
latest beta as in v8.4.9
www.foldingathome.org/beta
This version introduced more strict and more visible WU failures (partly due to your activity
). So now users will be able to identify failing resource groups, or computers. That client also stops further downloads of new WUs if there are 5 (or something like that) failures on that device.
Your drivers seem to be fine, for all intents and purposes. I see one of the failed WUs was using core23, which is not uncommon due to GLIBC issues, however other users tend to not go into 100s of thousands of failed WUs
I'm a bit confused of the purpose of NixOS, what is it based on? the reason I ask is to understand how current your libraries are.
Let's say Linux Mint 22 is based on Ubuntu 24.04, so I know that Mint 22 is latest iteration with latest packages.
Or Linux Mint 21 is based on Ubuntu 22.04, which has older packages and libraries. I know this does not help anything, other than us figuring out how current your system/distro is. I would guess that NixOS is using relatively old Linux base. If possible to upgrade do so, that will help in a way to sort out the failure issue. Culprit is GLIBC version mismatch. core23 and core24 need newer GLIBC version than what you have currently on your system.
For the moment I would suggest just folding on a CPU, since we have no workaround for core failures in Linux. Your CPU should be able to to receive WUs to fold, as only GPU project owners have blacklisted your username. Once we have solution to the failure issue, we will remove the limitations.
To not fold on your GPU either pause it. Or in options untick it completely.
Re: NixOS, AMD GPU and v8 (specifically v8.3.18)
Posted: Mon Dec 30, 2024 4:12 pm
by damage
toTOW wrote: ↑Sun Dec 29, 2024 7:42 pm
NixOS ... sounds like an exotic distribution ... how close is it to one that is supported by FAH ?
But as muziqaz said, we need to see logs from the client.
NixOS is the exemplar distribution using Nix - a tool for package management and system config that aims to be reproducible, declarative, and reliable.
When it works, it is awesome. There are cases when it is a severe pain, but I like the idea, and learning new things, so I have stuck with it....although I do feel I am the only fah user with NixOS sometimes
It is worth taking a look at nixos.org - even if just to convince yourself you never want to go near it!
Re: NixOS, AMD GPU and v8 (specifically v8.3.18)
Posted: Mon Dec 30, 2024 4:32 pm
by muziqaz
I did have a look at the website earlier and back when we first noticed your activities.
While you say you might be the only NixOS user in FAH, your team says otherwise. There are quite a few other users who joined the team, maybe they are also on this OS?
In regards to FAH and Linux, FAH likes simple distros, which have good driver support and are more popular. Ubuntu is the number one choice, but Mint is not bad, if one does not like Ubuntu's direction
Re: NixOS, AMD GPU and v8 (specifically v8.3.18)
Posted: Mon Dec 30, 2024 4:33 pm
by damage
muziqaz wrote: ↑Mon Dec 30, 2024 2:42 pm
I'm a bit confused of the purpose of NixOS, what is it based on? the reason I ask is to understand how current your libraries are.
I use the unstable branch of NixOS a rolling release. In terms of libraries, given that glibc is mentioned elsewhere on the forum, I am currently on version 2.4. The package was bumped from 8.3.7 to 8.3.18 3 months ago.
I believe it should be possible to specify the version of any libraries required in the packaging process, although that is more advanced than my current knowledge. I will investigate...
I will try and get the current package updated as well.