Page 1 of 1

[SOLVED] Folding with NVidia CUDA on Gentoo

Posted: Mon Mar 23, 2020 7:39 am
by jzroth
Hi, fairly new folder here. I'm running folding@home on Gentoo, and I'm having trouble getting it to recognize my Nvidia GeForce GTX 750 Ti. I'm able to run folding@home with just the CPU just fine - it will start up, receive work units, and solve them. Now I'm trying to do GPU computing as well, so I emerged dev-util/nvidia-cuda-toolkit and dev-util/nvidia-cuda-sdk with the opencl use flag, then rebuilt sci-biology/foldingathome. I can do

Code: Select all

eselect opencl list
and see that the nvidia implementation of opencl is infact selected.

However, after running

Code: Select all

./FAHClient --configure
and saying yes to GPU, I see this on starting up FAHClient:

Code: Select all

./FAHClient: /opt/foldingathome/libssl.so.10: no version information available (required by ./FAHClient)
./FAHClient: /opt/foldingathome/libcrypto.so.10: no version information available (required by ./FAHClient)
./FAHClient: /opt/foldingathome/libcrypto.so.10: no version information available (required by ./FAHClient)
07:31:46:INFO(1):Read GPUs.txt
07:31:46:************************* Folding@home Client *************************
07:31:46:    Website: https://foldingathome.org/
07:31:46:  Copyright: (c) 2009-2018 foldingathome.org
07:31:46:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
07:31:46:       Args:
07:31:46:     Config: /opt/foldingathome/config.xml
07:31:46:******************************** Build ********************************
07:31:46:    Version: 7.5.1
07:31:46:       Date: May 12 2018
07:31:46:       Time: 22:51:07
07:31:46: Repository: Git
07:31:46:   Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
07:31:46:     Branch: master
07:31:46:   Compiler: GNU 4.4.7 20120313 (Red Hat 4.4.7-18)
07:31:46:    Options: -std=gnu++98 -O3 -funroll-loops
07:31:46:   Platform: linux2 4.14.0-3-amd64
07:31:46:       Bits: 64
07:31:46:       Mode: Release
07:31:46:******************************* System ********************************
07:31:46:        CPU: AMD FX(tm)-8320 Eight-Core Processor
07:31:46:     CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
07:31:46:       CPUs: 8
07:31:46:     Memory: 13.57GiB
07:31:46:Free Memory: 515.02MiB
07:31:46:    Threads: POSIX_THREADS
07:31:46: OS Version: 4.19
07:31:46:Has Battery: false
07:31:46: On Battery: false
07:31:46: UTC Offset: -7
07:31:46:        PID: 30251
07:31:46:        CWD: /opt/foldingathome
07:31:46:         OS: Linux 4.19.97-gentoo-x86_64 x86_64
07:31:46:    OS Arch: AMD64
07:31:46:       GPUs: 1
07:31:46:      GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:4 GM107 [GeForce GTX 750 Ti] 1306
07:31:46:       CUDA: Not detected: cuInit() returned 100
07:31:46:     OpenCL: Not detected: clGetPlatformIDs() returned -1001
07:31:46:***********************************************************************
07:31:46:<config>
07:31:46:  <!-- Folding Slots -->
07:31:46:  <slot id='0' type='CPU'/>
07:31:46:  <slot id='1' type='GPU'/>
07:31:46:</config>
07:31:46:Trying to access database...
07:31:46:Successfully acquired database lock
No protocol specified
07:31:46:Enabled folding slot 00: READY cpu:6
07:31:46:Enabled folding slot 01: READY gpu:0:GM107 [GeForce GTX 750 Ti] 1306
07:31:46:ERROR:No compute devices matched GPU #0 NVIDIA:4 GM107 [GeForce GTX 750 Ti] 1306.  You may need to update your graphics drivers.
For some reason it's not detecting OpenCL or CUDA, and it's telling me I may have to update my graphics drivers (they're at the latest version). Does anyone here know what I need to install/configure so folding@home can start doing GPU computing?

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Mon Mar 23, 2020 4:22 pm
by kostuek
You could try to hunt down the problem with glxinfo https://dri.freedesktop.org/wiki/glxinfo/ Somehow it looks like the graphic driver is not installed.

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Mon Mar 23, 2020 9:44 pm
by katakaio
On Ubuntu and other Linux distros, it's often necessary to install the OpenCL headers and libraries if you don't install the proprietary drivers. For me, installing ocl-icd-opencl-dev worked like a charm.

It's not clear to me than installing the CUDA toolkit and SDK would get you those headers and libs, so that's my best guess.

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Mon Mar 23, 2020 10:14 pm
by bruce
I believe you have to install the proprietary drivers. Doing so while X is running is likely to clobber your GUI. Find a Ubuntu supported package for the NVidia proprietary drivers.

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Tue Mar 24, 2020 7:58 am
by jzroth
Proprietary drivers are already installed and running (I have "NVIDIA corporation" as my glx vendor string). Also, Gentoo installs the headers for every package automatically (it has to, because it's a source based distro).

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Tue Mar 24, 2020 8:47 am
by jzroth
Also, clinfo shows my OpenCL implementation is working:

Code: Select all

jacob@gaia ~ $ clinfo
Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 1.2 CUDA 10.2.131
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics
  Platform Extensions function suffix             NV

  Platform Name                                   NVIDIA CUDA
Number of devices                                 1
  Device Name                                     GeForce GTX 750 Ti
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 1.2 CUDA
  Driver Version                                  440.59
  Device OpenCL C Version                         OpenCL C 1.2
  Device Type                                     GPU
  Device Topology (NV)                            PCI-E, 01:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               5
  Max clock frequency                             1110MHz
  Compute Capability (NV)                         5.0
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Preferred work group size multiple              32
(... and so on, output continues for awhile)

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Tue Mar 24, 2020 10:53 am
by andreassen51
Just a thought (I'm a loud thinker excuse me) when I had problems getting darktable to use the GPU I had to rebuild the kernel then install the Nvidia driver again. I've rebooted and darktable pretended the problem wasn't there to begin with.

With that sorted I could start folding at once.

Regards.

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Tue Mar 24, 2020 5:11 pm
by dbosso
I'm seeing something similar here and would love to find a solution. Can you show me what you have for these lines in clinfo, as I think they indicate where the problem is:

Code: Select all

# clinfo
Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 1.2 CUDA 10.2.141
  Platform Profile                                FULL_PROFILE
...
NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [NV]
  clCreateContext(NULL, ...) [default]            Success [NV]
Specifically the
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
seems wrong.

Thanks.

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Tue Mar 24, 2020 9:34 pm
by jzroth
I'm seeing that too. Here's the full clinfo:

Code: Select all

Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 1.2 CUDA 10.2.131
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics
  Platform Extensions function suffix             NV

  Platform Name                                   NVIDIA CUDA
Number of devices                                 1
  Device Name                                     GeForce GTX 750 Ti
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 1.2 CUDA
  Driver Version                                  440.59
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Topology (NV)                            PCI-E, 01:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               5
  Max clock frequency                             1110MHz
  Compute Capability (NV)                         5.0
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Preferred work group size multiple              32
  Warp size (NV)                                  32
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              2096168960 (1.952GiB)
  Error Correction support                        No
  Max memory allocation                           524042240 (499.8MiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        122880 (120KiB)
  Global Memory cache line size                   128 bytes
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             4096x4096x4096 pixels
    Max number of read image args                 256
    Max number of write image args                16
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        65536
  Max number of constant args                     9
  Max constant buffer size                        65536 (64KiB)
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Kernel execution timeout (NV)                 Yes
  Concurrent copy and kernel execution (NV)       Yes
    Number of async copy engines                  1
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [NV]
  clCreateContext(NULL, ...) [default]            Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  Invalid device type for platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform
Down there at the bottom I have the same thing as you. Not sure what it means, though.

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Wed Mar 25, 2020 2:07 pm
by dbosso
I don't think it means anything, but I think I just solved the issue on my system.

After some more searching and reading I found this helpful page: https://www.darktable.org/usermanual/en ... ystem.html

Then I noticed:

Code: Select all

# ls -l /dev/nvidia*
crw-rw---- 1 root video 195,   0 Mar 24 09:49 /dev/nvidia0
crw-rw---- 1 root video 195, 255 Mar 24 09:49 /dev/nvidiactl
crw-rw-rw- 1 root root  244,   0 Mar 24 09:49 /dev/nvidia-uvm
crw-rw-rw- 1 root root  244,   1 Mar 24 09:49 /dev/nvidia-uvm-tools
I added the foldingathome user to the video group and that fixed the problem for me.

Hopefully this helps!

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Wed Mar 25, 2020 7:22 pm
by jzroth
Seems to have worked for me too! It's detecting CUDA and OpenCL now.
I won't be able to be sure it works till there's work units available again, though.

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Wed Mar 25, 2020 7:40 pm
by bruce
This looks like it might be the same issue:
HOWTO: How I got my R9 290 folding on Linux

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Thu Mar 26, 2020 3:14 am
by jzroth
bruce wrote:This looks like it might be the same issue:
HOWTO: How I got my R9 290 folding on Linux
That looks like it was a different issue - mine was just that I needed to add user "foldingathome" to group "video" in order to detect the CUDA and OpenCL devices on startup.

However, upon solving that problem I've encountered a new one. When FoldingAtHome manages to download work units for my GPU, I get this:

Code: Select all

02:47:18:WU01:FS01:Download complete
02:47:18:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11763 run:0 clone:2125 gen:6 core:0x22 unit:0x0000001480
02:47:18:WU01:FS01:Starting
02:47:18:WU01:FS01:Running FahCore: /opt/foldingathome/FAHCoreWrapper /opt/foldingathome/cores/cores.foldingathome.org/v7/lin/64bit/Co
0 -cuda-device 0 -gpu 0
02:47:18:WU01:FS01:Started FahCore on PID 63455
02:47:18:WU01:FS01:Core PID:63459
02:47:18:WU01:FS01:FahCore 0x22 started
02:47:19:WU01:FS01:0x22:*********************** Log Started 2020-03-26T02:47:18Z ***********************
02:47:19:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
02:47:19:WU01:FS01:0x22:       Type: 0x22
02:47:19:WU01:FS01:0x22:       Core: Core22
02:47:19:WU01:FS01:0x22:    Website: https://foldingathome.org/
02:47:19:WU01:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
02:47:19:WU01:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
02:47:19:WU01:FS01:0x22:             <rafal.wiewiora@choderalab.org>
02:47:19:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 705 -lifeline 63455 -checkpoint 15
02:47:19:WU01:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
02:47:19:WU01:FS01:0x22:             0 -gpu 0
02:47:19:WU01:FS01:0x22:     Config: <none>
02:47:19:WU01:FS01:0x22:************************************ Build *************************************
02:47:19:WU01:FS01:0x22:    Version: 0.0.2
02:47:19:WU01:FS01:0x22:       Date: Dec 6 2019
02:47:19:WU01:FS01:0x22:       Time: 21:20:17
02:47:19:WU01:FS01:0x22: Repository: Git
02:47:19:WU01:FS01:0x22:   Revision: f87d92b58abdf7e6bf2e173cfbc4dc3e837c7042
02:47:19:WU01:FS01:0x22:     Branch: core22
02:47:19:WU01:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
02:47:19:WU01:FS01:0x22:    Options: -std=gnu++98 -O3 -funroll-loops
02:47:19:WU01:FS01:0x22:   Platform: linux2 4.9.87-linuxkit-aufs
02:47:19:WU01:FS01:0x22:       Bits: 64
02:47:19:WU01:FS01:0x22:       Mode: Release
02:47:19:WU01:FS01:0x22:************************************ System ************************************
02:47:19:WU01:FS01:0x22:        CPU: AMD FX(tm)-8320 Eight-Core Processor
02:47:19:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
02:47:19:WU01:FS01:0x22:       CPUs: 8
02:47:19:WU01:FS01:0x22:     Memory: 13.57GiB
02:47:19:WU01:FS01:0x22:Free Memory: 2.06GiB
02:47:19:WU01:FS01:0x22:    Threads: POSIX_THREADS
02:47:19:WU01:FS01:0x22: OS Version: 4.19
02:47:19:WU01:FS01:0x22:Has Battery: false
02:47:19:WU01:FS01:0x22: On Battery: false
02:47:19:WU01:FS01:0x22: UTC Offset: -7
02:47:19:WU01:FS01:0x22:        PID: 63459
02:47:19:WU01:FS01:0x22:        CWD: /opt/foldingathome/work
02:47:19:WU01:FS01:0x22:         OS: Linux 4.19.97-gentoo-x86_64 x86_64
02:47:19:WU01:FS01:0x22:    OS Arch: AMD64
02:47:19:WU01:FS01:0x22:********************************************************************************
02:47:19:WU01:FS01:0x22:Project: 11763 (Run 0, Clone 2125, Gen 6)
02:47:19:WU01:FS01:0x22:Unit: 0x0000001480fccb0a5e6d814448d9d0cf
02:47:19:WU01:FS01:0x22:Reading tar file core.xml
02:47:19:WU01:FS01:0x22:Reading tar file integrator.xml
02:47:19:WU01:FS01:0x22:Reading tar file state.xml
02:47:19:WU01:FS01:0x22:Reading tar file system.xml
02:47:19:WU01:FS01:0x22:Digital signatures verified
02:47:19:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
02:47:19:WU01:FS01:0x22:Version 0.0.2
02:47:19:WU01:FS01:0x22:ERROR:exception: There is no registered Platform called "OpenCL"
02:47:19:WU01:FS01:0x22:Saving result file ../logfile_01.txt
02:47:19:WU01:FS01:0x22:Saving result file science.log
It keeps failing to start computing on the GPU with 'ERROR:exception: There is no registered Platform called "OpenCL"'. So, looks like I haven't completely solved it yet.

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Thu Mar 26, 2020 11:57 pm
by Ninpo
I believe you need the uvm USE flag for the nvidia driver for OpenCL to work with FAH.

Code: Select all

Installed versions:  440.64(0/440) (X acpi driver kms libglvnd tools uvm -compat -gtk3 -multilib -static-libs -wayland ABI_MIPS="-n32 -n64 -o32" ABI_RISCV="-lp64 -lp64d" ABI_S390="-32 -64" ABI_X86="32 64 -x32" KERNEL="linux -FreeBSD")
That along with having foldingathome in the video group got everything working for me today.
There's also:

Code: Select all

<config>
  <!-- Slot Control -->
  <power v='MEDIUM'/>

  <!-- User Information -->
  <passkey v='APASSKEY'/>
  <team v='TEAMNUMBER'/>
  <user v='Ninpo'/>

  <!-- Folding Slots -->
  <slot id='0' type='CPU'/>
  <slot id='1' type='GPU'>
    <paused v='true'/>
  </slot>
</config>
In my config file currently (GPU is currently paused as my desktop is unusable when folding with it, but it does work)

Re: Gentoo: Neither CUDA nor OpenCL detected despite install

Posted: Sun Mar 29, 2020 8:19 pm
by jzroth
OK, I've managed to get it working. Here's everything I did:

1. Uninstall sci-biology/foldingathome
2. delete the /opt/foldingathome directory
3. Emerge nvidia-drivers with these USE flags:

Code: Select all

+X
+abi_x86_32
+acpi
+driver
+kms
+libglvnd
+multilib
+static-libs
+tools
+uvm
3. Emerge nvidia-cuda-sdk with these USE flags:

Code: Select all

+cuda
+doc
+examples
+mpi
+opencl
4. Emerge nvidia-cuda-toolkit (I'm not sure if this is actually necessary but I did)
5. Verify that

Code: Select all

eselect opencl list
shows the nvidia implementation is selected
6. Reboot the computer
7. Emerge sci-biology/foldingathome
8. Add user foldingathome to video group with

Code: Select all

usermod -aG video foldingathome
9. cd to the /opt/foldingathome directory
10. Run config with

Code: Select all

sudo -u foldingathome ./FAHClient --configure
, choose to use GPU
11. Run foldingathome with

Code: Select all

sudo -u foldingathome ./FAHClient
Now it's folding on my GTX 750 Ti on Gentoo!