How does FAH "Detect" CUDA?

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
Druco
Posts: 1
Joined: Thu Apr 30, 2015 10:19 am

How does FAH "Detect" CUDA?

Post by Druco »

System:
i5-4590
8GB RAM
GTX 760 GPU

OpenSUSE 13.2

I have loaded the 346.46 drivers and the CUDA Devlopment package from the nvidia website and the example programs work so I assume that CUDA is working but FAH still reports that it isn't detecting CUDA:

Code: Select all

*********************** Log Started 2015-05-01T22:35:40Z ***********************
22:35:40:************************* Folding@home Client *************************
22:35:40:    Website: http://folding.stanford.edu/
22:35:40:  Copyright: (c) 2009-2014 Stanford University
22:35:40:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
22:35:40:       Args: --child --lifeline 5522 /etc/fahclient/config.xml --run-as
22:35:40:             fahclient --pid-file=/var/run/fahclient.pid --daemon
22:35:40:     Config: /etc/fahclient/config.xml
22:35:40:******************************** Build ********************************
22:35:40:    Version: 7.4.4
22:35:40:       Date: Mar 4 2014
22:35:40:       Time: 12:01:17
22:35:40:    SVN Rev: 4130
22:35:40:     Branch: fah/trunk/client
22:35:40:   Compiler: GNU 4.1.2 20080704 (Red Hat 4.1.2-46)
22:35:40:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
22:35:40:             -fno-unsafe-math-optimizations -msse2
22:35:40:   Platform: linux2 2.6.18-164.11.1.el5
22:35:40:       Bits: 64
22:35:40:       Mode: Release
22:35:40:******************************* System ********************************
22:35:40:        CPU: Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz
22:35:40:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
22:35:40:       CPUs: 4
22:35:40:     Memory: 7.74GiB
22:35:40:Free Memory: 5.24GiB
22:35:40:    Threads: POSIX_THREADS
22:35:40: OS Version: 3.16
22:35:40:Has Battery: false
22:35:40: On Battery: false
22:35:40: UTC Offset: -7
22:35:40:        PID: 5524
22:35:40:        CWD: /var/lib/fahclient
22:35:40:         OS: Linux 3.16.7-21-default x86_64
22:35:40:    OS Arch: AMD64
22:35:40:       GPUs: 1
22:35:40:      GPU 0: NVIDIA:3 GK104 [GeForce GTX 760]
22:35:40:       CUDA: Not detected
22:35:40:***********************************************************************
My question is, how does FAH try to detect CUDA? Since the CUDA examples are working I must have some environment variable or permission or location set incorrectly but without knowing what its detection algorithm is I am having a hard time debugging it.

I know from other posts that if CUDA is not present it will try to load an openCL WU but I've been trying that for days and all I ever get are BAD_WORK_UNIT errors until it gives up:

Code: Select all

22:35:40:Switching to user fahclient
22:35:40:Trying to access database...
22:35:42:Successfully acquired database lock
22:35:42:Enabled folding slot 01: READY gpu:0:GK104 [GeForce GTX 760]
22:35:42:WU00:FS01:Connecting to 171.67.108.200:80
22:35:43:WU00:FS01:Assigned to work server 171.64.65.56
22:35:43:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GK104 [GeForce GTX 760] from 171.64.65.56
22:35:43:WU00:FS01:Connecting to 171.64.65.56:8080
22:35:43:WU00:FS01:Downloading 885.46KiB
22:35:44:WU00:FS01:Download complete
22:35:44:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9411 run:544 clone:0 gen:71 core:0x17 unit:0x00000052ab40413854d27bf36ddc9191
22:35:44:WU00:FS01:Starting
22:35:44:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17 -dir 00 -suffix 01 -version 704 -lifeline 5524 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
22:35:44:WU00:FS01:Started FahCore on PID 5539
22:35:44:WU00:FS01:Core PID:5543
22:35:44:WU00:FS01:FahCore 0x17 started
[93m22:35:44:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)[0m
22:35:44:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9411 run:544 clone:0 gen:71 core:0x17 unit:0x00000052ab40413854d27bf36ddc9191
22:35:44:WU00:FS01:Uploading 1.82KiB to 171.64.65.56
22:35:44:WU00:FS01:Connecting to 171.64.65.56:8080
22:35:45:WU00:FS01:Upload complete
22:35:45:WU01:FS01:Connecting to 171.67.108.200:80
22:35:45:WU00:FS01:Server responded WORK_ACK (400)
22:35:45:WU00:FS01:Cleaning up
which continues until it gives up, so I would like to be able to at least try a CUDA WU.

Thank you.
JimboPalmer
Posts: 2522
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: How does FAH "Detect" CUDA?

Post by JimboPalmer »

I am no expert, but I do not think any CUDA WUs are available for Linux, I thought they were all OpenCL?
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: How does FAH "Detect" CUDA?

Post by 7im »

The opencl work units are closely linked to options in the CUDA code by CUDA version, so that is why the client checks for a CUDA version, even though it doesn't use CUDA directly.

I don't know how the detection works. I just know that in all but one case, No CUDA means no folding. There are several opensuse threads on this forum. Please review those for a solution.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
everyman
Posts: 27
Joined: Fri Aug 08, 2008 4:15 am
Hardware configuration: Toshiba X205-SLi1 using nVidia CUDA drivers version 177.35

Re: How does FAH "Detect" CUDA?

Post by everyman »

FAH Detects CUDA based on the kernel modules you have install for your hardware. Driver packages from 3rd parties (such as Linux ditros) might work ok for 3d games or desktop effects, but I have never gotten OpenCL or CUDA working with them. It is almost a requirement that Nvidia users download and install the drivers directly from Nvidia and according to their instructions.

Once you have done that you can run "sudo lsmod | grep nvidia" to see if all of the required modules are loaded. There should be two of them, nvidia and nvidia-uvm.


E
"In Theory there is no difference between Theory and Practice. In Practice there is."
DJViking
Posts: 41
Joined: Tue Apr 19, 2016 1:39 pm

Re: How does FAH "Detect" CUDA?

Post by DJViking »

This thread has been dormant for a while. I suspect it is the same problem I have on OpenSUSE. viewtopic.php?f=89&t=28751
I had installed the Nvidia proprietary drivers which include library for CUDA and OpenCL, but the user fahclient did not get access to the GPU. With this the folding did not work using the GPU.
My own user was added to the group video which I think is required to access the GPU. When I started the FAHClient with my own user it detected CUDA and was able to fold with the GPU.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: How does FAH "Detect" CUDA?

Post by bruce »

Is there an entry in the path or an environment variable that points to the library(s)?

(I don't know how NVidia does it; these are a couple of methods I've seen used.)
DJViking
Posts: 41
Joined: Tue Apr 19, 2016 1:39 pm

Re: How does FAH "Detect" CUDA?

Post by DJViking »

bruce wrote:Is there an entry in the path or an environment variable that points to the library(s)?
It is not access to the library that is a problem, but access to the graphic device. Any program kan use any library on the system, but what use is the library without access to the device.
toTOW
Site Moderator
Posts: 6359
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: How does FAH "Detect" CUDA?

Post by toTOW »

Could be a permission issue ... can you try to run the client as root/sudo ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Post Reply