Page 1 of 1
Struggling to fold on my Vega 56
Posted: Thu Aug 22, 2019 10:32 pm
by tmontney
Edit: Cross-post here:
https://askubuntu.com/questions/1167968 ... figuration
Followed this for my Ubuntu 18.04 x64 headless (no GUI/desktop packages) build:
https://gpuopen.com/vega-frontier-insta ... he-driver/
Added slot via telnet, "slot-add gpu". (I don't have access to FAHControl at the moment.) Slot is stuck in status ready, logs filled with "failed to start core: opencl device matching slot 3 not found". Log also has "no compute devices matched GPU #2 AMD:5 [Radeon Rx Vega]". I have 2 other GPUs, both NVIDIA, and are working fine.
lspci shows "05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XT [Radeon RX Vega 64] (rev c3)".
What have I missed?
Re: Struggling to fold on my Vega 56
Posted: Fri Aug 23, 2019 1:00 am
by JimboPalmer
[I am not a Linux user, but when one shows up, having logs won't hurt]
Posting the first 200 lines of the log, where it lists your configuration would help us 'see' the problem.
viewtopic.php?f=24&t=26036 has hints for many OSes.
Re: Struggling to fold on my Vega 56
Posted: Fri Aug 23, 2019 6:38 am
by bruce
It sounds like the OpenCL drivers for you GPU have not been installed.
Re: Struggling to fold on my Vega 56
Posted: Fri Aug 23, 2019 1:58 pm
by tmontney
Originally when I configured my NVIDIA GPUs, I installed
ocl-icd-opencl-dev and
nvidia-driver-390. I figured I already had OpenCL as it's the multi-vendor package. I then tried to install
mesa-opencl-icd. After a reboot, it caused all my slots to act like my Vega. However, it only says
No compute devices matched for my Vega still, so perhaps it really is driver related. Not sure how I'm missing the drivers.
Code: Select all
********************** Log Started 2019-08-23T13:44:31Z ***********************
13:44:31:************************* Folding@home Client *************************
13:44:31: Website: https://foldingathome.org/
13:44:31: Copyright: (c) 2009-2018 foldingathome.org
13:44:31: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
13:44:31: Args: --child --lifeline 672 /etc/fahclient/config.xml --run-as
13:44:31: fahclient --pid-file=/var/run/fahclient.pid --daemon
13:44:31: Config: /etc/fahclient/config.xml
13:44:31:******************************** Build ********************************
13:44:31: Version: 7.5.1
13:44:31: Date: May 11 2018
13:44:31: Time: 19:59:04
13:44:31: Repository: Git
13:44:31: Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
13:44:31: Branch: master
13:44:31: Compiler: GNU 6.3.0 20170516
13:44:31: Options: -std=gnu++98 -O3 -funroll-loops
13:44:31: Platform: linux2 4.14.0-3-amd64
13:44:31: Bits: 64
13:44:31: Mode: Release
13:44:31:******************************* System ********************************
13:44:31: CPU: Intel(R) Core(TM) i3-8100 CPU @ 3.60GHz
13:44:31: CPU ID: GenuineIntel Family 6 Model 158 Stepping 11
13:44:31: CPUs: 4
13:44:31: Memory: 7.46GiB
13:44:31: Free Memory: 7.04GiB
13:44:31: Threads: POSIX_THREADS
13:44:31: OS Version: 4.15
13:44:31: Has Battery: false
13:44:31: On Battery: false
13:44:31: UTC Offset: -5
13:44:31: PID: 674
13:44:31: CWD: /var/lib/fahclient
13:44:31: OS: Linux 4.15.0-58-generic x86_64
13:44:31: OS Arch: AMD64
13:44:31: GPUs: 3
13:44:31: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:6 GM200 [GeForce GTX 980 Ti] 5632
13:44:31: GPU 1: Bus:2 Slot:0 Func:0 NVIDIA:7 GP104 [GeForce GTX 1070] 6463
13:44:31: GPU 2: Bus:5 Slot:0 Func:0 AMD:5 [Radeon Rx vega]
13:44:31:CUDA Device 0: Platform:0 Device:0 Bus:2 Slot:0 Compute:6.1 Driver:9.1
13:44:31:CUDA Device 1: Platform:0 Device:1 Bus:1 Slot:0 Compute:5.2 Driver:9.1
13:44:31: OpenCL: Not detected: clGetDeviceIDs() returned -1
13:44:31:***********************************************************************
13:44:31:<config>
13:44:31: <!-- Folding Core -->
13:44:31: <checkpoint v='30'/>
13:44:31: <core-priority v='low'/>
13:44:31:
13:44:31: <!-- Folding Slot Configuration -->
13:44:31: <cause v='ALZHEIMERS'/>
13:44:31:
13:44:31: <!-- Network -->
13:44:31: <proxy v=':8080'/>
13:44:31:
13:44:31: <!-- Remote Command Server -->
13:44:31: <password v='***********'/>
13:44:31:
13:44:31: <!-- Slot Control -->
13:44:31: <pause-on-battery v='false'/>
13:44:31: <power v='full'/>
13:44:31:
13:44:31: <!-- User Information -->
13:44:31: <passkey v='********************************'/>
13:44:31: <team v='224497'/>
13:44:31: <user v='tmontney'/>
13:44:31:
13:44:31: <!-- Folding Slots -->
13:44:31: <slot id='0' type='CPU'/>
13:44:31: <slot id='1' type='GPU'/>
13:44:31: <slot id='2' type='GPU'/>
13:44:31: <slot id='3' type='GPU'/>
13:44:31:</config>
13:44:31:Switching to user fahclient
13:44:31:Trying to access database...
13:44:31:Successfully acquired database lock
13:44:31:Enabled folding slot 00: READY cpu:1
13:44:31:Enabled folding slot 01: READY gpu:0:GM200 [GeForce GTX 980 Ti] 5632
13:44:31:Enabled folding slot 02: READY gpu:1:GP104 [GeForce GTX 1070] 6463
13:44:31:Enabled folding slot 03: READY gpu:2:[Radeon Rx vega]
13:44:31:ERROR:No compute devices matched GPU #2 AMD:5 [Radeon Rx vega]. You may need to update your graphics drivers.
13:44:31:WU02:FS00:Starting
13:44:31:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/AVX/$
13:44:31:WU02:FS00:Started FahCore on PID 688
13:44:31:WU02:FS00:Core PID:692
13:44:31:WU02:FS00:FahCore 0xa7 started
13:44:31:WU04:FS01:Starting
13:44:31:ERROR:WU04:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually.
13:44:31:WU03:FS03:Starting
13:44:31:ERROR:WU03:FS03:Failed to start core: OpenCL device matching slot 3 not found, try setting 'opencl-index' manually.
13:44:31:WU01:FS02:Starting
13:44:31:ERROR:WU01:FS02:Failed to start core: OpenCL device matching slot 2 not found, try setting 'opencl-index' mmanually.
13:44:31:WU04:FS01:Starting
13:44:31:ERROR:WU04:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually.
13:44:31:WU03:FS03:Starting
13:44:31:ERROR:WU03:FS03:Failed to start core: OpenCL device matching slot 3 not found, try setting 'opencl-index' manually.
13:44:31:WU01:FS02:Starting
13:44:31:ERROR:WU01:FS02:Failed to start core: OpenCL device matching slot 2 not found, try setting 'opencl-index' manually.
13:44:31:WU02:FS00:0xa7:*********************** Log Started 2019-08-23T13:44:31Z ***********************
13:44:31:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
13:44:31:WU02:FS00:0xa7: Type: 0xa7
13:44:31:WU02:FS00:0xa7: Core: Gromacs
13:44:31:WU02:FS00:0xa7: Website: https://foldingathome.org/
13:44:31:WU02:FS00:0xa7: Copyright: (c) 2009-2018 foldingathome.org
13:44:31:WU02:FS00:0xa7: Copyright: (c) 2009-2018 foldingathome.org
13:44:31:WU02:FS00:0xa7: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
13:44:31:WU02:FS00:0xa7: Args: -dir 02 -suffix 01 -version 705 -lifeline 688 -checkpoint 30 -np 1
13:44:31:WU02:FS00:0xa7: Config: <none>
13:44:31:WU02:FS00:0xa7:************************************ Build *************************************
13:44:31:WU02:FS00:0xa7: Version: 0.0.17
13:44:31:WU02:FS00:0xa7: Date: Apr 27 2018
13:44:31:WU02:FS00:0xa7: Time: 19:09:21
13:44:31:WU02:FS00:0xa7: Repository: Git
13:44:31:WU02:FS00:0xa7: Revision: 21359963583d09ec2063ef946399441c4df4ccd7
13:44:31:WU02:FS00:0xa7: Branch: master
13:44:31:WU02:FS00:0xa7: Compiler: GNU 6.3.0 20170516
13:44:31:WU02:FS00:0xa7: Options: -std=gnu++98 -O3 -funroll-loops
13:44:31:WU02:FS00:0xa7: Platform: linux2 4.14.0-3-amd64
13:44:31:WU02:FS00:0xa7: Bits: 64
13:44:31:WU02:FS00:0xa7: Mode: Release
13:44:31:WU02:FS00:0xa7: SIMD: avx_256
13:44:31:WU02:FS00:0xa7:************************************ System ************************************
13:44:31:WU02:FS00:0xa7: CPU: Intel(R) Core(TM) i3-8100 CPU @ 3.60GHz
13:44:31:WU02:FS00:0xa7: CPU ID: GenuineIntel Family 6 Model 158 Stepping 11
13:44:31:WU02:FS00:0xa7: CPUs: 4
13:44:31:WU02:FS00:0xa7: Memory: 7.46GiB
13:44:31:WU02:FS00:0xa7:Free Memory: 7.02GiB
13:44:31:WU02:FS00:0xa7: Threads: POSIX_THREADS
13:44:31:WU02:FS00:0xa7: OS Version: 4.15
13:44:31:WU02:FS00:0xa7:Has Battery: false
13:44:31:WU02:FS00:0xa7: On Battery: false
13:44:31:WU02:FS00:0xa7: UTC Offset: -5
13:44:31:WU02:FS00:0xa7: PID: 692
13:44:31:WU02:FS00:0xa7: CWD: /var/lib/fahclient/work
13:44:31:WU02:FS00:0xa7: OS: Linux 4.15.0-58-generic x86_64
13:44:31:WU02:FS00:0xa7: OS Arch: AMD64
13:44:31:WU02:FS00:0xa7:********************************************************************************
13:44:31:WU02:FS00:0xa7:Project: 14153 (Run 10, Clone 347, Gen 140)
13:44:31:WU02:FS00:0xa7:Unit: 0x000000ab0002894b5c6e8ef8729ab0b5
13:44:31:WU02:FS00:0xa7:Digital signatures verified
13:44:31:WU02:FS00:0xa7:Calling: mdrun -s frame140.tpr -o frame140.trr -cpi state.cpt -cpt 30 -nt 1
13:44:31:WU02:FS00:0xa7:Steps: first=700000000 total=5000000
13:44:31:WU02:FS00:0xa7:Completed 3537662 out of 5000000 steps (70%)
13:45:31:WU04:FS01:Starting
13:45:31:ERROR:WU04:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually.
13:45:31:WU03:FS03:Starting
13:45:31:ERROR:WU03:FS03:Failed to start core: OpenCL device matching slot 3 not found, try setting 'opencl-index' manually.
13:45:31:WU01:FS02:Starting
After removing
mesa-opencl-icd and restarting FAHClient, NVIDIA slots folded again. Perhaps it wasn't a fan of multiple OpenCL packages.
clinfo also does not list my Vega.
Re: Struggling to fold on my Vega 56
Posted: Fri Aug 23, 2019 2:18 pm
by tmontney
Running FAHClient --lspci generates blank data:
Code: Select all
4:17:42:INFO(1):Read GPUs.txt
VendorID:DeviceID:PCI Bus:PCI Slot:PCI function:Vendor Name:Description
0x8086:0x3e1f:0:0:0:Intel Corporation:
0x8086:0x1901:0:1:0:Intel Corporation:
0x8086:0x3e91:0:2:0:Intel Corporation:
0x8086:0xa379:0:18:0:Intel Corporation:
0x8086:0xa36d:0:20:0:Intel Corporation:
0x8086:0xa36f:0:20:2:Intel Corporation:
0x8086:0xa360:0:22:0:Intel Corporation:
0x8086:0xa352:0:23:0:Intel Corporation:
0x8086:0xa33c:0:28:0:Intel Corporation:
0x8086:0xa33e:0:28:6:Intel Corporation:
0x8086:0xa332:0:29:0:Intel Corporation:
0x8086:0xa308:0:31:0:Intel Corporation:
0x8086:0xa348:0:31:3:Intel Corporation:
0x8086:0xa323:0:31:4:Intel Corporation:
0x8086:0xa324:0:31:5:Intel Corporation:
0x8086:0x15bc:0:31:6:Intel Corporation:
0x10de:0x17c8:1:0:0:NVIDIA Corporation:
0x10de:0x0fb0:1:0:1:NVIDIA Corporation:
0x10de:0x1b81:2:0:0:NVIDIA Corporation:
0x10de:0x10f0:2:0:1:NVIDIA Corporation:
0x1022:0x1470:3:0:0:Advanced Micro Devices, Inc. [AMD]:
0x1022:0x1471:4:0:0:Advanced Micro Devices, Inc. [AMD]:
0x1002:0x687f:5:0:0:Advanced Micro Devices, Inc. [AMD/ATI]:
0x1002:0xaaf8:5:0:1:Advanced Micro Devices, Inc. [AMD/ATI]:
0x8086:0x2526:6:0:0:Intel Corporation:
It's also bound to
/sys/bus/pci/drivers/amdgpu.
Re: Struggling to fold on my Vega 56
Posted: Fri Aug 23, 2019 7:40 pm
by bruce
Since you've already tried more than one OpcnCL driver, found it wanting, and removed it, there's one more thing you can try. Go to
www.khronos.org and download their OpenCL drivers. That might work. In either case, report back.
Re: Struggling to fold on my Vega 56
Posted: Fri Aug 23, 2019 9:26 pm
by tmontney
Thanks. I'll try that. If I can't get this working eventually, I'm just going to ditch the card in favor for another NVIDIA. No reason to torture myself.
Re: Struggling to fold on my Vega 56
Posted: Sun Aug 25, 2019 10:05 am
by toTOW
Mixing AMD and NV cards in the same machine is often a bad idea when it comes to OpenCL drivers ...
Re: Struggling to fold on my Vega 56
Posted: Sun Aug 25, 2019 7:25 pm
by bruce
toTOW wrote:Mixing AMD and NV cards in the same machine is often a bad idea when it comes to OpenCL drivers ...
"often" is the operative word here.
It is
possible to make them work together, but it's not a simple process.
Re: Struggling to fold on my Vega 56
Posted: Mon Aug 26, 2019 3:55 pm
by tmontney
True, but it's clearly possible. I would assume if you can get AMD and NVIDIA to work together, it shouldn't really matter which model you were using. I guess I'll find out tho.
If I do get it to work, I'll write up my steps. Should be pinned somewhere.
toTOW wrote:Mixing AMD and NV cards in the same machine is often a bad idea when it comes to OpenCL drivers ...
I swear I'd seen it previously, but I'll admit, I didn't do much research. I'm definitely considering swapping this for a comparable NVIDIA card. It's just nice to have options. If I see a card on sale, I can buy it regardless if it's NVIDIA or AMD.
Re: Struggling to fold on my Vega 56
Posted: Mon Aug 26, 2019 4:25 pm
by tmontney
bruce wrote:Since you've already tried more than one OpcnCL driver, found it wanting, and removed it, there's one more thing you can try. Go to
http://www.khronos.org and download their OpenCL drivers. That might work. In either case, report back.
Don't supposed you have any tips on this? I'm stuck after building. What do I do with the build folder? How do I link up so the OS/FAH knows where OpenCL is?
Re: Struggling to fold on my Vega 56
Posted: Tue Aug 27, 2019 4:44 pm
by foldy
Guess this is not a khronos issue. The OpenCL interface is installed already as nvidia GPU is detected now. You need to install the AMD driver
https://www.amd.com/de/support/kb/relea ... stallation
https://drivers.amd.com/drivers/linux/a ... .04.tar.xz
Maybe you also need to download the FAH GPUs.txt manually if it is not uptodated
https://apps.foldingathome.org/GPUs.txt