Page 1 of 1

Possible problem adding second graphics card

Posted: Wed Feb 15, 2017 11:15 pm
by CeeVee
I added a GTX1070 as a second graphics card today, the first was a GTX1060.
I seem to be folding OK but there are two possible issues that I've noticed.
The first, as in the log below,seems to show 3 GPU's with the third as an unknown.
When I added the GPU I simply went to configure -> slots and added a GPU, leaving the index as -1.
I did add a next-unit-percentage 100 to the slot but that was the only amendment I made.
The second issue is that the FAH client seems to be displaying the details for the two GPU slots in the wrong order on the 'status' pane.
The numbers for slot id 01 looks like the 1070 and the numbers for slot id 02 look like the 1060, (based on the estimated PPD for each slot).
I'm running on Linux with Nvidia driver 370.28 installed.
As I said, I seem to be folding OK. Should I be concerned with either of these issues?




Code: Select all

*********************** Log Started 2017-02-15T14:18:25Z ***********************
14:18:25:************************* Folding@home Client *************************
14:18:25:    Website: http://folding.stanford.edu/
14:18:25:  Copyright: (c) 2009-2014 Stanford University
14:18:25:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
14:18:25:       Args: --child --lifeline 1595 /etc/fahclient/config.xml --run-as
14:18:25:             fahclient --pid-file=/var/run/fahclient.pid --daemon
14:18:25:     Config: /etc/fahclient/config.xml
14:18:25:******************************** Build ********************************
14:18:25:    Version: 7.4.4
14:18:25:       Date: Mar 4 2014
14:18:25:       Time: 12:02:38
14:18:25:    SVN Rev: 4130
14:18:25:     Branch: fah/trunk/client
14:18:25:   Compiler: GNU 4.4.7
14:18:25:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
14:18:25:             -fno-unsafe-math-optimizations -msse2
14:18:25:   Platform: linux2 3.2.0-1-amd64
14:18:25:       Bits: 64
14:18:25:       Mode: Release
14:18:25:******************************* System ********************************
14:18:25:        CPU: AMD Phenom(tm) II X4 905e Processor
14:18:25:     CPU ID: AuthenticAMD Family 16 Model 4 Stepping 2
14:18:25:       CPUs: 4
14:18:25:     Memory: 7.71GiB
14:18:25:Free Memory: 7.27GiB
14:18:25:    Threads: POSIX_THREADS
14:18:25: OS Version: 4.4
14:18:25:Has Battery: false
14:18:25: On Battery: false
14:18:25: UTC Offset: 0
14:18:25:        PID: 1597
14:18:25:        CWD: /var/lib/fahclient
14:18:25:         OS: Linux 4.4.0-21-generic x86_64
14:18:25:    OS Arch: AMD64
14:18:25:       GPUs: 3
14:18:25:      GPU 0: NVIDIA:5 GP106 [GeForce GTX 1060 3GB]
14:18:25:      GPU 1: NVIDIA:5 GP104 [GeForce GTX 1070]
14:18:25:      GPU 2: UNSUPPORTED: NV3 [PCI]
14:18:25:       CUDA: 6.1
14:18:25:CUDA Driver: 8000
14:18:25:***********************************************************************
14:18:25:<config>
14:18:25:  <!-- Client Control -->
14:18:25:  <fold-anon v='true'/>
14:18:25:
14:18:25:  <!-- HTTP Server -->
14:18:25:  <allow v='127.0.0.1 192.168.1.51'/>
14:18:25:
14:18:25:  <!-- Network -->
14:18:25:  <proxy v=':8080'/>
14:18:25:
14:18:25:  <!-- Remote Command Server -->
14:18:25:  <command-allow-no-pass v='127.0.0.1 192.168.1.51'/>
14:18:25:
14:18:25:  <!-- Slot Control -->
14:18:25:  <power v='full'/>
14:18:25:
14:18:25:  <!-- User Information -->
14:18:25:  <passkey v='********************************'/>
14:18:25:  <team v='232392'/>
14:18:25:  <user v='Lion'/>
14:18:25:
14:18:25:  <!-- Folding Slots -->
14:18:25:  <slot id='0' type='CPU'>
14:18:25:    <next-unit-percentage v='100'/>
14:18:25:  </slot>
14:18:25:  <slot id='1' type='GPU'>
14:18:25:    <next-unit-percentage v='100'/>
14:18:25:  </slot>
14:18:25:</config>
14:18:25:Switching to user fahclient
14:18:25:Trying to access database...
14:18:25:Successfully acquired database lock
14:18:25:Enabled folding slot 00: READY cpu:3
14:18:25:Enabled folding slot 01: READY gpu:0:GP106 [GeForce GTX 1060 3GB]
14:18:25:WU01:FS01:Starting
14:18:25:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 704 -lifeline 1597 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
14:18:25:WU01:FS01:Started FahCore on PID 1607
14:18:25:WU01:FS01:Core PID:1611
14:18:25:WU01:FS01:FahCore 0x21 started
14:18:25:WU00:FS00:Starting
14:18:25:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 704 -lifeline 1597 -checkpoint 15 -np 3
14:18:25:WU00:FS00:Started FahCore on PID 1614
14:18:25:WU00:FS00:Core PID:1618
14:18:25:WU00:FS00:FahCore 0xa4 started
14:18:25:WU01:FS01:0x21:*********************** Log Started 2017-02-15T14:18:25Z ***********************
14:18:25:WU01:FS01:0x21:Project: 11403 (Run 7, Clone 42, Gen 132)
14:18:25:WU01:FS01:0x21:Unit: 0x000000c28ca304f255ed4f7e04b4aceb
14:18:25:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
14:18:25:WU01:FS01:0x21:Machine: 1
14:18:25:WU01:FS01:0x21:Digital signatures verified
14:18:25:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
14:18:25:WU01:FS01:0x21:Version 0.0.18
14:18:25:WU01:FS01:0x21:  Found a checkpoint file
14:18:26:WU00:FS00:0xa4:
14:18:26:WU00:FS00:0xa4:*------------------------------*
14:18:26:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
14:18:26:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
14:18:26:WU00:FS00:0xa4:
14:18:26:WU00:FS00:0xa4:Preparing to commence simulation
14:18:26:WU00:FS00:0xa4:- Looking at optimizations...
14:18:26:WU00:FS00:0xa4:- Files status OK
14:18:26:WU00:FS00:0xa4:- Expanded 1943133 -> 6219524 (decompressed 320.0 percent)
14:18:26:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=1943133 data_size=6219524, decompressed_data_size=6219524 diff=0
14:18:26:WU00:FS00:0xa4:- Digital signature verified
14:18:26:WU00:FS00:0xa4:
14:18:26:WU00:FS00:0xa4:Project: 11659 (Run 64, Clone 0, Gen 65)
14:18:26:WU00:FS00:0xa4:
14:18:26:WU00:FS00:0xa4:Assembly optimizations on if available.
14:18:26:WU00:FS00:0xa4:Entering M.D.
14:18:32:WU00:FS00:0xa4:Using Gromacs checkpoints
14:18:33:WU00:FS00:0xa4:Resuming from checkpoint
14:18:33:WU00:FS00:0xa4:Verified 00/wudata_01.log
14:18:33:WU00:FS00:0xa4:Verified 00/wudata_01.trr
14:18:33:WU00:FS00:0xa4:Verified 00/wudata_01.xtc
14:18:33:WU00:FS00:0xa4:Verified 00/wudata_01.edr
14:18:33:WU00:FS00:0xa4:Completed 716660 out of 1250000 steps  (57%)
14:18:59:WU01:FS01:0x21:Completed 4125000 out of 5000000 steps (82%)
14:18:59:WU01:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
14:20:19:WU01:FS01:0x21:Completed 4150000 out of 5000000 steps (83%)
14:22:42:WU01:FS01:0x21:Completed 4200000 out of 5000000 steps (84%)
14:23:20:Adding folding slot 02: READY gpu:1:GP104 [GeForce GTX 1070]
14:23:20:Saving configuration to /etc/fahclient/config.xml
14:23:20:<config>
14:23:20:  <!-- Client Control -->
14:23:20:  <fold-anon v='true'/>
14:23:20:
14:23:20:  <!-- HTTP Server -->
14:23:20:  <allow v='127.0.0.1 192.168.1.51'/>
14:23:20:
14:23:20:  <!-- Network -->
14:23:20:  <proxy v=':8080'/>
14:23:20:
14:23:20:  <!-- Remote Command Server -->
14:23:20:  <command-allow-no-pass v='127.0.0.1 192.168.1.51'/>
14:23:20:
14:23:20:  <!-- Slot Control -->
14:23:20:  <power v='full'/>
14:23:20:
14:23:20:  <!-- User Information -->
14:23:20:  <passkey v='********************************'/>
14:23:20:  <team v='232392'/>
14:23:20:  <user v='Lion'/>
14:23:20:
14:23:20:  <!-- Folding Slots -->
14:23:20:  <slot id='0' type='CPU'>
14:23:20:    <next-unit-percentage v='100'/>
14:23:20:  </slot>
14:23:20:  <slot id='1' type='GPU'>
14:23:20:    <next-unit-percentage v='100'/>
14:23:20:  </slot>
14:23:20:  <slot id='2' type='GPU'>
14:23:20:    <next-unit-percentage v='100'/>
14:23:20:  </slot>
14:23:20:</config>
14:23:20:FS00:Shutting core down
14:23:21:WU02:FS02:Connecting to 171.67.108.45:80
14:23:21:WU02:FS02:Assigned to work server 140.163.4.243
14:23:21:WU02:FS02:Requesting new work unit for slot 02: READY gpu:1:GP104 [GeForce GTX 1070] from 140.163.4.243
14:23:21:WU02:FS02:Connecting to 140.163.4.243:8080
14:23:21:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
14:23:22:WU02:FS02:Downloading 2.67MiB
14:23:22:WU00:FS00:Starting
14:23:22:WARNING:WU00:FS00:Changed SMP threads from 3 to 2 this can cause some work units to fail
Mod edit: added Code tags to log file

Re: Possible problem adding second graphics card

Posted: Thu Feb 16, 2017 1:26 am
by SteveWillis
I'm sure others know more about this but;
From what I've read the extra GPU is actually the audio function of your card and somehow it shows up as another card.
I've had the same, very annoying, problem with the details of different slots being mixed up. I think that is fixed in the beta version of fahcontrol, I don't see the problem on my rig using it.
Neither issue are anything to worry about.

Re: Possible problem adding second graphics card

Posted: Thu Feb 16, 2017 2:22 am
by JimboPalmer
Over time, I have heard many people claim that if they re-install the software after they add new hardware, it finds the new hardware much more reliably.

(I fold on laptops, so I never add new hardware myself)

Re: Possible problem adding second graphics card

Posted: Thu Feb 16, 2017 2:59 am
by ChristianVirtual
You can also try the new beta version from FAH 7.4.16 with improved hardware detection specially in your case...

Re: Possible problem adding second graphics card

Posted: Thu Feb 16, 2017 8:36 am
by FldngForGrandparents
What I have done with this same issue was to delete the /var/lib/fahclient/work directory and restart the app. You will lose all your current work but it will download the work and associate it with the correct GPU. This seemed to work each time.

Re: Possible problem adding second graphics card

Posted: Thu Feb 16, 2017 11:59 am
by ChristianVirtual
FldngForGrandparents wrote:What I have done with this same issue was to delete the /var/lib/fahclient/work directory and restart the app. You will lose all your current work but it will download the work and associate it with the correct GPU. This seemed to work each time.
That's a rather unfriendly approach as it will send WU into deadline. In addition it might harm your eligibility for Quick Return Bonus.

Better is to set all slots to finish and wait until all are done. Or at least delete the slot properly to inform the server you will not continue. A WU get faster reassigned and less delay happen.

Once slots are empty you could delete the folder you mentioned or just reinstall.

Re: Possible problem adding second graphics card

Posted: Fri Feb 17, 2017 1:09 am
by Leonardo
Over time, I have heard many people claim that if they re-install the software after they add new hardware, it finds the new hardware much more reliably.Over time, I have heard many people claim that if they re-install the software after they add new hardware, it finds the new hardware much more reliably.
Yes, but also, if the FAH reinstallation does not provide the solution, perform a clean reinstallation of the Nvidia GPU drivers.

Re: Possible problem adding second graphics card

Posted: Sat Feb 18, 2017 7:58 pm
by bruce
On my NON-laptop systems, I've added/relocated/removed a number of GPUs. I've never found a good reason to delete the Work folder on either WIndows or Linux.

Code: Select all

14:18:25:      GPU 0: NVIDIA:5 GP106 [GeForce GTX 1060 3GB]
14:18:25:      GPU 1: NVIDIA:5 GP104 [GeForce GTX 1070]
14:18:25:      GPU 2: UNSUPPORTED: NV3 [PCI]
Case 1: Adding a GPU.
GPUs are categorized into groups of similar types. In your case, you have two GPUs that are recognized as "NVIDIA:5" That means that if, when you added the second one, the existing slot was running a WU, even though the in-process WU might resume processing with the new GPU and the newly created empty slot might be associated with the earlier GPU, it wouldn't matter because as far as FAH is concerned, the two GPUs are interchangeable. (Speed might be different, but processing capabilities are the same. )

Case 2: Removing a GPU.
Now suppose you remove one GPU and restart the client. You'll have two slots and probably two WUs. FAHClient will recognize that the slot and WU no longer have any hardware to process that WU, so it has to make a choice: Delete the WU or leave it to be processed. Since your two GPUs are similar, it enqueues the "extra" WU on the existing slot and resumes processing one of them, proceeding to the other when if finishes the first. If one of your GPUs happened to be a "NVIDIA 4" or an "ATI 5" you would potentially have a WU which your hardware couldn't process so the only option is to delete the WU and report it as dumped. (If I have a situation like that, I use FINISH for the slot of the GPU I'm planning to remove.)

There are a lot of possibilities that may not be handled properly. Hardware changes are relatively rare and it's certain that somebody can find a case that hasn't been considered during the design and testing of FAHClient. Ordinarily, I PAUSE all GPU processing when I make hardware changes and then proceed carefully.

The first time FAHClient is run, it automatically creates slots and then proceeds to use them. It doesn't do that (again) when you restart FAH. Manually adding a slot (as you have done) does a pretty good job of accomplishing the same thing.

It's possible that your UNSUPPORTED GPU is the sound-card functionality of your GPU, but it's more likely that it's the imbedded GPU associated with your CPU which can be used for a display but not for OpenCL or CUDA. Post the output from FAHClient --lspci | grep NVIDIA.

Re: Possible problem adding second graphics card

Posted: Sun Feb 19, 2017 4:39 am
by CeeVee
Output from FAHClient --lspci | grep NVIDIA:
0x10de:0x1c02:NVIDIA Corporation:
0x10de:0x10f1:NVIDIA Corporation:
0x10de:0x1b81:NVIDIA Corporation:
0x10de:0x10f0:NVIDIA Corporation:NV3 [PCI]

The GPU I've got on this system is an AMD FX-8320E which I'm fairly sure doesn't come with an inbuilt GPU.
The motherboard is an Asus Sabertooth which again doesn't have any onboard graphics.
I'm folding on both GPU's and the CPU, getting circa 1million PPD, so the issue doesn't seem to be slowing anything down.
I've got a set of planned upgrades covering the next year or so which means I'll be adding GPU's to various machines that are already running FAH on CPU's only. I guess I'm going to be the outlier in the use-cases <G>.

Re: Possible problem adding second graphics card

Posted: Mon Feb 20, 2017 6:13 pm
by 7im
This device was causing some hardware detection issues in an earlier client version, and so was purposely blacklisted. That's the only reason it shows up as a GPU, even though it's an Audio Controller. I think it safe remove from the list completely.

10f0 GP104 High Definition Audio Controller

But it's not causing any install issues, other than a little confusion. FYI, one should ALWAYS reinstall the client when changing hardware configs to make sure the full hardware autodetection runs versus the changes. Run the new beta version, it has much better GPU hardware detection over any previous version, IMO. GPUs.txt file updated and posted.

Re: Possible problem adding second graphics card

Posted: Wed Feb 22, 2017 1:35 am
by CeeVee
If I re-install the client won't I lose any part-finished WU's though?. I thought that was a big no-no as the WU's then go into limbo until they hit their end-date

Re: Possible problem adding second graphics card

Posted: Wed Feb 22, 2017 5:45 pm
by 7im
CeeVee wrote:If I re-install the client won't I lose any part-finished WU's though?. I thought that was a big no-no as the WU's then go into limbo until they hit their end-date
Reinstall doesn't specifically mess with the work unit, but if the slots move around, it can potentially cause a problem. Always safer to "finish" a work unit, then upgrade.

To be fair, losing one work unit to add a new faster GPU is probably acceptable by most people, on balance. But not one of the "best practices" :twisted:

Re: Possible problem adding second graphics card

Posted: Wed Feb 22, 2017 6:02 pm
by bruce
I've lost an occasional WU when I made changes to my system but not from reinstalling. If I've removed or added a GPU or something like that, there are going to be configuration changes and I'm careful about incomplete WUs -- either by setting them to FINISH or by temporarily adding a "pause-on-start" setting of true. After the client is restarted, I examine the slot configurations before starting to fold. Then I remove the pause-on-start.

I test every beta configuration so I run into situations that others probably do not. In rare instances, upgrading from one version to another upgrades the client's data structures. When it does that, if something goes wrong and I want to go back to an earlier version, the partly completed WUs cannot be migrated back to that earlier version so I have to finish them. (Anyway, you won't be attempting that.)

If I REMOVE a GPU, my system probably won't be able to resume work on that WU so I always finish whatever the soon-to-be-removed GPU is working on.