Page 1 of 1

FahCore_21 termintes on startup

Posted: Mon Apr 03, 2017 4:27 pm
by PortableNuke
I've recently installed FAH on a Dell Precision T3610 with dual nVidia Quadro K600 graphics cards and Direct X 11.

When the the folding client starts (either because it detects that the system is idle or because I tell it to run all the time), the CPU core process runs without a problem but the FahCore_21 processes for the GPUs immediately terminate (with what appears to be a heap corruption error). Similarly if I run the FahBench app it also terminates in the same way.

Interestingly, if I set up the vsjitdebugger to intercede when it detects a crash and I then attach Visual Studio and tell it to continue execution of the app, it seems to work OK - it does throw some bad_cast exceptions and an OpenMMException exception but the app runs after all of that.

I've seen in other posts that there have been issues with the nVidia drivers before (my system currently have version 377.11 installed) and that when folk rolled back to an earlier version of the drivers the problem would disappear, so I'm wondering if that approach might work here. My only problem is that I have no idea what version of the drivers I should roll back to.

Does anyone have any suggestions?

P.S. Thanks for any replies.

Re: FahCore_21 termintes on startup

Posted: Mon Apr 03, 2017 4:32 pm
by bruce
I think this may help: viewtopic.php?f=24&t=29633

Re: FahCore_21 termintes on startup

Posted: Mon Apr 03, 2017 4:59 pm
by PortableNuke
bruce wrote:I think this may help: viewtopic.php?f=24&t=29633
Unfortunately not. It seems that I'm currently running Core 0.0.18 and it's not resolving the issue:
  • 12:30:28:WU00:FS02:Starting
    12:30:28:WU00:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/pamateer/AppData/Roaming/FAHClient/cores/fahwebx.stanford.edu/cores/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 704 -lifeline 7944 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
    12:30:28:WU00:FS02:Started FahCore on PID 4308
    12:30:28:WU00:FS02:Core PID:7248
    12:30:28:WU00:FS02:FahCore 0x21 started
    12:30:29:WU00:FS02:0x21:*********************** Log Started 2017-04-03T12:30:28Z ***********************
    12:30:29:WU00:FS02:0x21:Project: 11418 (Run 2, Clone 238, Gen 24)
    12:30:29:WU00:FS02:0x21:Unit: 0x000000218ca304f158af65042dae5a1a
    12:30:29:WU00:FS02:0x21:CPU: 0x00000000000000000000000000000000
    12:30:29:WU00:FS02:0x21:Machine: 2
    12:30:29:WU00:FS02:0x21:Reading tar file core.xml
    12:30:29:WU00:FS02:0x21:Reading tar file system.xml
    12:30:29:WU00:FS02:0x21:Reading tar file integrator.xml
    12:30:29:WU00:FS02:0x21:Reading tar file state.xml
    12:30:29:WARNING:WU02:FS01:FahCore returned: FAILED_1 (0 = 0x0)
    12:30:29:WU02:FS01:Starting
    12:30:29:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/pamateer/AppData/Roaming/FAHClient/cores/fahwebx.stanford.edu/cores/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 02 -suffix 01 -version 704 -lifeline 7944 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
    12:30:29:WU02:FS01:Started FahCore on PID 5352
    12:30:29:WU02:FS01:Core PID:6680
    12:30:29:WU02:FS01:FahCore 0x21 started
    12:30:29:WU04:FS00:0xa4:Using Gromacs checkpoints
    12:30:29:WU04:FS00:0xa4:Mapping NT from 10 to 10
    12:30:30:WU00:FS02:0x21:Digital signatures verified
    12:30:30:WU00:FS02:0x21:Folding@home GPU Core21 Folding@home Core
    12:30:30:WU00:FS02:0x21:Version 0.0.18
    12:30:30:WU02:FS01:0x21:*********************** Log Started 2017-04-03T12:30:29Z ***********************
    12:30:30:WU02:FS01:0x21:Project: 11402 (Run 4, Clone 19, Gen 445)
    12:30:30:WU02:FS01:0x21:Unit: 0x000002628ca304f255ed4e6e906835e8
    12:30:30:WU02:FS01:0x21:CPU: 0x00000000000000000000000000000000
    12:30:30:WU02:FS01:0x21:Machine: 1
    12:30:30:WU02:FS01:0x21:Digital signatures verified
    12:30:30:WU02:FS01:0x21:Folding@home GPU Core21 Folding@home Core
    12:30:30:WU02:FS01:0x21:Version 0.0.18
    12:30:30:WU04:FS00:0xa4:Resuming from checkpoint
    12:30:30:WU04:FS00:0xa4:Verified 04/wudata_01.log
    12:30:30:WU04:FS00:0xa4:Verified 04/wudata_01.trr
    12:30:30:WU04:FS00:0xa4:Verified 04/wudata_01.xtc
    12:30:30:WU04:FS00:0xa4:Verified 04/wudata_01.edr
    12:30:30:WU04:FS00:0xa4:Completed 218210 out of 1250000 steps (17%)
    12:30:33:WARNING:WU00:FS02:FahCore returned: FAILED_1 (0 = 0x0)
    12:30:34:WARNING:WU02:FS01:FahCore returned: FAILED_1 (0 = 0x0)

Re: FahCore_21 termintes on startup

Posted: Mon Apr 03, 2017 6:58 pm
by JimboPalmer
Is it important that he is showing a Fermi sub-directory but has a Kepler GPU?

Re: FahCore_21 termintes on startup

Posted: Mon Apr 03, 2017 8:22 pm
by foldy
If it is 2x Quadro K600 (96 shaders) they are too slow for folding anyway. Or do you mean Quadro K6000?

Re: FahCore_21 termintes on startup

Posted: Mon Apr 03, 2017 9:57 pm
by PortableNuke
foldy wrote:If it is 2x Quadro K600 (96 shaders) they are too slow for folding anyway. Or do you mean Quadro K6000?
No I do mean K600 - it's a workstation rather than a gaming rig, but the way I see it any extra assistance that I can get from them is better than nothing, especially if I'm only using the system 40 hours in every 168.

Of course, they do have to run the software, which isn't happening at the moment.

Does JimboPalmer's reply have any relevance to the the problem I'm seeing:
JimboPalmer wrote:Is it important that he is showing a Fermi sub-directory but has a Kepler GPU?

Re: FahCore_21 termintes on startup

Posted: Mon Apr 03, 2017 11:05 pm
by Joe_H
As far as I know the Fermi subdirectory is generic for Fermi and later GPU's from nVidia to distinguish from the pre-Fermi devices. Separate designations for Kepler, Maxwell and Pascal have not been used for the folding core. Information as to GPU type does get passed as part of the WU request though.

Wikipedia lists the K600 as having 192 cores, so it may be able to fold fast enough. As for drivers, nVidia's support of the Quadros sometimes has been reported to be picky about versions used. You do need a recent version installed with the OpenCL and Cuda support installed as well.

Re: FahCore_21 termintes on startup

Posted: Tue Apr 04, 2017 11:26 am
by PortableNuke
Joe_H wrote:As for drivers, nVidia's support of the Quadros sometimes has been reported to be picky about versions used. You do need a recent version installed with the OpenCL and Cuda support installed as well.
Well as far as nVidia are concerned I have the latest drivers (377.11) and looking at the driver details I can see that both the OpenCL (version 1.2.11) and CUDA drivers (version 6.14.13.7711) are present.

There's definitely something amiss with the OpenCL end of things though because GPU-Z also crashes on start up and when restarted it displays a dialog stating that it crashed during OpenCL detection and asks if I would like to re-enable OpenCL detection.

Re: FahCore_21 termintes on startup

Posted: Tue Apr 04, 2017 12:56 pm
by PortableNuke
OK, so I've made a little progress with this.

I downloaded the OpenCL Device Query sample from nVidia, seeing as GPU-Z reported the problem as occurring during device discovery.

The 32-bit version of the sample runs fine and reports the two graphics cards present
The 64-bit version crashes with a heap corruption error - just as the FahCore_21 and GPU-Z executables do.

I am running 64 bit Windows 7 SP1 so I'd expect both versions of the app to run.

Re: FahCore_21 termintes on startup

Posted: Tue Apr 04, 2017 1:59 pm
by PortableNuke
Good news. I have managed to get the system running.

Once I played around with the x86 and 64 versions of the oclDeviceQuery app I was able to see that the x86 version was loading nvopencl.dll but the x64 version was loading IntelOpenCL64.dll. Some further digging revealed that in there were two entries in the [HKLM]\SOFTWARE\Khronos\OpenCL\Vendors as follows:
C:\Windows\System32\nvopencl.dll = DWORD:00000000
IntelOpenCL64.dll = DWORD:00000000

Deleting the second Intel related entry (and the related entry under Wow6432Node) solved the problem and all three cores (CPU and 2x GPU) run OK.

Admittedly the GPUs are not what you'd call stellar in terms of processing as foldy predicted, but at least they're working now! :)

Re: FahCore_21 termintes on startup

Posted: Tue Apr 04, 2017 4:17 pm
by bruce
PortableNuke wrote:Unfortunately not. It seems that I'm currently running Core 0.0.18 and it's not resolving the issue
Obviously running the Intel OpenCL was your problem. You can choose to force an update or you can wait for a WU that requires the latest version of core_21.

To force the update, pause all GPU slots.
Delete (or rename) C:/Users/pamateer/AppData/Roaming/FAHClient/cores/fahwebx.stanford.edu/cores/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe.
Start ONE GPU and wait for the new version to download before starting the other one(s).

Re: FahCore_21 termintes on startup

Posted: Tue May 16, 2017 6:49 pm
by b-morgan
I have an extra folder in the path:

C:\Users\Brad\AppData\Roaming\FAHClient\cores\fahwebx.stanford.edu\cores\Win32\AMD64\NVIDIA\Fermi\beta\Core_21.fah\FahCore_21.exe

Should I delete it (i.e. move Core_21.fah up one level)?

Re: FahCore_21 termintes on startup

Posted: Tue May 16, 2017 6:59 pm
by Joe_H
b-morgan wrote:I have an extra folder in the path:

C:\Users\Brad\AppData\Roaming\FAHClient\cores\fahwebx.stanford.edu\cores\Win32\AMD64\NVIDIA\Fermi\beta\Core_21.fah\FahCore_21.exe

Should I delete it (i.e. move Core_21.fah up one level)?
No, the directory paths are determined by the client and the projects. That is the correct path to be used if the client is set for beta projects.