GPU WU task appears then suddenly disappears

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

Post Reply
Nope.avi
Posts: 15
Joined: Wed Jan 29, 2020 9:13 pm

GPU WU task appears then suddenly disappears

Post by Nope.avi »

When I try to run a GPU work unit, I temporarily get a task, for example a 11759 work unit, and then a few seconds later it disappears with the following code.
I've reinstalled the drivers directly from Nvidia's website, and nothing changes. What can I do to alleviate this?

Code: Select all

*********************** Log Started 2020-03-16T22:01:27Z ***********************
22:01:27:************************* Folding@home Client *************************
22:01:27:        Website: https://foldingathome.org/
22:01:27:      Copyright: (c) 2009-2018 foldingathome.org
22:01:27:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
22:01:27:           Args: 
22:01:27:         Config: C:\Users\Gemini\AppData\Roaming\FAHClient\config.xml
22:01:27:******************************** Build ********************************
22:01:27:        Version: 7.5.1
22:01:27:           Date: May 11 2018
22:01:27:           Time: 13:06:32
22:01:27:     Repository: Git
22:01:27:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
22:01:27:         Branch: master
22:01:27:       Compiler: Visual C++ 2008
22:01:27:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
22:01:27:       Platform: win32 10
22:01:27:           Bits: 32
22:01:27:           Mode: Release
22:01:27:******************************* System ********************************
22:01:27:            CPU: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz
22:01:27:         CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
22:01:27:           CPUs: 8
22:01:27:         Memory: 7.73GiB
22:01:27:    Free Memory: 5.78GiB
22:01:27:        Threads: WINDOWS_THREADS
22:01:27:     OS Version: 6.2
22:01:27:    Has Battery: true
22:01:27:     On Battery: false
22:01:27:     UTC Offset: -6
22:01:27:            PID: 3280
22:01:27:            CWD: C:\Users\Gemini\AppData\Roaming\FAHClient
22:01:27:             OS: Windows 10 Enterprise
22:01:27:        OS Arch: AMD64
22:01:27:           GPUs: 1
22:01:27:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:3 GK107 [GeForce GT 750M]
22:01:27:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:3.0 Driver:10.1
22:01:27:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:425.31
22:01:27:OpenCL Device 1: Platform:1 Device:0 Bus:NA Slot:NA Compute:1.2 Driver:20.19
22:01:27:  Win32 Service: false
22:01:27:***********************************************************************
22:01:27:<config>
22:01:27:  <!-- Network -->
22:01:27:  <proxy v=':8080'/>
22:01:27:
22:01:27:  <!-- Slot Control -->
22:01:27:  <power v='full'/>
22:01:27:
22:01:27:  <!-- User Information -->
22:01:27:  <team v='225187'/>
22:01:27:  <user v='Nope.avi'/>
22:01:27:
22:01:27:  <!-- Folding Slots -->
22:01:27:  <slot id='1' type='GPU'/>
22:01:27:</config>
22:01:27:Trying to access database...
22:01:27:Successfully acquired database lock
22:01:27:Enabled folding slot 01: READY gpu:0:GK107 [GeForce GT 750M]
22:01:27:WU00:FS01:Connecting to 65.254.110.245:8080
22:01:28:WU00:FS01:Assigned to work server 128.252.203.10
22:01:28:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GK107 [GeForce GT 750M] from 128.252.203.10
22:01:28:WU00:FS01:Connecting to 128.252.203.10:8080
22:01:49:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
22:01:49:WU00:FS01:Connecting to 128.252.203.10:80
22:02:10:ERROR:WU00:FS01:Exception: Failed to connect to 128.252.203.10:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
22:02:11:WU00:FS01:Connecting to 65.254.110.245:8080
22:02:11:WU00:FS01:Assigned to work server 140.163.4.241
22:02:11:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GK107 [GeForce GT 750M] from 140.163.4.241
22:02:11:WU00:FS01:Connecting to 140.163.4.241:8080
22:03:58:WU00:FS01:Downloading 7.92MiB
22:04:05:WU00:FS01:Download 2.37%
22:04:11:WU00:FS01:Download 3.94%
22:04:17:WU00:FS01:Download 10.25%
22:04:23:WU00:FS01:Download 15.78%
22:04:31:WU00:FS01:Download 22.87%
22:04:42:WU00:FS01:Download 26.82%
22:04:49:WU00:FS01:Download 31.55%
22:04:55:WU00:FS01:Download 37.07%
22:05:01:WU00:FS01:Download 43.38%
22:05:07:WU00:FS01:Download 45.75%
22:05:13:WU00:FS01:Download 50.48%
22:05:19:WU00:FS01:Download 54.42%
22:05:25:WU00:FS01:Download 61.52%
22:05:33:WU00:FS01:Download 66.26%
22:05:39:WU00:FS01:Download 72.57%
22:05:46:WU00:FS01:Download 76.51%
22:05:53:WU00:FS01:Download 83.61%
22:06:00:WU00:FS01:Download 88.34%
22:06:06:WU00:FS01:Download 92.28%
22:06:12:WU00:FS01:Download 95.44%
22:06:17:WU00:FS01:Download complete
22:06:17:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:11745 run:0 clone:1119 gen:1 core:0x22 unit:0x000000048ca304f15e67ed7f4b9a0374
22:06:17:WU00:FS01:Starting
22:06:17:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\Gemini\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 705 -lifeline 3280 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
22:06:17:WU00:FS01:Started FahCore on PID 7692
22:06:17:WU00:FS01:Core PID:5828
22:06:17:WU00:FS01:FahCore 0x22 started
22:06:18:WU00:FS01:0x22:*********************** Log Started 2020-03-16T22:06:17Z ***********************
22:06:18:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
22:06:18:WU00:FS01:0x22:       Type: 0x22
22:06:18:WU00:FS01:0x22:       Core: Core22
22:06:18:WU00:FS01:0x22:    Website: https://foldingathome.org/
22:06:18:WU00:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
22:06:18:WU00:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
22:06:18:WU00:FS01:0x22:             <rafal.wiewiora@choderalab.org>
22:06:18:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 705 -lifeline 7692 -checkpoint 15
22:06:18:WU00:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
22:06:18:WU00:FS01:0x22:             0 -gpu 0
22:06:18:WU00:FS01:0x22:     Config: <none>
22:06:18:WU00:FS01:0x22:************************************ Build *************************************
22:06:18:WU00:FS01:0x22:    Version: 0.0.2
22:06:18:WU00:FS01:0x22:       Date: Dec 6 2019
22:06:18:WU00:FS01:0x22:       Time: 21:30:31
22:06:18:WU00:FS01:0x22: Repository: Git
22:06:18:WU00:FS01:0x22:   Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
22:06:18:WU00:FS01:0x22:     Branch: HEAD
22:06:18:WU00:FS01:0x22:   Compiler: Visual C++ 2008
22:06:18:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
22:06:18:WU00:FS01:0x22:   Platform: win32 10
22:06:18:WU00:FS01:0x22:       Bits: 64
22:06:18:WU00:FS01:0x22:       Mode: Release
22:06:18:WU00:FS01:0x22:************************************ System ************************************
22:06:18:WU00:FS01:0x22:        CPU: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz
22:06:18:WU00:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
22:06:18:WU00:FS01:0x22:       CPUs: 8
22:06:18:WU00:FS01:0x22:     Memory: 7.73GiB
22:06:18:WU00:FS01:0x22:Free Memory: 5.59GiB
22:06:18:WU00:FS01:0x22:    Threads: WINDOWS_THREADS
22:06:18:WU00:FS01:0x22: OS Version: 6.2
22:06:18:WU00:FS01:0x22:Has Battery: true
22:06:18:WU00:FS01:0x22: On Battery: false
22:06:18:WU00:FS01:0x22: UTC Offset: -6
22:06:18:WU00:FS01:0x22:        PID: 5828
22:06:18:WU00:FS01:0x22:        CWD: C:\Users\Gemini\AppData\Roaming\FAHClient\work
22:06:18:WU00:FS01:0x22:         OS: Windows 10 Pro
22:06:18:WU00:FS01:0x22:    OS Arch: AMD64
22:06:18:WU00:FS01:0x22:********************************************************************************
22:06:18:WU00:FS01:0x22:Project: 11745 (Run 0, Clone 1119, Gen 1)
22:06:18:WU00:FS01:0x22:Unit: 0x000000048ca304f15e67ed7f4b9a0374
22:06:18:WU00:FS01:0x22:Reading tar file core.xml
22:06:18:WU00:FS01:0x22:Reading tar file integrator.xml
22:06:18:WU00:FS01:0x22:Reading tar file state.xml
22:06:19:WU00:FS01:0x22:Reading tar file system.xml
22:06:20:WU00:FS01:0x22:Digital signatures verified
22:06:20:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
22:06:20:WU00:FS01:0x22:Version 0.0.2
22:06:27:WU00:FS01:0x22:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)
22:06:27:WU00:FS01:0x22:Saving result file ..\logfile_01.txt
22:06:27:WU00:FS01:0x22:Saving result file science.log
22:06:27:WU00:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
22:06:28:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
22:06:28:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:11745 run:0 clone:1119 gen:1 core:0x22 unit:0x000000048ca304f15e67ed7f4b9a0374
22:06:28:WU00:FS01:Uploading 2.63KiB to 140.163.4.241
22:06:28:WU00:FS01:Connecting to 140.163.4.241:8080
22:06:29:WU01:FS01:Connecting to 65.254.110.245:8080
22:06:29:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
22:06:29:WU01:FS01:Connecting to 18.218.241.186:80
22:06:31:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
22:06:31:ERROR:WU01:FS01:Exception: Could not get an assignment
22:06:31:WU01:FS01:Connecting to 65.254.110.245:8080
22:06:32:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
22:06:32:WU01:FS01:Connecting to 18.218.241.186:80
22:06:32:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
22:06:32:ERROR:WU01:FS01:Exception: Could not get an assignment
22:06:35:WU00:FS01:Upload 100.00%
22:07:31:WU01:FS01:Connecting to 65.254.110.245:8080
22:07:32:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
22:07:32:WU01:FS01:Connecting to 18.218.241.186:80
22:07:32:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
22:07:32:ERROR:WU01:FS01:Exception: Could not get an assignment
22:08:01:WU00:FS01:Upload complete
22:08:01:WU00:FS01:Server responded WORK_ACK (400)
22:08:01:WU00:FS01:Cleaning up
22:09:08:WU01:FS01:Connecting to 65.254.110.245:8080
22:09:10:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
22:09:10:WU01:FS01:Connecting to 18.218.241.186:80
22:09:11:WU01:FS01:Assigned to work server 128.252.203.10
22:09:11:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GK107 [GeForce GT 750M] from 128.252.203.10
22:09:11:WU01:FS01:Connecting to 128.252.203.10:8080
22:09:47:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
Last edited by Nope.avi on Tue Mar 17, 2020 3:35 am, edited 1 time in total.
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: GPU WU task appears then suddenly disappears

Post by Joe_H »

Were your drivers downloaded and installed through Windows Update or directly from nVidia? The drivers from MS don't include the OpenCL support needed, and their updates often remove that support if it is already installed.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: GPU WU task appears then suddenly disappears

Post by bruce »

22:06:27:WU00:FS01:0x22:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)
...
22:06:27:WU00:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT

is the result of a GPU error. I don't have detailed information about error {-5) but it is possibly a driver issue (as explained above) or perhaps a true hardware

After that error, you were unable to get a new assignment due to the congestion on FAH's server sites. We are under what's effectively a DoS attack from all of the wonderful people who truly want to help.
Nope.avi
Posts: 15
Joined: Wed Jan 29, 2020 9:13 pm

Re: GPU WU task appears then suddenly disappears

Post by Nope.avi »

Joe_H wrote:Were your drivers downloaded and installed through Windows Update or directly from nVidia? The drivers from MS don't include the OpenCL support needed, and their updates often remove that support if it is already installed.
I have reinstalled the drivers directly from Nvidia. But I'm still getting the same errors.
JiiPee
Posts: 59
Joined: Sun Mar 09, 2008 4:09 pm
Location: FINLAND

Re: GPU WU task appears then suddenly disappears

Post by JiiPee »

There is something broken in core22 and corona projects, I'm getting same issue. Sometimes I can fold short time but then WU crashes.

https://pastebin.com/2DgYRRta I had to put it to pastebin because it was too large to post here.

This is full log after restart. As you can see there is some WU's what fails instantly to OpenCL error and then there is WU's that fold short time and then fails.

Card is Vega 56 with liquid cooling running stock now because I was thinking it's mem OC issue.
Also latests drivers 20.2.2
Currently this folding slot is sitting at red failed state prolly because too many failed WU's in row and I'm not sure if I even want to turn it on and try because it's been this same shitshow after corona WU's was released. Core22 was working quite ok with test WU, I was still seeing higher failures compared to core21 but at least it got some of them done. Now all corona WU's just fails with my GPU.

Summary of log:
11747 fail to OpenCL error
11742 WU fail
11760 WU fail
11762 WU fail
11750 WU fail
11752 fail to OpenCL error
11748 WU fail
11764 fail to OpenCL error
11759 fail to OpenCL error

and after that slot went failed state.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: GPU WU task appears then suddenly disappears

Post by bruce »

No, it's not Core22 that's the problem. Your system is unstable.

I suspect your system is excessively overclocked. FAH often puts a greater load on the hardware than whatever bench-marking tool you used to establish stability. I''ll bet it Will run fine at stock clock rates.
JiiPee
Posts: 59
Joined: Sun Mar 09, 2008 4:09 pm
Location: FINLAND

Re: GPU WU task appears then suddenly disappears

Post by JiiPee »

bruce wrote:No, it's not Core22 that's the problem. Your system is unstable.

I suspect your system is excessively overclocked. FAH often puts a greater load on the hardware than whatever bench-marking tool you used to establish stability. I''ll bet it Will run fine at stock clock rates.
No my system is at DEFAULT CLOCKS. And I just did run luxmark OpenCL for hours without any issues, it even hit GPU harder than folding.

Didn't you read even what I said? I took of memory OC because I was thinking it was the issue but it's not.

Also why does it fold fine with core21? Also core22 was working with test WU's.
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: GPU WU task appears then suddenly disappears

Post by Joe_H »

The test WU's were simpler and do not load a GPU as much. Non-test Core_22 WU's can load a GPU more than Core_21.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Nope.avi
Posts: 15
Joined: Wed Jan 29, 2020 9:13 pm

Re: GPU WU task appears then suddenly disappears

Post by Nope.avi »

bruce wrote:No, it's not Core22 that's the problem. Your system is unstable.

I suspect your system is excessively overclocked. FAH often puts a greater load on the hardware than whatever bench-marking tool you used to establish stability. I''ll bet it Will run fine at stock clock rates.

My system is a laptop that runs on stock clocks too. I've been using it to fold anonymously for fah for years. I'm also experiencing the same thing on a well ventilated AMD desktop system as well.
Post Reply