Client crashes constantly running GPU load on Radeon VII
Posted: Thu Mar 26, 2020 2:09 am
It's pretty straightforward - if I turn on GPU folding, it crashes somewhere from 5-15 minutes in, every time. No bluescreen, just straight to black. I admittedly had a slightly underpowered power supply, but even changing that hasn't helped. Driver changes, turning off programs, changing BIOS settings, moving RAM around, nothing. It can CPU fold for literal days, nothing else crashes, RAM tests good, and nothing is overheating. Setup is:
GigaByte X4700 Gaming 7 Rev. 1.1 MB
G.Skill Platinum Silver 3600 RAM @ 32GB
Ryzen 7 1700X
Radeon VII with v106 firmware
Latest Win10
Latest log below. It doesn't seem like it's generating logs when it crashes though, because I don't see where it says I turned the GPU on, but I definitely did.
GigaByte X4700 Gaming 7 Rev. 1.1 MB
G.Skill Platinum Silver 3600 RAM @ 32GB
Ryzen 7 1700X
Radeon VII with v106 firmware
Latest Win10
Latest log below. It doesn't seem like it's generating logs when it crashes though, because I don't see where it says I turned the GPU on, but I definitely did.
Code: Select all
*********************** Log Started 2020-03-26T01:27:08Z ***********************
01:27:08:************************* Folding@home Client *************************
01:27:08: Website: https://foldingathome.org/
01:27:08: Copyright: (c) 2009-2018 foldingathome.org
01:27:08: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
01:27:08: Args: --open-web-control
01:27:08: Config: C:\Users\Caboose\AppData\Roaming\FAHClient\config.xml
01:27:08:******************************** Build ********************************
01:27:08: Version: 7.5.1
01:27:08: Date: May 11 2018
01:27:08: Time: 13:06:32
01:27:08: Repository: Git
01:27:08: Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
01:27:08: Branch: master
01:27:08: Compiler: Visual C++ 2008
01:27:08: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
01:27:08: Platform: win32 10
01:27:08: Bits: 32
01:27:08: Mode: Release
01:27:08:******************************* System ********************************
01:27:08: CPU: AMD Ryzen 7 1700X Eight-Core Processor
01:27:08: CPU ID: AuthenticAMD Family 23 Model 1 Stepping 1
01:27:08: CPUs: 16
01:27:08: Memory: 31.95GiB
01:27:08: Free Memory: 26.77GiB
01:27:08: Threads: WINDOWS_THREADS
01:27:08: OS Version: 6.2
01:27:08: Has Battery: false
01:27:08: On Battery: false
01:27:08: UTC Offset: -4
01:27:08: PID: 16168
01:27:08: CWD: C:\Users\Caboose\AppData\Roaming\FAHClient
01:27:08: OS: Windows 10 Enterprise
01:27:08: OS Arch: AMD64
01:27:08: GPUs: 1
01:27:08: GPU 0: Bus:12 Slot:0 Func:0 AMD:5 Vega 20 [Radeon VII]
01:27:08: CUDA: Not detected: Failed to open dynamic library 'nvcuda.dll': The
01:27:08: specified module could not be found.
01:27:08:
01:27:08:OpenCL Device 0: Platform:0 Device:0 Bus:12 Slot:0 Compute:1.2 Driver:3004.8
01:27:08: Win32 Service: false
01:27:08:***********************************************************************
01:27:08:<config>
01:27:08: <!-- User Information -->
01:27:08: <user v='Caboose'/>
01:27:08:
01:27:08: <!-- Folding Slots -->
01:27:08: <slot id='0' type='CPU'/>
01:27:08: <slot id='1' type='GPU'>
01:27:08: <paused v='true'/>
01:27:08: </slot>
01:27:08:</config>
01:27:08:Trying to access database...
01:27:08:Successfully acquired database lock
01:27:08:Enabled folding slot 00: READY cpu:14
01:27:08:Enabled folding slot 01: PAUSED gpu:0:Vega 20 [Radeon VII] (by user)
01:27:08:WU00:FS00:Starting
01:27:08:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\Caboose\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/avx/Core_a7.fah/FahCore_a7.exe -dir 00 -suffix 01 -version 705 -lifeline 16168 -checkpoint 15 -np 14
01:27:08:WU00:FS00:Started FahCore on PID 1876
01:27:08:WU00:FS00:Core PID:7276
01:27:08:WU00:FS00:FahCore 0xa7 started
01:27:08:WU00:FS00:0xa7:*********************** Log Started 2020-03-26T01:27:08Z ***********************
01:27:08:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
01:27:08:WU00:FS00:0xa7: Type: 0xa7
01:27:08:WU00:FS00:0xa7: Core: Gromacs
01:27:08:WU00:FS00:0xa7: Args: -dir 00 -suffix 01 -version 705 -lifeline 1876 -checkpoint 15 -np
01:27:08:WU00:FS00:0xa7: 14
01:27:08:WU00:FS00:0xa7:************************************ CBang *************************************
01:27:08:WU00:FS00:0xa7: Date: Oct 26 2019
01:27:08:WU00:FS00:0xa7: Time: 01:38:25
01:27:08:WU00:FS00:0xa7: Revision: c46a1a011a24143739ac7218c5a435f66777f62f
01:27:08:WU00:FS00:0xa7: Branch: master
01:27:08:WU00:FS00:0xa7: Compiler: Visual C++ 2008
01:27:09:WU00:FS00:0xa7: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
01:27:09:WU00:FS00:0xa7: Platform: win32 10
01:27:09:WU00:FS00:0xa7: Bits: 64
01:27:09:WU00:FS00:0xa7: Mode: Release
01:27:09:WU00:FS00:0xa7:************************************ System ************************************
01:27:09:WU00:FS00:0xa7: CPU: AMD Ryzen 7 1700X Eight-Core Processor
01:27:09:WU00:FS00:0xa7: CPU ID: AuthenticAMD Family 23 Model 1 Stepping 1
01:27:09:WU00:FS00:0xa7: CPUs: 16
01:27:09:WU00:FS00:0xa7: Memory: 31.95GiB
01:27:09:WU00:FS00:0xa7:Free Memory: 26.74GiB
01:27:09:WU00:FS00:0xa7: Threads: WINDOWS_THREADS
01:27:09:WU00:FS00:0xa7: OS Version: 6.2
01:27:09:WU00:FS00:0xa7:Has Battery: false
01:27:09:WU00:FS00:0xa7: On Battery: false
01:27:09:WU00:FS00:0xa7: UTC Offset: -4
01:27:09:WU00:FS00:0xa7: PID: 7276
01:27:09:WU00:FS00:0xa7: CWD: C:\Users\Caboose\AppData\Roaming\FAHClient\work
01:27:09:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
01:27:09:WU00:FS00:0xa7: Version: 0.0.18
01:27:09:WU00:FS00:0xa7: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
01:27:09:WU00:FS00:0xa7: Copyright: 2019 foldingathome.org
01:27:09:WU00:FS00:0xa7: Homepage: https://foldingathome.org/
01:27:09:WU00:FS00:0xa7: Date: Oct 26 2019
01:27:09:WU00:FS00:0xa7: Time: 01:52:30
01:27:09:WU00:FS00:0xa7: Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
01:27:09:WU00:FS00:0xa7: Branch: master
01:27:09:WU00:FS00:0xa7: Compiler: Visual C++ 2008
01:27:09:WU00:FS00:0xa7: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
01:27:09:WU00:FS00:0xa7: Platform: win32 10
01:27:09:WU00:FS00:0xa7: Bits: 64
01:27:09:WU00:FS00:0xa7: Mode: Release
01:27:09:WU00:FS00:0xa7:************************************ Build *************************************
01:27:09:WU00:FS00:0xa7: SIMD: avx_256
01:27:09:WU00:FS00:0xa7:********************************************************************************
01:27:09:WU00:FS00:0xa7:Project: 14311 (Run 17, Clone 11, Gen 26)
01:27:09:WU00:FS00:0xa7:Unit: 0x0000001f0002894b5df2b5d25711a2dd
01:27:09:WU00:FS00:0xa7:Digital signatures verified
01:27:09:WU00:FS00:0xa7:Reducing thread count from 14 to 13 to avoid domain decomposition with large prime factor 7
01:27:09:WU00:FS00:0xa7:Reducing thread count from 13 to 12 to avoid domain decomposition by a prime number > 3
01:27:09:WU00:FS00:0xa7:Calling: mdrun -s frame26.tpr -o frame26.trr -cpi state.cpt -cpt 15 -nt 12
01:27:09:WU00:FS00:0xa7:Steps: first=13000000 total=500000
01:27:10:WU00:FS00:0xa7:Completed 292276 out of 500000 steps (58%)
01:27:16:8:127.0.0.1:New Web connection
01:27:27:FS00:Finishing
01:28:42:WU00:FS00:0xa7:Completed 295000 out of 500000 steps (59%)
01:29:30:FS00:Shutting core down
01:29:30:WU00:FS00:0xa7:WARNING:Console control signal 1 on PID 7276
01:29:30:WU00:FS00:0xa7:Exiting, please wait. . .
01:29:31:Clean exit