Crashes on FahCore_22
Posted: Fri Oct 23, 2020 12:19 pm
I seem to be getting some random crashes on the GPU side of things. It happens maybe every third day or so, but I get a pop up that FahCore_22 has crashed. Of course it doesn't restart until I close the pop up, which in the most recent case was after 12 hours. Here is where the log shows the crash. Each time it has been the same error. Anything I can do in these instances or is it just something to live with?
Code: Select all
23:46:25:WU00:FS01:0x22:An exception occurred at step 1053057: Error invoking kernel: CUDA_ERROR_ILLEGAL_ADDRESS (700)
23:46:25:WU00:FS01:0x22:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
23:46:25:WU00:FS01:0x22:Folding@home Core Shutdown: CORE_RESTART
******************************* Date: 2020-10-23 *******************************
******************************* Date: 2020-10-23 *******************************
12:09:17:WARNING:WU00:FS01:FahCore returned an unknown error code which probably indicates that it crashed
12:09:17:WARNING:WU00:FS01:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
12:09:17:WU00:FS01:Starting
12:09:17:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\Mike\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.13/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 4304 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
12:09:17:WU00:FS01:Started FahCore on PID 2976
12:09:17:WU00:FS01:Core PID:8804
12:09:17:WU00:FS01:FahCore 0x22 started
12:09:17:WU00:FS01:0x22:*********************** Log Started 2020-10-23T12:09:17Z ***********************
12:09:17:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
12:09:17:WU00:FS01:0x22: Core: Core22
12:09:17:WU00:FS01:0x22: Type: 0x22
12:09:17:WU00:FS01:0x22: Version: 0.0.13
12:09:17:WU00:FS01:0x22: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
12:09:17:WU00:FS01:0x22: Copyright: 2020 foldingathome.org
12:09:17:WU00:FS01:0x22: Homepage: https://foldingathome.org/
12:09:17:WU00:FS01:0x22: Date: Sep 19 2020
12:09:17:WU00:FS01:0x22: Time: 02:35:58
12:09:17:WU00:FS01:0x22: Revision: 571cf95de6de2c592c7c3ed48fcfb2e33e9ea7d3
12:09:17:WU00:FS01:0x22: Branch: core22-0.0.13
12:09:17:WU00:FS01:0x22: Compiler: Visual C++ 2015
12:09:17:WU00:FS01:0x22: Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
12:09:17:WU00:FS01:0x22: -DOPENMM_GIT_HASH="\"189320d0\""
12:09:17:WU00:FS01:0x22: Platform: win32 10
12:09:17:WU00:FS01:0x22: Bits: 64
12:09:17:WU00:FS01:0x22: Mode: Release
12:09:17:WU00:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
12:09:17:WU00:FS01:0x22: <peastman@stanford.edu>
12:09:17:WU00:FS01:0x22: Args: -dir 00 -suffix 01 -version 706 -lifeline 2976 -checkpoint 15
12:09:17:WU00:FS01:0x22: -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
12:09:17:WU00:FS01:0x22: 0 -gpu 0
12:09:17:WU00:FS01:0x22:************************************ libFAH ************************************
12:09:17:WU00:FS01:0x22: Date: Sep 7 2020
12:09:17:WU00:FS01:0x22: Time: 19:09:56
12:09:17:WU00:FS01:0x22: Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
12:09:17:WU00:FS01:0x22: Branch: HEAD
12:09:17:WU00:FS01:0x22: Compiler: Visual C++ 2015
12:09:17:WU00:FS01:0x22: Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
12:09:17:WU00:FS01:0x22: Platform: win32 10
12:09:17:WU00:FS01:0x22: Bits: 64
12:09:17:WU00:FS01:0x22: Mode: Release
12:09:17:WU00:FS01:0x22:************************************ CBang *************************************
12:09:17:WU00:FS01:0x22: Date: Sep 7 2020
12:09:17:WU00:FS01:0x22: Time: 19:08:30
12:09:17:WU00:FS01:0x22: Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
12:09:17:WU00:FS01:0x22: Branch: HEAD
12:09:17:WU00:FS01:0x22: Compiler: Visual C++ 2015
12:09:17:WU00:FS01:0x22: Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
12:09:17:WU00:FS01:0x22: Platform: win32 10
12:09:17:WU00:FS01:0x22: Bits: 64
12:09:17:WU00:FS01:0x22: Mode: Release
12:09:17:WU00:FS01:0x22:************************************ System ************************************
12:09:17:WU00:FS01:0x22: CPU: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
12:09:17:WU00:FS01:0x22: CPU ID: GenuineIntel Family 6 Model 26 Stepping 4
12:09:17:WU00:FS01:0x22: CPUs: 8
12:09:17:WU00:FS01:0x22: Memory: 23.99GiB
12:09:17:WU00:FS01:0x22:Free Memory: 19.41GiB
12:09:17:WU00:FS01:0x22: Threads: WINDOWS_THREADS
12:09:17:WU00:FS01:0x22: OS Version: 6.2
12:09:17:WU00:FS01:0x22:Has Battery: false
12:09:17:WU00:FS01:0x22: On Battery: false
12:09:17:WU00:FS01:0x22: UTC Offset: -4
12:09:17:WU00:FS01:0x22: PID: 8804
12:09:17:WU00:FS01:0x22: CWD: C:\Users\Mike\AppData\Roaming\FAHClient\work
12:09:17:WU00:FS01:0x22:************************************ OpenMM ************************************
12:09:17:WU00:FS01:0x22: Revision: 189320d0
12:09:17:WU00:FS01:0x22:********************************************************************************
12:09:17:WU00:FS01:0x22:Project: 17309 (Run 0, Clone 6791, Gen 0)
12:09:17:WU00:FS01:0x22:Unit: 0x0000000012bc7d9a5f91cc5ca4ca0346
12:09:17:WU00:FS01:0x22:Digital signatures verified
12:09:17:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
12:09:17:WU00:FS01:0x22:Version 0.0.13
12:09:17:WU00:FS01:0x22: Checkpoint write interval: 62500 steps (5%) [20 total]
12:09:17:WU00:FS01:0x22: JSON viewer frame write interval: 12500 steps (1%) [100 total]
12:09:17:WU00:FS01:0x22: XTC frame write interval: 125000 steps (10%) [10 total]
12:09:17:WU00:FS01:0x22: Global context and integrator variables write interval: disabled
12:09:17:WU00:FS01:0x22:There are 4 platforms available.
12:09:17:WU00:FS01:0x22:Platform 0: Reference
12:09:17:WU00:FS01:0x22:Platform 1: CPU
12:09:17:WU00:FS01:0x22:Platform 2: OpenCL
12:09:17:WU00:FS01:0x22: opencl-device 0 specified
12:09:17:WU00:FS01:0x22:Platform 3: CUDA
12:09:17:WU00:FS01:0x22: cuda-device 0 specified
12:09:41:WU00:FS01:0x22:Attempting to create CUDA context:
12:09:41:WU00:FS01:0x22: Configuring platform CUDA
12:09:48:WU00:FS01:0x22: Using CUDA and gpu 0
12:09:48:WU00:FS01:0x22:Completed 1000000 out of 1250000 steps (80%)