Project 18448 issue

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

Post Reply
TonyStewart14
Posts: 66
Joined: Fri Jan 06, 2012 6:37 am

Project 18448 issue

Post by TonyStewart14 »

I just had the following issue with this workunit. I am running a 3070 Ti GPU with the -advanced flag. It was causing my web browser to crash repeatedly.

Code: Select all

17:25:49:WU01:FS02:Connecting to assign1.foldingathome.org:80
17:25:49:WU01:FS02:Assigned to work server 129.32.209.202
17:25:49:WU01:FS02:Requesting new work unit for slot 02: gpu:1:0 GA104 [GeForce RTX 3070 Ti] from 129.32.209.202
17:25:49:WU01:FS02:Connecting to 129.32.209.202:8080
17:25:52:WU01:FS02:Downloading 57.00MiB
17:25:53:WU01:FS02:Download complete
17:25:53:WU01:FS02:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:18448 run:11 clone:33 gen:1254 core:0x22 unit:0x00000021000004e6000048100000000b
17:25:53:WU01:FS02:Starting
17:25:53:WU01:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.20/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 34756 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
17:25:53:WU01:FS02:Started FahCore on PID 556
17:25:53:WU01:FS02:Core PID:27252
17:25:53:WU01:FS02:FahCore 0x22 started
17:25:54:WU01:FS02:0x22:*********************** Log Started 2023-06-18T17:25:53Z ***********************
17:25:54:WU01:FS02:0x22:*************************** Core22 Folding@home Core ***************************
17:25:54:WU01:FS02:0x22:       Core: Core22
17:25:54:WU01:FS02:0x22:       Type: 0x22
17:25:54:WU01:FS02:0x22:    Version: 0.0.20
17:25:54:WU01:FS02:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:25:54:WU01:FS02:0x22:  Copyright: 2020 foldingathome.org
17:25:54:WU01:FS02:0x22:   Homepage: https://foldingathome.org/
17:25:54:WU01:FS02:0x22:       Date: Jan 20 2022
17:25:54:WU01:FS02:0x22:       Time: 01:15:36
17:25:54:WU01:FS02:0x22:   Revision: 3f211b8a4346514edbff34e3cb1c0e0ec951373c
17:25:54:WU01:FS02:0x22:     Branch: HEAD
17:25:54:WU01:FS02:0x22:   Compiler: Visual C++
17:25:54:WU01:FS02:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
17:25:54:WU01:FS02:0x22:             -DOPENMM_VERSION="\"7.7.0\""
17:25:54:WU01:FS02:0x22:   Platform: win32 10
17:25:54:WU01:FS02:0x22:       Bits: 64
17:25:54:WU01:FS02:0x22:       Mode: Release
17:25:54:WU01:FS02:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
17:25:54:WU01:FS02:0x22:             <peastman@stanford.edu>
17:25:54:WU01:FS02:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 556 -checkpoint 15
17:25:54:WU01:FS02:0x22:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
17:25:54:WU01:FS02:0x22:             nvidia -gpu 0 -gpu-usage 100
17:25:54:WU01:FS02:0x22:************************************ libFAH ************************************
17:25:54:WU01:FS02:0x22:       Date: Jan 20 2022
17:25:54:WU01:FS02:0x22:       Time: 01:14:17
17:25:54:WU01:FS02:0x22:   Revision: 9f4ad694e75c2350d4bb6b8b5b769ba27e483a2f
17:25:54:WU01:FS02:0x22:     Branch: HEAD
17:25:54:WU01:FS02:0x22:   Compiler: Visual C++
17:25:54:WU01:FS02:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
17:25:54:WU01:FS02:0x22:   Platform: win32 10
17:25:54:WU01:FS02:0x22:       Bits: 64
17:25:54:WU01:FS02:0x22:       Mode: Release
17:25:54:WU01:FS02:0x22:************************************ CBang *************************************
17:25:54:WU01:FS02:0x22:       Date: Jan 20 2022
17:25:54:WU01:FS02:0x22:       Time: 01:13:20
17:25:54:WU01:FS02:0x22:   Revision: ab023d155b446906d55b0f6c9a1eedeea04f7a1a
17:25:54:WU01:FS02:0x22:     Branch: HEAD
17:25:54:WU01:FS02:0x22:   Compiler: Visual C++
17:25:54:WU01:FS02:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
17:25:54:WU01:FS02:0x22:   Platform: win32 10
17:25:54:WU01:FS02:0x22:       Bits: 64
17:25:54:WU01:FS02:0x22:       Mode: Release
17:25:54:WU01:FS02:0x22:************************************ System ************************************
17:25:54:WU01:FS02:0x22:        CPU: AMD Ryzen 9 7950X3D 16-Core Processor
17:25:54:WU01:FS02:0x22:     CPU ID: AuthenticAMD Family 25 Model 97 Stepping 2
17:25:54:WU01:FS02:0x22:       CPUs: 32
17:25:54:WU01:FS02:0x22:     Memory: 31.14GiB
17:25:54:WU01:FS02:0x22:Free Memory: 17.69GiB
17:25:54:WU01:FS02:0x22:    Threads: WINDOWS_THREADS
17:25:54:WU01:FS02:0x22: OS Version: 6.2
17:25:54:WU01:FS02:0x22:Has Battery: false
17:25:54:WU01:FS02:0x22: On Battery: false
17:25:54:WU01:FS02:0x22: UTC Offset: -5
17:25:54:WU01:FS02:0x22:        PID: 27252
17:25:54:WU01:FS02:0x22:        CWD: C:\ProgramData\FAHClient\work
17:25:54:WU01:FS02:0x22:************************************ OpenMM ************************************
17:25:54:WU01:FS02:0x22:    Version: 7.7.0
17:25:54:WU01:FS02:0x22:********************************************************************************
17:25:54:WU01:FS02:0x22:Project: 18448 (Run 11, Clone 33, Gen 1254)
17:25:54:WU01:FS02:0x22:Reading tar file core.xml
17:25:54:WU01:FS02:0x22:Reading tar file integrator.xml
17:25:54:WU01:FS02:0x22:Reading tar file state.xml
17:25:54:WU01:FS02:0x22:Reading tar file system.xml
17:25:54:WU01:FS02:0x22:Digital signatures verified
17:25:54:WU01:FS02:0x22:Folding@home GPU Core22 Folding@home Core
17:25:54:WU01:FS02:0x22:Version 0.0.20
17:25:54:WU01:FS02:0x22:  Checkpoint write interval: 50000 steps (2%) [50 total]
17:25:54:WU01:FS02:0x22:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
17:25:54:WU01:FS02:0x22:  XTC frame write interval: 2500000 steps (1e+02%) [1 total]
17:25:54:WU01:FS02:0x22:  Global context and integrator variables write interval: disabled
17:25:54:WU01:FS02:0x22:There are 4 platforms available.
17:25:54:WU01:FS02:0x22:Platform 0: Reference
17:25:54:WU01:FS02:0x22:Platform 1: CPU
17:25:54:WU01:FS02:0x22:Platform 2: OpenCL
17:25:54:WU01:FS02:0x22:  opencl-device 0 specified
17:25:54:WU01:FS02:0x22:Platform 3: CUDA
17:25:54:WU01:FS02:0x22:  cuda-device 0 specified
17:25:58:WU01:FS02:0x22:Attempting to create CUDA context:
17:25:58:WU01:FS02:0x22:  Configuring platform CUDA
17:26:01:WU01:FS02:0x22:  Using CUDA and gpu 0
17:26:01:WU01:FS02:0x22:Completed 0 out of 2500000 steps (0%)
17:26:01:WU01:FS02:0x22:Checkpoint completed at step 0
17:26:54:WU01:FS02:0x22:Completed 25000 out of 2500000 steps (1%)
17:27:47:WU01:FS02:0x22:Completed 50000 out of 2500000 steps (2%)
17:27:47:WU01:FS02:0x22:Checkpoint completed at step 50000
17:28:40:WU01:FS02:0x22:Completed 75000 out of 2500000 steps (3%)
17:29:33:WU01:FS02:0x22:Completed 100000 out of 2500000 steps (4%)
17:29:33:WU01:FS02:0x22:Checkpoint completed at step 100000
17:29:41:WU01:FS02:0x22:An exception occurred at step 103662: bad allocation
17:29:41:WU01:FS02:0x22:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
17:29:41:WU01:FS02:0x22:Folding@home Core Shutdown: CORE_RESTART
17:29:41:WARNING:WU01:FS02:FahCore returned: CORE_RESTART (98 = 0x62)
17:29:42:WU01:FS02:Starting
17:29:42:WU01:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.20/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 34756 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
17:29:42:WU01:FS02:Started FahCore on PID 19340
17:29:42:WU01:FS02:Core PID:39736
17:29:42:WU01:FS02:FahCore 0x22 started
17:29:42:WU01:FS02:0x22:*********************** Log Started 2023-06-18T17:29:42Z ***********************
17:29:42:WU01:FS02:0x22:*************************** Core22 Folding@home Core ***************************
17:29:42:WU01:FS02:0x22:       Core: Core22
17:29:42:WU01:FS02:0x22:       Type: 0x22
17:29:42:WU01:FS02:0x22:    Version: 0.0.20
17:29:42:WU01:FS02:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:29:42:WU01:FS02:0x22:  Copyright: 2020 foldingathome.org
17:29:42:WU01:FS02:0x22:   Homepage: https://foldingathome.org/
17:29:42:WU01:FS02:0x22:       Date: Jan 20 2022
17:29:42:WU01:FS02:0x22:       Time: 01:15:36
17:29:42:WU01:FS02:0x22:   Revision: 3f211b8a4346514edbff34e3cb1c0e0ec951373c
17:29:42:WU01:FS02:0x22:     Branch: HEAD
17:29:42:WU01:FS02:0x22:   Compiler: Visual C++
17:29:42:WU01:FS02:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
17:29:42:WU01:FS02:0x22:             -DOPENMM_VERSION="\"7.7.0\""
17:29:42:WU01:FS02:0x22:   Platform: win32 10
17:29:42:WU01:FS02:0x22:       Bits: 64
17:29:42:WU01:FS02:0x22:       Mode: Release
17:29:42:WU01:FS02:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
17:29:42:WU01:FS02:0x22:             <peastman@stanford.edu>
17:29:42:WU01:FS02:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 19340 -checkpoint 15
17:29:42:WU01:FS02:0x22:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
17:29:42:WU01:FS02:0x22:             nvidia -gpu 0 -gpu-usage 100
17:29:42:WU01:FS02:0x22:************************************ libFAH ************************************
17:29:42:WU01:FS02:0x22:       Date: Jan 20 2022
17:29:42:WU01:FS02:0x22:       Time: 01:14:17
17:29:42:WU01:FS02:0x22:   Revision: 9f4ad694e75c2350d4bb6b8b5b769ba27e483a2f
17:29:42:WU01:FS02:0x22:     Branch: HEAD
17:29:42:WU01:FS02:0x22:   Compiler: Visual C++
17:29:42:WU01:FS02:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
17:29:42:WU01:FS02:0x22:   Platform: win32 10
17:29:42:WU01:FS02:0x22:       Bits: 64
17:29:42:WU01:FS02:0x22:       Mode: Release
17:29:42:WU01:FS02:0x22:************************************ CBang *************************************
17:29:42:WU01:FS02:0x22:       Date: Jan 20 2022
17:29:42:WU01:FS02:0x22:       Time: 01:13:20
17:29:42:WU01:FS02:0x22:   Revision: ab023d155b446906d55b0f6c9a1eedeea04f7a1a
17:29:42:WU01:FS02:0x22:     Branch: HEAD
17:29:42:WU01:FS02:0x22:   Compiler: Visual C++
17:29:42:WU01:FS02:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
17:29:42:WU01:FS02:0x22:   Platform: win32 10
17:29:42:WU01:FS02:0x22:       Bits: 64
17:29:42:WU01:FS02:0x22:       Mode: Release
17:29:42:WU01:FS02:0x22:************************************ System ************************************
17:29:42:WU01:FS02:0x22:        CPU: AMD Ryzen 9 7950X3D 16-Core Processor
17:29:42:WU01:FS02:0x22:     CPU ID: AuthenticAMD Family 25 Model 97 Stepping 2
17:29:42:WU01:FS02:0x22:       CPUs: 32
17:29:42:WU01:FS02:0x22:     Memory: 31.14GiB
17:29:42:WU01:FS02:0x22:Free Memory: 17.53GiB
17:29:42:WU01:FS02:0x22:    Threads: WINDOWS_THREADS
17:29:42:WU01:FS02:0x22: OS Version: 6.2
17:29:42:WU01:FS02:0x22:Has Battery: false
17:29:42:WU01:FS02:0x22: On Battery: false
17:29:42:WU01:FS02:0x22: UTC Offset: -5
17:29:42:WU01:FS02:0x22:        PID: 39736
17:29:42:WU01:FS02:0x22:        CWD: C:\ProgramData\FAHClient\work
17:29:42:WU01:FS02:0x22:************************************ OpenMM ************************************
17:29:42:WU01:FS02:0x22:    Version: 7.7.0
17:29:42:WU01:FS02:0x22:********************************************************************************
17:29:42:WU01:FS02:0x22:Project: 18448 (Run 11, Clone 33, Gen 1254)
17:29:42:WU01:FS02:0x22:Digital signatures verified
17:29:42:WU01:FS02:0x22:Folding@home GPU Core22 Folding@home Core
17:29:42:WU01:FS02:0x22:Version 0.0.20
17:29:42:WU01:FS02:0x22:  Checkpoint write interval: 50000 steps (2%) [50 total]
17:29:42:WU01:FS02:0x22:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
17:29:42:WU01:FS02:0x22:  XTC frame write interval: 2500000 steps (1e+02%) [1 total]
17:29:42:WU01:FS02:0x22:  Global context and integrator variables write interval: disabled
17:29:42:WU01:FS02:0x22:There are 4 platforms available.
17:29:42:WU01:FS02:0x22:Platform 0: Reference
17:29:42:WU01:FS02:0x22:Platform 1: CPU
17:29:42:WU01:FS02:0x22:Platform 2: OpenCL
17:29:42:WU01:FS02:0x22:  opencl-device 0 specified
17:29:42:WU01:FS02:0x22:Platform 3: CUDA
17:29:42:WU01:FS02:0x22:  cuda-device 0 specified
17:29:46:WU01:FS02:0x22:Attempting to create CUDA context:
17:29:46:WU01:FS02:0x22:  Configuring platform CUDA
17:29:47:WU01:FS02:0x22:  Using CUDA and gpu 0
17:29:47:WU01:FS02:0x22:Completed 100000 out of 2500000 steps (4%)
17:30:40:WU01:FS02:0x22:Completed 125000 out of 2500000 steps (5%)
17:31:33:WU01:FS02:0x22:Completed 150000 out of 2500000 steps (6%)
17:31:33:WU01:FS02:0x22:Checkpoint completed at step 150000
17:32:26:WU01:FS02:0x22:Completed 175000 out of 2500000 steps (7%)
17:33:18:WU01:FS02:0x22:Completed 200000 out of 2500000 steps (8%)
17:33:19:WU01:FS02:0x22:Checkpoint completed at step 200000
17:34:11:WU01:FS02:0x22:Completed 225000 out of 2500000 steps (9%)
17:35:04:WU01:FS02:0x22:Completed 250000 out of 2500000 steps (10%)
17:35:05:WU01:FS02:0x22:Checkpoint completed at step 250000
17:35:58:WU01:FS02:0x22:Completed 275000 out of 2500000 steps (11%)
17:36:52:WU01:FS02:0x22:Completed 300000 out of 2500000 steps (12%)
17:36:53:WU01:FS02:0x22:Checkpoint completed at step 300000
17:37:45:WU01:FS02:0x22:Completed 325000 out of 2500000 steps (13%)
17:38:14:FS01:Paused
17:38:14:FS02:Paused
17:38:14:FS02:Shutting core down
17:38:14:WU01:FS02:0x22:WARNING:Console control signal 1 on PID 39736
17:38:14:WU01:FS02:0x22:Exiting, please wait. . .
17:38:14:WU01:FS02:0x22:Folding@home Core Shutdown: INTERRUPTED
17:38:15:WU01:FS02:FahCore returned: INTERRUPTED (102 = 0x66)
17:38:20:Removing old file 'configs/config-20230325-034624.xml'
17:38:20:Saving configuration to config.xml
17:38:20:<config>
17:38:20:  <!-- Folding Slot Configuration -->
17:38:20:  <cause v='ALZHEIMERS'/>
17:38:20:  <client-type v='advanced'/>
17:38:20:
17:38:20:  <!-- Network -->
17:38:20:  <proxy v=':8080'/>
17:38:20:
17:38:20:  <!-- Slot Control -->
17:38:20:  <power v='full'/>
17:38:20:
17:38:20:  <!-- User Information -->
17:38:20:  <passkey v='*****'/>
17:38:20:  <team v='215762'/>
17:38:20:  <user v='JackEnright'/>
17:38:20:
17:38:20:  <!-- Folding Slots -->
17:38:20:  <slot id='1' type='GPU'>
17:38:20:    <paused v='true'/>
17:38:20:    <pci-bus v='18'/>
17:38:20:    <pci-slot v='0'/>
17:38:20:  </slot>
17:38:20:  <slot id='2' type='GPU'>
17:38:20:    <paused v='true'/>
17:38:20:    <pci-bus v='1'/>
17:38:20:    <pci-slot v='0'/>
17:38:20:  </slot>
17:38:20:</config>
17:39:05:ERROR:Receive error: 10054: An existing connection was forcibly closed by the remote host.
17:39:17:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
17:39:17:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
17:39:17:ERROR:Receive error: 10054: An existing connection was forcibly closed by the remote host.
bollix47
Posts: 2958
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Project 18448 issue

Post by bollix47 »

It is 'normal' for the core to restart after an error .... it will try up to 3 times before giving up and moving on.

btw client-type=advanced is defunct and hasn't been valid for some time .... the only valid client-type is beta and that is only used by folders who have joined the beta team.
TonyStewart14
Posts: 66
Joined: Fri Jan 06, 2012 6:37 am

Re: Project 18448 issue

Post by TonyStewart14 »

bollix47 wrote: Sun Jun 18, 2023 6:59 pm btw client-type=advanced is defunct and hasn't been valid for some time .... the only valid client-type is beta and that is only used by folders who have joined the beta team.
Good to know... apparently I should post/lurk here more often... :oops:

I'm still getting WUs from this project, but at least the client is catching it and automatically dumping as faulty rather than me having to manually dump it like I did before making this thread. It also made my screen go black and force a hard restart, so I'll stop folding for now until it's resolved.
bollix47
Posts: 2958
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Project 18448 issue

Post by bollix47 »

The following is just an fyi and may not apply to you but I'm including it here in case it's applicable.

I saw that you posted you have 32 GB of ram and some might say that's plenty ... I don't need a pagesys.

I've seen some of the new projects use around 20 GB for the core alone and I've seen them display the same symptoms that you just posted.

I made sure my pagesys was active and at least 2 to 3 times the size of my ram .... I never saw those symptoms again. :wink:
TonyStewart14
Posts: 66
Joined: Fri Jan 06, 2012 6:37 am

Re: Project 18448 issue

Post by TonyStewart14 »

bollix47 wrote: Sun Jun 18, 2023 8:26 pm I made sure my pagesys was active and at least 2 to 3 times the size of my ram .... I never saw those symptoms again. :wink:
I'm glad you brought that up after reading the log, just like with the -advanced flag. I originally disabled it for security reasons due to an AMD processor vulnerability, but I also have Bitlocker on, so I assume that would mitigate it to where it's not a significant issue.

I'm surprised a GPU project would use that much system RAM, but if that's the case I'd definitely be open to enabling pagefile, or even filling the other two RAM slots on my motherboard.
[Ars] For Caitlin
Posts: 43
Joined: Sun Jan 06, 2008 11:06 pm
Hardware configuration: Two homebuilt rigs. Radeon 6900XT. Nvidia 2060 Super.

Re: Project 18448 issue

Post by [Ars] For Caitlin »

FWIW, this project and its best friend 18449 crashes my windows 10/6900XT box usually at the first checkpoint. By crash I mean black screen reboot. Once rebooted, the project usually runs to completion.
Post Reply