Page 1 of 4

WU 13456

Posted: Thu Jul 29, 2021 3:22 pm
by WhitehawkEQ
I getting missing VC .dll messages when I get WU 13456 on my GTX 1080 cards, ALL of them (10 cards), I have 2 RTX 2080 cards that do fine.
This WU comes from WS 54.157.202.86.

Re: WU 13456

Posted: Thu Jul 29, 2021 4:01 pm
by JohnChodera
Oh no!

Can you give s any more information about which DLLs it thinks are missing? And perhaps a log showing which version of the core is being downloaded or used (should be 0.0.14)?

Thanks so much for helping us track this down.

~ John Chodera // MSKCC

Re: WU 13456

Posted: Thu Jul 29, 2021 4:02 pm
by aetch
It's not an issue with the work unit.
Core 22 version 0.0.14 requires a runtime package which may, or may not, have been installed on your computer.
viewtopic.php?f=108&t=37348

Re: WU 13456

Posted: Thu Jul 29, 2021 4:19 pm
by JohnChodera
Thanks, @aetch! We _should_ be distributing any runtime packages that are necessary for 0.0.14---if not, that's a bug, and on us to fix!

Thanks for the pointer!

~ John Chodera // MSKCC

Re: WU 13456

Posted: Thu Jul 29, 2021 4:33 pm
by Gary480six
John,

In my case it was MSVCP140.dll and then api-ms-win-crt-runtime-|1-1-0.dll that were missing. (it's been different .dlls for others)

On one computer I had to install several Microsoft packages to get that P13456 to Fold:

Microsoft Visual C++ 2015-2019 Redistributable
Windows Update KB2999226-x64

Re: WU 13456

Posted: Thu Jul 29, 2021 9:06 pm
by toTOW
See my answer (and John's too) here : viewtopic.php?p=352578#p352578

Re: WU 13456

Posted: Thu Jul 29, 2021 10:50 pm
by WhitehawkEQ
I wonder if maybe updating the video drivers might add the .dll's? I say this because the 2 RTX 2080 cards don't get this error as they are using the latest drivers.

Did some digging, all of my GTX 1080 cards is using core22 0.0.13, the RTX 2080 cards is using core22 0.0.14. I'm going to update the video driver on 1 of my PC's with GTX 1080 and see what happens.

p.s. forgot to add client V7.4.4

Re: WU 13456

Posted: Fri Jul 30, 2021 4:58 am
by Tashgan
Hi,

i get an other error Message relating to Cuda:

03:35:56:WU00:FS01:0x22:Failed to create CUDA context:
03:35:56:WU00:FS01:0x22:Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)

I use a RTX 3070Ti with the latest driver 471.41 on windows 10 21H1

Code: Select all

...
03:34:29:WU00:FS01:Connecting to assign1.foldingathome.org:80
03:34:30:WU00:FS01:Assigned to work server 54.157.202.86
03:34:30:WU00:FS01:Requesting new work unit for slot 01: gpu:1:0 GA104 [GeForce RTX 3070 Ti] from 54.157.202.86
03:34:30:WU00:FS01:Connecting to 54.157.202.86:8080
03:34:33:WU00:FS01:Downloading 6.91MiB
03:34:35:WU00:FS01:Download complete
03:34:35:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13456 run:509 clone:1 gen:0 core:0x22 unit:0x000000010000000000003490000001fd
03:35:47:WU00:FS01:Starting
03:35:47:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\Markus\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.14/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 12852 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
03:35:47:WU00:FS01:Started FahCore on PID 14340
03:35:47:WU00:FS01:Core PID:2052
03:35:47:WU00:FS01:FahCore 0x22 started
03:35:48:WU00:FS01:0x22:*********************** Log Started 2021-07-30T03:35:47Z ***********************
03:35:48:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
03:35:48:WU00:FS01:0x22:       Core: Core22
03:35:48:WU00:FS01:0x22:       Type: 0x22
03:35:48:WU00:FS01:0x22:    Version: 0.0.14
03:35:48:WU00:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:35:48:WU00:FS01:0x22:  Copyright: 2020 foldingathome.org
03:35:48:WU00:FS01:0x22:   Homepage: https://foldingathome.org/
03:35:48:WU00:FS01:0x22:       Date: Jun 17 2021
03:35:48:WU00:FS01:0x22:       Time: 16:56:23
03:35:48:WU00:FS01:0x22:   Revision: 3eae048d03ef0e039e125221739933c8d8190daf
03:35:48:WU00:FS01:0x22:     Branch: HEAD
03:35:48:WU00:FS01:0x22:   Compiler: Visual C++
03:35:48:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
03:35:48:WU00:FS01:0x22:             -DOPENMM_VERSION="\"7.5.1\""
03:35:48:WU00:FS01:0x22:   Platform: win32 10
03:35:48:WU00:FS01:0x22:       Bits: 64
03:35:48:WU00:FS01:0x22:       Mode: Release
03:35:48:WU00:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
03:35:48:WU00:FS01:0x22:             <peastman@stanford.edu>
03:35:48:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 706 -lifeline 14340 -checkpoint 15
03:35:48:WU00:FS01:0x22:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
03:35:48:WU00:FS01:0x22:             nvidia -gpu 0 -gpu-usage 100
03:35:48:WU00:FS01:0x22:************************************ libFAH ************************************
03:35:48:WU00:FS01:0x22:       Date: Jun 17 2021
03:35:48:WU00:FS01:0x22:       Time: 16:55:36
03:35:48:WU00:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
03:35:48:WU00:FS01:0x22:     Branch: HEAD
03:35:48:WU00:FS01:0x22:   Compiler: Visual C++
03:35:48:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
03:35:48:WU00:FS01:0x22:   Platform: win32 10
03:35:48:WU00:FS01:0x22:       Bits: 64
03:35:48:WU00:FS01:0x22:       Mode: Release
03:35:48:WU00:FS01:0x22:************************************ CBang *************************************
03:35:48:WU00:FS01:0x22:       Date: Jun 17 2021
03:35:48:WU00:FS01:0x22:       Time: 16:54:53
03:35:48:WU00:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
03:35:48:WU00:FS01:0x22:     Branch: HEAD
03:35:48:WU00:FS01:0x22:   Compiler: Visual C++
03:35:48:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
03:35:48:WU00:FS01:0x22:   Platform: win32 10
03:35:48:WU00:FS01:0x22:       Bits: 64
03:35:48:WU00:FS01:0x22:       Mode: Release
03:35:48:WU00:FS01:0x22:************************************ System ************************************
03:35:48:WU00:FS01:0x22:        CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
03:35:48:WU00:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 165 Stepping 5
03:35:48:WU00:FS01:0x22:       CPUs: 12
03:35:48:WU00:FS01:0x22:     Memory: 31.90GiB
03:35:48:WU00:FS01:0x22:Free Memory: 26.41GiB
03:35:48:WU00:FS01:0x22:    Threads: WINDOWS_THREADS
03:35:48:WU00:FS01:0x22: OS Version: 6.2
03:35:48:WU00:FS01:0x22:Has Battery: false
03:35:48:WU00:FS01:0x22: On Battery: false
03:35:48:WU00:FS01:0x22: UTC Offset: 2
03:35:48:WU00:FS01:0x22:        PID: 2052
03:35:48:WU00:FS01:0x22:        CWD: C:\Users\Markus\AppData\Roaming\FAHClient\work
03:35:48:WU00:FS01:0x22:************************************ OpenMM ************************************
03:35:48:WU00:FS01:0x22:    Version: 7.5.1
03:35:48:WU00:FS01:0x22:********************************************************************************
03:35:48:WU00:FS01:0x22:Project: 13456 (Run 509, Clone 1, Gen 0)
03:35:48:WU00:FS01:0x22:Unit: 0x00000000000000000000000000000000
03:35:48:WU00:FS01:0x22:Reading tar file core.xml
03:35:48:WU00:FS01:0x22:Reading tar file integrator.xml.bz2
03:35:48:WU00:FS01:0x22:Reading tar file state.xml.bz2
03:35:48:WU00:FS01:0x22:Reading tar file system.xml.bz2
03:35:48:WU00:FS01:0x22:Digital signatures verified
03:35:48:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
03:35:48:WU00:FS01:0x22:Version 0.0.14
03:35:48:WU00:FS01:0x22:  Checkpoint write interval: 50000 steps (5%) [20 total]
03:35:48:WU00:FS01:0x22:  JSON viewer frame write interval: 10000 steps (1%) [100 total]
03:35:48:WU00:FS01:0x22:  XTC frame write interval: 250000 steps (25%) [4 total]
03:35:48:WU00:FS01:0x22:  Global context and integrator variables write interval: 25000 steps (2.5%) [40 total]
03:35:48:WU00:FS01:0x22:There are 4 platforms available.
03:35:48:WU00:FS01:0x22:Platform 0: Reference
03:35:48:WU00:FS01:0x22:Platform 1: CPU
03:35:48:WU00:FS01:0x22:Platform 2: OpenCL
03:35:48:WU00:FS01:0x22:  opencl-device 0 specified
03:35:48:WU00:FS01:0x22:Platform 3: CUDA
03:35:48:WU00:FS01:0x22:  cuda-device 0 specified
03:35:56:WU00:FS01:0x22:Attempting to create CUDA context:
03:35:56:WU00:FS01:0x22:  Configuring platform CUDA
03:35:56:WU00:FS01:0x22:Failed to create CUDA context:
03:35:56:WU00:FS01:0x22:Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)
03:35:56:WU00:FS01:0x22:Attempting to create OpenCL context:
03:35:56:WU00:FS01:0x22:  Configuring platform OpenCL
03:36:01:WU00:FS01:0x22:  Using OpenCL on platformId 0 and gpu 0
03:36:01:WU00:FS01:0x22:Completed 0 out of 1000000 steps (0%)
03:36:02:WU00:FS01:0x22:Checkpoint completed at step 0
03:37:17:WU00:FS01:0x22:Completed 10000 out of 1000000 steps (1%)
...

Re: WU 13456

Posted: Fri Jul 30, 2021 10:53 am
by Neil-B
I am seeing the same issue as Tashgan ... Using RTX3070, latest driver, windows 10 pro build 19043 fully patched ... reinstall nvidia drivers and reboot didn't resolve issue.

Edit:

Back checked logs ... apologies this has been happening since the new core was downloaded - should have spotted sooner but doesn't show as "error" in advanced control error filter and the default to OpenCL running with ppd higher than the 18108 wus on cuda the kit had been running led me not to check/question whether there was any issue - should have spotted it was Moonshot and circa 600k lower ppd than those usually are !!

Additional Edit:

There also appears to be another cuda related issue reported here https://foldingforum.org/viewtopic.php? ... 05#p352604 ... also pm'd John to give him heads up on these posts.

Re: WU 13456

Posted: Fri Jul 30, 2021 3:16 pm
by JohnChodera
@Tashgan @Neil-B: Thanks for the heads up!
Thankfully, it looks like it falls back to OpenCL smoothly.

Any chance either of you have the CUDA Toolkit installed? If so, which version(s)?

~ John Chodera // MSKCC

Re: WU 13456

Posted: Fri Jul 30, 2021 3:24 pm
by Neil-B
No cuda toolkit installed here ... I'd guess this may be happening to a lot of folders as the failover to opencl means it isn't immediately obvious unless one is monitoring logs/ppd carefully

Re: WU 13456

Posted: Fri Jul 30, 2021 3:55 pm
by Tashgan
I have no cuda toolkit installed.

Re: WU 13456

Posted: Fri Jul 30, 2021 5:03 pm
by WhitehawkEQ
Here is a log from 1 of my GTX 1080 cards:

Code: Select all

09:07:39:WU01:FS01:0x22:Checkpoint completed at step 2375000
09:09:35:WU01:FS01:0x22:Completed 2400000 out of 2500000 steps (96%)
09:11:32:WU01:FS01:0x22:Completed 2425000 out of 2500000 steps (97%)
09:13:29:WU01:FS01:0x22:Completed 2450000 out of 2500000 steps (98%)
09:15:25:WU01:FS01:0x22:Completed 2475000 out of 2500000 steps (99%)
09:17:19:WU01:FS01:0x22:Completed 2500000 out of 2500000 steps (100%)
09:17:19:WU01:FS01:0x22:Average performance: 14.8709 ns/day
09:17:20:WU00:FS01:Connecting to 65.254.110.245:80
09:17:20:WU00:FS01:Assigned to work server 54.157.202.86
09:17:20:WU00:FS01:Requesting new work unit for slot 01: RUNNING gpu:1:GP104 [GeForce GTX 1080] 8873 from 54.157.202.86
09:17:20:WU00:FS01:Connecting to 54.157.202.86:8080
09:17:20:WU01:FS01:0x22:Checkpoint completed at step 2500000
09:17:21:WU00:FS01:Downloading 6.93MiB
09:17:22:WU00:FS01:Download complete
09:17:23:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13456 run:591 clone:43 gen:0 core:0x22 unit:0x0000002b00000000000034900000024f
09:17:29:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
09:17:29:WU01:FS01:0x22:Saving result file checkpointIntegrator.xml.bz2
09:17:29:WU01:FS01:0x22:Saving result file checkpointState.xml.bz2
09:17:29:WU01:FS01:0x22:Saving result file positions.xtc
09:17:29:WU01:FS01:0x22:Saving result file science.log
09:17:29:WU01:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
09:17:30:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
09:17:30:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17804 run:86 clone:38 gen:200 core:0x22 unit:0x00000026000000c80000458c00000056
09:17:30:WU01:FS01:Uploading 7.93MiB to 207.53.233.146
09:17:30:WU00:FS01:Starting
09:17:30:WU01:FS01:Connecting to 207.53.233.146:8080
09:17:30:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/xxxxxxxx/AppData/Roaming/FAHClient/cores/cores.foldingathome.org/win/64bit/22-0.0.14/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 704 -lifeline 11440 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
09:17:30:WU00:FS01:Started FahCore on PID 12348
09:17:31:WU00:FS01:Core PID:6312
09:17:31:WU00:FS01:FahCore 0x22 started
09:17:36:WU01:FS01:Upload 67.75%
09:17:38:WU01:FS01:Upload complete
09:17:38:WU01:FS01:Server responded WORK_ACK (400)
09:17:38:WU01:FS01:Final credit estimate, 131943.00 points
09:17:38:WU01:FS01:Cleaning up
******************************* Date: 2021-07-30 *******************************
After downloading 13456, it hangs up.

Re: WU 13456

Posted: Fri Jul 30, 2021 5:27 pm
by JohnChodera
Thanks for the feedback!

We built core22 0.0.14 against CUDA 10.2, and I realize that support for RTX 30x0s was only added in CUDA 11.1:
https://developer.nvidia.com/blog/cuda- ... 30-series/

We likely need to build 0.0.15 against the latest CUDA 11 to ensure CUDA will work with the latest, fastest GPUs.

@WhitehawkEQ: I don't see an issue in that log---did you mean to send another log snippet?

~ John Chodera // MSKCC

Re: WU 13456

Posted: Fri Jul 30, 2021 5:50 pm
by Neil-B
@JohnChodera .. Interesting about the cuda version numbers .. iirc .13 reported it was using cuda on my 3070 .. did .14 step back version or was it my imagination? .. or has it moved to a version that explicitly blocks 30*0 series?

Edit: Just Checked and .13 did report/imply cuda was in use

Code: Select all

12:10:51:WU00:FS01:0x22:Version 0.0.13
12:10:51:WU00:FS01:0x22:  Checkpoint write interval: 250000 steps (5%) [20 total]
12:10:51:WU00:FS01:0x22:  JSON viewer frame write interval: 50000 steps (1%) [100 total]
12:10:51:WU00:FS01:0x22:  XTC frame write interval: 50000 steps (1%) [100 total]
12:10:51:WU00:FS01:0x22:  Global context and integrator variables write interval: disabled
12:10:51:WU00:FS01:0x22:There are 4 platforms available.
12:10:51:WU00:FS01:0x22:Platform 0: Reference
12:10:51:WU00:FS01:0x22:Platform 1: CPU
12:10:51:WU00:FS01:0x22:Platform 2: OpenCL
12:10:51:WU00:FS01:0x22:  opencl-device 0 specified
12:10:51:WU00:FS01:0x22:Platform 3: CUDA
12:10:51:WU00:FS01:0x22:  cuda-device 0 specified
12:10:55:WU00:FS01:0x22:Attempting to create CUDA context:
12:10:55:WU00:FS01:0x22:  Configuring platform CUDA
12:11:59:WU00:FS01:0x22:  Using CUDA and gpu 0
12:11:59:WU00:FS01:0x22:Completed 0 out of 5000000 steps (0%)