Page 1 of 1
CPU unstable when folding, stable in every other application
Posted: Wed Apr 08, 2020 7:18 am
by JacobSmith
I have an Intel i7-8700K at 4.8 GHz and it goes up to 76 c in the stress test in Intel extreme tuning utility and it passes the stress test. If I use the default 11/12 threads in FAH then my PC crashes. Reducing the number of threads used in FAH to 8/12 works OK. Can someone explain to me why folding proteins is so much more stressful on my CPU than a stress test? I think that the default number of threads should be reduced in order to avoid this issue.
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 7:34 am
by PantherX
Can you please post your log file. Ensure that you have copied the System configuration which is present at the start of the log file (viewtopic.php?f=61&t=26036).
My guess is that the "crash" happening at 11 CPUs is because it is using a large prime number. 10 CPUs would be a multiple of 5 so not a lot of Projects play nice with it. 9 CPUs might work but 8 CPUs will definitely work on most Projects.
Regarding the temperature, F@H uses AVX2 instruction set:
https://en.wikipedia.org/wiki/Advanced_ ... Extensions
They bring significant increase in performance in folding and was heavily requested years ago by the folding community to speed up CPU folding. Since the CPU is working harder, it will produce more heat. Issues in cooling the CPU is not due to software, it is due to hardware, most commonly insufficient CPU cooling or overclocking.
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 7:38 am
by JacobSmith
Code: Select all
*********************** Log Started 2020-04-08T07:30:31Z ***********************
07:30:31:************************* Folding@home Client *************************
07:30:31: Website: https://foldingathome.org/
07:30:31: Copyright: (c) 2009-2018 foldingathome.org
07:30:31: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
07:30:31: Args: --open-web-control
07:30:31: Config: C:\Users\JACOB\AppData\Roaming\FAHClient\config.xml
07:30:31:******************************** Build ********************************
07:30:31: Version: 7.5.1
07:30:31: Date: May 11 2018
07:30:31: Time: 13:06:32
07:30:31: Repository: Git
07:30:31: Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
07:30:31: Branch: master
07:30:31: Compiler: Visual C++ 2008
07:30:31: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:30:31: Platform: win32 10
07:30:31: Bits: 32
07:30:31: Mode: Release
07:30:31:******************************* System ********************************
07:30:31: CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
07:30:31: CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
07:30:31: CPUs: 12
07:30:31: Memory: 15.93GiB
07:30:31: Free Memory: 13.10GiB
07:30:31: Threads: WINDOWS_THREADS
07:30:31: OS Version: 6.2
07:30:31: Has Battery: false
07:30:31: On Battery: false
07:30:31: UTC Offset: -4
07:30:31: PID: 10256
07:30:31: CWD: C:\Users\JACOB\AppData\Roaming\FAHClient
07:30:31: OS: Windows 10 Home
07:30:31: OS Arch: AMD64
07:30:31: GPUs: 1
07:30:31: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:8 GP102 [GeForce GTX 1080 Ti] 11380
07:30:31: CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:6.1 Driver:11.0
07:30:31:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:445.75
07:30:31: Win32 Service: false
07:30:31:***********************************************************************
07:30:31:<config>
07:30:31: <!-- Network -->
07:30:31: <proxy v=':8080'/>
07:30:31:
07:30:31: <!-- Slot Control -->
07:30:31: <power v='full'/>
07:30:31:
07:30:31: <!-- User Information -->
07:30:31: <passkey v='********************************'/>
07:30:31: <team v='224497'/>
07:30:31: <user v='JacobSmith'/>
07:30:31:
07:30:31: <!-- Folding Slots -->
07:30:31: <slot id='0' type='CPU'>
07:30:31: <cpus v='8'/>
07:30:31: <paused v='true'/>
07:30:31: </slot>
07:30:31: <slot id='1' type='GPU'>
07:30:31: <paused v='true'/>
07:30:31: </slot>
07:30:31:</config>
07:30:31:Trying to access database...
07:30:31:Successfully acquired database lock
07:30:31:Enabled folding slot 00: PAUSED cpu:8 (by user)
07:30:31:Enabled folding slot 01: PAUSED gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 (by user)
07:30:34:10:127.0.0.1:New Web connection
07:30:46:FS00:Unpaused
07:30:46:FS01:Unpaused
07:30:46:WU02:FS01:Starting
07:30:46:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\JACOB\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 02 -suffix 01 -version 705 -lifeline 10256 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
07:30:46:WU02:FS01:Started FahCore on PID 1940
07:30:46:WU02:FS01:Core PID:9076
07:30:46:WU02:FS01:FahCore 0x22 started
07:30:47:WU02:FS01:0x22:*********************** Log Started 2020-04-08T07:30:46Z ***********************
07:30:47:WU02:FS01:0x22:*************************** Core22 Folding@home Core ***************************
07:30:47:WU02:FS01:0x22: Type: 0x22
07:30:47:WU02:FS01:0x22: Core: Core22
07:30:47:WU02:FS01:0x22: Website: https://foldingathome.org/
07:30:47:WU02:FS01:0x22: Copyright: (c) 2009-2018 foldingathome.org
07:30:47:WU02:FS01:0x22: Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
07:30:47:WU02:FS01:0x22: <rafal.wiewiora@choderalab.org>
07:30:47:WU02:FS01:0x22: Args: -dir 02 -suffix 01 -version 705 -lifeline 1940 -checkpoint 15
07:30:47:WU02:FS01:0x22: -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
07:30:47:WU02:FS01:0x22: 0 -gpu 0
07:30:47:WU02:FS01:0x22: Config: <none>
07:30:47:WU02:FS01:0x22:************************************ Build *************************************
07:30:47:WU02:FS01:0x22: Version: 0.0.2
07:30:47:WU02:FS01:0x22: Date: Dec 6 2019
07:30:47:WU02:FS01:0x22: Time: 21:30:31
07:30:47:WU02:FS01:0x22: Repository: Git
07:30:47:WU02:FS01:0x22: Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
07:30:47:WU02:FS01:0x22: Branch: HEAD
07:30:47:WU02:FS01:0x22: Compiler: Visual C++ 2008
07:30:47:WU02:FS01:0x22: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:30:47:WU02:FS01:0x22: Platform: win32 10
07:30:47:WU02:FS01:0x22: Bits: 64
07:30:47:WU02:FS01:0x22: Mode: Release
07:30:47:WU02:FS01:0x22:************************************ System ************************************
07:30:47:WU02:FS01:0x22: CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
07:30:47:WU02:FS01:0x22: CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
07:30:47:WU02:FS01:0x22: CPUs: 12
07:30:47:WU02:FS01:0x22: Memory: 15.93GiB
07:30:47:WU02:FS01:0x22:Free Memory: 13.62GiB
07:30:47:WU02:FS01:0x22: Threads: WINDOWS_THREADS
07:30:47:WU02:FS01:0x22: OS Version: 6.2
07:30:47:WU02:FS01:0x22:Has Battery: false
07:30:47:WU02:FS01:0x22: On Battery: false
07:30:47:WU02:FS01:0x22: UTC Offset: -4
07:30:47:WU02:FS01:0x22: PID: 9076
07:30:47:WU02:FS01:0x22: CWD: C:\Users\JACOB\AppData\Roaming\FAHClient\work
07:30:47:WU02:FS01:0x22: OS: Windows 10 Home
07:30:47:WU02:FS01:0x22: OS Arch: AMD64
07:30:47:WU02:FS01:0x22:********************************************************************************
07:30:47:WU02:FS01:0x22:Project: 11777 (Run 0, Clone 2799, Gen 26)
07:30:47:WU02:FS01:0x22:Unit: 0x0000002d287234c95e73c43b1323d69c
07:30:47:WU02:FS01:0x22:Digital signatures verified
07:30:47:WU02:FS01:0x22:Folding@home GPU Core22 Folding@home Core
07:30:47:WU02:FS01:0x22:Version 0.0.2
07:30:47:WU02:FS01:0x22: Found a checkpoint file
07:30:51:WU02:FS01:0x22:Completed 1400000 out of 2000000 steps (70%)
07:30:51:WU02:FS01:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
07:30:53:WU00:FS00:Connecting to 65.254.110.245:8080
07:30:54:WARNING:WU00:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
07:30:54:WU00:FS00:Connecting to 18.218.241.186:80
07:30:55:WARNING:WU00:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
07:30:55:ERROR:WU00:FS00:Exception: Could not get an assignment
07:30:55:WU00:FS00:Connecting to 65.254.110.245:8080
07:30:55:WARNING:WU00:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
07:30:55:WU00:FS00:Connecting to 18.218.241.186:80
07:30:56:WARNING:WU00:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
07:30:56:ERROR:WU00:FS00:Exception: Could not get an assignment
07:31:32:Saving configuration to config.xml
07:31:32:<config>
07:31:32: <!-- Network -->
07:31:32: <proxy v=':8080'/>
07:31:32:
07:31:32: <!-- Slot Control -->
07:31:32: <power v='full'/>
07:31:32:
07:31:32: <!-- User Information -->
07:31:32: <passkey v='********************************'/>
07:31:32: <team v='224497'/>
07:31:32: <user v='JacobSmith'/>
07:31:32:
07:31:32: <!-- Folding Slots -->
07:31:32: <slot id='0' type='CPU'>
07:31:32: <cpus v='8'/>
07:31:32: </slot>
07:31:32: <slot id='1' type='GPU'/>
07:31:32:</config>
07:31:42:WU02:FS01:0x22:Completed 1420000 out of 2000000 steps (71%)
07:31:55:WU00:FS00:Connecting to 65.254.110.245:8080
07:31:55:WARNING:WU00:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
07:31:55:WU00:FS00:Connecting to 18.218.241.186:80
07:31:55:WARNING:WU00:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
07:31:55:ERROR:WU00:FS00:Exception: Could not get an assignment
07:32:32:WU02:FS01:0x22:Completed 1440000 out of 2000000 steps (72%)
07:33:25:WU02:FS01:0x22:Completed 1460000 out of 2000000 steps (73%)
07:33:32:WU00:FS00:Connecting to 65.254.110.245:8080
07:33:32:WARNING:WU00:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
07:33:32:WU00:FS00:Connecting to 18.218.241.186:80
07:33:32:WARNING:WU00:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
07:33:32:ERROR:WU00:FS00:Exception: Could not get an assignment
07:34:16:WU02:FS01:0x22:Completed 1480000 out of 2000000 steps (74%)
07:35:08:WU02:FS01:0x22:Completed 1500000 out of 2000000 steps (75%)
07:36:02:WU02:FS01:0x22:Completed 1520000 out of 2000000 steps (76%)
07:36:09:WU00:FS00:Connecting to 65.254.110.245:8080
07:36:09:WARNING:WU00:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
07:36:09:WU00:FS00:Connecting to 18.218.241.186:80
07:36:10:WARNING:WU00:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
07:36:10:ERROR:WU00:FS00:Exception: Could not get an assignment
07:36:54:WU02:FS01:0x22:Completed 1540000 out of 2000000 steps (77%)
07:37:48:WU02:FS01:0x22:Completed 1560000 out of 2000000 steps (78%)
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 7:40 am
by JacobSmith
I do not see how this is hardware issue, my CPU is stable in every other application therefore it must be a software issue.
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 7:49 am
by PantherX
I don't see any CPU crashes in the log posted above. If you want, you can look at previous logs to provide the error message when your CPU was attempting to fold on 11 CPUs.
This thread provides a nice overview as to the different types of stress tests, the impact of AVX2 on hardware and how to deal with it:
https://forums.tomshardware.com/threads ... x.3414412/
In essence, unless the software is instructing the hardware to do something it is not expected to do, it is a hardware issue. In this case, the heating solution can be addressed by upgrading the cooling system, ensure that there's ventilation for the system, lowering the CPU clock, etc.
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 10:01 am
by HaloJones
JacobSmith wrote:I do not see how this is hardware issue, my CPU is stable in every other application therefore it must be a software issue.
hundreds of thousands use this software without issue so I'm sorry it must not be a software issue.
your hardware may well pass stress tests. this isn't a test. it checks as it goes along what it is doing and if it detects an anomaly it dumps the work. "close" isn't good enough for science.
8th gen intel are notorious for high temperatures and poor internal TIM between the cpu and the IHS. Try removing your overclock and see if it then can complete a unit.
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 10:29 am
by HugoNotte
Could it be that the number of threads / cores, in your case 11, is the problem? Maybe try to change the CPU slot according to this post:
viewtopic.php?f=61&t=34046&p=323142&hilit=large+prime#p323142
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 10:42 am
by Neil-B
JacobSmith wrote:I do not see how this is hardware issue, my CPU is stable in every other application therefore it must be a software issue.
FAH is one of the most CPU intensive software packages there is - actually far more intensive than many "stress tests" which tend to mimic the style of loads that other software puts on CPUs (just more) … and for GPUs it is possibly even more intense.
A modern well maintain computer should have no issues running all CPU cores flat 24/7/365 - If the cooling system struggles to cope with this then the system should throttle back the core speeds until a balance is found … Obviously for some people the impacts of this (mostly noise) may not be what they want and they may choose to limit how many cores FAH uses to manage this.
Having said that there are realities: (a few listed here)
1) Laptops tend to have really iffy cooling - many will use the same heat pipe for both CPU and GPU - the manufacturers tend not to expect either/both CPU and GPU to run flat out - they may get very noising and not actually manage to control heat even with full fan and full speed throttling - this can cause issues … This is not an issue with the software simply "poor" design and/or use of the laptop in a way the designers didn't consider … Laptops also tend to clog their ventilation system fairly quickly which can exacerbate the issues.
2) Overclocking (even factory overclocking) tends to be benchmarked as stable using less intensive software … This simply means that when issues occur with FAH it is that the system is "over" clocked and needs stepping back - fan profiles might need adjusting (these tend to be restricted so as to be "nice and quite" which can make them a significant part of the issue).
3) Some old kit can simply be suffering from age related issued (especially with some of the older cores) and if there is a weakness it is likely to be exposed by the FAH software.
So is it a hardware issue or a software issue? … Both, depending on how you look at it I guess … It is a software issue in that the FAH software is designed/optimised to use as much CPU as you let it (and as you have found there are ways to restrict this) which for most machine will cause no issues - for some this will be too much and so yes the FAH (running as designed to do) may be just too much for the machine … It is a hardware issue in that the hardware is simply not able to manage itself the way it should - all machines should be able to cool themselves at maximum load (but some can't either by weak design or by being overclocked to maximum cooling on lower loads).
All I can offer by way of observation is that if your machine can run the software at a lower number of cores then it won't be the software that is crashing your machine … when it crashes at higher core counts the software is doing nothing to you machine it isn't doing at lower core counts expect working it even harder - there must therefore be some form of instability (usually cooling or power related) that means it cannot manage the extra load in the way it should be able to.
Your situation may not be helped by a known "feature" of your CPU …
https://www.techpowerup.com/forums/thre ... ds.238287/
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 11:04 am
by JacobSmith
nope, its a software issue
my cooling is really good
the i7-8700K is an excellent processor
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 12:44 pm
by Methoraptor
JacobSmith wrote:nope, its a software issue
my cooling is really good
the i7-8700K is an excellent processor
I'm new to folding but I have some experience with stress tests. you know why PC builders test their builds using an actual game as well as showing you CPU-Z stats? It's because stress test programs are artificial and dont quite measure up the same as a genuine work load. Clean up your attitude dude these people are trying to help you
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 5:06 pm
by Joe_H
Yes, the i7-8700K is an excellent processor, if you manage to win the silicon lottery. Many suffer from poor placement of the TIM compound Intel applied between the chip and its "lid", and occasionally a lid a bit off from its proper location. And the properties Intel chose for the TIM emphasized lifespan, not thermal conductivity.
Re: CPU unstable when folding, stable in every other applica
Posted: Wed Apr 08, 2020 6:51 pm
by HugoNotte
Maybe just to highlight how much thermal stress F@H AVX WUs put on a processor: I am running F@H on a laptop with an i5 Ivybridge processor and GT630M GPU. CPU slot uses 2 threads, and the GPU slot 1 more. 1 thread should be available, the CPU runs at around 85% load when just folding. While folding, I did a system update in the background (I run linux). While updating the kernel, the CPU load went to 100% and from experience I know the kernel update puts 100% load on all 4 threads for something like 40 - 50 seconds, even when the system is idle. While folding, the CPU temperature is at around 87 degrees with 85% load. As soon as the kernel update took over the 4 threads and loaded the CPU 100% the temperature dropped to 76 degrees!
I have run a couple of benchmark and stress programs on both, my laptop and desktop under Windows and find that the temperatures stay usually noticeably cooler when those programs push the CPU to 100% than when running F@H or certain Boinc projects, I think Rosetta@home is one of the more demanding ones. These AVX WUs make temperatures rise significantly.
The above scenario happened on a CPU running 2 AVX threads. Your CPU is running 11 AVX threads, which should make an even bigger difference.
If it's not a thermal problem, have a look at the large prime issue regarding the number of threads. Maybe rather split your CPU into 2 slots.