FAH Client errors

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

tulanebarandgrill
Posts: 36
Joined: Thu Nov 16, 2017 2:57 pm

FAH Client errors

Post by tulanebarandgrill »

First off, I see that I am getting messages from FS01 (this is my GPU slot) starting 0x21; does this mean I have a really old client ?

Secondly I'm getting these BAD_WORK_UNIT errors that started a few days ago.

Here's an example starting with what looks like the first sign of trouble:

Code: Select all

21:53:35:WARNING:FS01:Size of positions 5845 does not match topology 5844
21:54:57:WU02:FS00:0xa7:Completed 2300000 out of 5000000 steps (46%)
21:57:06:WU01:FS01:0x21:Completed 4300000 out of 5000000 steps (86%)
21:57:06:WU02:FS00:0xa7:Completed 2350000 out of 5000000 steps (47%)
21:59:15:WU02:FS00:0xa7:Completed 2400000 out of 5000000 steps (48%)
22:00:41:WU01:FS01:0x21:Completed 4350000 out of 5000000 steps (87%)
22:01:25:WU02:FS00:0xa7:Completed 2450000 out of 5000000 steps (49%)
22:02:35:WU01:FS01:0x21:ERROR:exception: Error downloading array energyBuffer: clEnqueueReadBuffer (-5)
22:02:36:WU01:FS01:0x21:Saving result file logfile_01.txt
22:02:36:WU01:FS01:0x21:Saving result file log.txt
22:02:36:WU01:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
22:02:37:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
22:02:37:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11718 run:0 clone:223 gen:396 core:0x21 unit:0x0000023b8ca304e75b846a3f442a51ba
22:02:37:WU01:FS01:Uploading 19.50KiB to 140.163.4.231
22:02:37:WU01:FS01:Upload complete
22:02:37:WU01:FS01:Server responded WORK_ACK (400)
22:02:37:WU01:FS01:Cleaning up
Here's my startup log:

Code: Select all

00:19:41:WU00:FS01:Starting
00:19:41:WU00:FS01:Core PID:5140
00:19:41:WU00:FS01:FahCore 0x21 started
00:19:41:WU01:FS00:Starting
00:19:41:WARNING:WU01:FS00:AS lowered CPUs from 14 to 13
00:19:41:WU01:FS00:Core PID:6100
00:19:41:WU01:FS00:FahCore 0xa7 started
00:19:42:WU00:FS01:0x21:*********************** Log Started 2019-06-13T00:19:42Z ***********************
00:19:42:WU00:FS01:0x21:Project: 14166 (Run 1, Clone 73, Gen 56)
00:19:42:WU00:FS01:0x21:Unit: 0x000000570002894c5c38bfefb36ff53e
00:19:42:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
00:19:42:WU00:FS01:0x21:Machine: 1
00:19:42:WU00:FS01:0x21:Digital signatures verified
00:19:42:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
00:19:42:WU00:FS01:0x21:Version 0.0.18
00:19:42:WU00:FS01:0x21:  Found a checkpoint file
00:19:42:WU01:FS00:0xa7:*********************** Log Started 2019-06-13T00:19:42Z ***********************
00:19:42:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
00:19:42:WU01:FS00:0xa7:       Type: 0xa7
00:19:42:WU01:FS00:0xa7:       Core: Gromacs
00:19:42:WU01:FS00:0xa7:    Website: https://foldingathome.org/
00:19:42:WU01:FS00:0xa7:  Copyright: (c) 2009-2018 foldingathome.org
00:19:42:WU01:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
00:19:42:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 705 -lifeline 1840 -checkpoint 6 -np 13
00:19:42:WU01:FS00:0xa7:     Config: <none>
00:19:42:WU01:FS00:0xa7:************************************ Build *************************************
00:19:42:WU01:FS00:0xa7:    Version: 0.0.17
00:19:42:WU01:FS00:0xa7:       Date: Apr 27 2018
00:19:42:WU01:FS00:0xa7:       Time: 16:19:36
00:19:42:WU01:FS00:0xa7: Repository: Git
00:19:42:WU01:FS00:0xa7:   Revision: 21359963583d09ec2063ef946399441c4df4ccd7
00:19:42:WU01:FS00:0xa7:     Branch: master
00:19:42:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
00:19:42:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
00:19:42:WU01:FS00:0xa7:   Platform: win32 10
00:19:42:WU01:FS00:0xa7:       Bits: 64
00:19:42:WU01:FS00:0xa7:       Mode: Release
00:19:42:WU01:FS00:0xa7:       SIMD: avx_256
00:19:42:WU01:FS00:0xa7:************************************ System ************************************
00:19:42:WU01:FS00:0xa7:        CPU: Unknown
00:19:42:WU01:FS00:0xa7:     CPU ID: 
00:19:42:WU01:FS00:0xa7:       CPUs: 16
00:19:42:WU01:FS00:0xa7:     Memory: 63.91GiB
00:19:42:WU01:FS00:0xa7:Free Memory: 60.13GiB
00:19:42:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
00:19:42:WU01:FS00:0xa7: OS Version: 6.2
00:19:42:WU01:FS00:0xa7:Has Battery: false
00:19:42:WU01:FS00:0xa7: On Battery: false
00:19:42:WU01:FS00:0xa7: UTC Offset: -5
00:19:42:WU01:FS00:0xa7:        PID: 6100
00:19:42:WU01:FS00:0xa7:        CWD: C:\Users\pt\AppData\Roaming\FAHClient\work
00:19:42:WU01:FS00:0xa7:         OS: Windows 10 Pro
00:19:42:WU01:FS00:0xa7:    OS Arch: AMD64
00:19:42:WU01:FS00:0xa7:********************************************************************************
00:19:42:WU01:FS00:0xa7:Project: 14153 (Run 17, Clone 278, Gen 143)
00:19:42:WU01:FS00:0xa7:Unit: 0x000000a50002894b5c6e03b0533e1b1c
00:19:42:WU01:FS00:0xa7:Digital signatures verified
00:19:42:WU01:FS00:0xa7:Reducing thread count from 13 to 12 to avoid domain decomposition by a prime number > 3
00:19:42:WU01:FS00:0xa7:Calling: mdrun -s frame143.tpr -o frame143.trr -cpi state.cpt -cpt 6 -nt 12
00:19:42:WU01:FS00:0xa7:Steps: first=715000000 total=5000000
00:19:42:WU01:FS00:0xa7:Completed 676242 out of 5000000 steps (13%)
00:19:43:WU00:FS01:0x21:Completed 5750000 out of 12500000 steps (46%)
00:19:43:WU00:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
Any ideas folks ?

Mod edit: added Code tags to log listings, please use them from the Full Editor
- TulaneBaG
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: FAH Client errors

Post by bruce »

tulanebarandgrill wrote:First off, I see that I am getting messages from FS01 (this is my GPU slot) starting 0x21; does this mean I have a really old client ?
No. The notation 0x21 means that the WU you're working on is running the analysis software FAHCore_21.exe which is the standard GPU analysis software. You're also running FAHCore_a7.exe, which is the standard CPU analysis software and FAHClient.exe, which is the standard back-end conrol software. Each one comes with it's own revision number. The revision of your client is not included in the information you posted.

The error Error downloading array energyBuffer: clEnqueueReadBuffer (-5) is a report from your GPU's drivers indicating that the GPU has had a serious error and cannot continue. The most common cause of this error is GPU overclocking although it has also been attributed to specific revisions of driver software. Some projects are more sensitive to overclocking and this can be corrected by resetting your GPU to standard clock rates. Are you running the latest drivers from NVidia?
tulanebarandgrill
Posts: 36
Joined: Thu Nov 16, 2017 2:57 pm

Re: FAH Client errors

Post by tulanebarandgrill »

Thanks for the reply.

I am running a pretty recent version of GPU driver. I just installed Windows on this machine about 10 days ago and downloaded everything fresh. Maybe it's too fresh.

FAH Control version is 7.5.1.

Also, I've seen other times where I had this error and immediately before there was a message about the clock being too far off or something like that. I checked and the system clock was 5 minutes fast. I think I have fixed the issue though; seems to have run without incident for most of today.

Yes here it is ...

Code: Select all

16:30:01:WU00:FS01:Starting
16:30:01:WU00:FS01:Core PID:14888
16:30:01:WU00:FS01:FahCore 0x21 started
16:30:01:WU01:FS00:Starting
16:30:01:WARNING:WU01:FS00:AS lowered CPUs from 14 to 13
16:30:01:WU01:FS00:Core PID:7684
16:30:01:WU01:FS00:FahCore 0xa7 started
16:30:02:WU00:FS01:0x21:*********************** Log Started 2019-06-10T16:30:01Z ***********************
16:30:02:WU00:FS01:0x21:Project: 14177 (Run 75, Clone 4, Gen 75)
16:30:02:WU00:FS01:0x21:Unit: 0x000000680002894c5cae6ac3fdcd8e78
16:30:02:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
16:30:02:WU00:FS01:0x21:Machine: 1
16:30:02:WU00:FS01:0x21:Digital signatures verified
16:30:02:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
16:30:02:WU00:FS01:0x21:Version 0.0.18
16:30:02:WU00:FS01:0x21:  Found a checkpoint file
16:30:02:WU01:FS00:0xa7:*********************** Log Started 2019-06-10T16:30:01Z ***********************
16:30:02:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
16:30:02:WU01:FS00:0xa7:       Type: 0xa7
16:30:02:WU01:FS00:0xa7:       Core: Gromacs
16:30:02:WU01:FS00:0xa7:    Website: https://foldingathome.org/
16:30:02:WU01:FS00:0xa7:  Copyright: (c) 2009-2018 foldingathome.org
16:30:02:WU01:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:30:02:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 705 -lifeline 9228 -checkpoint 6 -np 13
16:30:02:WU01:FS00:0xa7:     Config: <none>
16:30:02:WU01:FS00:0xa7:************************************ Build *************************************
16:30:02:WU01:FS00:0xa7:    Version: 0.0.17
16:30:02:WU01:FS00:0xa7:       Date: Apr 27 2018
16:30:02:WU01:FS00:0xa7:       Time: 16:19:36
16:30:02:WU01:FS00:0xa7: Repository: Git
16:30:02:WU01:FS00:0xa7:   Revision: 21359963583d09ec2063ef946399441c4df4ccd7
16:30:02:WU01:FS00:0xa7:     Branch: master
16:30:02:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
16:30:02:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
16:30:02:WU01:FS00:0xa7:   Platform: win32 10
16:30:02:WU01:FS00:0xa7:       Bits: 64
16:30:02:WU01:FS00:0xa7:       Mode: Release
16:30:02:WU01:FS00:0xa7:       SIMD: avx_256
16:30:02:WU01:FS00:0xa7:************************************ System ************************************
16:30:02:WU01:FS00:0xa7:        CPU: Unknown
16:30:02:WU01:FS00:0xa7:     CPU ID: 
16:30:02:WU01:FS00:0xa7:       CPUs: 16
16:30:02:WU01:FS00:0xa7:     Memory: 63.91GiB
16:30:02:WU01:FS00:0xa7:Free Memory: 60.02GiB
16:30:02:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
16:30:02:WU01:FS00:0xa7: OS Version: 6.2
16:30:02:WU01:FS00:0xa7:Has Battery: false
16:30:02:WU01:FS00:0xa7: On Battery: false
16:30:02:WU01:FS00:0xa7: UTC Offset: -5
16:30:02:WU01:FS00:0xa7:        PID: 7684
16:30:02:WU01:FS00:0xa7:        CWD: C:\Users\pt\AppData\Roaming\FAHClient\work
16:30:02:WU01:FS00:0xa7:         OS: Windows 10 Pro
16:30:02:WU01:FS00:0xa7:    OS Arch: AMD64
16:30:02:WU01:FS00:0xa7:********************************************************************************
16:30:02:WU01:FS00:0xa7:Project: 14153 (Run 9, Clone 344, Gen 140)
16:30:02:WU01:FS00:0xa7:Unit: 0x0000009f0002894b5c6e8ef9a00b9b45
16:30:02:WU01:FS00:0xa7:Digital signatures verified
16:30:02:WU01:FS00:0xa7:Reducing thread count from 13 to 12 to avoid domain decomposition by a prime number > 3
16:30:02:WU01:FS00:0xa7:Calling: mdrun -s frame140.tpr -o frame140.trr -cpi state.cpt -cpt 6 -nt 12
16:30:02:WU01:FS00:0xa7:Steps: first=700000000 total=5000000
16:30:02:WU01:FS00:0xa7:Completed 1719322 out of 5000000 steps (34%)
16:30:04:WU00:FS01:0x21:Completed 7250000 out of 25000000 steps (29%)
16:30:04:WU00:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
16:35:21:Saving configuration to config.xml
16:35:21:<config>
16:35:21:  <!-- Folding Core -->
16:35:21:  <checkpoint v='6'/>
16:35:21:
16:35:21:  <!-- Folding Slot Configuration -->
16:35:21:  <cause v='ALZHEIMERS'/>
16:35:21:
16:35:21:  <!-- Logging -->
16:35:21:  <verbosity v='2'/>
16:35:21:
16:35:21:  <!-- Network -->
16:35:21:  <proxy v=':8080'/>
16:35:21:
16:35:21:  <!-- User Information -->
16:35:21:  <passkey v='********************************'/>
16:35:21:  <team v='233664'/>
16:35:21:  <user v='TulaneBaG'/>
16:35:21:
16:35:21:  <!-- Folding Slots -->
16:35:21:  <slot id='0' type='CPU'/>
16:35:21:  <slot id='1' type='GPU'/>
16:35:21:</config>
[b]16:35:21:WARNING:WU00:FS01:Detected clock skew (5 mins 00 secs), I/O delay, laptop hibernation or other slowdown noted, adjusting time estimates
16:35:21:WARNING:WU01:FS00:Detected clock skew (5 mins 00 secs), I/O delay, laptop hibernation or other slowdown noted, adjusting time estimates[/b]
16:35:21:WU00:FS01:0x21:ERROR:exception: Error downloading array energyBuffer: clEnqueueReadBuffer (-5)
16:35:22:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
16:35:22:WU00:FS01:0x21:Saving result file logfile_01.txt
16:35:22:WU00:FS01:0x21:Saving result file log.txt
16:35:22:WU00:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
16:35:22:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
16:35:22:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:14177 run:75 clone:4 gen:75 core:0x21 unit:0x000000680002894c5cae6ac3fdcd8e78
16:35:22:WU00:FS01:Uploading 3.31KiB to 155.247.166.220
16:35:22:WU00:FS01:Upload complete
16:35:22:WU00:FS01:Server responded WORK_ACK (400)
16:35:22:WU00:FS01:Cleaning up
- TulaneBaG
foldy
Posts: 2040
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: FAH Client errors

Post by foldy »

Detected clock skew: Maybe that means your machine's battery is low which makes the clock run too slow. If true then you can exchange the mainboard battery with a new one CR2032
Joe_H
Site Admin
Posts: 8002
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: FAH Client errors

Post by Joe_H »

tulanebarandgrill wrote:Yes here it is ...
Actually you still have not posted the starting section of your log. As the directions in bruce's link state, either click on the Refresh button and scroll to the very beginning of the log, or navigate to the log file on your computer and open the log.txt file in a text editor. Information is included in the first 150-200 lines that is not repeated in the startup of a WU as you have shown.
Image
tulanebarandgrill
Posts: 36
Joined: Thu Nov 16, 2017 2:57 pm

Re: FAH Client errors

Post by tulanebarandgrill »

Ok, to foldy, this is not a laptop, and there is no battery, however the root cause that my clock is running FAST (not slow) is that NTP wasn't working.

My real question though was can this cause the error after it? Because otherwise it seems an odd thing for the client to complain about.

In response to Joe's comments, I refreshed log and posting beginning section ...

Code: Select all

*********************** Log Started 2019-06-12T16:43:14Z ***********************
16:43:14:************************* Folding@home Client *************************
16:43:14:        Website: https://foldingathome.org/
16:43:14:      Copyright: (c) 2009-2018 foldingathome.org
16:43:14:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:43:14:           Args: 
16:43:14:         Config: C:\Users\pt\AppData\Roaming\FAHClient\config.xml
16:43:14:******************************** Build ********************************
16:43:14:        Version: 7.5.1
16:43:14:           Date: May 11 2018
16:43:14:           Time: 13:06:32
16:43:14:     Repository: Git
16:43:14:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
16:43:14:         Branch: master
16:43:14:       Compiler: Visual C++ 2008
16:43:14:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
16:43:14:       Platform: win32 10
16:43:14:           Bits: 32
16:43:14:           Mode: Release
16:43:14:******************************* System ********************************
16:43:14:            CPU: Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz
16:43:14:         CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
16:43:14:           CPUs: 16
16:43:14:         Memory: 63.91GiB
16:43:14:    Free Memory: 60.93GiB
16:43:14:        Threads: WINDOWS_THREADS
16:43:14:     OS Version: 6.2
16:43:14:    Has Battery: false
16:43:14:     On Battery: false
16:43:14:     UTC Offset: -5
16:43:14:            PID: 12904
16:43:14:            CWD: C:\Users\pt\AppData\Roaming\FAHClient
16:43:14:             OS: Windows 10 Enterprise
16:43:14:        OS Arch: AMD64
16:43:14:           GPUs: 1
16:43:14:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:5 GM204 [GeForce GTX 970]
16:43:14:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:5.2 Driver:9.1
16:43:14:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:388.13
16:43:14:  Win32 Service: false
16:43:14:***********************************************************************
16:43:14:<config>
16:43:14:  <!-- Folding Core -->
16:43:14:  <checkpoint v='6'/>
16:43:14:
16:43:14:  <!-- Folding Slot Configuration -->
16:43:14:  <cause v='ALZHEIMERS'/>
16:43:14:
16:43:14:  <!-- Logging -->
16:43:14:  <verbosity v='2'/>
16:43:14:
16:43:14:  <!-- Network -->
16:43:14:  <proxy v=':8080'/>
16:43:14:
16:43:14:  <!-- User Information -->
16:43:14:  <passkey v='********************************'/>
16:43:14:  <team v='233664'/>
16:43:14:  <user v='TulaneBaG'/>
16:43:14:
16:43:14:  <!-- Folding Slots -->
16:43:14:  <slot id='0' type='CPU'>
16:43:14:    <paused v='true'/>
16:43:14:  </slot>
16:43:14:  <slot id='1' type='GPU'>
16:43:14:    <paused v='true'/>
16:43:14:  </slot>
16:43:14:</config>
16:43:14:Trying to access database...
16:43:14:Enabled folding slot 00: PAUSED cpu:14 (by user)
16:43:14:Enabled folding slot 01: PAUSED gpu:0:GM204 [GeForce GTX 970] (by user)
16:45:49:FS00:Unpaused
16:45:49:FS01:Unpaused
16:45:49:WU01:FS01:Starting
16:45:49:WU01:FS01:Core PID:7468
16:45:49:WU01:FS01:FahCore 0x21 started
16:45:49:WU02:FS00:Starting
16:45:49:WARNING:WU02:FS00:AS lowered CPUs from 14 to 13
16:45:49:WU02:FS00:Core PID:9708
16:45:49:WU02:FS00:FahCore 0xa7 started
16:45:49:WU01:FS01:0x21:*********************** Log Started 2019-06-12T16:45:49Z ***********************
16:45:49:WU01:FS01:0x21:Project: 11718 (Run 0, Clone 223, Gen 396)
16:45:49:WU01:FS01:0x21:Unit: 0x0000023b8ca304e75b846a3f442a51ba
16:45:49:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
16:45:49:WU01:FS01:0x21:Machine: 1
16:45:49:WU01:FS01:0x21:Digital signatures verified
16:45:49:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
16:45:49:WU01:FS01:0x21:Version 0.0.18
16:45:50:WU02:FS00:0xa7:*********************** Log Started 2019-06-12T16:45:49Z ***********************
16:45:50:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
16:45:50:WU02:FS00:0xa7:       Type: 0xa7
16:45:50:WU02:FS00:0xa7:       Core: Gromacs
16:45:50:WU02:FS00:0xa7:    Website: https://foldingathome.org/
16:45:50:WU02:FS00:0xa7:  Copyright: (c) 2009-2018 foldingathome.org
16:45:50:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:45:50:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 705 -lifeline 5152 -checkpoint 6 -np 13
16:45:50:WU02:FS00:0xa7:     Config: <none>
16:45:50:WU02:FS00:0xa7:************************************ Build *************************************
16:45:50:WU02:FS00:0xa7:    Version: 0.0.17
16:45:50:WU02:FS00:0xa7:       Date: Apr 27 2018
16:45:50:WU02:FS00:0xa7:       Time: 16:19:36
16:45:50:WU02:FS00:0xa7: Repository: Git
16:45:50:WU02:FS00:0xa7:   Revision: 21359963583d09ec2063ef946399441c4df4ccd7
16:45:50:WU02:FS00:0xa7:     Branch: master
16:45:50:WU02:FS00:0xa7:   Compiler: Visual C++ 2008
16:45:50:WU02:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
16:45:50:WU02:FS00:0xa7:   Platform: win32 10
16:45:50:WU02:FS00:0xa7:       Bits: 64
16:45:50:WU02:FS00:0xa7:       Mode: Release
16:45:50:WU02:FS00:0xa7:       SIMD: avx_256
16:45:50:WU02:FS00:0xa7:************************************ System ************************************
16:45:50:WU02:FS00:0xa7:        CPU: Unknown
16:45:50:WU02:FS00:0xa7:     CPU ID: 
16:45:50:WU02:FS00:0xa7:       CPUs: 16
16:45:50:WU02:FS00:0xa7:     Memory: 63.91GiB
16:45:50:WU02:FS00:0xa7:Free Memory: 60.67GiB
16:45:50:WU02:FS00:0xa7:    Threads: WINDOWS_THREADS
16:45:50:WU02:FS00:0xa7: OS Version: 6.2
16:45:50:WU02:FS00:0xa7:Has Battery: false
16:45:50:WU02:FS00:0xa7: On Battery: false
16:45:50:WU02:FS00:0xa7: UTC Offset: -5
16:45:50:WU02:FS00:0xa7:        PID: 9708
16:45:50:WU02:FS00:0xa7:        CWD: C:\Users\pt\AppData\Roaming\FAHClient\work
16:45:50:WU02:FS00:0xa7:         OS: Windows 10 Pro
16:45:50:WU02:FS00:0xa7:    OS Arch: AMD64
16:45:50:WU02:FS00:0xa7:********************************************************************************
16:45:50:WU02:FS00:0xa7:Project: 13823 (Run 830, Clone 1, Gen 28)
16:45:50:WU02:FS00:0xa7:Unit: 0x0000001d80fccb095c8ff4fb019cfbdf
16:45:50:WU02:FS00:0xa7:Digital signatures verified
16:45:50:WU02:FS00:0xa7:Reducing thread count from 13 to 12 to avoid domain decomposition by a prime number > 3
16:45:50:WU02:FS00:0xa7:Calling: mdrun -s frame28.tpr -o frame28.trr -x frame28.xtc -cpi state.cpt -cpt 6 -nt 12
16:45:50:WU02:FS00:0xa7:Steps: first=3500000 total=125000
16:45:52:WU02:FS00:0xa7:Completed 5202 out of 125000 steps (4%)
16:45:57:WU01:FS01:0x21:Completed 0 out of 5000000 steps (0%)
16:45:57:WU01:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
16:46:17:Saving configuration to config.xml
16:46:17:<config>
16:46:17:  <!-- Folding Core -->
16:46:17:  <checkpoint v='6'/>
16:46:17:
16:46:17:  <!-- Folding Slot Configuration -->
16:46:17:  <cause v='ALZHEIMERS'/>
16:46:17:
16:46:17:  <!-- Logging -->
16:46:17:  <verbosity v='2'/>
16:46:17:
16:46:17:  <!-- Network -->
16:46:17:  <proxy v=':8080'/>
16:46:17:
16:46:17:  <!-- User Information -->
16:46:17:  <passkey v='********************************'/>
16:46:17:  <team v='233664'/>
16:46:17:  <user v='TulaneBaG'/>
16:46:17:
16:46:17:  <!-- Folding Slots -->
16:46:17:  <slot id='0' type='CPU'/>
16:46:17:  <slot id='1' type='GPU'/>
16:46:17:</config>
- TulaneBaG
Joe_H
Site Admin
Posts: 8002
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: FAH Client errors

Post by Joe_H »

foldy is not referring to a battery you would find in a laptop , but to the battery on your main board that keeps the clock and other settings going when your system is not plugged in. It is sometimes called the Time of Year clock battery.

Yes, it does sometime cause issues with the client, what you are seeing are just warnings. Basically the client sees the clock time jump, and reports that as the time slewing. It affects estimates for time to complete a frame, the entire WU, and the points that will be credited.

One additional suggestion, please return the logging verbosity to the default of 3. Higher values do not provide useful information except during beta tests of the client on rare occasions. Lower settings for the verbosity start to remove important messages for diagnosing issues.
Image
tulanebarandgrill
Posts: 36
Joined: Thu Nov 16, 2017 2:57 pm

Re: FAH Client errors

Post by tulanebarandgrill »

No battery in the main board; this one has a supercap. Anyway my clock was + 5 minutes over two days, this is a new computer, and it was NTP which was failing. It's not jumping but slowly falling out of sync. The NTP fix has solved that issue but I do appreciate your explanation on how it impacts the fah client.

How do I change the logging verbosity ? I did not alter it. This was just recently installed.

Also I'm still wondering what this is:

00:44:39:WARNING:FS01:Size of positions 5845 does not match topology 5844
- TulaneBaG
JimboPalmer
Posts: 2522
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: FAH Client errors

Post by JimboPalmer »

"00:44:39:WARNING:FS01:Size of positions 5845 does not match topology 5844"
That is a harmless message that seems to be spurious.

If it was me, I would set CPUs to 12. The software dislikes large prime numbers and multiples of them. You have 14, which is a multiple of 7, but the GTX 970 uses 1, so you have 13 for CPU folding.

In Advanced Control, choose the Configuration tab than the Slots Tab. Edit the CPU item and select 12 CPUs.

Then choose Advanced Tab and set Verbosity to 3.

Best of luck!
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
tulanebarandgrill
Posts: 36
Joined: Thu Nov 16, 2017 2:57 pm

Re: FAH Client errors

Post by tulanebarandgrill »

jimbopalmer - I don't follow your math. There are two slots, one with the 970, and one with the 14 CPU cores. Are you saying that somehow the GPU slot uses one of the CPU cores ?

I checked advanced tab, it is set to 2 right now which seems to be lower than 3.
- TulaneBaG
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: FAH Client errors

Post by bruce »

Each GPU will require one CPU (thread) to move data to/from it.
JimboPalmer
Posts: 2522
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: FAH Client errors

Post by JimboPalmer »

tulanebarandgrill wrote:Are you saying that somehow the GPU slot uses one of the CPU cores ?
Yes, the GPU needs a CPU thread to get data, as it cannot itself read the internet or even the local hard drive. All data needs to be loaded into main memory, then transferred across the PCI-E bus to the memory in the graphics card. So F@H devotes a thread to this housekeeping. (F@H can't tell a thread from a CPU core)
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Joe_H
Site Admin
Posts: 8002
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: FAH Client errors

Post by Joe_H »

tulanebarandgrill wrote:I checked advanced tab, it is set to 2 right now which seems to be lower than 3.
Yes, the default is 3 for logging verbosity. Somehow it was changed in your client, possibly inadvertently. A lower number there means fewer messages show up in your log.txt file or in the log window of FAHControl.

However, as I stated earlier, please do not raise it above the default value of 3. The messages that would show up for values of 4 or 5 only are useful to debugging for the developers at this point. Even they leave those messages out almost all of the time.
Image
tulanebarandgrill
Posts: 36
Joined: Thu Nov 16, 2017 2:57 pm

Re: FAH Client errors

Post by tulanebarandgrill »

@Bruce : I see, so no matter what I will always have an odd number of cores after the GPU bit is taken. Or should I perhaps configure the client for 13 cores and it will use 12? Setting to 12 would just put me at 11 cores for folding.

In other news, there is nothing more satisfying than this:

01:14:43:WU00:FS00:Cleaning up
01:14:59:WU02:FS01:0x21:Completed 2250000 out of 5000000 steps (45%)
01:15:53:WU01:FS00:0xa7:Completed 25000 out of 2500000 steps (1%)
01:17:05:WU01:FS00:0xa7:Completed 50000 out of 2500000 steps (2%)
01:17:54:WU02:FS01:0x21:Completed 2300000 out of 5000000 steps (46%)
01:18:17:WU01:FS00:0xa7:Completed 75000 out of 2500000 steps (3%)
01:19:28:WU01:FS00:0xa7:Completed 100000 out of 2500000 steps (4%)
01:20:40:WU01:FS00:0xa7:Completed 125000 out of 2500000 steps (5%)
01:20:48:WU02:FS01:0x21:Completed 2350000 out of 5000000 steps (47%)
01:21:52:WU01:FS00:0xa7:Completed 150000 out of 2500000 steps (6%)
01:23:04:WU01:FS00:0xa7:Completed 175000 out of 2500000 steps (7%)
01:23:42:WU02:FS01:0x21:Completed 2400000 out of 5000000 steps (48%)
01:24:15:WU01:FS00:0xa7:Completed 200000 out of 2500000 steps (8%)
01:25:27:WU01:FS00:0xa7:Completed 225000 out of 2500000 steps (9%)
01:26:36:WU02:FS01:0x21:Completed 2450000 out of 5000000 steps (49%)
01:26:39:WU01:FS00:0xa7:Completed 250000 out of 2500000 steps (10%)
01:27:51:WU01:FS00:0xa7:Completed 275000 out of 2500000 steps (11%)
01:29:02:WU01:FS00:0xa7:Completed 300000 out of 2500000 steps (12%)
01:29:30:WU02:FS01:0x21:Completed 2500000 out of 5000000 steps (50%)
01:30:14:WU01:FS00:0xa7:Completed 325000 out of 2500000 steps (13%)
01:31:25:WU01:FS00:0xa7:Completed 350000 out of 2500000 steps (14%)
01:32:26:WU02:FS01:0x21:Completed 2550000 out of 5000000 steps (51%)
01:32:37:WU01:FS00:0xa7:Completed 375000 out of 2500000 steps (15%)
01:33:49:WU01:FS00:0xa7:Completed 400000 out of 2500000 steps (16%)
01:35:01:WU01:FS00:0xa7:Completed 425000 out of 2500000 steps (17%)
01:35:20:WU02:FS01:0x21:Completed 2600000 out of 5000000 steps (52%)
01:36:13:WU01:FS00:0xa7:Completed 450000 out of 2500000 steps (18%)
01:37:24:WU01:FS00:0xa7:Completed 475000 out of 2500000 steps (19%)
01:38:13:WU02:FS01:0x21:Completed 2650000 out of 5000000 steps (53%)
01:38:36:WU01:FS00:0xa7:Completed 500000 out of 2500000 steps (20%)
01:39:48:WU01:FS00:0xa7:Completed 525000 out of 2500000 steps (21%)
01:41:00:WU01:FS00:0xa7:Completed 550000 out of 2500000 steps (22%)
01:41:07:WU02:FS01:0x21:Completed 2700000 out of 5000000 steps (54%)
01:42:11:WU01:FS00:0xa7:Completed 575000 out of 2500000 steps (23%)
01:43:23:WU01:FS00:0xa7:Completed 600000 out of 2500000 steps (24%)
01:44:01:WU02:FS01:0x21:Completed 2750000 out of 5000000 steps (55%)
01:44:35:WU01:FS00:0xa7:Completed 625000 out of 2500000 steps (25%)
01:45:47:WU01:FS00:0xa7:Completed 650000 out of 2500000 steps (26%)
01:46:57:WU02:FS01:0x21:Completed 2800000 out of 5000000 steps (56%)
01:46:59:WU01:FS00:0xa7:Completed 675000 out of 2500000 steps (27%)
01:48:10:WU01:FS00:0xa7:Completed 700000 out of 2500000 steps (28%)
01:49:22:WU01:FS00:0xa7:Completed 725000 out of 2500000 steps (29%)
01:49:51:WU02:FS01:0x21:Completed 2850000 out of 5000000 steps (57%)
01:50:34:WU01:FS00:0xa7:Completed 750000 out of 2500000 steps (30%)
01:51:46:WU01:FS00:0xa7:Completed 775000 out of 2500000 steps (31%)
01:52:45:WU02:FS01:0x21:Completed 2900000 out of 5000000 steps (58%)
01:52:58:WU01:FS00:0xa7:Completed 800000 out of 2500000 steps (32%)
01:54:10:WU01:FS00:0xa7:Completed 825000 out of 2500000 steps (33%)
01:55:21:WU01:FS00:0xa7:Completed 850000 out of 2500000 steps (34%)
01:55:38:WU02:FS01:0x21:Completed 2950000 out of 5000000 steps (59%)
01:56:33:WU01:FS00:0xa7:Completed 875000 out of 2500000 steps (35%)
01:57:45:WU01:FS00:0xa7:Completed 900000 out of 2500000 steps (36%)
01:58:32:WU02:FS01:0x21:Completed 3000000 out of 5000000 steps (60%)
01:58:57:WU01:FS00:0xa7:Completed 925000 out of 2500000 steps (37%)
02:00:09:WU01:FS00:0xa7:Completed 950000 out of 2500000 steps (38%)
02:01:21:WU01:FS00:0xa7:Completed 975000 out of 2500000 steps (39%)
02:01:28:WU02:FS01:0x21:Completed 3050000 out of 5000000 steps (61%)
02:02:33:WU01:FS00:0xa7:Completed 1000000 out of 2500000 steps (40%)
02:03:45:WU01:FS00:0xa7:Completed 1025000 out of 2500000 steps (41%)
02:04:22:WU02:FS01:0x21:Completed 3100000 out of 5000000 steps (62%)
02:04:57:WU01:FS00:0xa7:Completed 1050000 out of 2500000 steps (42%)
02:06:08:WU01:FS00:0xa7:Completed 1075000 out of 2500000 steps (43%)
02:07:16:WU02:FS01:0x21:Completed 3150000 out of 5000000 steps (63%)
02:07:20:WU01:FS00:0xa7:Completed 1100000 out of 2500000 steps (44%)
02:08:32:WU01:FS00:0xa7:Completed 1125000 out of 2500000 steps (45%)
02:09:44:WU01:FS00:0xa7:Completed 1150000 out of 2500000 steps (46%)
02:10:10:WU02:FS01:0x21:Completed 3200000 out of 5000000 steps (64%)
02:10:56:WU01:FS00:0xa7:Completed 1175000 out of 2500000 steps (47%)
02:12:08:WU01:FS00:0xa7:Completed 1200000 out of 2500000 steps (48%)
02:13:04:WU02:FS01:0x21:Completed 3250000 out of 5000000 steps (65%)
02:13:20:WU01:FS00:0xa7:Completed 1225000 out of 2500000 steps (49%)
02:14:32:WU01:FS00:0xa7:Completed 1250000 out of 2500000 steps (50%)
02:15:44:WU01:FS00:0xa7:Completed 1275000 out of 2500000 steps (51%)
02:15:59:WU02:FS01:0x21:Completed 3300000 out of 5000000 steps (66%)
02:16:56:WU01:FS00:0xa7:Completed 1300000 out of 2500000 steps (52%)
02:18:08:WU01:FS00:0xa7:Completed 1325000 out of 2500000 steps (53%)
02:18:53:WU02:FS01:0x21:Completed 3350000 out of 5000000 steps (67%)
02:19:20:WU01:FS00:0xa7:Completed 1350000 out of 2500000 steps (54%)
02:20:32:WU01:FS00:0xa7:Completed 1375000 out of 2500000 steps (55%)
02:21:45:WU01:FS00:0xa7:Completed 1400000 out of 2500000 steps (56%)
02:21:47:WU02:FS01:0x21:Completed 3400000 out of 5000000 steps (68%)
02:22:56:WU01:FS00:0xa7:Completed 1425000 out of 2500000 steps (57%)
02:24:09:WU01:FS00:0xa7:Completed 1450000 out of 2500000 steps (58%)
02:24:40:WU02:FS01:0x21:Completed 3450000 out of 5000000 steps (69%)
02:25:22:WU01:FS00:0xa7:Completed 1475000 out of 2500000 steps (59%)
02:26:34:WU01:FS00:0xa7:Completed 1500000 out of 2500000 steps (60%)
02:27:32:WU02:FS01:0x21:Completed 3500000 out of 5000000 steps (70%)
02:27:46:WU01:FS00:0xa7:Completed 1525000 out of 2500000 steps (61%)
- TulaneBaG
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: FAH Client errors

Post by bruce »

If you set it to 12 it should remain there. If it gets reduced to 11, then we have a new bug.

It looks like maybe you start with 16 and maybe you have 2 GPUs (but you still have not posted the top couple pages of FAH's log, so I have to guess) which gives you 14 to work with. Personally, I would probably create a second CPU slot. Set one to 8 and the other to 6 if you have a total of 14 after deductions are taken for GPUs.

I have sent an inquiry to Development asking for an explanation of what the AS is doing, as it seems to have changed since I first understood it.
Post Reply