Page 2 of 2

Re: 16926 - Some sort of loops with this CPU WU

Posted: Tue Dec 01, 2020 7:10 am
by VAcharonD1
Some more logs, this time from Linux with 12 threads (apparently lowered to 3). I've also seen it happen on a regular 3 thread slot.

Code: Select all

*********************** Log Started 2020-12-01T07:03:43Z ***********************
07:03:43:FS00:Initialized folding slot 00: cpu:12
07:03:43:WU02:FS00:Starting
07:03:43:WARNING:WU02:FS00:AS lowered CPUs from 12 to 3
07:03:43:WU02:FS00:Removing old file 'work/02/logfile_01-20201201-043902.txt'
07:03:43:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 104733 -checkpoint 15 -np 3
07:03:43:WU02:FS00:Started FahCore on PID 104758
07:03:43:WU02:FS00:Core PID:104762
07:03:43:WU02:FS00:FahCore 0xa8 started
07:03:44:WU02:FS00:0xa8:*********************** Log Started 2020-12-01T07:03:43Z ***********************
07:03:44:WU02:FS00:0xa8:************************** Gromacs Folding@home Core ***************************
07:03:44:WU02:FS00:0xa8:       Core: Gromacs
07:03:44:WU02:FS00:0xa8:       Type: 0xa8
07:03:44:WU02:FS00:0xa8:    Version: 0.0.9
07:03:44:WU02:FS00:0xa8:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
07:03:44:WU02:FS00:0xa8:  Copyright: 2020 foldingathome.org
07:03:44:WU02:FS00:0xa8:   Homepage: https://foldingathome.org/
07:03:44:WU02:FS00:0xa8:       Date: Oct 28 2020
07:03:44:WU02:FS00:0xa8:       Time: 22:15:07
07:03:44:WU02:FS00:0xa8:   Compiler: GNU 8.3.0
07:03:44:WU02:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
07:03:44:WU02:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
07:03:44:WU02:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
07:03:44:WU02:FS00:0xa8:       Bits: 64
07:03:44:WU02:FS00:0xa8:       Mode: Release
07:03:44:WU02:FS00:0xa8:       SIMD: avx2_256
07:03:44:WU02:FS00:0xa8:     OpenMP: ON
07:03:44:WU02:FS00:0xa8:       CUDA: OFF
07:03:44:WU02:FS00:0xa8:       Args: -dir 02 -suffix 01 -version 706 -lifeline 104758 -checkpoint 15 -np
07:03:44:WU02:FS00:0xa8:             3
07:03:44:WU02:FS00:0xa8:************************************ libFAH ************************************
07:03:44:WU02:FS00:0xa8:       Date: Oct 28 2020
07:03:44:WU02:FS00:0xa8:       Time: 22:12:00
07:03:44:WU02:FS00:0xa8:   Compiler: GNU 8.3.0
07:03:44:WU02:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
07:03:44:WU02:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
07:03:44:WU02:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
07:03:44:WU02:FS00:0xa8:       Bits: 64
07:03:44:WU02:FS00:0xa8:       Mode: Release
07:03:44:WU02:FS00:0xa8:************************************ CBang *************************************
07:03:44:WU02:FS00:0xa8:       Date: Oct 28 2020
07:03:44:WU02:FS00:0xa8:       Time: 22:11:46
07:03:44:WU02:FS00:0xa8:   Compiler: GNU 8.3.0
07:03:44:WU02:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
07:03:44:WU02:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
07:03:44:WU02:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
07:03:44:WU02:FS00:0xa8:       Bits: 64
07:03:44:WU02:FS00:0xa8:       Mode: Release
07:03:44:WU02:FS00:0xa8:************************************ System ************************************
07:03:44:WU02:FS00:0xa8:        CPU: Intel(R) Xeon(R) E-2286M CPU @ 2.40GHz
07:03:44:WU02:FS00:0xa8:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 13
07:03:44:WU02:FS00:0xa8:       CPUs: 16
07:03:44:WU02:FS00:0xa8:     Memory: 15.48GiB
07:03:44:WU02:FS00:0xa8:Free Memory: 7.62GiB
07:03:44:WU02:FS00:0xa8:    Threads: POSIX_THREADS
07:03:44:WU02:FS00:0xa8: OS Version: 5.9
07:03:44:WU02:FS00:0xa8:Has Battery: true
07:03:44:WU02:FS00:0xa8: On Battery: false
07:03:44:WU02:FS00:0xa8: UTC Offset: -6
07:03:44:WU02:FS00:0xa8:        PID: 104762
07:03:44:WU02:FS00:0xa8:        CWD: /var/lib/fahclient/work
07:03:44:WU02:FS00:0xa8:********************************************************************************
07:03:44:WU02:FS00:0xa8:Project: 16926 (Run 84, Clone 453, Gen 1)
07:03:44:WU02:FS00:0xa8:Unit: 0x000000068120d1cc5fbd3725dfba0f2c
07:03:44:WU02:FS00:0xa8:Reading tar file core.xml
07:03:44:WU02:FS00:0xa8:Reading tar file frame1.tpr
07:03:44:WU02:FS00:0xa8:Digital signatures verified
07:03:44:WU02:FS00:0xa8:Calling: mdrun -c frame1.gro -s frame1.tpr -x frame1.xtc -cpt 15 -nt 3 -ntmpi 1
07:03:44:WU02:FS00:0xa8:Steps: first=0 total=0
07:03:45:WU02:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
07:03:45:WU02:FS00:Starting
07:03:45:WARNING:WU02:FS00:AS lowered CPUs from 12 to 3
07:03:45:WU02:FS00:Removing old file 'work/02/logfile_01-20201201-044002.txt'
07:03:45:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 104733 -checkpoint 15 -np 3
07:03:45:WU02:FS00:Started FahCore on PID 104778
07:03:45:WU02:FS00:Core PID:104782
07:03:45:WU02:FS00:FahCore 0xa8 started
07:03:46:WU02:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
Mod Edit: Changed Quote Tags To Code Tags - PantherX

Re: 16926 - Some sort of loops with this CPU WU

Posted: Tue Dec 01, 2020 4:56 pm
by Lockheed_Tvr
Linux noob here. Can some tell me how to dump a work unit in Linux? I'm on Ubuntu 18.04. My googling did not return anything helpful.

Do I delete the core or the work?

How do get around not having permission to rm?

Re: 16926 - Some sort of loops with this CPU WU

Posted: Tue Dec 01, 2020 5:09 pm
by Yeroon
You should be able to do it through the gui file manager. I dont recall any permissions issues but there is always sudo if needed.
WU locations on a default installation (at least how mine is) at /var/lib/fahclient/work/

Find your offending WU number, delete or remove to another location, restart said slot.

Re: 16926 - Some sort of loops with this CPU WU

Posted: Tue Dec 01, 2020 6:59 pm
by Maddog
Do not use sudo in a gui, it can mess up permissions.
In file manager, navigate to /var/lib/fahclient/work/ . Right click on the work folder and it should give You a dropdown menu.
Choose open as administrator. There should then be no problem deleting the WU folder.

Re: 16926 - Some sort of loops with this CPU WU

Posted: Tue Dec 01, 2020 9:49 pm
by Lockheed_Tvr
Sort of odd. Half the boxes allowed me to just delete the work folder from the gui, the others needed to be removed from terminal. There was no "open as administrator" option from the right click menu. "sudo rm -r 02" and then my password worked from the command line. Linux continues to confound me.

Re: 16926 - Some sort of loops with this CPU WU

Posted: Wed Dec 02, 2020 6:25 am
by gunnarre
16926 (78,49,1) dumped on Linux Mint 20, 8-thread CPU.

Re: 16926 - Some sort of loops with this CPU WU

Posted: Wed Dec 02, 2020 9:52 pm
by kjk
Yeah, seems I'm also been been stuck with 16926 (90, 634, 6) for a while. Just noticed today the cpu was idling. My CPU slot is set to 8. I'm on 7.6.21 and Fedora 32 (linux2 5.8.0-1-amd64).
Thus dumping it.