Moderators: Site Moderators , FAHC Science Team
beer
Posts: 179 Joined: Tue Dec 13, 2011 11:18 am
Post
by beer » Mon Feb 19, 2018 7:02 pm
Hi
When I resive a core A4 then it failes and tries again with with same error. Se below. But can fold core a7 without any problems. Can someone advice me what to do?
Code: Select all
*********************** Log Started 2018-02-18T15:07:00Z ***********************
15:07:00:************************* Folding@home Client *************************
15:07:00: Website: http://folding.stanford.edu/
15:07:00: Copyright: (c) 2009-2014 Stanford University
15:07:00: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:07:00: Args: --child --lifeline 2829 /etc/fahclient/config.xml --run-as
15:07:00: fahclient --pid-file=/var/run/fahclient.pid --daemon
15:07:00: Config: /etc/fahclient/config.xml
15:07:00:******************************** Build ********************************
15:07:00: Version: 7.4.4
15:07:00: Date: Mar 4 2014
15:07:00: Time: 12:02:38
15:07:00: SVN Rev: 4130
15:07:00: Branch: fah/trunk/client
15:07:00: Compiler: GNU 4.4.7
15:07:00: Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
15:07:00: -fno-unsafe-math-optimizations -msse2
15:07:00: Platform: linux2 3.2.0-1-amd64
15:07:00: Bits: 64
15:07:00: Mode: Release
15:07:00:******************************* System ********************************
15:07:00: CPU: Intel(R) Core(TM) i7-4770S CPU @ 3.10GHz
15:07:00: CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
15:07:00: CPUs: 8
15:07:00: Memory: 7.73GiB
15:07:00:Free Memory: 4.70GiB
15:07:00: Threads: POSIX_THREADS
15:07:00: OS Version: 4.14
15:07:00:Has Battery: false
15:07:00: On Battery: false
15:07:00: UTC Offset: 1
15:07:00: PID: 2831
15:07:00: CWD: /var/lib/fahclient
15:07:00: OS: Linux 4.14.0-3-amd64 x86_64
15:07:00: OS Arch: AMD64
15:07:00: GPUs: 2
15:07:00: GPU 0: NVIDIA:7 GP104 [GeForce GTX 1070] 6463
15:07:00: GPU 1: UNSUPPORTED: NV3 [PCI]
15:07:00: CUDA: 6.1
15:07:00:CUDA Driver: 9000
15:07:00:***********************************************************************
Code: Select all
18:57:20:Adding folding slot 00: READY cpu:6
18:57:20:Removing old file 'configs/config-20180219-060551.xml'
18:57:20:Saving configuration to /etc/fahclient/config.xml
18:57:20:<config>
18:57:20: <!-- Client Control -->
18:57:20: <fold-anon v='true'/>
18:57:20:
18:57:20: <!-- Folding Slot Configuration -->
18:57:20: <gpu v='false'/>
18:57:20:
18:57:20: <!-- Network -->
18:57:20: <proxy v=':8080'/>
18:57:20:
18:57:20: <!-- User Information -->
18:57:20: <passkey v='********************************'/>
18:57:20: <user v='jonasvejlin'/>
18:57:20:
18:57:20: <!-- Folding Slots -->
18:57:20: <slot id='1' type='GPU'/>
18:57:20: <slot id='0' type='CPU'/>
18:57:20:</config>
18:57:21:WU01:FS00:Connecting to 171.67.108.45:8080
18:57:21:WU01:FS00:Assigned to work server 134.139.52.3
18:57:21:WU01:FS00:Requesting new work unit for slot 00: READY cpu:6 from 134.139.52.3
18:57:21:WU01:FS00:Connecting to 134.139.52.3:8080
18:57:22:ERROR:WU01:FS00:Exception: Server did not assign work unit
18:57:22:WU01:FS00:Connecting to 171.67.108.45:8080
18:57:23:WU01:FS00:Assigned to work server 171.67.108.158
18:57:23:WU01:FS00:Requesting new work unit for slot 00: READY cpu:6 from 171.67.108.158
18:57:23:WU01:FS00:Connecting to 171.67.108.158:8080
18:57:24:WU01:FS00:Downloading 806.52KiB
18:57:25:WU01:FS00:Download complete
18:57:26:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9031 run:372 clone:1 gen:1076 core:0xa4 unit:0x00000490ab436c9e5698349f4ad0e894
18:57:53:WU01:FS00:Downloading core from http://fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah
18:57:53:WU01:FS00:Connecting to fahwebx.stanford.edu:80
18:57:54:WU01:FS00:FahCore a4: Downloading 2.56MiB
18:57:57:WU01:FS00:FahCore a4: Download complete
18:57:57:WU01:FS00:Valid core signature
18:57:57:WU01:FS00:Unpacked 5.98MiB to cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4
18:57:57:WU01:FS00:Starting
18:57:57:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 01 -suffix 01 -version 704 -lifeline 627 -checkpoint 15 -np 6
18:57:57:WU01:FS00:Started FahCore on PID 26255
18:57:57:WU01:FS00:Core PID:26259
18:57:57:WU01:FS00:FahCore 0xa4 started
18:57:57:WU01:FS00:0xa4:
18:57:57:WU01:FS00:0xa4:*------------------------------*
18:57:57:WU01:FS00:0xa4:Folding@Home Gromacs GB Core
18:57:57:WU01:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
18:57:57:WU01:FS00:0xa4:
18:57:57:WU01:FS00:0xa4:Preparing to commence simulation
18:57:57:WU01:FS00:0xa4:- Looking at optimizations...
18:57:57:WU01:FS00:0xa4:- Created dyn
18:57:57:WU01:FS00:0xa4:- Files status OK
18:57:57:WU01:FS00:0xa4:- Expanded 825367 -> 1397768 (decompressed 169.3 percent)
18:57:57:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=825367 data_size=1397768, decompressed_data_size=1397768 diff=0
18:57:57:WU01:FS00:0xa4:- Digital signature verified
18:57:57:WU01:FS00:0xa4:
18:57:57:WU01:FS00:0xa4:Project: 9031 (Run 372, Clone 1, Gen 1076)
18:57:57:WU01:FS00:0xa4:
18:57:57:WU01:FS00:0xa4:Assembly optimizations on if available.
18:57:57:WU01:FS00:0xa4:Entering M.D.
18:58:03:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
18:58:04:WU01:FS00:Starting
18:58:04:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 01 -suffix 01 -version 704 -lifeline 627 -checkpoint 15 -np 6
18:58:04:WU01:FS00:Started FahCore on PID 26263
18:58:04:WU01:FS00:Core PID:26267
18:58:04:WU01:FS00:FahCore 0xa4 started
18:58:04:WU01:FS00:0xa4:
18:58:04:WU01:FS00:0xa4:*------------------------------*
18:58:04:WU01:FS00:0xa4:Folding@Home Gromacs GB Core
18:58:04:WU01:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
18:58:04:WU01:FS00:0xa4:
18:58:04:WU01:FS00:0xa4:Preparing to commence simulation
18:58:04:WU01:FS00:0xa4:- Ensuring status. Please wait.
18:58:13:WU01:FS00:0xa4:- Looking at optimizations...
18:58:13:WU01:FS00:0xa4:- Working with standard loops on this execution.
18:58:13:WU01:FS00:0xa4:- Previous termination of core was improper.
18:58:13:WU01:FS00:0xa4:- Files status OK
18:58:13:WU01:FS00:0xa4:- Expanded 825367 -> 1397768 (decompressed 169.3 percent)
18:58:13:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=825367 data_size=1397768, decompressed_data_size=1397768 diff=0
18:58:13:WU01:FS00:0xa4:- Digital signature verified
18:58:13:WU01:FS00:0xa4:
18:58:13:WU01:FS00:0xa4:Project: 9031 (Run 372, Clone 1, Gen 1076)
18:58:13:WU01:FS00:0xa4:
18:58:13:WU01:FS00:0xa4:Entering M.D.
18:58:20:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
bruce
Posts: 20824 Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.
Post
by bruce » Mon Feb 19, 2018 7:16 pm
How did you start FAHClient? The script is supposed to start automatically as a service when you reboot, but if that isn't working, try
> sudo /etc/init.d/FAHClient start
INTERRUPTED isn't really an error, it generally means that you closed the terminal that you used to start it.
beer
Posts: 179 Joined: Tue Dec 13, 2011 11:18 am
Post
by beer » Mon Feb 19, 2018 7:29 pm
it starts automatically after I have added a cpu slots with 6 cpu core from FAHcontrol. So how can something itnerrupt it when it runs as a service? If it helps then I have a GPU folding without problem that I also have configured by FAHcontrol. And it is only for A4 and not for A7
beer
Posts: 179 Joined: Tue Dec 13, 2011 11:18 am
Post
by beer » Mon Feb 19, 2018 9:06 pm
I tried to install folding@home 7.4.16 where is starts as services but I still have the same problem. It starts correctly but then it get interrupted and tries again with the same error. over and over again. I have also tried with different amount of cores with no changes.
bruce
Posts: 20824 Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.
Post
by bruce » Mon Feb 19, 2018 9:18 pm
Is Project: 9031 (Run 372, Clone 1, Gen 1076) the only A4 project that has failed? What other projects have been assigned?
beer
Posts: 179 Joined: Tue Dec 13, 2011 11:18 am
Post
by beer » Tue Feb 20, 2018 5:30 am
As far as I can see then it is all project that is using A4 that fails like that.
Here is a project that is using A4
Code: Select all
*********************** Log Started 2018-02-20T05:24:10Z ***********************
05:24:10:************************* Folding@home Client *************************
05:24:10: Website: http://folding.stanford.edu/
05:24:10: Copyright: (c) 2009-2016 Stanford University
05:24:10: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
05:24:10: Args: --child --lifeline 680 /etc/fahclient/config.xml --run-as
05:24:10: fahclient --pid-file=/var/run/fahclient.pid --daemon
05:24:10: Config: /etc/fahclient/config.xml
05:24:10:******************************** Build ********************************
05:24:10: Version: 7.4.16
05:24:10: Date: Jan 6 2017
05:24:10: Time: 08:08:33
05:24:10: Repository: Git
05:24:10: Revision: e12187cbb0bd6937c067b9749af011374563b7b9
05:24:10: Branch: master
05:24:10: Compiler: GNU 4.9.2
05:24:10: Options: -std=gnu++98 -O3 -funroll-loops -ffast-math -mfpmath=sse
05:24:10: -fno-unsafe-math-optimizations -msse2
05:24:10: Platform: linux2 4.8.0-2-amd64
05:24:10: Bits: 64
05:24:10: Mode: Release
05:24:10:******************************* System ********************************
05:24:10: CPU: Intel(R) Core(TM) i7-4770S CPU @ 3.10GHz
05:24:10: CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
05:24:10: CPUs: 8
05:24:10: Memory: 7.73GiB
05:24:10: Free Memory: 7.31GiB
05:24:10: Threads: POSIX_THREADS
05:24:10: OS Version: 4.14
05:24:10: Has Battery: false
05:24:10: On Battery: false
05:24:10: UTC Offset: 1
05:24:10: PID: 683
05:24:10: CWD: /var/lib/fahclient
05:24:10: OS: Linux 4.14.0-3-amd64 x86_64
05:24:10: OS Arch: AMD64
05:24:10: GPUs: 1
05:24:10: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 GP104 [GeForce GTX 1070] 6463
05:24:10: CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:6.1 Driver:9.0
05:24:10:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:384.111
05:24:10:***********************************************************************
05:24:10:<config>
05:24:10: <!-- Client Control -->
05:24:10: <fold-anon v='true'/>
05:24:10:
05:24:10: <!-- Network -->
05:24:10: <proxy v=':8080'/>
05:24:10:
05:24:10: <!-- User Information -->
05:24:10: <passkey v='********************************'/>
05:24:10: <user v='jonasvejlin'/>
05:24:10:
05:24:10: <!-- Folding Slots -->
05:24:10: <slot id='0' type='CPU'>
05:24:10: <cpus v='2'/>
05:24:10: </slot>
05:24:10:</config>
05:24:10:Switching to user fahclient
05:24:10:Trying to access database...
05:24:10:Successfully acquired database lock
05:24:10:Enabled folding slot 00: READY cpu:2
05:24:10:WU00:FS00:Starting
05:24:10:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 704 -lifeline 683 -checkpoint 15 -np 2
05:24:10:WU00:FS00:Started FahCore on PID 693
05:24:10:WU00:FS00:Core PID:697
05:24:10:WU00:FS00:FahCore 0xa4 started
05:24:11:WU00:FS00:0xa4:
05:24:11:WU00:FS00:0xa4:*------------------------------*
05:24:11:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
05:24:11:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
05:24:11:WU00:FS00:0xa4:
05:24:11:WU00:FS00:0xa4:Preparing to commence simulation
05:24:11:WU00:FS00:0xa4:- Ensuring status. Please wait.
05:24:20:WU00:FS00:0xa4:- Looking at optimizations...
05:24:20:WU00:FS00:0xa4:- Working with standard loops on this execution.
05:24:20:WU00:FS00:0xa4:Examination of work files indicates 8 consecutive improper terminations of core.
05:24:20:WU00:FS00:0xa4:- Expanded 887891 -> 2072336 (decompressed 233.3 percent)
05:24:20:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=887891 data_size=2072336, decompressed_data_size=2072336 diff=0
05:24:20:WU00:FS00:0xa4:- Digital signature verified
05:24:20:WU00:FS00:0xa4:
05:24:20:WU00:FS00:0xa4:Project: 8633 (Run 1, Clone 34, Gen 95)
05:24:20:WU00:FS00:0xa4:
05:24:20:WU00:FS00:0xa4:Entering M.D.
05:24:26:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
05:24:26:WU00:FS00:Starting
05:24:26:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 704 -lifeline 683 -checkpoint 15 -np 2
05:24:26:WU00:FS00:Started FahCore on PID 1020
05:24:26:WU00:FS00:Core PID:1024
05:24:26:WU00:FS00:FahCore 0xa4 started
05:24:27:WU00:FS00:0xa4:
05:24:27:WU00:FS00:0xa4:*------------------------------*
05:24:27:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
05:24:27:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
05:24:27:WU00:FS00:0xa4:
05:24:27:WU00:FS00:0xa4:Preparing to commence simulation
05:24:27:WU00:FS00:0xa4:- Ensuring status. Please wait.
05:24:36:WU00:FS00:0xa4:- Looking at optimizations...
05:24:36:WU00:FS00:0xa4:- Working with standard loops on this execution.
05:24:36:WU00:FS00:0xa4:Examination of work files indicates 8 consecutive improper terminations of core.
05:24:36:WU00:FS00:0xa4:- Expanded 887891 -> 2072336 (decompressed 233.3 percent)
05:24:36:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=887891 data_size=2072336, decompressed_data_size=2072336 diff=0
05:24:36:WU00:FS00:0xa4:- Digital signature verified
05:24:36:WU00:FS00:0xa4:
05:24:36:WU00:FS00:0xa4:Project: 8633 (Run 1, Clone 34, Gen 95)
05:24:36:WU00:FS00:0xa4:
05:24:36:WU00:FS00:0xa4:Entering M.D.
05:24:42:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
05:25:26:WU00:FS00:Starting
05:25:26:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 704 -lifeline 683 -checkpoint 15 -np 2
05:25:26:WU00:FS00:Started FahCore on PID 1671
05:25:26:WU00:FS00:Core PID:1675
05:25:26:WU00:FS00:FahCore 0xa4 started
05:25:27:WU00:FS00:0xa4:
05:25:27:WU00:FS00:0xa4:*------------------------------*
05:25:27:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
05:25:27:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
05:25:27:WU00:FS00:0xa4:
05:25:27:WU00:FS00:0xa4:Preparing to commence simulation
05:25:27:WU00:FS00:0xa4:- Ensuring status. Please wait.
05:25:36:WU00:FS00:0xa4:- Looking at optimizations...
05:25:36:WU00:FS00:0xa4:- Working with standard loops on this execution.
05:25:36:WU00:FS00:0xa4:Examination of work files indicates 8 consecutive improper terminations of core.
05:25:36:WU00:FS00:0xa4:- Expanded 887891 -> 2072336 (decompressed 233.3 percent)
05:25:36:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=887891 data_size=2072336, decompressed_data_size=2072336 diff=0
05:25:36:WU00:FS00:0xa4:- Digital signature verified
05:25:36:WU00:FS00:0xa4:
05:25:36:WU00:FS00:0xa4:Project: 8633 (Run 1, Clone 34, Gen 95)
05:25:36:WU00:FS00:0xa4:
05:25:36:WU00:FS00:0xa4:Entering M.D.
05:25:42:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
05:26:26:WU00:FS00:Starting
05:26:26:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 704 -lifeline 683 -checkpoint 15 -np 2
05:26:26:WU00:FS00:Started FahCore on PID 1859
05:26:26:WU00:FS00:Core PID:1863
05:26:26:WU00:FS00:FahCore 0xa4 started
05:26:27:WU00:FS00:0xa4:
05:26:27:WU00:FS00:0xa4:*------------------------------*
05:26:27:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
05:26:27:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
05:26:27:WU00:FS00:0xa4:
05:26:27:WU00:FS00:0xa4:Preparing to commence simulation
05:26:27:WU00:FS00:0xa4:- Ensuring status. Please wait.
05:26:36:WU00:FS00:0xa4:- Looking at optimizations...
05:26:36:WU00:FS00:0xa4:- Working with standard loops on this execution.
05:26:36:WU00:FS00:0xa4:Examination of work files indicates 8 consecutive improper terminations of core.
05:26:36:WU00:FS00:0xa4:- Expanded 887891 -> 2072336 (decompressed 233.3 percent)
05:26:36:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=887891 data_size=2072336, decompressed_data_size=2072336 diff=0
05:26:36:WU00:FS00:0xa4:- Digital signature verified
05:26:36:WU00:FS00:0xa4:
05:26:36:WU00:FS00:0xa4:Project: 8633 (Run 1, Clone 34, Gen 95)
05:26:36:WU00:FS00:0xa4:
05:26:36:WU00:FS00:0xa4:Entering M.D.
05:26:42:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
05:27:26:WU00:FS00:Starting
05:27:26:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 704 -lifeline 683 -checkpoint 15 -np 2
05:27:26:WU00:FS00:Started FahCore on PID 1915
05:27:26:WU00:FS00:Core PID:1919
05:27:26:WU00:FS00:FahCore 0xa4 started
05:27:27:WU00:FS00:0xa4:
05:27:27:WU00:FS00:0xa4:*------------------------------*
05:27:27:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
05:27:27:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
05:27:27:WU00:FS00:0xa4:
05:27:27:WU00:FS00:0xa4:Preparing to commence simulation
05:27:27:WU00:FS00:0xa4:- Ensuring status. Please wait.
05:27:36:WU00:FS00:0xa4:- Looking at optimizations...
05:27:36:WU00:FS00:0xa4:- Working with standard loops on this execution.
05:27:36:WU00:FS00:0xa4:Examination of work files indicates 8 consecutive improper terminations of core.
05:27:36:WU00:FS00:0xa4:- Expanded 887891 -> 2072336 (decompressed 233.3 percent)
05:27:36:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=887891 data_size=2072336, decompressed_data_size=2072336 diff=0
05:27:36:WU00:FS00:0xa4:- Digital signature verified
05:27:36:WU00:FS00:0xa4:
05:27:36:WU00:FS00:0xa4:Project: 8633 (Run 1, Clone 34, Gen 95)
05:27:36:WU00:FS00:0xa4:
05:27:36:WU00:FS00:0xa4:Entering M.D.
05:27:42:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
05:28:26:WU00:FS00:Starting
05:28:26:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 704 -lifeline 683 -checkpoint 15 -np 2
05:28:26:WU00:FS00:Started FahCore on PID 2063
05:28:26:WU00:FS00:Core PID:2067
05:28:26:WU00:FS00:FahCore 0xa4 started
05:28:27:WU00:FS00:0xa4:
05:28:27:WU00:FS00:0xa4:*------------------------------*
05:28:27:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
05:28:27:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
05:28:27:WU00:FS00:0xa4:
05:28:27:WU00:FS00:0xa4:Preparing to commence simulation
05:28:27:WU00:FS00:0xa4:- Ensuring status. Please wait.
05:28:36:WU00:FS00:0xa4:- Looking at optimizations...
05:28:36:WU00:FS00:0xa4:- Working with standard loops on this execution.
05:28:36:WU00:FS00:0xa4:Examination of work files indicates 8 consecutive improper terminations of core.
05:28:36:WU00:FS00:0xa4:- Expanded 887891 -> 2072336 (decompressed 233.3 percent)
05:28:36:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=887891 data_size=2072336, decompressed_data_size=2072336 diff=0
05:28:36:WU00:FS00:0xa4:- Digital signature verified
05:28:36:WU00:FS00:0xa4:
05:28:36:WU00:FS00:0xa4:Project: 8633 (Run 1, Clone 34, Gen 95)
05:28:36:WU00:FS00:0xa4:
05:28:36:WU00:FS00:0xa4:Entering M.D.
05:28:42:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
beer
Posts: 179 Joined: Tue Dec 13, 2011 11:18 am
Post
by beer » Tue Feb 20, 2018 5:53 am
As far as I can read on
https://github.com/FoldingAtHome/fah-issues/issues/1157 then there might be a bug in libfah/OpenMM that course the problem. Do you think it is the case this time? That could explain why I am only seeing thep problem with A4 and not wutg A7
Joe_H
Site Admin
Posts: 7927 Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4 MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA
Post
by Joe_H » Tue Feb 20, 2018 6:31 am
No, the A4 core does not use OpenMM, it uses Gromacs. So does the A7 core, just a later revision of Gromacs.
The possibilities I can think of are having a corrupted A4 executable or missing a support library needed to support it running.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
beer
Posts: 179 Joined: Tue Dec 13, 2011 11:18 am
Post
by beer » Tue Feb 20, 2018 6:44 am
I assumed have also assumed that it was the core that was the problem and therefore I have tried
1: Remove the CPU slot while letting the GPU slot continue
2: Delete /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/AVX/Core_a4.fah/FahCore_a4
3: Add the CPU slot again.
The result is the same as before. When resiving a A4 (this time with a fresh core in this case) still fails and when I resive an A7 then it fold like it should. What support library could be missing? Did I do it the right way of refreshing the core ?
bollix47
Posts: 2957 Joined: Sun Dec 02, 2007 5:04 am
Location: Canada
Post
by bollix47 » Tue Feb 20, 2018 11:16 am
beer wrote: /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/AVX/Core_a4.fah/FahCore_a4
There is no AVX in the path to the a4 core. If you truly tried to delete the core in that path nothing would have changed.
The correct path looks like:
/var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4
Your log above does show the correct path.
beer
Posts: 179 Joined: Tue Dec 13, 2011 11:18 am
Post
by beer » Tue Feb 20, 2018 11:48 am
ups I did a wrong when writing the path in the above post. it should have been /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/Core_a4.fah/FahCore_a4 and I could see in my logs that it is redownload when I readd a new cpu slot
beer
Posts: 179 Joined: Tue Dec 13, 2011 11:18 am
Post
by beer » Tue Feb 20, 2018 4:46 pm
Hi
I have just installed Debian sid (before it was Debian stable) on my mashine. I did install FAH the exactly same way. As far as I can se then I should be running 4.14.17 which (as I can se from the but report) already have resived a fix for the problem.
I did notice something in the log I have not seen before:
Code: Select all
16:35:49:Started thread 15 on PID 662
16:36:07:WARNING:FS01:Size of positions 392 does not match topology 389
Is this related to me not being able to run A4-cores?
bruce
Posts: 20824 Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.
Post
by bruce » Tue Feb 20, 2018 6:49 pm
We may learn something else by looking deeper into another log file.
Immediately after the error, pause the CPU slot. Note the number of the WU that's running.
(The log shown above says 05:24:36:WU00:FS00:0xa4. I'm not interested in FS00 [the slot number] but in WU00 and that number changes regularly.) In the /work folder which is in the same directory as the log file. you'll find a subdirectory called 00, matching whatever WU00 says. Post the information from another log file you'll find there.
Note: When FAHClient determines a WU has failed or is finished, that file and the entire directory will be deleted, so if you can't find it, you didn't pause the WU at the precise moment you needed to. Try again later.
JimboPalmer
Posts: 2522 Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA
Post
by JimboPalmer » Tue Feb 20, 2018 8:19 pm
We do not see FS01 in your logs, but I am assuming it is your GPU slot and that "Size of positions 392 does not match topology 389" is a GPU warning unrelated to this issue, my GTX 1050ti issues this warning every percentage point I am running.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends