10 million steps? What should I do?

Moderators: Site Moderators, FAHC Science Team

Post Reply
scm2000
Posts: 26
Joined: Sun Mar 15, 2020 12:13 am

10 million steps? What should I do?

Post by scm2000 »

I've got a work unit running in the 13821 project that has 10 million steps.
I'm 4 percent the way through it.. and that seems to have taken all day so far.. so it will not finish in time.

Should I kill the work unit? Will that effect my standing, and how, on a Mac do you go about killing a work unit?
Neil-B
Posts: 1996
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: 10 million steps? What should I do?

Post by Neil-B »

It will be a malformed WU … Can you post a log showing the start of the WU - That will let a report be put into the team … Once that's up someone will give details of the best way to dump/delete the WU … viewtopic.php?f=24&t=26036
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
scm2000
Posts: 26
Joined: Sun Mar 15, 2020 12:13 am

Re: 10 million steps? What should I do?

Post by scm2000 »

I rebooted so I don't have the log from the beginning but here is the relavant section:

Code: Select all

20:59:33:FS00:Unpaused
20:59:33:WU00:FS00:Starting
20:59:33:WU00:FS00:Running FahCore: /usr/local/bin/FAHCoreWrapper "/Library/Application Support/FAHClient/cores/cores.foldingathome.org/v7/osx/64bit/avx/Core_a7.fah/FahCore_a7" -dir 00 -suffix 01 -version 705 -lifeline 103 -checkpoint 15 -np 7
20:59:33:WU00:FS00:Started FahCore on PID 1180
20:59:33:WU00:FS00:Core PID:1181
20:59:33:WU00:FS00:FahCore 0xa7 started
20:59:34:WU00:FS00:0xa7:*********************** Log Started 2020-04-02T20:59:33Z ***********************
20:59:34:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
20:59:34:WU00:FS00:0xa7:       Type: 0xa7
20:59:34:WU00:FS00:0xa7:       Core: Gromacs
20:59:34:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 1180 -checkpoint 15 -np 7
20:59:34:WU00:FS00:0xa7:************************************ CBang *************************************
20:59:34:WU00:FS00:0xa7:       Date: Oct 26 2019
20:59:34:WU00:FS00:0xa7:       Time: 03:00:53
20:59:34:WU00:FS00:0xa7:   Revision: 3b1c887e9f30a608262e0d62833b273e843f7c1b
20:59:34:WU00:FS00:0xa7:     Branch: master
20:59:34:WU00:FS00:0xa7:   Compiler: GNU 4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.8)
20:59:34:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -mmacosx-version-min=10.7
20:59:34:WU00:FS00:0xa7:             -Wno-unused-local-typedefs -stdlib=libc++ -fPIC
20:59:34:WU00:FS00:0xa7:   Platform: darwin 19.0.0
20:59:34:WU00:FS00:0xa7:       Bits: 64
20:59:34:WU00:FS00:0xa7:       Mode: Release
20:59:34:WU00:FS00:0xa7:************************************ System ************************************
20:59:34:WU00:FS00:0xa7:        CPU: Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz
20:59:34:WU00:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
20:59:34:WU00:FS00:0xa7:       CPUs: 8
20:59:34:WU00:FS00:0xa7:     Memory: 16.00GiB
20:59:34:WU00:FS00:0xa7:Free Memory: 11.19GiB
20:59:34:WU00:FS00:0xa7:    Threads: POSIX_THREADS
20:59:34:WU00:FS00:0xa7: OS Version: 10.15
20:59:34:WU00:FS00:0xa7:Has Battery: true
20:59:34:WU00:FS00:0xa7: On Battery: false
20:59:34:WU00:FS00:0xa7: UTC Offset: -4
20:59:34:WU00:FS00:0xa7:        PID: 1181
20:59:34:WU00:FS00:0xa7:        CWD: /Library/Application Support/FAHClient/work
20:59:34:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
20:59:34:WU00:FS00:0xa7:    Version: 0.0.18
20:59:34:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
20:59:34:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
20:59:34:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
20:59:34:WU00:FS00:0xa7:       Date: Oct 26 2019
20:59:34:WU00:FS00:0xa7:       Time: 03:06:33
20:59:34:WU00:FS00:0xa7:   Revision: fcc08f30b8997509aaba3a213354c363f474e056
20:59:34:WU00:FS00:0xa7:     Branch: master
20:59:34:WU00:FS00:0xa7:   Compiler: GNU 4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.8)
20:59:34:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -mmacosx-version-min=10.7
20:59:34:WU00:FS00:0xa7:             -Wno-unused-local-typedefs -stdlib=libc++
20:59:34:WU00:FS00:0xa7:   Platform: darwin 19.0.0
20:59:34:WU00:FS00:0xa7:       Bits: 64
20:59:34:WU00:FS00:0xa7:       Mode: Release
20:59:34:WU00:FS00:0xa7:************************************ Build *************************************
20:59:34:WU00:FS00:0xa7:       SIMD: avx_256
20:59:34:WU00:FS00:0xa7:********************************************************************************
20:59:34:WU00:FS00:0xa7:Project: 13821 (Run 704, Clone 0, Gen 83)
20:59:34:WU00:FS00:0xa7:Unit: 0x0000006b80fccb095c8838705874501f
20:59:34:WU00:FS00:0xa7:Digital signatures verified
20:59:34:WU00:FS00:0xa7:Reducing thread count from 7 to 6 to avoid domain decomposition by a prime number > 3
20:59:34:WU00:FS00:0xa7:Calling: mdrun -s frame83.tpr -o frame83.trr -x frame83.xtc -cpi state.cpt -cpt 15 -nt 6
20:59:34:WU00:FS00:0xa7:Steps: first=10375000 total=10375000
20:59:35:WU00:FS00:0xa7:Completed 454972 out of 10375000 steps (4%)
20:59:43:Removing old file 'configs/config-20200322-025556.xml'
Mod Edit: Added Code Tags - PantherX
Joe_H
Site Admin
Posts: 8130
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: 10 million steps? What should I do?

Post by Joe_H »

Yes, dump the WU, I have reported it and it should get removed from distribution.

Since you are on OS X, I can just copy and past directions I gave earlier today to another person who got one of these oversized WUs:

To dump WU, first pause folding. Since you are on OS X, go to the folder /Library/Application Support/FAHClient/. Inside that folder open the folder named 'work', and move the folder named '00' to the trash. You may be prompted for an admin password, I forget which files in this folder are protected by being owned by a different user. Then you can start up the client, it will do cleanup after detecting the work files are gone. Then it should request a new WU.
Image
scm2000
Posts: 26
Joined: Sun Mar 15, 2020 12:13 am

Re: 10 million steps? What should I do?

Post by scm2000 »

the only thing in application support is: FAHControl.db
scm2000
Posts: 26
Joined: Sun Mar 15, 2020 12:13 am

Re: 10 million steps? What should I do?

Post by scm2000 »

Joe_H wrote:Yes, dump the WU, I have reported it and it should get removed from distribution.

Since you are on OS X, I can just copy and past directions I gave earlier today to another person who got one of these oversized WUs:

To dump WU, first pause folding. Since you are on OS X, go to the folder /Library/Application Support/FAHClient/. Inside that folder open the folder named 'work', and move the folder named '00' to the trash. You may be prompted for an admin password, I forget which files in this folder are protected by being owned by a different user. Then you can start up the client, it will do cleanup after detecting the work files are gone. Then it should request a new WU.
ok all set.. it's back running a normal work unit again.
Post Reply