Did the log indicate such behavior? Might be wrong estimation, might be a checkpoint issue. I'm not sure. I've never changed the smp number mid WU (in fact, I've never changed it at all).
You might wanna post this on a new thread, though.
Points right for my system?
Moderators: Site Moderators, FAHC Science Team
-
- Site Admin
- Posts: 7990
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4 - Location: W. MA
Re: Points right for my system?
Would have to see the log to be more sure of the reasons. Potential reasons include a corrupted checkpoint resulting in starting over from the beginning, or just the known issue of the TPF and ETA calculations not necessarily being accurate.p24601 wrote:Hi, my CPU is now doing project 7809 and ETA was (I think it was)2.75 days. I had 1.05 days left when I turned SMP back from 2 to -1.
I've checked this morning and the ETA was 3.75 days what?! I've now put the SMP to 2 again, shut down the client and restarted waited a few minutes and the ETA is now 20 hours 46 minutes
So what happened?
Are you still running GPU folding on your system? The SMP folding is very sensitive to synchronization between the threads running on the different cores. Your 6570 will use most of a CPU core just feeding the GPU folding process, the SMP process will be severely impacted with a setting of -1(4 cores). That would explain the ETA of over 3 days after the first change. If you coincidentally managed to corrupt the checkpoint the second restart, then the WU could start over from the beginning.
-
- Posts: 5
- Joined: Thu Apr 19, 2012 4:17 pm
- Hardware configuration: Core2 Quad Core Q6600 2.40GHz 4GB Windows 7
Sapphire HD6570 2GB
Re: Points right for my system?
Thanks, for everybody's help! Since it's back to normally I won't worry about it.
Anyway here's (I think) is the log:
Anyway here's (I think) is the log:
Code: Select all
*********************** Log Started 2012-05-13T11:11:05Z ***********************
11:11:07:WU02:FS01:Starting
11:11:07:WU02:FS01:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:/Users/[private]/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 02 -suffix 01 -version 701 -lifeline 7036 -checkpoint 15 -np 2
11:11:07:WU02:FS01:Started FahCore on PID 7440
11:11:08:WU02:FS01:Core PID:7488
11:11:08:WU02:FS01:FahCore 0xa4 started
11:11:08:WU02:FS01:0xa4:
11:11:08:WU02:FS01:0xa4:*------------------------------*
11:11:08:WU02:FS01:0xa4:Folding@Home Gromacs GB Core
11:11:08:WU02:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
11:11:08:WU02:FS01:0xa4:
11:11:08:WU02:FS01:0xa4:Preparing to commence simulation
11:11:08:WU02:FS01:0xa4:- Looking at optimizations...
11:11:08:WU02:FS01:0xa4:- Files status OK
11:11:09:WU02:FS01:0xa4:- Expanded 2079470 -> 5386224 (decompressed 259.0 percent)
11:11:09:WU02:FS01:0xa4:Called DecompressByteArray: compressed_data_size=2079470 data_size=5386224, decompressed_data_size=5386224 diff=0
11:11:09:WU02:FS01:0xa4:- Digital signature verified
11:11:09:WU02:FS01:0xa4:
11:11:09:WU02:FS01:0xa4:Project: 7809 (Run 0, Clone 26, Gen 72)
11:11:09:WU02:FS01:0xa4:
11:11:09:WU02:FS01:0xa4:Assembly optimizations on if available.
11:11:09:WU02:FS01:0xa4:Entering M.D.
11:11:15:WU02:FS01:0xa4:Using Gromacs checkpoints
11:11:15:WU02:FS01:0xa4:Mapping NT from 2 to 2
11:11:16:WU02:FS01:0xa4:Resuming from checkpoint
11:11:16:WU02:FS01:0xa4:Verified 02/wudata_01.log
11:11:16:WU02:FS01:0xa4:Verified 02/wudata_01.trr
11:11:16:WU02:FS01:0xa4:Verified 02/wudata_01.xtc
11:11:16:WU02:FS01:0xa4:Verified 02/wudata_01.edr
11:11:17:WU02:FS01:0xa4:Completed 1045380 out of 1500000 steps (69%)
11:15:06:FS01:Paused
11:15:06:FS01:Shutting core down
11:15:10:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
13:07:20:FS01:Unpaused
13:07:20:WU02:FS01:Starting
13:07:20:WU02:FS01:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:/Users/[private]/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 02 -suffix 01 -version 701 -lifeline 7036 -checkpoint 15 -np 2
13:07:20:WU02:FS01:Started FahCore on PID 6352
13:07:20:WU02:FS01:Core PID:7712
13:07:20:WU02:FS01:FahCore 0xa4 started
13:07:21:WU02:FS01:0xa4:
13:07:21:WU02:FS01:0xa4:*------------------------------*
13:07:21:WU02:FS01:0xa4:Folding@Home Gromacs GB Core
13:07:21:WU02:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
13:07:21:WU02:FS01:0xa4:
13:07:21:WU02:FS01:0xa4:Preparing to commence simulation
13:07:21:WU02:FS01:0xa4:- Looking at optimizations...
13:07:21:WU02:FS01:0xa4:- Files status OK
13:07:21:WU02:FS01:0xa4:- Expanded 2079470 -> 5386224 (decompressed 259.0 percent)
13:07:21:WU02:FS01:0xa4:Called DecompressByteArray: compressed_data_size=2079470 data_size=5386224, decompressed_data_size=5386224 diff=0
13:07:21:WU02:FS01:0xa4:- Digital signature verified
13:07:21:WU02:FS01:0xa4:
13:07:21:WU02:FS01:0xa4:Project: 7809 (Run 0, Clone 26, Gen 72)
13:07:21:WU02:FS01:0xa4:
13:07:21:WU02:FS01:0xa4:Assembly optimizations on if available.
13:07:21:WU02:FS01:0xa4:Entering M.D.
13:07:27:WU02:FS01:0xa4:Using Gromacs checkpoints
13:07:27:WU02:FS01:0xa4:Mapping NT from 2 to 2
13:07:28:WU02:FS01:0xa4:Resuming from checkpoint
13:07:28:WU02:FS01:0xa4:Verified 02/wudata_01.log
13:07:28:WU02:FS01:0xa4:Verified 02/wudata_01.trr
13:07:28:WU02:FS01:0xa4:Verified 02/wudata_01.xtc
13:07:28:WU02:FS01:0xa4:Verified 02/wudata_01.edr
13:07:28:WU02:FS01:0xa4:Completed 1045380 out of 1500000 steps (69%)
13:19:53:WU02:FS01:0xa4:Completed 1050000 out of 1500000 steps (70%)
13:29:54:FS01:Paused
13:29:54:FS01:Shutting core down
13:30:04:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
14:17:58:FS01:Unpaused
14:17:58:WU02:FS01:Starting
14:17:58:WU02:FS01:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:/Users/[private]/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 02 -suffix 01 -version 701 -lifeline 7036 -checkpoint 15 -np 2
14:17:58:WU02:FS01:Started FahCore on PID 1140
14:17:58:WU02:FS01:Core PID:6952
14:17:58:WU02:FS01:FahCore 0xa4 started
14:17:59:WU02:FS01:0xa4:
14:17:59:WU02:FS01:0xa4:*------------------------------*
14:17:59:WU02:FS01:0xa4:Folding@Home Gromacs GB Core
14:17:59:WU02:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
14:17:59:WU02:FS01:0xa4:
14:17:59:WU02:FS01:0xa4:Preparing to commence simulation
14:17:59:WU02:FS01:0xa4:- Looking at optimizations...
14:17:59:WU02:FS01:0xa4:- Files status OK
14:17:59:WU02:FS01:0xa4:- Expanded 2079470 -> 5386224 (decompressed 259.0 percent)
14:17:59:WU02:FS01:0xa4:Called DecompressByteArray: compressed_data_size=2079470 data_size=5386224, decompressed_data_size=5386224 diff=0
14:17:59:WU02:FS01:0xa4:- Digital signature verified
14:17:59:WU02:FS01:0xa4:
14:17:59:WU02:FS01:0xa4:Project: 7809 (Run 0, Clone 26, Gen 72)
14:17:59:WU02:FS01:0xa4:
14:17:59:WU02:FS01:0xa4:Assembly optimizations on if available.
14:17:59:WU02:FS01:0xa4:Entering M.D.
14:18:05:WU02:FS01:0xa4:Using Gromacs checkpoints
14:18:05:WU02:FS01:0xa4:Mapping NT from 2 to 2
14:18:06:WU02:FS01:0xa4:Resuming from checkpoint
14:18:06:WU02:FS01:0xa4:Verified 02/wudata_01.log
14:18:06:WU02:FS01:0xa4:Verified 02/wudata_01.trr
14:18:06:WU02:FS01:0xa4:Verified 02/wudata_01.xtc
14:18:06:WU02:FS01:0xa4:Verified 02/wudata_01.edr
14:18:07:WU02:FS01:0xa4:Completed 1050980 out of 1500000 steps (70%)
14:55:28:WU02:FS01:0xa4:Completed 1065000 out of 1500000 steps (71%)
15:37:23:WU02:FS01:0xa4:Completed 1080000 out of 1500000 steps (72%)
16:19:25:WU02:FS01:0xa4:Completed 1095000 out of 1500000 steps (73%)
17:00:14:WU02:FS01:0xa4:Completed 1110000 out of 1500000 steps (74%)
******************************** Date: 13/05/12 ********************************
17:40:40:WU02:FS01:0xa4:Completed 1125000 out of 1500000 steps (75%)
18:24:47:WU02:FS01:0xa4:Completed 1140000 out of 1500000 steps (76%)
cross/'kros/ n: a thing they nail people to.
I Am The Way To A Forsaken People...
I Am The Way To A Forsaken People...
-
- Site Moderator
- Posts: 2850
- Joined: Mon Jul 18, 2011 4:44 am
- Hardware configuration: OS: Windows 10, Kubuntu 19.04
CPU: i7-6700k
GPU: GTX 970, GTX 1080 TI
RAM: 24 GB DDR4 - Location: Western Washington
Re: Points right for my system?
IIRC, in v6 if you changed the number of cores SMP could use during processing the WU would restart for scientific reliability reasons or something like that. Not sure if V7 has the same behavior though. Might...
F@h is now the top computing platform on the planet and nothing unites people like a dedicated fight against a common enemy. This virus affects all of us. Lets end it together.
-
- Site Admin
- Posts: 7990
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4 - Location: W. MA
Re: Points right for my system?
I never saw that happen in either V6 or V7 using the OS X versions of the client. It could have done that for some specific WU/core/client-version combinations, but as best as I can tell it was not a general thing. Of course reports could have been confusing a restart from the beginning due to a corrupted checkpoint that coincided with the SMP change.Jesse_V wrote:IIRC, in v6 if you changed the number of cores SMP could use during processing the WU would restart for scientific reliability reasons or something like that. Not sure if V7 has the same behavior though. Might...
Re: Points right for my system?
Waiting "a few minutes" isn't long enough to calculate a new ETA. If the current WU is K % completed, wait until it reaches at least (K+3) or (K+4)%
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.