Points right for my system?

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

iceman1992
Posts: 523
Joined: Fri Mar 23, 2012 5:16 pm

Re: Points right for my system?

Post by iceman1992 »

Did the log indicate such behavior? Might be wrong estimation, might be a checkpoint issue. I'm not sure. I've never changed the smp number mid WU (in fact, I've never changed it at all).

You might wanna post this on a new thread, though.
Joe_H
Site Admin
Posts: 7990
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Points right for my system?

Post by Joe_H »

p24601 wrote:Hi, my CPU is now doing project 7809 and ETA was (I think it was)2.75 days. I had 1.05 days left when I turned SMP back from 2 to -1.
I've checked this morning and the ETA was 3.75 days :shock: what?! I've now put the SMP to 2 again, shut down the client and restarted waited a few minutes and the ETA is now 20 hours 46 minutes :o

So what happened?
Would have to see the log to be more sure of the reasons. Potential reasons include a corrupted checkpoint resulting in starting over from the beginning, or just the known issue of the TPF and ETA calculations not necessarily being accurate.

Are you still running GPU folding on your system? The SMP folding is very sensitive to synchronization between the threads running on the different cores. Your 6570 will use most of a CPU core just feeding the GPU folding process, the SMP process will be severely impacted with a setting of -1(4 cores). That would explain the ETA of over 3 days after the first change. If you coincidentally managed to corrupt the checkpoint the second restart, then the WU could start over from the beginning.
Image
p24601
Posts: 5
Joined: Thu Apr 19, 2012 4:17 pm
Hardware configuration: Core2 Quad Core Q6600 2.40GHz 4GB Windows 7
Sapphire HD6570 2GB

Re: Points right for my system?

Post by p24601 »

Thanks, for everybody's help! Since it's back to normally I won't worry about it.
Anyway here's (I think) is the log:

Code: Select all

*********************** Log Started 2012-05-13T11:11:05Z ***********************
11:11:07:WU02:FS01:Starting
11:11:07:WU02:FS01:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:/Users/[private]/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 02 -suffix 01 -version 701 -lifeline 7036 -checkpoint 15 -np 2
11:11:07:WU02:FS01:Started FahCore on PID 7440
11:11:08:WU02:FS01:Core PID:7488
11:11:08:WU02:FS01:FahCore 0xa4 started
11:11:08:WU02:FS01:0xa4:
11:11:08:WU02:FS01:0xa4:*------------------------------*
11:11:08:WU02:FS01:0xa4:Folding@Home Gromacs GB Core
11:11:08:WU02:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
11:11:08:WU02:FS01:0xa4:
11:11:08:WU02:FS01:0xa4:Preparing to commence simulation
11:11:08:WU02:FS01:0xa4:- Looking at optimizations...
11:11:08:WU02:FS01:0xa4:- Files status OK
11:11:09:WU02:FS01:0xa4:- Expanded 2079470 -> 5386224 (decompressed 259.0 percent)
11:11:09:WU02:FS01:0xa4:Called DecompressByteArray: compressed_data_size=2079470 data_size=5386224, decompressed_data_size=5386224 diff=0
11:11:09:WU02:FS01:0xa4:- Digital signature verified
11:11:09:WU02:FS01:0xa4:
11:11:09:WU02:FS01:0xa4:Project: 7809 (Run 0, Clone 26, Gen 72)
11:11:09:WU02:FS01:0xa4:
11:11:09:WU02:FS01:0xa4:Assembly optimizations on if available.
11:11:09:WU02:FS01:0xa4:Entering M.D.
11:11:15:WU02:FS01:0xa4:Using Gromacs checkpoints
11:11:15:WU02:FS01:0xa4:Mapping NT from 2 to 2 
11:11:16:WU02:FS01:0xa4:Resuming from checkpoint
11:11:16:WU02:FS01:0xa4:Verified 02/wudata_01.log
11:11:16:WU02:FS01:0xa4:Verified 02/wudata_01.trr
11:11:16:WU02:FS01:0xa4:Verified 02/wudata_01.xtc
11:11:16:WU02:FS01:0xa4:Verified 02/wudata_01.edr
11:11:17:WU02:FS01:0xa4:Completed 1045380 out of 1500000 steps  (69%)
11:15:06:FS01:Paused
11:15:06:FS01:Shutting core down
11:15:10:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
13:07:20:FS01:Unpaused
13:07:20:WU02:FS01:Starting
13:07:20:WU02:FS01:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:/Users/[private]/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 02 -suffix 01 -version 701 -lifeline 7036 -checkpoint 15 -np 2
13:07:20:WU02:FS01:Started FahCore on PID 6352
13:07:20:WU02:FS01:Core PID:7712
13:07:20:WU02:FS01:FahCore 0xa4 started
13:07:21:WU02:FS01:0xa4:
13:07:21:WU02:FS01:0xa4:*------------------------------*
13:07:21:WU02:FS01:0xa4:Folding@Home Gromacs GB Core
13:07:21:WU02:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
13:07:21:WU02:FS01:0xa4:
13:07:21:WU02:FS01:0xa4:Preparing to commence simulation
13:07:21:WU02:FS01:0xa4:- Looking at optimizations...
13:07:21:WU02:FS01:0xa4:- Files status OK
13:07:21:WU02:FS01:0xa4:- Expanded 2079470 -> 5386224 (decompressed 259.0 percent)
13:07:21:WU02:FS01:0xa4:Called DecompressByteArray: compressed_data_size=2079470 data_size=5386224, decompressed_data_size=5386224 diff=0
13:07:21:WU02:FS01:0xa4:- Digital signature verified
13:07:21:WU02:FS01:0xa4:
13:07:21:WU02:FS01:0xa4:Project: 7809 (Run 0, Clone 26, Gen 72)
13:07:21:WU02:FS01:0xa4:
13:07:21:WU02:FS01:0xa4:Assembly optimizations on if available.
13:07:21:WU02:FS01:0xa4:Entering M.D.
13:07:27:WU02:FS01:0xa4:Using Gromacs checkpoints
13:07:27:WU02:FS01:0xa4:Mapping NT from 2 to 2 
13:07:28:WU02:FS01:0xa4:Resuming from checkpoint
13:07:28:WU02:FS01:0xa4:Verified 02/wudata_01.log
13:07:28:WU02:FS01:0xa4:Verified 02/wudata_01.trr
13:07:28:WU02:FS01:0xa4:Verified 02/wudata_01.xtc
13:07:28:WU02:FS01:0xa4:Verified 02/wudata_01.edr
13:07:28:WU02:FS01:0xa4:Completed 1045380 out of 1500000 steps  (69%)
13:19:53:WU02:FS01:0xa4:Completed 1050000 out of 1500000 steps  (70%)
13:29:54:FS01:Paused
13:29:54:FS01:Shutting core down
13:30:04:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
14:17:58:FS01:Unpaused
14:17:58:WU02:FS01:Starting
14:17:58:WU02:FS01:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:/Users/[private]/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 02 -suffix 01 -version 701 -lifeline 7036 -checkpoint 15 -np 2
14:17:58:WU02:FS01:Started FahCore on PID 1140
14:17:58:WU02:FS01:Core PID:6952
14:17:58:WU02:FS01:FahCore 0xa4 started
14:17:59:WU02:FS01:0xa4:
14:17:59:WU02:FS01:0xa4:*------------------------------*
14:17:59:WU02:FS01:0xa4:Folding@Home Gromacs GB Core
14:17:59:WU02:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
14:17:59:WU02:FS01:0xa4:
14:17:59:WU02:FS01:0xa4:Preparing to commence simulation
14:17:59:WU02:FS01:0xa4:- Looking at optimizations...
14:17:59:WU02:FS01:0xa4:- Files status OK
14:17:59:WU02:FS01:0xa4:- Expanded 2079470 -> 5386224 (decompressed 259.0 percent)
14:17:59:WU02:FS01:0xa4:Called DecompressByteArray: compressed_data_size=2079470 data_size=5386224, decompressed_data_size=5386224 diff=0
14:17:59:WU02:FS01:0xa4:- Digital signature verified
14:17:59:WU02:FS01:0xa4:
14:17:59:WU02:FS01:0xa4:Project: 7809 (Run 0, Clone 26, Gen 72)
14:17:59:WU02:FS01:0xa4:
14:17:59:WU02:FS01:0xa4:Assembly optimizations on if available.
14:17:59:WU02:FS01:0xa4:Entering M.D.
14:18:05:WU02:FS01:0xa4:Using Gromacs checkpoints
14:18:05:WU02:FS01:0xa4:Mapping NT from 2 to 2 
14:18:06:WU02:FS01:0xa4:Resuming from checkpoint
14:18:06:WU02:FS01:0xa4:Verified 02/wudata_01.log
14:18:06:WU02:FS01:0xa4:Verified 02/wudata_01.trr
14:18:06:WU02:FS01:0xa4:Verified 02/wudata_01.xtc
14:18:06:WU02:FS01:0xa4:Verified 02/wudata_01.edr
14:18:07:WU02:FS01:0xa4:Completed 1050980 out of 1500000 steps  (70%)
14:55:28:WU02:FS01:0xa4:Completed 1065000 out of 1500000 steps  (71%)
15:37:23:WU02:FS01:0xa4:Completed 1080000 out of 1500000 steps  (72%)
16:19:25:WU02:FS01:0xa4:Completed 1095000 out of 1500000 steps  (73%)
17:00:14:WU02:FS01:0xa4:Completed 1110000 out of 1500000 steps  (74%)
******************************** Date: 13/05/12 ********************************
17:40:40:WU02:FS01:0xa4:Completed 1125000 out of 1500000 steps  (75%)
18:24:47:WU02:FS01:0xa4:Completed 1140000 out of 1500000 steps  (76%)
cross/'kros/ n: a thing they nail people to.
I Am The Way To A Forsaken People...
Image
Jesse_V
Site Moderator
Posts: 2850
Joined: Mon Jul 18, 2011 4:44 am
Hardware configuration: OS: Windows 10, Kubuntu 19.04
CPU: i7-6700k
GPU: GTX 970, GTX 1080 TI
RAM: 24 GB DDR4
Location: Western Washington

Re: Points right for my system?

Post by Jesse_V »

IIRC, in v6 if you changed the number of cores SMP could use during processing the WU would restart for scientific reliability reasons or something like that. Not sure if V7 has the same behavior though. Might...
F@h is now the top computing platform on the planet and nothing unites people like a dedicated fight against a common enemy. This virus affects all of us. Lets end it together.
Joe_H
Site Admin
Posts: 7990
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Points right for my system?

Post by Joe_H »

Jesse_V wrote:IIRC, in v6 if you changed the number of cores SMP could use during processing the WU would restart for scientific reliability reasons or something like that. Not sure if V7 has the same behavior though. Might...
I never saw that happen in either V6 or V7 using the OS X versions of the client. It could have done that for some specific WU/core/client-version combinations, but as best as I can tell it was not a general thing. Of course reports could have been confusing a restart from the beginning due to a corrupted checkpoint that coincided with the SMP change.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Points right for my system?

Post by bruce »

Waiting "a few minutes" isn't long enough to calculate a new ETA. If the current WU is K % completed, wait until it reaches at least (K+3) or (K+4)%
Post Reply