Project: 6900 (Run 43, Clone 9, Gen 40)

Moderators: Site Moderators, FAHC Science Team

Post Reply
SergeantHop
Posts: 9
Joined: Fri Dec 25, 2009 3:37 am

Project: 6900 (Run 43, Clone 9, Gen 40)

Post by SergeantHop »

Code: Select all

# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Users\Folding\Desktop\Folding\CPU
Executable: C:\Users\Folding\Desktop\Folding\CPU\F@H.exe
Arguments: -smp 11 -smp -bigadv -verbosity 9 

[13:59:35] - Ask before connecting: No
[13:59:35] - User name: SergeantHop (Team 37726)
[13:59:35] - User ID: 2ED8816D60CD120A
[13:59:35] - Machine ID: 1
[13:59:35] 
[13:59:35] Loaded queue successfully.
[13:59:35] 
[13:59:35] - Autosending finished units... [April 30 13:59:35 UTC]
[13:59:35] + Processing work unit
[13:59:35] Trying to send all finished work units
[13:59:36] Core required: FahCore_a5.exe
[13:59:36] + No unsent completed units remaining.
[13:59:36] Core found.
[13:59:36] - Autosend completed
[13:59:36] Working on queue slot 01 [April 30 13:59:36 UTC]
[13:59:36] + Working ...
[13:59:36] - Calling '.\FahCore_a5.exe -dir work/ -nice 19 -suffix 01 -np 12 -checkpoint 5 -verbose -lifeline 4324 -version 634'

[13:59:36] 
[13:59:37] *------------------------------*
[13:59:37] Folding@Home Gromacs SMP Core
[13:59:37] Version 2.27 (Mar 12, 2010)
[13:59:37] 
[13:59:37] Preparing to commence simulation
[13:59:37] - Ensuring status. Please wait.
[13:59:46] - Looking at optimizations...
[13:59:46] - Working with standard loops on this execution.
[13:59:46] - Previous termination of core was improper.
[13:59:46] - Files status OK
[13:59:50] - Expanded 24868002 -> 30796293 (decompressed 123.8 percent)
[13:59:50] Called DecompressByteArray: compressed_data_size=24868002 data_size=30796293, decompressed_data_size=30796293 diff=0
[13:59:50] - Digital signature verified
[13:59:50] 
[13:59:50] Project: 6900 (Run 43, Clone 9, Gen 40)
[13:59:50] 
[13:59:50] Entering M.D.
[13:59:56] Mapping NT from 12 to 12 
[13:59:59] Completed 0 out of 250000 steps  (0%)
[14:00:05] CoreStatus = C0000005 (-1073741819)
[14:00:05] Client-core communications error: ERROR 0xc0000005
[14:00:05] Deleting current work unit & continuing...
[14:00:19] Trying to send all finished work units
[14:00:19] + No unsent completed units remaining.
[14:00:19] - Preparing to get new work unit...
[14:00:19] Cleaning up work directory
[14:00:19] + Attempting to get work packet
[14:00:19] Passkey found
[14:00:19] - Will indicate memory of 6135 MB
[14:00:19] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 12, Stepping: 2
[14:00:19] - Connecting to assignment server
[14:00:19] Connecting to http://assign.stanford.edu:8080/
[14:00:19] Posted data.
[14:00:19] Initial: ED82; - Successful: assigned to (130.237.232.237).
[14:00:19] + News From Folding@Home: Welcome to Folding@Home
[14:00:19] Loaded queue successfully.
[14:00:19] Sent data
[14:00:19] Connecting to http://130.237.232.237:8080/
[14:00:26] Posted data.
[14:00:26] Initial: 0000; - Receiving payload (expected size: 24867563)

====================================================================================================================


[22:40:09] 
[22:40:09] + Processing work unit
[22:40:09] Core required: FahCore_a5.exe
[22:40:09] Core found.
[22:40:09] Working on queue slot 08 [May 2 22:40:09 UTC]
[22:40:09] + Working ...
[22:40:09] 
[22:40:09] *------------------------------*
[22:40:09] Folding@Home Gromacs SMP Core
[22:40:09] Version 2.27 (Mar 12, 2010)
[22:40:09] 
[22:40:09] Preparing to commence simulation
[22:40:09] - Looking at optimizations...
[22:40:09] - Created dyn
[22:40:09] - Files status OK
[22:40:13] - Expanded 24868002 -> 30796293 (decompressed 123.8 percent)
[22:40:13] Called DecompressByteArray: compressed_data_size=24868002 data_size=30796293, decompressed_data_size=30796293 diff=0
[22:40:14] - Digital signature verified
[22:40:14] 
[22:40:14] Project: 6900 (Run 43, Clone 9, Gen 40)
[22:40:14] 
[22:40:14] Assembly optimizations on if available.
[22:40:14] Entering M.D.
[22:40:20] Mapping NT from 11 to 10 
[22:40:22] Completed 0 out of 250000 steps  (0%)
[02:07:42] CoreStatus = C0000005 (-1073741819)
[02:07:42] Client-core communications error: ERROR 0xc0000005
[02:07:42] Deleting current work unit & continuing...
[02:07:56] - Preparing to get new work unit...
[02:07:56] Cleaning up work directory
[02:08:02] + Attempting to get work packet
[02:08:02] Passkey found
[02:08:02] - Connecting to assignment server
[02:08:02] - Successful: assigned to (130.237.232.141).
[02:08:02] + News From Folding@Home: Welcome to Folding@Home
[02:08:02] Loaded queue successfully.

=====================================================================================================================


[02:32:59] Trying to unzip core FahCore_a5.exe
[02:33:00] Decompressed FahCore_a5.exe (9326080 bytes) successfully
[02:33:05] + Core successfully engaged
[02:33:10] 
[02:33:10] + Processing work unit
[02:33:10] Core required: FahCore_a5.exe
[02:33:10] Core found.
[02:33:10] Working on queue slot 01 [May 3 02:33:10 UTC]
[02:33:10] + Working ...
[02:33:10] 
[02:33:10] *------------------------------*
[02:33:10] Folding@Home Gromacs SMP Core
[02:33:10] Version 2.27 (Mar 12, 2010)
[02:33:10] 
[02:33:10] Preparing to commence simulation
[02:33:10] - Looking at optimizations...
[02:33:10] - Created dyn
[02:33:10] - Files status OK
[02:33:15] - Expanded 24868002 -> 30796293 (decompressed 123.8 percent)
[02:33:15] Called DecompressByteArray: compressed_data_size=24868002 data_size=30796293, decompressed_data_size=30796293 diff=0
[02:33:15] - Digital signature verified
[02:33:15] 
[02:33:15] Project: 6900 (Run 43, Clone 9, Gen 40)
[02:33:15] 
[02:33:15] Assembly optimizations on if available.
[02:33:15] Entering M.D.
[02:33:21] Mapping NT from 11 to 10 
[02:33:23] Completed 0 out of 250000 steps  (0%)
[02:33:24] Gromacs cannot continue further.
[02:33:24] Going to send back what have done -- stepsTotalG=250000
[02:33:24] Work fraction=-1.#IND steps=250000.
[02:42:02] logfile size=0 infoLength=0 edr=0 trr=23
[02:42:02] logfile size: 0 info=0 bed=0 hdr=23
[02:42:02] - Writing 640 bytes of core data to disk...
[02:42:04] CoreStatus = C0000005 (-1073741819)
[02:42:04] Client-core communications error: ERROR 0xc0000005
[02:42:04] Deleting current work unit & continuing...

=====================================================================================================================
This is from two different machines, one running stock for one of the instances. 100% stable with all other units.
k1wi
Posts: 909
Joined: Tue Sep 22, 2009 10:48 pm

Re: Project: 6900 (Run 43, Clone 9, Gen 40)

Post by k1wi »

Are both machines using -smp11? Using a prime number can lead to errors.

http://foldingforum.org/viewtopic.php?p=143042#p143042
Image
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 6900 (Run 43, Clone 9, Gen 40)

Post by bruce »

The WU (P6900,R43,C9,G40) has been reported as a bad WU.
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Project: 6900 (Run 43, Clone 9, Gen 40)

Post by 7im »

Yes, it's tough to tell how many thread are being used... 11 or 12...

Executable: C:\Users\Folding\Desktop\Folding\CPU\F@H.exe
Arguments: -smp 11 -smp -bigadv -verbosity 9


Looking lower in the log file, I guess -smp wins out over -smp 11, even when 11 comes first in the list.

[13:59:36] - Calling '.\FahCore_a5.exe -dir work/ -nice 19 -np 12 -checkpoint 5 -verbose -lifeline 4324 -version 634'
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
SergeantHop
Posts: 9
Joined: Fri Dec 25, 2009 3:37 am

Re: Project: 6900 (Run 43, Clone 9, Gen 40)

Post by SergeantHop »

It is in fact running on -smp 11, just like it always has. This is the only unit that it errors out on like this.

The only reason it has both -smp and -smp 11 is because it wouldn't download smp units with just the -smp 11 flag, adding the second generic one solved that issue.

Also, it totally only runs on 11 cores. Utilization is around 92%.
k1wi
Posts: 909
Joined: Tue Sep 22, 2009 10:48 pm

Re: Project: 6900 (Run 43, Clone 9, Gen 40)

Post by k1wi »

Is it possible that it's generating the 12 thread and only actually folding on 11 of your computer's threads? That might be the issue and could lead to lower performance as the extra thread has to battle to 'catch up'...

There is no reason that it should not work on -smp 11, unless Stanford have stopped allowing -smp 11. Having to use both -smp 11 and -smp suggests that there is an issue that needs to be resolved...

What are you also running that you are leaving that extra core spare for?
Image
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 6900 (Run 43, Clone 9, Gen 40)

Post by bruce »

k1wi wrote:There is no reason that it should not work on -smp 11, unless Stanford have stopped allowing -smp 11. Having to use both -smp 11 and -smp suggests that there is an issue that needs to be resolved...
You've got it backward. The reason that Stanford is placing restrictions on folding with prime numbers of threads is because gromacs isn't designed to work on less that a complete machine. Some WUs may work with 11 threads, but the failure rate is going to be higher and there's nothing that can be done about it except to force folks not use numbers like that. If you override those protections, you're doing it at your own risk and Stanford cannot support you.
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Project: 6900 (Run 43, Clone 9, Gen 40)

Post by 7im »

It gets even stranger... I didn't read far enough down the log...

Code: Select all

[22:40:20] Mapping NT from 11 to 10
This makes it look like it's only folding on 10 cores, and the prime thing doesn't apply.

Bad WU (as bruce already noted). :twisted:
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
k1wi
Posts: 909
Joined: Tue Sep 22, 2009 10:48 pm

Re: Project: 6900 (Run 43, Clone 9, Gen 40)

Post by k1wi »

bruce wrote:
k1wi wrote:There is no reason that it should not work on -smp 11, unless Stanford have stopped allowing -smp 11. Having to use both -smp 11 and -smp suggests that there is an issue that needs to be resolved...
You've got it backward. The reason that Stanford is placing restrictions on folding with prime numbers of threads is because gromacs isn't designed to work on less that a complete machine. Some WUs may work with 11 threads, but the failure rate is going to be higher and there's nothing that can be done about it except to force folks not use numbers like that. If you override those protections, you're doing it at your own risk and Stanford cannot support you.
Right, I was saying that the issue needs to be resolved with the user, not Stanford... Anyway, as 7im says, looks like a dodgy WU
Image
SergeantHop
Posts: 9
Joined: Fri Dec 25, 2009 3:37 am

Re: Project: 6900 (Run 43, Clone 9, Gen 40)

Post by SergeantHop »

I'm running 2 fermi cards, and it lags to high hell if I don't leave a core free. I've also always either run -smp 11 or -smp 7 on a quad, never with any issue. It has always got better PPD for me, so I tend to use it, and it's never crashed, except with this unit.
Post Reply