Project 6023 (Run 1, Clone 36, Gen 23)

Moderators: Site Moderators, FAHC Science Team

Post Reply
Phantom
Posts: 23
Joined: Mon Dec 03, 2007 2:14 am
Location: teammacosx.org
Contact:

Project 6023 (Run 1, Clone 36, Gen 23)

Post by Phantom »

Got a strange one here... Recommend someone keep an eye out for it... Only uses 75% of dual C2D Mac Pro running latest Snow Leopard 10.6.2.

Never gets to 1% frame completion -- just stays at 0% completion for hours. Normally frame times around 7 minutes... Manually terminated by user at 16:55:37 in the log below

Code: Select all

[10:42:29] - Preparing to get new work unit...
[10:42:29] Cleaning up work directory
[10:42:29] + Attempting to get work packet
[10:42:29] Passkey found
[10:42:29] - Will indicate memory of 8192 MB
[10:42:29] - Connecting to assignment server
[10:42:29] Connecting to http://assign.stanford.edu:8080/
[10:42:30] Posted data.
[10:42:30] Initial: 40AB; - Successful: assigned to (171.64.65.54).
[10:42:30] + News From Folding@Home: Welcome to Folding@Home
[10:42:30] Loaded queue successfully.
[10:42:30] Sent data
[10:42:30] Connecting to http://171.64.65.54:8080/
[10:42:30] Posted data.
[10:42:30] Initial: 0000; - Receiving payload (expected size: 1768098)
[10:42:32] - Downloaded at ~863 kB/s
[10:42:32] - Averaged speed for that direction ~750 kB/s
[10:42:32] + Received work.
[10:42:32] Trying to send all finished work units
[10:42:32] + No unsent completed units remaining.
[10:42:32] + Closed connections
[10:42:32] 
[10:42:32] + Processing work unit
[10:42:32] Core required: FahCore_a3.exe
[10:42:32] Core found.
[10:42:32] Working on queue slot 04 [February 26 10:42:32 UTC]
[10:42:32] + Working ...
[10:42:32] - Calling './FahCore_a3.exe -dir work/ -nice 19 -suffix 04 -np 4 -checkpoint 15 -forceasm -verbose -lifeline 284 -version 629'

[10:42:32] 
[10:42:32] *------------------------------*
[10:42:32] Folding@Home Gromacs SMP Core
[10:42:32] Version 2.13 (Dec 9 2009)
[10:42:32] 
[10:42:32] Preparing to commence simulation
[10:42:32] - Assembly optimizations manually forced on.
[10:42:32] - Not checking prior termination.
[10:42:33] - Expanded 1767586 -> 1967109 (decompressed 111.2 percent)
[10:42:33] Called DecompressByteArray: compressed_data_size=1767586 data_size=1967109, decompressed_data_size=1967109 diff=0
[10:42:33] - Digital signature verified
[10:42:33] 
[10:42:33] Project: 6023 (Run 1, Clone 36, Gen 23)
[10:42:33] 
[10:42:33] Assembly optimizations on if available.
[10:42:33] Entering M.D.
[10:42:39] Completed 0 out of 500000 steps  (0%)
[14:26:42] - Autosending finished units... [February 26 14:26:42 UTC]
[14:26:42] Trying to send all finished work units
[14:26:42] + No unsent completed units remaining.
[14:26:42] - Autosend completed
[16:55:37] ***** Got a SIGTERM signal (15)
[16:55:37] Killing all core threads

Folding@Home Client Shutdown.
I've stopped WU, restarted system, and resumed WU; however, still doesn't move past 0%... No special indications in log.

FahCore_a3.exe was running with 7 threads and only consumes 75% of my CPUs (shows as 300% in Activity Monitor). Most of the other FahCore_a3 WUs seem to run with 8 threads and consume nearly 100% of each CPU on my system (shows as >390% in Activity Monitor).

Dumped WU after no progress. New WU was assigned.
Wrish
Posts: 74
Joined: Thu Jan 28, 2010 5:09 am

Re: Project 6023 (Run 1, Clone 36, Gen 23)

Post by Wrish »

I'm a little confused. You have 4 cores (2 dual core processors) but A3 spawns 7 or 8 threads? I'm unfamiliar with Snow Leopard but wonder how you know it's 7 threads. Do check that when your client is stopped, there is no A3 process still hanging around. A3 should spawn 4 threads unless you force it with -smp 8, and if no other process is hogging a CPU core, it should use close to 400% CPU.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project 6023 (Run 1, Clone 36, Gen 23)

Post by bruce »

FahCore_a3 will spawn one worker thread per CPU-core (unless you specify otherwise) plus several other threads that do almost nothing . . . except to wait for a timer to tell it to run a brief process and go back to sleep. For example, there's a thread that waits 6 hours before checking the queue to see if there are any WUs that still need to be uploaded.

In *nix, it's easy to see the threads. In Windows, the threads are normally hidden within a smaller number of processes.
Phantom
Posts: 23
Joined: Mon Dec 03, 2007 2:14 am
Location: teammacosx.org
Contact:

Re: Project 6023 (Run 1, Clone 36, Gen 23)

Post by Phantom »

The evil WU was recently reassigned to me... Exact same symptoms... Stuck at 0% completion and using only 300% of an idle Mac Pro... Unfortunately, I wasn't around to detect the situation earlier and just detected the repeat situation.

Dumping that bad WU again. (Hope it will successfully complete for some other slightly different configuration -- or that it gets removed from the work queue...)

Fold on!
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project 6023 (Run 1, Clone 36, Gen 23)

Post by bruce »

Was the the same machine or another one? If it's the same one, what other WUs were completed between your first report and this one?
Phantom
Posts: 23
Joined: Mon Dec 03, 2007 2:14 am
Location: teammacosx.org
Contact:

Re: Project 6023 (Run 1, Clone 36, Gen 23)

Post by Phantom »

Bruce -- Yes. Same machine. Between the two instances of the evil WU, the machine processed the following WUs:

Assigned @ [February 26 17:35:48 UTC] Project: 6012 (Run 0, Clone 145, Gen 73) --> CoreStatus = 64
(Note that this WU was erroneously reassigned to this same machine below, even though successfully completed and results returned)

Assigned @ [February 27 05:45:50 UTC] Project: 6015 (Run 0, Clone 53, Gen 56) --> CoreStatus = 0
(Note that this WU was correctly reassigned to this same machine below, AND was then successfully completed and results returned)

Assigned @ [February 27 05:48:37 UTC] Project: 6012 (Run 0, Clone 145, Gen 73) --> CoreStatus = 0
(Strange seeing that the WU was correctly completed/returned above)

Assigned @ [February 27 05:51:32 UTC] Project: 6015 (Run 0, Clone 53, Gen 56) --> CoreStatus = 64

Assigned @ [February 27 18:18:09 UTC] Project: 6023 (Run 1, Clone 36, Gen 23) *** 2nd assignment of subject WU

Do these times correspond to any WU assignment issues that you've noticed on your end?

Other than this little apparent hiccup in the subject WU and these "interesting" WU assignments, the machine has been continually crunching through a mix of A1, A2, and A3 WUs.
ikerekes
Posts: 94
Joined: Thu Nov 13, 2008 4:18 pm
Hardware configuration: q6600 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon x2 6000+ @ 3.0Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
5600X2 @ 3.19Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
E5200 @ 3.7Ghz ubuntu 8.04 smp2 + asus 9600GT silent gpu2 in wine wrapper
E5200 @ 3.65Ghz ubuntu 8.04 smp2 + asus 9600GSO gpu2 in wine wrapper
E6550 vmware ubuntu 8.4.1
q8400 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon II 620 @ 2.6 Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Location: Calgary, Canada

Re: Project 6023 (Run 1, Clone 36, Gen 23)

Post by ikerekes »

got this killer wu second time in 2 days (:
Had to delete it and move forward
Image
toTOW
Site Moderator
Posts: 6395
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project 6023 (Run 1, Clone 36, Gen 23)

Post by toTOW »

I marked this WU as a bad one.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
hrsetrdr
Posts: 112
Joined: Sun Dec 02, 2007 4:29 pm
Location: In the Fold somewhere in SoCal.

Re: Project 6023 (Run 1, Clone 36, Gen 23)

Post by hrsetrdr »

I just got a P6023 (Run 1, Clone 119, Gen37) which looks to be progressing normally, on a C2D E6300 @2.0ghz,Linux kernel 2.6.31-14. TPF=16 mins 23 sec.
Folding rig:Supermicro X9DRD-7LN4F-JBOD | (2) Xeon E5-2670 | 128GB DDR3 ECC Registered

Image
Install Folding@Home on Linux without Python dependancy issues
Phantom
Posts: 23
Joined: Mon Dec 03, 2007 2:14 am
Location: teammacosx.org
Contact:

Re: Project 6023 (Run 1, Clone 36, Gen 23)

Post by Phantom »

I also successfully fold P6023 -- just not the evil WU in question. Currently, I've got Project 6023 (Run 0, Clone 81, Gen 45) folding on one of my Mac Minis (C2D T7600 @ 2.33GHz) with TPF=15 mins 45 sec.

Beware the evil WU! ;-)
Post Reply