Page 1 of 1

Project 2677 R25 C35 G32

Posted: Fri Aug 14, 2009 6:06 pm
by BrokenWolf
Another 1.5MB download running on one core only. Log shows completion of previous WU with time remaining to deadline.

This WU was downloaded over 12 hours ago and still has 6 days 14 hours+ before it will complete (per FahMon 2.3.99.1) Will not make the deadline.


Code: Select all

[05:16:30] Folding@home Core Shutdown: FINISHED_UNIT
[05:18:10] CoreStatus = 64 (100)
[05:18:10] Unit 9 finished with 59 percent of time to deadline remaining.
[05:18:10] Updated performance fraction: 0.589174
[05:18:10] Sending work to server
[05:18:10] Project: 2669 (Run 6, Clone 75, Gen 199)


[05:18:10] + Attempting to send results [August 14 05:18:10 UTC]
[05:18:10] - Reading file work/wuresults_09.dat from core
[05:18:10]   (Read 25965870 bytes from disk)
[05:18:10] Connecting to http://171.64.65.56:8080/
[05:18:23] Posted data.
[05:18:23] Initial: 0000; - Uploaded at ~1267 kB/s
[05:18:30] - Averaged speed for that direction ~1084 kB/s
[05:18:30] + Results successfully sent
[05:18:30] Thank you for your contribution to Folding@Home.
[05:18:30] + Number of Units Completed: 103

[05:19:11] - Warning: Could not delete all work unit files (9): Core file absent
[05:19:11] Trying to send all finished work units
[05:19:11] + No unsent completed units remaining.
[05:19:11] - Preparing to get new work unit...
[05:19:11] + Attempting to get work packet
[05:19:11] - Will indicate memory of 2007 MB
[05:19:11] - Connecting to assignment server
[05:19:11] Connecting to http://assign.stanford.edu:8080/
[05:19:11] Posted data.
[05:19:11] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[05:19:11] + News From Folding@Home: Welcome to Folding@Home
[05:19:11] Loaded queue successfully.
[05:19:11] Connecting to http://171.64.65.56:8080/
[05:19:18] Posted data.
[05:19:18] Initial: 0000; - Receiving payload (expected size: 1493917)
[05:19:20] - Downloaded at ~729 kB/s
[05:19:20] - Averaged speed for that direction ~1093 kB/s
[05:19:20] + Received work.
[05:19:20] Trying to send all finished work units
[05:19:20] + No unsent completed units remaining.
[05:19:20] + Closed connections
[05:19:20] 
[05:19:20] + Processing work unit
[05:19:20] At least 4 processors must be requested.Core required: FahCore_a2.exe
[05:19:20] Core found.
[05:19:20] Working on queue slot 00 [August 14 05:19:20 UTC]
[05:19:20] + Working ...
[05:19:20] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 00 -priority 96 -checkpoint 15 -verbose -lifeline 21604 -version 624'

[05:19:20] 
[05:19:20] *------------------------------*
[05:19:20] Folding@Home Gromacs SMP Core
[05:19:20] Version 2.08 (Mon May 18 14:47:42 PDT 2009)
[05:19:20] 
[05:19:20] Preparing to commence simulation
[05:19:20] - Ensuring status. Please wait.
[05:19:20] Called DecompressByteArray: compressed_data_size=1493405 data_size=24054765, decompressed_data_size=24054765 diff=0
[05:19:21] - Digital signature verified
[05:19:21] 
[05:19:21] Project: 2677 (Run 25, Clone 35, Gen 32)
[05:19:21] 
[05:19:21] Assembly optimizations on if available.
[05:19:21] Entering M.D.
[05:19:30] Run 25, Clone 35, Gen 32)
[05:19:30] 
[05:19:30] Entering M.D.
[05:20:02] Completed 0 out of 250000 steps  (0%)
[05:57:54] - Autosending finished units... [August 14 05:57:54 UTC]
[05:57:54] Trying to send all finished work units
[05:57:54] + No unsent completed units remaining.
[05:57:54] - Autosend completed
[07:03:26] Completed 2500 out of 250000 steps  (1%)
[08:47:03] Completed 5000 out of 250000 steps  (2%)
[10:30:40] Completed 7500 out of 250000 steps  (3%)
[11:57:53] - Autosending finished units... [August 14 11:57:53 UTC]
[11:57:53] Trying to send all finished work units
[11:57:53] + No unsent completed units remaining.
[11:57:53] - Autosend completed
[12:14:18] Completed 10000 out of 250000 steps  (4%)
[13:57:56] Completed 12500 out of 250000 steps  (5%)
[15:41:35] Completed 15000 out of 250000 steps  (6%)
[17:25:10] Completed 17500 out of 250000 steps  (7%)

Re: Project 2677 R25 C35 G32

Posted: Fri Aug 14, 2009 7:11 pm
by susato
Thanks for the heads-up Tim -

I expect that you're running these with the -smp flag, pure and simple.
Of course it is supposed to default to 4 processors but what if it missed that call?
Next time you get one, perhaps you could try stopping it and starting it with the -smp 4 flag to see if that does anything different?

Re: Project 2677 R25 C35 G32

Posted: Fri Aug 14, 2009 7:36 pm
by 314159
Hmm,

[05:19:20] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 00 -priority 96 -checkpoint 15 -verbose -lifeline 21604 -version 624'

I cannot duplicate this behavior on the single C2D that is running Version 2.08.
Staying with 2.07 on the remaining C2D's until we determine what is going on. :ewink:

More info would be available in the console vs. FAHlog.txt. Have you looked for anomalies there, BW?

Re: Project 2677 R25 C35 G32

Posted: Fri Aug 14, 2009 9:10 pm
by BrokenWolf
I just wanted to make sure that this WU was listed as one of the bad ones. As you may have noticed I have been recently harping quite a bit about these one core SMP WUs. I wanted to make sure it got listed as bad so no one else would get this wu assigned to them.

BW

Re: Project 2677 R25 C35 G32

Posted: Fri Aug 14, 2009 9:13 pm
by MtM
Yeah I know BW, it's a very good thing as well and I would actually expect more reports from more people about this ( I can't imagine people would miss their installs running on one core only! ) but it's not been the storm I was abit afraid for.