Page 1 of 1

Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Mon Aug 08, 2011 12:32 pm
by artoar_11
Please some moderator to monitor this WU.

Code: Select all

[11:34:14] Sending work to server
[11:34:14] Project: 6040 (Run 0, Clone 39, Gen 225)

[11:34:14] + Attempting to send results [August 8 11:34:14 UTC]
[11:34:14] - Reading file work/wuresults_09.dat from core
[11:34:14]   (Read 62698314 bytes from disk)
[11:34:14] Connecting to http://171.64.65.54:8080/
[11:37:53] Posted data.
[11:37:54] Initial: 0000; - Uploaded at ~277 kB/s
[11:37:55] - Averaged speed for that direction ~290 kB/s
[11:37:55] + Results successfully sent
[11:37:55] Thank you for your contribution to Folding@Home.
[11:37:55] + Number of Units Completed: 138

[11:38:00] Trying to send all finished work units
[11:38:00] + No unsent completed units remaining.
[11:38:00] - Preparing to get new work unit...
[11:38:00] Cleaning up work directory
[11:38:00] + Attempting to get work packet
[11:38:00] Passkey found
[11:38:00] - Will indicate memory of 4072 MB
[11:38:00] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 10, Stepping: 7
[11:38:00] - Connecting to assignment server
[11:38:00] Connecting to http://assign.stanford.edu:8080/
[11:38:01] Posted data.
[11:38:01] Initial: 40AB; - Successful: assigned to (171.64.65.54).
[11:38:01] + News From Folding@Home: Welcome to Folding@Home
[11:38:02] Loaded queue successfully.
[11:38:02] Sent data
[11:38:02] Connecting to http://171.64.65.54:8080/
[11:38:03] Posted data.
[11:38:03] Initial: 0000; - Receiving payload (expected size: 1765273)
[11:38:09] - Downloaded at ~287 kB/s
[11:38:09] - Averaged speed for that direction ~312 kB/s
[11:38:09] + Received work.
[11:38:09] Trying to send all finished work units
[11:38:09] + No unsent completed units remaining.
[11:38:09] + Closed connections
[11:38:09] 
[11:38:09] + Processing work unit
[11:38:09] Core required: FahCore_a3.exe
[11:38:09] Core found.
[11:38:09] Working on queue slot 00 [August 8 11:38:09 UTC]
[11:38:09] + Working ...
[11:38:09] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 00 -np 4 -nocpulock -checkpoint 9 -verbose -lifeline 3048 -version 634'

[11:38:09] 
[11:38:09] *------------------------------*
[11:38:09] Folding@Home Gromacs SMP Core
[11:38:09] Version 2.27 (Dec. 15, 2010)
[11:38:09] 
[11:38:09] Preparing to commence simulation
[11:38:09] - Looking at optimizations...
[11:38:09] - Created dyn
[11:38:09] - Files status OK
[11:38:09] - Expanded 1764761 -> 2253729 (decompressed 127.7 percent)
[11:38:09] Called DecompressByteArray: compressed_data_size=1764761 data_size=2253729, decompressed_data_size=2253729 diff=0
[11:38:09] - Digital signature verified
[11:38:09] 
[11:38:09] Project: 6053 (Run 1, Clone 194, Gen 357)
[11:38:09] 
[11:38:09] Assembly optimizations on if available.
[11:38:09] Entering M.D.
[11:38:15] Mapping NT from 4 to 4 
[11:38:15] Completed 0 out of 500000 steps  (0%)
[11:38:23] CoreStatus = C0000029 (-1073741783)
[11:38:23] Client-core communications error: ERROR 0xc0000029
[11:38:23] Deleting current work unit & continuing...

..........................................
[11:38:36] - Connecting to assignment server
[11:38:36] Connecting to http://assign.stanford.edu:8080/
[11:38:37] Posted data.
[11:38:37] Initial: 40AB; - Successful: assigned to (171.64.65.54).
[11:38:37] + News From Folding@Home: Welcome to Folding@Home
[11:38:37] Loaded queue successfully.
[11:38:37] Sent data
[11:38:37] Connecting to http://171.64.65.54:8080/
[11:38:38] Posted data.
[11:38:38] Initial: 0000; - Receiving payload (expected size: 1765273)
[11:38:45] - Downloaded at ~246 kB/s
[11:38:45] - Averaged speed for that direction ~299 kB/s
[11:38:45] + Received work.
[11:38:45] + Closed connections
[11:38:50] 
[11:38:50] + Processing work unit
[11:38:50] Core required: FahCore_a3.exe
[11:38:50] Core found.
[11:38:50] Working on queue slot 01 [August 8 11:38:50 UTC]
[11:38:50] + Working ...
[11:38:50] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 01 -np 4 -nocpulock -checkpoint 9 -verbose -lifeline 3048 -version 634'

[11:38:50] 
[11:38:50] *------------------------------*
[11:38:50] Folding@Home Gromacs SMP Core
[11:38:50] Version 2.27 (Dec. 15, 2010)
[11:38:50] 
[11:38:50] Preparing to commence simulation
[11:38:50] - Looking at optimizations...
[11:38:50] - Created dyn
[11:38:50] - Files status OK
[11:38:50] - Expanded 1764761 -> 2253729 (decompressed 127.7 percent)
[11:38:50] Called DecompressByteArray: compressed_data_size=1764761 data_size=2253729, decompressed_data_size=2253729 diff=0
[11:38:50] - Digital signature verified
[11:38:50] 
[11:38:50] Project: 6053 (Run 1, Clone 194, Gen 357)
[11:38:50] 
[11:38:50] Assembly optimizations on if available.
[11:38:50] Entering M.D.
[11:38:56] Mapping NT from 4 to 4 
[11:38:56] Completed 0 out of 500000 steps  (0%)
[11:39:08] CoreStatus = C0000029 (-1073741783)
[11:39:08] Client-core communications error: ERROR 0xc0000029
[11:39:08] Deleting current work unit & continuing...
[11:39:20] Trying to send all finished work units

................................
[11:39:36] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 02 -np 4 -nocpulock -checkpoint 9 -verbose -lifeline 3048 -version 634'

[11:39:36] 
[11:39:36] *------------------------------*
[11:39:36] Folding@Home Gromacs SMP Core
[11:39:36] Version 2.27 (Dec. 15, 2010)
[11:39:36] 
[11:39:36] Preparing to commence simulation
[11:39:36] - Looking at optimizations...
[11:39:36] - Created dyn
[11:39:36] - Files status OK
[11:39:36] - Expanded 1764761 -> 2253729 (decompressed 127.7 percent)
[11:39:36] Called DecompressByteArray: compressed_data_size=1764761 data_size=2253729, decompressed_data_size=2253729 diff=0
[11:39:36] - Digital signature verified
[11:39:36] 
[11:39:36] Project: 6053 (Run 1, Clone 194, Gen 357)
[11:39:36] 
[11:39:36] Assembly optimizations on if available.
[11:39:36] Entering M.D.
[11:39:42] Mapping NT from 4 to 4 
[11:39:42] Completed 0 out of 500000 steps  (0%)
[11:39:48] CoreStatus = C0000029 (-1073741783)
[11:39:48] Client-core communications error: ERROR 0xc0000029
[11:39:48] - Attempting to download new core...
[11:39:48] + Downloading new core: FahCore_a3.exe
[11:39:48] Downloading core (/~pande/Win32/x86/Core_a3.fah from www.stanford.edu)
[11:39:48] Initial: AFDE; + 10240 bytes downloaded
[11:39:49] Initial: 0E39; + 20480 bytes downloaded
[11:39:50] Initial: FD05; + 30720 bytes downloaded

....................................
[11:49:57] Initial: 0000; - Receiving payload (expected size: 1765273)
[11:50:04] - Downloaded at ~246 kB/s
[11:50:04] - Averaged speed for that direction ~251 kB/s
[11:50:04] + Received work.
[11:50:04] + Closed connections
[11:50:09] 
[11:50:09] + Processing work unit
[11:50:09] Core required: FahCore_a3.exe
[11:50:09] Core found.
[11:50:09] Working on queue slot 02 [August 8 11:50:09 UTC]
[11:50:09] + Working ...
[11:50:09] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 02 -np 4 -nocpulock -checkpoint 9 -verbose -lifeline 3048 -version 634'

[11:50:09] 
[11:50:09] *------------------------------*
[11:50:09] Folding@Home Gromacs SMP Core
[11:50:09] Version 2.27 (Dec. 15, 2010)
[11:50:09] 
[11:50:09] Preparing to commence simulation
[11:50:09] - Looking at optimizations...
[11:50:09] - Created dyn
[11:50:09] - Files status OK
[11:50:10] - Expanded 1764761 -> 2253729 (decompressed 127.7 percent)
[11:50:10] Called DecompressByteArray: compressed_data_size=1764761 data_size=2253729, decompressed_data_size=2253729 diff=0
[11:50:10] - Digital signature verified
[11:50:10] 
[11:50:10] Project: 6053 (Run 1, Clone 194, Gen 357)
[11:50:10] 
[11:50:10] Assembly optimizations on if available.
[11:50:10] Entering M.D.
[11:50:16] Mapping NT from 4 to 4 
[11:50:16] Completed 0 out of 500000 steps  (0%)
[11:50:25] CoreStatus = C0000029 (-1073741783)
[11:50:25] Client-core communications error: ERROR 0xc0000029
[11:50:25] Deleting current work unit & continuing...
[11:50:37] Trying to send all finished work units
[11:50:37] + No unsent completed units remaining.
[11:50:37] - Preparing to get new work unit...
[11:50:37] Cleaning up work directory


[11:51:19] + Processing work unit
[11:51:19] Core required: FahCore_a3.exe
[11:51:19] Core found.
[11:51:19] Working on queue slot 03 [August 8 11:51:19 UTC]
[11:51:19] + Working ...
[11:51:19] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 03 -np 4 -nocpulock -checkpoint 9 -verbose -lifeline 3048 -version 634'

[11:51:19] 
[11:51:19] *------------------------------*
[11:51:19] Folding@Home Gromacs SMP Core
[11:51:19] Version 2.27 (Dec. 15, 2010)
[11:51:19] 
[11:51:19] Preparing to commence simulation
[11:51:19] - Looking at optimizations...
[11:51:19] - Created dyn
[11:51:19] - Files status OK
[11:51:19] - Expanded 1764761 -> 2253729 (decompressed 127.7 percent)
[11:51:19] Called DecompressByteArray: compressed_data_size=1764761 data_size=2253729, decompressed_data_size=2253729 diff=0
[11:51:19] - Digital signature verified
[11:51:19] 
[11:51:19] Project: 6053 (Run 1, Clone 194, Gen 357)
[11:51:19] 
[11:51:19] Assembly optimizations on if available.
[11:51:19] Entering M.D.
[11:51:25] Mapping NT from 4 to 4 
[11:51:25] Completed 0 out of 500000 steps  (0%)
[11:51:31] Killing all core threads
[11:51:31] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown at user request.
[11:51:31] ***** Got a SIGTERM signal (2)
[11:51:31] Killing all core threads
[11:51:31] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown.
After many unsuccessful attempts to start, I changed Machine ID.

Code: Select all

--- Opening Log file [August 8 11:53:54 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: E:\Tempo\Arto's\_SMP_FAH-v. 6.34
Executable: E:\Tempo\Arto's\_SMP_FAH-v. 6.34\FAH6.34-win32-SMP.exe
Arguments: -smp -verbosity 9 -advmethods 

[11:53:54] - Ask before connecting: No
[11:53:54] - User name: artoar_home (Team 32435)
[11:53:54] - User ID: 3FADD50A009EB311
[11:53:54] - Machine ID: 3
[11:53:54] 
[11:53:54] Work directory not found. Creating...
[11:53:54] Could not open work queue, generating new queue...
[11:53:54] - Preparing to get new work unit...
[11:53:54] - Autosending finished units... [August 8 11:53:54 UTC]
[11:53:54] Cleaning up work directory
[11:53:54] Trying to send all finished work units
[11:53:54] + Attempting to get work packet
[11:53:54] + No unsent completed units remaining.
[11:53:54] Passkey found
[11:53:54] - Autosend completed
[11:53:54] - Will indicate memory of 4072 MB
[11:53:54] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 10, Stepping: 7
[11:53:54] - Connecting to assignment server
[11:53:54] Connecting to http://assign.stanford.edu:8080/
[11:53:55] Posted data.
[11:53:55] Initial: 40AB; - Successful: assigned to (171.64.65.54).
[11:53:55] + News From Folding@Home: Welcome to Folding@Home
[11:53:55] Loaded queue successfully.
[11:53:55] Sent data
[11:53:55] Connecting to http://171.64.65.54:8080/
[11:53:56] Posted data.
[11:53:56] Initial: 0000; - Receiving payload (expected size: 1765200)
[11:54:03] - Downloaded at ~246 kB/s
[11:54:03] - Averaged speed for that direction ~246 kB/s
[11:54:03] + Received work.
[11:54:03] + Closed connections
[11:54:03] 
[11:54:03] + Processing work unit
[11:54:03] Core required: FahCore_a3.exe
[11:54:03] Core found.
[11:54:03] Working on queue slot 01 [August 8 11:54:03 UTC]
[11:54:03] + Working ...
[11:54:03] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 01 -np 4 -nocpulock -checkpoint 9 -verbose -lifeline 5040 -version 634'

[11:54:03] 
[11:54:03] *------------------------------*
[11:54:03] Folding@Home Gromacs SMP Core
[11:54:03] Version 2.27 (Dec. 15, 2010)
[11:54:03] 
[11:54:03] Preparing to commence simulation
[11:54:03] - Looking at optimizations...
[11:54:03] - Created dyn
[11:54:03] - Files status OK
[11:54:04] - Expanded 1764688 -> 2253729 (decompressed 127.7 percent)
[11:54:04] Called DecompressByteArray: compressed_data_size=1764688 data_size=2253729, decompressed_data_size=2253729 diff=0
[11:54:04] - Digital signature verified
[11:54:04] 
[11:54:04] Project: 6053 (Run 1, Clone 61, Gen 360)
[11:54:04] 
[11:54:04] Assembly optimizations on if available.
[11:54:04] Entering M.D.
[11:54:10] Mapping NT from 4 to 4 
[11:54:10] Completed 0 out of 500000 steps  (0%)
[11:57:36] Completed 5000 out of 500000 steps  (1%)
[12:01:03] Completed 10000 out of 500000 steps  (2%)
[12:04:30] Completed 15000 out of 500000 steps  (3%)
[12:07:56] Completed 20000 out of 500000 steps  (4%)
[12:11:19] Completed 25000 out of 500000 steps  (5%)

Re: Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Mon Aug 08, 2011 7:09 pm
by 7im
I'll ask a mod to check this for you.

Mod Edit: Multiple failures, marked as a bad work unit.

Re: Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Tue Aug 09, 2011 7:33 am
by Teddy
Instant failure of the work unit here as well, if its marked as bad, how come it is still being handed out???

Code: Select all

[23:10:24] Folding@Home Gromacs SMP Core
[23:10:24] Version 2.27 (Dec. 15, 2010)
[23:10:24] 
[23:10:24] Preparing to commence simulation
[23:10:24] - Assembly optimizations manually forced on.
[23:10:24] - Not checking prior termination.
[23:10:24] - Expanded 1764761 -> 2253729 (decompressed 127.7 percent)
[23:10:24] Called DecompressByteArray: compressed_data_size=1764761 data_size=2253729, decompressed_data_size=2253729 diff=0
[23:10:24] - Digital signature verified
[23:10:24] 
[23:10:24] Project: 6053 (Run 1, Clone 194, Gen 357)
[23:10:24] 
[23:10:24] Assembly optimizations on if available.
[23:10:24] Entering M.D.
[23:10:30] Mapping NT from 4 to 4 
[23:10:31] Completed 0 out of 500000 steps  (0%)
[00:14:52] - Autosending finished units... [August 9 00:14:52 UTC]
[00:14:52] Trying to send all finished work units
[00:14:52] + No unsent completed units remaining.
[00:14:52] - Autosend completed
[06:14:52] - Autosending finished units... [August 9 06:14:52 UTC]
[06:14:52] Trying to send all finished work units
[06:14:52] + No unsent completed units remaining.
[06:14:52] - Autosend completed
[07:27:49] Killing all core threads
[07:27:49] Killing 2 cores
[07:27:49] Killing core 0
[07:27:49] Killing core 1

Folding@Home Client Shutdown at user request.
[07:27:49] ***** Got a SIGTERM signal (2)
[07:27:49] Killing all core threads
[07:27:49] Killing 2 cores
[07:27:49] Killing core 0
[07:27:49] Killing core 1

Folding@Home Client Shutdown.
many hrs wasted...


Mod Edit: Added Code Tags - PantherX

Re: Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Tue Aug 09, 2011 8:29 am
by artoar_11
I received this WU - [August 8 11:38:09 UTC]. For this project, writes - [Preferred (days)] - 3 days. Why server after several hours gives the same WU to another donor (Teddy - August 8 23:10:24 UTC)?
Has always been talk that a WU is given only one donor. Little doubt that it is not always so :?

Thanks 7im and Mod :)

Re: Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Tue Aug 09, 2011 3:36 pm
by bruce
Stanford runs a script at 8am (or 15:00 UTC) which collects any Mod reports such as the one above and actually processes the STOP function. The report was made soon after midnight PDT, so the WU wasn't stopped until about ½ hr ago, some 7½ hours later.

Re: Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Tue Aug 09, 2011 7:33 pm
by artoar_11
bruce, thanks for the reply.
English a little difficulty, for which I apologize.
My question was different. This WU has been received by me on August 8. The next donor should receive the same WU after 3 days (if I do not get it back). Server why not wait those 3 days (Preferred time (days)?

Thanks in advance.

Re: Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Tue Aug 09, 2011 8:58 pm
by PantherX
The reason why the Server didn't wait is because it already had multiple failure reports so it distributed the WU more than once hence 2 donors may have gotten the same WU before the Preferred Deadline.

Re: Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Tue Aug 09, 2011 11:57 pm
by bruce
PantherX wrote:The reason why the Server didn't wait is because it already had multiple failure reports so it distributed the WU more than once hence 2 donors may have gotten the same WU before the Preferred Deadline.
The WU was originally distributed on 2011-08-07 just before 6pm, Stanford time. You were the first one to report the problem.

Re: Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Wed Aug 10, 2011 12:08 am
by bruce
artoar_11 wrote:My question was different. This WU has been received by me on August 8. The next donor should receive the same WU after 3 days (if I do not get it back). Server why not wait those 3 days (Preferred time (days)?
The server waits until either the WU is returned or 3 days, whichever occurs first. After this type of error, the client does return the WU for zero credit, so it's immediately available for reassignment.

You were not the first to receive this WU or to return an error report to the sever. The WU was originally distributed before 6pm, Stanford time, on 2011-08-07. You were the first one to report the problem, and by then it had been sent out many times.

The same thing happens with WUs which are completed successfully. If the project has a 3-day Preferred Deadline and WUs are actually completed in an average of 24 hours, then three Generations of the WU can be completed in the same time that the server would wait for a lost WU. That's why we constantly stress that it's important to return the results as soon as you can.

Re: Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Wed Aug 10, 2011 7:23 am
by artoar_11
bruce wrote: The server waits until either the WU is returned or 3 days, whichever occurs first. After this type of error, the client does return the WU for zero credit, so it's immediately available for reassignment.
Yes, I understand.

[11:38:56] Completed 0 out of 500000 steps (0%)
[11:39:08] CoreStatus = C0000029 (-1073741783)
[11:39:08] Client-core communications error: ERROR 0xc0000029
[11:39:08] Deleting current work unit & continuing...
[11:39:20] Trying to send all finished work units

The server already has information that WU is returned. I forgot about this report :)
I wanted a detailed answer, because sometimes I answer the questions of colleagues from our team.

It was useful to me. Thank you.

Re: Project: 6053 (Run 1, Clone 194, Gen 357)

Posted: Wed Aug 10, 2011 11:27 am
by bruce
bruce wrote:You were not the first to receive this WU or to return an error report to the sever. The WU was originally distributed before 6pm, Stanford time, on 2011-08-07. You were the first one to report the problem, and by then it had been sent out many times.
Well, whatever is wrong with this WU, it has exhibited more than one type of failure. Your failure report was here on the forum but it was not in the database. Others must have had some other type of failure which did make a report to the stats. When the client says [11:39:08] Deleting current work unit & continuing... there is nothing to upload. The message [11:39:20] Trying to send all finished work units is unrelated and the fact that the two occurred only 12 seconds apart is coincidental. I can find no stats record for this WU from any of the machines that appear to be yours.