Page 1 of 1

Project 2665 : (Run 2, Clone 256 Gen 39)

Posted: Sun Oct 05, 2008 3:57 pm
by bossa_nova
I have recieved this WU 7 times in a row and it always stops at 78%. How can I get the server to stop sending the same WU?

Code: Select all

[06:18:05] Completed 195000 out of 250000 steps  (78 percent)
[06:23:18] Warning:  long 1-4 interactions
[06:31:27] - Autosending finished units... [October 5 06:31:27 UTC]
[06:31:27] Trying to send all finished work units
[06:31:27] + No unsent completed units remaining.
[06:31:27] - Autosend completed
[09:18:05] At least 3 hours since checkpoint written...
[09:20:05] 
[09:20:05] Folding@home Core Shutdown: EARLY_UNIT_END
[09:20:05] 
[09:20:05] Folding@home Core Shutdown: EARLY_UNIT_END
[09:20:09] CoreStatus = 7B (123)
[09:20:09] Client-core communications error: ERROR 0x7b
[09:20:09] Deleting current work unit & continuing...
[09:20:09] Using generic mpiexec calls
[09:22:12] - Warning: Could not delete all work unit files (4): Core returned invalid code
[09:22:12] Trying to send all finished work units
[09:22:12] + No unsent completed units remaining.
[09:22:12] - Preparing to get new work unit...
[09:22:12] + Attempting to get work packet
[09:22:12] - Will indicate memory of 3324 MB
[09:22:12] - Connecting to assignment server
[09:22:12] Connecting to http://assign.stanford.edu:8080/
[09:22:13] Posted data.
[09:22:13] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[09:22:13] + News From Folding@Home: Welcome to Folding@Home
[09:22:13] Loaded queue successfully.
[09:22:13] Connecting to http://171.64.65.64:8080/
[09:22:18] Posted data.
[09:22:18] Initial: 0000; - Receiving payload (expected size: 4812189)
[09:22:34] - Downloaded at ~293 kB/s
[09:22:34] - Averaged speed for that direction ~287 kB/s
[09:22:34] + Received work.
[09:22:34] + Closed connections
[09:22:39] 
[09:22:39] + Processing work unit
[09:22:39] Work type a1 not eligible for variable processors
[09:22:39] Core required: FahCore_a1.exe
[09:22:39] Core found.
[09:22:39] Using generic mpiexec calls
[09:22:39] Working on queue slot 05 [October 5 09:22:39 UTC]
[09:22:39] + Working ...
[09:22:39] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 05 -checkpoint 15 -verbose -lifeline 7172 -version 622'

[09:22:39] 
[09:22:39] *------------------------------*
[09:22:39] Folding@Home Gromacs SMP Core
[09:22:39] Version 1.74 (March 10, 2007)
[09:22:39] 
[09:22:39] Preparing to commence simulation
[09:22:39] - Ensuring status. Please wait.
[09:22:46] - Starting from initial work packet
[09:22:46] 
[09:22:46] Project: 2665 (Run 2, Clone 256, Gen 39)
[09:22:46] 
[09:22:46] Assembly optimizations on if available.
[09:22:46] Entering M.D.
[09:23:09]  percent)
[09:23:10] - Starting from initial work packet
[09:23:10] 
[09:23:10] Project: 2665 (Run 2, Clone 256, Gen 39)
[09:23:10] 
[09:23:11] Entering M.D.
[09:23:17] Rejecting checkpoint
[09:23:19] Protein: HGG with glycosylations
[09:23:19] Writing local files
[09:23:29] Extra SSE boost OK.
[09:23:29] Writing local files
[09:23:29] Completed 0 out of 250000 steps  (0 percent)

Re: Project 2665 : (Run 2, Clone 256 Gen 39)

Posted: Sun Oct 05, 2008 6:33 pm
by anko1
If you use qfix as soon as you see an EUE, in theory you won't get the same WU back, plus you get some points for your work.

Re: Project 2665 : (Run 2, Clone 256 Gen 39)

Posted: Sun Oct 05, 2008 6:51 pm
by toTOW
There's no data for this WU in the DB.

You can submit your partial results using qfix : viewtopic.php?f=8&t=191

Re: Project 2665 : (Run 2, Clone 256 Gen 39)

Posted: Mon Oct 06, 2008 3:12 pm
by bossa_nova
Qfix worked and I was finally able to a different WU after 6 EUE's of the same one. Thank you both for you help.