Project 6012: (Run 2 Clone 262 Gen 13)
Posted: Sun Mar 28, 2010 9:15 pm
Another failed a3core WU early in the folding. This one did not make it to 4%. I will have to remember this as Black Sunday.
Log file is as follows:
I attempted to return the "remains" to Stanford, but it failed. Apparently, the Client is in such a rush to "fail" that it does not write to the work files correctly. Outcome is no info transferred to PG by the formal route and no possibility of any points credited, however meager. Log file covering the return attempt is as follows:
Log file is as follows:
Code: Select all
[19:14:57] - Will indicate memory of 1024 MB
[19:14:57] - Connecting to assignment server
[19:14:57] Connecting to http://assign.stanford.edu:8080/
[19:15:00] Posted data.
[19:15:00] Initial: ED82; - Successful: assigned to (130.237.232.140).
[19:15:00] + News From Folding@Home: Welcome to Folding@Home
[19:15:00] Loaded queue successfully.
[19:15:00] Sent data
[19:15:00] Connecting to http://130.237.232.140:8080/
[19:15:02] Posted data.
[19:15:02] Initial: 0000; - Receiving payload (expected size: 1797039)
[19:25:36] - Downloaded at ~2 kB/s
[19:25:36] - Averaged speed for that direction ~3 kB/s
[19:25:36] + Received work.
[19:25:36] Trying to send all finished work units
[19:25:36] + No unsent completed units remaining.
[19:25:36] + Connections closed: You may now disconnect
[19:25:41]
[19:25:41] + Processing work unit
[19:25:41] Core required: FahCore_a3.exe
[19:25:41] Core found.
[19:25:41] Working on queue slot 07 [March 28 19:25:41 UTC]
[19:25:41] + Working ...
[19:25:41] - Calling './FahCore_a3.exe -dir work/ -nice 19 -suffix 07 -np 2 -checkpoint 15 -verbose -lifeline 6642 -version 629'
[19:25:41]
[19:25:41] *------------------------------*
[19:25:41] Folding@Home Gromacs SMP Core
[19:25:41] Version 2.17 (Mar 7 2010)
[19:25:41]
[19:25:41] Preparing to commence simulation
[19:25:41] - Looking at optimizations...
[19:25:41] - Created dyn
[19:25:41] - Files status OK
[19:25:41] - Expanded 1796527 -> 2078149 (decompressed 115.6 percent)
[19:25:41] Called DecompressByteArray: compressed_data_size=1796527 data_size=2078149, decompressed_data_size=2078149 diff=0
[19:25:41] - Digital signature verified
[19:25:41]
[19:25:41] Project: 6012 (Run 2, Clone 262, Gen 13)
[19:25:41]
[19:25:41] Assembly optimizations on if available.
[19:25:41] Entering M.D.
Starting 2 threads
NNODES=2, MYRANK=1, HOSTNAME=thread #1
NNODES=2, MYRANK=0, HOSTNAME=thread #0
Reading file work/wudata_07.tpr, VERSION 4.0.99_development_20090605 (single precision)
Note: tpx file_version 68, software version 70
Making 1D domain decomposition 2 x 1 x 1
starting mdrun 'Protein in POPC'
7000000 steps, 14000.0 ps (continuing from step 6500000, 13000.0 ps).
[19:25:48] Completed 0 out of 500000 steps (0%)
[19:49:33] Completed 5000 out of 500000 steps (1%)
[20:12:55] Completed 10000 out of 500000 steps (2%)
[20:36:11] Completed 15000 out of 500000 steps (3%)
-------------------------------------------------------
Program mdrun, VERSION 4.0.99-dev-20100305
Source code file: /Users/kasson/a3_devnew/gromacs/src/mdlib/pme.c, line: 563
Fatal error:
24 particles communicated to PME node 1 are more than a cell length out of the domain decomposition cell of their charge group in dimension x
For more information and tips for trouble shooting please check the GROMACS website at
http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
[20:39:04] mdrun returned 255
[20:39:04] Going to send back what have done -- stepsTotalG=500000
[20:39:04] Work fraction=0.0312 steps=500000.
[20:39:05] CoreStatus = 0 (0)
[20:39:05] Sending work to server
[20:39:05] Project: 6012 (Run 2, Clone 262, Gen 13)
[20:39:05] - Error: Could not get length of results file work/wuresults_07.dat
[20:39:05] - Error: Could not read unit 07 file. Removing from queue.
[20:39:05] Trying to send all finished work units
[20:39:05] + No unsent completed units remaining.
[20:39:05] - Preparing to get new work unit...
[20:39:06] > Press "c" to connect to the server to download unit
Code: Select all
[20:39:04] Going to send back what have done -- stepsTotalG=500000
[20:39:04] Work fraction=0.0312 steps=500000.
[20:39:05] CoreStatus = 0 (0)
[20:39:05] Sending work to server
[20:39:05] Project: 6012 (Run 2, Clone 262, Gen 13)
[20:39:05] - Error: Could not get length of results file work/wuresults_07.dat
[20:39:05] - Error: Could not read unit 07 file. Removing from queue.
[20:39:05] Trying to send all finished work units
[20:39:05] + No unsent completed units remaining.
[20:39:05] - Preparing to get new work unit...
[/code
Until the next time...... :-)