Project: 2609 (Run 0, Clone 492, Gen 0)
Posted: Wed Jan 02, 2008 12:11 am
				
				My machine downloaded this WU, crashed, deleted it, downloaded it again, crashed, deleted it and is now going on successfully with a 3062 WU.
One instance of the cycle is quoted below:
			One instance of the cycle is quoted below:
[18:15:40] + Results successfully sent
[18:15:40] Thank you for your contribution to Folding@Home.
[18:15:40] + Number of Units Completed: 67
[18:19:49] - Warning: Could not delete all work unit files (7): Core returned invalid code
[18:19:49] Trying to send all finished work units
[18:19:49] + No unsent completed units remaining.
[18:19:49] - Preparing to get new work unit...
[18:19:49] + Attempting to get work packet
[18:19:49] - Will indicate memory of 4000 MB
[18:19:49] - Connecting to assignment server
[18:19:49] Connecting to http://assign.stanford.edu:8080/
[18:19:50] Posted data.
[18:19:50] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[18:19:50] + News From Folding@Home: Welcome to Folding@Home
[18:19:50] Loaded queue successfully.
[18:19:50] Connecting to http://171.64.65.56:8080/
[18:19:53] Posted data.
[18:19:53] Initial: 0000; - Receiving payload (expected size: 132347)
[18:19:58] - Downloaded at ~25 kB/s
[18:19:58] - Averaged speed for that direction ~110 kB/s
[18:19:58] + Received work.
[18:19:58] Trying to send all finished work units
[18:19:58] + No unsent completed units remaining.
[18:19:58] + Closed connections
[18:19:58]
[18:19:58] + Processing work unit
[18:19:58] Core required: FahCore_a1.exe
[18:19:58] Core found.
[18:19:58] Working on Unit 08 [January 1 18:19:58]
[18:19:58] + Working ...
[18:19:58] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 08 -checkpoint 15 -verbose -lifeline 4834 -version 600'
[18:19:58]
[18:19:58] *------------------------------*
[18:19:58] Folding@Home Gromacs SMP Core
[18:19:58] Version 1.74 (September 24, 2007)
[18:19:58]
[18:19:58] Preparing to commence simulation
[18:19:58] - Ensuring status. Please wait.
[18:19:58] - Starting from initial work packet
[18:19:58]
[18:19:58] Project: 2609 (Run 0, Clone 492, Gen 0)
[18:19:58]
[18:19:58] Assembly optimizations on if available.
[18:19:58] Entering M.D.
[18:20:15] 8 percent)
[18:20:16] cket
[18:20:16]
[18:20:16] Project: 2609 (Run 0, Clone 492, Gen 0)
[18:20:16]
[18:20:16] Entering M.D.
[18:20:16] one 492, Gen 0)
[18:20:16]
[18:20:16] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=klasseng.local
NNODES=4, MYRANK=2, HOSTNAME=klasseng.local
NNODES=4, MYRANK=1, HOSTNAME=klasseng.local
NNODES=4, MYRANK=3, HOSTNAME=klasseng.local
NODEID=1 argc=15
NODEID=0 argc=15
NODEID=2 argc=15
NODEID=3 argc=15
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2004, The GROMACS development team,
check out http://www.gromacs.org for more information.
This inclusion of Gromacs code in the Folding@Home Core is under
a special license (see http://folding.stanford.edu/gromacs.html)
specially granted to Stanford by the copyright holders. If you
are interested in using Gromacs, visit http://www.gromacs.org where
you can download a free version of Gromacs under
the terms of the GNU General Public License (GPL) as published
by the Free Software Foundation; either version 2 of the License,
or (at your option) any later version.
[18:20:22] mdrunner cpfilename:
[18:20:22] Rejecting checkpoint
FahCore_a1.exe(24288,0x1801600) malloc: *** vm_allocate(size=3222790144) failed (error code=3)
FahCore_a1.exe(24288,0x1801600) malloc: *** error: can't allocate region
FahCore_a1.exe(24288,0x1801600) malloc: *** set a breakpoint in szone_error to debug
-------------------------------------------------------
Program Core_A1.exe, VERSION 3.3
Source code file: smalloc.c, line: 113
Fatal error:
calloc for ir->opts.nrdf (nelem=-1341786129, elsize=4, file tpxio.c, line 492)
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
: Cannot allocate memory
Error on node 0, will try to stop all the nodes
[18:20:22] Gromacs error.
[18:20:22]
[18:20:22] Folding@home Core Shutdown: UNKNOWN_ERROR
[0]0:Return code = 121
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit
[18:20:27] CoreStatus = 79 (121)
[18:20:27] Client-core communications error: ERROR 0x79
[18:20:27] Deleting current work unit & continuing...
[18:24:53] - Warning: Could not delete all work unit files (8): Core returned invalid code
[18:24:53] Trying to send all finished work units
[18:24:53] + No unsent completed units remaining.
