Page 1 of 1

Project: 2669 (Run 14, Clone 166, Gen 38)

Posted: Wed Jul 08, 2009 4:47 am
by klasseng
wouldn't even start:

Code: Select all

[03:42:01] + Processing work unit
[03:42:01] At least 4 processors must be requested; read 1.
[03:42:01] Core required: FahCore_a2.exe
[03:42:01] Core found.
[03:42:01] - Using generic ./mpiexec
[03:42:01] Working on queue slot 03 [July 2 03:42:01 UTC]
[03:42:01] + Working ...
[03:42:01] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 03 -checkpoint 20 -verbose -lifeline 881 -version 624'

[03:42:01] 
[03:42:01] *------------------------------*
[03:42:01] Folding@Home Gromacs SMP Core
[03:42:01] Version 2.07 (Sun Apr 19 14:29:51 PDT 2009)
[03:42:01] 
[03:42:01] Preparing to commence simulation
[03:42:01] - Ensuring status. Please wait.
[03:42:01] Files status OK
[03:42:03] - Expanded 4831120 -> 23973957 (decompressed 496.2 percent)
[03:42:03] Called DecompressByteArray: compressed_data_size=4831120 data_size=23973957, decompressed_data_size=23973957 diff=0
[03:42:04] - Digital signature verified
[03:42:04] 
[03:42:04] Project: 2669 (Run 14, Clone 166, Gen 38)
[03:42:04] 
[03:42:04] Assembly optimizations on if available.
[03:42:04] Entering M.D.
[03:42:13] un 14, Clone 166, Gen 38)
[03:42:13] 
[03:42:14] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=FrontRow.local
NNODES=4, MYRANK=1, HOSTNAME=FrontRow.local
NNODES=4, MYRANK=2, HOSTNAME=FrontRow.local
NNODES=4, MYRANK=3, HOSTNAME=FrontRow.local
NODEID=0 argc=20
NODEID=2 argc=20
NODEID=3 argc=20
NODEID=1 argc=20
                         :-)  G  R  O  M  A  C  S  (-:

                   Groningen Machine for Chemical Simulation

                 :-)  VERSION 4.0.99_development_20090307  (-:


      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2008, The GROMACS development team,
            check out http://www.gromacs.org for more information.


                                :-)  mdrun  (-:

Reading file work/wudata_03.tpr, VERSION 3.3.99_development_20070618 (single precision)
Note: tpx file_version 48, software version 64

NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp

Making 1D domain decomposition 1 x 1 x 4
starting mdrun '22848 system'
9750001 steps,  19500.0 ps (continuing from step 9250002,  18500.0 ps).
[03:42:25] Completed 0 out of 499999 steps  (0%)

t = 18500.005 ps: Water molecule starting at atom 135034 can not be settled.
Check for bad contacts and/or reduce the timestep.

t = 18500.005 ps: Water molecule starting at atom 137209 can not be settled.
Check for bad contacts and/or reduce the timestep.

t = 18500.007 ps: Water molecule starting at atom 135034 can not be settled.
Check for bad contacts and/or reduce the timestep.

t = 18500.007 ps: Water molecule starting at atom 137209 can not be settled.
Check for bad contacts and/or reduce the timestep.

t = 18500.009 ps: Water molecule starting at atom 101140 can not be settled.
Check for bad contacts and/or reduce the timestep.

t = 18500.009 ps: Water molecule starting at atom 110524 can not be settled.
Check for bad contacts and/or reduce the timestep.

t = 18500.009 ps: Water molecule starting at atom 45346 can not be settled.
Check for bad contacts and/or reduce the timestep.
[03:42:27] 
[03:42:27] Folding@home Core Shutdown: INTERRUPTED
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
[0]0:Return code = 102
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Quit
[03:42:31] CoreStatus = 66 (102)
[03:42:31] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[03:42:31] Killing all core threads

Folding@Home Client Shutdown.

Re: Project: 2669 (Run 14, Clone 166, Gen 38)

Posted: Wed Jul 08, 2009 10:33 am
by toTOW
That one looks like a bad WU :( ... I guess you can trash it.

Re: Project: 2669 (Run 14, Clone 166, Gen 38)

Posted: Thu Jul 09, 2009 11:28 pm
by ChasR
Same thing here:

Too many steps:

[19:35:55] Project: 2669 (Run 14, Clone 166, Gen 38)
[19:35:55]
[19:35:55] Entering M.D.
[19:36:05] Completed 0 out of 499999 steps (0%)

Re: Project: 2669 (Run 14, Clone 166, Gen 38)

Posted: Fri Jul 10, 2009 12:30 pm
by susato
Wouldn't start for me either... water molecules wouldn't settle.

No need for the mods database to tell us this one's bad. PM sent.

Project 2669 (Run 14, Clone 166, Gen 38)

Posted: Mon Jul 13, 2009 6:34 am
by Phantom

Code: Select all

# Mac OS X SMP Console Edition ################################################
###############################################################################

                       Folding@Home Client Version 6.24beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /Users/mini5/Library/InCrease/unit1
Executable: /Users/mini5/Library/InCrease/unit1/fah6
Arguments: -local -advmethods -forceasm -verbosity 9 -smp  

[06:20:00] - Ask before connecting: No
[06:20:00] - User name: Phantom (Team 1971)
[06:20:00] - User ID: xxxxxxxxxxxxxxxxxxxxx
[06:20:00] - Machine ID: 1
[06:20:00] 
[06:20:00] Loaded queue successfully.
[06:20:00] 
[06:20:00] - Autosending finished units... [06:20:00]
[06:20:00] + Processing work unit
[06:20:00] At least 4 processors must be requested.[06:20:00] + No unsent completed units remaining.
Core required: FahCore_a2.exe
[06:20:00] Core found.
[06:20:00] - Using generic ./mpiexec
[06:20:00] Working on queue slot 08 [July 13 06:20:00 UTC]
[06:20:00] + Working ...
[06:20:00] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 08 -checkpoint 15 -forceasm -verbose -lifeline 33281 -version 624'

[06:20:00] 
[06:20:00] *------------------------------*
[06:20:00] Folding@Home Gromacs SMP Core
[06:20:00] Version 2.07 (Sun Apr 19 14:29:51 PDT 2009)
[06:20:00] 
[06:20:00] Preparing to commence simulation
[06:20:00] - Ensuring status. Please wait.
[06:20:01] Called DecompressByteArray: compressed_data_size=4831120 data_size=23973957, decompressed_data_size=23973957 diff=0
[06:20:01] - Digital signature verified
[06:20:01] 
[06:20:01] Project: 2669 (Run 14, Clone 166, Gen 38)
[06:20:01] 
[06:20:02] Assembly optimizations on if available.
[06:20:02] Entering M.D.
[06:20:12]  on if available.
[06:20:12] Entering M.D.
[06:20:22] Completed 0 out of 499999 steps  (0%)
[06:20:29] CoreStatus = 1 (1)
[06:20:29] Sending work to server
[06:20:29] Project: 2669 (Run 14, Clone 166, Gen 38)
[06:20:29] - Error: Could not get length of results file work/wuresults_08.dat
[06:20:29] - Error: Could not read unit 08 file. Removing from queue.
Error repeats at same location after same WU is repeatedly issued and downloaded. Performed restart of computer. Did not help.

Re: Project: 2669 (Run 14, Clone 166, Gen 38)

Posted: Mon Jul 13, 2009 8:32 pm
by susato
Running the client with the -delete xx flag should help. (In InCrease, add it to the Extra Arguments field in Launch Preferences.

If you have to change your machineID to get rid of it, use Commands > Edit Unit Configuration in InCrease, or if you are using the Console client on its own, run it with the -configonly flag from the Terminal window.