Project 2653 Run 29 Clone 155 Gen 148 - Missing work files

Moderators: Site Moderators, FAHC Science Team

Post Reply
Adam2013
Posts: 2
Joined: Mon Feb 01, 2010 8:15 pm

Project 2653 Run 29 Clone 155 Gen 148 - Missing work files

Post by Adam2013 »

I was folding along just fine (Project 2653 Run 29 Clone 155, Gen 148) until the client got all the way done with this WU. I happened to be checking Fahmon at the time so what I saw was this in the logfile:

Code: Select all

[15:37:09] Completed 500000 out of 500000 steps  (100 percent)
[15:37:09] Writing final coordinates.
[15:37:10] Past main M.D. loop
[15:37:10] Will end MPI now
[15:38:10] 
[15:38:10] Finished Work Unit:
[15:38:10] - Reading up to 3729840 from "work/wudata_08.arc": Read 3729840
[15:38:10] - Reading up to 1788552 from "work/wudata_08.xtc": Read 1788552
[15:38:10] goefile size: 0
[15:38:10] logfile size: 43199
[15:38:10] Leaving Run
[15:38:11] - Writing 5565991 bytes of core data to disk...
[15:38:11]   ... Done.
[15:38:11] - Failed to delete work/wudata_08.sas
[15:38:11] - Failed to delete work/wudata_08.goe
[15:38:11] Warning:  check for stray files
[15:38:11] - Shutting down core
[15:40:05] Killing all core threads
[15:40:05] Killing 4 cores
[15:40:05] Killing core 0
[15:40:05] Killing core 1
[15:40:05] Killing core 2
[15:40:05] Killing core 3

Folding@Home Client Shutdown at user request.
[15:40:05] ***** Got a SIGTERM signal (2)
[15:40:05] Killing all core threads
[15:40:05] Killing 4 cores
[15:40:05] Killing core 0
[15:40:05] Killing core 1
[15:40:05] Killing core 2
[15:40:05] Killing core 3

Folding@Home Client Shutdown.
It stopped the core and then I thought it was telling me to delete those two files because it couldn't. I did and then I got this error the next time I started that client:

Code: Select all

[15:41:39] *------------------------------*
[15:41:39] Folding@Home Gromacs SMP Core
[15:41:39] Version 1.74 (March 10, 2007)
[15:41:39] 
[15:41:39] Preparing to commence simulation
[15:41:39] - Ensuring status. Please wait.
[15:41:56] - Looking at optimizations...
[15:41:56] - Working with standard loops on this execution.
[15:41:56] - Created dyn
[15:41:56] - Files status OK
[15:41:56] 
[15:41:56] Folding@home Core Shutdown: MISSING_WORK_FILES
[15:41:56] Finalizing output
[15:42:14] Killing all core threads
[15:42:14] Killing 4 cores
[15:42:14] Killing core 0
[15:42:14] Killing core 1
[15:42:14] Killing core 2
[15:42:14] Killing core 3

Folding@Home Client Shutdown at user request.
[15:42:14] ***** Got a SIGTERM signal (2)
[15:42:14] Killing all core threads
[15:42:14] Killing 4 cores
[15:42:14] Killing core 0
[15:42:14] Killing core 1
[15:42:14] Killing core 2
[15:42:14] Killing core 3

Folding@Home Client Shutdown.
If anyone has any information or needs more, please let me know. I have already accepted the fact that I will probably be losing this WU :|

Adam
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project 2653 Run 29 Clone 155 Gen 148 - Missing work files

Post by bruce »

Those messages should be ignored. One fundamental rule is that you should not manipulate any of FAH's files. The client is designed to run unattended and to correct whatever might keep it from continuing to work. There are a few cases where you can help it along, but there certainly a lot more cases where you can get into trouble, so it's best to do nothing.

You'll find those messages in many different situations. Here's one case where there was an error and another where the WU completed successfully. The important thing to note is that the WU is not actually completed until you get the CoreStatus = xx (yy) message. Stopping the client or manipulating any files until you get a CoreStatus message is risky, and in your case, caused the loss of a completed WU.

Code: Select all

[07:01:28]  Writing 30533 bytes of core data to disk...
[07:01:28]   ... Done.
[07:01:28] - Failed to delete work/wudata_04.chk
[07:01:28] - Failed to delete work/wudata_04.goe
[07:01:28] Warning:  check for stray files
[07:03:28]
[07:03:28] Folding@home Core Shutdown: EARLY_UNIT_END
[07:03:28] Finalizing output
[07:03:31] CoreStatus = 1 (1)

Code: Select all

[14:55:58] - Writing 22087214 bytes of core data to disk...
[14:56:00]   ... Done.
[14:56:00] - Failed to delete work/wudata_02.sas
[14:56:00] - Failed to delete work/wudata_02.goe
[14:56:00] Warning:  check for stray files
[14:56:00] - Shutting down core
[14:58:00]
[14:58:00] Folding@home Core Shutdown: FINISHED_UNIT
[14:58:06] CoreStatus = 64 (100)
Adam2013
Posts: 2
Joined: Mon Feb 01, 2010 8:15 pm

Re: Project 2653 Run 29 Clone 155 Gen 148 - Missing work files

Post by Adam2013 »

Thanks Bruce,
After I started my client, it took awhile but it started on a new WU. It seems to have trouble connecting to 171.64.65.56, but ill post to a different thread if it keeps happening.

Adam
Post Reply