P2605 Run11 clone 497 Generation 71
Posted: Thu Jul 10, 2008 3:56 pm
It's World Taekwondo Federation time again. This unit seemed to complete successfully. Here is the nohup.out output:
After that, it sat there like a sausage for 25 minutes. So I verified that my network connection was still active, and that there were no fah6 processes still using the processor, and killed the parent.
qfix shows:
But ./fah6 -send 5 -verbosity 9 gives me:
Nor will ./fah6 -smp start another unit:
I am sorry to say this is urgent, as I am leaving for a month's holiday tomorrow afternoon, and I shall lose about 40 WUs if we are unable to solve this problem. ATM, my thought would be to take the results file and save them somewhere else, then use qdclear to nix the work directory, so I can get the show on the road.
Thx in advance for any advice from Taekwondo blackbelts or folding experts.
Mike
Code: Select all
[14:45:57] Writing local files
[14:45:57] Completed 495000 out of 500000 steps (99 percent)
[14:58:05] Writing local files
[14:58:05] Completed 500000 out of 500000 steps (100 percent)
M E G A - F L O P S A C C O U N T I N G
Parallel run - timing based on wallclock.
RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
NF=No Forces
Computing: M-Number M-Flops % of Flops
----[14:58:05] Writing final coordinates.
-------------------------------------------------------------------
VdW(T) 1364641.594393 73690646.097222 16.6
RF Coul 545869.157252 18013682.189316 4.1
RF Coul [W3] 2257.835336 221267.862928 0.0
RF Coul + VdW(T) 617564.629043 40141700.887795 9.0
RF Coul + VdW(T) [W3] 280820.440646 36506657.283980 8.2
RF Coul + VdW(T) [W3-W3] 769776.706149 246328545.967680 55.5
Outer nonbonded loop 237847.898865 2378478.988650 0.5
1,4 nonbonded interactions 1059.002118 95310.190620 0.0
NS-Pairs 501069.584250 10522461.269250 2.4
Reset In Box 3878.527569 34906.748121 0.0
Shift-X 77469.154938 464814.929628 0.1
CG-CoM 1767.035340 51244.024860 0.0
Sum Forces 116353.732707 116353.732707 0.0
Bonds 13273.526547 570761.641521 0.1
Angles 15621.031242 2546228.092446 0.6
Propers 5386.510773 1233510.967017 0.3
Impropers 1008.002016 209664.419328 0.0
RB-Dihedrals 14140.528281 3492710.485407 0.8
Virial 38838.577677 699094.398186 0.2
Update 38784.577569 1202321.904639 0.3
Stop-CM 38784.500000 387845.000000 0.1
P-Coupling 38784.577569 232707.465414 0.1
Calc-Ekin 38784.655138 1047185.688726 0.2
Constraint-V 38784.577569 232707.465414 0.1
Constraint-Vir 25212.050424 605089.210176 0.1
Settle 8404.016808 2714497.428984 0.6
-----------------------------------------------------------------------
Total 443740394.340015 100.0
-----------------------------------------------------------------------
NODE (s) Real (s) (%)
Time: 71349.000 71349.000 100.0
19h49:09
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 50.189 6.219 1.211 19.819
[14:58:05] Past main M.D. loop
[14:58:05] Will end MPI now
[14:59:05]
[14:59:05] Finished Work Unit:
[14:59:05] - Reading up to 3723552 from "work/wudata_05.arc": Read 3723552
[14:59:05] - Reading up to 1779300 from "work/wudata_05.xtc": Read 1779300
[14:59:05] goefile size: 0
[14:59:05] logfile size: 16925
[14:59:05] Leaving Run
[14:59:08] - Writing 5524177 bytes of core data to disk...
[14:59:08] ... Done.
[14:59:09] - Shutting down core
[14:59:09]
qfix shows:
Code: Select all
entry 6, status 0, address 171.64.65.56:8080
entry 7, status 0, address 171.64.65.56:8080
entry 8, status 0, address 171.64.65.56:8080
entry 9, status 0, address 171.64.65.56:8080
entry 0, status 0, address 171.64.65.56:8080
entry 1, status 0, address 171.64.65.56:8080
entry 2, status 0, address 171.64.65.56:8080
entry 3, status 0, address 171.64.65.56:8080
entry 4, status 0, address 171.64.65.56:8080
entry 5, status 1, address 171.64.65.56:8080
Found results <work/wuresults_05.dat>: proj 2605, run 11, clone 497, gen 71
-- queue entry: proj 2605, run 11, clone 497, gen 71
-- queue entry isn't empty
File is OK
Code: Select all
Launch directory: /home/mike/Download
Executable: ./fah6
Arguments: -send 5 -verbosity 9
[15:52:05] - Ask before connecting: No
[15:52:05] - User name: Dutchmm (Team 31574)
[15:52:05] - User ID: 2C9583CC4EA765F8
[15:52:05] - Machine ID: 1
[15:52:05]
[15:52:05] Loaded queue successfully.
[15:52:05] Attempting to return result(s) to server...
[15:52:05] - Warning: Asked to send unfinished unit to server
[15:52:05] - Failed to send unit 05 to server
[15:52:05] ***** Got a SIGTERM signal (15)
[15:52:05] Killing all core threads
Folding@Home Client Shutdown.
Code: Select all
Launch directory: /home/mike/Download
Executable: ./fah6
Arguments: -smp
[15:53:24] - Ask before connecting: No
[15:53:24] - User name: Dutchmm (Team 31574)
[15:53:24] - User ID: 2C9583CC4EA765F8
[15:53:24] - Machine ID: 1
[15:53:24]
[15:53:24] Loaded queue successfully.
[15:53:24]
[15:53:24] + Processing work unit
[15:53:24] Core required: FahCore_a1.exe
[15:53:24] Core found.
[15:53:24] Working on Unit 05 [July 10 15:53:24]
[15:53:24] + Working ...
[15:53:25]
[15:53:25] *------------------------------*
[15:53:25] Folding@Home Gromacs SMP Core
[15:53:25] Version 1.74 (November 27, 2006)
[15:53:25]
[15:53:25] Preparing to commence simulation
[15:53:25] - Ensuring status. Please wait.
[15:53:42] - Looking at optimizations...
[15:53:42] - Working with standard loops on this execution.
[15:53:42] Examination of work files indicates 8 consecutive improper terminations of core.
[15:53:42] Finalizing output
[15:53:42] - Starting from initial work packet
[15:53:42]
[15:53:42] Project: 0 (Run 0, Clone 0, Gen 0)
[15:53:42]
[15:53:42] Error: Could not write local file. Exiting.
[15:53:42] - Shutting down core
Thx in advance for any advice from Taekwondo blackbelts or folding experts.
Mike