Two WU's running on my Quad 6600 Linux boxes
unitinfo.txt files are both 172,316,322 bytes.
# cut -c 1-60 unitinfo.txt
Current Work Unit
-----------------
Name: Gromacs
Tag: P2674R3C99G21
Download time: November 12 15:48:59
Due time: November 15 15:48:59
Progress: 1723161600% [||||||||||||||||||||||||||||||||||||
grep Completed fah.log | tail -5
[17:24:55] Completed 37509 out of 249999 steps (15%)
[17:31:18] Completed 40009 out of 249999 steps (16%)
[17:37:41] Completed 42509 out of 249999 steps (17%)
[17:44:03] Completed 45009 out of 249999 steps (18%)
[17:50:29] Completed 47509 out of 249999 steps (19%)
# cut -c 1-60 unitinfo.txt
Current Work Unit
-----------------
Name: Gromacs
Tag: P2674R3C19G21
Download time: November 12 15:05:55
Due time: November 15 15:05:55
Progress: 1723161600% [||||||||||||||||||||||||||||||||||||
[root@quad1 fah6]# grep Completed fah.log | tail -5
[17:24:13] Completed 55009 out of 249999 steps (22%)
[17:30:31] Completed 57509 out of 249999 steps (23%)
[17:36:48] Completed 60009 out of 249999 steps (24%)
[17:43:04] Completed 62509 out of 249999 steps (25%)
[17:49:21] Completed 65009 out of 249999 steps (26%)
172 meg unitinfo.txt / Progress: 1723161600%
Moderators: Site Moderators, FAHC Science Team
Re: 172 meg unitinfo.txt / Progress: 1723161600%
There has been some reports recently.
Client does need some fixing.
Client does need some fixing.
Re: 172 meg unitinfo.txt / Progress: 1723161600%
I like how the file size match the % in unitinfo.txt. This mean that there is no cap check in the client (if the % is over 100, something is wrong), thus having non-stop pipe char repeated thru the file.
-
- Pande Group Member
- Posts: 2058
- Joined: Fri Nov 30, 2007 6:25 am
- Location: Stanford
Re: 172 meg unitinfo.txt / Progress: 1723161600%
We'll take a look.
Prof. Vijay Pande, PhD
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
-
- Posts: 366
- Joined: Tue Feb 12, 2008 7:33 am
- Hardware configuration: Running exclusively Linux headless blades. All are dedicated crunching machines.
- Location: SE Michigan, USA
Re: 172 meg unitinfo.txt / Progress: 1723161600%
Here's two more
86 meg unitinfo.txt / Progress: 859018%
# cut -c 1-60 unitinfo.txt
Current Work Unit
-----------------
Name: Gromacs
Tag: P2673R1C10G17
Download time: November 12 16:04:09
Due time: November 18 16:04:09
Progress: 859018% [||||||||||||||||||||||||||||||||||||||||
# ls -alt unitinfo.txt
-rw-r--r-- 1 root root 86,059 Nov 12 16:15 unitinfo.txt
# grep Completed fah.log | tail -5
[20:25:27] Completed 105010 out of 500001 steps (21%)
[20:37:52] Completed 110010 out of 500001 steps (22%)
[20:50:17] Completed 115010 out of 500001 steps (23%)
[21:02:42] Completed 120010 out of 500001 steps (24%)
[21:15:08] Completed 125010 out of 500001 steps (25%)
172 meg unitinfo.txt / Progress: 1723161600%
#cut -c 1-60 unitinfo.txt
Current Work Unit
-----------------
Name: Gromacs
Tag: P2674R1C157G21
Download time: November 12 17:57:58
Due time: November 15 17:57:58
Progress: 1723161600% [||||||||||||||||||||||||||||||||||||
#ls -alt unitinfo.txt
-rw-r--r-- 1 root root 172,316,323 Nov 12 16:23 unitinfo.txt
#grep Completed fah.log | tail -5
[21:03:39] Completed 92509 out of 249999 steps (37%)
[21:08:40] Completed 95009 out of 249999 steps (38%)
[21:13:41] Completed 97509 out of 249999 steps (39%)
[21:18:42] Completed 100009 out of 249999 steps (40%)
[21:23:42] Completed 102509 out of 249999 steps (41%)
86 meg unitinfo.txt / Progress: 859018%
# cut -c 1-60 unitinfo.txt
Current Work Unit
-----------------
Name: Gromacs
Tag: P2673R1C10G17
Download time: November 12 16:04:09
Due time: November 18 16:04:09
Progress: 859018% [||||||||||||||||||||||||||||||||||||||||
# ls -alt unitinfo.txt
-rw-r--r-- 1 root root 86,059 Nov 12 16:15 unitinfo.txt
# grep Completed fah.log | tail -5
[20:25:27] Completed 105010 out of 500001 steps (21%)
[20:37:52] Completed 110010 out of 500001 steps (22%)
[20:50:17] Completed 115010 out of 500001 steps (23%)
[21:02:42] Completed 120010 out of 500001 steps (24%)
[21:15:08] Completed 125010 out of 500001 steps (25%)
172 meg unitinfo.txt / Progress: 1723161600%
#cut -c 1-60 unitinfo.txt
Current Work Unit
-----------------
Name: Gromacs
Tag: P2674R1C157G21
Download time: November 12 17:57:58
Due time: November 15 17:57:58
Progress: 1723161600% [||||||||||||||||||||||||||||||||||||
#ls -alt unitinfo.txt
-rw-r--r-- 1 root root 172,316,323 Nov 12 16:23 unitinfo.txt
#grep Completed fah.log | tail -5
[21:03:39] Completed 92509 out of 249999 steps (37%)
[21:08:40] Completed 95009 out of 249999 steps (38%)
[21:13:41] Completed 97509 out of 249999 steps (39%)
[21:18:42] Completed 100009 out of 249999 steps (40%)
[21:23:42] Completed 102509 out of 249999 steps (41%)
-
- Posts: 471
- Joined: Mon Dec 03, 2007 6:20 am
- Location: Amsterdam
- Contact:
Re: 172 meg unitinfo.txt / Progress: 1723161600%
There is a | character for each % reported to be completed in unitinfo.txt, I counted them when first analyzing this bug when it caused FCI to slow to a crawl (it stored the progress bar in an XML file, parsing this XML file with one or more progress bars of several MB is not recommendedXilikon wrote:I like how the file size match the % in unitinfo.txt. This mean that there is no cap check in the client (if the % is over 100, something is wrong), thus having non-stop pipe char repeated thru the file.

I've only seen this happen with projects for the a2 core. I have a copy of the folding directory with such a case saved for testing.
Dick Howells wuinfo (utility that parses the work/wuinfo_??.dat files) for this case shows:
Code: Select all
index 6:
Core: Core_a2
Name: Gromacs
Progress: 1718012% (4295029 of 250 steps)
The corresponding work/logfile_06.txt:
Code: Select all
*------------------------------*
Folding@Home Gromacs SMP Core
Version 2.01 (Wed Aug 13 13:11:25 PDT 2008)
Preparing to commence simulation
- Ensuring status. Please wait.
- Assembly optimizations manually forced on.
- Not checking prior termination.
- Expanded 4838305 -> 24033653 (decompressed 496.7 percent)
Called DecompressByteArray: compressed_data_size=4838305 data_size=24033653, decompressed_data_size=24033653 diff=0
- Digital signature verified
Project: 2670 (Run 5, Clone 11, Gen 17)
Assembly optimizations on if available.
Entering M.D.
Completed 2510 out of 250001 steps (1%)
Completed 5010 out of 250001 steps (2%)
[....]
Completed 60010 out of 250001 steps (24%)
Completed 62510 out of 250001 steps (25%)
Code: Select all
qd released 7 September 2008 (fr 071); qd info 10 November 2008 (update-qd.pl)
qd executed Wed Nov 12 22:20:35 CET 2008 (Wed Nov 12 21:20:35 UTC 2008)
Queue version 5.01
Current index: 6
[...]
Index 6: folding now 1920.00 pts (104.433 pt/hr) 3.92 X min speed; 25% complete
server: 171.67.108.24:8080; project: 2670
Folding: run 5, clone 11, generation 17; benchmark 0; misc: 500, 200
Project: 2670 (Run 5, Clone 11, Gen 17)
issue: Tue Oct 28 12:20:38 2008; begin: Tue Oct 28 12:21:00 2008
expect: Wed Oct 29 06:44:05 2008; due: Fri Oct 31 12:21:00 2008 (3 days)
preferred: Fri Oct 31 12:21:00 2008 (3 days)
core URL: http://www.stanford.edu/~pande/Linux/x86/Core_a2.fah (V2.01)
CPU: 1,0 x86; OS: 4,0 Linux
smp cores: 4; cores to use: 4
tag: P2670R5C11G17
flops: 1061161541 (1061.161541 megaflops)
assignment info (le): Tue Oct 28 12:20:37 2008; BBC399E2
CS: 171.67.108.25; P limit: 524286976
user: [DPC]_Fatal_Error_Group0smoking2000; team: 92; ID: 9E3B81209D0E757D; mach ID: 1
work/wudata_06.dat file size: 4838817; WU type: Folding@Home
Results successfully sent: Fri Jun 6 16:28:16 2008
Average download rate 377.409 KB/s (u=4); upload rate 65.087 KB/s (u=4)
Performance fraction 0.750157 (u=4)
Average pph: 75.427, ppd: 1810.25, ppw: 12671.7, ppy: 661176
Code: Select all
$ grep step work/wudata_06.log
nsteps = 250001
init_step = 4250000
em_stepsize = 0.01
fc_stepsize = 0
will use an extra communication step for exclusion forces for Reaction-Field
Charge group distribution at step 4250000: 13534 19290 16635 15187
DD step 4250009 vol min/aver 1.000 load imb.: force 23.9%
DD step 4250999 vol min/aver 0.754 load imb.: force 9.0%
Writing checkpoint, step 4251370 at Tue Oct 28 12:26:20 2008
DD step 4251999 vol min/aver 0.775 load imb.: force 5.2%
Writing checkpoint, step 4252470 at Tue Oct 28 12:31:22 2008
[...]
DD step 4308999 vol min/aver 0.719 load imb.: force 11.4%
Writing checkpoint, step 4309560 at Tue Oct 28 16:46:20 2008
DD step 4309999 vol min/aver 0.773 load imb.: force 3.0%
Writing checkpoint, step 4310950 at Tue Oct 28 16:51:20 2008
DD step 4310999 vol min/aver 0.780 load imb.: force 3.7%
DD step 4311999 vol min/aver 0.773 load imb.: force 3.5%
Writing checkpoint, step 4312370 at Tue Oct 28 16:56:19 2008
Code: Select all
DD step 4294999 vol min/aver 0.799 load imb.: force 5.0%
Step Time Lambda
4295000 8590.00000 0.00000
Energies (kJ/mol)
Bond Angle Ryckaert-Bell. LJ-14 Coulomb-14
1.76815e+04 4.78336e+04 4.69625e+04 2.17307e+04 3.00686e+05
LJ (SR) Disper. corr. Coulomb (SR) RF excl. Potential
1.96170e+05 -1.70938e+04 -2.29351e+06 -2.00248e+05 -1.87979e+06
Kinetic En. Total Energy Temperature Pressure (bar) Cons. rmsd ()
3.98393e+05 -1.48140e+06 3.12924e+02 -2.40168e+02 3.67236e-06
Writing checkpoint, step 4295270 at Tue Oct 28 15:51:21 2008