New release of FCI:
v1.6
It seems like forever since the
last release (v1.5.1), but that was only a few months ago (v1.5.1 was released Wed Nov 18, 2009). It
has been far too long since the last
major release (v1.5 was released Wed Apr 15, 2009 9:13 pm). Approximately 9 months is not very good in the release early, release often department. You can have a baby in that time frame, although you may not want to release early and often in that case
This release resembles the
v1.3 release (Sat Mar 14, 2009) a bit, introducing some major new features. But this release took much longer to get done. At least I beat Duke Nukem Forever
New Feature: FahChart-like TPF graphs for the current WU using jQuery Flot
This is the first feature of FCI that requires javascript. I've long sought for some way to integrated
FahCharts features in FCI, it was a unique application that allowed you to inspect the TPF at each completed step in the graph. Now you can do this in FCI too thanks to the wonderful
jQuery Flot plotting library, you can even click through to the line in the FAHlog on which the step in question was completed (good for inspecting peaks).
Example:
These TPF graphs are displayed on the client page, and on the project page if the trajectory a client is working on is selected.
New Feature: Trajectory tracking for projects
Next to the individual projects, each trajectory for a project is now also tracked in the expanded known-projects. Each project now has its own known-project XML file in which the general project information is stored, just like it is in the known-projects XML file listing all known-projects. But the project specific known-project XML file also stores a list of trajectories the FCI clients have worked on. If a current.xyz file is availble the project images will be stored per trajectory as well.
A project page for which the trajectory is specified (by clicking a trajectory link) will now highlight the client running the trajectory, and the TPF graph if available.
(click for full size 774x1033)
This feature is not fully implemented yet, because the TPF data for the trajectory is not stored for completed WUs, only for the currently active WUs. This will be addressed in a future release of FCI, where a client list will also be shown for completed WUs.
New Feature: Queue History RRD graphs
RRD graphs are now also generated for the following queue values:
- Current PPD
- Speed
- Progress (%)
- Flops
- Performance Fraction
- Average PPD
All values are for the current work unit, except the Performance Fraction and Average PPD, these are queue averages of multiple WUs.
New Feature: FCI server support on OpenSUSE
With the release of OpenSUSE 11.2 it's now possible to get all FCI servers dependencies installed, so I got to update the supported OSes in the
start post.
In previous releases everything except libapreq2 was available, a working build is now available thanks to the
openSUSE Build Service.
Improved Functionality: Slightly more intelligent FCI client
The FCI client now knows about a few known issues and how to handle them a bit more intelligently. It will no longer upload the current.xyz if it likely doesn't belong the work unit the FAH client is currently work on. This can happen with some SMP cores which no longer generate the current.xyz, but this is also one of the files the FahCores don't clean up (all the time). So if the current.xyz is was not modified in the time between the start of the current WU, and the time FCI client runs it will be skipped.
The reason why a file is skipped for upload is now displayed too, and can be one of the following:
- Nonexistent, the file simply does not exist.
- Unreadable, the file cannot be read.
- Commandline Override, --skip-<file> is used on the commandline.
- Configuration Override, skip-<file> is set in fci-client.conf.
- Unverifiable (current.xyz specific), there is no usable qd info to determine if the current.xyz likely belongs to the current WU or not.
- Not Current (current.xyz specific), the file was not modified in the time between the current WU started and the time FCI client runs.
- Filesize Limit (unitinfo.txt specific), the filesize exceeds 512 KB (150-200 is normal).
GPU detection is now also supported on Linux, FreeBSD and Mac OS X. On Linux if you have an Nvidia GPU and have the nvidia-settings utility installed, it will send more detailed information to the FCI server including the GPU core temp, amount of VRAM and driver version. For other GPUs only the name reported by lspci for the device is used.
Improved Functionality: Cleaner upload output
Mostly a cosmetic change, prettifying the server reponse sent to the FCI client after upload processing. But the upload processing is handled more uniformly behind the scenes now too.
Improved Functionality: Automatic Apache2 detection on Gentoo
Gentoo is the only distro that does not include the version of apache in the SERVER_SOFTWARE environmental variable on which FCI relies to detect the if Apache 2.x is used. It still assumes that Apache 1.3.x is used, although apache2 is more common these days. I recently stumbled upon the magical MOD_PERL_API_VERSION variable which allows automatic detection to work on Gentoo too. No need to modify index.pl to set $apache2 manually anymore.
Improved Functionality: New Client State Markers
Client hanging (
h), this marker is shown for clients where no progress (of the currently active WU) was detected for the time of the last 2 frames. So if the time qd was executed is greater than the time it took to complete the last 2 frames added to the time of the last detected progress (completed frame). This way of classifying a client as hung can produce false positives when qd is run during a slow frame (one taking much longer than the last two frames), but this is almost always restored after the next FCI update (when more progress has been made). So when you see this marker for one of your clients you'll want to check its FAHlog.txt to see if it's justified, you may also want to check one the client locally if more progress may have been made since its last FCI client upload. The tooltip of the marker will so the time since last progress, e.g. "No progress for 1 hour 3 minutes 53 seconds". This marker is also known as the Shadowtester marker, who
requested it some time ago now (Fri Mar 27, 2009).
Unable to get work (
w), this marker is shown for clients whose last activity in the FAHlog.txt are "Attempt X to get work failed, and no other work to do" and "Waiting before retry" lines. This allows you to see why your clients are not showing any progress and linger at the bottom of the Expected WUs list.
EUE limit exceeded (
e), this marker is shown for clients whose last line in the FAHlog.txt shows "EUE limit exceeded. Pausing 24 hours", which is usually preceded by a not so nice UNSTABLE_MACHINE event. This one is for rhavern who
requested it some time ago now too (Tue Apr 14, 2009).
Improved Functionality: Installer now sets the default apache group for all supported OSes
Previously you were required to invoke install.pl using --group <grouname> to have the permissions set correctly for any OS other than those using 'www-data' (Debian, Ubuntu & Mepis). The installation instructions mentioned which groupname was used by default on each supported OS, this knowledge is now also available in the installer itself.
The installer has also been updated to create the parts of the /usr/local/ tree it needs if it doesn't existing on Mac OS X. Apple has learned not to pollute /usr/local/ with its system installed software as it did in Tiger, but removing it entirely is a bit drastic.
Bugfix: Team 0 now supported
This release has several team 0 fixes, and should now properly support members of this team.
Fixing this bug made me see some of the creative usernames (containing special characters) used in team 0, giving the XML parser a hard time. It was kind of silly to convert the teamstats text files to XML anyway, it was initially done for consistency with the other code, but there is a negative performance benefit to added the extra XML tags to the teamstats data. So we just parse the text file now, and stopped using the XML file. The XML teamstats files will be deleted on the first update with the new version of fci-update-stanford-files.pl, so upgrades to this new version clean up these files automatically.
Bugfix: Begun date used when Issued date is not set
The new v5 work servers initially didn't set the issue date as reported by qd anymore, so FCI cannot rely on the existence of an issue timestamp anymore. In cases where no issue timestamp is known, the begin timestamp is used.
This mostly affected the Assigned Projects page now using either the Issued date or the Begun date which ever is available (tooltip show which value is used), and to a lesser extent the Projects page which displayed the Issued date initially but now uses the Begun date.
Bugfix: fci-update-eoc-stats.pl
Fixed return value and verbose output in save_file(), so the error condition of an unwritable file is properly handled. Also, when the stats for a user cannot be downloaded, continue to the next username instead of quiting the program.
Not a bug, but a feature (although the lack of this feature could be considered a bug). I've added the --sleep <n>/-s <n> option, use -s to set the number of seconds to sleep between requests to the EOC server. Minimum sleep time is 1 second.
Minor Changes
fci-update-xml-files.pl Added support for extracting the client type from the qd data, that was added to qd released 5 December 2009 (fr 076). Before only the client.cfg was used.
The unitinfo.txt is no longer parsed (this was done if certain values could not be parsed from the qd data), the file is now only provided for personal inspection if it was uploaded in the first place.
Arguments in the client.cfg (extra_parms) and those used on the commandline (logged in FAHlog.txt on Arguments: lines) are now merged instead of concatenated, so double arguments are no longer displayed in the web interface.
fci-update-jmol-projects.pl Added mapping for new cores (A3 => GRO-A3, A4 => GRO-A4, PM => ProtoMol), and handle empty contact columns by setting the value to 'n/a' as used in FCI (Jmol uses 'NA').
convert-rrd.pl The script I used to convert all FCIs RRDs from 32bit to 64bit in the scripts/extra/ directory of the FCI tarball. It may be useful to others who need to migrate RRDs created on a 32bit server to a 64bit one.
Since this release includes a new version of fci-client.pl
it's highly recommended that any FCI client you admin are updated to this new release.
The lastest FCI release (and previous releases) can be downloaded from the
project website, or you can use the
direct link.
For the next release, I'm planning to include support for the new (BigAdv) bonus scheme, but for this I need testers with dual quad machines to test an experimental qd release that I'm preparing for that. I'm hoping that jimerickson, Magic-Michael, and possibly others are willing to help out with testing this. Please get in touch via PM if you want to help test the new qd with bigadv support.
Beta team members using FCI should also get in touch via PM to get the super secret info on to how to include the beta projects in FCI so the beta status is indicated on the project page. Since the other psummary pages are not public yet (although the links leaked in an early version of the new project summary pages), I assume this information is still considered for beta team members only.