Page 7 of 52

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Sun Jan 31, 2010 5:25 pm
by ChelseaOilman
patonb wrote:See, FaHmon has those values.... Thats what boggels me...
Your talking about 2 different programs written by 2 different programers. They do things differently. What's so hard to understand about that? The programer for the first client monitoring program we all used to use, Electron Microscope, got so feed up with the moving target that PandeGroup provides that he gave up. And IMHO, if you haven't made a Paypal contribution to any of these people spending their spare time writing these monitoring programs you don't have the right to complain much. :wink:

Your probably better off posting your issues in the HFM.NET Google Group Harlam set up.
http://code.google.com/p/hfm-net/

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Tue Feb 02, 2010 9:26 am
by toTOW
I think I have a little bug with HFM ... here a screen shot :

Image

And the corresponding log of the client :

Code: Select all

[03:50:40] *------------------------------*
[03:50:40] Folding@Home Gromacs Core
[03:50:40] Version 1.90 (March 8, 2006)
[03:50:40] 
[03:50:40] Preparing to commence simulation
[03:50:40] - Looking at optimizations...
[03:50:40] - Created dyn
[03:50:40] - Files status OK
[03:50:45] - Expanded 2201911 -> 15081709 (decompressed 684.9 percent)
[03:50:45] - Starting from initial work packet
[03:50:45] 
[03:50:45] Project: 2495 (Run 108, Clone 22, Gen 0)
[03:50:45] 
[03:50:45] Assembly optimizations on if available.
[03:50:45] Entering M.D.
[03:50:53] Protein: system
[03:50:53] 
[03:50:53] Writing local files
[03:50:56] Extra SSE boost OK.
[03:50:58] Writing local files
[03:50:58] Completed 0 out of 250000 steps  (0%)
[04:05:58] Timered checkpoint triggered.
[04:21:58] Timered checkpoint triggered.
[04:36:59] Timered checkpoint triggered.
[04:40:13] Writing local files
[04:40:13] Completed 2500 out of 250000 steps  (1%)
[04:55:14] Timered checkpoint triggered.
[05:11:15] Timered checkpoint triggered.
[05:26:15] Timered checkpoint triggered.
[05:29:27] Writing local files
[05:29:27] Completed 5000 out of 250000 steps  (2%)
[05:44:27] Timered checkpoint triggered.
[06:00:28] Timered checkpoint triggered.
[06:15:28] Timered checkpoint triggered.
[06:18:41] Writing local files
[06:18:42] Completed 7500 out of 250000 steps  (3%)
[06:33:43] Timered checkpoint triggered.
[06:49:42] Timered checkpoint triggered.
[07:02:06] - Autosending finished units... [February 2 07:02:06 UTC]
[07:02:06] Trying to send all finished work units
[07:02:06] + No unsent completed units remaining.
[07:02:06] - Autosend completed
[07:04:43] Timered checkpoint triggered.
[07:07:57] Writing local files
[07:07:57] Completed 10000 out of 250000 steps  (4%)
[07:22:58] Timered checkpoint triggered.
[07:38:58] Timered checkpoint triggered.
[07:53:59] Timered checkpoint triggered.
[07:57:11] Writing local files
[07:57:11] Completed 12500 out of 250000 steps  (5%)
[08:12:13] Timered checkpoint triggered.
[08:27:19] Timered checkpoint triggered.
[08:42:23] Timered checkpoint triggered.
[08:48:52] Writing local files
[08:48:52] Completed 15000 out of 250000 steps  (6%)
[09:03:53] Timered checkpoint triggered.
[09:18:53] Timered checkpoint triggered.
Any ideas ?

edit : I restarted the client, and it fixed it ... I guess it's related to this log part of the log I got early this morning :

Code: Select all

[03:35:45] - Calling '.\FahCore_b4.exe -dir work/ -suffix 08 -priority 96 -checkpoint 15 -verbose -lifeline 604 -version 623'

[03:35:51] CoreStatus = FF (255)
[03:35:51] Sending work to server
[03:35:51] Project: 10000 (Run 290, Clone 0, Gen 38)
[03:35:51] - Read packet limit of 540015616... Set to 524286976.
[03:35:51] - Error: Could not get length of results file work/wuresults_08.dat
[03:35:51] - Error: Could not read unit 08 file. Removing from queue.
[03:35:51] Trying to send all finished work units
[03:35:51] + No unsent completed units remaining.
[03:35:51] - Preparing to get new work unit...
[03:35:51] + Attempting to get work packet
[03:35:51] - Will indicate memory of 2038 MB
[03:35:51] - Connecting to assignment server
[03:35:51] Connecting to http://assign.stanford.edu:8080/
[03:35:52] Posted data.
[03:35:52] Initial: 4A81; - Successful: assigned to (129.74.85.48).
[03:35:52] + News From Folding@Home: Welcome to Folding@Home
[03:35:52] Loaded queue successfully.
[03:35:52] Connecting to http://129.74.85.48:8080/
[03:35:52] Posted data.
[03:35:52] Initial: 0000; - Receiving payload (expected size: 65023)
[03:35:53] - Downloaded at ~63 kB/s
[03:35:53] - Averaged speed for that direction ~123 kB/s
[03:35:53] + Received work.
[03:35:53] Trying to send all finished work units
[03:35:53] + No unsent completed units remaining.
[03:35:53] + Closed connections
[03:35:58] 
[03:35:58] + Processing work unit
[03:35:58] Core required: FahCore_b4.exe
[03:35:58] Core found.
[03:35:58] Working on queue slot 09 [February 2 03:35:58 UTC]
[03:35:58] + Working ...
[03:35:58] - Calling '.\FahCore_b4.exe -dir work/ -suffix 09 -priority 96 -checkpoint 15 -verbose -lifeline 604 -version 623'

[03:36:02] CoreStatus = FF (255)
[03:36:02] Sending work to server
[03:36:02] Project: 10000 (Run 1497, Clone 0, Gen 4)
[03:36:02] - Read packet limit of 540015616... Set to 524286976.
[03:36:02] - Error: Could not get length of results file work/wuresults_09.dat
[03:36:02] - Error: Could not read unit 09 file. Removing from queue.
[03:36:02] Trying to send all finished work units
[03:36:02] + No unsent completed units remaining.
[...]
[03:46:58] - Preparing to get new work unit...
[03:46:58] + Attempting to get work packet
[03:46:58] - Will indicate memory of 2038 MB
[03:46:58] - Connecting to assignment server
[03:46:58] Connecting to http://assign.stanford.edu:8080/
[03:47:03] Posted data.
[03:47:03] Initial: 4A81; - Successful: assigned to (129.74.85.48).
[03:47:03] + News From Folding@Home: Welcome to Folding@Home
[03:47:03] Loaded queue successfully.
[03:47:03] Connecting to http://129.74.85.48:8080/
[03:47:04] Posted data.
[03:47:04] Initial: 0000; - Receiving payload (expected size: 64747)
[03:47:04] Conversation time very short, giving reduced weight in bandwidth avg
[03:47:04] - Downloaded at ~126 kB/s
[03:47:04] - Averaged speed for that direction ~78 kB/s
[03:47:04] + Received work.
[03:47:04] Trying to send all finished work units
[03:47:04] + No unsent completed units remaining.
[03:47:04] + Closed connections
[03:47:09] 
[03:47:09] + Processing work unit
[03:47:09] Core required: FahCore_b4.exe
[03:47:09] Core found.
[03:47:09] Working on queue slot 09 [February 2 03:47:09 UTC]
[03:47:09] + Working ...
[03:47:09] - Calling '.\FahCore_b4.exe -dir work/ -suffix 09 -priority 96 -checkpoint 15 -verbose -lifeline 604 -version 623'

[03:47:14] CoreStatus = FF (255)
[03:47:14] Sending work to server
[03:47:14] Project: 10002 (Run 2, Clone 0, Gen 15)
[03:47:14] - Read packet limit of 540015616... Set to 524286976.
[03:47:14] - Error: Could not get length of results file work/wuresults_09.dat
[03:47:14] - Error: Could not read unit 09 file. Removing from queue.
[03:47:14] Trying to send all finished work units
[03:47:14] + No unsent completed units remaining.
[03:47:14] - Preparing to get new work unit...
[03:47:14] + Attempting to get work packet

P5781 give wrong PPD in HFM.net

Posted: Sun Feb 07, 2010 6:14 pm
by P5-133XL
There is something wrong with the psummery for p5781. They all give me 1M+ PPD in hfm.net. Theoretically it could be a HFM.net issue, but it doesn't happen with any other projects, so I'm assuming that the problem is with Psummery rather than HFM.net.

Re: P5781 give wrong PPD in HFM.net

Posted: Sun Feb 07, 2010 8:19 pm
by bruce
I see nothing wrong with psummary.

The PPD calculation is only accurate if HFM.net can estimate the total time to complete the WU. P5781 is a non-bonus WU, so the calculation should be easy. Look at the times for, say, 10 frames of 1%. Multiply by 10 to get the estimated time for 100% and convert to decimal days. Divide the points for that WU (in this case, 783) by the decimal days to get a PPD.

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Wed Feb 10, 2010 2:23 am
by tonic
Is the bonus calculation for the A3 WUs not correct? I'm doing around 3min per frame on Project 6012 (470 pointer) and it's saying my PPD is only 1728 which seems incredibly low. Guessing it's not calculating the bonus? This is on HFM.net 0.4.8 Beta.

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Wed Feb 10, 2010 2:47 am
by harlam357
To everyone... sorry I haven't been around. The best way to find me is posting in the HFM Google Group.

http://groups.google.com/group/hfm-net
bollix47 wrote:EDIT: Managed to get past the above by installing mono 2.4 from: http://directhex.mfgames.com/hardy.html
It's still a little fragile .... e.g. if I click on any of clients 9 thru 12(sorted by ETA) the program immediately closes .... doesn't do that for clients 1 thru 8. Anyway, at least it now does come up and I was able to import the clients from FahMon and the bonus is being calculated on the a3 and bigadv a2 clients which is what I was looking for.
Yeah, Mono just doesn't "work" like the .NET Framework. I'm going to be buying a package soon that will allow me to test and debug directly on Mono/Linux right from Visual Studio. This should go a long way in making Mono/Linux support better in future releases.

Have you tried Mono 2.6? It was released recently. Alas, I haven't had time. One of the fixes on the list was much improved DataGridView support, which is the control HFM uses to display clients... so I'm crossing my fingers.
patonb wrote:
Nathan_P wrote:
They haven't been listed on Psummary yet
If it wasn't, then explain why fahmon had it all correctly? Took 3hrs for it to come out in hfm after fahmon displayed it.
bruce wrote:The psummary data is updated dynamically so if a project is available for a while and then the server goes down or for some other reason the project is suspended, the data will no longer appear on-line. I think that FahMon stores the information locally so it keeps it even after the project has finished. (HFM may or may not do the same thing.) In any case, you won't have the information in HFM unless you update the data at a time when the project is showing in psummary.
HFM does cache Project information. Please see the following post in the HFM Google Group.

http://groups.google.com/group/hfm-net/ ... d6577853e#
AZBrandon wrote:Thanks ChasR, once I got a new WU, it was back to reporting all information again. Seems to be something that may work or fail based on WU, but hey - good to hear the developer still updates the code. It seemed like fahmon was pretty much abandoned at this point.
Zagen30 wrote:Forgive me if this has been mentioned before- I did look to see if anyone else had reported it and didn't find anything.

HFM can get confused if the client starts up and tries to send a previously completed WU. For instance, earlier today my GPU client wouldn't send a WU, and this is what happened when I started the client up next:

The 5767 was the one that had already been completed, and the client was working on the 5781. HFM thought that the client was doing a 5767, and as such the PPD calculation was way off (of course, since the low 5780's aren't in the psummary, it now tells me I'm getting 0 PPD, but that's not HFM's fault). I unfortunately didn't get a screenshot.

*EDIT*

After checking the Google group Harlam has, it seems he already knows about this bug.
To AZBrandon and Zagen30...

HFM parses out much the entire FAHlog file on refresh, and when it does it tries to determine the boundries of where WUs begin. This is a tough chore as you can see and a much more "detailed" approach than any other application has taken as I think most others just parse out the end of the FAHlog file. There are so many variations to the logs and it's unpredictable at times. I'd have to say that users run into this kind of issue 1 out of 100 times. So no, it's not perfect. All I can do is fix situations when they are brought to my attention. Next time you have one of these issues, please send me your log files (harlam357 at gmail dot com). I'm going to implement a feature that will make send me logs for errant clients somewhat automatic, but it's going to take some time.

Thanks for that log snippet Zagen30. I'll fix that issue and add it to my test suite. To be honest, I've never seen the client try to autosend at that juncture. So yes, in that case, at that point in the log, HFM captures the first "Project: " string it sees and it thinks that is the Project.
P5-133XL wrote:There is something wrong with the psummery for p5781. They all give me 1M+ PPD in hfm.net. Theoretically it could be a HFM.net issue, but it doesn't happen with any other projects, so I'm assuming that the problem is with Psummery rather than HFM.net.
I had another user have this issue with his GPUs. He said he had some faulty information from the psummary. Not sure if the psummary was actually gunked for a short period or what was going on. If you're still having the issue can you check the benchmarks for the project in question and see if the k factor value is greater than 0.

Otherwise, please see this post in the HFM Google Group and try replacing the ProjectInfo.tab file manually.

http://groups.google.com/group/hfm-net/ ... d6577853e#

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Wed Feb 10, 2010 2:51 am
by patonb
Thanks for the answers.

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Wed Feb 10, 2010 3:30 am
by P5-133XL
I did check the benchmarks for p5781 and found the k-factor to be 5782. I deleted the Projectinfo.tab file, re-downloaded the psummery and no more problems. Thanks ChasR for the PM and Harlam357 for replying here.

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Wed Feb 10, 2010 12:26 pm
by ChasR
ChasR, :D

You're Welcome.

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Wed Feb 10, 2010 7:01 pm
by harlam357
P5-133XL wrote:I did check the benchmarks for p5781 and found the k-factor to be 5782. I deleted the Projectinfo.tab file, re-downloaded the psummery and no more problems. Thanks ChasR for the PM and Harlam357 for replying here.
Something must have been up with the psummary page for a while.... obviously the k factor for p5781 should be zero and somehow it got parsed out as the next project number. Will make some further strides to validate my parsing routine in upcoming releases.

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Thu Feb 11, 2010 9:59 am
by bollix47
Yeah, Mono just doesn't "work" like the .NET Framework. I'm going to be buying a package soon that will allow me to test and debug directly on Mono/Linux right from Visual Studio. This should go a long way in making Mono/Linux support better in future releases.

Have you tried Mono 2.6? It was released recently. Alas, I haven't had time. One of the fixes on the list was much improved DataGridView support, which is the control HFM uses to display clients... so I'm crossing my fingers.
Thanks for the 2.6 suggestion. It works much better. However, there are still times that the program crashes if I click on a different client. As long as I don't then the program appears to be stable indefinitely.

Here's the terminal info after a crash:

Code: Select all

System.NullReferenceException: Object reference not set to an instance of an object
  at System.Windows.Forms.Document.AlignCaret (Boolean changeCaretTag) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.RichTextBox.set_SelectionColor (Color value) [0x00000] in <filename unknown>:0 
  at HFM.Classes.RichTextBoxWrapper.DoLineHighlight (Int32 lineIndex, Color color) [0x00000] in <filename unknown>:0 
  at HFM.Classes.RichTextBoxWrapper.HighlightLines () [0x00000] in <filename unknown>:0 
  at (wrapper remoting-invoke-with-check) HFM.Classes.RichTextBoxWrapper:HighlightLines ()
  at HFM.Forms.frmMain.PreferenceSet_ColorLogFileChanged (System.Object sender, System.EventArgs e) [0x00000] in <filename unknown>:0 
  at HFM.Forms.frmMain.SetLogLines (HFM.Instances.ClientInstance Instance, IList`1 logLines) [0x00000] in <filename unknown>:0 
  at HFM.Forms.frmMain.queueControl_QueueIndexChanged (System.Object sender, HFM.Classes.QueueIndexChangedEventArgs e) [0x00000] in <filename unknown>:0 
  at HFM.Classes.QueueControl.OnQueueIndexChanged (HFM.Classes.QueueIndexChangedEventArgs e) [0x00000] in <filename unknown>:0 
  at HFM.Classes.QueueControl.cboQueueIndex_SelectedIndexChanged (System.Object sender, System.EventArgs e) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.ComboBox.OnSelectedIndexChanged (System.EventArgs e) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.ComboBox.SetSelectedIndex (Int32 value, Boolean supressAutoScroll) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.ComboBox.set_SelectedIndex (Int32 value) [0x00000] in <filename unknown>:0 
  at HFM.Classes.QueueControl.SetQueue (IQueueBase qBase, ClientType type, Boolean vm) [0x00000] in <filename unknown>:0 
  at (wrapper remoting-invoke-with-check) HFM.Classes.QueueControl:SetQueue (HFM.Framework.IQueueBase,HFM.Framework.ClientType,bool)
  at HFM.Forms.frmMain.ClientInstances_SelectedInstanceChanged (System.Object sender, System.EventArgs e) [0x00000] in <filename unknown>:0 
  at HFM.Instances.InstanceCollection.OnSelectedInstanceChanged (System.EventArgs e) [0x00000] in <filename unknown>:0 
  at HFM.Instances.InstanceCollection.set_SelectedInstance (HFM.Instances.ClientInstance value) [0x00000] in <filename unknown>:0 
  at HFM.Instances.InstanceCollection.SetCurrentInstance (IList SelectedClients) [0x00000] in <filename unknown>:0 
  at HFM.Forms.frmMain.<.ctor>b__3 (System.Object , System.Windows.Forms.DataGridViewCellEventArgs ) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.DataGridView.OnRowEnter (System.Windows.Forms.DataGridViewCellEventArgs e) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.DataGridView.SetCurrentCellAddressCore (Int32 columnIndex, Int32 rowIndex, Boolean setAnchorCellAddress, Boolean validateCurrentCell, Boolean throughMouseClick) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.DataGridView.OnMouseDown (System.Windows.Forms.MouseEventArgs e) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.Control.WmLButtonDown (System.Windows.Forms.Message& m) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.Control.WndProc (System.Windows.Forms.Message& m) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.DataGridView.WndProc (System.Windows.Forms.Message& m) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.Control+ControlWindowTarget.OnMessage (System.Windows.Forms.Message& m) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.Control+ControlNativeWindow.WndProc (System.Windows.Forms.Message& m) [0x00000] in <filename unknown>:0 
  at System.Windows.Forms.NativeWindow.WndProc (IntPtr hWnd, Msg msg, IntPtr wParam, IntPtr lParam) [0x00000] in <filename unknown>:0 
My main reason for trying was to get the bonus info for PPD and Credit. Both features appear to be working correctly for all clients.

Thanks for HFM. :ewink:

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Thu Feb 11, 2010 7:32 pm
by harlam357
Thanks for that stack trace. It will provide invaluable I'm sure.

I've entered an Issue on this and will attempt to resolve it before the 0.5.0 release.

Thank you for using HFM!

---

Need help with HFM? Have a Feature Request? Please Join the Discussions. Post in the HFM Google Group.

http://groups.google.com/group/hfm-net

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Fri Feb 19, 2010 7:19 pm
by JPinTO
I'm running HFM.net v0.4.8 - Beta. On fast i7 machines, the GRO-A3 cores are reporting much higher than actual PPD and credits.

For instance, I have a machine on P6021 with frame time: 03:13 and it's reporting 12748 credit and 57070 PPD.

The calculator http://www.linuxforge.net/bonuscalc2.php shows 3574 credit and 16000PPD for P6021 Frame Time 03:13.

Please advise.

- JP

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Fri Feb 19, 2010 7:39 pm
by ChasR
It isn't happening on all i7s. Mine is showing 3:02/frame, 17,363 ppd, 3658 credit, which is pretty close to correct. Delete the Projectinfo.tab file and download projects from Stanford. I think that will fix things for you.

Re: HFM.NET - Client Monitoring Application for Folding@Home

Posted: Sat Feb 20, 2010 4:15 pm
by JPinTO
I tried that, but I'm still showing 62000PPD on 03:13 frame time. Thanks for the suggestion.

- JP