Posted data, but not credited

Moderators: Site Moderators, FAHC Science Team

Post Reply
hrsetrdr
Posts: 112
Joined: Sun Dec 02, 2007 4:29 pm
Location: In the Fold somewhere in SoCal.

Posted data, but not credited

Post by hrsetrdr »

[

Code: Select all

16:41:03] Project: 6900 (Run 54, Clone 13, Gen 48)


[16:41:03] + Attempting to send results [January 21 16:41:03 UTC]
[16:41:03] - Reading file work/wuresults_08.dat from core
[16:41:03]   (Read 100191141 bytes from disk)
[16:41:03] Connecting to http://130.237.232.141:80/
[16:59:17] Posted data.
[16:59:17] Initial: 0000; - Uploaded at ~89 kB/s[16:59:17] - Averaged speed for that direction ~89 kB/s
[16:59:17]  Server reports problem with unit.
This WU hasn't yet shown up in my stats, just wondering why. The message "Server reports problem with unit" is one I don't recall seeing. I did 'transplant' this WU from another machine, perhaps I had forgotten to delete the machinedependent.dat from the original machine?
Folding rig:Supermicro X9DRD-7LN4F-JBOD | (2) Xeon E5-2670 | 128GB DDR3 ECC Registered

Image
Install Folding@Home on Linux without Python dependancy issues
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Posted data, but not credited

Post by bruce »

The message "Server reports problem with unit." is another way of saying "The WU you just submitted contains corrupt data and we can't give you credit for it." :(

There's a good chance that the corruption happened when you moved and restarted the WU but somehow the corruption wasn't detected at that time. In other words, it's probably related to your invalid checkpoint problem.
hrsetrdr
Posts: 112
Joined: Sun Dec 02, 2007 4:29 pm
Location: In the Fold somewhere in SoCal.

Re: Posted data, but not credited

Post by hrsetrdr »

bruce wrote:The message "Server reports problem with unit." is another way of saying "The WU you just submitted contains corrupt data and we can't give you credit for it." :(

There's a good chance that the corruption happened when you moved and restarted the WU but somehow the corruption wasn't detected at that time. In other words, it's probably related to your invalid checkpoint problem.

This is a different WU, not the one that suffered the invalid checkpoint issue. Here is the entire log for the session:

Code: Select all

--- Opening Log file [January 21 03:27:56 UTC] 


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/tnthomas/fah
Executable: ./fah6
Arguments: -smp 8 -bigbeta -verbosity 9 -oneunit 

[03:27:56] - Ask before connecting: No
[03:27:56] - User name: hrsetrdr (Team 32)
[03:27:56] - User ID not found locally
[03:27:56] + Requesting User ID from server
[03:27:56] - Getting ID from AS: 
[03:27:56] Connecting to http://assign.stanford.edu:8080/
[03:28:17] - Couldn't send HTTP request to server
[03:28:17] + Could not connect to Primary Assignment Server for ID
[03:28:17] Connecting to http://assign2.stanford.edu:80/
[03:28:17] Posted data.
[03:28:17] Initial: 3B0C; - Received User ID = C3B82FE736D98ED
[03:28:17] - Machine ID: 7
[03:28:17] 
[03:28:17] Loaded queue successfully.
[Ja:28:17] 
[03:28:17] - Autosending finished units... [Ja:28:1721 03:28:17 UTC]
[03:28:17] + Processing work unit
[03:28:17] Trying to send all finished work units
[03:28:17] Core required: FahCore_a5.exe
[03:28:17] + No unsent completed units remaining.
[03:28:17] Core found.
[03:28:17] - Autosend completed
[03:28:17] Working on queue slot 08 [January 21 03:28:17 UTC]
[03:28:17] + Working ...
[03:28:17] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 08 -np 8 -checkpoint 15 -verbose -lifeline 2692 -version 634'

[03:28:17] 
[03:28:17] *------------------------------*
[03:28:17] Folding@Home Gromacs SMP Core
[03:28:17] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[03:28:17] 
[03:28:17] Preparing to commence simulation
[03:28:17] - Ensuring status. Please wait.
[03:28:27] - Looking at optimizations...
[03:28:27] - Working with standard loops on this execution.
[03:28:27] - Previous termination of core was improper.
[03:28:27] - Files status OK
[03:28:29] - Expanded 24875109 -> 30796292 (decompressed 123.8 percent)
[03:28:29] Called DecompressByteArray: compressed_data_size=24875109 data_size=30796292, decompressed_data_size=30796292 diff=0
[03:28:29] - Digital signature verified
[03:28:29] 
[03:28:29] Project: 6900 (Run 54, Clone 13, Gen 48)
[03:28:29] 
[03:28:29] Entering M.D.
[03:28:35] Using Gromacs checkpoints
[03:28:36] Mapping NT from 8 to 8 
[03:28:41] Resuming from checkpoint
[03:28:41] Verified work/wudata_08.log
[03:28:41] Verified work/wudata_08.trr
[03:28:42] Verified work/wudata_08.xtc
[03:28:42] Verified work/wudata_08.edr
[03:28:43] Completed 203580 out of 250000 steps  (81%)
[03:53:13] Completed 205000 out of 250000 steps  (82%)
[04:35:48] Completed 207500 out of 250000 steps  (83%)
[05:18:32] Completed 210000 out of 250000 steps  (84%)
[06:01:13] Completed 212500 out of 250000 steps  (85%)
[06:43:54] Completed 215000 out of 250000 steps  (86%)
[07:26:36] Completed 217500 out of 250000 steps  (87%)
[08:09:12] Completed 220000 out of 250000 steps  (88%)
[08:51:53] Completed 222500 out of 250000 steps  (89%)
[09:28:17] - Autosending finished units... [January 21 09:28:17 UTC]
[09:28:17] Trying to send all finished work units
[09:28:17] + No unsent completed units remaining.
[09:28:17] - Autosend completed
[09:34:27] Completed 225000 out of 250000 steps  (90%)
[10:17:09] Completed 227500 out of 250000 steps  (91%)
[10:59:45] Completed 230000 out of 250000 steps  (92%)
[11:42:19] Completed 232500 out of 250000 steps  (93%)
[12:24:55] Completed 235000 out of 250000 steps  (94%)
[13:07:34] Completed 237500 out of 250000 steps  (95%)
[13:50:14] Completed 240000 out of 250000 steps  (96%)
[14:32:54] Completed 242500 out of 250000 steps  (97%)
[15:15:33] Completed 245000 out of 250000 steps  (98%)
[15:28:17] - Autosending finished units... [January 21 15:28:17 UTC]
[15:28:17] Trying to send all finished work units
[15:28:17] + No unsent completed units remaining.
[15:28:17] - Autosend completed
[15:58:09] Completed 247500 out of 250000 steps  (99%)
[16:40:36] Completed 250000 out of 250000 steps  (100%)
[16:40:45] DynamicWrapper: Finished Work Unit: sleep=10000
[16:40:55] 
[16:40:55] Finished Work Unit:
[16:40:55] - Reading up to 52713120 from "work/wudata_08.trr": Read 52713120
[16:40:55] trr file hash check passed.
[16:40:55] - Reading up to 47102852 from "work/wudata_08.xtc": Read 47102852
[16:40:56] xtc file hash check passed.
[16:40:56] edr file hash check passed.
[16:40:56] logfile size: 205221
[16:40:56] Leaving Run
[16:40:56] - Writing 100191141 bytes of core data to disk...
[16:40:56]   ... Done.
[16:41:02] - Shutting down core
[16:41:02] 
[16:41:02] Folding@home Core Shutdown: FINISHED_UNIT
[16:41:03] CoreStatus = 64 (100)
[16:41:03] Sending work to server
[16:41:03] Project: 6900 (Run 54, Clone 13, Gen 48)


[16:41:03] + Attempting to send results [January 21 16:41:03 UTC]
[16:41:03] - Reading file work/wuresults_08.dat from core
[16:41:03]   (Read 100191141 bytes from disk)
[16:41:03] Connecting to http://130.237.232.141:80/
[16:59:17] Posted data.
[16:59:17] Initial: 0000; - Uploaded at ~89 kB/s
[16:59:17] - Averaged speed for that direction ~89 kB/s
[16:59:17] - Server reports problem with unit.
[16:59:17] Trying to send all finished work units
[16:59:17] + No unsent completed units remaining.
[16:59:17] + -oneunit flag given and have now finished a unit. Exiting.- Preparing to get new work unit...
[16:59:17] Cleaning up work directory
[16:59:17] ***** Got a SIGTERM signal (15)
[16:59:17] Killing all core threads
Folding rig:Supermicro X9DRD-7LN4F-JBOD | (2) Xeon E5-2670 | 128GB DDR3 ECC Registered

Image
Install Folding@Home on Linux without Python dependancy issues
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Posted data, but not credited

Post by bruce »

hrsetrdr wrote:
bruce wrote:This is a different WU, not the one that suffered the invalid checkpoint issue. Here is the entire log for the session:

Code: Select all

...
[03:28:41] Resuming from checkpoint
[03:28:41] Verified work/wudata_08.log
[03:28:41] Verified work/wudata_08.trr
[03:28:42] Verified work/wudata_08.xtc
[03:28:42] Verified work/wudata_08.edr
[03:28:43] Completed 203580 out of 250000 steps  (81%)
...
Not really. If it's the entire log, why does it start at 81%? ;)

My point is that if a corruption occurred when you shut down after 81% and when you restarted, the "Verified work/wudata_08.*" statements failed to catch that corruption, it MIGHT be related. All we really know about this WU, though, is that the server detected a corruption.
ChasR
Posts: 402
Joined: Sun Dec 02, 2007 5:36 am
Location: Atlanta, GA

Re: Posted data, but not credited

Post by ChasR »

Moving a WU from one machine to another has, for some time, resulted in the server reporting a problem with the WU. In my recent experience, this happens every time a WU is transfered from one machine to another. THere may be a workaround, but I don't do it often enough to retain the memory. I assumed it was caused by server code to prevent folks from cherry picking on one machine , and completing them on another via sneakernetting. Is this not the case?
ChelseaOilman
Posts: 1037
Joined: Sun Dec 02, 2007 3:47 pm
Location: Colorado @ 10,000 feet

Re: Posted data, but not credited

Post by ChelseaOilman »

How did you get 80% of the way through a WU without having a user ID#?
[03:27:56] - User ID not found locally
[03:27:56] + Requesting User ID from server
[03:27:56] - Getting ID from AS:
[03:27:56] Connecting to http://assign.stanford.edu:8080/
[03:28:17] - Couldn't send HTTP request to server
[03:28:17] + Could not connect to Primary Assignment Server for ID
[03:28:17] Connecting to http://assign2.stanford.edu:80/
[03:28:17] Posted data.
[03:28:17] Initial: 3B0C; - Received User ID = C3B82FE736D98ED
If that isn't the user ID# that downloaded the WU your unlikely to receive any points for it.
hrsetrdr
Posts: 112
Joined: Sun Dec 02, 2007 4:29 pm
Location: In the Fold somewhere in SoCal.

Re: Posted data, but not credited

Post by hrsetrdr »

ChasR wrote:Moving a WU from one machine to another has, for some time, resulted in the server reporting a problem with the WU. In my recent experience, this happens every time a WU is transfered from one machine to another. THere may be a workaround, but I don't do it often enough to retain the memory. I assumed it was caused by server code to prevent folks from cherry picking on one machine , and completing them on another via sneakernetting. Is this not the case?
No, no 'cherrypicking' but yes- sneakernetting a WU off one machine(A) that was to undergo maintenance, over to machine B to finish and upload the WU. As for the user ID, it 'used-to' be a trouble avoidance practice to delete the machinedependent.dat before starting up the WU on the destination machine; perhaps this is no longer desirable.
Folding rig:Supermicro X9DRD-7LN4F-JBOD | (2) Xeon E5-2670 | 128GB DDR3 ECC Registered

Image
Install Folding@Home on Linux without Python dependancy issues
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Posted data, but not credited

Post by 7im »

With the newer security measures (in affect for months, if not more than a year), the work unit must be returned with the same Machine ID as the Machine ID that downloaded the work unit. Deleting that .dat file has the opposite affect. If not, no bonus for sure. Might not get any points.

To receive full credit, the Machine ID must be the same when downloading the WU as when uploading the finished WU.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
hrsetrdr
Posts: 112
Joined: Sun Dec 02, 2007 4:29 pm
Location: In the Fold somewhere in SoCal.

Re: Posted data, but not credited

Post by hrsetrdr »

Thanks 7im, guess I'm a bit out-of-date along those lines; sure won't be doing that again. :roll:
Folding rig:Supermicro X9DRD-7LN4F-JBOD | (2) Xeon E5-2670 | 128GB DDR3 ECC Registered

Image
Install Folding@Home on Linux without Python dependancy issues
Post Reply