Page 1 of 1

GPU WU Assignment Error

Posted: Wed Jun 24, 2009 4:29 pm
by Oldhat
Hi,

Feel free to move this to a more appropriate location, if needed.

I wanted to advise that I received the same work unit at approximately the same time on two clients.
Just to throw some more confusion into the mix, I received it on each GPU of a gtx295.

I imagine that it is purely a "blue moon" event, but thought I would bring it to your attention anyway.

e8200, 4096MB RAM, nVidia gtx295 dual GPU, Windows XP SP3, nVidia 185.85, GPU Systray 6.23

Code: Select all

--- Opening Log file [June 14 12:42:54 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\user\Application Data\Folding@home-gpu
Arguments: -gpu 0 

[12:42:54] - Ask before connecting: No
[12:42:54] - User name: iCiT (Team 24)

              --------------- SNIP ---------------

[13:07:49] Completed 96%
[13:08:37] Completed 97%
[13:09:24] Completed 98%
[13:10:12] Completed 99%
[13:11:00] Completed 100%
[13:11:00] Successful run
[13:11:00] DynamicWrapper: Finished Work Unit: sleep=10000
[13:11:10] Reserved 79052 bytes for xtc file; Cosm status=0
[13:11:10] Allocated 79052 bytes for xtc file
[13:11:10] - Reading up to 79052 from "work/wudata_06.xtc": Read 79052
[13:11:10] Read 79052 bytes from xtc file; available packet space=786351412
[13:11:10] xtc file hash check passed.
[13:11:10] Reserved 23472 23472 786351412 bytes for arc file=<work/wudata_06.trr> Cosm status=0
[13:11:10] Allocated 23472 bytes for arc file
[13:11:10] - Reading up to 23472 from "work/wudata_06.trr": Read 23472
[13:11:10] Read 23472 bytes from arc file; available packet space=786327940
[13:11:10] trr file hash check passed.
[13:11:10] Allocated 560 bytes for edr file
[13:11:10] Read bedfile
[13:11:10] edr file hash check passed.
[13:11:10] Allocated 31088 bytes for logfile
[13:11:10] Read logfile
[13:11:10] GuardedRun: success in DynamicWrapper
[13:11:10] GuardedRun: done
[13:11:10] Run: GuardedRun completed.
[13:11:15] - Writing 134684 bytes of core data to disk...
[13:11:15] Done: 134172 -> 111277 (compressed to 82.9 percent)
[13:11:15]   ... Done.
[13:11:15] - Shutting down core 
[13:11:15] 
[13:11:15] Folding@home Core Shutdown: FINISHED_UNIT
[13:11:19] CoreStatus = 64 (100)
[13:11:19] Sending work to server
[13:11:19] Project: 5758 (Run 10, Clone 892, Gen 5)
[13:11:19] - Read packet limit of 540015616... Set to 524286976.


[13:11:19] + Attempting to send results [June 24 13:11:19 UTC]
[13:11:22] + Results successfully sent
[13:11:22] Thank you for your contribution to Folding@Home.
[13:11:22] + Number of Units Completed: 162

[13:11:26] - Preparing to get new work unit...
[13:11:26] + Attempting to get work packet
[13:11:26] - Connecting to assignment server
[13:11:27] - Successful: assigned to (171.64.65.20).
[13:11:27] + News From Folding@Home: Welcome to Folding@Home
[13:11:27] Loaded queue successfully.
[13:11:28] + Closed connections
[13:11:28] 
[13:11:28] + Processing work unit
[13:11:28] Core required: FahCore_14.exe
[13:11:28] Core found.
[13:11:28] Working on queue slot 07 [June 24 13:11:28 UTC]
[13:11:28] + Working ...
[13:11:29] 
[13:11:29] *------------------------------*
[13:11:29] Folding@Home GPU Core - Beta
[13:11:29] Version 1.25 (Mon Mar 2 19:49:32 PST 2009)
[13:11:29] 
[13:11:29] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[13:11:29] Build host: vspm46
[13:11:29] Board Type: Nvidia
[13:11:29] Core      : 
[13:11:29] Preparing to commence simulation
[13:11:29] - Looking at optimizations...
[13:11:29] - Created dyn
[13:11:29] - Files status OK
[13:11:29] - Expanded 68628 -> 357580 (decompressed 521.0 percent)
[13:11:29] Called DecompressByteArray: compressed_data_size=68628 data_size=357580, decompressed_data_size=357580 diff=0
[13:11:29] - Digital signature verified
[13:11:29] 
[13:11:29] Project: 5911 (Run 8, Clone 29, Gen 17)
[13:11:29] 
[13:11:29] Assembly optimizations on if available.
[13:11:29] Entering M.D.
[13:11:35] Tpr hash work/wudata_07.tpr:  1071287597 3807907054 2957579036 2370395049 2009215850
[13:11:35] Working on Protein
[13:11:36] Client config found, loading data.
[13:11:36] Starting GUI Server
[13:15:30] Completed 1%
[13:19:39] Completed 2%
[13:23:29] Completed 3%
[13:27:27] Completed 4%

              --------------- SNIP ---------------

[15:39:58] Completed 36%
[15:44:06] Completed 37%
[15:48:14] Completed 38%

Folding@Home Client Shutdown.
The other log details.

Code: Select all

--- Opening Log file [June 14 12:42:58 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\user\Application Data\Folding@home-gpu 2
Arguments: -gpu 1 

[12:42:58] - Ask before connecting: No
[12:42:58] - User name: iCiT (Team 24)

              --------------- SNIP ---------------

[13:06:49] Completed 95%
[13:07:37] Completed 96%
[13:08:25] Completed 97%
[13:09:13] Completed 98%
[13:10:01] Completed 99%
[13:10:49] Completed 100%
[13:10:49] Successful run
[13:10:49] DynamicWrapper: Finished Work Unit: sleep=10000
[13:10:59] Reserved 79040 bytes for xtc file; Cosm status=0
[13:10:59] Allocated 79040 bytes for xtc file
[13:10:59] - Reading up to 79040 from "work/wudata_08.xtc": Read 79040
[13:10:59] Read 79040 bytes from xtc file; available packet space=786351424
[13:10:59] xtc file hash check passed.
[13:10:59] Reserved 23472 23472 786351424 bytes for arc file=<work/wudata_08.trr> Cosm status=0
[13:10:59] Allocated 23472 bytes for arc file
[13:10:59] - Reading up to 23472 from "work/wudata_08.trr": Read 23472
[13:10:59] Read 23472 bytes from arc file; available packet space=786327952
[13:10:59] trr file hash check passed.
[13:10:59] Allocated 560 bytes for edr file
[13:10:59] Read bedfile
[13:10:59] edr file hash check passed.
[13:10:59] Allocated 31138 bytes for logfile
[13:10:59] Read logfile
[13:10:59] GuardedRun: success in DynamicWrapper
[13:10:59] GuardedRun: done
[13:10:59] Run: GuardedRun completed.
[13:11:03] - Writing 134722 bytes of core data to disk...
[13:11:03] Done: 134210 -> 111245 (compressed to 82.8 percent)
[13:11:03]   ... Done.
[13:11:03] - Shutting down core 
[13:11:03] 
[13:11:03] Folding@home Core Shutdown: FINISHED_UNIT
[13:11:08] CoreStatus = 64 (100)
[13:11:08] Sending work to server
[13:11:08] Project: 5758 (Run 4, Clone 854, Gen 5)
[13:11:08] - Read packet limit of 540015616... Set to 524286976.


[13:11:08] + Attempting to send results [June 24 13:11:08 UTC]
[13:11:10] + Results successfully sent
[13:11:10] Thank you for your contribution to Folding@Home.
[13:11:10] + Number of Units Completed: 133

[13:11:14] - Preparing to get new work unit...
[13:11:14] + Attempting to get work packet
[13:11:14] - Connecting to assignment server
[13:11:15] - Successful: assigned to (171.64.65.20).
[13:11:15] + News From Folding@Home: Welcome to Folding@Home
[13:11:15] Loaded queue successfully.
[13:11:17] + Closed connections
[13:11:17] 
[13:11:17] + Processing work unit
[13:11:17] Core required: FahCore_14.exe
[13:11:17] Core found.
[13:11:17] Working on queue slot 09 [June 24 13:11:17 UTC]
[13:11:17] + Working ...
[13:11:17] 
[13:11:17] *------------------------------*
[13:11:17] Folding@Home GPU Core - Beta
[13:11:17] Version 1.25 (Mon Mar 2 19:49:32 PST 2009)
[13:11:17] 
[13:11:17] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[13:11:17] Build host: vspm46
[13:11:17] Board Type: Nvidia
[13:11:17] Core      : 
[13:11:17] Preparing to commence simulation
[13:11:17] - Looking at optimizations...
[13:11:17] - Created dyn
[13:11:17] - Files status OK
[13:11:17] - Expanded 68628 -> 357580 (decompressed 521.0 percent)
[13:11:17] Called DecompressByteArray: compressed_data_size=68628 data_size=357580, decompressed_data_size=357580 diff=0
[13:11:17] - Digital signature verified
[13:11:17] 
[13:11:17] Project: 5911 (Run 8, Clone 29, Gen 17)
[13:11:17] 
[13:11:17] Assembly optimizations on if available.
[13:11:17] Entering M.D.
[13:11:23] Tpr hash work/wudata_09.tpr:  1071287597 3807907054 2957579036 2370395049 2009215850
[13:11:23] Working on Protein
[13:11:24] Client config found, loading data.
[13:11:24] Starting GUI Server
[13:15:18] Completed 1%
[13:19:27] Completed 2%
[13:23:17] Completed 3%
[13:27:15] Completed 4%
[13:31:16] Completed 5%
[13:35:14] Completed 6%
[13:39:22] Completed 7%
[13:43:20] Completed 8%

              --------------- SNIP ---------------

[15:39:46] Completed 36%
[15:43:55] Completed 37%
[15:48:03] Completed 38%

Folding@Home Client Shutdown.
Cheers

Re: GPU WU Assignment Error

Posted: Thu Jun 25, 2009 10:41 am
by toTOW
Oldhat wrote:I wanted to advise that I received the same work unit at approximately the same time on two clients.
This is a known behavior that might happen when two clients request a WU at the same time ...

Re: GPU WU Assignment Error

Posted: Thu Jun 25, 2009 3:06 pm
by AZBrandon
toTOW wrote:
Oldhat wrote:I wanted to advise that I received the same work unit at approximately the same time on two clients.
This is a known behavior that might happen when two clients request a WU at the same time ...
Not to nitpick, but according to the logs he posted, the WU's were requested 12 seconds apart. Before that seems like the same time, look at what Dr. Pande reported in another thread: that 217188 WU's were posted for the GPU2 series of projects in a special stats update yesterday, which based on what we were able to gather amounted to about two days worth of WU's.

217188 / 48 = 4525 WU per hour.
4525 / 60 = 75 WU per minute

That means more than 1 WU is assigned per second. From the way I see it, WU's are getting handed out at a rate of something like one every 800 milliseconds, then a 12,000 ms separation should have been plenty to avoid double-assignment. It makes this occurrence seem even more odd. Then again, given they just started a server upgrade last week, I'd personally avoid jumping to any conclusions for at least a week or two until the server upgrades are completed.

Re: GPU WU Assignment Error

Posted: Thu Jun 25, 2009 4:53 pm
by 7im
Too much of that log was snipped out. We can't even tell if those two clients are using the same Machine ID or not. And we know dupe MIDs will cause this for sure.

Re: GPU WU Assignment Error

Posted: Thu Jun 25, 2009 5:57 pm
by bruce
AZBrandon wrote:That means more than 1 WU is assigned per second. From the way I see it, WU's are getting handed out at a rate of something like one every 800 milliseconds, then a 12,000 ms separation should have been plenty to avoid double-assignment. It makes this occurrence seem even more odd. Then again, given they just started a server upgrade last week, I'd personally avoid jumping to any conclusions for at least a week or two until the server upgrades are completed.
If all of the WUs that were re-credited were from a single server, then they're pretty interesting numbers. If they were spread over a number of servers, then you might need to recalculate.

The "same time" probably includes the time from when the server decides to assign a WU to you until it receives a confirmation from your machine that the WU was successfully downloaded -- which will obviously depend on the speed of your connection.

There are a few other reasons why a WU might be assigned more than once. As long as you don't have a conflict in MachineID/UserID, though, I expect that you'll get credit for both of them.

Re: GPU WU Assignment Error

Posted: Thu Jun 25, 2009 7:38 pm
by AZBrandon
Hey, as long as the work gets done, and done accurately a little duplication isn't going to hurt anything. Like you said, both donors should get credit so it's a glitch of somekind, but a kind of fail-safe glitch - Stanford gets the work done, and the donor gets credit. Everything else is just to improve efficiency, which we know from recent posts by Dr. Pande, they are working continually on improving pretty much every aspect of the project from end to end.