Page 1 of 1

Project: 5773 (Run 10, Clone 421, Gen 2)

Posted: Tue May 05, 2009 4:46 am
by Amaruk
WU died immediately, killing both the core and client.

Card is 8800 GT 256MB G92 at vendor clocks - 700/750/1700 (2 of 3)

X2 5600 @ 2.9 Ghz, 4 gb PC2-6400, XP Pro32 SP2, MSI K9A2 Platinum, 6.23 Systray, 181.20 drivers

Code: Select all

[00:07:32] + Attempting to get work packet
[00:07:32] - Will indicate memory of 2815 MB
[00:07:32] - Connecting to assignment server
[00:07:32] Connecting to http://assign-GPU.stanford.edu:8080/
[00:07:33] Posted data.
[00:07:33] Initial: 40AB; - Successful: assigned to (171.64.65.106).
[00:07:33] + News From Folding@Home: GPU folding beta
[00:07:33] Loaded queue successfully.
[00:07:33] Connecting to http://171.64.65.106:8080/
[00:07:33] Posted data.
[00:07:33] Initial: 0000; - Receiving payload (expected size: 68451)
[00:07:34] - Downloaded at ~66 kB/s
[00:07:34] - Averaged speed for that direction ~87 kB/s
[00:07:34] + Received work.
[00:07:34] Trying to send all finished work units
[00:07:34] + No unsent completed units remaining.
[00:07:34] + Closed connections
[00:07:39] 
[00:07:39] + Processing work unit
[00:07:39] Core required: FahCore_11.exe
[00:07:39] Core found.
[00:07:39] Working on queue slot 05 [May 5 00:07:39 UTC]
[00:07:39] + Working ...
[00:07:39] - Calling '.\FahCore_11.exe -dir work/ -suffix 05 -priority 96 -checkpoint 15 -verbose -lifeline 2184 -version 623'

[00:07:39] 
[00:07:39] *------------------------------*
[00:07:39] Folding@Home GPU Core - Beta
[00:07:39] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[00:07:39] 
[00:07:39] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[00:07:39] Build host: amoeba
[00:07:39] Board Type: Nvidia
[00:07:39] Core      : 
[00:07:39] Preparing to commence simulation
[00:07:39] - Looking at optimizations...
[00:07:39] - Created dyn
[00:07:39] - Files status OK
[00:07:39] - Expanded 67939 -> 350980 (decompressed 516.6 percent)
[00:07:39] Called DecompressByteArray: compressed_data_size=67939 data_size=350980, decompressed_data_size=350980 diff=0
[00:07:39] - Digital signature verified
[00:07:39] 
[00:07:39] Project: 5773 (Run 10, Clone 421, Gen 2)
[00:07:39] 
[00:07:39] Assembly optimizations on if available.
[00:07:39] Entering M.D.
[00:07:45] mdrun_gpu returned 
[00:07:45] Going to send back what have done -- stepsTotalG=0
[00:07:45] Work fraction=0.0000 steps=0.
[00:07:49] logfile size=4996 infoLength=4996 edr=0 trr=25
[00:07:49] - Writing 5534 bytes of core data to disk...
[00:07:49] Done: 5022 -> 1881 (compressed to 37.4 percent)
[00:07:49]   ... Done.
[00:07:49] 
[00:07:49] Folding@home Core Shutdown: UNSTABLE_MACHINE

At this point a familiar pop-up message Microsoft Visual C++ Runtime Library appeared saying:

Runtime Error!

Program: C:\Documents and Settings\ect...

This application has requested the Runtime to terminate in an unusual way.
Please contact the application's support team for more information.


Clicking OK brought up another popup with heading FahCore_11.exe - Application error saying:

The exception unknown software exception (0x40000015) occurred in the application at location 0x005236ee.

Click on OK to terminate program


After clicking OK, the client had this to say about my killing the core:

Code: Select all

[01:33:31] CoreStatus = 3 (3)
[01:33:31] Client-core communications error: ERROR 0x3
[01:33:31] This is a sign of more serious problems, shutting down.

Leading to the popup titled Folding@home saying:

Folding@home has run into a serious error running the core, and will shutdown.


Clicking OK kills the client. Here is the restart.

Code: Select all

--- Opening Log file [May 5 01:36:06 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\Hammersmythe\Application Data\Folding@home-gpu2
Arguments: -gpu 1 -verbosity 9 -forcegpu nvidia_g80 

[01:36:06] - Ask before connecting: No
[01:36:06] - User name: Amaruk (Team 50625)
[01:36:06] - User ID: C*************5
[01:36:06] - Machine ID: 3
[01:36:06] 
[01:36:06] Loaded queue successfully.
[01:36:06] Initialization complete
[01:36:06] 
[01:36:06] + Processing work unit
[01:36:06] Core required: FahCore_11.exe
[01:36:06] Core found.
[01:36:06] - Autosending finished units... [May 5 01:36:06 UTC]
[01:36:06] Trying to send all finished work units
[01:36:06] + No unsent completed units remaining.
[01:36:06] - Autosend completed
[01:36:06] Working on queue slot 05 [May 5 01:36:06 UTC]
[01:36:06] + Working ...
[01:36:06] - Calling '.\FahCore_11.exe -dir work/ -suffix 05 -priority 96 -checkpoint 15 -verbose -lifeline 780 -version 623'

[01:36:07] 
[01:36:07] *------------------------------*
[01:36:07] Folding@Home GPU Core - Beta
[01:36:07] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:36:07] 
[01:36:07] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:36:07] Build host: amoeba
[01:36:07] Board Type: Nvidia
[01:36:07] Core      : 
[01:36:07] Preparing to commence simulation
[01:36:07] - Looking at optimizations...
[01:36:07] - Created dyn
[01:36:07] - Files status OK
[01:36:07] Error: Missing work file=<>
[01:36:07] 
[01:36:07] Folding@home Core Shutdown: MISSING_WORK_FILES
[01:36:11] CoreStatus = 74 (116)
[01:36:11] The core could not find the work files specified. Removing from queue
[01:36:11] Deleting current work unit & continuing...
[01:36:15] Trying to send all finished work units
[01:36:15] + No unsent completed units remaining.
[01:36:15] - Preparing to get new work unit...
[01:36:15] + Attempting to get work packet
[01:36:15] - Will indicate memory of 2815 MB
[01:36:15] - Detect CPU. Vendor: AuthenticAMD, Family: 15, Model: 11, Stepping: 2
[01:36:15] - Connecting to assignment server
[01:36:15] Connecting to http://assign-GPU.stanford.edu:8080/
[01:36:15] Posted data.
[01:36:15] Initial: 40AB; - Successful: assigned to (171.64.65.106).
[01:36:15] + News From Folding@Home: GPU folding beta
[01:36:15] Loaded queue successfully.
[01:36:15] Connecting to http://171.64.65.106:8080/
[01:36:15] Posted data.
[01:36:15] Initial: 0000; - Receiving payload (expected size: 68451)
[01:36:16] - Downloaded at ~66 kB/s
[01:36:16] - Averaged speed for that direction ~83 kB/s
[01:36:16] + Received work.
[01:36:16] + Closed connections
[01:36:21] 
[01:36:21] + Processing work unit
[01:36:21] Core required: FahCore_11.exe
[01:36:21] Core found.
[01:36:21] Working on queue slot 06 [May 5 01:36:21 UTC]
[01:36:21] + Working ...
[01:36:21] - Calling '.\FahCore_11.exe -dir work/ -suffix 06 -priority 96 -checkpoint 15 -verbose -lifeline 780 -version 623'

[01:36:21] 
[01:36:21] *------------------------------*
[01:36:21] Folding@Home GPU Core - Beta
[01:36:21] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:36:21] 
[01:36:21] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:36:21] Build host: amoeba
[01:36:21] Board Type: Nvidia
[01:36:21] Core      : 
[01:36:21] Preparing to commence simulation
[01:36:21] - Looking at optimizations...
[01:36:21] - Created dyn
[01:36:21] - Files status OK
[01:36:21] - Expanded 67939 -> 350980 (decompressed 516.6 percent)
[01:36:21] Called DecompressByteArray: compressed_data_size=67939 data_size=350980, decompressed_data_size=350980 diff=0
[01:36:21] - Digital signature verified
[01:36:21] 
[01:36:21] Project: 5773 (Run 10, Clone 421, Gen 2)
[01:36:21] 
[01:36:21] Assembly optimizations on if available.
[01:36:21] Entering M.D.
[01:36:28] Working on Protein
[01:36:29] Client config found, loading data.
[01:36:29] Starting GUI Server
[01:39:12] Completed 1%
[01:41:54] Completed 2%
[01:44:36] Completed 3%
[01:47:18] Completed 4%
[01:50:00] Completed 5%
[01:52:42] Completed 6%
[01:55:24] Completed 7%
[01:58:06] Completed 8%
[02:00:48] Completed 9%
[02:03:30] Completed 10%
[02:06:12] Completed 11%
[02:08:53] Completed 12%
[02:11:36] Completed 13%
[02:14:18] Completed 14%
[02:17:01] Completed 15%
[02:19:43] Completed 16%
[02:22:26] Completed 17%
[02:25:08] Completed 18%
[02:27:51] Completed 19%
[02:30:33] Completed 20%
[02:33:16] Completed 21%
[02:35:57] Completed 22%
[02:38:38] Completed 23%
[02:41:19] Completed 24%
[02:44:00] Completed 25%
[02:46:42] Completed 26%
[02:49:25] Completed 27%
[02:52:08] Completed 28%
[02:54:50] Completed 29%
[02:57:33] Completed 30%
[03:00:16] Completed 31%
[03:02:59] Completed 32%
[03:05:42] Completed 33%
[03:08:25] Completed 34%
[03:11:09] Completed 35%
[03:13:52] Completed 36%
[03:16:35] Completed 37%
[03:19:18] Completed 38%
[03:22:01] Completed 39%
[03:24:44] Completed 40%
[03:26:05] Run: exception thrown during GuardedRun
[03:26:05] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[03:26:05] Going to send back what have done -- stepsTotalG=20000000
[03:26:05] Work fraction=0.4050 steps=20000000.
[03:26:09] logfile size=18855 infoLength=18855 edr=0 trr=23
[03:26:09] - Writing 19391 bytes of core data to disk...
[03:26:09] Done: 18879 -> 4904 (compressed to 25.9 percent)
[03:26:09]   ... Done.
[03:26:09] 
[03:26:09] Folding@home Core Shutdown: EARLY_UNIT_END
[03:26:13] CoreStatus = 72 (114)
[03:26:13] Sending work to server
[03:26:13] Project: 5773 (Run 10, Clone 421, Gen 2)
[03:26:13] - Read packet limit of 540015616... Set to 524286976.


[03:26:13] + Attempting to send results [May 5 03:26:13 UTC]
[03:26:13] - Reading file work/wuresults_06.dat from core
[03:26:13]   (Read 5416 bytes from disk)
[03:26:13] Connecting to http://171.64.65.106:8080/
[03:26:13] Posted data.
[03:26:13] Initial: 0000; Conversation time very short, giving reduced weight in bandwidth avg
[03:26:13] - Uploaded at ~12 kB/s
[03:26:13] - Averaged speed for that direction ~57 kB/s
[03:26:13] + Results successfully sent
[03:26:13] Thank you for your contribution to Folding@Home.

So...was anyone else able to fold it?


Also, if this looks familiar it's because the same thing happened yesterday on a different GPU

viewtopic.php?f=19&t=9815

Only difference is this one EUE'd. :(

Re: Project: 5773 (Run 10, Clone 421, Gen 2)

Posted: Tue May 05, 2009 5:22 am
by bruce
Something is really strange. According to the stats, you uploaded this WU at: 2009-05-04 20:07:04 and downloaded it 2009-05-04 17:34:30 or 2h33m later and you earned partial credit which is certainly not what is shown in that copy of FAHlog. My guess is that you've got multiple clients running on that machine and you failed to give each one their own directory. One client is processing the WU reasonably successfully and the other is aborting.

Please give the User ID (at least the first 8 digits) plus the Machine ID associated with each client on that machine. Then look some 6 or 8 lines earlier in FAHlog and include the Launch directory: xxxxx Then list several of the most recent WUs completed by each client.

Re: Project: 5773 (Run 10, Clone 421, Gen 2)

Posted: Tue May 05, 2009 6:56 am
by Amaruk
bruce wrote:According to the stats, you uploaded this WU at: 2009-05-04 20:07:04 and downloaded it 2009-05-04 17:34:30
Time of first download: [00:07:34] + Received work.

Time of second download (restart) was [01:36:16] + Received work.

Returned to server at: [03:26:13] + Results successfully sent

My guess is server uses time of first download.


00:07:34 UTC = 17:07:34 PDT vs 17:34:30, +26:56

03:26:13 UTC = 20:26:13 PDT vs 20:07:04, -19:09

Interesting :?
bruce wrote:My guess is that you've got multiple clients running on that machine and you failed to give each one their own directory. One client is processing the WU reasonably successfully and the other is aborting.
I've had multiple clients running for almost a year. Rock solid before upgrading to 6.23.
bruce wrote:Please give the User ID (at least the first 8 digits) plus the Machine ID associated with each client on that machine. Then look some 6 or 8 lines earlier in FAHlog and include the Launch directory: xxxxx Then list several of the most recent WUs completed by each client.
Sure thing.


GPU 1:

Code: Select all

--- Opening Log file [May 4 14:46:26 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\Hammersmythe\Application Data\Folding@home-gpu
Arguments: -gpu 0 -verbosity 9 -forcegpu nvidia_g80 

[14:46:26] - Ask before connecting: No
[14:46:26] - User name: Amaruk (Team 50625)
[14:46:26] - User ID: C75791562E71B25
[14:46:26] - Machine ID: 2
And here is a list of WUs, oldest first.

Project: 5900 (Run 11, Clone 789, Gen 4)

Project: 5900 (Run 5, Clone 824, Gen 5)

Project: 5774 (Run 3, Clone 349, Gen 1) [3%]

Project: 5769 (Run 1, Clone 213, Gen 63)

Project: 5779 (Run 14, Clone 89, Gen 1)

Project: 5778 (Run 3, Clone 361, Gen 2)

Project: 5774 (Run 4, Clone 56, Gen 1) [current]


GPU 2:

Code: Select all

--- Opening Log file [May 5 01:36:06 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\Hammersmythe\Application Data\Folding@home-gpu2
Arguments: -gpu 1 -verbosity 9 -forcegpu nvidia_g80 

[01:36:06] - Ask before connecting: No
[01:36:06] - User name: Amaruk (Team 50625)
[01:36:06] - User ID: C75791562E71B25
[01:36:06] - Machine ID: 3
And it's WUs

Project: 5756 (Run 7, Clone 70, Gen 31)

Project: 5769 (Run 14, Clone 148, Gen 58)

Project: 5755 (Run 3, Clone 63, Gen 185) [died immediately]

Project: 5767 (Run 6, Clone 117, Gen 43)

Project: 5766 (Run 7, Clone 413, Gen 46)

Project: 5765 (Run 0, Clone 390, Gen 171)

Project: 5900 (Run 10, Clone 570, Gen 21) [17%]

Project: 5904 (Run 0, Clone 237, Gen 71)

Project: 5773 (Run 10, Clone 407, Gen 2) [2%]

Project: 5773 (Run 10, Clone 421, Gen 2) [died immediately, killing core & client]

Project: 5773 (Run 10, Clone 421, Gen 2) [40%]

Project: 5773 (Run 4, Clone 130, Gen 3) [current]


GPU 3:

Code: Select all

--- Opening Log file [May 5 02:48:51 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\Hammersmythe\Application Data\Folding@home-gpu3
Arguments: -gpu 2 -verbosity 9 -forcegpu nvidia_g80 

[02:48:51] - Ask before connecting: No
[02:48:51] - User name: Amaruk (Team 50625)
[02:48:51] - User ID: C75791562E71B25
[02:48:51] - Machine ID: 4
And it's WUs:

Project: 5765 (Run 7, Clone 342, Gen 56)

Project: 5766 (Run 11, Clone 113, Gen 382)

Project: 5753 (Run 13, Clone 148, Gen 363)

Project: 5750 (Run 5, Clone 97, Gen 206)

Project: 5900 (Run 6, Clone 388, Gen 68) [36%]

Project: 5900 (Run 1, Clone 987, Gen 1) [33%]

Project: 5900 (Run 3, Clone 883, Gen 5)

Project: 5774 (Run 13, Clone 165, Gen 1)

Project: 5774 (Run 13, Clone 150, Gen 0)

Project: 5904 (Run 10, Clone 915, Gen 19) [current]

Re: Project: 5773 (Run 10, Clone 421, Gen 2)

Posted: Tue May 05, 2009 1:59 pm
by bruce
Project: 5773 (Run 10, Clone 421, Gen 2) [40%]
Hi Amaruk (team 50625),
Your WU (P5773 R10 C421 G2) was added to the stats database on 2009-05-04 20:10:36 for 311.02 points of credit.

311/768 = 40.5% so the points are consistent with your report. As you suspected, I don't see anything out of place in any of your logs. They don't provide any new explanation for what happened on your machine or why someone else was able to complete the same WU.

The reported times are another anomaly for which I have no explanation. :? :?:

Re: Project: 5773 (Run 10, Clone 421, Gen 2)

Posted: Wed May 06, 2009 5:30 am
by Amaruk
bruce wrote:The reported times are another anomaly for which I have no explanation. :? :?:
That makes two of us! :lol:

Thanks for your time and consideration. :)