Page 1 of 1

Can you delete a failed work unit?

Posted: Tue Jun 11, 2013 12:13 am
by rygaroo
I have 1 GPU work unit that failed immediately upon working on it. It has failed to send back the results for the past 3 days. I was hoping that it would delete itself after the Expiration date past, but it still shows up and continues to attempt to send 632B to 171.67.108.xx. Log shown in image and also pasted below -->

Image
http://i.imgur.com/2ioNF2h.png (direct link in case the image doesn't load properly)

Code: Select all

23:44:46:WU02:FS00:0x11:Completed 41%
23:45:12:WU01:FS00:Sending unit results: id:01 state:SEND error:DUMPED project:5769 run:7 clone:91 gen:6082 core:0x11 unit:0x13ae792251b146e417c2005b00071689
23:45:12:WU01:FS00:Uploading 632B to 171.67.108.11
23:45:12:WU01:FS00:Connecting to 171.67.108.11:8080
23:45:12:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to read stream
23:45:12:WU01:FS00:Trying to send results to collection server
23:45:12:WU01:FS00:Uploading 632B to 171.67.108.25
23:45:12:WU01:FS00:Connecting to 171.67.108.25:8080
23:45:33:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
23:45:33:WU01:FS00:Connecting to 171.67.108.25:80
23:45:35:WU02:FS00:0x11:Completed 42%
23:45:54:ERROR:WU01:FS00:Exception: Failed to connect to 171.67.108.25:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
23:46:24:WU02:FS00:0x11:Completed 43%
23:47:14:WU02:FS00:0x11:Completed 44%
23:48:04:WU02:FS00:0x11:Completed 45%
23:48:53:WU02:FS00:0x11:Completed 46%
23:49:42:WU02:FS00:0x11:Completed 47%
23:50:31:WU02:FS00:0x11:Completed 48%
23:51:20:WU02:FS00:0x11:Completed 49%
23:52:09:WU02:FS00:0x11:Completed 50%
23:52:58:WU02:FS00:0x11:Completed 51%
23:53:47:WU00:FS01:0xa4:Completed 570000 out of 1500000 steps  (38%)
23:53:47:WU02:FS00:0x11:Completed 52%
23:54:37:WU02:FS00:0x11:Completed 53%
23:55:27:WU02:FS00:0x11:Completed 54%
23:56:17:WU01:FS00:Sending unit results: id:01 state:SEND error:DUMPED project:5769 run:7 clone:91 gen:6082 core:0x11 unit:0x13ae792251b146e417c2005b00071689
23:56:18:WU01:FS00:Uploading 632B to 171.67.108.11
23:56:18:WU01:FS00:Connecting to 171.67.108.11:8080
23:56:18:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to read stream
23:56:18:WU01:FS00:Trying to send results to collection server
23:56:18:WU01:FS00:Uploading 632B to 171.67.108.25
23:56:18:WU01:FS00:Connecting to 171.67.108.25:8080
23:56:20:WU02:FS00:0x11:Completed 55%
23:56:39:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
23:56:39:WU01:FS00:Connecting to 171.67.108.25:80
23:57:00:ERROR:WU01:FS00:Exception: Failed to connect to 171.67.108.25:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
23:57:12:WU02:FS00:0x11:Completed 56%
Mod Edit: Added Code Tags - PantherX

Re: Can you delete a failed work unit?

Posted: Tue Jun 11, 2013 1:13 am
by tlg
viewtopic.php?f=67&t=21564

I think it's the second post down.
P5-133XL wrote:Record the number of the work queue for the particular WU. Stop v7. Goto start->FAHClient->Data Directory->work and delete the folder that has the same number as the work queue. Restart v7. The client will clean itself up and get a new WU.

Re: Can you delete a failed work unit?

Posted: Tue Jun 11, 2013 1:23 am
by PantherX
Welcome to the F@H Forum rygaroo,

Please note this is a reported bug (https://fah.stanford.edu/projects/FAHClient/ticket/1030) so hopefully, it will be fixed in a future version of V7. Thanks for your report.

Re: Can you delete a failed work unit?

Posted: Tue Jun 11, 2013 9:30 am
by rygaroo
Thanks for the help, that did the trick!