171.64.65.54 - Already sending work

Moderators: Site Moderators, FAHC Science Team

Post Reply
tourbound129
Posts: 21
Joined: Mon Aug 23, 2010 11:11 pm

171.64.65.54 - Already sending work

Post by tourbound129 »

My SMP client has completed the last 2 WU's but when it tries to send it says "Already sending work" and then "Sent 0 of 2 completed units to the server".

[11:52:40] Completed 490000 out of 500000 steps (98%)
[12:00:31] Completed 495000 out of 500000 steps (99%)
[12:08:20] Completed 500000 out of 500000 steps (100%)
[12:08:22] DynamicWrapper: Finished Work Unit: sleep=10000
[12:08:31]
[12:08:31] Finished Work Unit:
[12:08:31] - Reading up to 3700128 from "work/wudata_06.trr": Read 3700128
[12:08:31] trr file hash check passed.
[12:08:31] edr file hash check passed.
[12:08:31] logfile size: 58980
[12:08:31] Leaving Run
[12:08:36] - Writing 3794660 bytes of core data to disk...
[12:08:36] ... Done.
[12:08:37] - Shutting down core
[12:08:37]
[12:08:37] Folding@home Core Shutdown: FINISHED_UNIT
[12:08:39] CoreStatus = 64 (100)
[12:08:39] Unit 6 finished with 90 percent of time to deadline remaining.
[12:08:39] Updated performance fraction: 0.884859
[12:08:39] Sending work to server
[12:08:39] - Already sending work
[12:08:39] Trying to send all finished work units
[12:08:39] - Already sending work
[12:08:39] - Already sending work
[12:08:39] + Sent 0 of 2 completed units to the server
[12:08:39] - Preparing to get new work unit...
[12:08:39] Cleaning up work directory
[12:08:40] + Attempting to get work packet
[12:08:40] Passkey found
[12:08:40] - Will indicate memory of 3582 MB
[12:08:40] - Connecting to assignment server
[12:08:40] Connecting to http://assign.stanford.edu:8080/
[12:08:41] Posted data.
[12:08:41] Initial: 40AB; - Successful: assigned to (171.64.65.54).
[12:08:41] + News From Folding@Home: Welcome to Folding@Home
[12:08:41] Loaded queue successfully.
[12:08:41] Sent data
[12:08:41] Connecting to http://171.64.65.54:8080/
[12:08:42] Posted data.
[12:08:42] Initial: 0000; - Receiving payload (expected size: 1764113)
[12:08:48] - Downloaded at ~287 kB/s
[12:08:48] - Averaged speed for that direction ~256 kB/s
[12:08:48] + Received work.
[12:08:48] Trying to send all finished work units
[12:08:48] - Already sending work
[12:08:48] - Already sending work
[12:08:48] + Sent 0 of 2 completed units to the server
[12:08:48] + Closed connections
[12:08:48]
[12:08:48] + Processing work unit
[12:08:48] Core required: FahCore_a3.exe
[12:08:48] Core found.
[12:08:48] Working on queue slot 07 [September 7 12:08:48 UTC]
[12:08:48] + Working ...
[12:08:48] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 07 -np 4 -checkpoint 15 -verbose -lifeline 8076 -version 630'

[12:08:48]
[12:08:48] *------------------------------*
[12:08:48] Folding@Home Gromacs SMP Core
[12:08:48] Version 2.22 (Mar 12, 2010)
[12:08:48]
[12:08:48] Preparing to commence simulation
[12:08:48] - Looking at optimizations...
[12:08:48] - Created dyn
[12:08:48] - Files status OK
[12:08:49] - Expanded 1763601 -> 2248557 (decompressed 127.4 percent)
[12:08:49] Called DecompressByteArray: compressed_data_size=1763601 data_size=2248557, decompressed_data_size=2248557 diff=0
[12:08:49] - Digital signature verified
[12:08:49]
[12:08:49] Project: 6064 (Run 0, Clone 13, Gen 165)
[12:08:49]
[12:08:49] Assembly optimizations on if available.
[12:08:49] Entering M.D.
[12:08:55] Completed 0 out of 500000 steps (0%)
[12:16:43] Completed 5000 out of 500000 steps (1%)
[12:24:21] Completed 10000 out of 500000 steps (2%)
[12:32:17] Completed 15000 out of 500000 steps (3%)
[12:40:10] Completed 20000 out of 500000 steps (4%)
Image
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: 171.64.65.54 - Already sending work

Post by 7im »

Please post the PRCG numbers of the "already sending" work unit so that someone can check it's history.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
tourbound129
Posts: 21
Joined: Mon Aug 23, 2010 11:11 pm

Re: 171.64.65.54 - Already sending work

Post by tourbound129 »

6065 (0,159,39)
2633 (6,19,16)

Those are the two that are the completed WU's giving the "Already sending work" message.

Thanks for the reply and the help.
Image
lanbrown
Posts: 104
Joined: Thu Jul 09, 2009 1:21 am

Re: 171.64.65.54 - Already sending work

Post by lanbrown »

Do you have more than one machine? If so, did you you just copy the client from one to the other? have you checked to see what WU the other machine is working on? It sounds like you have more than one machine and that they are both getting the same WU. In essence, both machines look identical to Stanford and thus you are only getting credit for one return.
tourbound129
Posts: 21
Joined: Mon Aug 23, 2010 11:11 pm

Re: 171.64.65.54 - Already sending work

Post by tourbound129 »

lanbrown wrote:Do you have more than one machine? If so, did you you just copy the client from one to the other? have you checked to see what WU the other machine is working on? It sounds like you have more than one machine and that they are both getting the same WU. In essence, both machines look identical to Stanford and thus you are only getting credit for one return.
No, I only have one machine running SMP and GPU. The GPU is sending just fine and receiving credit. The SMP is the problem right now. It can receive work (crunching another WU right now) but just started with this problem of "already sending work". It had sent 24 WU's in a row without a problem until yesterday when this came up.

As best as I can tell I have not received any credit for these WU's.
Image
lanbrown
Posts: 104
Joined: Thu Jul 09, 2009 1:21 am

Re: 171.64.65.54 - Already sending work

Post by lanbrown »

Stop it and perform a -queueinfo and post the results. Let's see if it has really sent them or just holding on to them.
tourbound129
Posts: 21
Joined: Mon Aug 23, 2010 11:11 pm

Re: 171.64.65.54 - Already sending work

Post by tourbound129 »

Slot 05 and 06 are the projects in question.

Code: Select all

--- Opening Log file [September 7 17:16:35 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.30

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Users\nkc\FAH
Executable: fah6.exe
Arguments: -queueinfo -smp -verbosity 9 

[17:16:35] - Ask before connecting: No
[17:16:35] - User name: pgatour (Team 55280)
[17:16:35] - User ID: 46A1A8753D7559ED
[17:16:35] - Machine ID: 1
[17:16:35] 
[17:16:35] Loaded queue successfully.
[17:16:35] Printing Queue Information
Current Queue: 
Slot 08  Empty/Deleted
Project: 6057 (Run 0, Clone 121, Gen 55), Core: a3
Work server: 171.64.65.54:8080
Collection server: 171.67.108.25
Download date: September 2 08:42:31
Finished date: September 2 23:30:53

Slot 09  Empty/Deleted
Project: 6014 (Run 0, Clone 159, Gen 265), Core: a3
Work server: 130.237.232.140:8080
Collection server: 130.237.165.141
Download date: September 2 23:31:55
Finished date: September 3 14:50:53

Slot 00  Empty/Deleted
Project: 6065 (Run 0, Clone 81, Gen 158), Core: a3
Work server: 171.64.65.54:8080
Collection server: 171.67.108.25
Download date: September 3 14:57:17
Finished date: September 4 05:08:09

Slot 01  Empty/Deleted
Project: 6012 (Run 1, Clone 48, Gen 205), Core: a3
Work server: 130.237.232.140:8080
Collection server: 130.237.165.141
Download date: September 4 05:09:39
Finished date: September 4 18:52:56

Slot 02  Empty/Deleted
Project: 2633 (Run 10, Clone 3, Gen 11), Core: a3
Work server: 171.67.108.24:8080
Collection server: 171.67.108.25
Download date: September 4 19:00:00
Finished date: September 5 02:01:03

Slot 03  Empty/Deleted
Project: 6067 (Run 1, Clone 117, Gen 6), Core: a3
Work server: 171.64.65.54:8080
Collection server: 171.67.108.25
Download date: September 5 02:21:16
Finished date: September 5 15:31:33

Slot 04  Empty/Deleted
Project: 6054 (Run 1, Clone 197, Gen 15), Core: a3
Work server: 171.64.65.54:8080
Collection server: 171.67.108.25
Download date: September 5 15:32:40
Finished date: September 6 05:29:04

Slot 05  Done     
Project: 2633 (Run 6, Clone 19, Gen 16), Core: a3
Work server: 171.67.108.24:8080
Collection server: 171.67.108.25
Download date: September 6 05:30:07
Finished date: September 6 12:51:12
Failed uploads: 1

Slot 06  Done     
Project: 6065 (Run 0, Clone 159, Gen 39), Core: a3
Work server: 171.64.65.54:8080
Collection server: 171.67.108.25
Download date: September 6 21:59:56
Finished date: September 7 12:08:39

Slot 07 *Ready    
Project: 6064 (Run 0, Clone 13, Gen 165), Core: a3
Work server: 171.64.65.54:8080
Collection server: 171.67.108.25
Download date: September 7 12:08:48
Deadline date: September 13 12:08:48

PF: 0.884859 based on last 4 slot(s)
[17:16:35] ***** Got a SIGTERM signal (2)
[17:16:35] Killing all core threads
[17:16:35] Killing 4 cores
[17:16:35] Killing core 0
[17:16:35] Killing core 1
[17:16:35] Killing core 2
[17:16:35] Killing core 3

Folding@Home Client Shutdown.

Image
lanbrown
Posts: 104
Joined: Thu Jul 09, 2009 1:21 am

Re: 171.64.65.54 - Already sending work

Post by lanbrown »

They have not been sent. I would stop the client and issue fah6.exe -smp -verbosity 9 -send 05 and see if it can successfully send it. Post the log of what happens. You might also try rebooting first and then trying the upload manually. It is almost like it is still trying to send the WU's.
tourbound129
Posts: 21
Joined: Mon Aug 23, 2010 11:11 pm

Re: 171.64.65.54 - Already sending work

Post by tourbound129 »

After restarting the client when doing the -queueinfo it said "attempting to send results" and then went ahead and continued working on the current WU. 14 minutes later I got "Sent 2 of 2 completed units to the server. Autosend complete"

Not sure what the issue was but atleast it seems to have resolved itself.
Image
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: 171.64.65.54 - Already sending work

Post by 7im »

The client does an auto send of all completed work units each time the client starts. -send 05 is redundant.

I haven't seen that error before. I'm guessing that one of the 2 completed WUs was already trying to upload when the auto send timer was triggered. And since the upload was already in progress, you got that already sending message.

The client is programmed to be self correcting, and would have likely sorted this out eventually. But good to see you gave it a kick in the right direction. ;)
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
gwildperson
Posts: 450
Joined: Tue Dec 04, 2007 8:36 pm

Re: 171.64.65.54 - Already sending work

Post by gwildperson »

In different topic, Kasson said "We just restarted the server code; hopefully that will help with any "stuck" transfers."

I propose the following theory: Your machine was one of those "stuck" transfers and restarting your client successfully reduced the number clogging the server by one.
Post Reply