171.64.65.64 and 171.67.108.25 - Can't upload

Moderators: Site Moderators, FAHC Science Team

Post Reply
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

171.64.65.64 and 171.67.108.25 - Can't upload

Post by HaloJones »

I've got SMP a3 clients queueing up unable to connect to upload work.

Code: Select all

[21:39:24] - Couldn't send HTTP request to server
[21:39:24] + Could not connect to Work Server (results)
[21:39:24]     (171.67.108.25:8080)
[21:39:24] + Retrying using alternative port
[21:39:24] - Couldn't send HTTP request to server
[21:39:24] + Could not connect to Work Server (results)
[21:39:24]     (171.67.108.25:80)
[21:39:24]   Could not transmit unit 02 to Collection server; keeping in queue.

Code: Select all

[21:57:30] - Couldn't send HTTP request to server
[21:57:30] Resuming from checkpoint
[21:57:30] Verified work/wudata_01.log
[21:57:30] + Could not connect to Work Server (results)
[21:57:30] Verified work/wudata_01.trr
[21:57:30] Verified work/wudata_01.edr
[21:57:30]     (171.64.65.64:8080)
[21:57:30] Completed 36538 out of 500000 steps  (7%)
[21:57:30] + Retrying using alternative port
[21:57:30] - Couldn't send HTTP request to server
[21:57:30] + Could not connect to Work Server (results)
[21:57:30]     (171.64.65.64:80)
[21:57:31] - Error: Could not transmit unit 05 (completed March 2) to work server.


[21:57:31] + Attempting to send results [March 3 21:57:31 UTC]
[21:57:31] - Couldn't send HTTP request to server
[21:57:31] + Could not connect to Work Server (results)
[21:57:31]     (171.67.108.25:8080)
[21:57:31] + Retrying using alternative port
[21:57:31] - Couldn't send HTTP request to server
[21:57:31] + Could not connect to Work Server (results)
[21:57:31]     (171.67.108.25:80)
[21:57:31]   Could not transmit unit 05 to Collection server; keeping in queue.
[21:57:31] Project: 2653 (Run 33, Clone 18, Gen 145)


[21:57:31] + Attempting to send results [March 3 21:57:31 UTC]
[21:57:38] - Couldn't send HTTP request to server
[21:57:38] + Could not connect to Work Server (results)
[21:57:38]     (171.64.65.64:8080)
[21:57:38] + Retrying using alternative port
[21:57:38] - Couldn't send HTTP request to server
[21:57:38] + Could not connect to Work Server (results)
[21:57:38]     (171.64.65.64:80)
[21:57:38] - Error: Could not transmit unit 07 (completed March 2) to work server.


[21:57:38] + Attempting to send results [March 3 21:57:38 UTC]
[21:57:45] - Couldn't send HTTP request to server
[21:57:45] + Could not connect to Work Server (results)
[21:57:45]     (171.67.108.25:8080)
[21:57:45] + Retrying using alternative port
[21:57:45] - Couldn't send HTTP request to server
[21:57:45] + Could not connect to Work Server (results)
[21:57:45]     (171.67.108.25:80)
[21:57:45]   Could not transmit unit 07 to Collection server; keeping in queue.
[21:57:45] Project: 2653 (Run 33, Clone 72, Gen 145)


[21:57:45] + Attempting to send results [March 3 21:57:45 UTC]
[21:57:51] - Couldn't send HTTP request to server
[21:57:51] + Could not connect to Work Server (results)
[21:57:51]     (171.64.65.64:8080)
[21:57:51] + Retrying using alternative port
[21:57:51] - Couldn't send HTTP request to server
[21:57:51] + Could not connect to Work Server (results)
[21:57:51]     (171.64.65.64:80)
[21:57:51] - Error: Could not transmit unit 09 (completed March 2) to work server.


[21:57:51] + Attempting to send results [March 3 21:57:51 UTC]
[21:57:57] - Couldn't send HTTP request to server
[21:58:02] + Could not connect to Work Server (results)
[21:58:02]     (171.67.108.25:8080)
[21:58:02] + Retrying using alternative port
[21:58:02] - Couldn't send HTTP request to server
[21:58:02] + Could not connect to Work Server (results)
[21:58:02]     (171.67.108.25:80)
[21:58:02]   Could not transmit unit 09 to Collection server; keeping in queue.
The clients have been working for weeks and withot any changes or even a rebot, they''re not connecting. Browser seems fine, server status page seems OK, firewall disabled makes no difference. Stumped and losing bonus points!

Any ideas.
single 1070

Image
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: Can't upload to 171.67.108.25

Post by HaloJones »

By the way, both machines are happily uploading GPU clients. It's just the smp clients which are stuck.
single 1070

Image
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Can't upload to 171.64.65.64

Post by bruce »

The client always attempts to upload to a Work Server first, and if that fails, it reverts to a Collection Server. When there are problems with Work Servers, they often lead to an overload of a Collection Server which can only be fixed by correcting the problem with the Work Server. 171.67.108.25 is a collection server, so we need to change the title of the report to the other server.

Apparently that is 171.64.65.64
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: 171.64.65.64 and 171.67.108.25 - Can't upload

Post by HaloJones »

OK< so is it 171.64.65.64 that's overloaded? Whatever, I now have a whole bunch of un-uploaded (hey! new word) results.

By the way, seeing as how these are a3 units, the server's refusal to accept means I risk failing to return within the deadline affecting not just these units and their bonus points but also future units and their bonus points...

...not forgetting that Stanford wants these returned ASAP which I'm trying to do.
single 1070

Image
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: 171.64.65.64 and 171.67.108.25 - Can't upload

Post by HaloJones »

Attempting to hit http://171.64.65.64 gets this:

file:///C:/DOCUME~1/MIKE~1.HOM/LOCALS~1/Temp/tfpvclhD-1.part

which opens in Firefox as OK but not immediately. First it wants me to identify what to open it with.
single 1070

Image
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: 171.64.65.64 and 171.67.108.25 - Can't upload

Post by HaloJones »

Any word on what's happening?
single 1070

Image
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.64.65.64 and 171.67.108.25 - Can't upload

Post by bruce »

The "OK" that you got from the server indicates that you can connect to it, even if the data isn't in the form that allows Firefox to open it automatically.
toTOW
Site Moderator
Posts: 6443
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 171.64.65.64 and 171.67.108.25 - Can't upload

Post by toTOW »

PS : 171.64.65.64 only serves old A1 SMP WUs (such as p2653 and 2665), so you don't have to worry about your bonus here.

But it doesn't explain why you can't send work :(
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: 171.64.65.64 and 171.67.108.25 - Can't upload

Post by HaloJones »

Curiouser and curiouser. Here's the output from one machine:

Current Queue:
Slot 05 Done
Project: 2653 (Run 32, Clone 165, Gen 145), Core: a1
Work server: 171.64.65.64:8080
Collection server: 171.67.108.25
Download date: March 2 23:42:57
Finished date: March 2 23:46:03
Failed uploads: 23

Slot 06 Empty/Deleted
Project: 2653 (Run 32, Clone 163, Gen 145), Core: a1
Work server: 171.64.65.64:8080
Collection server: 171.67.108.25
Download date: March 2 23:46:20
Finished date: March 2 23:48:47

Slot 07 Done
Project: 2653 (Run 33, Clone 18, Gen 145), Core: a1
Work server: 171.64.65.64:8080
Collection server: 171.67.108.25
Download date: March 2 23:49:24
Finished date: March 2 23:52:25
Failed uploads: 19

Slot 08 Empty/Deleted
Project: 2665 (Run 3, Clone 138, Gen 195), Core: a1
Work server: 171.64.65.64:8080
Collection server: 171.67.108.25
Download date: March 2 23:52:55
Finished date: January 1 00:00:00

Slot 09 Done
Project: 2653 (Run 33, Clone 72, Gen 145), Core: a1
Work server: 171.64.65.64:8080
Collection server: 171.67.108.25
Download date: March 2 23:54:35
Finished date: March 2 23:57:37
Failed uploads: 16

Slot 00 Empty/Deleted
Project: 6025 (Run 0, Clone 89, Gen 31), Core: a3
Work server: 171.64.65.54:8080
Collection server: 171.67.108.25
Download date: March 2 23:58:46
Finished date: March 3 20:08:49

Slot 01 Empty/Deleted
Project: 6015 (Run 0, Clone 104, Gen 54), Core: a3
Work server: 130.237.232.140:8080
Collection server: 130.237.162.125
Download date: March 3 20:13:55
Finished date: March 4 17:38:47

Slot 02 Empty/Deleted
Project: 6011 (Run 0, Clone 14, Gen 36), Core: a3
Work server: 130.237.232.140:8080
Collection server: 130.237.162.125
Download date: March 4 17:43:08
Finished date: March 4 18:07:59

Slot 03 Done
Project: 2653 (Run 32, Clone 142, Gen 145), Core: a1
Work server: 171.64.65.64:8080
Collection server: 171.67.108.25
Download date: March 2 23:37:07
Finished date: March 2 23:39:37
Failed uploads: 27

Slot 04 *Ready
Project: 6015 (Run 1, Clone 151, Gen 36), Core: a3
Work server: 130.237.232.140:8080
Collection server: 130.237.162.125
Download date: March 4 18:09:45
Deadline date: March 10 18:09:45.

Note the ones that are showing done and not uploaded are all a1 units done in typical times of a couple of minutes. methinks something has gone screwy. Either way, I see no great harm here and will simply delete them from the queues.
single 1070

Image
Post Reply