171.64.65.56 not responding

Moderators: Site Moderators, FAHC Science Team

road-runner
Posts: 227
Joined: Sun Dec 02, 2007 4:01 am
Location: Willis, Texas

Re: 171.64.65.56 not responding

Post by road-runner »

I have several that cant send in the wus...

Code: Select all

[06:20:10] Folding@home Core Shutdown: FINISHED_UNIT
[06:23:23] CoreStatus = 64 (100)
[06:23:23] Sending work to server
[06:23:23] Project: 2669 (Run 11, Clone 128, Gen 166)


[06:23:23] + Attempting to send results [November 2 06:23:23 UTC]
[06:23:24] - Couldn't send HTTP request to server
[06:23:24]   (Got status 503)
[06:23:24] + Could not connect to Work Server (results)
[06:23:24]     (171.64.65.56:8080)
[06:23:24] + Retrying using alternative port
[06:23:24] - Couldn't send HTTP request to server
[06:23:24]   (Got status 503)
[06:23:24] + Could not connect to Work Server (results)
[06:23:24]     (171.64.65.56:80)
[06:23:24] - Error: Could not transmit unit 01 (completed November 2) to work server.
[06:23:24]   Keeping unit 01 in queue.
[06:23:24] Project: 2669 (Run 11, Clone 128, Gen 166)


[06:23:24] + Attempting to send results [November 2 06:23:24 UTC]
[06:23:24] - Couldn't send HTTP request to server
[06:23:24]   (Got status 503)
[06:23:24] + Could not connect to Work Server (results)
[06:23:24]     (171.64.65.56:8080)
[06:23:24] + Retrying using alternative port
[06:23:24] - Couldn't send HTTP request to server
[06:23:24]   (Got status 503)
[06:23:24] + Could not connect to Work Server (results)
[06:23:24]     (171.64.65.56:80)
[06:23:24] - Error: Could not transmit unit 01 (completed November 2) to work server.


[06:23:24] + Attempting to send results [November 2 06:23:24 UTC]
[06:23:25] - Couldn't send HTTP request to server
[06:23:25]   (Got status 503)
[06:23:25] + Could not connect to Work Server (results)
[06:23:25]     (171.67.108.25:8080)
[06:23:25] + Retrying using alternative port
[06:23:25] - Couldn't send HTTP request to server
[06:23:25]   (Got status 503)
[06:23:25] + Could not connect to Work Server (results)
[06:23:25]     (171.67.108.25:80)
[06:23:25]   Could not transmit unit 01 to Collection server; keeping in queue.
[06:23:25] - Preparing to get new work unit...
[06:23:25] Cleaning up work directory
[06:23:26] + Attempting to get work packet
[06:23:26] - Connecting to assignment server
[06:23:26] - Successful: assigned to (171.67.108.22).
[06:23:26] + News From Folding@Home: Welcome to Folding@Home
[06:23:26] Loaded queue successfully.
[06:24:41] Project: 2669 (Run 11, Clone 128, Gen 166)


[06:24:41] + Attempting to send results [November 2 06:24:41 UTC]
[06:24:41] - Couldn't send HTTP request to server
[06:24:41]   (Got status 503)
[06:24:41] + Could not connect to Work Server (results)
[06:24:41]     (171.64.65.56:8080)
[06:24:41] + Retrying using alternative port
[06:24:41] - Couldn't send HTTP request to server
[06:24:41]   (Got status 503)
[06:24:41] + Could not connect to Work Server (results)
[06:24:41]     (171.64.65.56:80)
[06:24:41] - Error: Could not transmit unit 01 (completed November 2) to work server.


[06:24:41] + Attempting to send results [November 2 06:24:41 UTC]
[06:24:41] - Couldn't send HTTP request to server
[06:24:41]   (Got status 503)
[06:24:41] + Could not connect to Work Server (results)
[06:24:41]     (171.67.108.25:8080)
[06:24:41] + Retrying using alternative port
[06:24:42] - Couldn't send HTTP request to server
[06:24:42]   (Got status 503)
[06:24:42] + Could not connect to Work Server (results)
[06:24:42]     (171.67.108.25:80)
[06:24:42]   Could not transmit unit 01 to Collection server; keeping in queue.
[06:24:42] + Closed connections
[06:24:42] 
[06:24:42] + Processing work unit
[06:24:42] Core required: FahCore_a2.exe
[06:24:42] Core found.
[06:24:42] Working on queue slot 02 [November 2 06:24:42 UTC]
[06:24:42] + Working ...
[06:24:42] 
[06:24:42] *------------------------------*
[06:24:42] Folding@Home Gromacs SMP Core
[06:24:42] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[06:24:42] 
[06:24:42] Preparing to commence simulation
[06:24:42] - Ensuring status. Please wait.
[06:24:45] Called DecompressByteArray: compressed_data_size=30331959 data_size=159726549, decompressed_data_size=159726549 diff=0
[06:24:46] - Digital signature verified
[06:24:46] 
[06:24:46] Project: 2682 (Run 2, Clone 12, Gen 2)
[06:24:46] 
[06:24:46] Assembly optimizations on if available.
[06:24:46] Entering M.D.
[06:24:57]  (Run 2, Clone 12, Gen 2)
[06:24:57] 
[06:24:57] Entering M.D.
[07:00:08] pleted 2500 out of 250000 steps  (1%)
[07:34:08] Completed 5000 out of 250000 steps  (2%)
[08:08:09] Completed 7500 out of 250000 steps  (3%)
[08:42:09] Completed 10000 out of 250000 steps  (4%)
[09:16:08] Completed 12500 out of 250000 steps  (5%)
[09:50:09] Completed 15000 out of 250000 steps  (6%)
[10:24:11] Completed 17500 out of 250000 steps  (7%)
[10:58:13] Completed 20000 out of 250000 steps  (8%)
[11:32:15] Completed 22500 out of 250000 steps  (9%)
[12:06:16] Completed 25000 out of 250000 steps  (10%)
[12:19:20] Project: 2669 (Run 11, Clone 128, Gen 166)


[12:19:24] + Attempting to send results [November 2 12:19:24 UTC]
[12:19:27] - Couldn't send HTTP request to server
[12:19:27]   (Got status 503)
[12:19:27] + Could not connect to Work Server (results)
[12:19:27]     (171.64.65.56:8080)
[12:19:27] + Retrying using alternative port
[12:19:28] - Couldn't send HTTP request to server
[12:19:28]   (Got status 503)
[12:19:28] + Could not connect to Work Server (results)
[12:19:28]     (171.64.65.56:80)
[12:19:28] - Error: Could not transmit unit 01 (completed November 2) to work server.


[12:19:28] + Attempting to send results [November 2 12:19:28 UTC]
[12:19:28] - Couldn't send HTTP request to server
[12:19:28]   (Got status 503)
[12:19:28] + Could not connect to Work Server (results)
[12:19:28]     (171.67.108.25:8080)
[12:19:28] + Retrying using alternative port
[12:19:28] - Couldn't send HTTP request to server
[12:19:28]   (Got status 503)
[12:19:28] + Could not connect to Work Server (results)
[12:19:28]     (171.67.108.25:80)
[12:19:28]   Could not transmit unit 01 to Collection server; keeping in queue.
[12:40:17] Completed 27500 out of 250000 steps  (11%)
[13:14:18] Completed 30000 out of 250000 steps  (12%)
Image
preet.to
Posts: 19
Joined: Sun Dec 16, 2007 3:20 pm

Re: 171.64.65.56 not responding

Post by preet.to »

It seems almost random if you get a WU or not. Some of mine did, the others remain idle.

Now my problem is that finished WU's are staying in the queue so long, they are expiring. So PPD=0. What will be done about these?

Server status has no hint of a problem.
rickoic
Posts: 320
Joined: Sat May 23, 2009 4:49 pm
Hardware configuration: eVga x299 DARK 2070 Super, eVGA 2080, eVga 1070, eVga 2080 Super
MSI x399 eVga 2080, eVga 1070, eVga 1070, GT970
Location: Mississippi near Memphis, Tn

Re: 171.64.65.56 not responding

Post by rickoic »

I u/l'd a 2681 from 1930 to 2011 Sunday, but have been catching the 2662-2677 wu's everysince.

Fold on
Rick
I'm folding because Dec 2005 I had radical prostate surgery.
Lost brother to spinal cancer, brother-in-law to prostate cancer.
Several 1st cousins lost and a few who have survived.
JadeMiner
Posts: 3
Joined: Tue Jul 22, 2008 9:27 am

Re: 171.64.65.56 not responding

Post by JadeMiner »

171.64.65.56 ... reject, reject, reject.

I'm amazed that in late 2009 anybody could let a server go down like this.

Honestly makes me wonder.

Does all this folding stuff really help anybody? These guys can't even receive files.

Seriously reconsidering all of this electricity I use every month.

Has folding ever helped a single person?
rickoic
Posts: 320
Joined: Sat May 23, 2009 4:49 pm
Hardware configuration: eVga x299 DARK 2070 Super, eVGA 2080, eVga 1070, eVga 2080 Super
MSI x399 eVga 2080, eVga 1070, eVga 1070, GT970
Location: Mississippi near Memphis, Tn

Re: 171.64.65.56 not responding

Post by rickoic »

I'm glad all the Stanford people have thick skins, I for one appreciate everything they are doing and realize that on their budget they can't be there 24 hours a day to baby sit their servers. If my pc's don't catch the wu that I want I feel that there is an additional need for the wu I did catch, and so continue to fold it. Only exception if when one dies on me 2-3 times on what I know is a stable machine.

Not sure that anyone has been medically helped by what has been done as yet, but just pushing the envelope further along a path is an improvement, and who knows, that next work unit that you fold might be the one with the golden bullet of information that will help someone.

Big hand to Dr. Vijay and all the staff at Stanford.

Fold on
Rick
I'm folding because Dec 2005 I had radical prostate surgery.
Lost brother to spinal cancer, brother-in-law to prostate cancer.
Several 1st cousins lost and a few who have survived.
uncle fuzzy
Posts: 460
Joined: Sun Dec 02, 2007 10:15 pm
Location: Michigan

Re: 171.64.65.56 not responding

Post by uncle fuzzy »

Grandpa_01 wrote:Whats your secret uncle fuzzy
Pure luck, and offering sacrifices to the SMP idol I keep in the corner shrine. 8-)

Although I got a new WU, this last completed one won't go home.

edit- I need to make another sacrifice. Trying to force the upload, I seem to have lost the completed WU and trashed the one I was working on. After 5 failed downloads, I was sent to another server and am folding again. Player 3, notfred's, 4-core on Q6600@3.4
Proud to crash my machines as a Beta Tester!

Image
stevew
Posts: 42
Joined: Mon Dec 03, 2007 11:53 pm
Hardware configuration: Mac Pro 8-core 2.26 GHz 2009, 12 GB.
iMac i7 3.4 GHz, 4 GB.
Location: Team Hack-A-Day

171.64.65.56 not OK = 503 error

Post by stevew »

Since Oct 31 my 2 Mac/Intels running SMP WUs have had problems with server 171.64.65.56. Getting status 503. Toggling one machine's PrefPane on an off managed to send a WU and get one new one. Now 2nd machine is stuck, no upload and no work. Entering http://171.64.65.56 gets nothing, not OK.

[16:27:26] + Could not connect to Work Server (results)
[16:27:26] (171.64.65.56:8080)
[16:27:26] + Retrying using alternative port
[16:27:33] - Couldn't send HTTP request to server
[16:27:33] + Could not connect to Work Server (results)
[16:27:33] (171.64.65.56:80)
[16:27:33] - Error: Could not transmit unit 04 (completed November 2) to work server.
BrokenWolf
Posts: 126
Joined: Sat Aug 02, 2008 3:08 am

Re: 171.64.65.56 not responding

Post by BrokenWolf »

I have quite a few that are unable to return completed WU's. When browsing to the addresses I do not get an OK.

Code: Select all

[16:39:52] Loaded queue successfully.
[16:39:52] Attempting to return result(s) to server...
[16:39:52] Trying to send all finished work units
[16:39:52] Project: 2662 (Run 1, Clone 132, Gen 46)


[16:39:52] + Attempting to send results [November 2 16:39:52 UTC]
[16:39:52] - Reading file work/wuresults_01.dat from core
[16:39:52]   (Read 26799556 bytes from disk)
[16:39:52] Connecting to http://171.64.65.56:8080/
[16:39:59] - Couldn't send HTTP request to server
[16:39:59]   (Got status 502)
[16:39:59] + Could not connect to Work Server (results)
[16:39:59]     (171.64.65.56:8080)
[16:39:59] + Retrying using alternative port
[16:39:59] Connecting to http://171.64.65.56:80/
[16:40:07] - Couldn't send HTTP request to server
[16:40:07]   (Got status 502)
[16:40:07] + Could not connect to Work Server (results)
[16:40:07]     (171.64.65.56:80)
[16:40:07] - Error: Could not transmit unit 01 (completed November 2) to work server.
[16:40:07] - 5 failed uploads of this unit.


[16:40:07] + Attempting to send results [November 2 16:40:07 UTC]
[16:40:07] - Reading file work/wuresults_01.dat from core
[16:40:07]   (Read 26799556 bytes from disk)
[16:40:07] Connecting to http://171.67.108.25:8080/
[16:40:13] - Couldn't send HTTP request to server
[16:40:13]   (Got status 502)
[16:40:13] + Could not connect to Work Server (results)
[16:40:13]     (171.67.108.25:8080)
[16:40:13] + Retrying using alternative port
[16:40:13] Connecting to http://171.67.108.25:80/
[16:40:31] - Couldn't send HTTP request to server
[16:40:31]   (Got status 502)
[16:40:31] + Could not connect to Work Server (results)
[16:40:31]     (171.67.108.25:80)
[16:40:31]   Could not transmit unit 01 to Collection server; keeping in queue.
[16:40:31] + Sent 0 of 1 completed units to the server
[16:40:31] - Failed to send all units to server
[16:40:31] ***** Got a SIGTERM signal (15)
[16:40:31] Killing all core threads
Image
coolamasta
Posts: 10
Joined: Mon Mar 30, 2009 9:19 am
Hardware configuration: 2 x Q6600 @ 3GHz
1 x E6600 @ stock
3 x 9800GT's
1 x 8800GTX
Location: England

Re: 171.64.65.56 not responding

Post by coolamasta »

I got 7 SMP clients all waiting for work now and at least 5 of them got work to send back :(
Image
preet.to
Posts: 19
Joined: Sun Dec 16, 2007 3:20 pm

Re: 171.64.65.56 not responding

Post by preet.to »

Running FAH is a large endeavour. They rely on us to donate bits and we rely on them for server support. So for that I am greatful.

What I am upset about is the lack of communications on this matter. I don't care if this takes two weeks to fix, I can wait. But tell us what the problem is and what to expect.

Meanwhile all my held units are expired. I have lost a number of days of production.
Pick2
Posts: 85
Joined: Fri Feb 13, 2009 12:38 pm
Hardware configuration: Linux & CPUs
Location: USA

Re: 171.64.65.56 not responding

Post by Pick2 »

I've got 5 out of 12 waiting for work , with a few more getting close to done. I have 8 WU waiting to get sent up. I hope this gets straitened out soon.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.64.65.56 not responding

Post by bruce »

theo343 wrote:Have you ever thought about setting up proxy WU servers instead of having all the network and data transfer load to one location? You should seriously think about setting up a proxy oriented model of the WU servers. I guess this can be accomplished by renting resources on servercenters around the world. There should be a proxy on each continent for each type of client. These proxies are the ones who in a timly manner should send clientresults back to you, get new bulks of WUs and assign them to the clients.

Right now your model are too vunerable and you should start thinking of branching out.

I hope Stanford can give us an updated ETA on the operational status of the new servers. (still no proxy oriented model, but at least an improvement)
FAH does have a proxy server model. It's certainly not ideal, since the proxy (known as a collection server) only accepts uploads but the fundamental problem is that all of the hardware that's presently available is overloaded -- both the Work Server and the Collection Servers. The new hardware is critical.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.64.65.56 not responding

Post by bruce »

preet.to wrote:Server status has no hint of a problem.
Hint: NetLoad=200 is a problem. That's all this server can handle at one time. Everyone else is turned away.
kasson
Pande Group Member
Posts: 1459
Joined: Thu Nov 29, 2007 9:37 pm

Re: 171.64.65.56 not responding

Post by kasson »

As bruce notes, the server is talking to a lot of clients at once right now. The server is functioning, but more clients want to talk to it than it has capacity. This should improve as it clears the backlog.
rickoic
Posts: 320
Joined: Sat May 23, 2009 4:49 pm
Hardware configuration: eVga x299 DARK 2070 Super, eVGA 2080, eVga 1070, eVga 2080 Super
MSI x399 eVga 2080, eVga 1070, eVga 1070, GT970
Location: Mississippi near Memphis, Tn

Re: 171.64.65.56 not responding

Post by rickoic »

I wonder if any thought has gone into putting the current server time in a column. That would tell for sure if it was updating?

Fold on
Rick
I'm folding because Dec 2005 I had radical prostate surgery.
Lost brother to spinal cancer, brother-in-law to prostate cancer.
Several 1st cousins lost and a few who have survived.
Post Reply