Page 3 of 4

Re: 171.67.108.22

Posted: Thu Aug 11, 2011 9:45 pm
by bruce
DrSpalding wrote:In any case, 171.67.108.22 is not accepting the WUs either.
171.67.108.22 was overloaded. Dr. Pande said he would look into it. About that same time, it stopped being overloaded and a number of WUs have been uploaded since then.

Your log shows an upload attempt from before the problem was corrected but your client has not retried since it was fixed.

Re: 171.67.108.22

Posted: Fri Aug 12, 2011 12:11 am
by DrSpalding
It uploaded the P2684 WU about 14:00 PDT. Thanks for looking into it.

Re: 171.67.108.22

Posted: Fri Aug 12, 2011 1:26 am
by stevew
Still trying to send
WU end:

Code: Select all

[21:47:26] Project: 2684 (Run 6, Clone 0, Gen 102)


[21:47:26] + Attempting to send results [August 10 21:47:26 UTC]
[21:47:26] - Reading file work/wuresults_02.dat from core
[21:47:27]   (Read 95708495 bytes from disk)
[21:47:27] Connecting to http://171.67.108.22:8080/
Last try:

Code: Select all

[22:16:26] - Couldn't send HTTP request to server
[22:16:26] + Could not connect to Work Server (results)
[22:16:26]     (171.67.108.22:8080)
[22:16:26] + Retrying using alternative port
[22:16:26] Connecting to http://171.67.108.22:80/
[22:25:44] - Couldn't send HTTP request to server
[22:25:44] + Could not connect to Work Server (results)
[22:25:44]     (171.67.108.22:80)
[22:25:44] - Error: Could not transmit unit 02 (completed August 10) to work server.
[22:25:44] - 13 failed uploads of this unit.


[22:25:44] + Attempting to send results [August 11 22:25:44 UTC]

Re: 171.67.108.22

Posted: Fri Aug 12, 2011 1:41 am
by bruce
Successful uploads are running about 20 per hour. Does that server respond to http://171.67.108.22 or http://171.67.108.22:8080?

When I open those URLs, firefox does not display the OK page, but it can download a file containing <html><b>OK</b></html> which is correct. (You really don't need to do that as long as you don't get an error message.)

Re: 171.67.108.22

Posted: Sat Aug 13, 2011 12:42 pm
by stevew
[Probably not Stanford server but OS X and Wine]
2684 WU never successfully uploaded to this server and neither was the the the following 6900 WU uploaded to its Work Server. When I was assigned another 2684 WU I stopped. Each of the WUs, on completion and subsequent upload attempts would fail to completely upload. The sending would stop after about 10 minutes and on this machine normal uploads each took 18+ minutes. True since jan-8-2010. I'll miss the 1,000,000+ point months and hope for a Mac native bigadv client.

The has been only one recent change, OS X 10.7 install on day of release. Best guess is that Wine under 10.7 does not work well. I do not have the ability to diagnose the error. Installed Virtualbox and now folding with Ubuntu 64-bit 2.6.37-30, "-smp 16 -bigadv -verbose 9". A 6900 WU is running at about 3.33 x min with ~27 min frame times. Hopefully the upload proccess will succeed.

Unable to upload to 171.67.108.22

Posted: Wed Sep 07, 2011 2:11 pm
by TarogStar
I have been unable to upload to this server for 2 days now, and I am almost finished with another WU. It also appears to use a reject server as the backup. I have never had problems with this computer uploading before, and I have other machines on the network uploading whatever WU fine. The status page lists this server as full Accepting. I seem to be able to ping fine and visit the folding@home pages fine as well, is there anything on the server end, or must it be on my end?

Code: Select all

[18:55:01] + Attempting to get work packet
[18:55:01] Passkey found
[18:55:01] - Connecting to assignment server
[18:55:01] - Successful: assigned to (130.237.232.141).
[18:55:01] + News From Folding@Home: Welcome to Folding@Home
[18:55:01] Loaded queue successfully.
[19:37:01] + Could not connect to Work Server
[19:37:01] - Attempt #3  to get work failed, and no other work to do.
Waiting before retry.
[19:37:23] + Attempting to get work packet
[19:37:23] Passkey found
[19:37:23] - Connecting to assignment server
[19:37:24] - Successful: assigned to (171.67.108.22).
[19:37:24] + News From Folding@Home: Welcome to Folding@Home
[19:37:24] Loaded queue successfully.
[19:37:55] + Closed connections
[19:38:00] 
[19:38:00] + Processing work unit
[19:38:00] Core required: FahCore_a5.exe
[19:38:00] Core found.
[19:38:00] Working on queue slot 08 [September 3 19:38:00 UTC]
[19:38:00] + Working ...
[19:38:00] 
[19:38:00] *------------------------------*
[19:38:00] Folding@Home Gromacs SMP Core
[19:38:00] Version 2.27 (Mar 12, 2010)
[19:38:00] 
[19:38:00] Preparing to commence simulation
[19:38:00] - Looking at optimizations...
[19:38:00] - Created dyn
[19:38:00] - Files status OK
[19:38:13] - Expanded 25461599 -> 31941441 (decompressed 125.4 percent)
[19:38:13] Called DecompressByteArray: compressed_data_size=25461599 data_size=31941441, decompressed_data_size=31941441 diff=0
[19:38:13] - Digital signature verified
[19:38:13] 
[19:38:13] Project: 2686 (Run 1, Clone 19, Gen 151)
[19:38:13] 
[19:38:13] Assembly optimizations on if available.
[19:38:13] Entering M.D.
[19:38:19] Mapping NT from 16 to 16 
[19:38:25] Completed 0 out of 250000 steps  (0%)

Processing here....

[17:10:55] Completed 250000 out of 250000 steps  (100%)
[17:11:10] DynamicWrapper: Finished Work Unit: sleep=10000
[17:11:20] 
[17:11:20] Finished Work Unit:
[17:11:20] - Reading up to 52713120 from "work/wudata_08.trr": Read 52713120
[17:11:20] trr file hash check passed.
[17:11:20] - Reading up to 42829432 from "work/wudata_08.xtc": Read 42829432
[17:11:21] xtc file hash check passed.
[17:11:21] edr file hash check passed.
[17:11:21] logfile size: 209456
[17:11:21] Leaving Run
[17:11:22] - Writing 95923392 bytes of core data to disk...
[17:11:23]   ... Done.
[17:11:31] - Shutting down core
[17:11:31] 
[17:11:31] Folding@home Core Shutdown: FINISHED_UNIT
[17:11:34] CoreStatus = 64 (100)
[17:11:34] Sending work to server
[17:11:34] Project: 2686 (Run 1, Clone 19, Gen 151)


[17:11:34] + Attempting to send results [September 5 17:11:34 UTC]
[17:11:58] - Couldn't send HTTP request to server
[17:11:58] + Could not connect to Work Server (results)
[17:11:58]     (171.67.108.22:8080)
[17:11:58] + Retrying using alternative port
[17:12:21] - Couldn't send HTTP request to server
[17:12:21] + Could not connect to Work Server (results)
[17:12:21]     (171.67.108.22:80)
[17:12:21] - Error: Could not transmit unit 08 (completed September 5) to work server.
[17:12:21]   Keeping unit 08 in queue.
[17:12:21] Project: 2686 (Run 1, Clone 19, Gen 151)


[17:12:21] + Attempting to send results [September 5 17:12:21 UTC]
[17:12:46] - Couldn't send HTTP request to server
[17:12:46] + Could not connect to Work Server (results)
[17:12:46]     (171.67.108.22:8080)
[17:12:46] + Retrying using alternative port
[17:13:09] - Couldn't send HTTP request to server
[17:13:09] + Could not connect to Work Server (results)
[17:13:09]     (171.67.108.22:80)
[17:13:09] - Error: Could not transmit unit 08 (completed September 5) to work server.


[17:13:09] + Attempting to send results [September 5 17:13:09 UTC]
[17:13:10] - Couldn't send HTTP request to server
[17:13:10] + Could not connect to Work Server (results)
[17:13:10]     (171.67.108.25:8080)
[17:13:10] + Retrying using alternative port
[17:13:11] - Couldn't send HTTP request to server
[17:13:11] + Could not connect to Work Server (results)
[17:13:11]     (171.67.108.25:80)
[17:13:11]   Could not transmit unit 08 to Collection server; keeping in queue.
[17:13:11] - Preparing to get new work unit...
[17:13:11] Cleaning up work directory
[17:13:11] + Attempting to get work packet
[17:13:11] Passkey found
[17:13:11] - Connecting to assignment server
[17:13:12] - Successful: assigned to (130.237.232.141).
[17:13:12] + News From Folding@Home: Welcome to Folding@Home
[17:13:12] Loaded queue successfully.
[17:13:12] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[17:13:21] + Attempting to get work packet
[17:13:21] Passkey found
[17:13:21] - Connecting to assignment server
[17:13:22] - Successful: assigned to (130.237.232.141).
[17:13:22] + News From Folding@Home: Welcome to Folding@Home
[17:13:22] Loaded queue successfully.
[17:13:39] Project: 2686 (Run 1, Clone 19, Gen 151)

Re: Unable to upload to 171.67.108.22

Posted: Wed Sep 07, 2011 2:39 pm
by TarogStar
Another WU just finished, and it sent successfully, but the old WU is till in that loop, and I now have a new WU from that same server, successfully downloaded.

Any help is appreciated.

Code: Select all

[13:57:44] Completed 247500 out of 250000 steps  (99%)
[14:16:39] Project: 2686 (Run 1, Clone 19, Gen 151)


[14:16:39] + Attempting to send results [September 7 14:16:39 UTC]
[14:17:03] - Couldn't send HTTP request to server
[14:17:03] + Could not connect to Work Server (results)
[14:17:03]     (171.67.108.22:8080)
[14:17:03] + Retrying using alternative port
[14:17:25] - Couldn't send HTTP request to server
[14:17:25] + Could not connect to Work Server (results)
[14:17:25]     (171.67.108.22:80)
[14:17:25] - Error: Could not transmit unit 08 (completed September 5) to work server.


[14:17:25] + Attempting to send results [September 7 14:17:25 UTC]
[14:17:26] - Couldn't send HTTP request to server
[14:17:26] + Could not connect to Work Server (results)
[14:17:26]     (171.67.108.25:8080)
[14:17:26] + Retrying using alternative port
[14:17:27] - Couldn't send HTTP request to server
[14:17:27] + Could not connect to Work Server (results)
[14:17:27]     (171.67.108.25:80)
[14:17:27]   Could not transmit unit 08 to Collection server; keeping in queue.
[14:26:04] Completed 250000 out of 250000 steps  (100%)
[14:26:19] DynamicWrapper: Finished Work Unit: sleep=10000
[14:26:29] 
[14:26:29] Finished Work Unit:
[14:26:29] - Reading up to 52713120 from "work/wudata_09.trr": Read 52713120
[14:26:29] trr file hash check passed.
[14:26:29] - Reading up to 47064208 from "work/wudata_09.xtc": Read 47064208
[14:26:30] xtc file hash check passed.
[14:26:30] edr file hash check passed.
[14:26:30] logfile size: 204864
[14:26:30] Leaving Run
[14:26:31] - Writing 100150132 bytes of core data to disk...
[14:26:32]   ... Done.
[14:26:39] - Shutting down core
[14:26:39] 
[14:26:39] Folding@home Core Shutdown: FINISHED_UNIT
[14:26:43] CoreStatus = 64 (100)
[14:26:43] Sending work to server
[14:26:43] Project: 6900 (Run 43, Clone 4, Gen 36)


[14:26:43] + Attempting to send results [September 7 14:26:43 UTC]
[14:27:26] + Results successfully sent
[14:27:26] Thank you for your contribution to Folding@Home.
[14:27:26] + Number of Units Completed: 133

[14:27:33] Project: 2686 (Run 1, Clone 19, Gen 151)


[14:27:33] + Attempting to send results [September 7 14:27:33 UTC]
[14:27:56] - Couldn't send HTTP request to server
[14:27:56] + Could not connect to Work Server (results)
[14:27:56]     (171.67.108.22:8080)
[14:27:56] + Retrying using alternative port
[14:28:19] - Couldn't send HTTP request to server
[14:28:19] + Could not connect to Work Server (results)
[14:28:19]     (171.67.108.22:80)
[14:28:19] - Error: Could not transmit unit 08 (completed September 5) to work server.


[14:28:19] + Attempting to send results [September 7 14:28:19 UTC]
[14:28:21] - Couldn't send HTTP request to server
[14:28:21] + Could not connect to Work Server (results)
[14:28:21]     (171.67.108.25:8080)
[14:28:21] + Retrying using alternative port
[14:28:22] - Couldn't send HTTP request to server
[14:28:22] + Could not connect to Work Server (results)
[14:28:22]     (171.67.108.25:80)
[14:28:22]   Could not transmit unit 08 to Collection server; keeping in queue.
[14:28:22] - Preparing to get new work unit...
[14:28:22] Cleaning up work directory
[14:28:22] + Attempting to get work packet
[14:28:22] Passkey found
[14:28:22] - Connecting to assignment server
[14:28:22] - Successful: assigned to (171.67.108.22).
[14:28:22] + News From Folding@Home: Welcome to Folding@Home
[14:28:22] Loaded queue successfully.
[14:28:41] Project: 2686 (Run 1, Clone 19, Gen 151)


[14:28:41] + Attempting to send results [September 7 14:28:41 UTC]
[14:29:04] - Couldn't send HTTP request to server
[14:29:04] + Could not connect to Work Server (results)
[14:29:04]     (171.67.108.22:8080)
[14:29:04] + Retrying using alternative port
[14:29:27] - Couldn't send HTTP request to server
[14:29:27] + Could not connect to Work Server (results)
[14:29:27]     (171.67.108.22:80)
[14:29:27] - Error: Could not transmit unit 08 (completed September 5) to work server.


[14:29:27] + Attempting to send results [September 7 14:29:27 UTC]
[14:29:29] - Couldn't send HTTP request to server
[14:29:29] + Could not connect to Work Server (results)
[14:29:29]     (171.67.108.25:8080)
[14:29:29] + Retrying using alternative port
[14:29:30] - Couldn't send HTTP request to server
[14:29:30] + Could not connect to Work Server (results)
[14:29:30]     (171.67.108.25:80)
[14:29:30]   Could not transmit unit 08 to Collection server; keeping in queue.
[14:29:30] + Closed connections
[14:29:30] 
[14:29:30] + Processing work unit
[14:29:30] Core required: FahCore_a5.exe
[14:29:30] Core found.
[14:29:30] Working on queue slot 00 [September 7 14:29:30 UTC]
[14:29:30] + Working ...
[14:29:30] 
[14:29:30] *------------------------------*
[14:29:30] Folding@Home Gromacs SMP Core
[14:29:30] Version 2.27 (Mar 12, 2010)
[14:29:30] 
[14:29:30] Preparing to commence simulation
[14:29:30] - Looking at optimizations...
[14:29:30] - Created dyn
[14:29:30] - Files status OK
[14:29:43] - Expanded 24822107 -> 30791309 (decompressed 124.0 percent)
[14:29:43] Called DecompressByteArray: compressed_data_size=24822107 data_size=30791309, decompressed_data_size=30791309 diff=0
[14:29:43] - Digital signature verified
[14:29:43] 
[14:29:43] Project: 2684 (Run 11, Clone 8, Gen 102)
[14:29:43] 
[14:29:43] Assembly optimizations on if available.
[14:29:43] Entering M.D.
[14:29:50] Mapping NT from 16 to 16 
[14:29:57] Completed 0 out of 250000 steps  (0%)

Re: Unable to upload to 171.67.108.22

Posted: Wed Sep 07, 2011 4:38 pm
by bruce
TarogStar wrote:I have been unable to upload to this server for 2 days now, and I am almost finished with another WU. It also appears to use a reject server as the backup. . . .
I've merged your posts with an ongoing topic on the same server. It's possibly the same issue that has been discussed earlier in this new topic. I'll make sure the Pande Group is aware of the issue.

The issue with Collection Servers which are supposed to be there as a backup is a long-standing problem which is moving (slowly) toward a final solution.

Re: 171.67.108.22

Posted: Fri Sep 09, 2011 2:31 pm
by TarogStar
The WU finally sent today, after losing all the bonus points. Is that server overworked, or did the Pande group do something to fix it? I have another WU from the same server due to be completed in about 14 hours, is there anything I can do on my end if this happens again? Do I just need to keep trying to resend that unit over and over by restarting the client until it sends? It looks like that server has been accepting WU the whole time that I couldn't send them...

Re: 171.67.108.22

Posted: Fri Sep 09, 2011 3:54 pm
by bruce
There's really nothing you can do on your end.

On Sunday-Monday of this week there were some problems with both servers 130.237.232.141 and 130.237.232.237. (See other topics in this forum.) Those servers distribute similar projects to the ones found on 171.67.108.22. To fix the problem, one of those servers needed to be taken off-line for a few days. With all servers functioning normally, they run very close to optimum capacity, both providing good service and using server resources wisely. With one server off-line, the remaining servers attempt to handle the additional workload but the workload is a challenge for them and the service level suffers somewhat.

Re: 171.67.108.22

Posted: Tue Sep 13, 2011 12:14 pm
by stevew
http://171.67.108.22:8080/ and :80/ show "OK" when addressed in browser; however, upload has failed 5 times to both ports 8080 and 80.

-rw-r--r-- 1 mp8 staff 95190450 Sep 12 20:49 work/wuresults_08.dat (MST)

Code: Select all

[02:49:14] Unit 8 finished with 68 percent of time to deadline remaining.
[02:49:14] Updated performance fraction: 0.894693
[02:49:14] Sending work to server
[02:49:14] Project: 2685 (Run 0, Clone 1, Gen 168)


[02:49:14] + Attempting to send results [September 13 02:49:14 UTC]
[02:49:14] - Reading file work/wuresults_08.dat from core
[02:49:14]   (Read 95190450 bytes from disk)
[02:49:14] Connecting to http://171.67.108.22:8080/
[02:57:17] - Couldn't send HTTP request to server
[02:57:17] + Could not connect to Work Server (results)
[02:57:17]     (171.67.108.22:8080)
Now 17% complete on a Project: 6900 (Run 81, Clone 7, Gen 3) from 130.237.232.141:8080

[Edit: at 12:33 UTC the 2685 WU is being sent to never-never land, 171.67.108.25, which does not exist.]

Re: 171.67.108.22

Posted: Tue Sep 13, 2011 6:07 pm
by bruce
If uploading to the primary work server fails, the client is designed to send the data to the collection server, whether that collection server is operational or not. Nobody can do anything about that part until the new collection server code is fully operational.

As far as whether the communications problems between you and 171.67.108.22 are concerned, I have no way to determine if the problem is in your installation, with your ISP, with Stanford's ISP, or with Stanford's server . . . except to check that the server is actively receiving projects from others. During the past three hours, the acceptance rate for uploads has been about 10 WUs per hour. By itself, that number means very little (unless it's zero) but over 24 hours, it does seem to vary between 4 and 17 so I'd have to guess that's pretty close to the rate at which these projects are being completed.

This WU was reissued and somebody else has completed it so as far as the science is concerned, the WU has been completed. For you to get credit, though, your copy will need to be successfully uploaded.

Re: 171.67.108.22

Posted: Tue Sep 13, 2011 10:42 pm
by bruce
Another topic, possibly on the same issue: viewtopic.php?f=19&t=19604#p195325

Re: 171.67.108.22

Posted: Wed Sep 14, 2011 2:49 pm
by SKeptical_Thinker
I'm seeing the same issue:

Code: Select all

14:37:07:Connecting to 171.67.108.22:8080
14:37:07:WARNING: WorkServer connection failed on port 8080 trying 80
14:37:07:Connecting to 171.67.108.25:80
14:37:07:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
14:37:07:WARNING: Exception: Failed to send results to work server: Failed to read stream
14:37:07:Trying to send results to collection server
14:37:07:Unit 00: Uploading 12.98KiB
14:37:07:Connecting to 171.67.108.25:8080
14:37:07:WARNING: WorkServer connection failed on port 8080 trying 80
14:37:07:Connecting to 171.67.108.25:80
14:37:07:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
14:38:44:Sending unit results: id:01 state:SEND project:2684 run:9 clone:6 gen:90 core:0xa5 unit:0x1dc4b1114e6f9b86005a000609007c0a
14:38:44:Unit 01: Uploading 12.98KiB
14:38:44:Connecting to 171.67.108.22:8080
14:38:44:WARNING: Exception: Failed to send results to work server: Failed to read stream
14:38:44:Trying to send results to collection server
14:38:44:Unit 01: Uploading 12.98KiB
14:38:44:Connecting to 171.67.108.25:8080
14:38:44:Sending unit results: id:00 state:SEND project:2684 run:9 clone:6 gen:90 core:0xa5 unit:0x56cdfabb4e6fa30e005a000609007c0a
14:38:44:Unit 00: Uploading 12.98KiB
14:38:44:Connecting to 171.67.108.22:8080
14:38:44:WARNING: WorkServer connection failed on port 8080 trying 80
14:38:44:Connecting to 171.67.108.25:80
14:38:44:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
14:38:44:WARNING: Exception: Failed to send results to work server: Failed to read stream
14:38:44:Trying to send results to collection server
14:38:44:Unit 00: Uploading 12.98KiB
14:38:44:Connecting to 171.67.108.25:8080
14:38:44:WARNING: WorkServer connection failed on port 8080 trying 80
14:38:44:Connecting to 171.67.108.25:80
14:38:44:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
The server replies "OK" when I connect to it.

Re: 171.67.108.22

Posted: Wed Sep 14, 2011 4:12 pm
by stevew
Glad that the WU was completed by someone and the science done. I tried restarting the client to force another delivery with Firewall off and trashed the 2685 and the 6900 that was being folded. Cleaned up and doing another 6900, dang.