171.64.65.56 ???

Moderators: Site Moderators, FAHC Science Team

Post Reply
lobuxracer
Posts: 18
Joined: Mon Aug 18, 2008 5:49 am

Re: 171.64.65.56 ???

Post by lobuxracer »

You are not alone. HTTP 503 here too.
Image
susato
Site Moderator
Posts: 511
Joined: Fri Nov 30, 2007 4:57 am
Location: Team MacOSX
Contact:

Re: 171.64.65.56 ???

Post by susato »

Netload's back at 200. Every time Dr. K. bumps it, it heads right back to 200 connections and trouble. Good to hear about the upcoming equipment upgrade.
ThunderRd
Posts: 78
Joined: Sun Dec 02, 2007 5:30 am
Location: Nong Khai, Thailand

Re: 171.64.65.56 ???

Post by ThunderRd »

susato wrote:Netload's back at 200. Every time Dr. K. bumps it, it heads right back to 200 connections and trouble. Good to hear about the upcoming equipment upgrade.
Actually it's at 202 ATM.

Additional server is welcomed. I've got 5 machines waiting for work now, 4 of them for over 12 hours.
snapshot
Posts: 132
Joined: Thu Apr 09, 2009 7:25 pm
Location: Wiltshire, UK

Re: 171.64.65.56 ???

Post by snapshot »

I've got two WUs waiting for upload with their bonus points frittering away.....
Tobit
Posts: 342
Joined: Thu Apr 17, 2008 2:35 pm
Location: Manchester, NH USA

Re: 171.64.65.56 ???

Post by Tobit »

65.56 needs another kick please. At the time of this post, net load is over 200 again. :(
Datsun 1600
Posts: 33
Joined: Mon May 05, 2008 2:42 am

Re: 171.64.65.56 ???

Post by Datsun 1600 »

How can I get permanently assigned to Server 171.64.65.54, I have no problems with that one, but as soon as I get assigned a WU from Server 171.64.65.56 and go to return it, it is usually overloaded? GRRRRRRRRRRR
VijayPande
Pande Group Member
Posts: 2058
Joined: Fri Nov 30, 2007 6:25 am
Location: Stanford

Re: 171.64.65.56 ???

Post by VijayPande »

I'm sorry about this. We're monitoring the situation and have a longer term solution (new servers, work distributed amongst them), but that will take some time to get on line (sorry it's taking so long).
Prof. Vijay Pande, PhD
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
7up1n3
Posts: 68
Joined: Sun Dec 02, 2007 2:55 am
Contact:

Re: 171.64.65.56 ???

Post by 7up1n3 »

Getting 503s here too now.

Code: Select all

[13:11:12] Completed 2000000 out of 2000000 steps  (100%)
[13:11:12] DynamicWrapper: Finished Work Unit: sleep=10000
[13:11:23] 
[13:11:23] Finished Work Unit:
[13:11:23] - Reading up to 687408 from "work/wudata_05.trr": Read 687408
[13:11:23] trr file hash check passed.
[13:11:23] - Reading up to 42672364 from "work/wudata_05.xtc": Read 42672364
[13:11:23] xtc file hash check passed.
[13:11:23] edr file hash check passed.
[13:11:23] logfile size: 279301
[13:11:23] Leaving Run
[13:11:23] - Writing 43641409 bytes of core data to disk...
[13:11:24]   ... Done.
[13:11:25] - Shutting down core
[13:11:25] 
[13:11:25] Folding@home Core Shutdown: FINISHED_UNIT
[13:11:29] CoreStatus = 64 (100)
[13:11:29] Unit 5 finished with 91 percent of time to deadline remaining.
[13:11:29] Updated performance fraction: 0.673357
[13:11:29] Sending work to server
[13:11:29] Project: 6701 (Run 56, Clone 11, Gen 58)


[13:11:29] + Attempting to send results [October 5 13:11:29 UTC]
[13:11:29] - Reading file work/wuresults_05.dat from core
[13:11:29]   (Read 43641409 bytes from disk)
[13:11:29] Connecting to http://171.64.65.56:8080/
[13:11:29] - Couldn't send HTTP request to server
[13:11:29]   (Got status 503)
[13:11:29] + Could not connect to Work Server (results)
[13:11:29]     (171.64.65.56:8080)
[13:11:29] + Retrying using alternative port
[13:11:29] Connecting to http://171.64.65.56:80/
[13:11:29] - Couldn't send HTTP request to server
[13:11:29]   (Got status 503)
[13:11:29] + Could not connect to Work Server (results)
[13:11:29]     (171.64.65.56:80)
[13:11:29] - Error: Could not transmit unit 05 (completed October 5) to work server.
[13:11:29] - 1 failed uploads of this unit.
[13:11:29]   Keeping unit 05 in queue.
[13:11:29] Trying to send all finished work units
[13:11:29] Project: 6701 (Run 56, Clone 11, Gen 58)


[13:11:29] + Attempting to send results [October 5 13:11:29 UTC]
[13:11:29] - Reading file work/wuresults_05.dat from core
[13:11:29]   (Read 43641409 bytes from disk)
[13:11:29] Connecting to http://171.64.65.56:8080/
[13:11:29] - Couldn't send HTTP request to server
[13:11:29]   (Got status 503)
[13:11:29] + Could not connect to Work Server (results)
[13:11:29]     (171.64.65.56:8080)
[13:11:29] + Retrying using alternative port
[13:11:29] Connecting to http://171.64.65.56:80/
[13:11:29] - Couldn't send HTTP request to server
[13:11:29]   (Got status 503)
[13:11:29] + Could not connect to Work Server (results)
[13:11:29]     (171.64.65.56:80)
[13:11:29] - Error: Could not transmit unit 05 (completed October 5) to work server.
[13:11:29] - 2 failed uploads of this unit.


[13:11:29] + Attempting to send results [October 5 13:11:29 UTC]
[13:11:29] - Reading file work/wuresults_05.dat from core
[13:11:29]   (Read 43641409 bytes from disk)
[13:11:29] Connecting to http://171.67.108.25:8080/
[13:11:29] - Couldn't send HTTP request to server
[13:11:29]   (Got status 503)
[13:11:29] + Could not connect to Work Server (results)
[13:11:29]     (171.67.108.25:8080)
[13:11:29] + Retrying using alternative port
[13:11:29] Connecting to http://171.67.108.25:80/
[13:11:29] - Couldn't send HTTP request to server
[13:11:29]   (Got status 503)
[13:11:29] + Could not connect to Work Server (results)
[13:11:29]     (171.67.108.25:80)
[13:11:29]   Could not transmit unit 05 to Collection server; keeping in queue.
[13:11:29] + Sent 0 of 1 completed units to the server
[13:11:29] - Preparing to get new work unit...
[13:11:29] Cleaning up work directory
[13:11:31] + Attempting to get work packet
[13:11:31] Passkey found
[13:11:31] - Will indicate memory of 12285 MB
[13:11:31] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 7, Stepping: 10
[13:11:31] - Connecting to assignment server
[13:11:31] Connecting to http://assign.stanford.edu:8080/
[13:11:32] Posted data.
[13:11:32] Initial: 40AB; - Successful: assigned to (171.64.65.54).
[13:11:32] + News From Folding@Home: Welcome to Folding@Home
[13:11:32] Loaded queue successfully.
[13:11:32] Sent data
[13:11:32] Connecting to http://171.64.65.54:8080/
[13:11:33] Posted data.
[13:11:33] Initial: 0000; - Receiving payload (expected size: 1765609)
[13:11:35] - Downloaded at ~862 kB/s
[13:11:35] - Averaged speed for that direction ~615 kB/s
[13:11:35] + Received work.
[13:11:35] Trying to send all finished work units
[13:11:35] Project: 6701 (Run 56, Clone 11, Gen 58)


[13:11:35] + Attempting to send results [October 5 13:11:35 UTC]
[13:11:35] - Reading file work/wuresults_05.dat from core
[13:11:35]   (Read 43641409 bytes from disk)
[13:11:35] Connecting to http://171.64.65.56:8080/
[13:11:36] - Couldn't send HTTP request to server
[13:11:36]   (Got status 503)
[13:11:36] + Could not connect to Work Server (results)
[13:11:36]     (171.64.65.56:8080)
[13:11:36] + Retrying using alternative port
[13:11:36] Connecting to http://171.64.65.56:80/
[13:11:36] - Couldn't send HTTP request to server
[13:11:36]   (Got status 503)
[13:11:36] + Could not connect to Work Server (results)
[13:11:36]     (171.64.65.56:80)
[13:11:36] - Error: Could not transmit unit 05 (completed October 5) to work server.
[13:11:36] - 3 failed uploads of this unit.


[13:11:36] + Attempting to send results [October 5 13:11:36 UTC]
[13:11:36] - Reading file work/wuresults_05.dat from core
[13:11:36]   (Read 43641409 bytes from disk)
[13:11:36] Connecting to http://171.67.108.25:8080/
[13:11:36] - Couldn't send HTTP request to server
[13:11:36]   (Got status 503)
[13:11:36] + Could not connect to Work Server (results)
[13:11:36]     (171.67.108.25:8080)
[13:11:36] + Retrying using alternative port
[13:11:36] Connecting to http://171.67.108.25:80/
[13:11:36] - Couldn't send HTTP request to server
[13:11:36]   (Got status 503)
[13:11:36] + Could not connect to Work Server (results)
[13:11:36]     (171.67.108.25:80)
[13:11:36]   Could not transmit unit 05 to Collection server; keeping in queue.
[13:11:36] + Sent 0 of 1 completed units to the server
[13:11:36] + Closed connections
[13:11:36] 
[13:11:36] + Processing work unit
[13:11:36] Core required: FahCore_a3.exe
[13:11:36] Core found.
[13:11:36] Working on queue slot 06 [October 5 13:11:36 UTC]
[13:11:36] + Working ...
[13:11:36] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 06 -np 8 -checkpoint 15 -verbose -lifeline 2668 -version 630'

[13:11:36] 
[13:11:36] *------------------------------*
[13:11:36] Folding@Home Gromacs SMP Core
[13:11:36] Version 2.22 (Mar 12, 2010)
[13:11:36] 
[13:11:36] Preparing to commence simulation
[13:11:36] - Looking at optimizations...
[13:11:36] - Created dyn
[13:11:36] - Files status OK
[13:11:36] - Expanded 1765097 -> 2251569 (decompressed 127.5 percent)
[13:11:36] Called DecompressByteArray: compressed_data_size=1765097 data_size=2251569, decompressed_data_size=2251569 diff=0
[13:11:36] - Digital signature verified
[13:11:36] 
[13:11:36] Project: 6055 (Run 1, Clone 160, Gen 45)
[13:11:36] 
[13:11:36] Assembly optimizations on if available.
[13:11:36] Entering M.D.
[13:11:43] Completed 0 out of 500000 steps  (0%)
What happens to WUs that fail to upload like this? Do they expire when the deadline passes? Is the bonus frittered away?
Image
Rage3D Admin ~ The Fighting 300 ~ Team Rage3D Folding
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 171.64.65.56 ???

Post by PantherX »

7up1n3 wrote:...What happens to WUs that fail to upload like this? Do they expire when the deadline passes? Is the bonus frittered away?
In this case, you bonus points will reduce according to the delay caused by the Server. If you cross the Preferred Deadline, you will be assigned Base Credits. If you cross the Final Deadline, you will not get any credits and the Client will delete the WU and move on. Sorry but that is the way it currently works. Hopefully when new SMP Servers are added, this will be history and we all will be happy.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: 171.64.65.56 ???

Post by codysluder »

VijayPande wrote:I'm sorry about this. We're monitoring the situation and have a longer term solution (new servers, work distributed amongst them), but that will take some time to get on line (sorry it's taking so long).
I understand how long it takes to get new servers on-line but there's a real conflict between how long that's going to take and how long it takes for our QRBonus to decay. Are there any short-term solutions that can help? Why can't at least some of the work be distributed as has been suggested here?
Datsun 1600 wrote:How can I get permanently assigned to Server 171.64.65.54, I have no problems with that one, but as soon as I get assigned a WU from Server 171.64.65.56 and go to return it, it is usually overloaded? GRRRRRRRRRRR
7up1n3
Posts: 68
Joined: Sun Dec 02, 2007 2:55 am
Contact:

Re: 171.64.65.56 ???

Post by 7up1n3 »

PantherX wrote:
7up1n3 wrote:...What happens to WUs that fail to upload like this? Do they expire when the deadline passes? Is the bonus frittered away?
In this case, you bonus points will reduce according to the delay caused by the Server. If you cross the Preferred Deadline, you will be assigned Base Credits. If you cross the Final Deadline, you will not get any credits and the Client will delete the WU and move on. Sorry but that is the way it currently works. Hopefully when new SMP Servers are added, this will be history and we all will be happy.
Wow. It shouldn't be that difficult to read the code detailing beginning and end of the processing time, and rewarding the contribution accordingly so that users aren't penalized for server side issues.
Image
Rage3D Admin ~ The Fighting 300 ~ Team Rage3D Folding
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: 171.64.65.56 ???

Post by codysluder »

You're assuming that all failure to upload problems are server-side issues. That may be true much of the time and it's certainly true right now, but I'll bet that folks would find a way to cheat. I have not tried it, but what happens if you adjusting your clock to show that the WU finished one minuted after you downloaded it but then corrected the clock before you uploaded the WU, blaming all of your processing time on a server outage.

The points cannot be based on any clock other than the server clock.

Stanford said that they recognized the possibility of server problems and that they'd do their best to maintain the servers but it was a risk that you'd have to accept. Are you saying that you don't believe that they are making a sincere effort to correct the problems? If so, I disagree with you.
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: 171.64.65.56 ???

Post by codysluder »

Assuming that the problems with this server will continue until a new server comes on-line, some folks might actually benefit by temporarily suspending SMP. Multiple uniprocessor clients normally earn less PPD than SMP, but you don't have to reduce the PPD for SMP very much before multiple clients win out because of their reliability. At least it's an alternative to consider.
Tobit
Posts: 342
Joined: Thu Apr 17, 2008 2:35 pm
Location: Manchester, NH USA

Re: 171.64.65.56 ???

Post by Tobit »

Please kick it again, net load is over 200 yet again.
7up1n3
Posts: 68
Joined: Sun Dec 02, 2007 2:55 am
Contact:

Re: 171.64.65.56 ???

Post by 7up1n3 »

codysluder wrote:You're assuming that all failure to upload problems are server-side issues.
I'm not assuming that at all. But I am assuming that, when server issues do arise, that a system could be implemented to address it. This has been done in the past, with mass point credits as users are "caught up", and would simply need to be adjusted to accommodate the bonus system.
Post Reply