vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
Moderators: Site Moderators, FAHC Science Team
-
- Site Moderator
- Posts: 6349
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
Is maintenance done on these servers ?
-
- Posts: 522
- Joined: Mon Dec 03, 2007 4:33 am
- Location: Australia
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
mikesmusic is still unable to send queued results: viewtopic.php?p=66220#p66220 Can anyone please check if the server is fine?
-
- Posts: 438
- Joined: Mon Dec 03, 2007 1:31 am
- Hardware configuration: Old Faithful CPU: Windows Graphical 5.03; Intel Pentium 4 Processor 540
(3.2GHz) HT;Windows XP
Big Red: Windows SMP Console 6.29; Windows GPU console 6.20r1; Intel Q9450 2.66G; ASUS P5Q 775 P45; [BFG 9800GTX+ old graphics card] NVidia GeForce 8800 GTX [as of 5/9/09]; Windows XP Pro SP3
Lenovo Think Pad: Windows 6.29 w/ SMP; Windows GPU Console 6.20r1 systray; Intel QX9300; NVIDIA Quadro FX-3700M; Windows XP Professional - Location: SF Peninsula
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
171.64.122.72 is in reject now. Just lost two units: the 4419 that expired and the unit that got a special exit when the program tried to auto send the expired unit. Of the 6 WUs 4419-21 (or so, they're on different machines) that I've gotten, only one has been returned. I'm going to delete the others before they kill units in process too.
-
- Posts: 438
- Joined: Mon Dec 03, 2007 1:31 am
- Hardware configuration: Old Faithful CPU: Windows Graphical 5.03; Intel Pentium 4 Processor 540
(3.2GHz) HT;Windows XP
Big Red: Windows SMP Console 6.29; Windows GPU console 6.20r1; Intel Q9450 2.66G; ASUS P5Q 775 P45; [BFG 9800GTX+ old graphics card] NVidia GeForce 8800 GTX [as of 5/9/09]; Windows XP Pro SP3
Lenovo Think Pad: Windows 6.29 w/ SMP; Windows GPU Console 6.20r1 systray; Intel QX9300; NVIDIA Quadro FX-3700M; Windows XP Professional - Location: SF Peninsula
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
I see that 171.64.122.72 is back up now, but it has a really high CPU load (6.86) and a heavy net load (171). Also the DL (days left on the tape?) is at zero, so don't know if that is affecting things too.
-
- Posts: 8
- Joined: Tue Sep 09, 2008 8:22 pm
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
My two work units, project 4418, completed 30 oct and 1 nov still won't upload.
Neither (171.67.108.17:8080)nor(171.64.122.72:8080) will respond
Neither (171.67.108.17:8080)nor(171.64.122.72:8080) will respond
Code: Select all
[06:44:48] + Attempting to send results
[06:45:06] - Couldn't send HTTP request to server
[06:45:06] + Could not connect to Work Server (results)
[06:45:06] (171.67.108.17:8080)
[06:45:06] Could not transmit unit 07 to Collection server; keeping in queue.
[08:05:44] Writing local files
[08:05:44] Completed 820000 out of 2000000 steps (41)
[10:26:44] Writing local files
[10:26:44] Completed 840000 out of 2000000 steps (42)
[12:45:10] + Attempting to send results
[12:47:43] Writing local files
[12:47:43] Completed 860000 out of 2000000 steps (43)
[12:52:31] - Couldn't send HTTP request to server
[12:52:31] + Could not connect to Work Server (results)
[12:52:31] (171.64.122.72:8080)
[12:52:31] - Error: Could not transmit unit 06 (completed October 30) to work server.
[12:52:31] + Attempting to send results
[12:52:32] - Couldn't send HTTP request to server
[12:52:32] + Could not connect to Work Server (results)
[12:52:32] (171.67.108.17:8080)
[12:52:32] Could not transmit unit 06 to Collection server; keeping in queue.
[12:52:32] + Attempting to send results
[12:59:54] - Couldn't send HTTP request to server
[12:59:54] + Could not connect to Work Server (results)
[12:59:54] (171.64.122.72:8080)
[12:59:54] - Error: Could not transmit unit 07 (completed November 1) to work server.
[12:59:54] + Attempting to send results
[12:59:54] - Couldn't send HTTP request to server
[12:59:54] + Could not connect to Work Server (results)
[12:59:54] (171.67.108.17:8080)
[12:59:54] Could not transmit unit 07 to Collection server; keeping in queue.
[15:08:41] Writing local files
[15:08:41] Completed 880000 out of 2000000 steps (44)
[17:31:16] Writing local files
[17:31:16] Completed 900000 out of 2000000 steps (45)
-
- Posts: 438
- Joined: Mon Dec 03, 2007 1:31 am
- Hardware configuration: Old Faithful CPU: Windows Graphical 5.03; Intel Pentium 4 Processor 540
(3.2GHz) HT;Windows XP
Big Red: Windows SMP Console 6.29; Windows GPU console 6.20r1; Intel Q9450 2.66G; ASUS P5Q 775 P45; [BFG 9800GTX+ old graphics card] NVidia GeForce 8800 GTX [as of 5/9/09]; Windows XP Pro SP3
Lenovo Think Pad: Windows 6.29 w/ SMP; Windows GPU Console 6.20r1 systray; Intel QX9300; NVIDIA Quadro FX-3700M; Windows XP Professional - Location: SF Peninsula
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
171.64.122.72 has a monstrously high net load of 445 and a CPU load of 3.42 with an assignment weight of 9%. Could we maybe turn off all assigning until units come home? I have some outstanding from October.
-
- Posts: 8
- Joined: Tue Sep 09, 2008 8:22 pm
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
That server status page is beyond the ken of this mere mortal. Of the 30 or so servers supposedly accepting jobs for the 'classic' clients, only about two currently have "% Ass 80" in double figures: Good ol' vsp05 (171.64.122.72) and VSPMF93 (171.65.103.160). The net load of vsp05 is way beyond anything else. I cannot even ping vsp05 at present. i wonder how many more weeks this will go on for.
-
- Posts: 8
- Joined: Tue Sep 09, 2008 8:22 pm
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
Anko where did you see assignment weight of 9%? I'm looking at the WEight column in the server stats page. Last time I looked I thought vsp05's weight was '10000' but today it is '5000' (ie less). One of my jobs actually was accepted in the last day or so. I have just one ( completed nov 1st) left now. Not that I'm any judge but it looks like these work packets are a bit on the small side and are overloading the servers by coming back too quickly??
-
- Posts: 438
- Joined: Mon Dec 03, 2007 1:31 am
- Hardware configuration: Old Faithful CPU: Windows Graphical 5.03; Intel Pentium 4 Processor 540
(3.2GHz) HT;Windows XP
Big Red: Windows SMP Console 6.29; Windows GPU console 6.20r1; Intel Q9450 2.66G; ASUS P5Q 775 P45; [BFG 9800GTX+ old graphics card] NVidia GeForce 8800 GTX [as of 5/9/09]; Windows XP Pro SP3
Lenovo Think Pad: Windows 6.29 w/ SMP; Windows GPU Console 6.20r1 systray; Intel QX9300; NVIDIA Quadro FX-3700M; Windows XP Professional - Location: SF Peninsula
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
I "misspoke." I was referring to the % ASSigned column, which upon closer reading doesn't actually mean what I thought it did. <blush> I ended up losing another two units: the 4419 that expired and the unit it killed b/c autosend ran into an expired unit. I went ahead and deleted the last two I had, which were close to expiring, rather than miss the deadline and loose two more. I suspect that you're right - the units are so small that the servers get overloaded with the returns. They go back [or try to] almost as fast as they get sent.
-
- Site Moderator
- Posts: 6349
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
122.78 is currently in Reject mode
-
- Posts: 2
- Joined: Wed Aug 13, 2008 10:52 pm
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
I've got 14 computers folding and I'm considering shutting them down as far as folding is concerned. Every one of them has completed work units that won't upload. When the time limit expires, the current work unit is lost with a "corrupted core". The listing below is from a computer with 6 completed work units that it cannot upload plus it can't get any work.
Is there anyone left at Stanford that gives a damn?
[15:11:09] + Attempting to send results
[15:11:09] - Couldn't send HTTP request to server
[15:11:09] + Could not connect to Work Server (results)
[15:11:09] (171.64.122.72:8080)
[15:11:09] - Error: Could not transmit unit 00 (completed November 10) to work server.
[15:11:09] + Attempting to send results
[15:11:10] - Couldn't send HTTP request to server
[15:11:10] + Could not connect to Work Server (results)
[15:11:10] (171.67.108.17:8080)
[15:11:10] Could not transmit unit 00 to Collection server; keeping in queue.
[15:11:10] + Attempting to send results
[15:11:11] - Couldn't send HTTP request to server
[15:11:11] + Could not connect to Work Server (results)
[15:11:11] (171.64.122.72:8080)
[15:11:11] - Error: Could not transmit unit 02 (completed November 11) to work server.
[15:11:11] + Attempting to send results
[15:11:11] - Couldn't send HTTP request to server
[15:11:11] + Could not connect to Work Server (results)
[15:11:11] (171.67.108.17:8080)
[15:11:11] Could not transmit unit 02 to Collection server; keeping in queue.
[15:11:11] + Attempting to send results
[15:11:11] - Couldn't send HTTP request to server
[15:11:11] + Could not connect to Work Server (results)
[15:11:11] (171.64.65.65:8080)
[15:11:11] - Error: Could not transmit unit 03 (completed November 13) to work server.
[15:11:11] + Attempting to send results
[15:11:12] - Couldn't send HTTP request to server
[15:11:12] + Could not connect to Work Server (results)
[15:11:12] (171.67.108.25:8080)
[15:11:12] Could not transmit unit 03 to Collection server; keeping in queue.
[15:11:12] + Attempting to send results
[15:11:12] - Couldn't send HTTP request to server
[15:11:12] + Could not connect to Work Server (results)
[15:11:12] (:8080)
[15:11:12] - Error: Could not transmit unit 04 (completed November 13) to work server.
[15:11:12] + Attempting to send results
[15:11:13] - Couldn't send HTTP request to server
[15:11:13] + Could not connect to Work Server (results)
[15:11:13] (171.67.108.17:8080)
[15:11:13] Could not transmit unit 04 to Collection server; keeping in queue.
[15:11:13] + Attempting to send results
[15:11:14] - Couldn't send HTTP request to server
[15:11:14] + Could not connect to Work Server (results)
[15:11:14] (171.64.65.111:8080)
[15:11:14] - Error: Could not transmit unit 08 (completed November 10) to work server.
[15:11:14] + Attempting to send results
[15:11:14] - Couldn't send HTTP request to server
[15:11:14] + Could not connect to Work Server (results)
[15:11:14] (171.67.108.17:8080)
[15:11:14] Could not transmit unit 08 to Collection server; keeping in queue.
[15:11:14] + Attempting to send results
[15:11:15] - Couldn't send HTTP request to server
[15:11:15] + Could not connect to Work Server (results)
[15:11:15] (171.64.122.72:8080)
[15:11:15] - Error: Could not transmit unit 09 (completed November 10) to work server.
[15:11:15] + Attempting to send results
[15:11:15] - Couldn't send HTTP request to server
[15:11:15] + Could not connect to Work Server (results)
[15:11:15] (171.67.108.17:8080)
[15:11:15] Could not transmit unit 09 to Collection server; keeping in queue.
[15:28:10] + Attempting to get work packet
[15:28:10] - Connecting to assignment server
[15:28:11] - Successful: assigned to (171.64.65.65).
[15:28:11] + News From Folding@Home: Welcome to Folding@Home
[15:28:11] Loaded queue successfully.
[15:28:11] - Couldn't send HTTP request to server
[15:28:11] (Got status 503)
[15:28:11] + Could not connect to Work Server
[15:28:11] - Error: Attempt #12 to get work failed, and no other work to do.
Waiting before retry.
[16:16:18] + Attempting to get work packet
[16:16:18] - Connecting to assignment server
[16:16:18] - Successful: assigned to (171.64.65.65).
[16:16:18] + News From Folding@Home: Welcome to Folding@Home
[16:16:18] Loaded queue successfully.
[16:16:19] - Couldn't send HTTP request to server
[16:16:19] (Got status 503)
[16:16:19] + Could not connect to Work Server
[16:16:19] - Error: Attempt #13 to get work failed, and no other work to do.
Waiting before retry.
[17:04:22] + Attempting to get work packet
[17:04:22] - Connecting to assignment server
[17:04:22] - Successful: assigned to (171.64.122.72).
[17:04:22] + News From Folding@Home: Welcome to Folding@Home
[17:04:22] Loaded queue successfully.
[17:04:23] - Couldn't send HTTP request to server
[17:04:23] (Got status 503)
[17:04:23] + Could not connect to Work Server
[17:04:23] - Error: Attempt #14 to get work failed, and no other work to do.
Waiting before retry.
[17:52:36] + Attempting to get work packet
[17:52:36] - Connecting to assignment server
[17:52:36] - Successful: assigned to (171.64.65.65).
[17:52:36] + News From Folding@Home: Welcome to Folding@Home
[17:52:37] Loaded queue successfully.
[17:52:37] - Couldn't send HTTP request to server
[17:52:37] (Got status 503)
[17:52:37] + Could not connect to Work Server
[17:52:37] - Error: Attempt #15 to get work failed, and no other work to do.
Waiting before retry.
Is there anyone left at Stanford that gives a damn?
[15:11:09] + Attempting to send results
[15:11:09] - Couldn't send HTTP request to server
[15:11:09] + Could not connect to Work Server (results)
[15:11:09] (171.64.122.72:8080)
[15:11:09] - Error: Could not transmit unit 00 (completed November 10) to work server.
[15:11:09] + Attempting to send results
[15:11:10] - Couldn't send HTTP request to server
[15:11:10] + Could not connect to Work Server (results)
[15:11:10] (171.67.108.17:8080)
[15:11:10] Could not transmit unit 00 to Collection server; keeping in queue.
[15:11:10] + Attempting to send results
[15:11:11] - Couldn't send HTTP request to server
[15:11:11] + Could not connect to Work Server (results)
[15:11:11] (171.64.122.72:8080)
[15:11:11] - Error: Could not transmit unit 02 (completed November 11) to work server.
[15:11:11] + Attempting to send results
[15:11:11] - Couldn't send HTTP request to server
[15:11:11] + Could not connect to Work Server (results)
[15:11:11] (171.67.108.17:8080)
[15:11:11] Could not transmit unit 02 to Collection server; keeping in queue.
[15:11:11] + Attempting to send results
[15:11:11] - Couldn't send HTTP request to server
[15:11:11] + Could not connect to Work Server (results)
[15:11:11] (171.64.65.65:8080)
[15:11:11] - Error: Could not transmit unit 03 (completed November 13) to work server.
[15:11:11] + Attempting to send results
[15:11:12] - Couldn't send HTTP request to server
[15:11:12] + Could not connect to Work Server (results)
[15:11:12] (171.67.108.25:8080)
[15:11:12] Could not transmit unit 03 to Collection server; keeping in queue.
[15:11:12] + Attempting to send results
[15:11:12] - Couldn't send HTTP request to server
[15:11:12] + Could not connect to Work Server (results)
[15:11:12] (:8080)
[15:11:12] - Error: Could not transmit unit 04 (completed November 13) to work server.
[15:11:12] + Attempting to send results
[15:11:13] - Couldn't send HTTP request to server
[15:11:13] + Could not connect to Work Server (results)
[15:11:13] (171.67.108.17:8080)
[15:11:13] Could not transmit unit 04 to Collection server; keeping in queue.
[15:11:13] + Attempting to send results
[15:11:14] - Couldn't send HTTP request to server
[15:11:14] + Could not connect to Work Server (results)
[15:11:14] (171.64.65.111:8080)
[15:11:14] - Error: Could not transmit unit 08 (completed November 10) to work server.
[15:11:14] + Attempting to send results
[15:11:14] - Couldn't send HTTP request to server
[15:11:14] + Could not connect to Work Server (results)
[15:11:14] (171.67.108.17:8080)
[15:11:14] Could not transmit unit 08 to Collection server; keeping in queue.
[15:11:14] + Attempting to send results
[15:11:15] - Couldn't send HTTP request to server
[15:11:15] + Could not connect to Work Server (results)
[15:11:15] (171.64.122.72:8080)
[15:11:15] - Error: Could not transmit unit 09 (completed November 10) to work server.
[15:11:15] + Attempting to send results
[15:11:15] - Couldn't send HTTP request to server
[15:11:15] + Could not connect to Work Server (results)
[15:11:15] (171.67.108.17:8080)
[15:11:15] Could not transmit unit 09 to Collection server; keeping in queue.
[15:28:10] + Attempting to get work packet
[15:28:10] - Connecting to assignment server
[15:28:11] - Successful: assigned to (171.64.65.65).
[15:28:11] + News From Folding@Home: Welcome to Folding@Home
[15:28:11] Loaded queue successfully.
[15:28:11] - Couldn't send HTTP request to server
[15:28:11] (Got status 503)
[15:28:11] + Could not connect to Work Server
[15:28:11] - Error: Attempt #12 to get work failed, and no other work to do.
Waiting before retry.
[16:16:18] + Attempting to get work packet
[16:16:18] - Connecting to assignment server
[16:16:18] - Successful: assigned to (171.64.65.65).
[16:16:18] + News From Folding@Home: Welcome to Folding@Home
[16:16:18] Loaded queue successfully.
[16:16:19] - Couldn't send HTTP request to server
[16:16:19] (Got status 503)
[16:16:19] + Could not connect to Work Server
[16:16:19] - Error: Attempt #13 to get work failed, and no other work to do.
Waiting before retry.
[17:04:22] + Attempting to get work packet
[17:04:22] - Connecting to assignment server
[17:04:22] - Successful: assigned to (171.64.122.72).
[17:04:22] + News From Folding@Home: Welcome to Folding@Home
[17:04:22] Loaded queue successfully.
[17:04:23] - Couldn't send HTTP request to server
[17:04:23] (Got status 503)
[17:04:23] + Could not connect to Work Server
[17:04:23] - Error: Attempt #14 to get work failed, and no other work to do.
Waiting before retry.
[17:52:36] + Attempting to get work packet
[17:52:36] - Connecting to assignment server
[17:52:36] - Successful: assigned to (171.64.65.65).
[17:52:36] + News From Folding@Home: Welcome to Folding@Home
[17:52:37] Loaded queue successfully.
[17:52:37] - Couldn't send HTTP request to server
[17:52:37] (Got status 503)
[17:52:37] + Could not connect to Work Server
[17:52:37] - Error: Attempt #15 to get work failed, and no other work to do.
Waiting before retry.
-
- Posts: 8
- Joined: Tue Sep 09, 2008 8:22 pm
Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down
That has to be a fair question Tommy. Here we are 11 days later and no response is your question.TommyHicks wrote: Is there anyone left at Stanford that gives a damn?
The server stats show that vsp5,11,15 are verry verry busy indeed.
Here is a typical ping plotter response from Vsp05. vsp11 and 15 are the same
Code: Select all
Target Name: vsp05
IP: 171.65.122.78
Date/Time: 24/11/2008 17:44:36
1 1 ms private
2 28 ms private
3 26 ms ge1-3-0-100.core1.ixn.dub.stisp.net [84.203.130.9]
4 30 ms ge1-3-0-98.core1.tcy.dub.stisp.net [84.203.130.2]
5 37 ms [195.66.224.185]
6 49 ms te2-7.ccr02.ams03.atlas.cogentco.com [130.117.1.169]
7 129 ms te7-3.mpd01.ymq02.atlas.cogentco.com [130.117.0.69]
8 137 ms te3-7.mpd01.yyz02.atlas.cogentco.com [154.54.7.213]
9 141 ms te7-8.ccr02.ord01.atlas.cogentco.com [154.54.7.73]
10 151 ms te4-3.ccr02.mci01.atlas.cogentco.com [154.54.6.201]
11 190 ms te8-4.ccr02.sfo01.atlas.cogentco.com [154.54.24.117]
12 189 ms te4-4.mpd01.sjc04.atlas.cogentco.com [154.54.7.174]
13 187 ms Stanford_University2.demarc.cogentco.com [66.250.7.138]
14 196 ms bbrb-isp.Stanford.EDU [171.64.1.155]
15 * [-]
That means your work packet has no chance
Do they care? Its a mystery.