SO explain this one

Moderators: Site Moderators, FAHC Science Team

Post Reply
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

SO explain this one

Post by Grandpa_01 »

RANT
I finished a 6701 this morning which naturally could not be sent the client then downloaded a 6063 and while running attempted to connect and return the 6701 which it could not, it then completed the 6063 and yeah the server accepted it no problem the client then tried to send the 6701 and guess what 503 again. Since the server can not seem to get its *hit together I guess I am just supposed to dump these worthless 6701's
END RANT

Code: Select all

[16:35:51] Completed 1940000 out of 2000000 steps  (97%)
[16:42:43] Completed 1960000 out of 2000000 steps  (98%)
[16:49:36] Completed 1980000 out of 2000000 steps  (99%)
[16:56:27] Completed 2000000 out of 2000000 steps  (100%)
[16:56:27] DynamicWrapper: Finished Work Unit: sleep=10000
[16:56:37] 
[16:56:37] Finished Work Unit:
[16:56:37] - Reading up to 687408 from "work/wudata_08.trr": Read 687408
[16:56:37] trr file hash check passed.
[16:56:37] - Reading up to 42665616 from "work/wudata_08.xtc": Read 42665616
[16:56:37] xtc file hash check passed.
[16:56:37] edr file hash check passed.
[16:56:37] logfile size: 277860
[16:56:37] Leaving Run
[16:56:39] - Writing 43633220 bytes of core data to disk...
[16:56:39]   ... Done.
[16:56:40] - Shutting down core
[16:56:40] 
[16:56:40] Folding@home Core Shutdown: FINISHED_UNIT
[16:56:44] CoreStatus = 64 (100)
[16:56:44] Sending work to server
[16:56:44] Project: 6701 (Run 3, Clone 2, Gen 40)


[16:56:44] + Attempting to send results [September 1 16:56:44 UTC]
[16:57:05] - Couldn't send HTTP request to server
[16:57:05] + Could not connect to Work Server (results)
[16:57:05]     (171.64.65.56:8080)
[16:57:05] + Retrying using alternative port
[16:57:05] - Couldn't send HTTP request to server
[16:57:05]   (Got status 503)
[16:57:05] + Could not connect to Work Server (results)
[16:57:05]     (171.64.65.56:80)
[16:57:05] - Error: Could not transmit unit 08 (completed September 1) to work server.
[16:57:05]   Keeping unit 08 in queue.
[16:57:05] Project: 6701 (Run 3, Clone 2, Gen 40)


[16:57:05] + Attempting to send results [September 1 16:57:05 UTC]
[16:57:05] - Couldn't send HTTP request to server
[16:57:05] + Could not connect to Work Server (results)
[16:57:05]     (171.64.65.56:8080)
[16:57:05] + Retrying using alternative port
[16:57:05] - Couldn't send HTTP request to server
[16:57:05]   (Got status 503)
[16:57:05] + Could not connect to Work Server (results)
[16:57:05]     (171.64.65.56:80)
[16:57:05] - Error: Could not transmit unit 08 (completed September 1) to work server.


[16:57:05] + Attempting to send results [September 1 16:57:05 UTC]
[16:57:06] - Couldn't send HTTP request to server
[16:57:06] + Could not connect to Work Server (results)
[16:57:06]     (171.67.108.25:8080)
[16:57:06] + Retrying using alternative port
[16:57:06] - Couldn't send HTTP request to server
[16:57:06]   (Got status 503)
[16:57:06] + Could not connect to Work Server (results)
[16:57:06]     (171.67.108.25:80)
[16:57:06]   Could not transmit unit 08 to Collection server; keeping in queue.
[16:57:06] - Preparing to get new work unit...
[16:57:06] Cleaning up work directory
[16:57:07] + Attempting to get work packet
[16:57:07] Passkey found
[16:57:07] - Connecting to assignment server
[16:57:07] - Successful: assigned to (171.64.65.54).
[16:57:07] + News From Folding@Home: Welcome to Folding@Home
[16:57:07] Loaded queue successfully.
[16:57:09] Project: 6701 (Run 3, Clone 2, Gen 40)


[16:57:09] + Attempting to send results [September 1 16:57:09 UTC]
[16:57:09] + Could not connect to Work Server (results)
[16:57:09]     (171.64.65.56:8080)
[16:57:09] + Retrying using alternative port
[16:57:28] - Couldn't send HTTP request to server
[16:57:28] + Could not connect to Work Server (results)
[16:57:28]     (171.64.65.56:80)
[16:57:28] - Error: Could not transmit unit 08 (completed September 1) to work server.


[16:57:28] + Attempting to send results [September 1 16:57:28 UTC]
[16:57:50] - Couldn't send HTTP request to server
[16:57:50] + Could not connect to Work Server (results)
[16:57:50]     (171.67.108.25:8080)
[16:57:50] + Retrying using alternative port
[16:58:00] + Could not connect to Work Server (results)
[16:58:00]     (171.67.108.25:80)
[16:58:00]   Could not transmit unit 08 to Collection server; keeping in queue.
[16:58:00] + Closed connections
[16:58:00] 
[16:58:00] + Processing work unit
[16:58:00] Core required: FahCore_a3.exe
[16:58:00] Core found.
[16:58:00] Working on queue slot 09 [September 1 16:58:00 UTC]
[16:58:00] + Working ...
[16:58:00] 
[16:58:00] *------------------------------*
[16:58:00] Folding@Home Gromacs SMP Core
[16:58:00] Version 2.22 (Mar 12, 2010)
[16:58:00] 
[16:58:00] Preparing to commence simulation
[16:58:00] - Looking at optimizations...
[16:58:00] - Created dyn
[16:58:00] - Files status OK
[16:58:01] - Expanded 1763811 -> 2249733 (decompressed 127.5 percent)
[16:58:01] Called DecompressByteArray: compressed_data_size=1763811 data_size=2249733, decompressed_data_size=2249733 diff=0
[16:58:01] - Digital signature verified
[16:58:01] 
[16:58:01] Project: 6063 (Run 0, Clone 192, Gen 109)
[16:58:01] 
[16:58:01] Assembly optimizations on if available.
[16:58:01] Entering M.D.
[16:58:07] Completed 0 out of 500000 steps  (0%)
[17:00:59] Completed 5000 out of 500000 steps  (1%)
[17:03:51] Completed 10000 out of 500000 steps  (2%)
[17:06:44] Completed 15000 out of 500000 steps  (3%)
[17:09:36] Completed 20000 out of 500000 steps  (4%)
[17:12:28] Completed 25000 out of 500000 steps  (5%)
[17:15:22] Completed 30000 out of 500000 steps  (6%)
[17:18:14] Completed 35000 out of 500000 steps  (7%)
[17:21:06] Completed 40000 out of 500000 steps  (8%)
[17:23:58] Completed 45000 out of 500000 steps  (9%)
[17:26:49] Completed 50000 out of 500000 steps  (10%)
[17:29:42] Completed 55000 out of 500000 steps  (11%)
[17:32:34] Completed 60000 out of 500000 steps  (12%)
[17:35:26] Completed 65000 out of 500000 steps  (13%)
[17:38:17] Completed 70000 out of 500000 steps  (14%)
[17:41:09] Completed 75000 out of 500000 steps  (15%)
[17:44:01] Completed 80000 out of 500000 steps  (16%)
[17:46:52] Completed 85000 out of 500000 steps  (17%)
[17:49:43] Completed 90000 out of 500000 steps  (18%)
[17:52:34] Completed 95000 out of 500000 steps  (19%)
[17:55:25] Completed 100000 out of 500000 steps  (20%)
[17:58:16] Completed 105000 out of 500000 steps  (21%)
[18:01:07] Completed 110000 out of 500000 steps  (22%)
[18:04:00] Completed 115000 out of 500000 steps  (23%)
[18:06:53] Completed 120000 out of 500000 steps  (24%)
[18:09:47] Completed 125000 out of 500000 steps  (25%)
[18:12:40] Completed 130000 out of 500000 steps  (26%)
[18:15:33] Completed 135000 out of 500000 steps  (27%)
[18:18:24] Completed 140000 out of 500000 steps  (28%)
[18:21:16] Completed 145000 out of 500000 steps  (29%)
[18:24:08] Completed 150000 out of 500000 steps  (30%)
[18:27:00] Completed 155000 out of 500000 steps  (31%)
[18:29:52] Completed 160000 out of 500000 steps  (32%)
[18:32:44] Completed 165000 out of 500000 steps  (33%)
[18:35:36] Completed 170000 out of 500000 steps  (34%)
[18:38:28] Completed 175000 out of 500000 steps  (35%)
[18:41:21] Completed 180000 out of 500000 steps  (36%)
[18:44:14] Completed 185000 out of 500000 steps  (37%)
[18:47:06] Completed 190000 out of 500000 steps  (38%)
[18:50:00] Completed 195000 out of 500000 steps  (39%)
[18:52:53] Completed 200000 out of 500000 steps  (40%)
[18:55:46] Completed 205000 out of 500000 steps  (41%)
[18:58:40] Completed 210000 out of 500000 steps  (42%)
[19:01:33] Completed 215000 out of 500000 steps  (43%)
[19:04:26] Completed 220000 out of 500000 steps  (44%)
[19:07:18] Completed 225000 out of 500000 steps  (45%)
[19:10:09] Completed 230000 out of 500000 steps  (46%)
[19:13:02] Completed 235000 out of 500000 steps  (47%)
[19:15:54] Completed 240000 out of 500000 steps  (48%)
[19:18:47] Completed 245000 out of 500000 steps  (49%)
[19:21:40] Completed 250000 out of 500000 steps  (50%)
[19:24:32] Completed 255000 out of 500000 steps  (51%)
[19:27:25] Completed 260000 out of 500000 steps  (52%)
[19:30:17] Completed 265000 out of 500000 steps  (53%)
[19:33:10] Completed 270000 out of 500000 steps  (54%)
[19:36:02] Completed 275000 out of 500000 steps  (55%)
[19:38:55] Completed 280000 out of 500000 steps  (56%)
[19:41:48] Completed 285000 out of 500000 steps  (57%)
[19:44:42] Completed 290000 out of 500000 steps  (58%)
[19:47:36] Completed 295000 out of 500000 steps  (59%)
[19:50:29] Completed 300000 out of 500000 steps  (60%)
[19:53:20] Completed 305000 out of 500000 steps  (61%)
[19:56:11] Completed 310000 out of 500000 steps  (62%)
[19:59:04] Completed 315000 out of 500000 steps  (63%)
[20:01:57] Completed 320000 out of 500000 steps  (64%)
[20:04:49] Completed 325000 out of 500000 steps  (65%)
[20:07:42] Completed 330000 out of 500000 steps  (66%)
[20:10:34] Completed 335000 out of 500000 steps  (67%)
[20:13:26] Completed 340000 out of 500000 steps  (68%)
[20:16:19] Completed 345000 out of 500000 steps  (69%)
[20:19:12] Completed 350000 out of 500000 steps  (70%)
[20:22:05] Completed 355000 out of 500000 steps  (71%)
[20:24:57] Completed 360000 out of 500000 steps  (72%)
[20:27:49] Completed 365000 out of 500000 steps  (73%)
[20:30:43] Completed 370000 out of 500000 steps  (74%)
[20:33:35] Completed 375000 out of 500000 steps  (75%)
[20:36:26] Completed 380000 out of 500000 steps  (76%)
[20:39:18] Completed 385000 out of 500000 steps  (77%)
[20:42:10] Completed 390000 out of 500000 steps  (78%)
[20:45:02] Completed 395000 out of 500000 steps  (79%)
[20:47:55] Completed 400000 out of 500000 steps  (80%)
[20:50:47] Completed 405000 out of 500000 steps  (81%)
[20:53:39] Completed 410000 out of 500000 steps  (82%)
[20:56:31] Completed 415000 out of 500000 steps  (83%)
[20:59:24] Completed 420000 out of 500000 steps  (84%)
[21:02:17] Completed 425000 out of 500000 steps  (85%)
[21:05:08] Completed 430000 out of 500000 steps  (86%)
[21:08:00] Completed 435000 out of 500000 steps  (87%)
[21:10:53] Completed 440000 out of 500000 steps  (88%)
[21:13:46] Completed 445000 out of 500000 steps  (89%)
[21:16:39] Completed 450000 out of 500000 steps  (90%)
[21:19:32] Completed 455000 out of 500000 steps  (91%)
[21:22:24] Completed 460000 out of 500000 steps  (92%)
[21:25:17] Completed 465000 out of 500000 steps  (93%)
[21:28:10] Completed 470000 out of 500000 steps  (94%)
[21:29:37] Project: 6701 (Run 3, Clone 2, Gen 40)


[21:29:37] + Attempting to send results [September 1 21:29:37 UTC]
[21:29:37] - Couldn't send HTTP request to server
[21:29:37]   (Got status 503)
[21:29:37] + Could not connect to Work Server (results)
[21:29:37]     (171.64.65.56:8080)
[21:29:37] + Retrying using alternative port
[21:29:37] - Couldn't send HTTP request to server
[21:29:37]   (Got status 503)
[21:29:37] + Could not connect to Work Server (results)
[21:29:37]     (171.64.65.56:80)
[21:29:37] - Error: Could not transmit unit 08 (completed September 1) to work server.


[21:29:37] + Attempting to send results [September 1 21:29:37 UTC]
[21:29:37] - Couldn't send HTTP request to server
[21:29:37]   (Got status 503)
[21:29:37] + Could not connect to Work Server (results)
[21:29:37]     (171.67.108.25:8080)
[21:29:37] + Retrying using alternative port
[21:29:37] - Couldn't send HTTP request to server
[21:29:37]   (Got status 503)
[21:29:37] + Could not connect to Work Server (results)
[21:29:37]     (171.67.108.25:80)
[21:29:37]   Could not transmit unit 08 to Collection server; keeping in queue.
[21:31:03] Completed 475000 out of 500000 steps  (95%)
[21:33:55] Completed 480000 out of 500000 steps  (96%)
[21:36:47] Completed 485000 out of 500000 steps  (97%)
[21:39:39] Completed 490000 out of 500000 steps  (98%)
[21:42:31] Completed 495000 out of 500000 steps  (99%)
[21:45:23] Completed 500000 out of 500000 steps  (100%)
[21:45:24] DynamicWrapper: Finished Work Unit: sleep=10000
[21:45:35] 
[21:45:35] Finished Work Unit:
[21:45:35] - Reading up to 3697632 from "work/wudata_09.trr": Read 3697632
[21:45:35] trr file hash check passed.
[21:45:35] edr file hash check passed.
[21:45:35] logfile size: 55287
[21:45:35] Leaving Run
[21:45:37] - Writing 3788471 bytes of core data to disk...
[21:45:37]   ... Done.
[21:45:37] - Shutting down core
[21:45:37] 
[21:45:37] Folding@home Core Shutdown: FINISHED_UNIT
[21:45:41] CoreStatus = 64 (100)
[21:45:41] Sending work to server
[21:45:41] Project: 6063 (Run 0, Clone 192, Gen 109)


[21:45:41] + Attempting to send results [September 1 21:45:41 UTC]
[21:45:57] + Results successfully sent
[21:45:57] Thank you for your contribution to Folding@Home.
[21:45:57] + Number of Units Completed: 85

[21:46:01] Project: 6701 (Run 3, Clone 2, Gen 40)


[21:46:01] + Attempting to send results [September 1 21:46:01 UTC]
[21:46:01] - Couldn't send HTTP request to server
[21:46:01] + Could not connect to Work Server (results)
[21:46:01]     (171.64.65.56:8080)
[21:46:01] + Retrying using alternative port
[21:46:01] + Could not connect to Work Server (results)
[21:46:01]     (171.64.65.56:80)
[21:46:01] - Error: Could not transmit unit 08 (completed September 1) to work server.


[21:46:01] + Attempting to send results [September 1 21:46:01 UTC]
[21:46:11] + Could not connect to Work Server (results)
[21:46:11]     (171.67.108.25:8080)
[21:46:11] + Retrying using alternative port
[21:46:21] + Could not connect to Work Server (results)
[21:46:21]     (171.67.108.25:80)
[21:46:21]   Could not transmit unit 08 to Collection server; keeping in queue.
[21:46:21] - Preparing to get new work unit...
[21:46:21] Cleaning up work directory
[21:46:22] + Attempting to get work packet
[21:46:22] Passkey found
[21:46:22] - Connecting to assignment server
[21:46:23] - Successful: assigned to (171.67.108.22).
[21:46:23] + News From Folding@Home: Welcome to Folding@Home
[21:46:23] Loaded queue successfully.
[21:46:51] Project: 6701 (Run 3, Clone 2, Gen 40)


[21:46:51] + Attempting to send results [September 1 21:46:51 UTC]
[21:46:51] - Couldn't send HTTP request to server
[21:46:51]   (Got status 503)
[21:46:51] + Could not connect to Work Server (results)
[21:46:51]     (171.64.65.56:8080)
[21:46:51] + Retrying using alternative port
[21:46:51] + Could not connect to Work Server (results)
[21:46:51]     (171.64.65.56:80)
[21:46:51] - Error: Could not transmit unit 08 (completed September 1) to work server.


[21:46:51] + Attempting to send results [September 1 21:46:51 UTC]
[21:46:58] + Could not connect to Work Server (results)
[21:46:58]     (171.67.108.25:8080)
[21:46:58] + Retrying using alternative port
[21:47:19] + Could not connect to Work Server (results)
[21:47:19]     (171.67.108.25:80)
[21:47:19]   Could not transmit unit 08 to Collection server; keeping in queue.
[21:47:19] + Closed connections
[21:47:19] 
[21:47:19] + Processing work unit
[21:47:19] Core required: FahCore_a3.exe
[21:47:19] Core found.
[21:47:19] Working on queue slot 00 [September 1 21:47:19 UTC]
[21:47:19] + Working ...
[21:47:20] 
[21:47:20] *------------------------------*
[21:47:20] Folding@Home Gromacs SMP Core
[21:47:20] Version 2.22 (Mar 12, 2010)
[21:47:20] 
[21:47:20] Preparing to commence simulation
[21:47:20] - Looking at optimizations...
[21:47:20] - Created dyn
[21:47:20] - Files status OK
[21:47:24] - Expanded 26405078 -> 32704173 (decompressed 123.8 percent)
[21:47:24] Called DecompressByteArray: compressed_data_size=26405078 data_size=32704173, decompressed_data_size=32704173 diff=0
[21:47:24] - Digital signature verified
[21:47:24] 
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
parkut
Posts: 366
Joined: Tue Feb 12, 2008 7:33 am
Hardware configuration: Running exclusively Linux headless blades. All are dedicated crunching machines.
Location: SE Michigan, USA

Re: SO explain this one

Post by parkut »

According to this page: http://fah-web.stanford.edu/localinfo/contact.SMP.html

The server has a net load of 200, (very high) and might be the reason why you're not able to connect?
314159
Posts: 232
Joined: Sun Dec 02, 2007 2:46 am
Location: http://www.teammacosx.org/

Re: SO explain this one

Post by 314159 »

I have several queued as I write this and others due to complete shortly. (relatively large dedicated folding farm here)

This server is ONCE AGAIN pinned at ~200 Net Load. I do not understand this at all, i.e. why the condition exists or why someone is not following this server and correcting the situation when it occurs (way too frequently).

I was on the serverstat page several days ago and observed how the Net Load was reduced from ~200 down to ~57 when someone at the PG actually took some action.

Good luck on any of these being submitted until someone from the Pande Group corrects this recurring issue.

"503" - "503" - "503" (obviously) on each submission attempt.

I really do not care but this situation is irritating many of the SMP DONORS to this project. :)
John (from the central part of the Commonwealth of Virginia, U.S.A.)

A friendly visitor to what hopefully will remain a friendly Forum.
With thanks to all of the dedicated volunteers on the staff here!!
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: SO explain this one

Post by 7im »

I think the topic has been addressed already, just not resolved yet. ;)

More SMP server power coming on line in the near future
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
314159
Posts: 232
Joined: Sun Dec 02, 2007 2:46 am
Location: http://www.teammacosx.org/

Re: SO explain this one

Post by 314159 »

Hi 7im,

Long time no see but I do read the forum somewhat actively and have seen that you are still effectively supporting "folders in need".
Keep up the good work, Dude. 8-)

See the other thread with Peter's kind reply.
He mentions "stuck transfers". Weird situation! (but relatively large uploads that are probably the cause - I have no idea.)

Can't wait for the new SMP Power! (and Yeah, I did read that link the day it was posted). :)
John (from the central part of the Commonwealth of Virginia, U.S.A.)

A friendly visitor to what hopefully will remain a friendly Forum.
With thanks to all of the dedicated volunteers on the staff here!!
Post Reply