171.67.108.200 - Internal Server Error

Moderators: Site Moderators, FAHC Science Team

aoeu
Posts: 87
Joined: Thu Dec 31, 2009 9:07 pm

Re: 171.67.108.200 - Internal Server Error

Post by aoeu »

First the warnings and Errors Log starting just before the current problem.

Code: Select all

******************************* Date: 2015-05-16 *******************************
******************************* Date: 2015-05-16 *******************************
******************************* Date: 2015-05-16 *******************************
16:09:51:WARNING:WU03:FS02:Failed to get assignment from '171.67.108.200:80': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
17:45:10:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
******************************* Date: 2015-05-16 *******************************
21:51:44:WARNING:WU02:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
23:29:18:WARNING:WU00:FS01:Failed to get assignment from '171.67.108.200:80': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
00:54:43:WARNING:WU01:FS02:Failed to get assignment from '171.67.108.200:80': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
02:01:30:WARNING:WU03:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
******************************* Date: 2015-05-17 *******************************
05:52:39:WARNING:WU02:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
07:51:55:WARNING:WU03:FS01:Failed to get assignment from '171.67.108.200:80': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
******************************* Date: 2015-05-17 *******************************
09:47:46:WU03:FS01:0x18:WARNING:Console control signal 1 on PID 3388
09:47:46:WU01:FS02:0x18:WARNING:Console control signal 1 on PID 5204
09:47:46:WU03:FS01:0x18:ERROR:103: Lost client lifeline
12:33:28:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
Here is the Log from just before the last error.

Code: Select all

12:33:27:WU02:FS00:0xa4:Completed 4950000 out of 5000000 steps  (99%)
12:33:28:WU00:FS00:Connecting to 171.67.108.200:8080
12:33:28:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
12:33:28:WU00:FS00:Connecting to 171.67.108.204:80
12:33:29:WU00:FS00:Assigned to work server 155.247.166.219
12:33:29:WU00:FS00:Requesting new work unit for slot 00: RUNNING cpu:6 from 155.247.166.219
12:33:29:WU00:FS00:Connecting to 155.247.166.219:8080
12:33:29:WU00:FS00:Downloading 116.65KiB
12:33:29:WU00:FS00:Download complete
12:33:29:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:6398 run:143 clone:22 gen:85 core:0xa4 unit:0x0000005f0002894b5462cbc3412244b2
12:35:17:WU01:FS02:0x18:Completed 2650000 out of 5000000 steps (53%)
12:35:51:WU03:FS01:0x18:Completed 1650000 out of 5000000 steps (33%)
12:37:18:WU02:FS00:0xa4:Completed 5000000 out of 5000000 steps  (100%)
12:37:18:WU02:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
12:37:28:WU02:FS00:0xa4:
12:37:28:WU02:FS00:0xa4:Finished Work Unit:
12:37:28:WU02:FS00:0xa4:- Reading up to 1256376 from "02/wudata_01.trr": Read 1256376
12:37:28:WU02:FS00:0xa4:trr file hash check passed.
12:37:28:WU02:FS00:0xa4:- Reading up to 111504 from "02/wudata_01.xtc": Read 111504
12:37:28:WU02:FS00:0xa4:xtc file hash check passed.
12:37:28:WU02:FS00:0xa4:edr file hash check passed.
12:37:28:WU02:FS00:0xa4:logfile size: 88690
12:37:28:WU02:FS00:0xa4:Leaving Run
12:37:31:WU02:FS00:0xa4:- Writing 1528470 bytes of core data to disk...
12:37:32:WU02:FS00:0xa4:Done: 1527958 -> 1293596 (compressed to 84.6 percent)
12:37:32:WU02:FS00:0xa4:  ... Done.
12:37:33:WU02:FS00:0xa4:- Shutting down core
12:37:33:WU02:FS00:0xa4:
12:37:33:WU02:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
12:37:33:WU02:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
12:37:33:WU02:FS00:Sending unit results: id:02 state:SEND error:NO_ERROR project:6395 run:41 clone:38 gen:114 core:0xa4 unit:0x000000780002894b5462c743312d9110
12:37:33:WU02:FS00:Uploading 1.23MiB to 155.247.166.219
12:37:33:WU02:FS00:Connecting to 155.247.166.219:8080
12:37:33:WU00:FS00:Starting
12:37:33:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/aoeu/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 704 -lifeline 2204 -checkpoint 15 -np 6
12:37:33:WU00:FS00:Started FahCore on PID 5964
12:37:33:WU00:FS00:Core PID:1772
12:37:33:WU00:FS00:FahCore 0xa4 started
12:37:34:WU00:FS00:0xa4:
12:37:34:WU00:FS00:0xa4:*------------------------------*
12:37:34:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
12:37:34:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
12:37:34:WU00:FS00:0xa4:
12:37:34:WU00:FS00:0xa4:Preparing to commence simulation
12:37:34:WU00:FS00:0xa4:- Looking at optimizations...
12:37:34:WU00:FS00:0xa4:- Created dyn
12:37:34:WU00:FS00:0xa4:- Files status OK
12:37:34:WU00:FS00:0xa4:- Expanded 118937 -> 270464 (decompressed 227.4 percent)
12:37:34:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=118937 data_size=270464, decompressed_data_size=270464 diff=0
12:37:34:WU00:FS00:0xa4:- Digital signature verified
12:37:34:WU00:FS00:0xa4:
12:37:34:WU00:FS00:0xa4:Project: 6398 (Run 143, Clone 22, Gen 85)
12:37:34:WU00:FS00:0xa4:
12:37:34:WU00:FS00:0xa4:Assembly optimizations on if available.
12:37:34:WU00:FS00:0xa4:Entering M.D.
12:37:36:WU02:FS00:Upload complete
12:37:36:WU02:FS00:Server responded WORK_ACK (400)
12:37:36:WU02:FS00:Final credit estimate, 1515.00 points
12:37:36:WU02:FS00:Cleaning up
12:37:39:WU00:FS00:0xa4:Mapping NT from 6 to 6 
12:37:39:WU00:FS00:0xa4:Completed 0 out of 5000000 steps  (0%)
12:41:10:WU00:FS00:0xa4:Completed 50000 out of 5000000 steps  (1%)
12:41:55:WU03:FS01:0x18:Completed 1700000 out of 5000000 steps (34%)
12:44:42:WU00:FS00:0xa4:Completed 100000 out of 5000000 steps  (2%)
12:47:02:WU01:FS02:0x18:Completed 2700000 out of 5000000 steps (54%)
12:47:58:WU03:FS01:0x18:Completed 1750000 out of 5000000 steps (35%)
12:48:03:WU00:FS00:0xa4:Completed 150000 out of 5000000 steps  (3%)
12:51:36:WU00:FS00:0xa4:Completed 200000 out of 5000000 steps  (4%)
I have three slots, all have WUs, and are folding.
I hope this is helpful.
VijayPande
Pande Group Member
Posts: 2058
Joined: Fri Nov 30, 2007 6:25 am
Location: Stanford

Re: 171.67.108.200 - Internal Server Error

Post by VijayPande »

We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
Prof. Vijay Pande, PhD
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
VijayPande
Pande Group Member
Posts: 2058
Joined: Fri Nov 30, 2007 6:25 am
Location: Stanford

Re: 171.67.108.200 - Internal Server Error

Post by VijayPande »

As for the project descriptions, they are on fab-web which is one of the key machines having problems.

I’ve taken a look at it and there’s a more serious issue with a particular server than I can take care of myself. I’ve filed a ticket with the sysadmin team. Best guess ETA on this being fixed is Monday at noon (assuming this is something simple).
Prof. Vijay Pande, PhD
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
billford
Posts: 1003
Joined: Thu May 02, 2013 8:46 pm
Hardware configuration: Full Time:

2x NVidia GTX 980
1x NVidia GTX 780 Ti
2x 3GHz Core i5 PC (Linux)

Retired:

3.2GHz Core i5 PC (Linux)
3.2GHz Core i5 iMac
2.8GHz Core i5 iMac
2.16GHz Core 2 Duo iMac
2GHz Core 2 Duo MacBook
1.6GHz Core 2 Duo Acer laptop
Location: Near Oxford, United Kingdom
Contact:

Re: 171.67.108.200 - Internal Server Error

Post by billford »

VijayPande wrote:I think we have the WS’s back in shape, but we’ll keep an eye on this.
Somewhat belated response, I've been otherwise occupied.

Thanks for the info, WS assignment seems OK now.
Image
Simplex0
Posts: 69
Joined: Sun Oct 06, 2013 10:35 am

Re: 171.67.108.200 - Internal Server Error

Post by Simplex0 »

VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
Really? ........

"NO IMPACT: A CME expected to sideswipe Earth's magnetic field on May 17th either missed or its impact was undetectable. As a result, geomagnetic activity is low and likely to remain so for the next 24 to 48 hours."

http://spaceweather.com/
DemonfangArun
Posts: 95
Joined: Fri Sep 09, 2011 11:23 pm
Contact:

Re: 171.67.108.200 - Internal Server Error

Post by DemonfangArun »

VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
nothing in the budget for at least some cheap consumer grade line interactive ups's (or some proper server grade online units)? i would think a place like Stanford could at least get something to keep servers online through power blips (and shut down servers properly!).
This user has stopped using F@H because it supports fake crap!
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: 171.67.108.200 - Internal Server Error

Post by 7im »

Simplex0 wrote:
VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
Really? ........

"NO IMPACT: A CME expected to sideswipe Earth's magnetic field on May 17th either missed or its impact was undetectable. As a result, geomagnetic activity is low and likely to remain so for the next 24 to 48 hours."

http://spaceweather.com/
The late to arrive El Nino is fueling larger and more severe storms in the Southwest.
http://www.noaa.gov
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
billford
Posts: 1003
Joined: Thu May 02, 2013 8:46 pm
Hardware configuration: Full Time:

2x NVidia GTX 980
1x NVidia GTX 780 Ti
2x 3GHz Core i5 PC (Linux)

Retired:

3.2GHz Core i5 PC (Linux)
3.2GHz Core i5 iMac
2.8GHz Core i5 iMac
2.16GHz Core 2 Duo iMac
2GHz Core 2 Duo MacBook
1.6GHz Core 2 Duo Acer laptop
Location: Near Oxford, United Kingdom
Contact:

Re: 171.67.108.200 - Internal Server Error

Post by billford »

Stats (and project descriptions) would seem to be back again :)
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.67.108.200 - Internal Server Error

Post by bruce »

DemonfangArun wrote:
VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
nothing in the budget for at least some cheap consumer grade line interactive ups's (or some proper server grade online units)? i would think a place like Stanford could at least get something to keep servers online through power blips (and shut down servers properly!).
Apparently the servers were protected by (I suppose: server-grade) UPSs but something else wasn't (like maybe a key router). See Update on fah-web May 2015
Simplex0
Posts: 69
Joined: Sun Oct 06, 2013 10:35 am

Re: 171.67.108.200 - Internal Server Error

Post by Simplex0 »

7im wrote:
Simplex0 wrote:
VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
Really? ........

"NO IMPACT: A CME expected to sideswipe Earth's magnetic field on May 17th either missed or its impact was undetectable. As a result, geomagnetic activity is low and likely to remain so for the next 24 to 48 hours."

http://spaceweather.com/
The late to arrive El Nino is fueling larger and more severe storms in the Southwest.
http://www.noaa.gov
In this case the words was "electrical storm" which the sun is able to triger but not, as far as i know, El Nino.
DemonfangArun
Posts: 95
Joined: Fri Sep 09, 2011 11:23 pm
Contact:

Re: 171.67.108.200 - Internal Server Error

Post by DemonfangArun »

bruce wrote:
DemonfangArun wrote:
VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
nothing in the budget for at least some cheap consumer grade line interactive ups's (or some proper server grade online units)? i would think a place like Stanford could at least get something to keep servers online through power blips (and shut down servers properly!).
Apparently the servers were protected by (I suppose: server-grade) UPSs but something else wasn't (like maybe a key router). See Update on fah-web May 2015
that would've been my next guess. ups all the things is my typical strategy, because upsing a computer to prevent gaming interruptions does no good if network goes off for example. anyway, glad it's fixed and it should be interesting to see how the 3PM eoc update fare..chokes... jk
This user has stopped using F@H because it supports fake crap!
billford
Posts: 1003
Joined: Thu May 02, 2013 8:46 pm
Hardware configuration: Full Time:

2x NVidia GTX 980
1x NVidia GTX 780 Ti
2x 3GHz Core i5 PC (Linux)

Retired:

3.2GHz Core i5 PC (Linux)
3.2GHz Core i5 iMac
2.8GHz Core i5 iMac
2.16GHz Core 2 Duo iMac
2GHz Core 2 Duo MacBook
1.6GHz Core 2 Duo Acer laptop
Location: Near Oxford, United Kingdom
Contact:

Re: 171.67.108.200 - Internal Server Error

Post by billford »

Simplex0 wrote: In this case the words was "electrical storm" which the sun is able to triger but not, as far as i know, El Nino.
CME's cause geomagnetic storms not electrical storms- they're just common or garden thunderstorms.
Image
sco01
Posts: 79
Joined: Sun Mar 21, 2010 8:52 pm
Hardware configuration: Core i7 3930K(C2)@4,2 (1.30v)
Swiftech Apogee Drive II + MCR-H220 Radiator
Asus P9X79
4*4Gb SEC 1600@1866 1.5v
GeForce GTX780ti ref (nztx G10+zalman lq315)
Creative X-Fi Titanium Fatal1ty
Lite-On iHAS124-34
HDD- WD3000GLFS + 2*Hitaсhi HDS723030BLE640 + 2*HDT721010SLA360
case-Corsair Obsidian 700D
PSU - Seasonic Platinum 860W (SS-860XP2)

2) Xeon E5645@4Ghz (1,34v)
ASUS P6T
Thermalright Silver Arrow
Corsair TR3X6G1600C7 (3x2048Mb, 7-7-7-20 1,62v)
GeForce GTX780 ref (nztx G10+corsair H75)
Zotac GTX980 AMP! Edition (ZT-90204-10P)
Hitachi 320Gb HTS543232A7A384
case - Chieftec LBX-03B Dremel edition :-)
PSU - Zalman ZM-850HP+ (Enhance silver)

APC SUA1500RMI2U
APC SUA1500RMI2U

Re: 171.67.108.200 - Internal Server Error

Post by sco01 »

20:36:10:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
20:36:10:WU01:FS01:Connecting to 171.67.108.204:80
20:36:11:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.204:80': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
20:36:11:ERROR:WU01:FS01:Exception: Could not get an assignment
billford
Posts: 1003
Joined: Thu May 02, 2013 8:46 pm
Hardware configuration: Full Time:

2x NVidia GTX 980
1x NVidia GTX 780 Ti
2x 3GHz Core i5 PC (Linux)

Retired:

3.2GHz Core i5 PC (Linux)
3.2GHz Core i5 iMac
2.8GHz Core i5 iMac
2.16GHz Core 2 Duo iMac
2GHz Core 2 Duo MacBook
1.6GHz Core 2 Duo Acer laptop
Location: Near Oxford, United Kingdom
Contact:

Re: 171.67.108.200 - Internal Server Error

Post by billford »

OK here, but different port:

Code: Select all

20:42:36:WU00:FS01:0x17:Completed 4950000 out of 5000000 steps (99%)
20:44:01:WU00:FS01:0x17:Completed 5000000 out of 5000000 steps (100%)
20:44:01:WU01:FS01:Connecting to 171.67.108.200:80
20:44:02:WU01:FS01:Assigned to work server 171.64.65.56
20:44:02:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GM204 [GeForce GTX 980] from 171.64.65.56
20:44:02:WU01:FS01:Connecting to 171.64.65.56:8080
20:44:03:WU01:FS01:Downloading 889.82KiB
20:44:03:WU00:FS01:Connecting to 171.67.108.52:8080
20:44:05:WU01:FS01:Download complete
20:44:05:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9411 run:715 clone:0 gen:79 core:0x17 unit:0x00000061ab40413854d27c3131671be8
20:44:05:WU01:FS01:Starting
Image
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: 171.67.108.200 - Internal Server Error

Post by 7im »

bruce wrote:
Apparently the servers were protected by (I suppose: server-grade) UPSs but something else wasn't (like maybe a key router). See Update on fah-web May 2015
Even server grade UPS systems eventually shutdown if the Main power is out for too long, and there is no generator to fall back on. Also explains the heavy load when power was restored. And I think Dr. Pande meant PERC Reset, as in a Dell PowerEdge Expandable RAID Controller. ;)
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Post Reply