******************************* Date: 2015-05-16 *******************************
******************************* Date: 2015-05-16 *******************************
******************************* Date: 2015-05-16 *******************************
16:09:51:WARNING:WU03:FS02:Failed to get assignment from '171.67.108.200:80': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
17:45:10:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
******************************* Date: 2015-05-16 *******************************
21:51:44:WARNING:WU02:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
23:29:18:WARNING:WU00:FS01:Failed to get assignment from '171.67.108.200:80': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
00:54:43:WARNING:WU01:FS02:Failed to get assignment from '171.67.108.200:80': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
02:01:30:WARNING:WU03:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
******************************* Date: 2015-05-17 *******************************
05:52:39:WARNING:WU02:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
07:51:55:WARNING:WU03:FS01:Failed to get assignment from '171.67.108.200:80': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
******************************* Date: 2015-05-17 *******************************
09:47:46:WU03:FS01:0x18:WARNING:Console control signal 1 on PID 3388
09:47:46:WU01:FS02:0x18:WARNING:Console control signal 1 on PID 5204
09:47:46:WU03:FS01:0x18:ERROR:103: Lost client lifeline
12:33:28:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
12:33:27:WU02:FS00:0xa4:Completed 4950000 out of 5000000 steps (99%)
12:33:28:WU00:FS00:Connecting to 171.67.108.200:8080
12:33:28:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
12:33:28:WU00:FS00:Connecting to 171.67.108.204:80
12:33:29:WU00:FS00:Assigned to work server 155.247.166.219
12:33:29:WU00:FS00:Requesting new work unit for slot 00: RUNNING cpu:6 from 155.247.166.219
12:33:29:WU00:FS00:Connecting to 155.247.166.219:8080
12:33:29:WU00:FS00:Downloading 116.65KiB
12:33:29:WU00:FS00:Download complete
12:33:29:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:6398 run:143 clone:22 gen:85 core:0xa4 unit:0x0000005f0002894b5462cbc3412244b2
12:35:17:WU01:FS02:0x18:Completed 2650000 out of 5000000 steps (53%)
12:35:51:WU03:FS01:0x18:Completed 1650000 out of 5000000 steps (33%)
12:37:18:WU02:FS00:0xa4:Completed 5000000 out of 5000000 steps (100%)
12:37:18:WU02:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
12:37:28:WU02:FS00:0xa4:
12:37:28:WU02:FS00:0xa4:Finished Work Unit:
12:37:28:WU02:FS00:0xa4:- Reading up to 1256376 from "02/wudata_01.trr": Read 1256376
12:37:28:WU02:FS00:0xa4:trr file hash check passed.
12:37:28:WU02:FS00:0xa4:- Reading up to 111504 from "02/wudata_01.xtc": Read 111504
12:37:28:WU02:FS00:0xa4:xtc file hash check passed.
12:37:28:WU02:FS00:0xa4:edr file hash check passed.
12:37:28:WU02:FS00:0xa4:logfile size: 88690
12:37:28:WU02:FS00:0xa4:Leaving Run
12:37:31:WU02:FS00:0xa4:- Writing 1528470 bytes of core data to disk...
12:37:32:WU02:FS00:0xa4:Done: 1527958 -> 1293596 (compressed to 84.6 percent)
12:37:32:WU02:FS00:0xa4: ... Done.
12:37:33:WU02:FS00:0xa4:- Shutting down core
12:37:33:WU02:FS00:0xa4:
12:37:33:WU02:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
12:37:33:WU02:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
12:37:33:WU02:FS00:Sending unit results: id:02 state:SEND error:NO_ERROR project:6395 run:41 clone:38 gen:114 core:0xa4 unit:0x000000780002894b5462c743312d9110
12:37:33:WU02:FS00:Uploading 1.23MiB to 155.247.166.219
12:37:33:WU02:FS00:Connecting to 155.247.166.219:8080
12:37:33:WU00:FS00:Starting
12:37:33:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/aoeu/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 704 -lifeline 2204 -checkpoint 15 -np 6
12:37:33:WU00:FS00:Started FahCore on PID 5964
12:37:33:WU00:FS00:Core PID:1772
12:37:33:WU00:FS00:FahCore 0xa4 started
12:37:34:WU00:FS00:0xa4:
12:37:34:WU00:FS00:0xa4:*------------------------------*
12:37:34:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
12:37:34:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
12:37:34:WU00:FS00:0xa4:
12:37:34:WU00:FS00:0xa4:Preparing to commence simulation
12:37:34:WU00:FS00:0xa4:- Looking at optimizations...
12:37:34:WU00:FS00:0xa4:- Created dyn
12:37:34:WU00:FS00:0xa4:- Files status OK
12:37:34:WU00:FS00:0xa4:- Expanded 118937 -> 270464 (decompressed 227.4 percent)
12:37:34:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=118937 data_size=270464, decompressed_data_size=270464 diff=0
12:37:34:WU00:FS00:0xa4:- Digital signature verified
12:37:34:WU00:FS00:0xa4:
12:37:34:WU00:FS00:0xa4:Project: 6398 (Run 143, Clone 22, Gen 85)
12:37:34:WU00:FS00:0xa4:
12:37:34:WU00:FS00:0xa4:Assembly optimizations on if available.
12:37:34:WU00:FS00:0xa4:Entering M.D.
12:37:36:WU02:FS00:Upload complete
12:37:36:WU02:FS00:Server responded WORK_ACK (400)
12:37:36:WU02:FS00:Final credit estimate, 1515.00 points
12:37:36:WU02:FS00:Cleaning up
12:37:39:WU00:FS00:0xa4:Mapping NT from 6 to 6
12:37:39:WU00:FS00:0xa4:Completed 0 out of 5000000 steps (0%)
12:41:10:WU00:FS00:0xa4:Completed 50000 out of 5000000 steps (1%)
12:41:55:WU03:FS01:0x18:Completed 1700000 out of 5000000 steps (34%)
12:44:42:WU00:FS00:0xa4:Completed 100000 out of 5000000 steps (2%)
12:47:02:WU01:FS02:0x18:Completed 2700000 out of 5000000 steps (54%)
12:47:58:WU03:FS01:0x18:Completed 1750000 out of 5000000 steps (35%)
12:48:03:WU00:FS00:0xa4:Completed 150000 out of 5000000 steps (3%)
12:51:36:WU00:FS00:0xa4:Completed 200000 out of 5000000 steps (4%)
I have three slots, all have WUs, and are folding.
I hope this is helpful.
We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
Prof. Vijay Pande, PhD
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
As for the project descriptions, they are on fab-web which is one of the key machines having problems.
I’ve taken a look at it and there’s a more serious issue with a particular server than I can take care of myself. I’ve filed a ticket with the sysadmin team. Best guess ETA on this being fixed is Monday at noon (assuming this is something simple).
Prof. Vijay Pande, PhD
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
Really? ........
"NO IMPACT: A CME expected to sideswipe Earth's magnetic field on May 17th either missed or its impact was undetectable. As a result, geomagnetic activity is low and likely to remain so for the next 24 to 48 hours."
VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
nothing in the budget for at least some cheap consumer grade line interactive ups's (or some proper server grade online units)? i would think a place like Stanford could at least get something to keep servers online through power blips (and shut down servers properly!).
This user has stopped using F@H because it supports fake crap!
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
Really? ........
"NO IMPACT: A CME expected to sideswipe Earth's magnetic field on May 17th either missed or its impact was undetectable. As a result, geomagnetic activity is low and likely to remain so for the next 24 to 48 hours."
VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
nothing in the budget for at least some cheap consumer grade line interactive ups's (or some proper server grade online units)? i would think a place like Stanford could at least get something to keep servers online through power blips (and shut down servers properly!).
Apparently the servers were protected by (I suppose: server-grade) UPSs but something else wasn't (like maybe a key router). See Update on fah-web May 2015
VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
Really? ........
"NO IMPACT: A CME expected to sideswipe Earth's magnetic field on May 17th either missed or its impact was undetectable. As a result, geomagnetic activity is low and likely to remain so for the next 24 to 48 hours."
VijayPande wrote:We had an electrical storm (with several brief power outages) which put some of our servers and the Stanford net into a weird state. I think we have the WS’s back in shape, but we’ll keep an eye on this.
nothing in the budget for at least some cheap consumer grade line interactive ups's (or some proper server grade online units)? i would think a place like Stanford could at least get something to keep servers online through power blips (and shut down servers properly!).
Apparently the servers were protected by (I suppose: server-grade) UPSs but something else wasn't (like maybe a key router). See Update on fah-web May 2015
that would've been my next guess. ups all the things is my typical strategy, because upsing a computer to prevent gaming interruptions does no good if network goes off for example. anyway, glad it's fixed and it should be interesting to see how the 3PM eoc update fare..chokes... jk
This user has stopped using F@H because it supports fake crap!
20:36:10:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.200:8080': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
20:36:10:WU01:FS01:Connecting to 171.67.108.204:80
20:36:11:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.204:80': 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
20:36:11:ERROR:WU01:FS01:Exception: Could not get an assignment
20:42:36:WU00:FS01:0x17:Completed 4950000 out of 5000000 steps (99%)
20:44:01:WU00:FS01:0x17:Completed 5000000 out of 5000000 steps (100%)
20:44:01:WU01:FS01:Connecting to 171.67.108.200:80
20:44:02:WU01:FS01:Assigned to work server 171.64.65.56
20:44:02:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GM204 [GeForce GTX 980] from 171.64.65.56
20:44:02:WU01:FS01:Connecting to 171.64.65.56:8080
20:44:03:WU01:FS01:Downloading 889.82KiB
20:44:03:WU00:FS01:Connecting to 171.67.108.52:8080
20:44:05:WU01:FS01:Download complete
20:44:05:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9411 run:715 clone:0 gen:79 core:0x17 unit:0x00000061ab40413854d27c3131671be8
20:44:05:WU01:FS01:Starting
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
bruce wrote:
Apparently the servers were protected by (I suppose: server-grade) UPSs but something else wasn't (like maybe a key router). See Update on fah-web May 2015
Even server grade UPS systems eventually shutdown if the Main power is out for too long, and there is no generator to fall back on. Also explains the heavy load when power was restored. And I think Dr. Pande meant PERC Reset, as in a Dell PowerEdge Expandable RAID Controller.