171.67.108.11 and 171.67.108.25

Moderators: Site Moderators, FAHC Science Team

Post Reply
yemgi
Posts: 5
Joined: Fri Jan 21, 2011 8:47 pm
Location: France

171.67.108.11 and 171.67.108.25

Post by yemgi »

I have rebooted my computer this morning and since then it can not connect to the work server.
I am running the 7.1.24 client on Win7 x64
CPU Core i7 860
GPU GTX260
The SMP client is running fine bt the GPU keeps getting errors with the servers

Code: Select all

11:55:09:Saving configuration to config.xml
11:55:09:<config>
11:55:09:  <!-- Folding Slot Configuration -->
11:55:09:  <gpu v='true'/>
11:55:09:
11:55:09:  <!-- Network -->
11:55:09:  <proxy v=':8080'/>
11:55:09:
11:55:09:  <!-- User Information -->
11:55:09:  <passkey v='********************************'/>
11:55:09:  <team v='45435'/>
11:55:09:  <user v='yemgi'/>
11:55:09:
11:55:09:  <!-- Folding Slots -->
11:55:09:  <slot id='0' type='GPU'>
11:55:09:    <client-type v='advanced'/>
11:55:09:    <max-packet-size v='big'/>
11:55:09:    <verbosity v='9'/>
11:55:09:  </slot>
11:55:09:  <slot id='1' type='SMP'>
11:55:09:    <client-type v='bigadv'/>
11:55:09:    <max-packet-size v='big'/>
11:55:09:  </slot>
11:55:09:</config>
11:56:45:Slot 00 unpaused
11:56:45:Sending unit results: id:00 state:SEND project:5770 run:2 clone:158 gen:1157 core:0x11 unit:0x38b81b6c4dc519020485009e0002168a
11:56:45:Unit 00: Uploading 633B
11:56:45:Connecting to 171.67.108.11:8080
11:56:45:Sending unit results: id:03 state:SEND project:5768 run:10 clone:109 gen:331 core:0x11 unit:0x467580984dc51910014b006d000a1688
11:56:45:Unit 03: Uploading 635B
11:56:45:Connecting to 171.67.108.11:8080
11:56:45:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:56:45:Trying to send results to collection server
11:56:45:Unit 00: Uploading 633B
11:56:45:Connecting to 171.67.108.25:8080
11:56:45:Sending unit results: id:04 state:SEND project:5770 run:2 clone:339 gen:1126 core:0x11 unit:0x47531bec4dc5191d046601530002168a
11:56:45:Unit 04: Uploading 635B
11:56:45:Connecting to 171.67.108.11:8080
11:56:45:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:56:45:Trying to send results to collection server
11:56:45:Unit 03: Uploading 635B
11:56:45:Connecting to 171.67.108.25:8080
11:56:45:Sending unit results: id:01 state:SEND project:5766 run:6 clone:78 gen:852 core:0x11 unit:0x396aeb124dc5192a0354004e00061686
11:56:45:Unit 01: Uploading 632B
11:56:45:Connecting to 171.67.108.11:8080
11:56:46:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:56:46:Trying to send results to collection server
11:56:46:Unit 04: Uploading 635B
11:56:46:Connecting to 171.67.108.25:8080
11:56:46:Sending unit results: id:05 state:SEND project:5769 run:9 clone:130 gen:937 core:0x11 unit:0x380c5f334dc5230203a9008200091689
11:56:46:Unit 05: Uploading 2.32KiB
11:56:46:Connecting to 171.67.108.11:8080
11:56:46:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:56:46:Trying to send results to collection server
11:56:46:Unit 01: Uploading 632B
11:56:46:Connecting to 171.67.108.25:8080
11:56:46:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:56:46:Trying to send results to collection server
11:56:46:Unit 05: Uploading 2.32KiB
11:56:46:Connecting to 171.67.108.25:8080
11:56:47:WARNING: WorkServer connection failed on port 8080 trying 80
11:56:47:Connecting to 171.67.108.25:80
11:56:47:WARNING: WorkServer connection failed on port 8080 trying 80
11:56:47:Connecting to 171.67.108.25:80
11:56:47:WARNING: WorkServer connection failed on port 8080 trying 80
11:56:47:Connecting to 171.67.108.25:80
11:56:47:WARNING: WorkServer connection failed on port 8080 trying 80
11:56:47:Connecting to 171.67.108.25:80
11:56:50:News: Welcome to Folding@Home
11:56:50:Assigned to work server 171.67.108.11
11:56:50:Requesting new work unit for slot 00: READY gpu:0:"GT200 [GeForce GTX 260]" from 171.67.108.11
11:56:50:Connecting to 171.67.108.11:8080
11:56:51:Slot 00: Downloading 46.06KiB
11:56:51:WARNING: WorkServer connection failed on port 8080 trying 80
11:56:51:Connecting to 171.67.108.25:80
11:56:51:WARNING: WorkServer connection failed on port 8080 trying 80
11:56:51:Connecting to 171.67.108.25:80
11:56:51:WARNING: WorkServer connection failed on port 8080 trying 80
11:56:51:Connecting to 171.67.108.25:80
11:56:51:WARNING: WorkServer connection failed on port 8080 trying 80
11:56:51:Connecting to 171.67.108.25:80
11:56:51:Slot 00: Download complete
11:56:51:Received Unit: id:10 state:DOWNLOAD project:5768 run:9 clone:32 gen:763 core:0x11 unit:0x7130ffcf4dc5338902fb002000091688
11:56:51:Starting Unit 10
11:56:51:Running core: C:/Users/Guilhem/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 10 -suffix 01 -lifeline 1964 -version 701 -checkpoint 15 -gpu 0 -service
11:56:51:Started core on PID 1524
11:56:51:FahCore 0x11 started
11:56:51:Started thread 19 on PID 1964
11:56:58:Unit 10:Tpr hash 10/wudata_01.tpr:  2931702750 1887271039 2725142602 4017907258 3279560524
11:56:58:Unit 10:
11:56:58:Unit 10:Calling fah_main args: 14 usage=100
11:56:58:Unit 10:
11:56:58:Unit 10:mdrun_gpu returned 
11:56:58:Unit 10:Going to send back what have done -- stepsTotalG=0
11:56:58:Unit 10:Work fraction=0.0000 steps=0.
11:57:02:Unit 10:logfile size=4948 infoLength=4948 edr=0 trr=25
11:57:02:Unit 10:+ Opened results file
11:57:02:Unit 10:- Writing 5486 bytes of core data to disk...
11:57:02:Unit 10:Done: 4974 -> 1862 (compressed to 37.4 percent)
11:57:02:Unit 10:  ... Done.
11:57:02:Unit 10:DeleteFrameFiles: successfully deleted file=10/wudata_01.ckp
11:57:02:Unit 10:
11:57:02:Unit 10:Folding@home Core Shutdown: UNSTABLE_MACHINE
11:57:02:FahCore, running Unit 10, returned: UNSTABLE_MACHINE (122)
11:57:02:Starting Unit 10
11:57:02:Running core: C:/Users/Guilhem/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 10 -suffix 01 -lifeline 1964 -version 701 -checkpoint 15 -gpu 0 -service
11:57:02:Started core on PID 3936
11:57:02:FahCore 0x11 started
11:57:02:Started thread 20 on PID 1964
11:57:02:FahCore, running Unit 10, returned: MISSING_WORK_FILES (116)
11:57:02:WARNING: Unit 10 Fatal error, dumping
11:57:02:Sending unit results: id:10 state:SEND project:5768 run:9 clone:32 gen:763 core:0x11 unit:0x7130ffcf4dc5338902fb002000091688
11:57:02:Unit 10: Uploading 2.32KiB
11:57:02:Connecting to 171.67.108.11:8080
11:57:03:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:57:03:Trying to send results to collection server
11:57:03:Unit 10: Uploading 2.32KiB
11:57:03:Connecting to 171.67.108.25:8080
11:57:03:Connecting to assign-GPU.stanford.edu:80
11:57:03:News: Welcome to Folding@Home
11:57:03:Assigned to work server 171.67.108.11
11:57:03:Requesting new work unit for slot 00: READY gpu:0:"GT200 [GeForce GTX 260]" from 171.67.108.11
11:57:03:Connecting to 171.67.108.11:8080
11:57:04:Slot 00: Downloading 44.83KiB
11:57:04:WARNING: WorkServer connection failed on port 8080 trying 80
11:57:04:Connecting to 171.67.108.25:80
11:57:05:Slot 00: Download complete
11:57:05:Received Unit: id:11 state:DOWNLOAD project:5770 run:3 clone:5 gen:1412 core:0x11 unit:0x09cd28b64dc53397058400050003168a
11:57:05:Starting Unit 11
11:57:05:Running core: C:/Users/Guilhem/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 11 -suffix 01 -lifeline 1964 -version 701 -checkpoint 15 -gpu 0 -service
11:57:05:Started core on PID 2580
11:57:05:FahCore 0x11 started
11:57:05:Started thread 21 on PID 1964
11:57:05:Unit 11:
11:57:05:Unit 11:*------------------------------*
11:57:05:Unit 11:Folding@Home GPU Core
11:57:05:Unit 11:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
11:57:05:Unit 11:
11:57:05:Unit 11:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
11:57:05:Unit 11:Build host: amoeba
11:57:05:Unit 11:Board Type: Nvidia
11:57:05:Unit 11:Core      : 
11:57:05:Unit 11:Preparing to commence simulation
11:57:05:Unit 11:- Looking at optimizations...
11:57:05:Unit 11:DeleteFrameFiles: successfully deleted file=11/wudata_01.ckp
11:57:05:Unit 11:- Created dyn
11:57:05:Unit 11:- Files status OK
11:57:05:Unit 11:- Expanded 45398 -> 251112 (decompressed 553.1 percent)
11:57:05:Unit 11:Called DecompressByteArray: compressed_data_size=45398 data_size=251112, decompressed_data_size=251112 diff=0
11:57:05:Unit 11:- Digital signature verified
11:57:05:Unit 11:
11:57:05:Unit 11:Project: 5770 (Run 3, Clone 5, Gen 1412)
11:57:05:Unit 11:
11:57:05:Unit 11:Assembly optimizations on if available.
11:57:05:Unit 11:Entering M.D.
11:57:08:WARNING: WorkServer connection failed on port 8080 trying 80
11:57:08:Connecting to 171.67.108.25:80
11:57:11:Unit 11:Tpr hash 11/wudata_01.tpr:  859119256 3595133971 3765282212 2536307495 2914312074
11:57:11:Unit 11:
11:57:11:Unit 11:Calling fah_main args: 14 usage=100
11:57:11:Unit 11:
11:57:11:Unit 11:mdrun_gpu returned 
11:57:11:Unit 11:Going to send back what have done -- stepsTotalG=0
11:57:11:Unit 11:Work fraction=0.0000 steps=0.
11:57:15:Unit 11:logfile size=4948 infoLength=4948 edr=0 trr=25
11:57:15:Unit 11:+ Opened results file
11:57:15:Unit 11:- Writing 5486 bytes of core data to disk...
11:57:15:Unit 11:Done: 4974 -> 1863 (compressed to 37.4 percent)
11:57:15:Unit 11:  ... Done.
11:57:15:Unit 11:DeleteFrameFiles: successfully deleted file=11/wudata_01.ckp
11:57:15:Unit 11:
11:57:15:Unit 11:Folding@home Core Shutdown: UNSTABLE_MACHINE
11:57:15:FahCore, running Unit 11, returned: UNSTABLE_MACHINE (122)
11:57:15:Starting Unit 11
11:57:15:Running core: C:/Users/Guilhem/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 11 -suffix 01 -lifeline 1964 -version 701 -checkpoint 15 -gpu 0 -service
11:57:15:Started core on PID 740
11:57:15:FahCore 0x11 started
11:57:15:Started thread 22 on PID 1964
11:57:16:FahCore, running Unit 11, returned: MISSING_WORK_FILES (116)
11:57:16:WARNING: Unit 11 Fatal error, dumping
11:57:16:Sending unit results: id:11 state:SEND project:5770 run:3 clone:5 gen:1412 core:0x11 unit:0x09cd28b64dc53397058400050003168a
11:57:16:Unit 11: Uploading 2.32KiB
11:57:16:Connecting to 171.67.108.11:8080
11:57:16:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:57:16:Trying to send results to collection server
11:57:16:Unit 11: Uploading 2.32KiB
11:57:16:Connecting to 171.67.108.25:8080
11:57:16:Connecting to assign-GPU.stanford.edu:80
11:57:17:News: Welcome to Folding@Home
11:57:17:Assigned to work server 171.67.108.11
11:57:17:Requesting new work unit for slot 00: READY gpu:0:"GT200 [GeForce GTX 260]" from 171.67.108.11
11:57:17:Connecting to 171.67.108.11:8080
11:57:17:Slot 00: Downloading 44.94KiB
11:57:18:WARNING: WorkServer connection failed on port 8080 trying 80
11:57:18:Connecting to 171.67.108.25:80
11:57:18:Slot 00: Download complete
11:57:18:Received Unit: id:12 state:DOWNLOAD project:5772 run:13 clone:164 gen:1069 core:0x11 unit:0x344a08044dc533a4042d00a4000d168c
11:57:18:Starting Unit 12
11:57:18:Running core: C:/Users/Guilhem/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 12 -suffix 01 -lifeline 1964 -version 701 -checkpoint 15 -gpu 0 -service
11:57:18:Started core on PID 4620
11:57:18:FahCore 0x11 started
11:57:18:Started thread 23 on PID 1964
11:57:18:Unit 12:
11:57:18:Unit 12:*------------------------------*
11:57:18:Unit 12:Folding@Home GPU Core
11:57:18:Unit 12:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
11:57:18:Unit 12:
11:57:18:Unit 12:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
11:57:18:Unit 12:Build host: amoeba
11:57:18:Unit 12:Board Type: Nvidia
11:57:18:Unit 12:Core      : 
11:57:18:Unit 12:Preparing to commence simulation
11:57:18:Unit 12:- Looking at optimizations...
11:57:18:Unit 12:DeleteFrameFiles: successfully deleted file=12/wudata_01.ckp
11:57:18:Unit 12:- Created dyn
11:57:18:Unit 12:- Files status OK
11:57:18:Unit 12:- Expanded 45503 -> 251112 (decompressed 551.8 percent)
11:57:18:Unit 12:Called DecompressByteArray: compressed_data_size=45503 data_size=251112, decompressed_data_size=251112 diff=0
11:57:18:Unit 12:- Digital signature verified
11:57:18:Unit 12:
11:57:18:Unit 12:Project: 5772 (Run 13, Clone 164, Gen 1069)
11:57:18:Unit 12:
11:57:18:Unit 12:Assembly optimizations on if available.
11:57:18:Unit 12:Entering M.D.
11:57:20:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:57:20:Trying to send results to collection server
11:57:20:Unit 11: Uploading 2.32KiB
11:57:20:Connecting to 171.67.108.25:8080
11:57:21:WARNING: WorkServer connection failed on port 8080 trying 80
11:57:21:Connecting to 171.67.108.25:80
11:57:24:Unit 12:Tpr hash 12/wudata_01.tpr:  4020711331 3314669640 3680630177 202899422 2206815822
11:57:24:Unit 12:
11:57:24:Unit 12:Calling fah_main args: 14 usage=100
11:57:24:Unit 12:
11:57:24:Unit 12:mdrun_gpu returned 
11:57:24:Unit 12:Going to send back what have done -- stepsTotalG=0
11:57:24:Unit 12:Work fraction=0.0000 steps=0.
11:57:28:Unit 12:logfile size=4948 infoLength=4948 edr=0 trr=25
11:57:28:Unit 12:+ Opened results file
11:57:28:Unit 12:- Writing 5486 bytes of core data to disk...
11:57:28:Unit 12:Done: 4974 -> 1851 (compressed to 37.2 percent)
11:57:28:Unit 12:  ... Done.
11:57:28:Unit 12:DeleteFrameFiles: successfully deleted file=12/wudata_01.ckp
11:57:28:FahCore, running Unit 12, returned: UNSTABLE_MACHINE (122)
11:57:28:Starting Unit 12
11:57:28:Running core: C:/Users/Guilhem/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 12 -suffix 01 -lifeline 1964 -version 701 -checkpoint 15 -gpu 0 -service
11:57:28:Started core on PID 4220
11:57:28:FahCore 0x11 started
11:57:28:Started thread 24 on PID 1964
11:57:29:FahCore, running Unit 12, returned: MISSING_WORK_FILES (116)
11:57:29:WARNING: Unit 12 Fatal error, dumping
11:57:29:Sending unit results: id:12 state:SEND project:5772 run:13 clone:164 gen:1069 core:0x11 unit:0x344a08044dc533a4042d00a4000d168c
11:57:29:Unit 12: Uploading 2.31KiB
11:57:29:Connecting to 171.67.108.11:8080
11:57:29:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:57:29:Trying to send results to collection server
11:57:29:Unit 12: Uploading 2.31KiB
11:57:29:Connecting to 171.67.108.25:8080
11:57:29:Connecting to assign-GPU.stanford.edu:80
11:57:30:News: Welcome to Folding@Home
11:57:30:Assigned to work server 171.67.108.11
11:57:30:Requesting new work unit for slot 00: READY gpu:0:"GT200 [GeForce GTX 260]" from 171.67.108.11
11:57:30:Connecting to 171.67.108.11:8080
11:57:30:Slot 00: Downloading 46.08KiB
11:57:31:WARNING: WorkServer connection failed on port 8080 trying 80
11:57:31:Connecting to 171.67.108.25:80
11:57:31:Slot 00: Download complete
11:57:31:Received Unit: id:13 state:DOWNLOAD project:5766 run:11 clone:211 gen:712 core:0x11 unit:0x75d7cb804dc533b102c800d3000b1686
11:57:31:Starting Unit 13
11:57:31:Running core: C:/Users/Guilhem/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 13 -suffix 01 -lifeline 1964 -version 701 -checkpoint 15 -gpu 0 -service
11:57:31:Started core on PID 296
11:57:31:FahCore 0x11 started
11:57:31:Started thread 25 on PID 1964
11:57:31:Unit 13:
11:57:31:Unit 13:*------------------------------*
11:57:31:Unit 13:Folding@Home GPU Core
11:57:31:Unit 13:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
11:57:31:Unit 13:
11:57:31:Unit 13:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
11:57:31:Unit 13:Build host: amoeba
11:57:31:Unit 13:Board Type: Nvidia
11:57:31:Unit 13:Core      : 
11:57:31:Unit 13:Preparing to commence simulation
11:57:31:Unit 13:- Looking at optimizations...
11:57:31:Unit 13:DeleteFrameFiles: successfully deleted file=13/wudata_01.ckp
11:57:31:Unit 13:- Created dyn
11:57:31:Unit 13:- Files status OK
11:57:31:Unit 13:- Expanded 46675 -> 252912 (decompressed 541.8 percent)
11:57:31:Unit 13:Called DecompressByteArray: compressed_data_size=46675 data_size=252912, decompressed_data_size=252912 diff=0
11:57:31:Unit 13:- Digital signature verified
11:57:31:Unit 13:
11:57:31:Unit 13:Project: 5766 (Run 11, Clone 211, Gen 712)
11:57:31:Unit 13:
11:57:31:Unit 13:Assembly optimizations on if available.
11:57:31:Unit 13:Entering M.D.
11:57:33:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:57:33:Trying to send results to collection server
11:57:33:Unit 12: Uploading 2.31KiB
11:57:33:Connecting to 171.67.108.25:8080
11:57:35:WARNING: WorkServer connection failed on port 8080 trying 80
11:57:35:Connecting to 171.67.108.25:80
11:57:37:Unit 13:Tpr hash 13/wudata_01.tpr:  476119477 2551791098 2169889918 4047412003 868929951
11:57:37:Unit 13:
11:57:37:Unit 13:Calling fah_main args: 14 usage=100
11:57:37:Unit 13:
11:57:37:Unit 13:mdrun_gpu returned 
11:57:37:Unit 13:Going to send back what have done -- stepsTotalG=0
11:57:37:Unit 13:Work fraction=0.0000 steps=0.
11:57:41:Unit 13:logfile size=4947 infoLength=4947 edr=0 trr=25
11:57:41:Unit 13:+ Opened results file
11:57:41:Unit 13:- Writing 5485 bytes of core data to disk...
11:57:41:Unit 13:Done: 4973 -> 1865 (compressed to 37.5 percent)
11:57:41:Unit 13:  ... Done.
11:57:41:Unit 13:DeleteFrameFiles: successfully deleted file=13/wudata_01.ckp
11:57:41:Unit 13:
11:57:41:Unit 13:Folding@home Core Shutdown: UNSTABLE_MACHINE
11:57:41:FahCore, running Unit 13, returned: UNSTABLE_MACHINE (122)
11:57:42:Starting Unit 13
11:57:42:Running core: C:/Users/Guilhem/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 13 -suffix 01 -lifeline 1964 -version 701 -checkpoint 15 -gpu 0 -service
11:57:42:Started core on PID 2204
11:57:42:FahCore 0x11 started
11:57:42:Started thread 26 on PID 1964
11:57:42:FahCore, running Unit 13, returned: MISSING_WORK_FILES (116)
11:57:42:WARNING: Unit 13 Fatal error, dumping
11:57:42:Sending unit results: id:13 state:SEND project:5766 run:11 clone:211 gen:712 core:0x11 unit:0x75d7cb804dc533b102c800d3000b1686
11:57:42:Unit 13: Uploading 2.32KiB
11:57:42:Connecting to 171.67.108.11:8080
11:57:42:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:57:42:Trying to send results to collection server
11:57:42:Unit 13: Uploading 2.32KiB
11:57:42:Connecting to 171.67.108.25:8080
11:57:42:Connecting to assign-GPU.stanford.edu:80
11:57:43:News: Welcome to Folding@Home
11:57:43:Assigned to work server 171.67.108.11
11:57:43:Requesting new work unit for slot 00: READY gpu:0:"GT200 [GeForce GTX 260]" from 171.67.108.11
11:57:43:Connecting to 171.67.108.11:8080
11:57:43:Slot 00: Downloading 44.87KiB
11:57:44:WARNING: WorkServer connection failed on port 8080 trying 80
11:57:44:Connecting to 171.67.108.25:80
11:57:44:Slot 00: Download complete
11:57:44:Received Unit: id:14 state:DOWNLOAD project:5771 run:12 clone:78 gen:746 core:0x11 unit:0x40a44eaa4dc533be02ea004e000c168b
11:57:44:Starting Unit 14
11:57:44:Running core: C:/Users/Guilhem/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 14 -suffix 01 -lifeline 1964 -version 701 -checkpoint 15 -gpu 0 -service
11:57:44:Started core on PID 4368
11:57:44:FahCore 0x11 started
11:57:44:Started thread 27 on PID 1964
11:57:45:Unit 14:
11:57:45:Unit 14:*------------------------------*
11:57:45:Unit 14:Folding@Home GPU Core
11:57:45:Unit 14:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
11:57:45:Unit 14:
11:57:45:Unit 14:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
11:57:45:Unit 14:Build host: amoeba
11:57:45:Unit 14:Board Type: Nvidia
11:57:45:Unit 14:Core      : 
11:57:45:Unit 14:Preparing to commence simulation
11:57:45:Unit 14:- Looking at optimizations...
11:57:45:Unit 14:DeleteFrameFiles: successfully deleted file=14/wudata_01.ckp
11:57:45:Unit 14:- Created dyn
11:57:45:Unit 14:- Files status OK
11:57:45:Unit 14:- Expanded 45436 -> 251112 (decompressed 552.6 percent)
11:57:45:Unit 14:Called DecompressByteArray: compressed_data_size=45436 data_size=251112, decompressed_data_size=251112 diff=0
11:57:45:Unit 14:- Digital signature verified
11:57:45:Unit 14:
11:57:45:Unit 14:Project: 5771 (Run 12, Clone 78, Gen 746)
11:57:45:Unit 14:
11:57:45:Unit 14:Assembly optimizations on if available.
11:57:45:Unit 14:Entering M.D.
11:57:46:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:57:46:Trying to send results to collection server
11:57:46:Unit 13: Uploading 2.32KiB
11:57:46:Connecting to 171.67.108.25:8080
11:57:48:WARNING: WorkServer connection failed on port 8080 trying 80
11:57:48:Connecting to 171.67.108.25:80
11:57:50:Unit 14:Tpr hash 14/wudata_01.tpr:  1583027463 3637511510 1647374093 572790733 1850828169
11:57:50:Unit 14:
11:57:50:Unit 14:Calling fah_main args: 14 usage=100
11:57:50:Unit 14:
11:57:50:Unit 14:mdrun_gpu returned 
11:57:50:Unit 14:Going to send back what have done -- stepsTotalG=0
11:57:50:Unit 14:Work fraction=0.0000 steps=0.
11:57:54:Unit 14:logfile size=4948 infoLength=4948 edr=0 trr=25
11:57:54:Unit 14:+ Opened results file
11:57:54:Unit 14:- Writing 5486 bytes of core data to disk...
11:57:54:Unit 14:Done: 4974 -> 1860 (compressed to 37.3 percent)
11:57:54:Unit 14:  ... Done.
11:57:54:Unit 14:DeleteFrameFiles: successfully deleted file=14/wudata_01.ckp
11:57:55:FahCore, running Unit 14, returned: UNSTABLE_MACHINE (122)
11:58:06:Sending unit results: id:10 state:SEND project:5768 run:9 clone:32 gen:763 core:0x11 unit:0x7130ffcf4dc5338902fb002000091688
11:58:06:Unit 10: Uploading 2.32KiB
11:58:06:Connecting to 171.67.108.11:8080
11:58:07:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:58:07:Trying to send results to collection server
11:58:07:Unit 10: Uploading 2.32KiB
11:58:07:Connecting to 171.67.108.25:8080
11:58:08:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:08:Connecting to 171.67.108.25:80
11:58:19:Sending unit results: id:11 state:SEND project:5770 run:3 clone:5 gen:1412 core:0x11 unit:0x09cd28b64dc53397058400050003168a
11:58:19:Unit 11: Uploading 2.32KiB
11:58:19:Connecting to 171.67.108.11:8080
11:58:20:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:58:20:Trying to send results to collection server
11:58:20:Unit 11: Uploading 2.32KiB
11:58:20:Connecting to 171.67.108.25:8080
11:58:21:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:21:Connecting to 171.67.108.25:80
11:58:22:Sending unit results: id:00 state:SEND project:5770 run:2 clone:158 gen:1157 core:0x11 unit:0x38b81b6c4dc519020485009e0002168a
11:58:22:Unit 00: Uploading 633B
11:58:22:Connecting to 171.67.108.11:8080
11:58:22:Sending unit results: id:03 state:SEND project:5768 run:10 clone:109 gen:331 core:0x11 unit:0x467580984dc51910014b006d000a1688
11:58:22:Unit 03: Uploading 635B
11:58:22:Connecting to 171.67.108.11:8080
11:58:22:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:58:22:Trying to send results to collection server
11:58:22:Unit 00: Uploading 633B
11:58:22:Connecting to 171.67.108.25:8080
11:58:22:Sending unit results: id:04 state:SEND project:5770 run:2 clone:339 gen:1126 core:0x11 unit:0x47531bec4dc5191d046601530002168a
11:58:22:Unit 04: Uploading 635B
11:58:22:Connecting to 171.67.108.11:8080
11:58:23:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:58:23:Trying to send results to collection server
11:58:23:Unit 03: Uploading 635B
11:58:23:Connecting to 171.67.108.25:8080
11:58:23:Sending unit results: id:01 state:SEND project:5766 run:6 clone:78 gen:852 core:0x11 unit:0x396aeb124dc5192a0354004e00061686
11:58:23:Unit 01: Uploading 632B
11:58:23:Connecting to 171.67.108.11:8080
11:58:23:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:58:23:Trying to send results to collection server
11:58:23:Unit 04: Uploading 635B
11:58:23:Connecting to 171.67.108.25:8080
11:58:24:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:24:Connecting to 171.67.108.25:80
11:58:24:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:24:Connecting to 171.67.108.25:80
11:58:24:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:24:Connecting to 171.67.108.25:80
11:58:25:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:25:Connecting to 171.67.108.25:80
11:58:28:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:28:Connecting to 171.67.108.25:80
11:58:28:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:28:Connecting to 171.67.108.25:80
11:58:28:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:28:Connecting to 171.67.108.25:80
11:58:33:Sending unit results: id:12 state:SEND project:5772 run:13 clone:164 gen:1069 core:0x11 unit:0x344a08044dc533a4042d00a4000d168c
11:58:33:Unit 12: Uploading 2.31KiB
11:58:33:Connecting to 171.67.108.11:8080
11:58:33:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:58:33:Trying to send results to collection server
11:58:33:Unit 12: Uploading 2.31KiB
11:58:33:Connecting to 171.67.108.25:8080
11:58:35:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:35:Connecting to 171.67.108.25:80
11:58:46:Sending unit results: id:13 state:SEND project:5766 run:11 clone:211 gen:712 core:0x11 unit:0x75d7cb804dc533b102c800d3000b1686
11:58:46:Unit 13: Uploading 2.32KiB
11:58:46:Connecting to 171.67.108.11:8080
11:58:46:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:58:46:Trying to send results to collection server
11:58:46:Unit 13: Uploading 2.32KiB
11:58:46:Connecting to 171.67.108.25:8080
11:58:48:WARNING: WorkServer connection failed on port 8080 trying 80
11:58:48:Connecting to 171.67.108.25:80
11:59:32:Downloading core from http://www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah
11:59:32:Connecting to www.stanford.edu:80
11:59:33:WARNING: FahCore type in core package seems to be in wrong byte order.
11:59:33:FahCore 11: Downloading 648.82KiB
11:59:37:FahCore 11: Download complete
11:59:37:Valid core signature
11:59:37:WARNING: FahCore has not changed since last download, aborting core update
11:59:37:Starting Unit 14
11:59:37:Running core: C:/Users/Guilhem/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 14 -suffix 01 -lifeline 1964 -version 701 -checkpoint 15 -gpu 0 -service
11:59:37:Started core on PID 2300
11:59:37:FahCore 0x11 started
11:59:37:Started thread 28 on PID 1964
11:59:37:FahCore, running Unit 14, returned: MISSING_WORK_FILES (116)
11:59:37:WARNING: Unit 14 Fatal error, dumping
11:59:37:Sending unit results: id:14 state:SEND project:5771 run:12 clone:78 gen:746 core:0x11 unit:0x40a44eaa4dc533be02ea004e000c168b
11:59:37:Unit 14: Uploading 2.32KiB
11:59:37:Connecting to 171.67.108.11:8080
11:59:38:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:59:38:Trying to send results to collection server
11:59:38:Unit 14: Uploading 2.32KiB
11:59:38:Connecting to 171.67.108.25:8080
11:59:39:WARNING: WorkServer connection failed on port 8080 trying 80
11:59:39:Connecting to 171.67.108.25:80
11:59:41:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:59:41:Trying to send results to collection server
11:59:41:Unit 14: Uploading 2.32KiB
11:59:41:Connecting to 171.67.108.25:8080
11:59:43:WARNING: WorkServer connection failed on port 8080 trying 80
11:59:43:Connecting to 171.67.108.25:80
11:59:43:Sending unit results: id:10 state:SEND project:5768 run:9 clone:32 gen:763 core:0x11 unit:0x7130ffcf4dc5338902fb002000091688
11:59:43:Unit 10: Uploading 2.32KiB
11:59:43:Connecting to 171.67.108.11:8080
11:59:44:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
11:59:44:Trying to send results to collection server
11:59:44:Unit 10: Uploading 2.32KiB
11:59:44:Connecting to 171.67.108.25:8080
11:59:45:WARNING: WorkServer connection failed on port 8080 trying 80
11:59:45:Connecting to 171.67.108.25:80
bollix47
Posts: 2953
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: 171.67.108.25

Post by bollix47 »

The work server you're trying to return to is 171.67.108.11. It's got a high d/l count at the moment but looks okay otherwise.

171.67.108.25 is a collection server that's supposed to pick up the slack when the work server is unavailable. It hasn't worked in a long time. Not sure why failed uploads are still being sent to it.

Stopping and starting the gpu slot should get you another work unit and the one in question should upload eventually.

gl
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.67.108.25

Post by bruce »

I wish I could be as optimistic. I see repeated errors saying:
11:57:02:FahCore, running Unit 10, returned: UNSTABLE_MACHINE (122)
11:57:02:FahCore, running Unit 10, returned: MISSING_WORK_FILES (116)
11:57:02:WARNING: Unit 10 Fatal error, dumping

This happened on at least 5 different WUs, followed by the download of a fresh copy of the FahCore, as a last attempt to find something the client could do to fix the problem.
project:5768 run:9 clone:32 gen:763
project:5770 run:3 clone:5 gen:1412
project:5772 run:13 clone:164 gen:1069
project:5766 run:11 clone:211 gen:712
project:5771 run:12 clone:78 gen:746

You need to identify why your GPU is unable to process a series of different WUs. (Overheated? Excessive overclocking? Corrupt drivers? Failed hardware? Configured for the wrong the of GPU, etc.)

The failure to upload is a different problem. There's a bug in this version of FAHClient which causes repeated failures to upload error reports. (#615) Under the same circumstances, I expect that the same server will probably accept WUs that are completed successfully.

By the way, that's server 171.67.108.11, so I'm changing the title of this report.
FantasticRhino
Posts: 4
Joined: Tue May 24, 2011 2:33 pm

Re: 171.67.108.11 and 171.67.108.25

Post by FantasticRhino »

Hi, on the same subject of the availability of these servers, I have had an GPU WU sitting on my machine for a few weeks now and it hasn't uploaded to either of these 2 servers. I have an SMP client running as well and have since removed the GPU slot (Nvidia Quadro 770M, primarily because this is my work machine and running the GPU client makes the video laggy). I am using the beta client 7.1.24 which has been working fine otherwise, it's just this one WU that hasn't been able to upload.

According to the server status the primary server should be available...
171.67.108.11 GPU vsp07v vvoelz full Accepting
171.67.108.25 CS 5 vsp19v pande standby Not Accept

project:5765 run:9 clone:365 gen:479

Here is a log:

Code: Select all

14:17:09:Started core on PID 7552
14:17:09:FahCore 0xa3 started
14:17:09:Sending unit results: id:01 state:SEND project:5765 run:9 clone:365 gen:479 core:0x11 unit:0x1ba37bb64dc1acfd01df016d00091685
14:17:09:Unit 01: Uploading 6.81KiB
14:17:09:Connecting to 171.67.108.11:8080
14:17:10:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
14:17:10:Trying to send results to collection server
14:17:10:Unit 01: Uploading 6.81KiB
14:17:10:Connecting to 171.67.108.25:8080
14:17:10:Unit 02:
14:17:10:Unit 02:*------------------------------*
14:17:10:Unit 02:Folding@Home Gromacs SMP Core
14:17:10:Unit 02:Version 2.27 (Dec. 15, 2010)
14:17:10:Unit 02:
14:17:10:Unit 02:Preparing to commence simulation
14:17:10:Unit 02:- Ensuring status. Please wait.
14:17:11:Server connection id=1 on 0.0.0.0:36330 from 127.0.0.1
14:17:11:WARNING: WorkServer connection failed on port 8080 trying 80
14:17:11:Connecting to 171.67.108.25:80
14:17:12:ERROR: Exception: Failed to connect to 171.67.108.25:80: No connection could be made because the target machine actively refused it.
14:17:12:Sending unit results: id:01 state:SEND project:5765 run:9 clone:365 gen:479 core:0x11 unit:0x1ba37bb64dc1acfd01df016d00091685
14:17:12:Unit 01: Uploading 6.81KiB
14:17:12:Connecting to 171.67.108.11:8080
14:17:13:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
14:17:13:Trying to send results to collection server
14:17:13:Unit 01: Uploading 6.81KiB
14:17:13:Connecting to 171.67.108.25:8080
14:17:14:WARNING: WorkServer connection failed on port 8080 trying 80
14:17:14:Connecting to 171.67.108.25:80
14:17:15:ERROR: Exception: Failed to connect to 171.67.108.25:80: No connection could be made because the target machine actively refused it.
14:17:19:Unit 02:- Looking at optimizations...
14:17:19:Unit 02:- Working with standard loops on this execution.
14:17:19:Unit 02:- Previous termination of core was improper.
14:17:19:Unit 02:- Going to use standard loops.
14:17:19:Unit 02:- Files status OK
14:17:19:Unit 02:- Expanded 1772326 -> 1969568 (decompressed 111.1 percent)
14:17:19:Unit 02:Called DecompressByteArray: compressed_data_size=1772326 data_size=1969568, decompressed_data_size=1969568 diff=0
14:17:19:Unit 02:- Digital signature verified
14:17:19:Unit 02:
14:17:19:Unit 02:Project: 7132 (Run 0, Clone 64, Gen 82)
14:17:19:Unit 02:
14:17:19:Unit 02:Entering M.D.
14:17:25:Unit 02:Using Gromacs checkpoints
14:17:25:Unit 02:Mapping NT from 2 to 2 
14:17:26:Unit 02:Resuming from checkpoint
14:17:26:Unit 02:Verified 02/wudata_01.log
14:17:26:Unit 02:Verified 02/wudata_01.trr
14:17:26:Unit 02:Verified 02/wudata_01.edr
14:17:26:Unit 02:Completed 29166 out of 500000 steps  (5%)
14:18:13:Sending unit results: id:01 state:SEND project:5765 run:9 clone:365 gen:479 core:0x11 unit:0x1ba37bb64dc1acfd01df016d00091685
14:18:13:Unit 01: Uploading 6.81KiB
14:18:13:Connecting to 171.67.108.11:8080
14:18:13:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
14:18:13:Trying to send results to collection server
14:18:13:Unit 01: Uploading 6.81KiB
14:18:13:Connecting to 171.67.108.25:8080
14:18:14:WARNING: WorkServer connection failed on port 8080 trying 80
14:18:14:Connecting to 171.67.108.25:80
14:18:16:ERROR: Exception: Failed to connect to 171.67.108.25:80: No connection could be made because the target machine actively refused it.
14:19:50:Sending unit results: id:01 state:SEND project:5765 run:9 clone:365 gen:479 core:0x11 unit:0x1ba37bb64dc1acfd01df016d00091685
14:19:50:Unit 01: Uploading 6.81KiB
14:19:50:Connecting to 171.67.108.11:8080
14:19:51:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
14:19:51:Trying to send results to collection server
14:19:51:Unit 01: Uploading 6.81KiB
14:19:51:Connecting to 171.67.108.25:8080
14:20:05:WARNING: WorkServer connection failed on port 8080 trying 80
14:20:05:Connecting to 171.67.108.25:80
14:20:06:ERROR: Exception: Failed to connect to 171.67.108.25:80: No connection could be made because the target machine actively refused it.
14:20:11:Unit 02:Completed 30000 out of 500000 steps  (6%)
14:22:27:Sending unit results: id:01 state:SEND project:5765 run:9 clone:365 gen:479 core:0x11 unit:0x1ba37bb64dc1acfd01df016d00091685
14:22:27:Unit 01: Uploading 6.81KiB
14:22:27:Connecting to 171.67.108.11:8080
14:22:28:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
14:22:28:Trying to send results to collection server
14:22:28:Unit 01: Uploading 6.81KiB
14:22:28:Connecting to 171.67.108.25:8080
14:22:30:WARNING: WorkServer connection failed on port 8080 trying 80
14:22:30:Connecting to 171.67.108.25:80
14:22:32:ERROR: Exception: Failed to connect to 171.67.108.25:80: No connection could be made because the target machine actively refused it.
14:26:41:Sending unit results: id:01 state:SEND project:5765 run:9 clone:365 gen:479 core:0x11 unit:0x1ba37bb64dc1acfd01df016d00091685
14:26:41:Unit 01: Uploading 6.81KiB
14:26:41:Connecting to 171.67.108.11:8080
14:26:43:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
14:26:43:Trying to send results to collection server
14:26:43:Unit 01: Uploading 6.81KiB
14:26:43:Connecting to 171.67.108.25:8080
14:26:44:WARNING: WorkServer connection failed on port 8080 trying 80
14:26:44:Connecting to 171.67.108.25:80
14:26:58:ERROR: Exception: Failed to connect to 171.67.108.25:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
14:33:33:Sending unit results: id:01 state:SEND project:5765 run:9 clone:365 gen:479 core:0x11 unit:0x1ba37bb64dc1acfd01df016d00091685
14:33:33:Unit 01: Uploading 6.81KiB
14:33:33:Connecting to 171.67.108.11:8080
14:33:33:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
14:33:33:Trying to send results to collection server
14:33:33:Unit 01: Uploading 6.81KiB
14:33:33:Connecting to 171.67.108.25:8080
14:33:34:WARNING: WorkServer connection failed on port 8080 trying 80
14:33:34:Connecting to 171.67.108.25:80
14:33:36:ERROR: Exception: Failed to connect to 171.67.108.25:80: No connection could be made because the target machine actively refused it.
Thanks in advance...
Post Reply