Ongoing problem with 171.64.65.98

Moderators: Site Moderators, FAHC Science Team

Post Reply
LonePalm
Posts: 98
Joined: Thu Feb 26, 2009 7:27 pm
Location: Saint Marys, Georgia

Ongoing problem with 171.64.65.98

Post by LonePalm »

My ongoing fight with 171.64.65.98 is back in full swing. Now when I finish a WU I can't get another WU unless I reboot my computer.

This is beyond annoying.

Code: Select all

12:59:00:WU01:FS00:0x17:Completed 2000000 out of 2000000 steps (100%)
12:59:01:WU02:FS00:Connecting to assign-GPU.stanford.edu:80
12:59:01:WU02:FS00:News: Welcome to Folding@Home
12:59:01:WU02:FS00:Assigned to work server 171.64.65.98
12:59:01:WU02:FS00:Requesting new work unit for slot 00: RUNNING gpu:0:Tahiti XT [Radeon HD 7970] from 171.64.65.98
12:59:01:WU02:FS00:Connecting to 171.64.65.98:8080
12:59:04:ERROR:WU02:FS00:Exception: Server did not assign work unit
12:59:04:WU02:FS00:Connecting to assign-GPU.stanford.edu:80
12:59:04:WU02:FS00:News: Welcome to Folding@Home
12:59:04:WU02:FS00:Assigned to work server 171.64.65.98
12:59:04:WU02:FS00:Requesting new work unit for slot 00: RUNNING gpu:0:Tahiti XT [Radeon HD 7970] from 171.64.65.98
12:59:04:WU02:FS00:Connecting to 171.64.65.98:8080
12:59:07:ERROR:WU02:FS00:Exception: Server did not assign work unit
12:59:12:WU01:FS00:0x17:Saving result file logfile_01.txt
12:59:12:WU01:FS00:0x17:Saving result file checkpointState.xml
12:59:13:WU01:FS00:0x17:Saving result file checkpt.crc
12:59:13:WU01:FS00:0x17:Saving result file log.txt
12:59:14:WU01:FS00:0x17:Saving result file positions.xtc
12:59:15:WU01:FS00:0x17:Folding@home Core Shutdown: FINISHED_UNIT
12:59:15:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
12:59:15:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:7811 run:0 clone:354 gen:328 core:0x17 unit:0x0000015b0a3b1e8651db495773f2cecc
12:59:15:WU01:FS00:Uploading 4.26MiB to 171.64.65.98
12:59:15:WU01:FS00:Connecting to 171.64.65.98:8080
12:59:21:WU01:FS00:Upload 24.96%
12:59:27:WU01:FS00:Upload 57.25%
12:59:33:WU01:FS00:Upload 88.08%
12:59:39:WU01:FS00:Upload complete
12:59:39:WU01:FS00:Server responded WORK_ACK (400)
12:59:39:WU01:FS00:Final credit estimate, 9972.00 points
12:59:39:WU01:FS00:Cleaning up
13:00:04:WU02:FS00:Connecting to assign-GPU.stanford.edu:80
13:00:04:WU02:FS00:News: Welcome to Folding@Home
13:00:04:WU02:FS00:Assigned to work server 171.64.65.98
13:00:04:WU02:FS00:Requesting new work unit for slot 00: READY gpu:0:Tahiti XT [Radeon HD 7970] from 171.64.65.98
13:00:04:WU02:FS00:Connecting to 171.64.65.98:8080
13:00:07:ERROR:WU02:FS00:Exception: Server did not assign work unit
13:01:41:WU02:FS00:Connecting to assign-GPU.stanford.edu:80
13:01:42:WU02:FS00:News: Welcome to Folding@Home
13:01:42:WU02:FS00:Assigned to work server 171.64.65.98
13:01:42:WU02:FS00:Requesting new work unit for slot 00: READY gpu:0:Tahiti XT [Radeon HD 7970] from 171.64.65.98
13:01:42:WU02:FS00:Connecting to 171.64.65.98:8080
13:01:44:ERROR:WU02:FS00:Exception: Server did not assign work unit
.....
15:21:58:WU02:FS00:Connecting to assign-GPU.stanford.edu:80
15:21:58:WU02:FS00:News: Welcome to Folding@Home
15:21:58:WU02:FS00:Assigned to work server 171.64.65.98
15:22:03:WU00:FS01:0xa3:Completed 375000 out of 500000 steps  (75%)
15:22:58:WU02:FS00:Requesting new work unit for slot 00: READY gpu:0:Tahiti XT [Radeon HD 7970] from 171.64.65.98
15:22:58:WU02:FS00:Connecting to 171.64.65.98:8080
15:23:00:ERROR:WU02:FS00:Exception: Server did not assign work unit
15:26:12:WU02:FS00:Connecting to assign-GPU.stanford.edu:80
15:26:12:WU02:FS00:News: Welcome to Folding@Home
15:26:12:WU02:FS00:Assigned to work server 171.64.65.98
15:26:54:WU00:FS01:0xa3:Completed 380000 out of 500000 steps  (76%)
15:27:12:WU02:FS00:Requesting new work unit for slot 00: READY gpu:0:Tahiti XT [Radeon HD 7970] from 171.64.65.98
15:27:12:WU02:FS00:Connecting to 171.64.65.98:8080
15:27:15:ERROR:WU02:FS00:Exception: Server did not assign work unit
At this point I rebooted the computer and immediately received a WU.

Code: Select all

15:35:42:WU01:FS00:Connecting to assign-GPU.stanford.edu:80
15:35:43:WU01:FS00:News: Welcome to Folding@Home
15:35:43:WU01:FS00:Assigned to work server 171.64.65.98
15:35:43:WU01:FS00:Requesting new work unit for slot 00: READY gpu:0:Tahiti XT [Radeon HD 7970] from 171.64.65.98
15:35:43:WU01:FS00:Connecting to 171.64.65.98:8080
15:35:45:WU01:FS00:Downloading 1.55MiB
15:35:46:WU01:FS00:Download complete
15:35:46:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:7811 run:0 clone:51 gen:485 core:0x17 unit:0x000002020a3b1e8651db46a59778b39e
15:35:47:WU01:FS00:Starting
15:35:47:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 703 -lifeline 6656 -checkpoint 15 -gpu 0 -gpu-vendor ati
15:35:47:WU01:FS00:Started FahCore on PID 2540
15:35:50:WU01:FS00:Core PID:4748
15:35:50:WU01:FS00:FahCore 0x17 started
15:35:51:WU00:FS01:0xa3:- Looking at optimizations...
15:35:51:WU00:FS01:0xa3:- Working with standard loops on this execution.
15:35:51:WU00:FS01:0xa3:- Previous termination of core was improper.
15:35:51:WU00:FS01:0xa3:- Files status OK
15:35:52:WU00:FS01:0xa3:- Expanded 2166920 -> 3127244 (decompressed 144.3 percent)
15:35:52:WU00:FS01:0xa3:Called DecompressByteArray: compressed_data_size=2166920 data_size=3127244, decompressed_data_size=3127244 diff=0
15:35:52:WU00:FS01:0xa3:- Digital signature verified
15:35:52:WU00:FS01:0xa3:
15:35:52:WU00:FS01:0xa3:Project: 7507 (Run 0, Clone 93, Gen 478)
15:35:52:WU00:FS01:0xa3:
15:35:52:WU00:FS01:0xa3:Entering M.D.
15:35:52:WU01:FS00:0x17:*********************** Log Started 2013-10-26T15:35:52Z ***********************
15:35:52:WU01:FS00:0x17:Project: 7811 (Run 0, Clone 51, Gen 485)
15:35:52:WU01:FS00:0x17:Unit: 0x000002020a3b1e8651db46a59778b39e
15:35:52:WU01:FS00:0x17:CPU: 0x00000000000000000000000000000000
15:35:52:WU01:FS00:0x17:Machine: 0
15:35:52:WU01:FS00:0x17:Reading tar file state.xml
15:35:52:WU01:FS00:0x17:Reading tar file system.xml
15:35:53:WU01:FS00:0x17:Reading tar file integrator.xml
15:35:53:WU01:FS00:0x17:Reading tar file core.xml
15:35:53:WU01:FS00:0x17:Digital signatures verified
15:35:58:WU00:FS01:0xa3:Using Gromacs checkpoints
15:35:58:WU00:FS01:0xa3:Mapping NT from 7 to 7 
15:35:59:WU00:FS01:0xa3:Resuming from checkpoint
15:35:59:WU00:FS01:0xa3:Verified 00/wudata_01.log
15:36:05:WU00:FS01:0xa3:Verified 00/wudata_01.trr
15:36:05:WU00:FS01:0xa3:Verified 00/wudata_01.xtc
15:36:05:WU00:FS01:0xa3:Verified 00/wudata_01.edr
15:36:05:WU00:FS01:0xa3:Completed 372010 out of 500000 steps  (74%)
15:36:20:4:127.0.0.1:New Web connection
15:36:30:WARNING:Exception: 8:127.0.0.1: Send error: 10053: An established connection was aborted by the software in your host machine.
15:36:32:WU01:FS00:0x17:Completed 0 out of 2000000 steps (0%)
15:38:24:WU01:FS00:0x17:Completed 20000 out of 2000000 steps (1%)

What gives guys?
Image
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Ongoing problem with 171.64.65.98

Post by bruce »

Please post the top few pages of the log showing the hardware detection and the configuration settings.

Did this actually happen some 8 hours ago or is you clock screwed up?
P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

Re: Ongoing problem with 171.64.65.98

Post by P5-133XL »

Rebooting your local machine is not going to change anything at the server-end so it is not a server-issue. This type of problem can be a bear to diagnose. When folding is failing to connect, can you get to the work server using a browser by simply entering its ip address into the url line?

Things that have caused similar problems:

1. A network connection that is being saturated by another application such as running bittorrent. A saturated connection can cause packet acknowledgements to timeout. When you reboot the bittorrent doesn't start as fast as folding so the connection isn't saturated any longer.
2. A poor quality wireless network connection.
3. DHCP address conflicts


I'm sure there are more.
Image
LonePalm
Posts: 98
Joined: Thu Feb 26, 2009 7:27 pm
Location: Saint Marys, Georgia

Re: Ongoing problem with 171.64.65.98

Post by LonePalm »

Everything was going fine there for several weeks. I have made no changes to my system or my FAH configuration. I am scrupulous about keeping everything up to date.
FWIW - I build my first PC from parts in 1982.

FAH has no problem getting to the server, just no WU until I reboot. I don't use bittorrent. I don't use a wireless router. The CPU folding hasn't had any problems.

Here is the beginning of my most recent log file.

Code: Select all

*********************** Log Started 2013-10-26T15:35:35Z ***********************
15:35:35:************************* Folding@home Client *************************
15:35:35:      Website: http://folding.stanford.edu/
15:35:35:    Copyright: (c) 2009-2013 Stanford University
15:35:35:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:35:35:         Args: --open-web-control
15:35:35:       Config: C:/ProgramData/FAHClient/config.xml
15:35:35:******************************** Build ********************************
15:35:35:      Version: 7.3.6
15:35:35:         Date: Feb 18 2013
15:35:35:         Time: 15:25:17
15:35:35:      SVN Rev: 3923
15:35:35:       Branch: fah/trunk/client
15:35:35:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
15:35:35:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
15:35:35:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
15:35:35:     Platform: win32 XP
15:35:35:         Bits: 32
15:35:35:         Mode: Release
15:35:35:******************************* System ********************************
15:35:35:          CPU: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
15:35:35:       CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
15:35:35:         CPUs: 8
15:35:35:       Memory: 7.96GiB
15:35:35:  Free Memory: 6.05GiB
15:35:35:      Threads: WINDOWS_THREADS
15:35:35:  Has Battery: false
15:35:35:   On Battery: false
15:35:35:   UTC offset: -4
15:35:35:          PID: 6656
15:35:35:          CWD: C:/ProgramData/FAHClient
15:35:35:           OS: Windows 7 Professional
15:35:35:      OS Arch: AMD64
15:35:35:         GPUs: 1
15:35:35:        GPU 0: ATI:5 Tahiti XT [Radeon HD 7970]
15:35:35:         CUDA: Not detected
15:35:35:Win32 Service: false
15:35:35:***********************************************************************
15:35:35:<config>
15:35:35:  <!-- Folding Core -->
15:35:35:  <core-priority v='low'/>
15:35:35:
15:35:35:  <!-- Folding Slot Configuration -->
15:35:35:  <client-type v='advanced'/>
15:35:35:  <power v='full'/>
15:35:35:
15:35:35:  <!-- Network -->
15:35:35:  <proxy v=':8080'/>
15:35:35:
15:35:35:  <!-- User Information -->
15:35:35:  <passkey v='********************************'/>
15:35:35:  <team v='36120'/>
15:35:35:  <user v='LonePalm'/>
15:35:35:
15:35:35:  <!-- Folding Slots -->
15:35:35:  <slot id='0' type='GPU'>
15:35:35:    <client-type v='advanced'/>
15:35:35:    <next-unit-percentage v='100'/>
15:35:35:  </slot>
15:35:35:  <slot id='1' type='CPU'>
15:35:35:    <client-type v='advanced'/>
15:35:35:    <cpus v='-1'/>
15:35:35:  </slot>
15:35:35:</config>
15:35:35:Trying to access database...
15:35:35:Successfully acquired database lock
15:35:35:Enabled folding slot 00: READY gpu:0:Tahiti XT [Radeon HD 7970]
15:35:35:Enabled folding slot 01: READY cpu:7
15:35:36:WU00:FS01:Starting
15:35:36:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a3.fah/FahCore_a3.exe -dir 00 -suffix 01 -version 703 -lifeline 6656 -checkpoint 15 -np 7
15:35:37:WU00:FS01:Started FahCore on PID 5004
15:35:41:WU00:FS01:Core PID:4652
15:35:41:WU00:FS01:FahCore 0xa3 started
15:35:42:WU00:FS01:0xa3:
15:35:42:WU00:FS01:0xa3:*------------------------------*
15:35:42:WU00:FS01:0xa3:Folding@Home Gromacs SMP Core
15:35:42:WU00:FS01:0xa3:Version 2.27 (Dec. 15, 2010)
15:35:42:WU00:FS01:0xa3:
15:35:42:WU00:FS01:0xa3:Preparing to commence simulation
15:35:42:WU00:FS01:0xa3:- Ensuring status. Please wait.
15:35:42:WU01:FS00:Connecting to assign-GPU.stanford.edu:80
15:35:43:WU01:FS00:News: Welcome to Folding@Home
15:35:43:WU01:FS00:Assigned to work server 171.64.65.98
15:35:43:WU01:FS00:Requesting new work unit for slot 00: READY gpu:0:Tahiti XT [Radeon HD 7970] from 171.64.65.98
15:35:43:WU01:FS00:Connecting to 171.64.65.98:8080
15:35:45:WU01:FS00:Downloading 1.55MiB
15:35:46:WU01:FS00:Download complete
15:35:46:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:7811 run:0 clone:51 gen:485 core:0x17 unit:0x000002020a3b1e8651db46a59778b39e
15:35:47:WU01:FS00:Starting
15:35:47:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 703 -lifeline 6656 -checkpoint 15 -gpu 0 -gpu-vendor ati
15:35:47:WU01:FS00:Started FahCore on PID 2540
15:35:50:WU01:FS00:Core PID:4748
15:35:50:WU01:FS00:FahCore 0x17 started
15:35:51:WU00:FS01:0xa3:- Looking at optimizations...
15:35:51:WU00:FS01:0xa3:- Working with standard loops on this execution.
15:35:51:WU00:FS01:0xa3:- Previous termination of core was improper.
15:35:51:WU00:FS01:0xa3:- Files status OK
15:35:52:WU00:FS01:0xa3:- Expanded 2166920 -> 3127244 (decompressed 144.3 percent)
15:35:52:WU00:FS01:0xa3:Called DecompressByteArray: compressed_data_size=2166920 data_size=3127244, decompressed_data_size=3127244 diff=0
15:35:52:WU00:FS01:0xa3:- Digital signature verified
15:35:52:WU00:FS01:0xa3:
15:35:52:WU00:FS01:0xa3:Project: 7507 (Run 0, Clone 93, Gen 478)
15:35:52:WU00:FS01:0xa3:
15:35:52:WU00:FS01:0xa3:Entering M.D.
15:35:52:WU01:FS00:0x17:*********************** Log Started 2013-10-26T15:35:52Z ***********************
15:35:52:WU01:FS00:0x17:Project: 7811 (Run 0, Clone 51, Gen 485)
15:35:52:WU01:FS00:0x17:Unit: 0x000002020a3b1e8651db46a59778b39e
15:35:52:WU01:FS00:0x17:CPU: 0x00000000000000000000000000000000
15:35:52:WU01:FS00:0x17:Machine: 0
15:35:52:WU01:FS00:0x17:Reading tar file state.xml
15:35:52:WU01:FS00:0x17:Reading tar file system.xml
15:35:53:WU01:FS00:0x17:Reading tar file integrator.xml
15:35:53:WU01:FS00:0x17:Reading tar file core.xml
15:35:53:WU01:FS00:0x17:Digital signatures verified
15:35:58:WU00:FS01:0xa3:Using Gromacs checkpoints
15:35:58:WU00:FS01:0xa3:Mapping NT from 7 to 7 
15:35:59:WU00:FS01:0xa3:Resuming from checkpoint
15:35:59:WU00:FS01:0xa3:Verified 00/wudata_01.log
15:36:05:WU00:FS01:0xa3:Verified 00/wudata_01.trr
15:36:05:WU00:FS01:0xa3:Verified 00/wudata_01.xtc
15:36:05:WU00:FS01:0xa3:Verified 00/wudata_01.edr
15:36:05:WU00:FS01:0xa3:Completed 372010 out of 500000 steps  (74%)
15:36:20:4:127.0.0.1:New Web connection
15:36:30:WARNING:Exception: 8:127.0.0.1: Send error: 10053: An established connection was aborted by the software in your host machine.
15:36:32:WU01:FS00:0x17:Completed 0 out of 2000000 steps (0%)
Image
LonePalm
Posts: 98
Joined: Thu Feb 26, 2009 7:27 pm
Location: Saint Marys, Georgia

Re: Ongoing problem with 171.64.65.98

Post by LonePalm »

OK. this is getting old. The moment I complain about something on the forums, the problem goes away. I have made no changes to my system but suddenly I am getting WUs first time, every time.

To who ever solved the problem, my profuse thanks.
Image
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Ongoing problem with 171.64.65.98

Post by bruce »

LonePalm wrote:To who ever solved the problem, my profuse thanks.
Server's do run out of WUs or do have other problems. Pande Group members do check on servers and they can fix either type of problem if they notice it. They also do read the form and could have spotted this topic as another method of discovering the problem.

(Forum Mods/Admins can generally distinguish between server-based problems and problems with FAHClient. When I attempted to figure out when you actually had the problem, that's what I was trying to do, but I wasn't able to complete that process, so I didn't have anything to do with solving the problem.)
Post Reply