Can't send work units for weeks now

Moderators: Site Moderators, FAHC Science Team

eberlyml
Posts: 23
Joined: Sun Dec 02, 2007 10:17 pm
Location: Southeast Pennsylvania

Can't send work units for weeks now

Post by eberlyml »

I have been having problems since early in January sending work units back, have posted on specific server issues a few times. I will paste from the logs of the latest one not to go back to see if anyone sees any reason for the problems I'm having. It started with the Windows 6.20 client and after losing about 6 of the last 7 units I decided to remove the 6.20 client and install the 6.23. Well, guess what! You guessed it same thing on the first unit that finished with it. I have been folding with this same machine on the same network for close to a year now with no problems before this. I guess I question if it has anything to do with the servers anymore. Can someone please take a look at the below and see if you see anything that might indicate what is going on here. I am aware of no changes in my network here that could be causing this. All the server addresses give me an "OK" when put in a browser.

TIA,
PME

Code: Select all

[19:24:09] Completed 198000 out of 200000 steps  (99%)
[19:39:12] Timer requesting checkpoint
[19:49:39] Completed 200000 out of 200000 steps  (100%)
[19:50:43] 
[19:50:43] Finished Work Unit:
[19:50:43] - Reading up to 2269856 from "work/wudata_01.trr": Read 2269856
[19:50:44] - Reading up to 121256 from "work/wudata_01.xtc": Read 121256
[19:50:44] xvg file size: 65369
[19:50:44] logfile size: 38466
[19:50:44] Leaving Run
[19:50:49] - Writing 2581287 bytes of core data to disk...
[19:50:49]   ... Done.
[19:50:49] - Shutting down core
[19:50:49] 
[19:50:49] Folding@home Core Shutdown: FINISHED_UNIT
[19:50:52] CoreStatus = 64 (100)
[19:50:52] Sending work to server
[19:50:52] Project: 3858 (Run 6376, Clone 0, Gen 28)


[19:50:52] + Attempting to send results [February 1 19:50:52 UTC]
[19:50:53] - Couldn't send HTTP request to server
[19:50:53] + Could not connect to Work Server (results)
[19:50:53]     (128.59.74.4:8080)
[19:50:53] + Retrying using alternative port
[19:50:53] - Couldn't send HTTP request to server
[19:50:53] + Could not connect to Work Server (results)
[19:50:53]     (128.59.74.4:80)
[19:50:53] - Error: Could not transmit unit 01 (completed February 1) to work server.
[19:50:53]   Keeping unit 01 in queue.
[19:50:53] Project: 3858 (Run 6376, Clone 0, Gen 28)


[19:50:53] + Attempting to send results [February 1 19:50:53 UTC]
[19:50:53] - Couldn't send HTTP request to server
[19:50:53] + Could not connect to Work Server (results)
[19:50:53]     (128.59.74.4:8080)
[19:50:53] + Retrying using alternative port
[19:50:53] - Couldn't send HTTP request to server
[19:50:53] + Could not connect to Work Server (results)
[19:50:53]     (128.59.74.4:80)
[19:50:53] - Error: Could not transmit unit 01 (completed February 1) to work server.


[19:50:53] + Attempting to send results [February 1 19:50:53 UTC]
[19:50:53] - Couldn't send HTTP request to server
[19:50:53] + Could not connect to Work Server (results)
[19:50:53]     (171.65.103.100:8080)
[19:50:53] + Retrying using alternative port
[19:50:53] - Couldn't send HTTP request to server
[19:50:53] + Could not connect to Work Server (results)
[19:50:53]     (171.65.103.100:80)
[19:50:53]   Could not transmit unit 01 to Collection server; keeping in queue.
[19:50:53] - Preparing to get new work unit...
[19:50:53] + Attempting to get work packet
[19:50:53] - Connecting to assignment server
[19:50:53] - Successful: assigned to (128.59.74.4).
[19:50:53] + News From Folding@Home: Welcome to Folding@Home
[19:50:54] Loaded queue successfully.
[19:50:56] Project: 3858 (Run 6376, Clone 0, Gen 28)


[19:50:56] + Attempting to send results [February 1 19:50:56 UTC]
[19:50:56] - Couldn't send HTTP request to server
[19:50:56] + Could not connect to Work Server (results)
[19:50:56]     (128.59.74.4:8080)
[19:50:56] + Retrying using alternative port
[19:50:56] - Couldn't send HTTP request to server
[19:50:56] + Could not connect to Work Server (results)
[19:50:56]     (128.59.74.4:80)
[19:50:56] - Error: Could not transmit unit 01 (completed February 1) to work server.


[19:50:56] + Attempting to send results [February 1 19:50:56 UTC]
[19:50:56] - Couldn't send HTTP request to server
[19:50:56] + Could not connect to Work Server (results)
[19:50:56]     (171.65.103.100:8080)
[19:50:56] + Retrying using alternative port
[19:50:56] - Couldn't send HTTP request to server
[19:50:56] + Could not connect to Work Server (results)
[19:50:56]     (171.65.103.100:80)
[19:50:56]   Could not transmit unit 01 to Collection server; keeping in queue.
[19:50:56] + Closed connections
eberlyml, team centos #48721
DanGe
Posts: 118
Joined: Sat Nov 08, 2008 2:46 am
Hardware configuration: 2018 Mac Mini / MacOS Catalina
MSI Radeon RX Vega 56 (eGPU via Sonnet Breakaway Box 550)
3.2 GHz 6-Core Intel Core i7
Location: California, United States

Re: Can't send work units for weeks now

Post by DanGe »

Are you able to download WUs?
eberlyml
Posts: 23
Joined: Sun Dec 02, 2007 10:17 pm
Location: Southeast Pennsylvania

Re: Can't send work units for weeks now

Post by eberlyml »

DanGe wrote:Are you able to download WUs?

If you notice the following lines are in the middle of the log portion that I included in my original post.

[19:50:53] Could not transmit unit 01 to Collection server; keeping in queue.
[19:50:53] - Preparing to get new work unit...
[19:50:53] + Attempting to get work packet
[19:50:53] - Connecting to assignment server
[19:50:53] - Successful: assigned to (128.59.74.4).
[19:50:53] + News From Folding@Home: Welcome to Folding@Home
[19:50:54] Loaded queue successfully.
[19:50:56] Project: 3858 (Run 6376, Clone 0, Gen 28)


Does the new unit come from same server I can't connect to to send in a work unit, or is the assignment server different and it assigns this unit to be returned here? All seems kinda strange to me! :?

Thanks for looking,
PME
eberlyml, team centos #48721
DanGe
Posts: 118
Joined: Sat Nov 08, 2008 2:46 am
Hardware configuration: 2018 Mac Mini / MacOS Catalina
MSI Radeon RX Vega 56 (eGPU via Sonnet Breakaway Box 550)
3.2 GHz 6-Core Intel Core i7
Location: California, United States

Re: Can't send work units for weeks now

Post by DanGe »

eberlyml wrote:If you notice the following lines
Oops, lol :oops: Didn't see that...

According to the server status page right now, 128.59.74.4 has a load of 97 while 171.65.103.100 has a load of 208. I can see why you can't connect to the second one (208 is rather high), but I'm not sure if 97 is high for the first server.

When you typed the server addresses in your browser, did you enter in the port numbers (e.g. 171.65.103.100:8080)?

If you did all that and if it is not the servers, then the last thing I can think of is your firewall.
eberlyml
Posts: 23
Joined: Sun Dec 02, 2007 10:17 pm
Location: Southeast Pennsylvania

Re: Can't send work units for weeks now

Post by eberlyml »

DanGe wrote:
eberlyml wrote:If you notice the following lines
Oops, lol :oops: Didn't see that...

According to the server status page right now, 128.59.74.4 has a load of 97 while 171.65.103.100 has a load of 208. I can see why you can't connect to the second one (208 is rather high), but I'm not sure if 97 is high for the first server.

When you typed the server addresses in your browser, did you enter in the port numbers (e.g. 171.65.103.100:8080)?

Yes.

If you did all that and if it is not the servers, then the last thing I can think of is your firewall.
Checked the firewall logs numerous times, tried turning the Windows f/w off a few times and then sending finished wu's in. No change. Don't know why it would all of a sudden become a problem, or why it does not affect the download.
Any way, thanks for thinking thru this with me. Looks like this machine will be taken out of the folding mix, and maybe retired as it serves no other real purpose anymore (only Windows machine left here any way).

PME
eberlyml, team centos #48721
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: Can't send work units for weeks now

Post by codysluder »

eberlyml wrote:Does the new unit come from same server I can't connect to to send in a work unit, or is the assignment server different and it assigns this unit to be returned here? All seems kinda strange to me! :?
Yes, it takes some time to understand how all the parts of FAH work together.

A new WU can come from a number of different servers. It must be returned to that same server or to a collection server. (The collection servers are supposed to be used only when the primary server is down or is overloaded.)

FAH has been experiencing a lot of server overload recently. The new server code will help that a lot when it comes on-line.
http://folding.typepad.com/news/2008/11/index.html
eberlyml
Posts: 23
Joined: Sun Dec 02, 2007 10:17 pm
Location: Southeast Pennsylvania

Re: Can't send work units for weeks now

Post by eberlyml »

So if that is the case, I am getting a new work unit within seconds of not being able to send a finished unit back to the same server. Something certainly wrong with that scenario!

Any other ideas certainly welcome here,
PME
eberlyml, team centos #48721
eberlyml
Posts: 23
Joined: Sun Dec 02, 2007 10:17 pm
Location: Southeast Pennsylvania

Re: Can't send work units for weeks now

Post by eberlyml »

Just got to a place where I could check this machine again, it finished another wu today and again is unable to send it in. I left it go with the 6.20 client until it had five units it was not able to send back. Then I took off 6.20 and installed 6.23 and I'm starting the same process all over again. It has no problem getting a new work unit to fold, but can send very few back anymore. I hate to stop folding on that machine, but it's not doing anyone any good this way either.

PME
eberlyml, team centos #48721
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Can't send work units for weeks now

Post by bruce »

Nobody has asked you about your firewall / security software. There's a reasonable probability that if you have some good security software (i.e.- something more effective than what is supplied by Microsoft) it's blocking outgoing connections.

Shut down any program that might be a security risk if it connects to the internet. Temporarily disable all non-MS security software. Run fah with the -send all flag. It something uploads, you'll need to reconfigure your security software to give the FAH software more rights to connect to Stanford.

Whether it works or not, reboot to set everything back to normal and post that segment of FAHlog.
eberlyml
Posts: 23
Joined: Sun Dec 02, 2007 10:17 pm
Location: Southeast Pennsylvania

Re: Can't send work units for weeks now

Post by eberlyml »

OK, here's what I've done so far. I have disabled/routed around everything that could be a problem including turning off the firewall in my dsl modem. I am still getting the same report in the FAHlog as I initially reported.

Now, one thing I have found. A little while ago I enabled logging on my dsl modem to see if it had anything to say about my problems. the security logs are showing this
:FIREWALL icmp check (1 of 2): Protocol: ICMP Src ip: 128.59.74.4 Dst ip: xx.xx.my.ip Type: Destination Unreachable Code: Communication with Destination Host is Administratively Prohibited
This happens every time I restart the fah client and it tries to send the finished work units.

So it logs it on the modem firewall even when the firewall is turned off on the modem. Does this have anything to do with my problems here?

ideas welcome,
PME
eberlyml, team centos #48721
Amaruk
Posts: 254
Joined: Fri Jun 20, 2008 3:57 am
Location: Watching from the Woods

Re: Can't send work units for weeks now

Post by Amaruk »

Code: Select all

[19:50:49] Folding@home Core Shutdown: FINISHED_UNIT
[19:50:52] CoreStatus = 64 (100)
[19:50:52] Sending work to server
[19:50:52] Project: 3858 (Run 6376, Clone 0, Gen 28)


[19:50:52] + Attempting to send results [February 1 19:50:52 UTC]
[19:50:53] - Couldn't send HTTP request to server
[19:50:53] + Could not connect to Work Server (results)
[19:50:53]     (128.59.74.4:8080)
[19:50:53] + Retrying using alternative port
[19:50:53] - Couldn't send HTTP request to server
[19:50:53] + Could not connect to Work Server (results)
[19:50:53]     (128.59.74.4:80)
[19:50:53] - Error: Could not transmit unit 01 (completed February 1) to work server.
[19:50:53]   Keeping unit 01 in queue.
[19:50:53] Project: 3858 (Run 6376, Clone 0, Gen 28)
Project 3858 uses FahCore_7c.exe - this is the only core of the five that I've used that required internet access.


If you haven't done so already, try giving FahCore_7c.exe permission to access the net.
Image
eberlyml
Posts: 23
Joined: Sun Dec 02, 2007 10:17 pm
Location: Southeast Pennsylvania

Re: Can't send work units for weeks now

Post by eberlyml »

Thanks for your response, I guess I don't follow you here. I have tried sending many units many times without f/w's in place and always get the same message. Can always d/l a new work unit and keeps trying to send finished units over and over but never does. Do I need to do something with permissions on the core? (not much of a Windows guy here).
eberlyml, team centos #48721
John_Weatherman
Posts: 289
Joined: Sun Dec 02, 2007 4:31 am
Location: Carrizo Plain National Monument, California
Contact:

Re: Can't send work units for weeks now

Post by John_Weatherman »

eberlyml wrote:.

Now, one thing I have found. A little while ago I enabled logging on my dsl modem to see if it had anything to say about my problems. the security logs are showing this
:FIREWALL icmp check (1 of 2): Protocol: ICMP Src ip: 128.59.74.4 Dst ip: xx.xx.my.ip Type: Destination Unreachable Code: Communication with Destination Host is Administratively Prohibited
This happens every time I restart the fah client and it tries to send the finished work units.

So it logs it on the modem firewall even when the firewall is turned off on the modem. Does this have anything to do with my problems here?

ideas welcome,
PME
Just a thought, are you actually turning off the firewall? Do you need to log in, for example, as an Administrator to turn it off? What is the type of modem you have, in case anyone else has had the same problem?
eberlyml
Posts: 23
Joined: Sun Dec 02, 2007 10:17 pm
Location: Southeast Pennsylvania

Re: Can't send work units for weeks now

Post by eberlyml »

Just a thought, are you actually turning off the firewall? Do you need to log in, for example, as an Administrator to turn it off? What is the type of modem you have, in case anyone else has had the same problem?

I am logging in as admin and turning off the firewall. I have multiple firewalls on my network and have at one point routed around every one of them, and wired the machine in question directly to the modem, turned off the modem firewall and the Windows firewall for a very short time and tried sending the finished units. Got the same exact report from FAHlog as I do with everything in place. If indeed it is on this end it must be some Windows setting or some setting I've yet to find on the modem, which by the way is a Speedtouch 546. Could the ICMP protocol check in the modem have anything to do with this mess?
The modem indicates that this can be turned off only by CLI, as of yet I've not found any way of doing so but am still looking into that.

As always, thanks for your ideas

PME
eberlyml, team centos #48721
Xilikon
Posts: 155
Joined: Sun Dec 02, 2007 1:34 pm

Re: Can't send work units for weeks now

Post by Xilikon »

Nobody asked if you added a rule to allow port 8080 since uploads use that port and downloads use port 80. I forgot to enable port 8080 before and I was able to download a unit but not upload.
Image
Post Reply