Multiple WU's Fail downld/upld to 155.247.166.*
Moderators: Site Moderators, FAHC Science Team
Re: Multiple WU's Fail downld/upld to 155.247.166.*
I had a similar problem last weekend ... here's what worked for me:
Open FAHControl
Click on Pause
Exit FAHControl
Re-boot computer
Open FAHControl
Click on Fold
Open FAHControl
Click on Pause
Exit FAHControl
Re-boot computer
Open FAHControl
Click on Fold
Re: Multiple WU's Fail downld/upld to 155.247.166.*
As a CPU only folder, i found out theres one interesting thing is the A7 project never failed to downloaded/uploaded.
currently most A7 WUs are assigned from A7 only server 128.252.203.9, but occasionally i got WUs from 155.247.166.219, sometime i got transfer failure from 128.252.203.9 but never got SINGLE transfer failure from 219 server for one month
currently most A7 WUs are assigned from A7 only server 128.252.203.9, but occasionally i got WUs from 155.247.166.219, sometime i got transfer failure from 128.252.203.9 but never got SINGLE transfer failure from 219 server for one month
Re: Multiple WU's Fail downld/upld to 155.247.166.*
My failure on the weekend was an a7 project ... unfortunately it happens to all projects regardless of the core. It does not happen every time and some will not experience it for weeks at a time or maybe never, but it's certainly a 'pain' when it happens. The fact is that the core has nothing to do with the download/upload sequences ... that's all done by the client.
Re: Multiple WU's Fail downld/upld to 155.247.166.*
This worked. thanks!bollix47 wrote:I had a similar problem last weekend ... here's what worked for me:
Open FAHControl
Click on Pause
Exit FAHControl
Re-boot computer
Open FAHControl
Click on Fold
I tried stopping the fahclient and killing any remaining processes for user fahclient and then restarting the client but that didn't work in this case. Rebooting linux is not a satisfying solution for me but it worked.
Re: Multiple WU's Fail downld/upld to 155.247.166.*
There's a known bug in FAHCore_A7. It has been fixed in a new version of that FAHCore and that version is being beta tested so it should be ready to release soon. The bug causes extra (unnecessary) information to be added to the result, making the file too large to upload. Excessively large uploads are being rejected by the servers. (Yours is 68 MiB and it should be maybe 10 MiB)
I would probably discard that file, but it will eventually expire and delete itself from your system.
Most likely you have been processing that WU with the "on idle" setting. I recommend you discontinue using that setting until the new CPU FAHCore_a7 is released.
I would probably discard that file, but it will eventually expire and delete itself from your system.
Most likely you have been processing that WU with the "on idle" setting. I recommend you discontinue using that setting until the new CPU FAHCore_a7 is released.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 41
- Joined: Thu Oct 09, 2008 8:59 pm
Re: Multiple WU's Fail downld/upld to 155.247.166.*
November 6 1600 EST - Temple server .219 is failing to download GPU work units.
-
- Posts: 244
- Joined: Thu Dec 06, 2007 6:31 pm
- Hardware configuration: Folding with: 4x RTX 4070Ti, 1x RTX 4080 Super
- Location: United Kingdom
- Contact:
Re: Multiple WU's Fail downld/upld to 155.247.166.*
I concur - looks like the download issues are back with the 155.247.166.* server.Catalina588 wrote:November 6 1600 EST - Temple server .219 is failing to download GPU work units.
Folding Stats (HFM.NET): DocJonz Folding Farm Stats
Re: Multiple WU's Fail downld/upld to 155.247.166.*
I am down on two out of four Folding machines. I will keep them down until/unless someone can give the "all clear".
Re: Multiple WU's Fail downld/upld to 155.247.166.*
What messages are you seeing when .219 doesn't issue a WU?
=====
The bug in the CPU FAHCore_A7 has been fixed so all CPU Wus going out now will no longer be inflated ... consuming extra bandwidth. Over the next couple of weeks, those CPU WUs that are being processed will be completed and the congestion problem will gradually be reduced.
=====
The bug in the CPU FAHCore_A7 has been fixed so all CPU Wus going out now will no longer be inflated ... consuming extra bandwidth. Over the next couple of weeks, those CPU WUs that are being processed will be completed and the congestion problem will gradually be reduced.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Multiple WU's Fail downld/upld to 155.247.166.*
Same. I've been getting hung folding slots for the past 4+ hours. Downloads fail from *.219, and just stop like this:
Or, this one that fixed itself. Downloads failed from *.219, but are OK from *.220:
Code: Select all
2019-11-06:23:46:53:WU01:FS01:0x21:Completed 25000000 out of 25000000 steps (100%)
2019-11-06:23:46:53:WU01:FS01:0x21:Saving result file logfile_01.txt
2019-11-06:23:46:53:WU01:FS01:0x21:Saving result file checkpointState.xml
2019-11-06:23:46:53:WU01:FS01:0x21:Saving result file checkpt.crc
2019-11-06:23:46:53:WU01:FS01:0x21:Saving result file log.txt
2019-11-06:23:46:53:WU01:FS01:0x21:Saving result file positions.xtc
2019-11-06:23:46:54:WU01:FS01:0x21:Folding@home Core Shutdown: FINISHED_UNIT
2019-11-06:23:46:54:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
2019-11-06:23:46:54:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:14191 run:17 clone:13 gen:89 core:0x21 unit:0x0000007f0002894c5d5d742b6b992b05
2019-11-06:23:46:54:WU01:FS01:Uploading 9.30MiB to 155.247.166.220
2019-11-06:23:46:54:WU01:FS01:Connecting to 155.247.166.220:8080
2019-11-06:23:46:54:WU02:FS01:Connecting to 65.254.110.245:8080
2019-11-06:23:46:54:WU02:FS01:Assigned to work server 155.247.166.219
2019-11-06:23:46:54:WU02:FS01:Requesting new work unit for slot 01: READY gpu:1:GP104 [GeForce GTX 1080] 8873 from 155.247.166.219
2019-11-06:23:46:54:WU02:FS01:Connecting to 155.247.166.219:8080
2019-11-06:23:46:55:WU02:FS01:Downloading 27.50MiB
2019-11-06:23:47:00:WU01:FS01:Upload 11.43%
2019-11-06:23:47:02:WU02:FS01:Download 1.59%
2019-11-06:23:47:06:WU01:FS01:Upload 21.51%
2019-11-06:23:47:09:WU02:FS01:Download 2.27%
2019-11-06:23:47:13:WU01:FS01:Upload 30.92%
2019-11-06:23:47:17:WU02:FS01:Download 2.73%
2019-11-06:23:47:19:WU01:FS01:Upload 43.02%
2019-11-06:23:47:24:WU02:FS01:Download 3.18%
2019-11-06:23:47:25:WU01:FS01:Upload 56.46%
2019-11-06:23:47:31:WU01:FS01:Upload 67.88%
2019-11-06:23:47:33:WU02:FS01:Download 3.86%
2019-11-06:23:47:37:WU01:FS01:Upload 84.02%
2019-11-06:23:47:42:WU02:FS01:Download 4.32%
2019-11-06:23:47:43:WU01:FS01:Upload 96.79%
2019-11-06:23:47:45:WU01:FS01:Upload complete
2019-11-06:23:47:45:WU01:FS01:Server responded WORK_ACK (400)
2019-11-06:23:47:45:WU01:FS01:Final credit estimate, 192064.00 points
2019-11-06:23:47:45:WU01:FS01:Cleaning up
2019-11-06:23:48:21:WU02:FS01:Download 4.55%
2019-11-06:23:48:29:WU02:FS01:Download 4.77%
2019-11-06:23:48:35:WU02:FS01:Download 5.23%
2019-11-06:23:51:11:WU02:FS01:Download 5.45%
2019-11-06:23:51:12:ERROR:WU02:FS01:Exception: Transfer failed
2019-11-06:23:51:13:WU02:FS01:Connecting to 65.254.110.245:8080
2019-11-06:23:51:13:WU02:FS01:Assigned to work server 155.247.166.219
2019-11-06:23:51:13:WU02:FS01:Requesting new work unit for slot 01: READY gpu:1:GP104 [GeForce GTX 1080] 8873 from 155.247.166.219
2019-11-06:23:51:13:WU02:FS01:Connecting to 155.247.166.219:8080
2019-11-06:23:51:13:WU02:FS01:Downloading 27.45MiB
2019-11-06:23:51:21:WU02:FS01:Download 0.46%
2019-11-06:23:51:28:WU02:FS01:Download 1.59%
2019-11-06:23:51:54:WU02:FS01:Download 2.28%
2019-11-06:23:52:20:WU02:FS01:Download 2.96%
2019-11-06:23:52:26:WU02:FS01:Download 3.87%
2019-11-06:23:52:33:WU02:FS01:Download 4.55%
2019-11-06:23:52:39:WU02:FS01:Download 5.46%
2019-11-06:23:52:45:WU02:FS01:Download 6.15%
2019-11-06:23:52:56:WU02:FS01:Download 7.06%
2019-11-06:23:53:03:WU02:FS01:Download 7.51%
2019-11-06:23:53:10:WU02:FS01:Download 8.42%
2019-11-06:23:53:20:WU02:FS01:Download 8.65%
2019-11-06:23:53:26:WU02:FS01:Download 8.88%
2019-11-06:23:53:35:WU02:FS01:Download 9.56%
2019-11-06:23:53:42:WU02:FS01:Download 9.79%
2019-11-06:23:53:49:WU02:FS01:Download 10.47%
2019-11-06:23:53:56:WU02:FS01:Download 10.93%
2019-11-06:23:54:02:WU02:FS01:Download 11.15%
2019-11-06:23:54:26:WU02:FS01:Download 11.38%
2019-11-06:23:54:32:WU02:FS01:Download 12.29%
2019-11-06:23:54:40:WU02:FS01:Download 12.98%
2019-11-06:23:54:46:WU02:FS01:Download 13.43%
2019-11-06:23:54:53:WU02:FS01:Download 14.34%
2019-11-06:23:55:14:WU02:FS01:Download 14.57%
2019-11-06:23:55:20:WU02:FS01:Download 15.94%
2019-11-06:23:55:26:WU02:FS01:Download 16.85%
2019-11-06:23:55:32:WU02:FS01:Download 17.76%
2019-11-06:23:55:38:WU02:FS01:Download 19.35%
2019-11-06:23:55:45:WU02:FS01:Download 20.72%
2019-11-06:23:55:52:WU02:FS01:Download 21.63%
2019-11-06:23:55:58:WU02:FS01:Download 22.31%
2019-11-06:23:56:05:WU02:FS01:Download 23.22%
2019-11-06:23:56:11:WU02:FS01:Download 24.13%
2019-11-06:23:56:17:WU02:FS01:Download 25.04%
2019-11-06:23:56:36:WU02:FS01:Download 25.95%
2019-11-06:23:56:45:WU02:FS01:Download 26.18%
2019-11-06:23:56:53:WU02:FS01:Download 26.41%
Code: Select all
2019-11-06:23:47:36:WU00:FS00:Connecting to 65.254.110.245:8080
2019-11-06:23:47:36:WU00:FS00:Assigned to work server 155.247.166.219
2019-11-06:23:47:36:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:TU102 [GeForce RTX 2080 Ti Rev. A] M 13448 from 155.247.166.219
2019-11-06:23:47:36:WU00:FS00:Connecting to 155.247.166.219:8080
2019-11-06:23:47:37:ERROR:WU00:FS00:Exception: Server did not assign work unit
2019-11-06:23:47:37:WU00:FS00:Connecting to 65.254.110.245:8080
2019-11-06:23:47:38:WU00:FS00:Assigned to work server 155.247.166.219
2019-11-06:23:47:38:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:TU102 [GeForce RTX 2080 Ti Rev. A] M 13448 from 155.247.166.219
2019-11-06:23:47:38:WU00:FS00:Connecting to 155.247.166.219:8080
2019-11-06:23:47:38:WU00:FS00:Downloading 27.49MiB
2019-11-06:23:47:52:WU00:FS00:Download 0.68%
2019-11-06:23:47:59:WU00:FS00:Download 0.91%
2019-11-06:23:48:09:WU00:FS00:Download 1.36%
2019-11-06:23:48:17:WU00:FS00:Download 2.05%
2019-11-06:23:48:25:WU00:FS00:Download 2.73%
2019-11-06:23:48:47:WU00:FS00:Download 2.96%
2019-11-06:23:49:45:WU00:FS00:Download 3.08%
2019-11-06:23:49:45:ERROR:WU00:FS00:Exception: Transfer failed
2019-11-06:23:49:45:WU00:FS00:Connecting to 65.254.110.245:8080
2019-11-06:23:49:46:WU00:FS00:Assigned to work server 155.247.166.220
2019-11-06:23:49:46:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:TU102 [GeForce RTX 2080 Ti Rev. A] M 13448 from 155.247.166.220
2019-11-06:23:49:46:WU00:FS00:Connecting to 155.247.166.220:8080
2019-11-06:23:49:46:WU00:FS00:Downloading 15.58MiB
2019-11-06:23:49:52:WU00:FS00:Download 89.88%
2019-11-06:23:49:52:WU00:FS00:Download complete
-
- Posts: 22
- Joined: Mon Mar 10, 2014 1:41 am
Re: Multiple WU's Fail downld/upld to 155.247.166.*
Both of my 1070s were hung on 'download' this morning. 155.247.166.219 needs to be pulled until the underlying problem can be fixed. It's been over a month intermittently now. This is not a good look for the project.
Re: Multiple WU's Fail downld/upld to 155.247.166.*
isabsolutefunk wrote:Both of my 1070s were hung on 'download' this morning. 155.247.166.219 needs to be pulled until the underlying problem can be fixed. It's been over a month intermittently now. This is not a good look for the project.
I don't think there's any chance that the server will be pulled. vav3.ocis.temple.edu is currently supporting about 25% of FAH''s activity. Taking that capacity off-line would make a huge disruption in your ability to get an assignment when you need one. I understand it looks bad, and is an inconvenience for you but that's unrealistic and makes a moderate problem into a big one. New hardware is being ordered to handle the recent increase in production, but provisioning for that increase takes time and money.
Besides, the first step has been completed (fixing FAHCore_a7 software) and rolling out that fix takes time because it's cannot be called "completed" until all WUs currently in the field are refreshed with new ones, no matter how slow the Donor hardware happens to be.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 22
- Joined: Mon Mar 10, 2014 1:41 am
Re: Multiple WU's Fail downld/upld to 155.247.166.*
Wow, 25%, I thought the load was more distributed than that. These issues don't bother me that much, but the hanging downloads require manual intervention on our behalf, and for folders which don't (or can't) check their systems periodically, it results in lost output. I'm hoping the next client release supports a hard timeout on downloads, which would help a lot.
Re: Multiple WU's Fail downld/upld to 155.247.166.*
My 25% number came from https://apps.foldingathome.org/serverstats.
There are a lot of servers currently off-line, and don't know why. (Possibly related to a recent critical upgrade of the server software)
The essential part of my post is that the problems are understood and are being addressed --- and it takes a lot of "red tape" to get enough signatures to spend as much money as it takes to get a new server(s).
There are a lot of servers currently off-line, and don't know why. (Possibly related to a recent critical upgrade of the server software)
The essential part of my post is that the problems are understood and are being addressed --- and it takes a lot of "red tape" to get enough signatures to spend as much money as it takes to get a new server(s).
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Site Admin
- Posts: 7951
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Multiple WU's Fail downld/upld to 155.247.166.*
Also probably related to servers no longer being operated out of Stanford, the last one was shut off in the last couple months. That leaves servers at WUSTL, Temple and MSKCC.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3