Page 1 of 2
Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 7:54 am
by OVV
Hi to All,
i'm new of the forum, but i'm folding from about two months.
I've PC running nearly 24/7 SMP+GPU, but i have a slow connection (about 30kb/s in upload) so i can't upload big files (bigger than 18 Mb) within the timeout of the CS.
I've one P7904 completed on 03/01 of 23.65Mb, two P7903 completed on 07/01 of 28Mb and another one p7903 running, but i can't upload to the collection server within the 10-11 minutes before receiveng an upload error!!
I've also setted the client (V7.138) with the max-packet-size small flag to avoid big WU, but sometimes i receive it aniway.
Someone have any suggestion?
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 4:13 pm
by Napoleon
Welcome to the forums, OVV.
Timeout problem has been discussed also in viewtopic.php?f=67&t=18461&start=0#p186430. Unfortunately, I don't seem to find a cookie-cutter solution for slow uploads there, but could you post some logs? You mentioned you were uploading to CS (Collection Server), and those have long-standing issues that will be worked out in the future. Under normal circumstances uploads should go to WS (Work Server) anyway, more specifically the one you got the WU from originally, unless I've misunderstood something. Could be that you're just seeing some temporary server issue.
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 4:45 pm
by OVV
Code: Select all
23:09:43:Trying to send results to collection server
23:09:43:Unit 02: Uploading 23.65MiB to 129.74.85.16
23:09:43:Connecting to 129.74.85.16:8080
23:09:50:Unit 02: 3.54%
23:09:57:Unit 02: 4.48%
23:10:03:Unit 02: 5.19%
23:10:10:Unit 02: 6.13%
23:10:16:Unit 02: 6.84%
23:10:23:Unit 02: 7.78%
23:10:26:Unit 03:Completed 21500000 out of 50000000 steps (43%).
23:10:28:Unit 00:Completed 140000 out of 250000 steps (56%)
23:10:29:Unit 02: 8.49%
23:10:36:Unit 02: 9.43%
23:10:42:Unit 02: 10.13%
23:10:49:Unit 02: 11.07%
23:10:55:Unit 02: 11.78%
23:11:02:Unit 02: 12.72%
23:11:08:Unit 02: 13.43%
23:11:15:Unit 02: 14.37%
23:11:21:Unit 02: 15.08%
23:11:28:Unit 02: 16.02%
23:11:34:Unit 02: 16.73%
23:11:41:Unit 02: 17.68%
23:11:43:Unit 00:Completed 142500 out of 250000 steps (57%)
23:11:47:Unit 02: 18.37%
23:11:54:Unit 02: 19.31%
23:12:00:Unit 02: 20.02%
23:12:07:Unit 02: 20.96%
23:12:13:Unit 02: 21.67%
23:12:20:Unit 02: 22.61%
23:12:26:Unit 02: 23.32%
23:12:33:Unit 02: 24.27%
23:12:39:Unit 02: 24.98%
23:12:46:Unit 02: 25.92%
23:12:48:Unit 03:Completed 22000000 out of 50000000 steps (44%).
23:12:52:Unit 02: 26.61%
23:12:59:Unit 00:Completed 145000 out of 250000 steps (58%)
23:12:59:Unit 02: 27.55%
23:13:05:Unit 02: 28.26%
23:13:12:Unit 02: 29.21%
23:13:18:Unit 02: 29.92%
23:13:25:Unit 02: 30.86%
23:13:31:Unit 02: 31.57%
23:13:38:Unit 02: 32.51%
23:13:44:Unit 02: 33.22%
23:13:51:Unit 02: 34.16%
23:13:57:Unit 02: 34.86%
23:14:04:Unit 02: 35.80%
23:14:10:Unit 02: 36.51%
23:14:14:Unit 00:Completed 147500 out of 250000 steps (59%)
23:14:17:Unit 02: 37.45%
23:14:24:Unit 02: 38.39%
23:14:30:Unit 02: 39.10%
23:14:36:Unit 02: 39.81%
23:14:43:Unit 02: 40.75%
23:14:49:Unit 02: 41.58%
23:14:56:Unit 02: 42.40%
23:15:02:Unit 02: 43.33%
23:15:09:Unit 02: 44.04%
23:15:09:Unit 03:Completed 22500000 out of 50000000 steps (45%).
23:15:16:Unit 02: 44.98%
23:15:22:Unit 02: 45.69%
23:15:28:Unit 00:Completed 150000 out of 250000 steps (60%)
23:15:29:Unit 02: 46.63%
23:15:35:Unit 02: 47.34%
23:15:42:Unit 02: 48.28%
23:15:48:Unit 02: 49.00%
23:15:55:Unit 02: 49.94%
23:16:01:Unit 02: 50.65%
23:16:08:Unit 02: 51.59%
23:16:14:Unit 02: 52.28%
23:16:21:Unit 02: 53.22%
23:16:27:Unit 02: 53.93%
23:16:34:Unit 02: 54.88%
23:16:40:Unit 02: 55.59%
23:16:43:Unit 00:Completed 152500 out of 250000 steps (61%)
23:16:47:Unit 02: 56.53%
23:16:53:Unit 02: 57.24%
23:17:00:Unit 02: 58.18%
23:17:06:Unit 02: 58.89%
23:17:13:Unit 02: 59.83%
23:17:19:Unit 02: 60.53%
23:17:26:Unit 02: 61.48%
23:17:32:Unit 03:Completed 23000000 out of 50000000 steps (46%).
23:17:32:Unit 02: 62.18%
23:17:39:Unit 02: 63.12%
23:17:45:Unit 02: 63.83%
23:17:52:Unit 02: 64.77%
23:17:58:Unit 02: 65.48%
23:18:00:Unit 00:Completed 155000 out of 250000 steps (62%)
23:18:05:Unit 02: 66.42%
23:18:11:Unit 02: 67.13%
23:18:18:Unit 02: 68.07%
23:18:24:Unit 02: 68.79%
23:18:31:Unit 02: 69.73%
23:18:37:Unit 02: 70.42%
23:18:43:Unit 02: 71.13%
23:18:50:Unit 02: 72.07%
23:18:56:Unit 02: 72.78%
23:19:03:Unit 02: 73.72%
23:19:09:Unit 02: 74.43%
23:19:15:Unit 00:Completed 157500 out of 250000 steps (63%)
23:19:16:Unit 02: 75.38%
23:19:22:Unit 02: 76.07%
23:19:29:Unit 02: 77.03%
23:19:35:Unit 02: 77.72%
23:19:42:Unit 02: 78.66%
23:19:48:Unit 02: 79.37%
23:19:53:Unit 03:Completed 23500000 out of 50000000 steps (47%).
23:19:55:Unit 02: 80.32%
23:20:01:Unit 02: 81.03%
23:20:08:Unit 02: 81.97%
23:20:10:ERROR: Exception: 10001: Server responded: HTTP_GATEWAY_TIME_OUT
This is a portion of the log file where my connection fail to upload the result
I've already had this issue some week ago and i've dumped some WU because i'm unable to send it to CS. To try to avoid this problem i've inserted the flag max-packet-size small , but some days ago i've received again this kind of WU.
Which is the maximum dimension in Mb of the WU that i've to expect with this flag?
Thanks
OVV
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 5:17 pm
by PantherX
We need the portion before this line:
23:09:43:Trying to send results to collection server
The reason is that we want to see why it choose the CS instead of the WS. Please post the portion after the WU has reached 100%
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 6:05 pm
by Nathan_P
Be interesting to see the outcome of this, I routinely spend 1.5hours uploading to the work servers with no problem. That is with v6 however. I know there were problems with some servers and v7 - perhaps this is another one of those?
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 6:59 pm
by Joe_H
The flag "max-packet-size small" refers to the size of the WU downloaded from the servers, and corresponds to 5 MB in V7. Once processed the results file can be much larger. So it will help you not get WU's that have even larger result files than you are seeing with this one. Could you list the project information for the WU that is not uploading along with the parts of the log that have already been asked for? In any case, I believe the timeout on uploads is fairly long, it is usually long enough for fairly slow connections, but yours might be below that threshold.
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 7:13 pm
by Napoleon
I looked up 129.74.85.16, it is actually classic WS, not a CS. 790x projects use A4 core, so classic WS looks OK to me. Seems to me that the OP is trying to upload to a WS as usual, not a to CS. I don't know what the WS timeout value is set to, but the logs show a timeout after 10 minutes and 27 seconds. That doesn't sound like a nice, round value human beings normally use. Maybe this problems isn't related to timeouts after all?
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 7:47 pm
by OVV
ATM i don't have the log between the 100% of the WU and the first attempt of upload because i've cleared the log, but tomorrow morning (for me in Italy) i will complete another 7903 so i will paste the full log.
For the question on CS and WS, probably i've not clear in my mind the difference between the two server....
The timeout of ten minutes is probably perfect for most the FAH users, but for whom with a slow ADSL is a big problem.
I've managed WU of 18Mb with some trouble, but 23 or even worst 28Mb is definitively too much for my connection.
There is a method to upload the result from another site instead of the phisical location of the computer that is running FAH?
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 8:08 pm
by bruce
Quoted from viewtopic.php?f=67&t=20450&p=203626#p203577
bs_texas wrote:Most of my folding these days, except for my PS3, is via GPU with the 7.1.38 version. I could probably triple my work units, but I can't seem to upload any CPU results anymore on the old, slow, ancient, crappy, bad, slow, 30 year old, unreliable, old, slow.... 56-dial-up system, which is currently all that is available here!
Ticket #94 was accepted to provide dial-up support for V7 eventually, but it's not something that I'd expect any time soon. Obviously if 56kb lines are supported, you shouldn't have any trouble with your 30kB/s connection (note the "b" and "B" for bits and Bytes).
You'll need to consider the setting of max-packet-size to small/normal/big. You may need to restrict the packet size based on your line speed. The SMP projects will probably never have small packets, but some of the uniprocessor projects and some of the GPU projects be smaller than others and are more likely to work within the limits of your connection speed.
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 8:56 pm
by OVV
With my GPU (Quadro 2000) i don't have any problem, the size of the completed WU is not bigger than 4Mb.
Bruce,
i've already setted the small flag on the max-packet-size, many WU are not big so i don't have problem, but the 790x are too big...
The uniprocessor is an option but my xeon quad core is a little bit wasted to run this type of WU
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 9:27 pm
by bruce
OVV wrote:The uniprocessor is an option but my xeon quad core is a little bit wasted to run this type of WU
Maybe ... maybe not ... but we'll have to see what the future brings. I don't consider any donated work to be wasted unless the WU exceeds the Preferred Deadline and has to be reprocessed by someone else. For a long time, the uniprocessor work has depended mostly on FahCore_78 which has no QRB and which has long deadlines. New projects are mostly using FahCore_a4 and FahCore_b4.
There are some interesting possibilities. Consider that Core_a4 is getting a QRB and you're really working on some of the same projects that are being done with SMP. I've heard that Protomol (Core_b4) has some great potential but don't know much about it. It may also evolve into a multi-threaded application someday. Core_78 is the same very solid performer that it has been for a number of years and I wouldn't discount the importance of the work it is doing
I've seen some indications that there may be some realignment of the points some time in the future. The PPD for core_a4 uniprocessor projects seems to be superior to that of core_78 which might indicate that FAH is moving in that direction. (...and No, I don't have any knowledge when, if ever, this might happen.)
Back to your comment: If your xeon quad is able to meet the Preferred Deadlines of SMP, including any communications delays that might happen, I support your choice of smp. If it fails or it's marginal, then running one Uniprocessor core per physical core is excellent use of the equipment.
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 9:50 pm
by OVV
Ok Bruce,
tomorrow i will try with the uniprocessor, if i have understood correctly i can run 4 uniprocessor in the same time? (one for phisical core) I will let you know how it works!
But i don't wont to loose the 4 WU (3 completed and one at 90%) that i can't upload whitin the timeout of the server, what can i do?
Re: Slow connection - Upload Problems
Posted: Mon Jan 09, 2012 11:00 pm
by bruce
The only way to upload the results is to use the capabilities in the client. I've asked about the timeout on that server but don't have an answer yet. I presume that your client has tried to upload each one more than once and that the timeout interval is the same each time the client tries to upload to a specific server.
YGPM.
Re: Slow connection - Upload Problems
Posted: Tue Jan 10, 2012 8:26 pm
by OVV
I'm working with 4 slot uniprocessor, and it seems okay, the WU are very little, so i don't have problem to upload them, thanks for the suggestion! Also the PPD seems in line with the SMP, maybe a little less, but less is better than nothing!!
Re: Slow connection - Upload Problems
Posted: Wed Jan 11, 2012 3:39 pm
by Zeta
I'm having a similar problem, but I'm not on a slow connection at all:
Code: Select all
15:41:39: Config: <none>
15:41:39:******************************** Build ********************************
15:41:39: Version: 7.1.43
15:41:39: Date: Jan 2 2012
15:41:39: Time: 04:27:48
15:41:39: SVN Rev: 3223
15:41:39: Branch: fah/trunk/client
15:41:39: Compiler: GNU 4.1.2 20080704 (Red Hat 4.1.2-46)
15:41:39: Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
15:41:39: -fno-unsafe-math-optimizations -msse2
15:41:39: Platform: linux2 2.6.18-164.11.1.el5
15:41:39: Bits: 64
15:41:39: Mode: Release
15:41:39:******************************* System ********************************
15:41:39: CPU: Quad-Core AMD Opteron(tm) Processor 2378
15:41:39: CPU ID: AuthenticAMD Family 16 Model 4 Stepping 2
15:41:39: CPUs: 4
15:41:39: Memory: 7.88GiB
15:41:39:Free Memory: 3.03GiB
15:41:39: Threads: POSIX_THREADS
15:41:39: On Battery: false
15:41:39: UTC offset: -5
15:41:39: PID: 14187
15:41:39: CWD: /home/xxxx/FAH/
15:41:39: OS: Linux 2.6.37.1-1.2-desktop x86_64
15:41:39: OS Arch: AMD64
15:41:39: GPUs: 1
15:41:39: GPU 0: NVIDIA:1 G92 [GeForce 9800 GTX]
15:41:39: CUDA: 1.1
15:41:39:CUDA Driver: 4000
15:41:39:***********************************************************************
15:41:39:<config>
15:41:39: <!-- Folding Slots -->
15:41:39:</config>
15:41:39:Trying to access database...
15:41:40:Successfully acquired database lock
15:41:40:Enabled folding slot 00: READY smp:4
15:41:40:WU01:FS00:Starting
15:41:40:WU01:FS00:Running FahCore: /home/xxxx/FAH/FAHCoreWrapper /home/xxxx/FAH/cores/www.stanford.edu/~pande/Linux/AMD64/Core_a3.fah/FahCore_a3 -dir 01 -suffix 01 -version 701 -checkpoint 15 -np 4
15:41:40:WU01:FS00:Started FahCore on PID 14195
15:41:40:WU01:FS00:Core PID:14199
15:41:40:WU01:FS00:FahCore 0xa3 started
15:41:40:WU00:FS00:Sending unit results: id:00 state:SEND error:OK project:7903 run:51 clone:9 gen:18 core:0xa4 unit:0x0000001800ac9c214eca67e72622fb29
15:41:40:WU00:FS00:Uploading 28.82MiB to 128.113.12.161
15:41:40:WU00:FS00:Connecting to localhost:3333
15:41:40:WU01:FS00:0xa3:
15:41:40:WU01:FS00:0xa3:*------------------------------*
15:41:40:WU01:FS00:0xa3:Folding@Home Gromacs SMP Core
15:41:40:WU01:FS00:0xa3:Version 2.27 (Dec. 15, 2010)
15:41:40:WU01:FS00:0xa3:
15:41:40:WU01:FS00:0xa3:Preparing to commence simulation
15:41:40:WU01:FS00:0xa3:- Looking at optimizations...
15:41:40:WU01:FS00:0xa3:- Files status OK
15:41:40:WU01:FS00:0xa3:- Expanded 753721 -> 1428856 (decompressed 189.5 percent)
15:41:40:WU01:FS00:0xa3:Called DecompressByteArray: compressed_data_size=753721 data_size=1428856, decompressed_data_size=1428856 diff=0
15:41:40:WU01:FS00:0xa3:- Digital signature verified
15:41:40:WU01:FS00:0xa3:
15:41:40:WU01:FS00:0xa3:Project: 10132 (Run 92, Clone 2, Gen 46)
15:41:40:WU01:FS00:0xa3:
15:41:40:WU01:FS00:0xa3:Assembly optimizations on if available.
15:41:40:WU01:FS00:0xa3:Entering M.D.
15:41:46:WU00:FS00:Upload 33.18%
15:41:46:WU01:FS00:0xa3:Mapping NT from 4 to 4
15:41:46:WU01:FS00:0xa3:Completed 0 out of 2000000 steps (0%)
15:41:52:WU00:FS00:Upload 52.27%
15:41:58:WU00:FS00:Upload 69.84%
15:42:04:WU00:FS00:Upload 83.50%
15:42:10:WU00:FS00:Upload 93.26%
15:42:13:WARNING:WU00:FS00:Exception: Failed to send results to work server: 10001: Server responded: HTTP_GATEWAY_TIME_OUT
15:42:13:WU00:FS00:Trying to send results to collection server
15:42:13:WU00:FS00:Uploading 28.82MiB to 129.74.85.16
15:42:13:WU00:FS00:Connecting to localhost:3333
15:42:19:WU00:FS00:Upload 8.68%
15:42:40:WU00:FS00:Upload 9.33%
15:42:46:WU00:FS00:Upload 32.97%
15:42:52:WU00:FS00:Upload 52.49%
15:42:58:WU00:FS00:Upload 70.71%
15:43:04:WU00:FS00:Upload 89.58%
15:43:06:ERROR:WU00:FS00:Exception: 10001: Server responded: HTTP_BAD_GATEWAY
15:43:06:WU00:FS00:Sending unit results: id:00 state:SEND error:OK project:7903 run:51 clone:9 gen:18 core:0xa4 unit:0x0000001800ac9c214eca67e72622fb29
15:43:06:WU00:FS00:Uploading 28.82MiB to 128.113.12.161
15:43:06:WU00:FS00:Connecting to localhost:3333
15:43:12:WU00:FS00:Upload 14.75%
15:43:18:WU00:FS00:Upload 38.61%
15:43:24:WU00:FS00:Upload 47.07%
15:43:30:WU00:FS00:Upload 74.18%
15:43:36:WU00:FS00:Upload 96.73%
15:43:37:WARNING:WU00:FS00:Exception: Failed to send results to work server: 10001: Server responded: HTTP_GATEWAY_TIME_OUT
15:43:37:WU00:FS00:Trying to send results to collection server
15:43:37:WU00:FS00:Uploading 28.82MiB to 129.74.85.16
15:43:37:WU00:FS00:Connecting to localhost:3333
15:43:43:WU00:FS00:Upload 19.74%
15:43:49:WU00:FS00:Upload 43.38%
15:43:55:WU00:FS00:Upload 67.45%
15:44:01:WU00:FS00:Upload 80.90%
15:44:06:ERROR:WU00:FS00:Exception: 10001: Server responded: HTTP_GATEWAY_TIME_OUT
15:44:06:WU00:FS00:Sending unit results: id:00 state:SEND error:OK project:7903 run:51 clone:9 gen:18 core:0xa4 unit:0x0000001800ac9c214eca67e72622fb29
15:44:06:WU00:FS00:Uploading 28.82MiB to 128.113.12.161
15:44:06:WU00:FS00:Connecting to localhost:3333
15:44:12:WU00:FS00:Upload 16.48%
15:44:18:WU00:FS00:Upload 35.35%
15:44:24:WU00:FS00:Upload 55.52%
15:44:30:WU00:FS00:Upload 81.77%
15:44:34:WARNING:WU00:FS00:Exception: Failed to send results to work server: 10001: Server responded: HTTP_GATEWAY_TIME_OUT
15:44:34:WU00:FS00:Trying to send results to collection server
15:44:34:WU00:FS00:Uploading 28.82MiB to 129.74.85.16
15:44:34:WU00:FS00:Connecting to localhost:3333
15:44:40:WU00:FS00:Upload 26.89%
15:44:46:WU00:FS00:Upload 54.66%
15:44:52:WU00:FS00:Upload 81.33%
15:44:56:ERROR:WU00:FS00:Exception: 10001: Server responded: HTTP_GATEWAY_TIME_OUT
maybe the full log will be helpful to both of our problems?