bruce wrote:
Some are special cases. For example, 140.163.4.231. It's in ACCEPT mode so it should be accepting WUs. Also, under Collection Server, it says YES so failed connections SHOULD be redirected to another server. Please post a segment of the log showing a pair of those attempts. (Was this server called a Work Server or a Collection Server in that context?)
Here are a couple of chunks from the logs, I hope that they help:
Code: Select all
12:21:04:WU00:FS02:0x22:Completed 4950000 out of 5000000 steps (99%)
12:21:04:WU01:FS02:Connecting to assign1.foldingathome.org:80
12:21:04:WU01:FS02:Assigned to work server 129.213.157.105
12:21:04:WU01:FS02:Requesting new work unit for slot 02: RUNNING gpu:1:TU102 [GeForce RTX 2080 Ti Rev. A] M 13448 from 129.213.157.105
12:21:04:WU01:FS02:Connecting to 129.213.157.105:8080
12:21:05:ERROR:WU01:FS02:Exception: Server did not assign work unit
12:21:05:WU01:FS02:Connecting to assign1.foldingathome.org:80
12:21:06:WU01:FS02:Assigned to work server 206.223.170.146
12:21:06:WU01:FS02:Requesting new work unit for slot 02: RUNNING gpu:1:TU102 [GeForce RTX 2080 Ti Rev. A] M 13448 from 206.223.170.146
12:21:06:WU01:FS02:Connecting to 206.223.170.146:8080
12:21:30:WU00:FS02:0x22:Completed 5000000 out of 5000000 steps (100%)
12:21:30:WU00:FS02:0x22:Average performance: 327.273 ns/day
12:21:31:WU00:FS02:0x22:Checkpoint completed at step 5000000
12:21:31:WU00:FS02:0x22:Saving result file ..\logfile_01.txt
12:21:31:WU00:FS02:0x22:Saving result file checkpointIntegrator.xml
12:21:31:WU00:FS02:0x22:Saving result file checkpointState.xml
12:21:31:WU00:FS02:0x22:Saving result file positions.xtc
12:21:31:WU00:FS02:0x22:Saving result file science.log
12:21:31:WU00:FS02:0x22:Folding@home Core Shutdown: FINISHED_UNIT
12:21:32:WU00:FS02:FahCore returned: FINISHED_UNIT (100 = 0x64)
12:21:32:WU00:FS02:Sending unit results: id:00 state:SEND error:NO_ERROR project:16921 run:97 clone:4 gen:160 core:0x22 unit:0x000000c10002894c5f5bf34bf68ce549
12:21:32:WU00:FS02:Uploading 4.99MiB to 155.247.166.220
12:21:32:WU00:FS02:Connecting to 155.247.166.220:8080
12:21:37:WU02:FS01:0x22:Completed 750000 out of 5000000 steps (15%)
12:21:37:WU02:FS01:0x22:Checkpoint completed at step 750000
12:21:37:ERROR:WU01:FS02:Exception: 10002: Received short response, expected 512 bytes, got 0
12:21:42:WU00:FS02:Upload complete
12:21:42:WU00:FS02:Server responded WORK_ACK (400)
12:21:42:WU00:FS02:Final credit estimate, 50477.00 points
12:21:42:WU00:FS02:Cleaning up
12:22:05:WU01:FS02:Connecting to assign1.foldingathome.org:80
12:22:06:WU01:FS02:Assigned to work server 129.213.157.105
12:22:06:WU01:FS02:Requesting new work unit for slot 02: READY gpu:1:TU102 [GeForce RTX 2080 Ti Rev. A] M 13448 from 129.213.157.105
12:22:06:WU01:FS02:Connecting to 129.213.157.105:8080
12:22:06:ERROR:WU01:FS02:Exception: Server did not assign work unit
12:22:08:WU03:FS00:0xa8:Completed 85000 out of 500000 steps (17%)
12:22:37:WU02:FS01:0x22:Completed 800000 out of 5000000 steps (16%)
12:23:34:WU02:FS01:0x22:Completed 850000 out of 5000000 steps (17%)
12:23:42:WU01:FS02:Connecting to assign1.foldingathome.org:80
12:23:43:WU01:FS02:Assigned to work server 206.223.170.146
12:23:43:WU01:FS02:Requesting new work unit for slot 02: READY gpu:1:TU102 [GeForce RTX 2080 Ti Rev. A] M 13448 from 206.223.170.146
12:23:43:WU01:FS02:Connecting to 206.223.170.146:8080
12:23:59:WU03:FS00:0xa8:Completed 90000 out of 500000 steps (18%)
12:24:04:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80
12:24:04:WU01:FS02:Connecting to 206.223.170.146:80
12:24:11:WU02:FS01:0x22:Completed 900000 out of 5000000 steps (18%)
12:24:13:ERROR:WU01:FS02:Exception: Server did not assign work unit
12:24:43:WU02:FS01:0x22:Completed 950000 out of 5000000 steps (19%)
12:25:14:WU02:FS01:0x22:Completed 1000000 out of 5000000 steps (20%)
12:25:14:WU02:FS01:0x22:Checkpoint completed at step 1000000
12:25:45:WU02:FS01:0x22:Completed 1050000 out of 5000000 steps (21%)
12:25:57:WU03:FS00:0xa8:Completed 95000 out of 500000 steps (19%)
12:26:15:WU02:FS01:0x22:Completed 1100000 out of 5000000 steps (22%)
12:26:20:WU01:FS02:Connecting to assign1.foldingathome.org:80
12:26:20:WU01:FS02:Assigned to work server 206.223.170.146
12:26:20:WU01:FS02:Requesting new work unit for slot 02: READY gpu:1:TU102 [GeForce RTX 2080 Ti Rev. A] M 13448 from 206.223.170.146
12:26:20:WU01:FS02:Connecting to 206.223.170.146:8080
12:26:27:ERROR:WU01:FS02:Exception: Server did not assign work unit
Code: Select all
03:13:55:WU03:FS02:0x22:Folding@home Core Shutdown: FINISHED_UNIT
03:13:56:WU03:FS02:FahCore returned: FINISHED_UNIT (100 = 0x64)
03:13:56:WU03:FS02:Sending unit results: id:03 state:SEND error:NO_ERROR project:17309 run:0 clone:873 gen:57 core:0x22 unit:0x0000004512bc7d9a0000000000000369
03:13:56:WU03:FS02:Uploading 14.64MiB to 18.188.125.154
03:13:56:WU03:FS02:Connecting to 18.188.125.154:8080
03:13:56:WU12:FS02:Starting
<<<redacted>>>>
03:13:57:WU12:FS02:0x22:Digital signatures verified
03:13:57:WU12:FS02:0x22:Folding@home GPU Core22 Folding@home Core
03:13:57:WU12:FS02:0x22:Version 0.0.13
03:13:57:WU12:FS02:0x22: Checkpoint write interval: 62500 steps (5%) [20 total]
03:13:57:WU12:FS02:0x22: JSON viewer frame write interval: 12500 steps (1%) [100 total]
03:13:57:WU12:FS02:0x22: XTC frame write interval: 125000 steps (10%) [10 total]
03:13:57:WU12:FS02:0x22: Global context and integrator variables write interval: disabled
03:13:58:WU12:FS02:0x22:There are 4 platforms available.
03:13:58:WU12:FS02:0x22:Platform 0: Reference
03:13:58:WU12:FS02:0x22:Platform 1: CPU
03:13:58:WU12:FS02:0x22:Platform 2: OpenCL
03:13:58:WU12:FS02:0x22: opencl-device 1 specified
03:13:58:WU12:FS02:0x22:Platform 3: CUDA
03:13:58:WU12:FS02:0x22: cuda-device 1 specified
03:14:17:WU12:FS02:0x22:Attempting to create CUDA context:
03:14:17:WU12:FS02:0x22: Configuring platform CUDA
03:14:22:WU12:FS02:0x22: Using CUDA and gpu 1
03:14:22:WU12:FS02:0x22:Completed 0 out of 1250000 steps (0%)
03:14:23:WU12:FS02:0x22:Checkpoint completed at step 0
03:14:26:WARNING:WU03:FS02:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
03:14:26:WU03:FS02:Trying to send results to collection server
03:14:26:WU03:FS02:Uploading 14.64MiB to 140.163.4.231
03:14:26:WU03:FS02:Connecting to 140.163.4.231:8080
03:14:33:WU10:FS00:0xa7:Completed 120000 out of 250000 steps (48%)
03:14:56:ERROR:WU03:FS02:Exception: 10002: Received short response, expected 512 bytes, got 0
03:14:57:WU03:FS02:Sending unit results: id:03 state:SEND error:NO_ERROR project:17309 run:0 clone:873 gen:57 core:0x22 unit:0x0000004512bc7d9a0000000000000369
03:14:57:WU03:FS02:Uploading 14.64MiB to 18.188.125.154
03:14:57:WU03:FS02:Connecting to 18.188.125.154:8080
03:15:15:WU12:FS02:0x22:Completed 12500 out of 1250000 steps (1%)
03:15:27:WARNING:WU03:FS02:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
03:15:27:WU03:FS02:Trying to send results to collection server
03:15:27:WU03:FS02:Uploading 14.64MiB to 140.163.4.231
03:15:27:WU03:FS02:Connecting to 140.163.4.231:8080
03:15:28:WU00:FS01:0x22:Completed 4900000 out of 5000000 steps (98%)
03:15:29:WU00:FS01:0x22:Checkpoint completed at step 4900000
03:15:33:WU10:FS00:0xa7:Completed 122500 out of 250000 steps (49%)
03:15:57:ERROR:WU03:FS02:Exception: 10002: Received short response, expected 512 bytes, got 0
03:15:57:WU03:FS02:Sending unit results: id:03 state:SEND error:NO_ERROR project:17309 run:0 clone:873 gen:57 core:0x22 unit:0x0000004512bc7d9a0000000000000369
03:15:57:WU03:FS02:Uploading 14.64MiB to 18.188.125.154
03:15:57:WU03:FS02:Connecting to 18.188.125.154:8080
03:16:12:WU12:FS02:0x22:Completed 25000 out of 1250000 steps (2%)
03:16:28:WARNING:WU03:FS02:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
03:16:28:WU03:FS02:Trying to send results to collection server
03:16:28:WU03:FS02:Uploading 14.64MiB to 140.163.4.231
03:16:28:WU03:FS02:Connecting to 140.163.4.231:8080
03:16:34:WU10:FS00:0xa7:Completed 125000 out of 250000 steps (50%)
03:16:58:ERROR:WU03:FS02:Exception: 10002: Received short response, expected 512 bytes, got 0
03:17:09:WU12:FS02:0x22:Completed 37500 out of 1250000 steps (3%)
03:17:33:WU10:FS00:0xa7:Completed 127500 out of 250000 steps (51%)
03:17:35:WU11:FS02:Sending unit results: id:11 state:SEND error:NO_ERROR project:14909 run:45 clone:8 gen:72 core:0x22 unit:0x0000006181d59d695f526024535b608c
03:17:35:WU11:FS02:Uploading 16.92MiB to 129.213.157.105
03:17:35:WU11:FS02:Connecting to 129.213.157.105:8080
03:17:35:WU03:FS02:Sending unit results: id:03 state:SEND error:NO_ERROR project:17309 run:0 clone:873 gen:57 core:0x22 unit:0x0000004512bc7d9a0000000000000369
03:17:35:WU03:FS02:Uploading 14.64MiB to 18.188.125.154
03:17:35:WU03:FS02:Connecting to 18.188.125.154:8080
03:18:05:WARNING:WU11:FS02:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
03:18:05:WU11:FS02:Trying to send results to collection server
03:18:05:WU11:FS02:Uploading 16.92MiB to 129.213.40.229
03:18:05:WU11:FS02:Connecting to 129.213.40.229:8080
03:18:05:WARNING:WU03:FS02:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
03:18:05:WU03:FS02:Trying to send results to collection server
03:18:05:WU03:FS02:Uploading 14.64MiB to 140.163.4.231
03:18:05:WU03:FS02:Connecting to 140.163.4.231:8080
03:18:06:WU12:FS02:0x22:Completed 50000 out of 1250000 steps (4%)
03:18:33:WU10:FS00:0xa7:Completed 130000 out of 250000 steps (52%)
03:18:35:ERROR:WU03:FS02:Exception: 10002: Received short response, expected 512 bytes, got 0
03:18:36:ERROR:WU11:FS02:Exception: 10002: Received short response, expected 512 bytes, got 0
03:19:04:WU12:FS02:0x22:Completed 62500 out of 1250000 steps (5%)
03:19:06:WU12:FS02:0x22:Checkpoint completed at step 62500
03:19:33:WU10:FS00:0xa7:Completed 132500 out of 250000 steps (53%)
03:20:04:WU12:FS02:0x22:Completed 75000 out of 1250000 steps (6%)
03:20:12:WU03:FS02:Sending unit results: id:03 state:SEND error:NO_ERROR project:17309 run:0 clone:873 gen:57 core:0x22 unit:0x0000004512bc7d9a0000000000000369
03:20:12:WU03:FS02:Uploading 14.64MiB to 18.188.125.154
03:20:12:WU03:FS02:Connecting to 18.188.125.154:8080
03:20:32:WU10:FS00:0xa7:Completed 135000 out of 250000 steps (54%)
03:20:42:WARNING:WU03:FS02:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
03:20:42:WU03:FS02:Trying to send results to collection server
03:20:42:WU03:FS02:Uploading 14.64MiB to 140.163.4.231
03:20:42:WU03:FS02:Connecting to 140.163.4.231:8080
03:21:03:WU12:FS02:0x22:Completed 87500 out of 1250000 steps (7%)
03:21:10:WU07:FS02:Sending unit results: id:07 state:SEND error:NO_ERROR project:14908 run:136 clone:2 gen:107 core:0x22 unit:0x0000008981d59d695f5260297031acbb
03:21:10:WU07:FS02:Uploading 19.79MiB to 129.213.157.105
03:21:10:WU07:FS02:Connecting to 129.213.157.105:8080
03:21:13:ERROR:WU03:FS02:Exception: 10002: Received short response, expected 512 bytes, got 0
03:21:31:WU10:FS00:0xa7:Completed 137500 out of 250000 steps (55%)
03:21:38:WU00:FS01:0x22:Completed 4950000 out of 5000000 steps (99%)
03:21:39:WU13:FS01:Connecting to assign1.foldingathome.org:80
03:21:39:WU13:FS01:Assigned to work server 18.188.125.154
03:21:39:WU13:FS01:Requesting new work unit for slot 01: gpu:10:0 TU102 [GeForce RTX 2080 Ti Rev. A] M 13448 from 18.188.125.154
03:21:39:WU13:FS01:Connecting to 18.188.125.154:8080
03:21:40:WARNING:WU07:FS02:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
03:21:40:WU07:FS02:Trying to send results to collection server
03:21:40:WU07:FS02:Uploading 19.79MiB to 150.136.14.110
03:21:40:WU07:FS02:Connecting to 150.136.14.110:8080
03:21:41:WU13:FS01:Downloading 12.01MiB