The more powerful NVIDIA cards I have don't get project 13421, the only one that does is a GeForce GTX 660
In fact, it is the only one out of the 4 nvidia cards that is having that project.
Driver 384.130
Ubuntu 16.04
Drivers are the binaries coming from ubuntu
Nothing special otherwise, CPU is an older i5-2320, and its also doing CPU folding.
On the AMD ones, it is the following:
2 systems have the following:
Ubuntu 16.04, mix of RX570 and RX560. Carizzo core (APU) is also affected by the failure issue.
AMDGPU 18.30-641594 from AMD's website
1 system is Ubuntu 18.04
AMDGPU 20.20-1089974 from AMD's website
Interesting log snippets from the Geforce GTX660
Code: Select all
$ grep ERROR log.txt |grep -v NO_ERROR
05:56:48:WU01:FS01:0x22:ERROR:NaNs detected in forces. 8 0
05:56:55:WU02:FS01:0x22:ERROR:NaNs detected in forces. 8 0
05:57:01:WU01:FS01:0x22:ERROR:Force RMSE error of 415.735 with threshold of 5
05:57:05:WU02:FS01:0x22:ERROR:NaNs detected in forces. 8 0
08:33:40:ERROR:WU02:FS00:Exception: Server did not assign work unit
08:51:06:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
08:51:12:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
12:29:02:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 9 0
12:29:58:ERROR:WU00:FS00:Exception: Could not get an assignment
12:29:59:ERROR:WU00:FS00:Exception: Could not get an assignment
12:35:13:ERROR:WU00:FS00:Exception: Failed to connect to 3.21.157.11:80: Connection timed out
14:17:18:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 6 0
14:17:25:WU01:FS01:0x22:ERROR:Force RMSE error of 445.004 with threshold of 5
14:17:31:WU02:FS01:0x22:ERROR:Force RMSE error of 272.417 with threshold of 5
14:17:35:WU01:FS01:0x22:ERROR:Force RMSE error of 381.459 with threshold of 5
14:17:41:WU02:FS01:0x22:ERROR:Force RMSE error of 267.462 with threshold of 5
14:17:47:WU01:FS01:0x22:ERROR:Force RMSE error of 524.41 with threshold of 5
16:07:45:ERROR:WU01:FS00:Exception: Failed to connect to 3.21.157.11:80: Connection timed out
19:59:36:ERROR:WU00:FS00:Exception: Could not get an assignment
13:51:25:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 12 0
00:14:07:WU01:FS01:0x22:ERROR:Force RMSE error of 165.891 with threshold of 5
00:14:12:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 5 0
00:14:18:WU01:FS01:0x22:ERROR:Force RMSE error of 143.528 with threshold of 5
00:14:24:WU00:FS01:0x22:ERROR:Force RMSE error of 1416.69 with threshold of 5
13:29:23:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 12 0
15:05:35:WU01:FS01:0x22:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
01:14:20:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 12 0
01:14:24:WU03:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
01:14:27:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
01:14:31:WU03:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 10 2
01:14:36:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 25 0
01:14:40:WU03:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
01:14:44:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 25 0
01:23:40:WU03:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 0 0
09:12:20:ERROR:WU01:FS00:Exception: Failed to connect to 3.21.157.11:80: Connection timed out
20:41:57:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 0 0
20:42:04:WU03:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 15 2
20:42:07:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 21 0
06:19:54:ERROR:WU01:FS01:Exception: Transfer failed
06:21:45:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 21 0
09:26:20:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:27:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:31:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:35:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:39:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:44:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:48:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:53:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:57:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:27:02:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
12:44:31:ERROR:WU00:FS00:Exception: Failed to connect to 3.21.157.11:80: Connection timed out
13:57:49:WU01:FS01:0x22:ERROR:Force RMSE error of 38.2303 with threshold of 5
13:57:53:WU02:FS01:0x22:ERROR:Force RMSE error of 17.6938 with threshold of 5
13:57:57:WU01:FS01:0x22:ERROR:Force RMSE error of 15.6476 with threshold of 5
13:58:02:WU02:FS01:0x22:ERROR:Force RMSE error of 65.3317 with threshold of 5
13:58:07:WU01:FS01:0x22:ERROR:Force RMSE error of 69.6833 with threshold of 5
13:58:11:WU02:FS01:0x22:ERROR:Force RMSE error of 69.8878 with threshold of 5
13:58:19:WU01:FS01:0x22:ERROR:Force RMSE error of 21.2261 with threshold of 5
13:58:22:WU02:FS01:0x22:ERROR:Force RMSE error of 20.6909 with threshold of 5
14:06:58:ERROR:WU01:FS01:Exception: Transfer failed
14:07:01:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 15 0
14:07:04:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 15 0
14:07:37:WU01:FS01:0x22:ERROR:Force RMSE error of 596.999 with threshold of 5
14:07:41:WU02:FS01:0x22:ERROR:Force RMSE error of 639.224 with threshold of 5
00:29:32:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 6 1
Code: Select all
05:56:48:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:3533 clone:15 gen:0 core:0x22 unit:0x0000000212bc7d9a5f20206f5b5d3eda
05:56:55:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:3553 clone:15 gen:0 core:0x22 unit:0x0000000212bc7d9a5f20206741780e8c
05:57:02:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:3558 clone:15 gen:0 core:0x22 unit:0x0000000212bc7d9a5f2020680636f542
05:57:05:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:3582 clone:15 gen:0 core:0x22 unit:0x0000000212bc7d9a5f202071a2a1087f
08:51:02:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13421 run:3589 clone:15 gen:0 core:0x22 unit:0x0000000212bc7d9a5f202071b9531fe5
08:51:06:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:6428 clone:17 gen:0 core:0x22 unit:0x0000000112bc7d9a5f224a0aad570b6b
08:51:12:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:6588 clone:17 gen:0 core:0x22 unit:0x0000000112bc7d9a5f224a1169e6179a
10:40:35:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:6600 clone:17 gen:0 core:0x22 unit:0x0000000112bc7d9a5f224a13be52964a
12:29:00:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13421 run:4178 clone:19 gen:0 core:0x22 unit:0x0000000212bc7d9a5f20708f0b1912b9
12:29:02:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:2996 clone:21 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4f716980b2e6
14:17:14:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13421 run:3154 clone:21 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1fc0d2cba70df7
14:17:19:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:1842 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d0c48905fca
14:17:25:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:2023 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d4188a0f371
14:17:31:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:2039 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d6f62d34cdc
14:17:35:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:2071 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d759e4e3649
14:17:42:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:2104 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d859c7c6854
14:17:48:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:2133 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d905c76a65e
04:12:19:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:3338 clone:29 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1fc0fa9dba1177
13:51:26:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:5651 clone:22 gen:0 core:0x22 unit:0x0000000212bc7d9a5f2249eb66dbcde5
10:57:44:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13421 run:8173 clone:60 gen:0 core:0x22 unit:0x0000000012bc7d9a5f284d8ef58578e8
12:46:24:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:3471 clone:64 gen:0 core:0x22 unit:0x0000000012bc7d9a5f1fc14ca6785ba0
00:14:03:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:1485 clone:69 gen:0 core:0x22 unit:0x0000000012bc7d9a5f1e0dbf9e5b4300
00:14:07:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7435 clone:69 gen:0 core:0x22 unit:0x0000000012bc7d9a5f224a38f66b7bc1
00:14:13:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:7502 clone:69 gen:0 core:0x22 unit:0x0000000012bc7d9a5f224a4078515492
00:14:19:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7514 clone:69 gen:0 core:0x22 unit:0x0000000012bc7d9a5f224a40b385f85f
00:14:25:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:7521 clone:69 gen:0 core:0x22 unit:0x0000000012bc7d9a5f224a401cc8b4ba
11:42:23:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:2077 clone:73 gen:0 core:0x22 unit:0x0000000012bc7d9a5f1f4d7f134a2e4b
13:29:19:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:13421 run:129 clone:74 gen:0 core:0x22 unit:0x0000000012bc7d9a5f1d0fff3d8580ef
13:29:24:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:6591 clone:74 gen:0 core:0x22 unit:0x0000000012bc7d9a5f224a156d3abbbd
10:35:49:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:13421 run:3660 clone:81 gen:0 core:0x22 unit:0x0000000012bc7d9a5f20206c40b32827
01:14:21:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7215 clone:37 gen:1 core:0x22 unit:0x0000000212bc7d9a5f224a2e8b55170c
01:14:24:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13421 run:7208 clone:45 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a30e9fd7089
01:14:28:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7208 clone:49 gen:1 core:0x22 unit:0x0000000212bc7d9a5f224a31dbe82c13
01:14:32:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13421 run:7208 clone:53 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a2f1734bee9
01:14:36:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7208 clone:60 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a2f22013846
01:14:41:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13421 run:7208 clone:65 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a2f654039ef
01:14:44:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7208 clone:70 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a2e79f7e1ba
13:55:52:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:6763 clone:54 gen:1 core:0x22 unit:0x0000000212bc7d9a5f224a1cfcfc3add
15:44:47:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13421 run:6603 clone:66 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a10fc9e5c4c
01:23:40:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13421 run:6070 clone:50 gen:1 core:0x22 unit:0x0000000112bc7d9a5f2249ffb817cf0c
20:41:58:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:5228 clone:21 gen:1 core:0x22 unit:0x0000000112bc7d9a5f2249e2e4f3496d
20:42:04:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13421 run:5224 clone:19 gen:1 core:0x22 unit:0x0000000112bc7d9a5f2249db0e4136af
20:42:07:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:5224 clone:25 gen:1 core:0x22 unit:0x0000000112bc7d9a5f2249db0ca65d7e
06:21:46:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4845 clone:2 gen:1 core:0x22 unit:0x0000000212bc7d9a5f2249ccacd50bb7
09:26:18:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:13421 run:4844 clone:42 gen:1 core:0x22 unit:0x0000000112bc7d9a5f2249cc5eb8be00
09:26:20:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:4718 clone:56 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20bd4738e6cfd6
09:26:27:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4708 clone:1 gen:1 core:0x22 unit:0x0000000312bc7d9a5f20bd465c1ef795
09:26:31:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:4708 clone:13 gen:1 core:0x22 unit:0x0000000312bc7d9a5f20bd4947b4ce16
09:26:35:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4708 clone:15 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20bd472fb0d1de
09:26:39:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:4708 clone:19 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20bd473e69abc7
09:26:44:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4708 clone:25 gen:1 core:0x22 unit:0x0000000412bc7d9a5f20bd4418671006
09:26:49:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:4708 clone:26 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20bd4a67513fe6
09:26:53:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4708 clone:31 gen:1 core:0x22 unit:0x0000000312bc7d9a5f20bd4b97a650e4
09:26:58:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:4708 clone:34 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20bd4785cb695b
09:27:02:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4708 clone:40 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20bd4ad606d382
13:57:50:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4506 clone:41 gen:1 core:0x22 unit:0x0000000312bc7d9a5f20709b1066bca2
13:57:53:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4506 clone:44 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20709da5e6d0ec
13:57:57:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4506 clone:47 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20709bf85a3e3b
13:58:03:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4506 clone:56 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709b617a0b23
13:58:07:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4506 clone:62 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709b69bc982f
13:58:11:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4506 clone:67 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709b820e8c20
13:58:19:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4505 clone:2 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709bb806f7d1
13:58:22:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4505 clone:14 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709b3f0af7ee
14:07:01:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4499 clone:52 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20709bbce5d549
14:07:04:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4499 clone:54 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709b62a529cc
14:07:38:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4498 clone:24 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709bba692b0c
14:07:41:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4498 clone:30 gen:1 core:0x22 unit:0x0000000312bc7d9a5f20709b863d1a62
00:29:32:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4060 clone:46 gen:1 core:0x22 unit:0x0000000112bc7d9a5f207094e404bfd2
Project 16600 does run fine on it though.