core22 0.0.10 released to full FAH!

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

core22 0.0.10 released to full FAH!

Post by JohnChodera »

We're excited to roll out core22 0.0.10 to full FAH after extensive testing of updates since 0.0.5.

The new core has a number of improvements that will help us better support COVID-19 drug discovery collaborations, especially the COVID Moonshot (http://postera.ai/covid).

New Features:
  • Now built on the most up-to-date release of OpenMM: OpenMM 7.4.2
  • Average performance (in ns/day) is now printed out to the client log at the end of the WU.
  • JSON viewer frames are now written every 1% by default, and project managers can now flexibly control the viewer frame interval for each project.
  • Checkpoints are now written every 5% by default, and project managers can now flexibly control the checkpoint interval independent of the XTC frame write interval for each project.
  • Checkpoint, XTC, and viewer frame intervals are now written to client log
  • Adds experimental support for Intel GPUs
  • Slight performance improvement on some GPUs by enabling separate PME streams by default
Bugfixes and stability improvements:
  • Issues causing some AMD cards (AMD RX 460/470/480/570/580/590) to fail with

    Code: Select all

    Error invoking kernel sortShortList: clEnqueueNDRangeKernel (-5)
    was fixed in OpenMM (openmm/openmm#2608) and included in OpenMM 7.4.2
  • Project managers now have control over force and energy RMSE error checking thresholds, which should allow project managers to reduce RMSE energy and force errors for large projects by setting these thresholds larger.
  • Fixes bugs in resuming from projects that use CustomIntegrators (like 114xx for the COVID Moonshot) and correctly rewinds globals.csv files to ensure integrity of data when core is paused/resumed
Please post here if you have trouble with the core doing something that isn't returned to the work servers. Note that we do get all the data that gets uploaded, and regularly analyze failures for ways in which we can improve stability and performance, so there is no need to report issues that end up with uploaded WUs even if they fail. But we definitely want to hear about issues that impact your machine or cause WUs to fail without upload!

Some short projects may note slightly reduced performance in exchange for having to backtrack less when NaNs are encountered due to more frequent checkpointing. We hope to tune this and further improve behavior in future, and have some ideas as to how we can keep the GPU running while the CPU is performing sanity checks and checkpoints in future core22 updates.

We're also hoping to make further performance updates in the coming weeks. Stay tuned!

~ John Chodera and the core22 team
mwroggenbuck
Posts: 127
Joined: Tue Mar 24, 2020 12:47 pm

Re: core22 0.0.10 released to full FAH!

Post by mwroggenbuck »

I believe that have an error that was not reported back to the server.

I have other posts about unknow enum happening after finish unit, and then not uploading. I was hoping that the new core would fix this. Apparently it did not. I have attached the log file, and the windows event error. The only thing that I can see that is different is that the event trace does not start with ntdll.dll or a open cl dll.

viewtopic.php?f=81&t=35482

Let me know if I can give you any more information or try something else. I see this problem quite often, but nobody else seems to report it. I could probably do a clean install of Windows 10 professional, but I would like to avoid that if possible. I have a lot of stuff on this computer.

Code: Select all

17:42:01:WU00:FS00:Starting
17:42:01:WU02:FS00:Connecting to 155.247.166.220:8080
17:42:01:WU00:FS00:Running FahCore: D:\C_Alt\programs\FAHClient/FAHCoreWrapper.exe D:\C_Alt\data\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 4532 -checkpoint 5 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
17:42:01:WU00:FS00:Started FahCore on PID 13860
17:42:01:WU00:FS00:Core PID:5628
17:42:01:WU00:FS00:FahCore 0x22 started
17:42:02:WU00:FS00:0x22:*********************** Log Started 2020-06-19T17:42:01Z ***********************
17:42:02:WU00:FS00:0x22:*************************** Core22 Folding@home Core ***************************
17:42:02:WU00:FS00:0x22:       Core: Core22
17:42:02:WU00:FS00:0x22:       Type: 0x22
17:42:02:WU00:FS00:0x22:    Version: 0.0.10
17:42:02:WU00:FS00:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:42:02:WU00:FS00:0x22:  Copyright: 2020 foldingathome.org
17:42:02:WU00:FS00:0x22:   Homepage: https://foldingathome.org/
17:42:02:WU00:FS00:0x22:       Date: Jun 16 2020
17:42:02:WU00:FS00:0x22:       Time: 14:33:22
17:42:02:WU00:FS00:0x22:   Revision: 147051aad40bcbec7d4b25105bbedfab425f1dc2
17:42:02:WU00:FS00:0x22:     Branch: core22-0.0.10
17:42:02:WU00:FS00:0x22:   Compiler: Visual C++ 2015
17:42:02:WU00:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:42:02:WU00:FS00:0x22:   Platform: win32 10
17:42:02:WU00:FS00:0x22:       Bits: 64
17:42:02:WU00:FS00:0x22:       Mode: Release
17:42:02:WU00:FS00:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
17:42:02:WU00:FS00:0x22:             <peastman@stanford.edu>
17:42:02:WU00:FS00:0x22:       Args: -dir 00 -suffix 01 -version 706 -lifeline 13860 -checkpoint 5
17:42:02:WU00:FS00:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
17:42:02:WU00:FS00:0x22:************************************ libFAH ************************************
17:42:02:WU00:FS00:0x22:       Date: Jun 15 2020
17:42:02:WU00:FS00:0x22:       Time: 18:05:04
17:42:02:WU00:FS00:0x22:   Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
17:42:02:WU00:FS00:0x22:     Branch: HEAD
17:42:02:WU00:FS00:0x22:   Compiler: Visual C++ 2015
17:42:02:WU00:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:42:02:WU00:FS00:0x22:   Platform: win32 10
17:42:02:WU00:FS00:0x22:       Bits: 64
17:42:02:WU00:FS00:0x22:       Mode: Release
17:42:02:WU00:FS00:0x22:************************************ CBang *************************************
17:42:02:WU00:FS00:0x22:       Date: Jun 16 2020
17:42:02:WU00:FS00:0x22:       Time: 14:31:33
17:42:02:WU00:FS00:0x22:   Revision: 75fcee0b8e713cb47f5191a3689d5f4f07244c7f
17:42:02:WU00:FS00:0x22:     Branch: HEAD
17:42:02:WU00:FS00:0x22:   Compiler: Visual C++ 2015
17:42:02:WU00:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:42:02:WU00:FS00:0x22:   Platform: win32 10
17:42:02:WU00:FS00:0x22:       Bits: 64
17:42:02:WU00:FS00:0x22:       Mode: Release
17:42:02:WU00:FS00:0x22:************************************ System ************************************
17:42:02:WU00:FS00:0x22:        CPU: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
17:42:02:WU00:FS00:0x22:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 9
17:42:02:WU00:FS00:0x22:       CPUs: 8
17:42:02:WU00:FS00:0x22:     Memory: 31.96GiB
17:42:02:WU00:FS00:0x22:Free Memory: 26.10GiB
17:42:02:WU00:FS00:0x22:    Threads: WINDOWS_THREADS
17:42:02:WU00:FS00:0x22: OS Version: 6.2
17:42:02:WU00:FS00:0x22:Has Battery: false
17:42:02:WU00:FS00:0x22: On Battery: false
17:42:02:WU00:FS00:0x22: UTC Offset: -4
17:42:02:WU00:FS00:0x22:        PID: 5628
17:42:02:WU00:FS00:0x22:        CWD: D:\C_Alt\data\FAHClient\work
17:42:02:WU00:FS00:0x22:********************************************************************************
17:42:02:WU00:FS00:0x22:Project: 11760 (Run 0, Clone 10093, Gen 34)
17:42:02:WU00:FS00:0x22:Unit: 0x0000003180fccb0a5e6f4428ade96bd3
17:42:02:WU00:FS00:0x22:Reading tar file core.xml
17:42:02:WU00:FS00:0x22:Reading tar file integrator.xml
17:42:02:WU00:FS00:0x22:Reading tar file state.xml
17:42:02:WU00:FS00:0x22:Reading tar file system.xml
17:42:02:WU00:FS00:0x22:Digital signatures verified
17:42:02:WU00:FS00:0x22:Folding@home GPU Core22 Folding@home Core
17:42:02:WU00:FS00:0x22:Version 0.0.10
17:42:02:WU00:FS00:0x22:  Checkpoint write interval: 100000 steps (5%) [20 total]
17:42:02:WU00:FS00:0x22:  JSON viewer frame write interval: 20000 steps (1%) [100 total]
17:42:02:WU00:FS00:0x22:  XTC frame write interval: 50000 steps (2.5%) [40 total]
17:42:02:WU00:FS00:0x22:  Global context and integrator variables write interval: disabled
17:42:12:WU00:FS00:0x22:Completed 0 out of 2000000 steps (0%)
17:42:13:WU02:FS00:Upload complete
17:42:13:WU02:FS00:Server responded WORK_ACK (400)
17:42:13:WU02:FS00:Final credit estimate, 92662.00 points
17:42:13:WU02:FS00:Cleaning up
17:44:32:WU00:FS00:0x22:Completed 20000 out of 2000000 steps (1%)
17:46:51:WU00:FS00:0x22:Completed 40000 out of 2000000 steps (2%)
17:49:10:WU00:FS00:0x22:Completed 60000 out of 2000000 steps (3%)
17:51:31:WU00:FS00:0x22:Completed 80000 out of 2000000 steps (4%)
17:53:50:WU00:FS00:0x22:Completed 100000 out of 2000000 steps (5%)
17:56:09:WU00:FS00:0x22:Completed 120000 out of 2000000 steps (6%)
17:58:28:WU00:FS00:0x22:Completed 140000 out of 2000000 steps (7%)
18:00:46:WU00:FS00:0x22:Completed 160000 out of 2000000 steps (8%)
18:03:05:WU00:FS00:0x22:Completed 180000 out of 2000000 steps (9%)
18:05:24:WU00:FS00:0x22:Completed 200000 out of 2000000 steps (10%)
18:05:59:ERROR:Receive error: 10054: An existing connection was forcibly closed by the remote host.
18:07:43:WU00:FS00:0x22:Completed 220000 out of 2000000 steps (11%)
18:10:02:WU00:FS00:0x22:Completed 240000 out of 2000000 steps (12%)
******************************* Date: 2020-06-19 *******************************
18:12:20:WU00:FS00:0x22:Completed 260000 out of 2000000 steps (13%)
18:14:39:WU00:FS00:0x22:Completed 280000 out of 2000000 steps (14%)
18:16:59:WU00:FS00:0x22:Completed 300000 out of 2000000 steps (15%)
18:19:17:WU00:FS00:0x22:Completed 320000 out of 2000000 steps (16%)
18:21:36:WU00:FS00:0x22:Completed 340000 out of 2000000 steps (17%)
18:23:53:WU00:FS00:0x22:Completed 360000 out of 2000000 steps (18%)
18:26:11:WU00:FS00:0x22:Completed 380000 out of 2000000 steps (19%)
18:28:30:WU00:FS00:0x22:Completed 400000 out of 2000000 steps (20%)
18:30:48:WU00:FS00:0x22:Completed 420000 out of 2000000 steps (21%)
18:33:06:WU00:FS00:0x22:Completed 440000 out of 2000000 steps (22%)
18:35:24:WU00:FS00:0x22:Completed 460000 out of 2000000 steps (23%)
18:37:42:WU00:FS00:0x22:Completed 480000 out of 2000000 steps (24%)
18:40:00:WU00:FS00:0x22:Completed 500000 out of 2000000 steps (25%)
18:42:19:WU00:FS00:0x22:Completed 520000 out of 2000000 steps (26%)
18:44:37:WU00:FS00:0x22:Completed 540000 out of 2000000 steps (27%)
18:46:55:WU00:FS00:0x22:Completed 560000 out of 2000000 steps (28%)
18:49:13:WU00:FS00:0x22:Completed 580000 out of 2000000 steps (29%)
18:51:31:WU00:FS00:0x22:Completed 600000 out of 2000000 steps (30%)
18:53:49:WU00:FS00:0x22:Completed 620000 out of 2000000 steps (31%)
18:56:07:WU00:FS00:0x22:Completed 640000 out of 2000000 steps (32%)
18:58:25:WU00:FS00:0x22:Completed 660000 out of 2000000 steps (33%)
19:00:43:WU00:FS00:0x22:Completed 680000 out of 2000000 steps (34%)
19:03:01:WU00:FS00:0x22:Completed 700000 out of 2000000 steps (35%)
19:05:20:WU00:FS00:0x22:Completed 720000 out of 2000000 steps (36%)
19:07:38:WU00:FS00:0x22:Completed 740000 out of 2000000 steps (37%)
19:09:56:WU00:FS00:0x22:Completed 760000 out of 2000000 steps (38%)
19:12:14:WU00:FS00:0x22:Completed 780000 out of 2000000 steps (39%)
19:14:33:WU00:FS00:0x22:Completed 800000 out of 2000000 steps (40%)
19:16:51:WU00:FS00:0x22:Completed 820000 out of 2000000 steps (41%)
19:19:09:WU00:FS00:0x22:Completed 840000 out of 2000000 steps (42%)
19:21:27:WU00:FS00:0x22:Completed 860000 out of 2000000 steps (43%)
19:23:45:WU00:FS00:0x22:Completed 880000 out of 2000000 steps (44%)
19:26:03:WU00:FS00:0x22:Completed 900000 out of 2000000 steps (45%)
19:28:21:WU00:FS00:0x22:Completed 920000 out of 2000000 steps (46%)
19:30:39:WU00:FS00:0x22:Completed 940000 out of 2000000 steps (47%)
19:32:57:WU00:FS00:0x22:Completed 960000 out of 2000000 steps (48%)
19:35:14:WU00:FS00:0x22:Completed 980000 out of 2000000 steps (49%)
19:37:32:WU00:FS00:0x22:Completed 1000000 out of 2000000 steps (50%)
19:39:51:WU00:FS00:0x22:Completed 1020000 out of 2000000 steps (51%)
19:42:09:WU00:FS00:0x22:Completed 1040000 out of 2000000 steps (52%)
19:44:26:WU00:FS00:0x22:Completed 1060000 out of 2000000 steps (53%)
19:46:44:WU00:FS00:0x22:Completed 1080000 out of 2000000 steps (54%)
19:49:03:WU00:FS00:0x22:Completed 1100000 out of 2000000 steps (55%)
19:49:38:ERROR:Receive error: 10054: An existing connection was forcibly closed by the remote host.
19:51:25:WU00:FS00:0x22:Completed 1120000 out of 2000000 steps (56%)
19:53:45:WU00:FS00:0x22:Completed 1140000 out of 2000000 steps (57%)
19:56:06:WU00:FS00:0x22:Completed 1160000 out of 2000000 steps (58%)
19:58:26:WU00:FS00:0x22:Completed 1180000 out of 2000000 steps (59%)
20:00:47:WU00:FS00:0x22:Completed 1200000 out of 2000000 steps (60%)
20:03:07:WU00:FS00:0x22:Completed 1220000 out of 2000000 steps (61%)
20:05:25:WU00:FS00:0x22:Completed 1240000 out of 2000000 steps (62%)
20:07:43:WU00:FS00:0x22:Completed 1260000 out of 2000000 steps (63%)
20:10:01:WU00:FS00:0x22:Completed 1280000 out of 2000000 steps (64%)
20:12:20:WU00:FS00:0x22:Completed 1300000 out of 2000000 steps (65%)
20:14:38:WU00:FS00:0x22:Completed 1320000 out of 2000000 steps (66%)
20:16:57:WU00:FS00:0x22:Completed 1340000 out of 2000000 steps (67%)
20:19:17:WU00:FS00:0x22:Completed 1360000 out of 2000000 steps (68%)
20:21:35:WU00:FS00:0x22:Completed 1380000 out of 2000000 steps (69%)
20:23:53:WU00:FS00:0x22:Completed 1400000 out of 2000000 steps (70%)
20:26:12:WU00:FS00:0x22:Completed 1420000 out of 2000000 steps (71%)
20:28:30:WU00:FS00:0x22:Completed 1440000 out of 2000000 steps (72%)
20:30:48:WU00:FS00:0x22:Completed 1460000 out of 2000000 steps (73%)
20:33:06:WU00:FS00:0x22:Completed 1480000 out of 2000000 steps (74%)
20:35:24:WU00:FS00:0x22:Completed 1500000 out of 2000000 steps (75%)
20:37:43:WU00:FS00:0x22:Completed 1520000 out of 2000000 steps (76%)
20:40:01:WU00:FS00:0x22:Completed 1540000 out of 2000000 steps (77%)
20:42:19:WU00:FS00:0x22:Completed 1560000 out of 2000000 steps (78%)
20:44:38:WU00:FS00:0x22:Completed 1580000 out of 2000000 steps (79%)
20:46:56:WU00:FS00:0x22:Completed 1600000 out of 2000000 steps (80%)
20:49:15:WU00:FS00:0x22:Completed 1620000 out of 2000000 steps (81%)
20:51:33:WU00:FS00:0x22:Completed 1640000 out of 2000000 steps (82%)
20:53:51:WU00:FS00:0x22:Completed 1660000 out of 2000000 steps (83%)
20:56:09:WU00:FS00:0x22:Completed 1680000 out of 2000000 steps (84%)
20:58:28:WU00:FS00:0x22:Completed 1700000 out of 2000000 steps (85%)
21:00:46:WU00:FS00:0x22:Completed 1720000 out of 2000000 steps (86%)
21:03:04:WU00:FS00:0x22:Completed 1740000 out of 2000000 steps (87%)
21:05:23:WU00:FS00:0x22:Completed 1760000 out of 2000000 steps (88%)
21:07:41:WU00:FS00:0x22:Completed 1780000 out of 2000000 steps (89%)
21:09:59:WU00:FS00:0x22:Completed 1800000 out of 2000000 steps (90%)
21:12:18:WU00:FS00:0x22:Completed 1820000 out of 2000000 steps (91%)
21:14:36:WU00:FS00:0x22:Completed 1840000 out of 2000000 steps (92%)
21:16:54:WU00:FS00:0x22:Completed 1860000 out of 2000000 steps (93%)
21:19:13:WU00:FS00:0x22:Completed 1880000 out of 2000000 steps (94%)
21:21:31:WU00:FS00:0x22:Completed 1900000 out of 2000000 steps (95%)
21:23:49:WU00:FS00:0x22:Completed 1920000 out of 2000000 steps (96%)
21:26:08:WU00:FS00:0x22:Completed 1940000 out of 2000000 steps (97%)
21:28:26:WU00:FS00:0x22:Completed 1960000 out of 2000000 steps (98%)
21:30:44:WU00:FS00:0x22:Completed 1980000 out of 2000000 steps (99%)
21:30:45:WU01:FS00:Connecting to assign1.foldingathome.org:80
21:30:45:WARNING:WU01:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
21:30:45:WU01:FS00:Connecting to assign2.foldingathome.org:80
21:30:45:WU01:FS00:Assigned to work server 140.163.4.241
21:30:45:WU01:FS00:Requesting new work unit for slot 00: RUNNING gpu:0:Ellesmere XT [Radeon RX 470/480/570/580/590] from 140.163.4.241
21:30:46:WU01:FS00:Connecting to 140.163.4.241:8080
21:30:58:WU01:FS00:Downloading 4.53MiB
21:30:58:WU01:FS00:Download complete
21:30:58:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11742 run:0 clone:4039 gen:188 core:0x22 unit:0x000001188ca304f15e69cb57a35e65ea
21:33:02:WU00:FS00:0x22:Completed 2000000 out of 2000000 steps (100%)
21:33:02:WU00:FS00:0x22:Average performance: 25.0072 ns/day
21:33:04:WU00:FS00:0x22:Saving result file ..\logfile_01.txt
21:33:04:WU00:FS00:0x22:Saving result file checkpointState.xml
21:33:05:WU00:FS00:0x22:Saving result file positions.xtc
21:33:05:WU00:FS00:0x22:Saving result file science.log
21:33:05:WU00:FS00:0x22:Folding@home Core Shutdown: FINISHED_UNIT
21:33:07:WARNING:WU00:FS00:FahCore returned an unknown error code which probably indicates that it crashed
21:33:07:WARNING:WU00:FS00:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
21:33:07:WU01:FS00:Starting
21:33:07:WU01:FS00:Running FahCore: D:\C_Alt\programs\FAHClient/FAHCoreWrapper.exe D:\C_Alt\data\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 4532 -checkpoint 5 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
21:33:07:WU01:FS00:Started FahCore on PID 10008
21:33:07:WU01:FS00:Core PID:13032
21:33:07:WU01:FS00:FahCore 0x22 started
21:33:08:WU01:FS00:0x22:*********************** Log Started 2020-06-19T21:33:07Z ***********************
21:33:08:WU01:FS00:0x22:*************************** Core22 Folding@home Core ***************************
21:33:08:WU01:FS00:0x22:       Core: Core22
21:33:08:WU01:FS00:0x22:       Type: 0x22
21:33:08:WU01:FS00:0x22:    Version: 0.0.10
21:33:08:WU01:FS00:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
21:33:08:WU01:FS00:0x22:  Copyright: 2020 foldingathome.org
21:33:08:WU01:FS00:0x22:   Homepage: https://foldingathome.org/
21:33:08:WU01:FS00:0x22:       Date: Jun 16 2020
21:33:08:WU01:FS00:0x22:       Time: 14:33:22
21:33:08:WU01:FS00:0x22:   Revision: 147051aad40bcbec7d4b25105bbedfab425f1dc2
21:33:08:WU01:FS00:0x22:     Branch: core22-0.0.10
21:33:08:WU01:FS00:0x22:   Compiler: Visual C++ 2015
21:33:08:WU01:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
21:33:08:WU01:FS00:0x22:   Platform: win32 10
21:33:08:WU01:FS00:0x22:       Bits: 64
21:33:08:WU01:FS00:0x22:       Mode: Release
21:33:08:WU01:FS00:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
21:33:08:WU01:FS00:0x22:             <peastman@stanford.edu>
21:33:08:WU01:FS00:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 10008 -checkpoint 5
21:33:08:WU01:FS00:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
21:33:08:WU01:FS00:0x22:************************************ libFAH ************************************
21:33:08:WU01:FS00:0x22:       Date: Jun 15 2020
21:33:08:WU01:FS00:0x22:       Time: 18:05:04
21:33:08:WU01:FS00:0x22:   Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
21:33:08:WU01:FS00:0x22:     Branch: HEAD
21:33:08:WU01:FS00:0x22:   Compiler: Visual C++ 2015
21:33:08:WU01:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
21:33:08:WU01:FS00:0x22:   Platform: win32 10
21:33:08:WU01:FS00:0x22:       Bits: 64
21:33:08:WU01:FS00:0x22:       Mode: Release
21:33:08:WU01:FS00:0x22:************************************ CBang *************************************
21:33:08:WU01:FS00:0x22:       Date: Jun 16 2020
21:33:08:WU01:FS00:0x22:       Time: 14:31:33
21:33:08:WU01:FS00:0x22:   Revision: 75fcee0b8e713cb47f5191a3689d5f4f07244c7f
21:33:08:WU01:FS00:0x22:     Branch: HEAD
21:33:08:WU01:FS00:0x22:   Compiler: Visual C++ 2015
21:33:08:WU01:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
21:33:08:WU01:FS00:0x22:   Platform: win32 10
21:33:08:WU01:FS00:0x22:       Bits: 64
21:33:08:WU01:FS00:0x22:       Mode: Release
21:33:08:WU01:FS00:0x22:************************************ System ************************************
21:33:08:WU01:FS00:0x22:        CPU: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
21:33:08:WU01:FS00:0x22:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 9
21:33:08:WU01:FS00:0x22:       CPUs: 8
21:33:08:WU01:FS00:0x22:     Memory: 31.96GiB
21:33:08:WU01:FS00:0x22:Free Memory: 26.09GiB
21:33:08:WU01:FS00:0x22:    Threads: WINDOWS_THREADS
21:33:08:WU01:FS00:0x22: OS Version: 6.2
21:33:08:WU01:FS00:0x22:Has Battery: false
21:33:08:WU01:FS00:0x22: On Battery: false
21:33:08:WU01:FS00:0x22: UTC Offset: -4
21:33:08:WU01:FS00:0x22:        PID: 13032
21:33:08:WU01:FS00:0x22:        CWD: D:\C_Alt\data\FAHClient\work
21:33:08:WU01:FS00:0x22:********************************************************************************
21:33:08:WU01:FS00:0x22:Project: 11742 (Run 0, Clone 4039, Gen 188)
21:33:08:WU01:FS00:0x22:Unit: 0x000001188ca304f15e69cb57a35e65ea
21:33:08:WU01:FS00:0x22:Reading tar file core.xml
21:33:08:WU01:FS00:0x22:Reading tar file integrator.xml
21:33:08:WU01:FS00:0x22:Reading tar file state.xml
21:33:08:WU01:FS00:0x22:Reading tar file system.xml
21:33:08:WU01:FS00:0x22:Digital signatures verified
21:33:08:WU01:FS00:0x22:Folding@home GPU Core22 Folding@home Core
21:33:08:WU01:FS00:0x22:Version 0.0.10
21:33:09:WU01:FS00:0x22:  Checkpoint write interval: 100000 steps (5%) [20 total]
21:33:09:WU01:FS00:0x22:  JSON viewer frame write interval: 20000 steps (1%) [100 total]
21:33:09:WU01:FS00:0x22:  XTC frame write interval: 50000 steps (2.5%) [40 total]
21:33:09:WU01:FS00:0x22:  Global context and integrator variables write interval: disabled
21:33:18:WU01:FS00:0x22:Completed 0 out of 2000000 steps (0%)
21:35:38:WU01:FS00:0x22:Completed 20000 out of 2000000 steps (1%)
Here is the windows event log

Code: Select all

Faulting application name: FahCore_22.exe, version: 0.0.0.0, time stamp: 0x5ee8d83c
Faulting module name: FahCore_22.exe, version: 0.0.0.0, time stamp: 0x5ee8d83c
Exception code: 0xc0000409
Fault offset: 0x00000000008c4674
Faulting process id: 0x15fc
Faulting application start time: 0x01d64660f04bdffb
Faulting application path: D:\C_Alt\data\FAHClient\cores\cores.foldingathome.org\v7\win\64bit\Core_22.fah\FahCore_22.exe
Faulting module path: D:\C_Alt\data\FAHClient\cores\cores.foldingathome.org\v7\win\64bit\Core_22.fah\FahCore_22.exe
Report Id: cd6f529a-f46c-42ab-90e9-83c2dfbe177e
Faulting package full name: 
Faulting package-relative application ID: 
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: core22 0.0.10 released to full FAH!

Post by JohnChodera »

Thanks for the report! I'm investigating now.

Can you check if you're using the latest client version?

~ John Chodera // MSKCC
mwroggenbuck
Posts: 127
Joined: Tue Mar 24, 2020 12:47 pm

Re: core22 0.0.10 released to full FAH!

Post by mwroggenbuck »

According to help/about I am using 7.6.13

I have zipped up the work subdirectory of the FAH data directory. I think that I can use the PM system to send it to you if you want (or maybe you have another idea). It has all the data and log files from the failed run.
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: core22 0.0.10 released to full FAH!

Post by JohnChodera »

Please do! If that doesn't work, you can use the FAH issue tracker:
https://github.com/foldingathome/fah-issues/issues
mwroggenbuck
Posts: 127
Joined: Tue Mar 24, 2020 12:47 pm

Re: core22 0.0.10 released to full FAH!

Post by mwroggenbuck »

The zip file is ~37 MB. github will only allow 10 MB,

Is there some way else to send this? maybe anonymous ftp? You can give me instructions in a private message (I could not find a way to send a file via PM).
mwroggenbuck
Posts: 127
Joined: Tue Mar 24, 2020 12:47 pm

Re: core22 0.0.10 released to full FAH!

Post by mwroggenbuck »

John,

I sent you a private message with a link to the zipped up work directory. Please send me a PM once you have downloaded it so I can clean up.

Mark
Crunchtimer
Posts: 50
Joined: Tue May 05, 2020 5:34 am

Re: core22 0.0.10 released to full FAH!

Post by Crunchtimer »

Thanks John, looks like an interesting update with the progress and all and I really appreciate the high tempo at which things are happening and the integration with the Moonshot!

W.r.t. COVID (GPU, core22 0.0.10) projects 13412-13415 to FAH, what it considered a new GPU? I'm using GTX1070s.

All the best!

PS. My rig is working home alone during the weekend, so I'm keen on coming home to check this one out. I won't tell this to my family though :)
TPL
Posts: 103
Joined: Sun Apr 19, 2020 11:37 am

Re: core22 0.0.10 released to full FAH!

Post by TPL »

I have Radeon R5 M330 in my older laptop and stopped using GPU long ago as it is too slow. Now I gave it a new try and project 13413 was assigned to me right away. It was ok, less than 6 hours. Next one was 13415 but ETA was 1.7 days after 3% folding so I dumped it manually. For some reason GPU load was only 30%. After that couple other WUs were assinged and I dumped them as well as too long. Then 13415 was assiched again and this time it is going ok, less than 6 hours again. Now GPU load is again about 80%.

I realised afterwards that I should of try to reboot before dumping first 13415 but it is gone now.

I could use this slow GPU if only those small WUs were assinged to it. Problem is that also bigger WUs will be assigned and they are taking too much time for Timeout.
Ichbin3
Posts: 96
Joined: Thu May 28, 2020 8:06 am
Hardware configuration: MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
Location: Germany

Re: core22 0.0.10 released to full FAH!

Post by Ichbin3 »

JohnChodera wrote:Average performance (in ns/day) is now printed out to the client log at the end of the WU.
What does that stand for?
Image
MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
Neil-B
Posts: 1996
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: core22 0.0.10 released to full FAH!

Post by Neil-B »

SI unit would indicate nanosecond per day … so I was thinking possibly the duration of molecular folding time the slot is estimated to be able to compute per day folding on that project? … but yes, would be good to have the meaning of this confirmed :)
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Joe_H
Site Admin
Posts: 7936
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: core22 0.0.10 released to full FAH!

Post by Joe_H »

Yes, that would be nanoseconds per day of progress on that project.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: core22 0.0.10 released to full FAH!

Post by JohnChodera »

@TPL: Weird! 13415 should be almost identical to 13413. It could be that this particular RUN encountered some unusual issues that are atypical of most other RUNs. These simulations should run well on your GPU!
ajm
Posts: 750
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: core22 0.0.10 released to full FAH!

Post by ajm »

I now have a 1080 ti (hybrid) and a 2070S in the same kit (CPU water-cooled). For the same project (13412), with the new core 0.0.10, I get

PCI-e 16x: 1080 ti 13412 (1503, 0, 1) -> ca. 620-650K PPD (about 40-50% of normal), GPU load around 50-60%
PCIe 16x: 2070S 13412 (1170, 0, 1) -> 1.8M PPD (usual), GPU load also around 50-60%

Is my 1080 ti dying or is the project 13412 especially tricky?

EDIT: Screnshots:
1080ti: http://ajm.ch/prv/13412-1080ti.png
2070S: http://ajm.ch/prv/13412-2070S.png
(the GPU load is now higher on the 2070S but no change otherwise)

EDIT2: That 13412 (1503, 0, 1) was already returned thrice as faulty: https://apps.foldingathome.org/wu#proje ... ne=0&gen=1

EDIT3: Similar problem on another kit where a 5700XT is struggling with 13415 (983, 8, 1) at around 25% of its capacity (329k PPD, instead of some 1.35M).
Here too, that WU has already been returned four times as faulty: https://apps.foldingathome.org/wu#proje ... ne=8&gen=1

EDIT4: The 5700XT is done with the 13415 (983, 8, 1) and got a new similar WU 13415 (393, 12, 0) at some 356K PPD.
Now, are those special low PPD WUs useful, science wise, in which case it's OK of course, or are they somehow inherently faulty, that is, worthless for science, and am I just an idiot not to dump them?
Last edited by ajm on Sat Jun 20, 2020 4:20 pm, edited 5 times in total.
TPL
Posts: 103
Joined: Sun Apr 19, 2020 11:37 am

Re: core22 0.0.10 released to full FAH!

Post by TPL »

JohnChodera wrote:These simulations should run well on your GPU!
And they do. Last one was ok and I'm folding new again, 13415. ETA 5h21min when it started. Don't know what was wrong with that problematic one.

I let it go and see how it goes.

Edit: Slow WU was this; project:13415 run:823 clone:0 gen:0, if there is any value for this information.
Post Reply