Page 1 of 1

13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Posted: Tue May 12, 2020 2:55 pm
by Jandska
Hello again,

got WU at project 13405 again and this time it regularly crashed at 25%. I'm not probably the first one this happened to with this WU https://apps.foldingathome.org/wu#proje ... e=45&gen=0.

Here is my log:

Code: Select all

14:46:49:WU01:FS01:0x22:Completed 250000 out of 1000000 steps (25%)
14:47:15:WU01:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
14:47:15:WU01:FS01:0x22:Following exception occured: Particle coordinate is nan
14:47:38:WU01:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
14:47:38:WU01:FS01:0x22:Following exception occured: Particle coordinate is nan
14:48:01:WU01:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
14:48:01:WU01:FS01:0x22:Following exception occured: Particle coordinate is nan
14:48:01:WU01:FS01:0x22:ERROR:114: Max Retries Reached
14:48:01:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
14:48:01:WU01:FS01:0x22:Saving result file badstate-0.xml
14:48:01:WU01:FS01:0x22:Saving result file badstate-1.xml
14:48:01:WU01:FS01:0x22:Saving result file badstate-2.xml
14:48:01:WU01:FS01:0x22:Saving result file checkpointState.xml
14:48:02:WU01:FS01:0x22:Saving result file checkpt.crc
14:48:02:WU01:FS01:0x22:Saving result file globals.csv
14:48:02:WU01:FS01:0x22:Saving result file positions.xtc
14:48:02:WU01:FS01:0x22:Saving result file science.log
14:48:02:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
14:48:03:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
14:48:03:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13405 run:576 clone:45 gen:0 core:0x22 unit:0x0000000312bc7d9a5eb97d3da00e8261
14:48:03:WU01:FS01:Uploading 4.91MiB to 18.188.125.154
14:48:03:WU01:FS01:Connecting to 18.188.125.154:8080
14:48:04:WU02:FS01:Connecting to 65.254.110.245:80
14:48:04:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:80': No WUs available for this configuration
14:48:04:WU02:FS01:Connecting to 18.218.241.186:80
14:48:05:WU02:FS01:Assigned to work server 13.82.98.119
14:48:05:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GP107 [GeForce GTX 1050 LP] 1862 from 13.82.98.119
14:48:05:WU02:FS01:Connecting to 13.82.98.119:8080
14:48:32:WU01:FS01:Upload complete
14:48:32:WU01:FS01:Server responded WORK_ACK (400)
14:48:32:WU01:FS01:Cleaning up

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Posted: Tue May 12, 2020 4:09 pm
by TPL
Mine crashed at 60% I think. Project 13405 (Run 470, Clone 57, Gen 0).

Code: Select all

14:58:04:WU00:FS01:0x22:Completed 580000 out of 1000000 steps (58%)
15:03:36:WU00:FS01:0x22:Completed 590000 out of 1000000 steps (59%)
15:07:54:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
15:07:54:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
15:08:09:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
15:08:09:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
15:08:25:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
15:08:25:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
15:08:25:WU00:FS01:0x22:ERROR:114: Max Retries Reached
15:08:25:WU00:FS01:0x22:Saving result file ../logfile_01.txt
15:08:25:WU00:FS01:0x22:Saving result file badstate-0.xml
15:08:25:WU00:FS01:0x22:Saving result file badstate-1.xml
15:08:25:WU00:FS01:0x22:Saving result file badstate-2.xml
15:08:25:WU00:FS01:0x22:Saving result file checkpointState.xml
15:08:26:WU00:FS01:0x22:Saving result file checkpt.crc
15:08:26:WU00:FS01:0x22:Saving result file globals.csv
15:08:26:WU00:FS01:0x22:Saving result file positions.xtc
15:08:26:WU00:FS01:0x22:Saving result file science.log
15:08:26:WU00:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
15:08:27:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
15:08:27:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13405 run:470 clone:57 gen:0 core:0x22 unit:0x0000000112bc7d9a5eb5846c3c4843c6
15:08:27:WU00:FS01:Uploading 4.98MiB to 18.188.125.154
15:08:27:WU00:FS01:Connecting to 18.188.125.154:8080
15:08:27:WU01:FS01:Connecting to 65.254.110.245:80
15:08:27:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:80': No WUs available for this configuration
15:08:27:WU01:FS01:Connecting to 18.218.241.186:80
15:08:28:WU01:FS01:Assigned to work server 18.188.125.154
15:08:28:WU01:FS01:Requesting new work unit for slot 01: READY gpu:1:TU117M [GeForce GTX 1650 Mobile / Max-Q] from 18.188.125.154
15:08:28:WU01:FS01:Connecting to 18.188.125.154:8080
15:08:30:WU01:FS01:Downloading 5.98MiB
15:08:33:WU00:FS01:Upload 58.98%
15:08:36:WU01:FS01:Download 37.60%
15:08:39:WU00:FS01:Upload complete
15:08:39:WU00:FS01:Server responded WORK_ACK (400)
With responce WORK_ACK and points 29 854. Well, thank you.

These 1340X projects have slowly succeeded or failed for me. PPD drop from "average" GPU WU 20-25%. Is my system overclocked? Some 9-11 hours/WU, naah, I don't think so. Its a stock.

Just my experience. :)

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Posted: Tue May 12, 2020 4:36 pm
by TPL
And BTW, I don't have beta laber nor I do have even advanced label on. These are continuously directed to me anyway.

With that computer I haven't face any problems with any other WUs so far.

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Posted: Tue May 12, 2020 4:41 pm
by Jandska
They are not advanced anymore: viewtopic.php?f=24&t=35063&p=332176&hilit=13405#p332176 but as I understood from previous discussions these WUs are something new and have more issues than other projects.

EDIT: And I don't really mind about points...I do this to help the science and finding solutions against COVID-19 and other issues.

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Posted: Tue May 12, 2020 4:50 pm
by Neil-B
I believe p13404 and p13405 are part of a fairly new series of Projects and as such are more likely to have issues … they were showing on the Active Projects webpage (they have now been removed) so I'd guess (and yours and others experience confirms) that for a while these Projects have been on general release although that may no longer be the case.

… and from another thread viewtopic.php?f=101&t=35190&p=333655#p333655 ... There are times when failed WUs are actually worth it for the bigger picture science and it would appear that these projects fall into that type of scenario.

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Posted: Tue May 12, 2020 5:53 pm
by TPL
I don't see them be removed but that was not any claim to that direction either. I meant my previous posts as a report.

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Posted: Tue May 12, 2020 10:25 pm
by Joe_H
These projects were announced as being released to all folding clients, not just Beta and Advanced - viewtopic.php?f=24&t=35063&p=332176.

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Posted: Tue May 12, 2020 10:33 pm
by Neil-B