13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Moderators: Site Moderators, FAHC Science Team

Post Reply
Jandska
Posts: 18
Joined: Thu May 07, 2020 11:07 am

13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Post by Jandska »

Hello again,

got WU at project 13405 again and this time it regularly crashed at 25%. I'm not probably the first one this happened to with this WU https://apps.foldingathome.org/wu#proje ... e=45&gen=0.

Here is my log:

Code: Select all

14:46:49:WU01:FS01:0x22:Completed 250000 out of 1000000 steps (25%)
14:47:15:WU01:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
14:47:15:WU01:FS01:0x22:Following exception occured: Particle coordinate is nan
14:47:38:WU01:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
14:47:38:WU01:FS01:0x22:Following exception occured: Particle coordinate is nan
14:48:01:WU01:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
14:48:01:WU01:FS01:0x22:Following exception occured: Particle coordinate is nan
14:48:01:WU01:FS01:0x22:ERROR:114: Max Retries Reached
14:48:01:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
14:48:01:WU01:FS01:0x22:Saving result file badstate-0.xml
14:48:01:WU01:FS01:0x22:Saving result file badstate-1.xml
14:48:01:WU01:FS01:0x22:Saving result file badstate-2.xml
14:48:01:WU01:FS01:0x22:Saving result file checkpointState.xml
14:48:02:WU01:FS01:0x22:Saving result file checkpt.crc
14:48:02:WU01:FS01:0x22:Saving result file globals.csv
14:48:02:WU01:FS01:0x22:Saving result file positions.xtc
14:48:02:WU01:FS01:0x22:Saving result file science.log
14:48:02:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
14:48:03:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
14:48:03:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13405 run:576 clone:45 gen:0 core:0x22 unit:0x0000000312bc7d9a5eb97d3da00e8261
14:48:03:WU01:FS01:Uploading 4.91MiB to 18.188.125.154
14:48:03:WU01:FS01:Connecting to 18.188.125.154:8080
14:48:04:WU02:FS01:Connecting to 65.254.110.245:80
14:48:04:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:80': No WUs available for this configuration
14:48:04:WU02:FS01:Connecting to 18.218.241.186:80
14:48:05:WU02:FS01:Assigned to work server 13.82.98.119
14:48:05:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GP107 [GeForce GTX 1050 LP] 1862 from 13.82.98.119
14:48:05:WU02:FS01:Connecting to 13.82.98.119:8080
14:48:32:WU01:FS01:Upload complete
14:48:32:WU01:FS01:Server responded WORK_ACK (400)
14:48:32:WU01:FS01:Cleaning up
TPL
Posts: 103
Joined: Sun Apr 19, 2020 11:37 am

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Post by TPL »

Mine crashed at 60% I think. Project 13405 (Run 470, Clone 57, Gen 0).

Code: Select all

14:58:04:WU00:FS01:0x22:Completed 580000 out of 1000000 steps (58%)
15:03:36:WU00:FS01:0x22:Completed 590000 out of 1000000 steps (59%)
15:07:54:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
15:07:54:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
15:08:09:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
15:08:09:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
15:08:25:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
15:08:25:WU00:FS01:0x22:Following exception occured: Particle coordinate is nan
15:08:25:WU00:FS01:0x22:ERROR:114: Max Retries Reached
15:08:25:WU00:FS01:0x22:Saving result file ../logfile_01.txt
15:08:25:WU00:FS01:0x22:Saving result file badstate-0.xml
15:08:25:WU00:FS01:0x22:Saving result file badstate-1.xml
15:08:25:WU00:FS01:0x22:Saving result file badstate-2.xml
15:08:25:WU00:FS01:0x22:Saving result file checkpointState.xml
15:08:26:WU00:FS01:0x22:Saving result file checkpt.crc
15:08:26:WU00:FS01:0x22:Saving result file globals.csv
15:08:26:WU00:FS01:0x22:Saving result file positions.xtc
15:08:26:WU00:FS01:0x22:Saving result file science.log
15:08:26:WU00:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
15:08:27:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
15:08:27:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13405 run:470 clone:57 gen:0 core:0x22 unit:0x0000000112bc7d9a5eb5846c3c4843c6
15:08:27:WU00:FS01:Uploading 4.98MiB to 18.188.125.154
15:08:27:WU00:FS01:Connecting to 18.188.125.154:8080
15:08:27:WU01:FS01:Connecting to 65.254.110.245:80
15:08:27:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:80': No WUs available for this configuration
15:08:27:WU01:FS01:Connecting to 18.218.241.186:80
15:08:28:WU01:FS01:Assigned to work server 18.188.125.154
15:08:28:WU01:FS01:Requesting new work unit for slot 01: READY gpu:1:TU117M [GeForce GTX 1650 Mobile / Max-Q] from 18.188.125.154
15:08:28:WU01:FS01:Connecting to 18.188.125.154:8080
15:08:30:WU01:FS01:Downloading 5.98MiB
15:08:33:WU00:FS01:Upload 58.98%
15:08:36:WU01:FS01:Download 37.60%
15:08:39:WU00:FS01:Upload complete
15:08:39:WU00:FS01:Server responded WORK_ACK (400)
With responce WORK_ACK and points 29 854. Well, thank you.

These 1340X projects have slowly succeeded or failed for me. PPD drop from "average" GPU WU 20-25%. Is my system overclocked? Some 9-11 hours/WU, naah, I don't think so. Its a stock.

Just my experience. :)
TPL
Posts: 103
Joined: Sun Apr 19, 2020 11:37 am

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Post by TPL »

And BTW, I don't have beta laber nor I do have even advanced label on. These are continuously directed to me anyway.

With that computer I haven't face any problems with any other WUs so far.
Jandska
Posts: 18
Joined: Thu May 07, 2020 11:07 am

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Post by Jandska »

They are not advanced anymore: viewtopic.php?f=24&t=35063&p=332176&hilit=13405#p332176 but as I understood from previous discussions these WUs are something new and have more issues than other projects.

EDIT: And I don't really mind about points...I do this to help the science and finding solutions against COVID-19 and other issues.
Neil-B
Posts: 1996
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Post by Neil-B »

I believe p13404 and p13405 are part of a fairly new series of Projects and as such are more likely to have issues … they were showing on the Active Projects webpage (they have now been removed) so I'd guess (and yours and others experience confirms) that for a while these Projects have been on general release although that may no longer be the case.

… and from another thread viewtopic.php?f=101&t=35190&p=333655#p333655 ... There are times when failed WUs are actually worth it for the bigger picture science and it would appear that these projects fall into that type of scenario.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
TPL
Posts: 103
Joined: Sun Apr 19, 2020 11:37 am

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Post by TPL »

I don't see them be removed but that was not any claim to that direction either. I meant my previous posts as a report.
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Post by Joe_H »

These projects were announced as being released to all folding clients, not just Beta and Advanced - viewtopic.php?f=24&t=35063&p=332176.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Neil-B
Posts: 1996
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: 13405 (Run 576, Clone 45, Gen 0) crashed at 25%

Post by Neil-B »

2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Post Reply