BAD_WORK_UNIT project:19600

tvdsluis · Post by **tvdsluis** » Tue Sep 05, 2023 4:29 pm

9 out of 10 work units of this project end in a bad work unit.
And i keep getting this project which is a total waste of resources.
Other projects run fine on this RX 580 8 Gb.

Can i somehow opt out of this project.
Or do i need to stop folding for a few weeks/months until this project is finished.
I dont want to keep wasting electricity on this project.

10:16:37:WU02:FS01:0x22:Completed 2160000 out of 3000000 steps (72%)
10:18:09:WU02:FS01:0x22:Completed 2190000 out of 3000000 steps (73%)
10:19:01:WU02:FS01:0x22:An exception occurred at step 2206289: Particle coordinate is nan
10:19:01:WU02:FS01:0x22:Max number of attempts to resume from last checkpoint (2) reached. Aborting.
10:19:01:WU02:FS01:0x22:ERROR:114: Max number of attempts to resume from last checkpoint reached.
10:19:01:WU02:FS01:0x22:Saving result file ..\logfile_01.txt
10:19:01:WU02:FS01:0x22:Saving result file positions.xtc
10:19:01:WU02:FS01:0x22:Saving result file science.log
10:19:01:WU02:FS01:0x22:Saving result file state.xml.bz2
10:19:01:WU02:FS01:0x22:Saving result file xtcAtoms.csv.bz2
10:19:01:WU02:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
10:19:01:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
10:19:01:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:19600 run:514 clone:1 gen:0 core:0x22 unit:0x000000010000000000004c9000000202
10:19:01:WU02:FS01:Uploading 2.04MiB to 160.45.68.201
10:19:01:WU02:FS01:Connecting to 160.45.68.201:8080
10:19:03:WU01:FS01:Connecting to assign1.foldingathome.org:80
10:19:04:WU01:FS01:Assigned to work server 160.45.68.201
10:19:04:WU01:FS01:Requesting new work unit for slot 01: gpu:38:0 Ellesmere XT [RX 470/480/570/580/590] from 160.45.68.201

Post by **Joe_H** » Tue Sep 05, 2023 5:18 pm

The WU shown in your log was folded successfully by the next person assigned it. If you are only getting failures on WUs from this project, it is possible there is an incompatibility between the version of the folding core used and the AMD driver and OpenCL support.

tvdsluis · Post by **tvdsluis** » Tue Sep 05, 2023 5:37 pm

Joe_H wrote: ↑Tue Sep 05, 2023 5:18 pm The WU shown in your log was folded successfully by the next person assigned it. If you are only getting failures on WUs from this project, it is possible there is an incompatibility between the version of the folding core used and the AMD driver and OpenCL support.

I think that is probably the case, since other projects run fine.
But what is my best solution.
I keep getting this project and maybe once a day i get another project that runs fine.
The rest of the day my systems spends resouces on bad wu's from the project, and i can't opt out.

Is my only option to stop folding until this project is finished, and how can i see this project is finished?

muziqaz · Post by **muziqaz** » Tue Sep 05, 2023 6:27 pm

What driver is this on?

Post by **yycyyc** » Tue Sep 05, 2023 9:02 pm

But what is my best solution.
I keep getting this project and maybe once a day i get another project that runs fine.
The rest of the day my systems spends resouces on bad wu's from the project, and i can't opt out.

Is my only option to stop folding until this project is finished, and how can i see this project is finished?

Thanks for your direct feedback! I feel really sorry for the constant failure you have experienced with this project. While we are trying to look into the matter, I just temporarily opt you out from the project until the problem gets solved.

The filter rule might take some time to propagate to the assign server and become valid, and in the meantime jobs already assigned to you might still cause a failure. I beg for your understanding and patience. If the problem does not resolve for you after the current jobs are gone, please don't hesitate to report it here again!

Thank you again for your valuable contribution to the folding community, and we will try our best to reduce this type of compatibility issues.

tvdsluis · Post by **tvdsluis** » Wed Sep 06, 2023 8:52 am

yycyyc wrote: ↑Tue Sep 05, 2023 9:02 pm Thanks for your direct feedback! I feel really sorry for the constant failure you have experienced with this project. While we are trying to look into the matter, I just temporarily opt you out from the project until the problem gets solved.

The filter rule might take some time to propagate to the assign server and become valid, and in the meantime jobs already assigned to you might still cause a failure. I beg for your understanding and patience. If the problem does not resolve for you after the current jobs are gone, please don't hesitate to report it here again!

Thank you again for your valuable contribution to the folding community, and we will try our best to reduce this type of compatibility issues.

Thank you for your fast response. It helps to see this is taken so serious.
I hope this gets resolved for you and for me.
I will continue to fold!

muziqaz · Post by **muziqaz** » Wed Sep 06, 2023 10:33 am

This project has been disabled from all but one AMD species (RDNA3). AMD drivers for GCN have this bug, which blows up the simulations on particular projects. AMD is not willing to spend any time fixing it, as OCL is on its last leg, being replaced by ROCm.
The reason these crashes slip through the testing is holiday season for AMD tester and bad timing in general

We hope this won't happen in the future...until next time

Folding Forum

BAD_WORK_UNIT project:19600

BAD_WORK_UNIT project:19600

Re: BAD_WORK_UNIT project:19600

Re: BAD_WORK_UNIT project:19600

Re: BAD_WORK_UNIT project:19600

Re: BAD_WORK_UNIT project:19600

Re: BAD_WORK_UNIT project:19600

Re: BAD_WORK_UNIT project:19600