Users with many failed WUs
Posted: Wed May 06, 2020 4:39 pm
I had a few WUs failed on a rig that was getting its knickers in a twist over the various GPU indices.
Out of curiosity I put one into the WU checker to see if someone else had picked it up and finished it.
https://apps.foldingathome.org/wu#proje ... =635&gen=0
Turns out someone else also failed it before someone successfully completed it.
Again, out of curiosity (and boredom - self-isolation does that...) I clicked on the other failed user.
Interesting - a list of 228 slots having returned WUs, so putting in some significant effort.
However, most show a very low credit - suggesting failures. I picked every 3rd slot to sample the low point WUs (<1000):
Picking some of the last WUs from the other failed users in the above:
This rather rudementary and ad-hoc delve into a few users and WUs shows an enormous number of failures. All but 4 were subsequently completed successfully, those 4 are still waiting, so probably not bad WUs either.
Just following the tree from WU to user to WUs to users you quickly come across hundreds of failed WUs failed by a few users. I've seen similar patterns in BOINC projects where the WU results are somewhat easier to browse and you're not limited to just the last WU for a client.
Obviously, anyone can have a bad spell and produce a few failures, myself included, but the above is shameful.
What can be done to stop this having a huge detrimental effect on the project?
Isn't there a risk that WUs get marked as bad if there are too many bad clients out there? Also, a high proportion get completed by more than 1 person - so there's also a duplication of work and effort.
When WUs are in short supply and assignment servers under such heavy load with a surplus of good clients sitting idle, shouldn't some blacklisting be implemented?
Out of curiosity I put one into the WU checker to see if someone else had picked it up and finished it.
https://apps.foldingathome.org/wu#proje ... =635&gen=0
Turns out someone else also failed it before someone successfully completed it.
Again, out of curiosity (and boredom - self-isolation does that...) I clicked on the other failed user.
Interesting - a list of 228 slots having returned WUs, so putting in some significant effort.
However, most show a very low credit - suggesting failures. I picked every 3rd slot to sample the low point WUs (<1000):
Code: Select all
https://apps.foldingathome.org/wu#project=14158&run=2&clone=8992&gen=1 failed - 2 failures before completion
https://apps.foldingathome.org/wu#project=14443&run=0&clone=17&gen=3 failed - 4 failures before completion
https://apps.foldingathome.org/wu#project=11763&run=0&clone=6892&gen=47 failed - for some season issued and completed successfully by 3 others
https://apps.foldingathome.org/wu#project=13879&run=0&clone=1568&gen=81 failed - completed by next person
https://apps.foldingathome.org/wu#project=14437&run=227&clone=2&gen=3 failed - 3 failures before completion
https://apps.foldingathome.org/wu#project=11743&run=0&clone=2872&gen=13 failed - completed by next person
https://apps.foldingathome.org/wu#project=14432&run=0&clone=268&gen=38 failed - completed by next person
https://apps.foldingathome.org/wu#project=11741&run=0&clone=5579&gen=23 failed - for some season issued and completed successfully by 2 others
https://apps.foldingathome.org/wu#project=16434&run=548&clone=2&gen=4 failed - completed by next person
https://apps.foldingathome.org/wu#project=16433&run=1879&clone=0&gen=7 failed - completed by next person
https://apps.foldingathome.org/wu#project=14538&run=0&clone=1965&gen=90 failed - 2 failures before completion
https://apps.foldingathome.org/wu#project=14436&run=2444&clone=0&gen=16 failed - completed by next person
https://apps.foldingathome.org/wu#project=13878&run=0&clone=1390&gen=56 failed - completed by next person
https://apps.foldingathome.org/wu#project=11746&run=0&clone=324&gen=65 failed - for some season issued and completed successfully by 2 others
https://apps.foldingathome.org/wu#project=11743&run=0&clone=4735&gen=73 failed - completed by next person
https://apps.foldingathome.org/wu#project=16433&run=53&clone=1&gen=11 failed - for some season issued and completed successfully by 2 others
https://apps.foldingathome.org/wu#project=11743&run=0&clone=9178&gen=12 failed - completed by next person
https://apps.foldingathome.org/wu#project=11761&run=0&clone=13123&gen=1 failed - 2 failures before completion by 2 others
https://apps.foldingathome.org/wu#project=14251&run=401&clone=0&gen=6 failed - no other return
https://apps.foldingathome.org/wu#project=14436&run=877&clone=0&gen=17 failed - completed by next person
https://apps.foldingathome.org/wu#project=14549&run=0&clone=603&gen=46 failed - completed by next person
https://apps.foldingathome.org/wu#project=14549&run=0&clone=514&gen=54 failed - completed by next person
https://apps.foldingathome.org/wu#project=11746&run=0&clone=6303&gen=43 failed - completed by next person
https://apps.foldingathome.org/wu#project=16434&run=388&clone=2&gen=12 failed - completed by next person
https://apps.foldingathome.org/wu#project=14444&run=0&clone=170&gen=6 failed - completed by next person
https://apps.foldingathome.org/wu#project=14438&run=0&clone=808&gen=3 failed - completed by next person
https://apps.foldingathome.org/wu#project=14436&run=2137&clone=3&gen=8 failed - 2 failures before completion
https://apps.foldingathome.org/wu#project=14549&run=0&clone=1461&gen=39 failed - completed by next person
https://apps.foldingathome.org/wu#project=16444&run=0&clone=526&gen=5 failed - no other return
https://apps.foldingathome.org/wu#project=13876&run=0&clone=1840&gen=60 failed - completed by next person
https://apps.foldingathome.org/wu#project=14436&run=2526&clone=0&gen=26 failed - completed by next person
https://apps.foldingathome.org/wu#project=14561&run=0&clone=1307&gen=18 failed - 4 failures before completion
https://apps.foldingathome.org/wu#project=16434&run=328&clone=0&gen=22 failed - 2 failures before completion
https://apps.foldingathome.org/wu#project=14436&run=772&clone=0&gen=31 failed - completed by next person
Code: Select all
https://apps.foldingathome.org/wu#project=11762&run=0&clone=6151&gen=31 failed - completed by next person
https://apps.foldingathome.org/wu#project=14549&run=0&clone=1217&gen=0 failed - completed by next person
https://apps.foldingathome.org/wu#project=14543&run=0&clone=359&gen=38 failed - completed by next person (me!)
https://apps.foldingathome.org/wu#project=16421&run=0&clone=2459&gen=9 failed - completed by next person
https://apps.foldingathome.org/wu#project=14251&run=102&clone=2&gen=6 failed - no other return
https://apps.foldingathome.org/wu#project=16443&run=0&clone=276&gen=7 failed - no other return
https://apps.foldingathome.org/wu#project=11779&run=0&clone=8875&gen=43 failed - 4 failures before completion
https://apps.foldingathome.org/wu#project=11761&run=0&clone=5160&gen=16 failed - 2 failures before completion by 3 others
https://apps.foldingathome.org/wu#project=16804&run=13&clone=144&gen=0 failed - completed by next person
https://apps.foldingathome.org/wu#project=16804&run=53&clone=408&gen=9 failed - completed by next person
https://apps.foldingathome.org/wu#project=11761&run=0&clone=3118&gen=48 failed - 2 failures before completion by 2 others and 1 dumped
https://apps.foldingathome.org/wu#project=11759&run=0&clone=227&gen=61 failed - 3 failures before completion by 6 others
Just following the tree from WU to user to WUs to users you quickly come across hundreds of failed WUs failed by a few users. I've seen similar patterns in BOINC projects where the WU results are somewhat easier to browse and you're not limited to just the last WU for a client.
Obviously, anyone can have a bad spell and produce a few failures, myself included, but the above is shameful.
What can be done to stop this having a huge detrimental effect on the project?
Isn't there a risk that WUs get marked as bad if there are too many bad clients out there? Also, a high proportion get completed by more than 1 person - so there's also a duplication of work and effort.
When WUs are in short supply and assignment servers under such heavy load with a surplus of good clients sitting idle, shouldn't some blacklisting be implemented?