How to make fah client less fussy about what host OS or kernel is running
Posted: Sun Feb 09, 2025 9:59 pm
It appears that every time I reboot my debian 12.9 bookworm rig with a new kernel, "Machine ID" changes, which invalidates or suspends any WU in progress. Probably they get mothballed indefinitely. Log shows errors for a WU in progress, how it "does not belong" to a certain Client ID hash. Several of these orphans are accumulating, 100% finished but blocked from uploading results. On top of that, other distros running on the same rig have distinct Machine IDs and so are unable to share WUs. This quickly gets even more unmanageable.
In a previous cut of the fah client which had fewer features, it was also much less picky about such like. Using a file system mount shared among different linux distros, all debian-related, all running on the same host, often with near identical kernels, it was able to load up & make progress on the same WU regardless of which distro or kernel variant was running at the time. That I considered a feature, which missing was hardly to be endured!
MySQL database files appear to track the minutiae of keeping these orphans unavailable for upload. Is it really worth it to FAH science integrity, tossing out so much folding computation so? I'd say older editions weren't broken in those terms, what rationale to fix it makes sense?
Thanks for any insights and advice.
In a previous cut of the fah client which had fewer features, it was also much less picky about such like. Using a file system mount shared among different linux distros, all debian-related, all running on the same host, often with near identical kernels, it was able to load up & make progress on the same WU regardless of which distro or kernel variant was running at the time. That I considered a feature, which missing was hardly to be endured!
MySQL database files appear to track the minutiae of keeping these orphans unavailable for upload. Is it really worth it to FAH science integrity, tossing out so much folding computation so? I'd say older editions weren't broken in those terms, what rationale to fix it makes sense?
Thanks for any insights and advice.