Windows updates causing WU's to fail, and strange PPD to be recorded.

Moderators: Site Moderators, FAHC Science Team

Post Reply
D.Record
Posts: 12
Joined: Tue Nov 04, 2025 3:53 am

Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by D.Record »

Hi Everyone,

Windows updates causing WU's to fail, and strange PPD to be recorded

Ive noticed on rare occasions that when ive let windows restart the computer's while im folding, that the 0xa8 WU's have failed and after restart the 0x27 is showing im running at say something silly like 300,000,000 PPD and then drops down as the WU completes, depending on the anount of time left, but could end on 12,000,000 PPD which is not possible.

Im just wondering if the FAHclient is being forced to end by windows, after the client fails to respond to a request to close?

In the early hours of this morning windows decided to install a waiting update that i had ignored for a few days, and so it restarted, and killed my WU and FAHclient didnt restart until i logged in.

Note to self don't leave updates waiting more than a few days, and always pause folding before i restart a machine.

Many thanks,

David
muziqaz
Posts: 2234
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3d, 7950x3d, 5950x, 5800x3d
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX550, Intel B580
Location: London
Contact:

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by muziqaz »

PPD issue is known.
In regards to windows updates, you will have to control them better. When you see them lined up to be done, pause folding, update windows. Windows updates are annoying, I know, but they are also lethal for windows apps, as updates ignore them running and just restart the PC anyways
FAH Omega tester
Image
Joe_H
Site Admin
Posts: 8272
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by Joe_H »

Yes, Windows Update is a known problem, and not just for F@h. There is code in F@h that is supposed to be observed by the Windows shutdown code, but frequently is not. The v8.5 public beta has improvements to that code, haven't seen wide reports on how much better that works.

Essentially this has been a problem with Windows Update and shutdown for decades. It ignores running apps signals to wait, even those using MS documented methods. Never mind autoupdating drivers in the middle of operations. Best practices has been to turn off autoupdates and manually run them periodically when apps can be shutdown cleanly first. Then check afterwards to make certain MS has not in their we know best manner turned autoupdating back on with one of the updates.
Image
D.Record
Posts: 12
Joined: Tue Nov 04, 2025 3:53 am

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by D.Record »

Hi Everyone,

Would it be better to accept that if windows is going to force the Fahclient to close, that the software can save its current WU's, or have a somekind memoryfile in place so it could potentially recover.

I appreciate that there probably is a reason why fahclient is built as it is, but im just interested.

At 6 hours of processing a WU to have it fail in the middle is a loss, but from what i understand corrupted WU's is not helpful either.

Ive seen it mentioned that the fahclient is good at testing gpu cards after repair, because faulty gpus wont cope with folding.

Are the WU's failed by the Fahclient because it senses instability or corruption?

Many thanks.

David
Joe_H
Site Admin
Posts: 8272
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by Joe_H »

D.Record wrote: Tue Nov 25, 2025 10:36 pm Would it be better to accept that if windows is going to force the Fahclient to close, that the software can save its current WU's, or have a somekind memoryfile in place so it could potentially recover.
The client is in the process of doing just that when Windows comes along and just terminates the processes. That leaves files open and partly written, or corrupted by the time the restart is done. In the case of the GPU core processing, it writes out checkpoint files every so many steps. Those usually are useable for restarting, but not if the Windows process termination happens to coincide with that checkpoint. The CPU folding cores write a checkpoint every 15 minutes by default, and also start a checkpoint when paused. This works fine on Linux and macOS, but the code paths documented by MS to be checked on during a shutdown and cause the shutdown to wait on the finish are not being observed. Instead the shutdown just terminates the processes without waiting.
muziqaz
Posts: 2234
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3d, 7950x3d, 5950x, 5800x3d
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX550, Intel B580
Location: London
Contact:

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by muziqaz »

D.Record wrote: Tue Nov 25, 2025 10:36 pm Hi Everyone,

Would it be better to accept that if windows is going to force the Fahclient to close, that the software can save its current WU's, or have a somekind memoryfile in place so it could potentially recover.

I appreciate that there probably is a reason why fahclient is built as it is, but im just interested.

At 6 hours of processing a WU to have it fail in the middle is a loss, but from what i understand corrupted WU's is not helpful either.

Ive seen it mentioned that the fahclient is good at testing gpu cards after repair, because faulty gpus wont cope with folding.

Are the WU's failed by the Fahclient because it senses instability or corruption?

Many thanks.

David
Look at FAH as video/3D rendering workload. What happens to your render/video if windows decides to just restart in the middle of rendering work? You have to start over from beginning, because the partly rendered file is unreadable due to untimely termination of the workload. I know FAH has checkpoints, but as Joe mentioned Windows uses sledgehammer to shutdown for the update.

FAHClient itself does not do any heavy lifting. It just control FAHcores, which are actually simulating the protein folding. FAHcores are very sensitive to hardware instability, and it is very good at loading GPUs to their limits. However, please note, that FAH app should not be used as benchmarking, stress testing tool for OC attempts, or to find stable PC settings. It should be used after user makes sure that their hardware is 100% stable. I mean, FAH should not be the first line of benchmarking suite, only the final, once everything else shows hardware is stable.
FAH Omega tester
Image
FaaR
Posts: 76
Joined: Tue Aug 19, 2008 1:32 am

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by FaaR »

Joe_H wrote: Tue Nov 25, 2025 11:11 pm The client is in the process of doing just that when Windows comes along and just terminates the processes. That leaves files open and partly written, or corrupted by the time the restart is done.
There's an easy fix to that. Ideally you'd keep two sets of checkpoints and write to them in an alternating fashion (disk space should not be a concern, really), this way there's always at least one working checkpoint (after the first one, of course... :))


One could also write new checkpoints to a temp checkpoint file/s and then delete the old checkpoint, and rename the temp file/s to the proper name. Not as good a method, since there's still a chance windows could sneak in a restart between finishing the file write/s and the delete/rename step. Less security, and only "advantage" is keeping just one set of checkpoint file/s permanently on disk (you still need twice the space initially) - nobody should miss the storage space for having two sets of checkpoints at all times anymore, it's not the 1980s any longer. :)
calxalot
Site Moderator
Posts: 1720
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by calxalot »

It’s not just a matter of atomic checkpoints.
The client cannot distinguish a core being brutally killed from a crash of unknown cause.
Joe_H
Site Admin
Posts: 8272
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by Joe_H »

The GROMACS code the CPU folding cores are based on used to support double checkpoints, no one used the feature. No idea if that feature still is available. However process recovery from an issue requiring use of a second checkpoint is more complicated and would increase the complexity of the client and core code. The decision was made years ago to not expend the single full time paid developer's effort in that direction for a rarely used feature.

Disk space was never a consideration.

The multi-step method you propose is also subject to being interrupted and leaving things in an inconsistent, I.e. for all intents corrupted, state.
D.Record
Posts: 12
Joined: Tue Nov 04, 2025 3:53 am

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by D.Record »

Hi everyone,

Thank you for your replies, it does sound like a challenging problem.

I personally don't use folding@home for testing system stability, i did see it mentioned on youtube where by a creator said it was way to know if a gpu card was good.

From what i understand the gpu's are loaded up and run a WU, and then at the end the results downloads back to the client and saved to the hard disk, before being uploaded from the computer to the wu server. I appreciate thats a very simple block diagram breakdown, and its likely far more complex than that.

However another concern made me think of this discussion today and so i came back to check on this discussion.

As i was looking at multiple psu configuations to potentially operate 6x gpus, i only have 1 and a second waiting to be installed, the risk of brown out's or power failure on the electricity grid that supplies my home came up in thought.

Its very rare, but it does happen, and when I have been playing CSGO in the past have had a complete power failure last a minute, or on another occasion only last a few seconds, but it can take me 5 to 10 minutes to be back to kind of where i was. In the case of folding@home ill lose at least 2 WU's at varying completion percentages and take time to get back online, but a multi gpu system it would be more disappointing / traggic.

Naturally a UPS uninteruptable power supply is the way to go, but there is a limit to how long a UPS will operate, amount of batteries that from a fire risk could be placed in your home, along with budget. However it is bad to run your batteries down to 0% i belive, so ideally its designed to give your system time to save and shut down the computer, to possibly restart when mains power is restored though i would need research that to be sure.

Forgive me if i mistaken on some of the UPS details, ive not actully used one, but ive read up on them on multiple occasions, and keep thinking of purchasing one,

I understand this tricky hyperthetical discussion, but i think its worth having this discussion.

Its not if it can happen, its kind of what will happen if it happens, and how can we cope with it while its happening. It comes to mind with the traggic Hong Kong multiple apartment complex fires in the last 2 days.

So if im not mistaken, i believe windows combined with the UPS software for some models connected by USB, will close down the machine, or i surmise it hibernates it.

How would that work with fahclient and the wu's loaded in to the cpu and gpus card or cards?

Is it even possible to hibernate a machine ?

I assume that might work for a system only folding on the CPU, because windows would save the active memory and the WU loaded in to the CPU if im not mistaken, then it might be possible. However would i be correct in assuming the GPU memory would not be saved, as while the CPU and client in the RAM memory would be reloaded, the gpu would be blank and confused by the clients requests.

It sounds the WU's are less aware or bothered by external details, they just do there thing maybe update the client on there process, but ultimately if power is lost, or the system forces a restart because of the user or the systems installing updates, and the WU is cut off mid process and lost.

I appreciate all your insight, and its not my intention to cause any frustations in bringing up something, that might have been discussed in depth in a past discussion. I apologise in advance if that is the case.

Many thanks,

David.
Last edited by D.Record on Fri Nov 28, 2025 4:59 pm, edited 1 time in total.
muziqaz
Posts: 2234
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3d, 7950x3d, 5950x, 5800x3d
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX550, Intel B580
Location: London
Contact:

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by muziqaz »

FAH does not like hibernation or sleep mode.
Shutting down would be the only solution
FAH Omega tester
Image
Joe_H
Site Admin
Posts: 8272
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Windows updates causing WU's to fail, and strange PPD to be recorded.

Post by Joe_H »

D.Record wrote: Fri Nov 28, 2025 4:39 pm From what i understand the gpu's are loaded up and run a WU, and then at the end the results downloads back to the client and saved to the hard disk, before being uploaded from the computer to the wu server. I appreciate thats a very simple block diagram breakdown, and its likely far more complex than that.
That is a very simplified version of GPU processing. In practice the folding core running on the CPU prepares a block of data with processing instructions and hands that off to the GPU. When completed that block of data is returned and another sent to be processed. This is repeated until the WU is done.
D.Record wrote: Fri Nov 28, 2025 4:39 pm Naturally a UPS uninteruptable power supply is the way to go, but there is a limit to how long a UPS will operate, amount of batteries that from a fire risk could be placed in your home, along with budget. However it is bad to run your batteries down to 0% i belive, so ideally its designed to give your system time to save and shut down the computer, to possibly restart when mains power is restored though i would need research that to be sure.
The size of UPS required to power a multi-GPU system and keep processing is probably a budget breaker for most individual folders. Here in the US that would also probably require a 220 V AC circuit. Instead the best approach I can think of is to use the builtin capability of the F@h client to pause when running on battery. Mainly there for laptops, but will also work with desktop and server machines as the OS can pass on the UPS status when connected by a USB cable. A brief high draw period while the folding is paused, but the batteries will not be drained as rapidly. OS settings can be used to do a clean shutdown or put the system into a deep sleep mode if the power is out for more than some set period or the battery level drops below some percentage. It is available in Windows and the macOS, and I believe there are options to include support for connected UPSs in Linux.

As a practical matter, CPU folding recovers from sleep just fine. Its data is in system RAM, and deep sleep modes that power that down will copy memory contents to the drive first and restore on wakening. However under most circumstances powering down the GPU does not save the data in video RAM or shader cores. So GPU folding when slept loses that "state" information. There are some recovery modes that will just result in restarting at the last checkpoint, but that is not always certain.
Image
Post Reply