I currently test the process suspend and resume for FahCore17 and it is working so far.
When the core is suspended then the fahcore process is still in memory.
The FahClient thinks the core is running all the time and increases percent done even when core is suspended.
When i resume the core it just continues folding without loading a checkpoint.
The FahClient percent value is resynct on next saving a checkpoint.
The behavior of FahCore17 when process is suspend and resumed is similar
with behavior of whole PC hibernate and resume.
So this way there is no checkpoint loading and so no time delay.
This solves my 2.5min time delay issue on pause and resume
Pause/Resume gpu core 17 takes 2.5min with high CPU usage
Moderators: Site Moderators, FAHC Science Team
Re: Pause/Resume gpu core 17 takes 2.5min with high CPU usag
This procedure can work, and it can also fail. I suspect that you didn't test it with "sleep."
Suspend/Resume does not capture the state of the GPU or the data that it's working on. .Suspend does capture all the data that's in main RAM and the state of the CPU processes (including FAHCore.exe). Any data that's currently being processed by the GPU may or may not be lost. As long as the GPU is not powered off, the active warps in the GPU will finish and the results will be enqueued to be sent back to main RAM and suspend/resume will work. If the GPU is powered off, the data that is actively being processed by the GPU often is lost.
Sleep MUST NOT remove power from the GPU memory.
Suspend/Resume does not capture the state of the GPU or the data that it's working on. .Suspend does capture all the data that's in main RAM and the state of the CPU processes (including FAHCore.exe). Any data that's currently being processed by the GPU may or may not be lost. As long as the GPU is not powered off, the active warps in the GPU will finish and the results will be enqueued to be sent back to main RAM and suspend/resume will work. If the GPU is powered off, the data that is actively being processed by the GPU often is lost.
Sleep MUST NOT remove power from the GPU memory.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: Pause/Resume gpu core 17 takes 2.5min with high CPU usag
I think when PC Sleep then RAM and GPU MEM still habe power.
But when hibernate the PC is suspend to disk (= save RAM to disk and power off).
I think GPU memory power is no problem with FahCore 17 anymore.
It was a problem a year ago but since it works, 7im was also surprised in another thread.
When after hibernate the PC is resumed then FahCore 17 is reloaded from disk to RAM and will reallocate the GPU memory.
This is done on the fly the FAH logfile only show a "clock skew" - and continues the next %
It may at max lose the 1% which was currently calculating in the GPU while hibernate.
And now i can also suspend and resume the FahCore 17 process when GPU is needed some time otherwise.
I may also lose the 1% which was currently calculating in the GPU but it will resume immidiatly
without a 2.5min delay with high CPU usage which was my problem => solved.
This is how it looks like on PC suspend and resume:
21:10:41:WU01:FS00:0x17:Completed 2100000 out of 5000000 steps (42%)
******************************* Date: 2015-02-23 *******************************
16:55:31:WARNING:WU01:FS00:Detected clock skew (19 hours 43 mins), adjusting time estimates
16:55:31:WARNING:WU00:FS01:Detected clock skew (19 hours 43 mins), adjusting time estimates
17:03:26:WU01:FS00:0x17:Completed 2150000 out of 5000000 steps (43%)
This is how it looks like on FahCore17 process suspend and resume:
20:02:34:WU01:FS00:0x17:Completed 1750000 out of 5000000 steps (35%)
20:15:28:WARNING:WU01:FS00:Detected clock skew (6 mins 03 secs), adjusting time estimates
20:15:28:WARNING:WU00:FS01:Detected clock skew (6 mins 03 secs), adjusting time estimates
20:17:49:WU01:FS00:0x17:Completed 1800000 out of 5000000 steps (36%)
But when hibernate the PC is suspend to disk (= save RAM to disk and power off).
I think GPU memory power is no problem with FahCore 17 anymore.
It was a problem a year ago but since it works, 7im was also surprised in another thread.
When after hibernate the PC is resumed then FahCore 17 is reloaded from disk to RAM and will reallocate the GPU memory.
This is done on the fly the FAH logfile only show a "clock skew" - and continues the next %
It may at max lose the 1% which was currently calculating in the GPU while hibernate.
And now i can also suspend and resume the FahCore 17 process when GPU is needed some time otherwise.
I may also lose the 1% which was currently calculating in the GPU but it will resume immidiatly
without a 2.5min delay with high CPU usage which was my problem => solved.
This is how it looks like on PC suspend and resume:
21:10:41:WU01:FS00:0x17:Completed 2100000 out of 5000000 steps (42%)
******************************* Date: 2015-02-23 *******************************
16:55:31:WARNING:WU01:FS00:Detected clock skew (19 hours 43 mins), adjusting time estimates
16:55:31:WARNING:WU00:FS01:Detected clock skew (19 hours 43 mins), adjusting time estimates
17:03:26:WU01:FS00:0x17:Completed 2150000 out of 5000000 steps (43%)
This is how it looks like on FahCore17 process suspend and resume:
20:02:34:WU01:FS00:0x17:Completed 1750000 out of 5000000 steps (35%)
20:15:28:WARNING:WU01:FS00:Detected clock skew (6 mins 03 secs), adjusting time estimates
20:15:28:WARNING:WU00:FS01:Detected clock skew (6 mins 03 secs), adjusting time estimates
20:17:49:WU01:FS00:0x17:Completed 1800000 out of 5000000 steps (36%)
Re: Pause/Resume gpu core 17 takes 2.5min with high CPU usag
I have been lobbying Development to fix that problem for nearly a year but I'm still not sure if that fix has been distributed to everyone. I would like to assume that the same fix has been incorporated into FahCore_18 but would like a definitive answer. Until the fix is distributed to everyone, I'm really not going to recommend it, but you're welcome to do whatever works for you.foldy wrote:I think GPU memory power is no problem with FahCore 17 anymore.
It was a problem a year ago but since it works, 7im was also surprised in another thread.
I' certain that FahCore_15 has not been fixed.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: Pause/Resume gpu core 17 takes 2.5min with high CPU usag
I will check FahCore_18 for that when it will be available for AMD GPU on Windows and report back.
Re: Pause/Resume gpu core 17 takes 2.5min with high CPU usag
foldy wrote:I will check FahCore_18 for that when it will be available for AMD GPU on Windows and report back.
... and FahCore_15 (or somebody with a nVidia GPU).
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: Pause/Resume gpu core 17 takes 2.5min with high CPU usag
I tested suspend and resume FAHCore_18 (beta 0.0.4) process
and like FAHCore_17 it works without a problem
and saves the high CPU time which occurs
when pause and unpause the FahClient.
Maybe this could be a feature option for the next FahClient:
Keep FahCore in memory when paused => instant resume possible
and like FAHCore_17 it works without a problem
and saves the high CPU time which occurs
when pause and unpause the FahClient.
Maybe this could be a feature option for the next FahClient:
Keep FahCore in memory when paused => instant resume possible
Re: Pause/Resume gpu core 17 takes 2.5min with high CPU usag
The subreddit would be the right place to suggest this. It's not a question for support.foldy wrote:Maybe this could be a feature option for the next FahClient:
Keep FahCore in memory when paused => instant resume possible
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.