Project 7809, TPF increases until restart

Moderators: Site Moderators, FAHC Science Team

Post Reply
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

Project 7809, TPF increases until restart

Post by Xavier Zepherious »

project 7809(5.284.143)
TPF 48.37s
est time to complete 3.1 days
I noticed this was wrong
Rig is been rock steady - hasn't had a single work failure

FAH 7.2.9 Win7 3930k 4.1Ghz

Solution - Pause All work - and then select fold again
seems to fix the long TPF issues - this has occured before on these Work units -and Im not the only one noticing it
this seems to work

tpf is now 11m45s

Somehow the Project gets hung up somehow and just pausing and resuming fixes it

no other project does this

Mod Edit: Edited Thread Subject, Added PRCG To Post - PantherX
iancook221188
Posts: 16
Joined: Tue Dec 15, 2009 5:29 pm

Project: 7809 (10, 443, 5) 4 hour TPF

Post by iancook221188 »

Project: 7809 (10, 443, 5)

on a 2600k 4.5 it using 99%cpu

17 day to complete

tried restating the client, can i delete the wu it has been going 12 hour now before i noticed the problem, the old log has all the info about the other client in it as well do you want me to post the large jog here as it would be quite long as it goes back 12 hours with two other gpu client working

Mod Edit: Post Merged, PRCG Added - PantherX
Image
Qinsp
Posts: 216
Joined: Sun Oct 17, 2010 2:34 pm

Re: project 7809(5.284.143) longTPF

Post by Qinsp »

I've seen a P8101 do this too. Restart drops the TPF a bunch.

Unless you are talking about the STATUS tpf and ETA estimates. They don't work on big jobs as far as I can tell. If I'm rock-steady at 21:00 tpf ±0:05 sec for 30 frames, the estimates will vary between 15min and 45min. I have to hand calc the ETA if I need to service the machine.
Quality Inspection - Corona, CA, USA
Dimensional Inspection Laboratory
Pat McSwain, President
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Project 7809, TPF increases until restart

Post by PantherX »

Before I report it to the PG Member, it would be nice to have the log files (filtered) along with the PRCGs which might help in troubleshooting this issue. Please ensure that this is indeed caused only by the WU, i.e. other applications aren't using the CPU which is causing the increase in the TPF.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

Re: Project 7809, TPF increases until restart

Post by Xavier Zepherious »

I would check on what GPU WU(project) is running with that WU

Im hearing reports On our forums of similar issues with other SMP WU's

did a 8083 WU flawlessly after it

7808 WU(2,362,157) is also doing it now (basically not doing much all night I slept until I restarted it)
never had a problem until the 8070 WU were running also running a 7660 (3 way SLI 670's)


Pause and resume restarts it


the log file doesn't say much - im running verbosity 5



TPF Ranges to 30 - 1 hour

once paused and resumed tpf around 10m

Unusual behavior
what Im assuming is the GPU WU is interfering with the SMP somehow

the other person was running a 7647 SMP WU and a 8070 GPU WU


update after reviewlng log files when the TPF change occurs

Seems the issue coincides with a new GPU WU starting
tpf is at 10min until It sends and receives/starts a new 8070
once that happens tpf goes to 40mins
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

Re: Project 7809, TPF increases until restart

Post by Xavier Zepherious »

and confirmed once again with the next upload
it doesn't happen right away - but when the new GPU WU start the SMP starts lagging - and within 3 frames is at 40m tpf
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Project 7809, TPF increases until restart

Post by PantherX »

Could it be possible that some WUs from this Project series is extra sensitive to any interruptions? When you get the TPF increase, if you just pause the GPU Slot (thus there will not be any more interruptions), will the TPF drop back to normal? If it doesn't then it might be an issue.

BTW, if the issue occurs with only the GPU processing 8070 WUs then it would seem that it is using enough CPU cycles to cause a slowdown in the SMP processing.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

Re: Project 7809, TPF increases until restart

Post by Xavier Zepherious »

the next two wu's were a 7625 and 7660 - and no slowdown occured and had a 8m tpf on 7808


the slow down is not minor either with 8070's - up to 1hr or more TPF
more like a bunch of threads get stuck on the SMP

pause and resuming all of them fixes the smp and bring tpf back to around 10m
so it's not like its stealing cpus cycles (maybe some - because as noted above 2m extra - 25% increase in time to complete a frame)

and it doesn't occur every time either (although there is a increased TPF increase with each new 8070/8071)

Im going to watch these units and update as often as I can to figure out if the GPU WU's are causing it or it's a smp issue

next time im going to try just pausing the gpu's wait see what that does
then I'll restart the gpu's ......and leave SMP alone - if that doesn't fix it
then I'll try the SMP alone


You want debugging - as a programmer myself this is what I do
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project 7809, TPF increases until restart

Post by bruce »

Do note that the debugging will take time. To establish a reasonable TPF estimate, the client must collect enough recent data to establish a pattern that may be different than the previous pattern it has established, so you'll have to wait for several frames to be completed after any change you make.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Project 7809, TPF increases until restart

Post by PantherX »

Thanks for taking the time to debug and document your findings! This would be really helpful when troubleshooting.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

Re: Project 7809, TPF increases until restart

Post by Xavier Zepherious »

Happened again with a 8071 WU
Exactly when it starts the new WU - went up to 1hr TPF on the 7809 SMP unit in 2 frames(and may stay like that for hours)

sounds like certain GPU units are hogging CPU resources when they start - and won't release it

Pausing the GPU solves it
the TPF on the SMP improves immediately - down to 5m TPF with the SMP

Restarting the GPU's would probably work to fix the issue rather than resetting them all


- so this is primarily a GPU WU issue
Now if there was a way for me to figure out how much CPU power each GPU uses to see where the issue is
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project 7809, TPF increases until restart

Post by bruce »

What CPU percentages does Task Manager report are being used by the various active programs?
Napoleon
Posts: 887
Joined: Wed May 26, 2010 2:31 pm
Hardware configuration: Atom330 (overclocked):
Windows 7 Ultimate 64bit
Intel Atom330 dualcore (4 HyperThreads)
NVidia GT430, core_15 work
2x2GB Kingston KVR1333D3N9K2/4G 1333MHz memory kit
Asus AT3IONT-I Deluxe motherboard
Location: Finland

Re: Project 7809, TPF increases until restart

Post by Napoleon »

I've noticed myself that P807x GPU WUs require quite a lot of CPU time in practice - a lot more than other GPU WUs. It doesn't necessarily show up in Task Manager. I'd recommend dropping to SMP:10 on the 3930K CPU, since SMP:11 is known to cause CPU WU failures.
Win7 64bit, FAH v7, OC'd
2C/4T Atom330 3x667MHz - GT430 2x832.5MHz - ION iGPU 3x466.7MHz
NaCl - Core_15 - display
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

Re: Project 7809, TPF increases until restart

Post by Xavier Zepherious »

I'll have to keep this in mind next time to check task manager
at moment a4 core 94-97%

a baseline tpf (my rig)on 7808/7809 is 8-12m tpf with a 3 GPUs running... 5m20s tpf when not running GPU's
(almost twice as long when my gpu's are running)

and that's fine Im not getting 45K like my SMP rig running the same WU's nor do I expect it
10k-16K that's fine 7808/7809 always gave me poor numbers with my gpus


just that 30min -1hr TPF is not something I want to waste power or time... 2-4 day est time practically no points

workin an 8082 with tpf of 2m35s(4.1k Credit) with 3 gpus - 23k PPD
vs 7808 for 10K (which I just uploaded)

the issue also occurs on 8082/8083
just to a lesser extend
tpf on 8082 up to 15 mins after fetching a new WU (8070)

found core A4 92%
core 15 at 3%,1%,1%
cpu ide 3%
a4 is not picking up the extra cycle

two hrs at increased tpf

restart of all WU's
a4 it's up to 95 now
tpf back to 3 mins

hmmm looks like I might have to restrict CPU to smp -10

Edit:update

SMP10 is no worse for tpf than smp12

if fact it might be better because the SMP may not be CPU starved now
SMP at 83% cpu use with improved tpf over SMP 12 with 95% cpu use on a4 core

tpf 7ms est PPD on 7809 30k

looks good so far
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

Re: Project 7809, TPF increases until restart

Post by Xavier Zepherious »

Ya! I'll stay at smp10

CPU usuage on the GPUS dropped too down to 1% combined - rather than 5% before - both my gpu's WU's and SMP were starved before
SMP has improved remarkably
Post Reply