Issue with 18224

Moderators: Site Moderators, FAHC Science Team

Post Reply
azhad
Posts: 16
Joined: Tue Jul 27, 2021 9:40 pm

Issue with 18224

Post by azhad »

319,9,56 is stuck in a Shutting Down state.
muziqaz
Posts: 1705
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Issue with 18224

Post by muziqaz »

Log files or it didn't happen
FAH Omega tester
Image
belloq
Posts: 51
Joined: Thu Sep 24, 2020 12:58 pm

Re: Issue with 18224

Post by belloq »

I've noticed that WU 18244 221,0,65 has a wildly variable frame completion time on my system. I show 4 previous user's failed attempts in the project overview for the same WU. On my system, I've had some frames complete in ~5min and others are ~30min.

Code: Select all

15:11:38:I1:WU488: Global context and integrator variables write interval: disabled
15:11:38:I1:WU488:There are 4 platforms available.
15:11:38:I1:WU488:Platform 0: Reference
15:11:38:I1:WU488:Platform 1: CPU
15:11:38:I1:WU488:Platform 2: OpenCL
15:11:38:I1:WU488: opencl-device 0 specified
15:11:38:I1:WU488:Platform 3: CUDA
15:11:38:I1:WU488: cuda-device 0 specified
15:12:20:I1:WU488:Attempting to create CUDA context:
15:12:20:I1:WU488: Configuring platform CUDA
15:12:29:I1:WU488: Using CUDA on CUDA Platform and gpu 0
15:12:29:I1:WU488: GPU info: Platform: CUDA
15:12:29:I1:WU488: GPU info: PlatformIndex: 0
15:12:29:I1:WU488: GPU info: Device: NVIDIA RTX A1000 Laptop GPU
15:12:29:I1:WU488: GPU info: DeviceIndex: 0
15:12:29:I1:WU488: GPU info: Vendor: 0x10de
15:12:29:I1:WU488: GPU info: PCI: 01:00:00
15:12:29:I1:WU488: GPU info: Compute: 8.6
15:12:29:I1:WU488: GPU info: Driver: 12.7
15:12:29:I1:WU488: GPU info: GPU: true
15:12:30:I1:WU488:Completed 1850000 out of 2500000 steps (74%)
15:19:29:I1:WU488:Completed 1875000 out of 2500000 steps (75%)
15:42:07:I1:WU488:Completed 1900000 out of 2500000 steps (76%)
15:42:12:I1:WU488:Checkpoint completed at step 1900000
16:05:49:I1:WU488:Completed 1925000 out of 2500000 steps (77%)
16:28:34:I1:WU488:Completed 1950000 out of 2500000 steps (78%)
16:28:38:I1:WU488:Checkpoint completed at step 1950000
16:51:47:I1:WU488:Completed 1975000 out of 2500000 steps (79%)
17:19:05:I1:WU488:Completed 2000000 out of 2500000 steps (80%)
17:19:08:I1:WU488:Checkpoint completed at step 2000000
Generally, I see fairly consistent frame time. Is this abnormal or somewhat to be expected in some cases/project? (I suppose it could be something on my system causing this too... but I can't think of anything that's dramatically changed recently.)
Joe_H
Site Admin
Posts: 8115
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Issue with 18224

Post by Joe_H »

I have seen some WUs have variable frame times, though not quite by this much. But my experience has mostly been with CPU projects. There may be changes in the conformation of the system that requires longer computation time to complete a frame.

Otherwise the only thing I can think of is something reducing power to the GPU for some reason or something using enough CPU that it slows down the CPU side of the folding.
Image
belloq
Posts: 51
Joined: Thu Sep 24, 2020 12:58 pm

Re: Issue with 18224

Post by belloq »

Joe_H wrote: Wed May 14, 2025 5:43 pmOtherwise the only thing I can think of is something reducing power to the GPU for some reason or something using enough CPU that it slows down the CPU side of the folding.
aaahhhhh! I did not think of that. I am in fact in a different location than I was earlier when the TPF was closer to 5-6min. And I believe the power adapter I am on is not as powerful as previous. I will check into this!
arisu
Posts: 438
Joined: Mon Feb 24, 2025 11:11 pm

Re: Issue with 18224

Post by arisu »

Power adapters shouldn't have any effect because the device always gets as much power as it wants, or it shuts down. The GPU isn't aware of how much power is available. (Edit: I am wrong about that part)

It could be a lot of things. If you're curious and you're on Linux then install nvidia-smi and run this command for a few frames, which will print GPU statistics in real time and will also save them to a log that will assist in figuring out why some frames are taking so long:

Code: Select all

nvidia-smi dmon -s pucmt -o T | tee gpu.log
Last edited by arisu on Thu May 15, 2025 3:00 am, edited 1 time in total.
Joe_H
Site Admin
Posts: 8115
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Issue with 18224

Post by Joe_H »

arisu wrote: Thu May 15, 2025 1:26 am Power adapters shouldn't have any effect because the device always gets as much power as it wants, or it shuts down. The GPU isn't aware of how much power is available.
Sorry, but that does not hold for all systems. Some laptops will run in a power limited mode when using a smaller power adapter than usual, or only run but not charge. At best you are extrapolating from a limited set of hardware, I have seen all kinds of variations in power management depending on available power.
Image
arisu
Posts: 438
Joined: Mon Feb 24, 2025 11:11 pm

Re: Issue with 18224

Post by arisu »

Joe_H wrote: Thu May 15, 2025 1:35 am
arisu wrote: Thu May 15, 2025 1:26 am Power adapters shouldn't have any effect because the device always gets as much power as it wants, or it shuts down. The GPU isn't aware of how much power is available.
Sorry, but that does not hold for all systems. Some laptops will run in a power limited mode when using a smaller power adapter than usual, or only run but not charge. At best you are extrapolating from a limited set of hardware, I have seen all kinds of variations in power management depending on available power.
I didn't know that. I know that laptops will go into a lower power mode if they are unplugged but I did not know that any are able to detect anything other than the current DC voltage being fed to it. I assume this would mostly be laptops that charge over USB?
Joe_H
Site Admin
Posts: 8115
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Issue with 18224

Post by Joe_H »

Now it is mostly those using USB-C, but before that several makers had options for power adapter wattage and their power management section would operate based on which was connected. This to my direct knowledge included some Dell models, many of the MacBooks, and I heard rumors of this also in some Lenovo models.
Image
arisu
Posts: 438
Joined: Mon Feb 24, 2025 11:11 pm

Re: Issue with 18224

Post by arisu »

Thanks for the correction. I've never used a laptop like that. That makes sense and could certainly explain OP's variable TPF.
muziqaz
Posts: 1705
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Issue with 18224

Post by muziqaz »

My work workstation requires dual usb c plug for full performance. If single usb c is plugged in, I get a notification that I lack power, and I will be running a laptop equivalent of one from 1999
FAH Omega tester
Image
Post Reply