I have 10 Pi 4Bs doing F@H work.
All errors are The total potential energy is nan resulting in WU_STALLED (127 = 0x7f)
Pi 1:
12409 (Run 83, Clone 3, Gen 22) failed with too many errors
Pi 2:
12417 (Run 113, Clone 7, Gen 26) completed with 2 errors
12411 (Run 120, Clone 5, Gen 24) completed without errors
Pi 3:
completed multiple 124xx projects without any errors
Pi 4:
12403 (Run 29, Clone 0, Gen 28) completed with 1 error
12400 (Run 41, Clone 6, Gen 30) completed with 2 errors
Pi 5:
completed multiple 124xx projects without errors before encountering:
12410 (Run 5, Clone 9, Gen 9) completed with 1 error that appears to have stalled the Pi, requiring a reboot.
Pi 7:
completed multiple 124xx projects without any errors
Pi 8:
12419 (Run 121, Clone 8, Gen 27) failed with too many errors
12416 (Run 154, Clone 3, Gen 20) failed with too many errors
12401 (Run 25, Clone 7, Gen 21) completed despite experiencing 9 errors! A reboot mid-project might have helped.
12419 (Run 15, Clone 7, Gen 22) completed with 1 error
12419 (Run 18, Clone 1, Gen 25) failed with too many errors
12400 (Run 80, Clone 1, Gen 23) failed after 10 errors! Reboots mid-project might be delayed the WU failing.
12400 (Run 25, Clone 5, Gen 20) started with 2 errors before completing 1%
While Pi 8 's problem might be due to the machine,I don't understand why a machine that keep failing 124xx projects, is assigned only more of them!
Potentially issues for work units for project 124xx
Moderators: Site Moderators, FAHC Science Team
-
- Site Moderator
- Posts: 6359
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: Potentially issues for work units for project 124xx
How much RAM do you have on these systems ? Are they dedicated to FAH ?
Can you check in system logs if the errors happen at the same time as other events ?
Can you check in system logs if the errors happen at the same time as other events ?
-
- Posts: 73
- Joined: Mon Jan 30, 2023 10:43 am
- Hardware configuration: NVIDIA RTX 4070
10 x Raspberry Pi 5 Model B 2GB RAM
10 x Raspberry Pi 4 Model B 2GB RAM - Location: VIC, Australia
Re: Potentially issues for work units for project 124xx
2GB, and they are headless Pi's, doing nothing but FAH.
I could if I knew what to check, and where to find it.Can you check in system logs if the errors happen at the same time as other events ?
These Pi's all run 24/7, and normally don't all have problems at the same time. Occasionally one does, but so many having problems, points to the WUs causing the problem. And, they are only having problems whilst processing 124xx WUs.
-
- Posts: 1
- Joined: Fri Jun 09, 2023 9:03 am
- Hardware configuration: 1 x Raspberry Pi 4 Model B 4GB RAM
Re: Potentially issues for work units for project 124xx
My 4GB Raspberry Pi4 has just been given WU 12447 (5,9,10). The computer is currently expecting to finish in 6 days but has been given a limit of 3.6 days to complete the task.
There seems to be little point in wasting this time and energy waiting for the inevitable failure of this task, so I would like to dump this WU and get a new assignment.
However, the only two options given in the Web Control interface through the "Stop Folding" button are to "Finish up, then stop" or "Stop now". The first option doesn't address the problem and the second option only pauses the WU processing.
I have searched for instructions on how to "dump" a WU and have found references to people having done this without going into the details.
There seems to be little point in wasting this time and energy waiting for the inevitable failure of this task, so I would like to dump this WU and get a new assignment.
However, the only two options given in the Web Control interface through the "Stop Folding" button are to "Finish up, then stop" or "Stop now". The first option doesn't address the problem and the second option only pauses the WU processing.
I have searched for instructions on how to "dump" a WU and have found references to people having done this without going into the details.
-
- Site Moderator
- Posts: 1115
- Joined: Sat Dec 08, 2007 1:33 am
- Location: San Francisco, CA
- Contact:
Re: Potentially issues for work units for project 124xx
Easiest way is to use FAHControl. You may need to run FAHControl from a Windows machine.
Your Pi4 is not fast enough for most work.
Little point in getting another WU.
You can also dump from the command line. You would need to tell us where your client data directory is.
Your Pi4 is not fast enough for most work.
Little point in getting another WU.
You can also dump from the command line. You would need to tell us where your client data directory is.