17925, setPositions() and unknown property, BAD_WORK_UNIT

Moderators: Site Moderators, FAHC Science Team

Post Reply
arisu
Posts: 466
Joined: Mon Feb 24, 2025 11:11 pm

17925, setPositions() and unknown property, BAD_WORK_UNIT

Post by arisu »

Folding on a GTX 770M using OpenCL, failure is always immediate, mostly ending in this error:

Code: Select all

ERROR:exception: Called setPositions() on a Context with the wrong number of positions
However, one work unit ended in this error:

Code: Select all

ERROR:exception: Unknown property 'z' in node '
ERROR:          '
Full log showing the first error:

Code: Select all

15:56:11:I1:TailFileToLog:WU214:*********************** Log Started 2025-05-06T15:56:10Z ***********************
15:56:11:I1:TailFileToLog:WU214:*************************** Core22 Folding@home Core ***************************
15:56:11:I1:TailFileToLog:WU214:       Core: Core22
15:56:11:I1:TailFileToLog:WU214:       Type: 0x22
15:56:11:I1:TailFileToLog:WU214:    Version: 0.0.20
15:56:11:I1:TailFileToLog:WU214:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:56:11:I1:TailFileToLog:WU214:  Copyright: 2020 foldingathome.org
15:56:11:I1:TailFileToLog:WU214:   Homepage: https://foldingathome.org/
15:56:11:I1:TailFileToLog:WU214:       Date: Jan 20 2022
15:56:11:I1:TailFileToLog:WU214:       Time: 00:57:52
15:56:11:I1:TailFileToLog:WU214:   Revision: 3f211b8a4346514edbff34e3cb1c0e0ec951373c
15:56:11:I1:TailFileToLog:WU214:     Branch: HEAD
15:56:11:I1:TailFileToLog:WU214:   Compiler: GNU 9.4.0
15:56:11:I1:TailFileToLog:WU214:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
15:56:11:I1:TailFileToLog:WU214:             -fdata-sections -O3 -funroll-loops -fno-pie
15:56:11:I1:TailFileToLog:WU214:             -DOPENMM_VERSION="\"7.7.0\""
15:56:11:I1:TailFileToLog:WU214:   Platform: linux 5.11.0-1025-azure
15:56:11:I1:TailFileToLog:WU214:       Bits: 64
15:56:11:I1:TailFileToLog:WU214:       Mode: Release
15:56:11:I1:TailFileToLog:WU214:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
15:56:11:I1:TailFileToLog:WU214:             <peastman@stanford.edu>
15:56:11:I1:TailFileToLog:WU214:       Args: -dir krpOOfbD4NvIBxtC3ZMETR6uxgn5iltzrANdbUwz43Q -suffix 01
15:56:11:I1:TailFileToLog:WU214:             -version 8.4.9 -lifeline 229382 -gpu-uuid
15:56:11:I1:TailFileToLog:WU214:             509ae1d8-ced6-4ea5-a088-ddbb07bb44a1 -gpu-platform opencl
15:56:11:I1:TailFileToLog:WU214:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -gpu 0
15:56:11:I1:TailFileToLog:WU214:************************************ libFAH ************************************
15:56:11:I1:TailFileToLog:WU214:       Date: Jan 20 2022
15:56:11:I1:TailFileToLog:WU214:       Time: 00:57:22
15:56:11:I1:TailFileToLog:WU214:   Revision: 9f4ad694e75c2350d4bb6b8b5b769ba27e483a2f
15:56:11:I1:TailFileToLog:WU214:     Branch: HEAD
15:56:11:I1:TailFileToLog:WU214:   Compiler: GNU 9.4.0
15:56:11:I1:TailFileToLog:WU214:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
15:56:11:I1:TailFileToLog:WU214:             -fdata-sections -O3 -funroll-loops -fno-pie
15:56:11:I1:TailFileToLog:WU214:   Platform: linux 5.11.0-1025-azure
15:56:11:I1:TailFileToLog:WU214:       Bits: 64
15:56:11:I1:TailFileToLog:WU214:       Mode: Release
15:56:11:I1:TailFileToLog:WU214:************************************ CBang *************************************
15:56:11:I1:TailFileToLog:WU214:       Date: Jan 20 2022
15:56:11:I1:TailFileToLog:WU214:       Time: 00:57:00
15:56:11:I1:TailFileToLog:WU214:   Revision: ab023d155b446906d55b0f6c9a1eedeea04f7a1a
15:56:11:I1:TailFileToLog:WU214:     Branch: HEAD
15:56:11:I1:TailFileToLog:WU214:   Compiler: GNU 9.4.0
15:56:11:I1:TailFileToLog:WU214:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
15:56:11:I1:TailFileToLog:WU214:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
15:56:11:I1:TailFileToLog:WU214:   Platform: linux 5.11.0-1025-azure
15:56:11:I1:TailFileToLog:WU214:       Bits: 64
15:56:11:I1:TailFileToLog:WU214:       Mode: Release
15:56:11:I1:TailFileToLog:WU214:************************************ System ************************************
15:56:11:I1:TailFileToLog:WU214:        CPU: Intel(R) Core(TM) i7-4900MQ CPU @ 2.80GHz
15:56:11:I1:TailFileToLog:WU214:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
15:56:11:I1:TailFileToLog:WU214:       CPUs: 8
15:56:11:I1:TailFileToLog:WU214:     Memory: 7.65GiB
15:56:11:I1:TailFileToLog:WU214:Free Memory: 1.82GiB
15:56:11:I1:TailFileToLog:WU214:    Threads: POSIX_THREADS
15:56:11:I1:TailFileToLog:WU214: OS Version: 6.1
15:56:11:I1:TailFileToLog:WU214:Has Battery: true
15:56:11:I1:TailFileToLog:WU214: On Battery: false
15:56:11:I1:TailFileToLog:WU214: UTC Offset: 5
15:56:11:I1:TailFileToLog:WU214:        PID: 236613
15:56:11:I1:TailFileToLog:WU214:        CWD: /tmp/fah/work
15:56:11:I1:TailFileToLog:WU214:************************************ OpenMM ************************************
15:56:11:I1:TailFileToLog:WU214:    Version: 7.7.0
15:56:11:I1:TailFileToLog:WU214:********************************************************************************
15:56:11:I1:TailFileToLog:WU214:Project: 17925 (Run 10, Clone 143, Gen 18)
15:56:11:I1:TailFileToLog:WU214:Digital signatures verified
15:56:11:I1:TailFileToLog:WU214:Folding@home GPU Core22 Folding@home Core
15:56:11:I1:TailFileToLog:WU214:Version 0.0.20
15:56:11:I1:TailFileToLog:WU214:  Checkpoint write interval: 62500 steps (5%) [20 total]
15:56:11:I1:TailFileToLog:WU214:  JSON viewer frame write interval: 12500 steps (1%) [100 total]
15:56:11:I1:TailFileToLog:WU214:  XTC frame write interval: 25000 steps (2%) [50 total]
15:56:11:I1:TailFileToLog:WU214:  Global context and integrator variables write interval: disabled
15:56:11:I1:TailFileToLog:WU214:There are 3 platforms available.
15:56:11:I1:TailFileToLog:WU214:Platform 0: Reference
15:56:11:I1:TailFileToLog:WU214:Platform 1: CPU
15:56:11:I1:TailFileToLog:WU214:Platform 2: OpenCL
15:56:11:I1:TailFileToLog:WU214:  opencl-device 0 specified
15:56:20:I1:TailFileToLog:WU214:Attempting to create OpenCL context:
15:56:20:I1:TailFileToLog:WU214:  Configuring platform OpenCL
15:56:23:I1:TailFileToLog:WU214:ERROR:exception: Called setPositions() on a Context with the wrong number of positions
15:56:23:I1:TailFileToLog:WU214:Saving result file ../logfile_01.txt
15:56:23:I1:TailFileToLog:WU214:Saving result file science.log
15:56:23:I1:TailFileToLog:WU214:Folding@home Core Shutdown: BAD_WORK_UNIT
15:56:23:E :Unit:WU214:Core returned BAD_WORK_UNIT (114)
Full log of the second kind of error:

Code: Select all

15:57:01:I1:TailFileToLog:WU215:*********************** Log Started 2025-05-06T15:57:01Z ***********************
15:57:01:I1:TailFileToLog:WU215:*************************** Core22 Folding@home Core ***************************
15:57:01:I1:TailFileToLog:WU215:       Core: Core22
15:57:01:I1:TailFileToLog:WU215:       Type: 0x22
15:57:01:I1:TailFileToLog:WU215:    Version: 0.0.20
15:57:01:I1:TailFileToLog:WU215:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:57:01:I1:TailFileToLog:WU215:  Copyright: 2020 foldingathome.org
15:57:01:I1:TailFileToLog:WU215:   Homepage: https://foldingathome.org/
15:57:01:I1:TailFileToLog:WU215:       Date: Jan 20 2022
15:57:01:I1:TailFileToLog:WU215:       Time: 00:57:52
15:57:01:I1:TailFileToLog:WU215:   Revision: 3f211b8a4346514edbff34e3cb1c0e0ec951373c
15:57:01:I1:TailFileToLog:WU215:     Branch: HEAD
15:57:01:I1:TailFileToLog:WU215:   Compiler: GNU 9.4.0
15:57:01:I1:TailFileToLog:WU215:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
15:57:01:I1:TailFileToLog:WU215:             -fdata-sections -O3 -funroll-loops -fno-pie
15:57:01:I1:TailFileToLog:WU215:             -DOPENMM_VERSION="\"7.7.0\""
15:57:01:I1:TailFileToLog:WU215:   Platform: linux 5.11.0-1025-azure
15:57:01:I1:TailFileToLog:WU215:       Bits: 64
15:57:01:I1:TailFileToLog:WU215:       Mode: Release
15:57:01:I1:TailFileToLog:WU215:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
15:57:01:I1:TailFileToLog:WU215:             <peastman@stanford.edu>
15:57:01:I1:TailFileToLog:WU215:       Args: -dir O6gFuBTMRJ32ca_wWkCISQX6KQVzgee0cP41gPIDW9k -suffix 01
15:57:01:I1:TailFileToLog:WU215:             -version 8.4.9 -lifeline 229382 -gpu-uuid
15:57:01:I1:TailFileToLog:WU215:             509ae1d8-ced6-4ea5-a088-ddbb07bb44a1 -gpu-platform opencl
15:57:01:I1:TailFileToLog:WU215:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -gpu 0
15:57:01:I1:TailFileToLog:WU215:************************************ libFAH ************************************
15:57:01:I1:TailFileToLog:WU215:       Date: Jan 20 2022
15:57:01:I1:TailFileToLog:WU215:       Time: 00:57:22
15:57:01:I1:TailFileToLog:WU215:   Revision: 9f4ad694e75c2350d4bb6b8b5b769ba27e483a2f
15:57:01:I1:TailFileToLog:WU215:     Branch: HEAD
15:57:01:I1:TailFileToLog:WU215:   Compiler: GNU 9.4.0
15:57:01:I1:TailFileToLog:WU215:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
15:57:01:I1:TailFileToLog:WU215:             -fdata-sections -O3 -funroll-loops -fno-pie
15:57:01:I1:TailFileToLog:WU215:   Platform: linux 5.11.0-1025-azure
15:57:01:I1:TailFileToLog:WU215:       Bits: 64
15:57:01:I1:TailFileToLog:WU215:       Mode: Release
15:57:01:I1:TailFileToLog:WU215:************************************ CBang *************************************
15:57:01:I1:TailFileToLog:WU215:       Date: Jan 20 2022
15:57:01:I1:TailFileToLog:WU215:       Time: 00:57:00
15:57:01:I1:TailFileToLog:WU215:   Revision: ab023d155b446906d55b0f6c9a1eedeea04f7a1a
15:57:01:I1:TailFileToLog:WU215:     Branch: HEAD
15:57:01:I1:TailFileToLog:WU215:   Compiler: GNU 9.4.0
15:57:01:I1:TailFileToLog:WU215:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
15:57:01:I1:TailFileToLog:WU215:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
15:57:01:I1:TailFileToLog:WU215:   Platform: linux 5.11.0-1025-azure
15:57:01:I1:TailFileToLog:WU215:       Bits: 64
15:57:01:I1:TailFileToLog:WU215:       Mode: Release
15:57:01:I1:TailFileToLog:WU215:************************************ System ************************************
15:57:01:I1:TailFileToLog:WU215:        CPU: Intel(R) Core(TM) i7-4900MQ CPU @ 2.80GHz
15:57:01:I1:TailFileToLog:WU215:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
15:57:01:I1:TailFileToLog:WU215:       CPUs: 8
15:57:01:I1:TailFileToLog:WU215:     Memory: 7.65GiB
15:57:01:I1:TailFileToLog:WU215:Free Memory: 1.85GiB
15:57:01:I1:TailFileToLog:WU215:    Threads: POSIX_THREADS
15:57:01:I1:TailFileToLog:WU215: OS Version: 6.1
15:57:01:I1:TailFileToLog:WU215:Has Battery: true
15:57:01:I1:TailFileToLog:WU215: On Battery: false
15:57:01:I1:TailFileToLog:WU215: UTC Offset: 5
15:57:01:I1:TailFileToLog:WU215:        PID: 236708
15:57:01:I1:TailFileToLog:WU215:        CWD: /tmp/fah/work
15:57:01:I1:TailFileToLog:WU215:************************************ OpenMM ************************************
15:57:01:I1:TailFileToLog:WU215:    Version: 7.7.0
15:57:01:I1:TailFileToLog:WU215:********************************************************************************
15:57:01:I1:TailFileToLog:WU215:Project: 17925 (Run 0, Clone 117, Gen 9)
15:57:01:I1:TailFileToLog:WU215:Reading tar file core.xml
15:57:01:I1:TailFileToLog:WU215:Reading tar file integrator.xml
15:57:01:I1:TailFileToLog:WU215:Reading tar file state.xml
15:57:01:I1:TailFileToLog:WU215:Reading tar file system.xml
15:57:02:I1:TailFileToLog:WU215:Digital signatures verified
15:57:02:I1:TailFileToLog:WU215:Folding@home GPU Core22 Folding@home Core
15:57:02:I1:TailFileToLog:WU215:Version 0.0.20
15:57:02:I1:TailFileToLog:WU215:  Checkpoint write interval: 62500 steps (5%) [20 total]
15:57:02:I1:TailFileToLog:WU215:  JSON viewer frame write interval: 12500 steps (1%) [100 total]
15:57:02:I1:TailFileToLog:WU215:  XTC frame write interval: 25000 steps (2%) [50 total]
15:57:02:I1:TailFileToLog:WU215:  Global context and integrator variables write interval: disabled
15:57:02:I1:TailFileToLog:WU215:There are 3 platforms available.
15:57:02:I1:TailFileToLog:WU215:Platform 0: Reference
15:57:02:I1:TailFileToLog:WU215:Platform 1: CPU
15:57:02:I1:TailFileToLog:WU215:Platform 2: OpenCL
15:57:02:I1:TailFileToLog:WU215:  opencl-device 0 specified
15:57:13:I1:TailFileToLog:WU215:ERROR:exception: Unknown property 'z' in node '
15:57:13:I1:TailFileToLog:WU215:ERROR:          '
15:57:13:I1:TailFileToLog:WU215:Saving result file ../logfile_01.txt
15:57:13:I1:TailFileToLog:WU215:Saving result file science.log
15:57:13:I1:TailFileToLog:WU215:Folding@home Core Shutdown: BAD_WORK_UNIT
15:57:14:E :Unit:WU215:Core returned BAD_WORK_UNIT (114)
The only WU with 17925 that survived was R30 C58 G14, but R10 G143 G18, R0 C117 G9, and R29 C57 G7 failed with one of the above errors. Notably, the three failures all occurred in a row (WU214, WU215, WU216). The WU that survived was WU113 back in April.

I have a copy of the work directories (wudata_01.dat, wuresults_01.dat, and logfile_01.dat) for each of these failures if it would be helpful.
arisu
Posts: 466
Joined: Mon Feb 24, 2025 11:11 pm

Re: 17925, setPositions() and unknown property, BAD_WORK_UNIT

Post by arisu »

The "unknown property z in node" error in R0 C117 G9 looks like it was caused by a truncated state.xml file. It was NOT truncated by my computer. The wudata_01.dat file is a tar file with a special header, and the state.xml file simply... ends. Here is a hexdump of the relevant portion of the (decompressed but not extracted) wudata_01.dat:

Code: Select all

00f5f700  36 37 38 31 35 32 38 37  36 22 20 79 3d 22 36 31  |678152876" y="61|
00f5f710  38 2e 37 31 34 39 31 38  36 34 39 32 38 39 38 22  |8.7149186492898"|
00f5f720  20 7a 3d 22 2d 33 36 39  2e 34 33 35 38 30 30 37  | z="-369.4358007|
00f5f730  38 34 32 36 37 35 22 2f  3e 0a 09 09 3c 46 6f 72  |842675"/>...<For|
00f5f740  63 65 20 78 3d 22 31 39  39 2e 33 35 34 30 34 38  |ce x="199.354048|
00f5f750  37 36 34 31 30 30 33 22  20 79 3d 22 38 39 2e 30  |7641003" y="89.0|
00f5f760  35 30 31 38 36 30 37 32  34 37 36 32 00 00 00 00  |501860724762....|
00f5f770  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00f5f800  73 79 73 74 65 6d 2e 78  6d 6c 00 00 00 00 00 00  |system.xml......|
00f5f810  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00f5f860  00 00 00 00 30 30 30 30  36 36 36 00 30 30 30 30  |....0000666.0000|
00f5f870  30 30 30 00 30 30 30 30  30 30 30 00 30 30 34 32  |000.0000000.0042|
00f5f880  37 37 31 32 33 32 30 00  31 35 30 30 36 34 33 30  |7712320.15006430|
00f5f890  31 34 36 00 30 31 30 30  31 37 00 20 30 00 00 00  |146.010017. 0...|
00f5f8a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00f5f900  00 75 73 74 61 72 00 30  00 00 00 00 00 00 00 00  |.ustar.0........|
00f5f910  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00f5fa00  3c 3f 78 6d 6c 20 76 65  72 73 69 6f 6e 3d 22 31  |<?xml version="1|
This shows that it was truncated on the server (or before being sent to the server) before being packaged into wudata_01.dat and being signed/checksummed.

The last 5 lines of state.xml after extracting, because it is easier to see the truncation this way than seeing it in the above hexdump:

Code: Select all

                <Force x="322.6443655434996" y="-231.0596212821547" z="-516.6305848516058"/>
                <Force x="1538.4133703273255" y="1453.6799492444843" z="723.2296320230234"/>
                <Force x="-202.92045989260077" y="-374.2721789064817" z="-269.31309781596065"/>
                <Force x="-858.556678152876" y="618.7149186492898" z="-369.4358007842675"/>
                <Force x="199.3540487641003" y="89.0501860724762
I haven't found out what the setPositions() error is caused by yet. These three work units have failed on different people's computers about 70 times each.

I'm sure he's already aware that failures are occurring, but I messaged paul30 (the project manager) with this particular finding just in case it helps speed up the investigation into the root cause.
toTOW
Site Moderator
Posts: 6437
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 17925, setPositions() and unknown property, BAD_WORK_UNIT

Post by toTOW »

Yes, these are bad WUs (corrupted files on the server). I reported them to the researcher.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
arisu
Posts: 466
Joined: Mon Feb 24, 2025 11:11 pm

Re: 17925, setPositions() and unknown property, BAD_WORK_UNIT

Post by arisu »

How long ago did you report them? Maybe I did all this investigation for nothing. :lol:
muziqaz
Posts: 1722
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: 17925, setPositions() and unknown property, BAD_WORK_UNIT

Post by muziqaz »

Report was made at the similar time when toTOW wrote his comment here
FAH Omega tester
Image
Post Reply