-
Describe the enhancement requestedCurrently, when struct itself is null, child field's validity is unknown. So client must compute AND of those bitmaps to know child's validity. I think inlining parent validity into child validity is more effective for query engines. Component(s)Format |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Accessing Arrow Struct children requires checking both parent and child validity (p[i] && c[i]), which can feel cumbersome. For example, Java's StructVector.getChild(), needing explicit parent.isNull(i) checks. ref: |
Beta Was this translation helpful? Give feedback.
-
That is a feature, not a bug. The validity bitmap (aka "nulls") is a mask on top of the children arrays. Setting a value to null only requires updating the mask, not recursively traversing the child arrays to set them to null as well. A similar pattern exists in other nested arrays like Even though it's a bit cumbersome when reading (bitwise AND necessary), preserving this invariant you're proposing on every operation that builds new arrays from scratch or from other arrays would be almost impossible. |
Beta Was this translation helpful? Give feedback.
That is a feature, not a bug. The validity bitmap (aka "nulls") is a mask on top of the children arrays. Setting a value to null only requires updating the mask, not recursively traversing the child arrays to set them to null as well.
A similar pattern exists in other nested arrays like
List
:offsets[i + 1] - offsets[i]
is not necessarily zero when the i-th list is null.Even though it's a bit cumbersome when reading (bitwise AND necessary), preserving this invariant you're proposing on every operation that builds new arrays from scratch or from…