feat: Change SQL-Explode/UNNEST to Dataframe.explode method #22546
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes: #22545
Summary
Currently, using UNNEST (which corresponds to explode in Polars) within sqlContext relies on the Expr.explode() method, which does not preserve the row-wise mapping between the exploded list and the other columns in the DataFrame. As a result, attempting to UNNEST a list column alongside another column (e.g., sort_key) does not yield the expected exploded shape and leads to a shape mismatch error when trying to align non-list columns.
Example
Old behaviour
polars.exceptions.ShapeError: Series length 2 doesn't match the DataFrame height of 6
New behaviour
Solution
Changed the sql unnest to use the DataFrame.explode() instead of the Expr.explode()/List.explode() method.