Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different unnests on plan_to_sql are merged #15128

Open
blaginin opened this issue Mar 10, 2025 · 5 comments · May be fixed by #15159
Open

Different unnests on plan_to_sql are merged #15128

blaginin opened this issue Mar 10, 2025 · 5 comments · May be fixed by #15159
Labels
bug Something isn't working

Comments

@blaginin
Copy link
Contributor

blaginin commented Mar 10, 2025

Describe the bug

While working on #14781 I noticed that UNNEST is being transformed to sql incorrectly. For example,

SELECT unnest([1, 2, 3, 4]) from unnest([1, 2, 3]);
SELECT unnest([1, 2, 3, 4]) from (select 1);

are transformed into

SELECT * FROM UNNEST([1, 2, 3]);
SELECT 1 FROM UNNEST([1, 2, 3, 4]);

which are both wrong.

To Reproduce

let ctx = SessionContext::new();
let df = ctx
    .sql("SELECT unnest([1, 2, 3, 4]) from unnest([1, 2, 3])")
    .await?;

let dialect = CustomDialectBuilder::default()
    .with_unnest_as_table_factor(true)
    .build();
let unparser = Unparser::new(&dialect);

let roundtrip_statement = unparser.plan_to_sql(&df.logical_plan())?;

assert_eq!(
    format!("{}", roundtrip_statement),
    "SELECT * FROM UNNEST([1, 2, 3])"
);

Expected behavior

Unnests should be preserved

Additional context

I think this is related to #13660

While fixing this, it would be good to not rely on having double alias, e.g.Alias(Alias(Column(... as I plan to remove it in #14781

@blaginin blaginin added the bug Something isn't working label Mar 10, 2025
@blaginin
Copy link
Contributor Author

fyi @goldmedal

@blaginin
Copy link
Contributor Author

The easiest fix I can think of is to add info about the location of the unnest in the __unnest_placeholder, similarly to how we currently add depth. With that, we'll try to put it into relation only if it was originally in the relation.

@goldmedal
Copy link
Contributor

The easiest fix I can think of is to add info about the location of the unnest in the __unnest_placeholder, similarly to how we currently add depth. With that, we'll try to put it into relation only if it was originally in the relation.

I'm not sure about that.

The problem is here:

relation.unnest(unnest_relation);

If with_unnest_as_table_factor is enabled, we will try to keep unnest in the from clause, but I never consider the case you propose 🤔. It seems to be another pattern we should handle in the unparser.

If we disable with_unnest_as_table_factor, it works well:

SELECT UNNEST([1, 2, 3, 4]) AS "UNNEST(make_array(Int64(1),Int64(2),Int64(3),Int64(4)))" FROM (SELECT UNNEST([1, 2, 3]) AS "UNNEST(make_array(Int64(1),Int64(2),Int64(3)))")

So, another option, with_unnest_as_table_factor is designed (#13601) for dialects that only allow the use of unnest as a table factor (e.g., BigQuery). Perhaps we can simply throw an error when encountering this type of case if with_unnest_as_table_factor is enabled?

@blaginin blaginin linked a pull request Mar 11, 2025 that will close this issue
@blaginin
Copy link
Contributor Author

blaginin commented Mar 11, 2025

Perhaps we can simply throw an error

I've been thinking about #15159 - arguably a dirty fix, but I can't come up with anything better. Do you need that feature for anything other than EmptyRelation?

@goldmedal
Copy link
Contributor

I've been thinking about #15159 - arguably a dirty fix, but I can't come up with anything better. Do you need that feature for anything other than EmptyRelation?

Thanks. I think it may be the only way we can do it in the unparser side. 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants