Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Attach Diagnostic to more than one column errors in scalar_subquery and in_subquery #15143

Merged
merged 13 commits into from
Mar 14, 2025

Conversation

changsun20
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

This pull request enhances diagnostic information by attaching the Diagnostic to scalar subqueries. Additionally, as suggested by @eliaperantoni, the same improvement is applied to "IN" subqueries for consistency and better error handling.

What changes are included in this PR?

The primary changes are implemented in datafusion/sql/src/expr/subquery.rs, where the Diagnostic is properly attached right at the point of DatafusionError occurrence. This improves debugging and error traceability.

Are these changes tested?

Yes, additional test cases have been added in datafusion/sql/tests/cases/diagnostic.rs to validate the changes.

Are there any user-facing changes?

Yes, the Subquery struct has been modified by introducing a new data field for Spans. This update also affects other functions that utilize the Subquery struct in their processes.

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules substrait Changes to the substrait crate labels Mar 11, 2025
Copy link
Contributor

@eliaperantoni eliaperantoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great awesome work! And you were so quick! ❤️

I have a bunch of comments but they should be fairly simple to implement :)

Regarding how to highlight just the inner body of a SELECT, you could use Span::iter_union to do something like:

let select_span = Span::iter_union(
    select_items.iter().map(|si| si.span())
);

I'm pretty sure this doesn't compile haha, but it's more like pseudocode.

@changsun20
Copy link
Contributor Author

Thank you so much for your thoughtful review! I'll address these problems soon (♡ ὅ ◡ ὅ )ʃ♡

@changsun20
Copy link
Contributor Author

Thanks again @eliaperantoni for the clear review of my code! I've implemented all the suggested changes and I'd love to hear your comments.

Copy link
Contributor

@eliaperantoni eliaperantoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stellar, stellar work @changsun20! This is awesome ❤️. Thank you so much for your contribution.

I think this looks very good for merging @alamb :)

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Mar 12, 2025
@changsun20
Copy link
Contributor Author

Thank you @eliaperantoni for your feedback!

Also I've just fixed the CI issue on sqllogictests. Could you please help me trigger the CI workflow again @alamb? Thanks!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @changsun20 and @eliaperantoni

@alamb
Copy link
Contributor

alamb commented Mar 13, 2025

I took the liberty of merging this PR up from main and fixing the tests as I had it checked out already

@changsun20
Copy link
Contributor Author

changsun20 commented Mar 13, 2025

I took the liberty of merging this PR up from main and fixing the tests as I had it checked out already

Many thanks to @alamb for taking the time to review my testing issues. I'll try to become more familiar with the test pyramid in the future and test it thoroughly before pushing out changes.

@alamb
Copy link
Contributor

alamb commented Mar 13, 2025

I took the liberty of merging this PR up from main and fixing the tests as I had it checked out already

Many thanks to @alamb for taking the time to review my testing issues. I'll try to become more familiar with the test pyramid in the future and test it thoroughly before pushing out changes.

No worries! Your next PRs will also automatically run CI checks which will help

As a helpful hint, locally i have found running

cargo test --test sqllogictests 
cargo nextest run

Will catch almost all test related issues

Catches almost all

@changsun20
Copy link
Contributor Author

As a helpful hint, locally i have found running

cargo test --test sqllogictests 
cargo nextest run

Will catch almost all test related issues

Thanks for the helpful testing tips! I'll be sure to incorporate these commands into my workflow to improve test coverage for future contributions.

I wanted to share that I'm particularly excited about the Apache DataFusion community's mission and collaborative environment. With this in mind, I'm planning to apply for the GSoC "Improving DataFusion DX" project.

My current roadmap includes:

  1. Exploring additional issues outlined in [EPIC] Attach Diagnostic to more errors #14429
  2. Diving deeper into the Diagnostic infrastructure implementation
  3. Developing a comprehensive proposal based on these investigations

@eliaperantoni Would this approach align well with the project's priorities? I'd greatly appreciate any feedback or suggested adjustments to better align with the project's goals. Thank you for your guidance!

@eliaperantoni
Copy link
Contributor

@changsun20 That's great to hear, I'm very very excited that you're signing up for GSoC! I'm gonna a mentor there :) And I'm also delighted to hear that you'd like to work on Diagnostic, your plan for it sounds great :)

@Weijun-H Weijun-H merged commit b337fbc into apache:main Mar 14, 2025
26 checks passed
@Weijun-H
Copy link
Member

Thanks @changsun20 , @alamb and @eliaperantoni . I am very excited to see these informative features 😮

@eliaperantoni
Copy link
Contributor

Congratulations @changsun20 on merging your first DataFusion PR! Really superb work ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions optimizer Optimizer rules sql SQL Planner sqllogictest SQL Logic Tests (.slt) substrait Changes to the substrait crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Attach Diagnostic to "more than one column in subquery" error
4 participants