Skip to content

feat: Allow casting List<UInt8> to Binary #22611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
May 8, 2025

Conversation

itamarst
Copy link
Contributor

@itamarst itamarst commented May 5, 2025

Fixes #21549.

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels May 5, 2025
@itamarst
Copy link
Contributor Author

itamarst commented May 5, 2025

pyodide seems to be failing on main too?

@itamarst itamarst marked this pull request as ready for review May 5, 2025 15:30
Copy link

codecov bot commented May 5, 2025

Codecov Report

Attention: Patch coverage is 94.05405% with 11 lines in your changes missing coverage. Please review.

Project coverage is 81.00%. Comparing base (5cedf53) to head (674c82a).
Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-compute/src/cast/mod.rs 94.85% 9 Missing ⚠️
crates/polars-core/src/chunked_array/cast.rs 80.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main   #22611    +/-   ##
========================================
  Coverage   80.99%   81.00%            
========================================
  Files        1661     1664     +3     
  Lines      234867   235162   +295     
  Branches     2772     2773     +1     
========================================
+ Hits       190234   190490   +256     
- Misses      43965    44004    +39     
  Partials      668      668            

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coastalwhite
Copy link
Collaborator

pyodide seems to be failing on main too?

Should be fixed by #22613.

@nameexhaustion
Copy link
Collaborator

nameexhaustion commented May 5, 2025

It should be possible to construct the BinaryViewArray with Views that point directly to the existing Buffer<u8>, instead of copying the memory using an iterator.

There is also a simple optimization we should do where if at the end of construction we see that all Views are in-lined, we can drop the ref to the original buffer.

@itamarst
Copy link
Contributor Author

itamarst commented May 6, 2025

@nameexhaustion I did this, and then realized it's a problem insofar as a View assumes a u32 offset but the Buffer<u8> might be bigger than that. Would you happen to know of any prewritten code that splits buffers if they're too large, for this use case?

@itamarst
Copy link
Contributor Author

itamarst commented May 6, 2025

@nameexhaustion OK implemented both ideas.

@itamarst itamarst requested a review from nameexhaustion May 7, 2025 14:54
@itamarst
Copy link
Contributor Author

itamarst commented May 7, 2025

Thank you for the review, I've addressed the comments.

@itamarst itamarst requested a review from nameexhaustion May 7, 2025 16:00
@itamarst
Copy link
Contributor Author

itamarst commented May 7, 2025

Will figure out failing tests tomorrow morning.

@nameexhaustion
Copy link
Collaborator

nameexhaustion commented May 8, 2025

The test is failing because the buffer is now dropped as all the views are in-lined - we should add specific test to assert that there are no data buffers in the result array for the case when all views are inlined.

For the test to ensure proper buffer splitting, the max buffer size limit needs to be increased.

@itamarst itamarst requested a review from nameexhaustion May 8, 2025 15:46
@nameexhaustion nameexhaustion merged commit 6ee3df0 into pola-rs:main May 8, 2025
26 checks passed
@nameexhaustion nameexhaustion changed the title feat(python): Allow casting List<UInt8> to Binary feat: Allow casting List<UInt8> to Binary May 9, 2025
@nameexhaustion nameexhaustion added the rust Related to Rust Polars label May 9, 2025
coastalwhite pushed a commit to coastalwhite/polars that referenced this pull request May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add cast from pl.List(pl.UInt8) to pl.Binary
4 participants