-
Notifications
You must be signed in to change notification settings - Fork 3.7k
GH-46222: [Python] Allow to specify footer when opening IPC file for writing #46354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
GH-46222: [Python] Allow to specify footer when opening IPC file for writing #46354
Conversation
|
17c2586
to
af34416
Compare
Hey @chaubold ! Thanks for doing this, PR looks pretty straightforward, I can take a look at it tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for an issue with the docstring (see comment) this looks good to me. It would be nice to get another quick look @raulcd ?
The gdb issue (AMD64 Conda Python 3.10 Without Pandas
) seems unrelated.
af34416
to
e139f66
Compare
b23205b
to
990ab60
Compare
990ab60
to
0a62797
Compare
Sorry for all the iterations here. But now the docstring complains are resolved and only the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
56c834c
to
bd7c065
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, minor linting issue
buffer = sink.getvalue() | ||
with pa.ipc.open_file(buffer) as r: | ||
assert r.metadata == meta | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add two empty lines to fix the linter failure?
Rationale for this change
The Arrow APIs for other languages offer to set the footer metadata of an IPC file, but
pyarrow
was lacking this. I opened #46222 because we need this feature at KNIME, and took a shot at suggesting an implementation. Happy to get your feedback on the change!What changes are included in this PR?
A new keyword argument
metadata
inRecordBatchFileWriter.__init__
as well as inipc.new_file
. The value isNone
by default to not break backwards compatibility.Also added a
metadata
property to theRecordBatchFileReader
to be able to extract the metadata easily.Are these changes tested?
Yes, by a unit test.
Are there any user-facing changes?
Yes, see above :)