Skip to content

Conversation

@vladborovtsov
Copy link

@vladborovtsov vladborovtsov commented Sep 27, 2025

Rationale for this change

Add an optional default_column_type parameter to the CSV reading API (C++ and Python) to provide a fallback type when per-column types aren’t specified, improving schema consistency and complementing the existing column_types logic.

What changes are included in this PR?

Are these changes tested?

Yes. Existing and new tests are passing.

C++:

> [==========] Running 3 tests from 1 test suite.
> [----------] Global test environment set-up.
> [----------] 3 tests from ReaderTests
> [ RUN      ] ReaderTests.DefaultColumnTypePartialDefault
> [       OK ] ReaderTests.DefaultColumnTypePartialDefault (3 ms)
> [ RUN      ] ReaderTests.DefaultColumnTypeAllStringsWithHeader
> [       OK ] ReaderTests.DefaultColumnTypeAllStringsWithHeader (0 ms)
> [ RUN      ] ReaderTests.DefaultColumnTypeAllStringsNoHeader
> [       OK ] ReaderTests.DefaultColumnTypeAllStringsNoHeader (0 ms)
> [----------] 3 tests from ReaderTests (4 ms total)
> 
> [----------] Global test environment tear-down
> [==========] 3 tests from 1 test suite ran. (4 ms total)
> [  PASSED  ] 3 tests.

All:

> [==========] 264 tests from 46 test suites ran. (452 ms total)
> [  PASSED  ] 264 tests.

pyarrow:
New tests are passing.

Are there any user-facing changes?

I believe this change is backward compatible. Parameter is optional and its default value doesn't change the existing behavior; All the existing rests are passing.

Maybe relevant: #22232

Relates to #47502

@github-actions
Copy link

⚠️ GitHub issue #47502 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link

⚠️ GitHub issue #47502 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link

⚠️ GitHub issue #47502 has been automatically assigned in GitHub to PR creator.

1 similar comment
@github-actions
Copy link

⚠️ GitHub issue #47502 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link

⚠️ GitHub issue #47502 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link

⚠️ GitHub issue #47502 has been automatically assigned in GitHub to PR creator.

@vladborovtsov
Copy link
Author

@github-actions crossbow submit preview-docs

@github-actions
Copy link

Only contributors can submit requests to this bot. Please ask someone from the community for help with getting the first commit in.
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/18062577036

@github-actions
Copy link

⚠️ GitHub issue #47502 has been automatically assigned in GitHub to PR creator.

@vladborovtsov vladborovtsov marked this pull request as ready for review September 27, 2025 19:26
@kou kou changed the title GH-47502: Introduce optional default_column_type parameter GH-47502: [C++] Introduce optional default_column_type parameter Oct 13, 2025
@github-actions
Copy link

⚠️ GitHub issue #47502 has been automatically assigned in GitHub to PR creator.

@AlenkaF
Copy link
Member

AlenkaF commented Oct 24, 2025

Thank you @vladborovtsov for the contribution.
I will add info about the proposed solution in the original issue (#22232) so I can see opinions from C++ devs on the proposed solution.

@AlenkaF AlenkaF changed the title GH-47502: [C++] Introduce optional default_column_type parameter GH-22232: [C++][Python] Introduce optional default_column_type parameter Oct 24, 2025
@github-actions
Copy link

⚠️ GitHub issue #22232 has been automatically assigned in GitHub to PR creator.

@vladborovtsov
Copy link
Author

Hi @AlenkaF
I'm happy to continue the labour and discussion to get that merged.
As for AI, it wasn't used much, although I tried :) With such huge codebase the generation quality is quite low.

@AlenkaF
Copy link
Member

AlenkaF commented Oct 24, 2025

Happy to see a response!
All good, it is totally OK to use gen AI wisely ;)

I will wait for an opinion from a C++ dev and in the meantime try to look at the Python part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants