Skip to content

BUG: read_csv with pyarrow engine cannot handle single-line CSV files #62635

@jorisvandenbossche

Description

@jorisvandenbossche

As long as the file contains a newline, it works fine:

>>> import io
... import pandas as pd
... pd.read_csv(
...     io.StringIO("1,2,3\n"),
...     names=["col1", "col2", "col3"],
...     engine="pyarrow",
... )
   col1  col2  col3
0     1     2     3

But reading an actual one-line file raises inside pyarrow:

>>> pd.read_csv(
...     io.StringIO("1,2,3"),
...     names=["col1", "col2", "col3"],
...     engine="pyarrow",
... )
---------------------------------------------------------------------------
ArrowInvalid     
...
ParserError: CSV parse error: Empty CSV file or block: cannot infer number of columns

While the default c or python engine handle this fine.
And if the header is in the file, it also works fine.

Tested with current latest versions on Ubuntu:

>>> pd.__version__
'3.0.0.dev0+2236.g3c4586fde9'
>>> pa.__version__
'21.0.0'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Arrowpyarrow functionalityBugIO CSVread_csv, to_csv

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions