Skip to content

Conversation

@Alvaro-Kothe
Copy link
Member


I adapted the test from this PR to Python 3.7 in commit d378852.

@pytest.mark.parametrize("chunksize", [1, 1.0])
@pytest.mark.parametrize("buffer", [BytesIO, StringIO])
def test_readjson_chunks(request, lines_json_df, chunksize, buffer):
    # Basic test that read_json(chunks=True) gives the same result as
    # read_json(chunks=False)
    # GH17048: memory usage when lines=True
    # GH#28906: read binary json lines in chunks

    if buffer == BytesIO:
        lines_json_df = lines_json_df.encode()

    unchunked = read_json(StringIO(lines_json_df), lines=True)
    with buffer(lines_json_df) as buf:
        reader = read_json(buf, lines=True, chunksize=chunksize)
        chunked = pd.concat(reader)

    tm.assert_frame_equal(chunked, unchunked)

Here is the test summary:

$ pytest pandas/tests/io/json/test_readlines.py::test_readjson_chunks -v
...
FAILED pandas/tests/io/json/test_readlines.py::test_readjson_chunks[BytesIO-1] - TypeError: ...
FAILED pandas/tests/io/json/test_readlines.py::test_readjson_chunks[BytesIO-1.0] - TypeError...
=========================== 2 failed, 2 passed, 8 warnings in 0.30s ===========================

The error only occorred when using a context manager.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rhshadrach rhshadrach added IO JSON read_json, to_json, json_normalize Needs Tests Unit test(s) needed to prevent regressions labels Nov 16, 2025
@rhshadrach rhshadrach added this to the 3.0 milestone Nov 16, 2025
@rhshadrach rhshadrach merged commit 3508aae into pandas-dev:main Nov 16, 2025
50 of 51 checks passed
rustamali9183 pushed a commit to rustamali9183/pandas that referenced this pull request Nov 17, 2025
mittal-aakriti pushed a commit to mittal-aakriti/pandas that referenced this pull request Nov 19, 2025
mittal-aakriti pushed a commit to mittal-aakriti/pandas that referenced this pull request Nov 19, 2025
@Alvaro-Kothe Alvaro-Kothe deleted the test/chunklines branch November 20, 2025 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IO JSON read_json, to_json, json_normalize Needs Tests Unit test(s) needed to prevent regressions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

read_json doesn't work on binary files with lines=True and chunksize

2 participants