BUG: Fixed DataFrame.combine with non-unique columns #62760

vijmeister · 2025-10-19T20:37:51Z

closes BUG: DataFrame.combine with non-unique columns #51340
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

vijmeister · 2025-10-19T21:12:18Z

I realized that I need to further think about the logic of my proposed fix. The .iloc usage is not capturing the intended columns to merge ...

Re-do logic of fix

…ames.

…on-unique-columns

jbrockmendel · 2025-10-30T22:33:39Z

pandas/core/frame.py

+        new_columns_out = self.columns.union(other_columns, sort=False)
+        # Deduplicate column names if necessary
+        self_columns = Index(
+            dedup_names(list(self_columns), False), dtype=self_columns.dtype


why would dedup_names be necessay here?

Having unique column names would preserve the original logic of using column names. Switching to indices would require multiple indices.

jbrockmendel · 2025-10-30T22:34:06Z

pandas/core/frame.py

+        other_columns = Index(
+            dedup_names(list(other_columns), False), dtype=other_columns.dtype
+        )
+        this.columns = Index(


why alter this.columns?

jbrockmendel · 2025-10-30T22:35:17Z

pandas/core/frame.py

        result = {}
-        for col in new_columns:
+        for col in new_columns_unique:
            series = this[col]


get rid of all the dedup_names stuff above and just iterate over range(this.shape[1]) and use series = this.iloc[:, i], other_series = other.iloc[:, i]

There has to be other fixes because the logic heavily relied on column names instead of indices. I think this, other and new_columns(which result uses) would each need to have their own indices.

the logic heavily relied on column names instead of indices

how so?

In the docstring of the function there is an example in which one dataframe has columns A,B while other has B,C. In that case it would be tricky to use index instead of column name.

On L9100 we call self.align(other). After that, the columns are aligned.

I understand what align is doing now after seeing the docsting example with different column names.

…on-unique-columns

…me.combine.

vijmeister · 2025-10-31T17:31:41Z

@jbrockmendel , I was able to use .iloc instead of column names.

Fixed DataFrame.combine with non-unique columns

5e4a066

vijmeister marked this pull request as draft October 19, 2025 21:34

vijmeister added 5 commits October 24, 2025 17:51

Re-doing fix for Dataframe combine that works with duplicate column n…

dcac1a3

…ames.

Refactored to use dedup_names from pandas.io.common.

6507206

Merge remote-tracking branch 'upstream/main' into dataframe-combine-n…

74fd8b5

…on-unique-columns

Minor fixes to pass a couple build tests.

36ed5ec

Merge remote-tracking branch 'upstream/main' into dataframe-combine-n…

3c23ac7

…on-unique-columns

vijmeister marked this pull request as ready for review October 28, 2025 02:33

jbrockmendel reviewed Oct 30, 2025

View reviewed changes

vijmeister added 3 commits October 31, 2025 10:44

Merge remote-tracking branch 'upstream/main' into dataframe-combine-n…

7b3045b

…on-unique-columns

Remove dedup_names and use column indices instead of names in DataFra…

b553d69

…me.combine.

Fix logic for column access in result dictionary in DataFrame.combine.

51a6455

Uh oh!

BUG: Fixed DataFrame.combine with non-unique columns #62760

Are you sure you want to change the base?

BUG: Fixed DataFrame.combine with non-unique columns #62760

Uh oh!

Conversation

vijmeister commented Oct 19, 2025

Uh oh!

vijmeister commented Oct 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vijmeister commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vijmeister commented Oct 19, 2025 •

edited

Loading