Skip to content

Add overwrite support to writeData for clean rewrite workflows#536

Open
Rong-Zh wants to merge 5 commits into
ycphs:masterfrom
Rong-Zh:feat-writeData-overwrite
Open

Add overwrite support to writeData for clean rewrite workflows#536
Rong-Zh wants to merge 5 commits into
ycphs:masterfrom
Rong-Zh:feat-writeData-overwrite

Conversation

@Rong-Zh

@Rong-Zh Rong-Zh commented Apr 5, 2026

Copy link
Copy Markdown

I introduced overwrite = TRUE in writeData() to fix a common rewrite issue.

In a typical workflow, users read an existing worksheet, filter rows, and write the filtered result back to the same sheet.
When the filtered result has fewer rows than the original data, the previous behavior could leave stale trailing rows in the worksheet.

Before this change:

  1. writeData() only overwrote the current target write range.
  2. Cells outside that new range were not cleared automatically.
  3. Users often had to manually call deleteData() and calculate row/column ranges,
  4. Replacing data by removeWorksheet() + addWorksheet() could also change the original sheet order.

This was error-prone and easy to misuse.

After this change:

  1. writeData(..., overwrite = TRUE) clears existing worksheet cell data before writing.
  2. Rewrite workflows now produce clean outputs without stale rows.
  3. Default behavior remains unchanged (overwrite = FALSE), so existing code stays backward compatible.

In short, this change improves correctness for filtered rewrite scenarios while keeping the original API behavior intact for all existing callers.

Reproducible Setup: Generate Test Excel File

library(tidyverse)
library(openxlsx)
packageVersion(pkg = "openxlsx")
#4.2.8.1

result <- iris |>
  group_by(Species) |>
  slice_head(n = 2)

dim(result)
result
# A tibble: 6 × 5
# Groups:   Species [3]
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species   
#          <dbl>       <dbl>        <dbl>       <dbl> <fct>     
# 1          5.1         3.5          1.4         0.2 setosa    
# 2          4.9         3            1.4         0.2 setosa    
# 3          7           3.2          4.7         1.4 versicolor
# 4          6.4         3.2          4.5         1.5 versicolor
# 5          6.3         3.3          6           2.5 virginica 
# 6          5.8         2.7          5.1         1.9 virginica 

filename <- "test.xlsx"
write.xlsx(x = list(iris=result),  filename)

wb <- loadWorkbook(filename)
name <- getSheetNames(filename)

tbl <-readWorkbook(wb, name)
tbl
# Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 1          5.1         3.5          1.4         0.2     setosa
# 2          4.9         3.0          1.4         0.2     setosa
# 3          7.0         3.2          4.7         1.4 versicolor
# 4          6.4         3.2          4.5         1.5 versicolor
# 5          6.3         3.3          6.0         2.5  virginica
# 6          5.8         2.7          5.1         1.9  virginica

df <- tbl |> 
  dplyr::mutate(isFiltered = dplyr::if_else(Sepal.Width > 3.5 | Sepal.Width <= 3.0, "drop", "keep" )) |>
  dplyr::filter(isFiltered == "keep") |>
  dplyr::relocate(isFiltered)

df
# isSelected Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 1                     5.1         3.5          1.4         0.2     setosa
# 2                     7.0         3.2          4.7         1.4 versicolor
# 3                     6.4         3.2          4.5         1.5 versicolor
# 4                     6.3         3.3          6.0         2.5  virginica

# writeData
writeData(wb, sheet = name, df)
outxlsx <- "testoutput.xlsx"
saveWorkbook(wb, outxlsx, overwrite = TRUE)


tmp <-readWorkbook(outxlsx, name)
tmp
# isFiltered Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 1       keep          5.1         3.5          1.4         0.2     setosa
# 2       keep          7.0         3.2          4.7         1.4 versicolor
# 3       keep          6.4         3.2          4.5         1.5 versicolor
# 4       keep          6.3         3.3          6.0         2.5  virginica
# 5        6.3          3.3         6.0          2.5   virginica       <NA>
# 6        5.8          2.7         5.1          1.9   virginica       <NA>

Explanation (without overwrite):

  1. Input: original worksheet has 6 rows, filtered df has 4 rows.
  2. Action: writeData(wb, sheet = name, df) writes only the new target range.
  3. Output: rows 5-6 remain from previous data, so stale trailing rows are visible.
  4. Drawback: users must manually clear old ranges (for example with deleteData()) before writing.
wb <- loadWorkbook(filename)
name <- getSheetNames(filename)

tbl <-readWorkbook(wb, name)
tbl
# Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 1          5.1         3.5          1.4         0.2     setosa
# 2          4.9         3.0          1.4         0.2     setosa
# 3          7.0         3.2          4.7         1.4 versicolor
# 4          6.4         3.2          4.5         1.5 versicolor
# 5          6.3         3.3          6.0         2.5  virginica
# 6          5.8         2.7          5.1         1.9  virginica

df <- tbl |> 
  dplyr::mutate(isFiltered = dplyr::if_else(Sepal.Width > 3.5 | Sepal.Width <= 3.0, "drop", "keep" )) |>
  dplyr::filter(isFiltered == "keep") |>
  dplyr::relocate(isFiltered)

df
# isSelected Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 1                     5.1         3.5          1.4         0.2     setosa
# 2                     7.0         3.2          4.7         1.4 versicolor
# 3                     6.4         3.2          4.5         1.5 versicolor
# 4                     6.3         3.3          6.0         2.5  virginica

# writeData
writeData(wb, sheet = name, df, overwrite=TRUE)
outxlsx <- "testoutput1.xlsx"
saveWorkbook(wb, outxlsx, overwrite = TRUE)


tmp <-readWorkbook(outxlsx, name)
tmp
# isFiltered Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 1       keep          5.1         3.5          1.4         0.2     setosa
# 2       keep          7.0         3.2          4.7         1.4 versicolor
# 3       keep          6.4         3.2          4.5         1.5 versicolor
# 4       keep          6.3         3.3          6.0         2.5  virginica

Explanation (with overwrite):

  1. Input: same as above (6 original rows, 4 filtered rows).
  2. Action: writeData(wb, sheet = name, df, overwrite=TRUE) clears existing worksheet cell data before writing.
  3. Output: only the 4 filtered rows remain; no stale trailing rows.
  4. Benefit: no manual deleteData() range calculation, and no need to remove/recreate the worksheet.

Copilot AI review requested due to automatic review settings April 5, 2026 07:45

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in overwrite flag to writeData() to support “clean rewrite” workflows by clearing existing worksheet cell data before writing new data, preventing stale trailing rows/cells when the new dataset is smaller.

Changes:

  • Add overwrite = FALSE parameter to writeData() and validate it.
  • When overwrite = TRUE, delete existing worksheet sheet_data before writing.
  • Update generated Rd documentation to include the new parameter.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
R/writeData.R Adds overwrite parameter and implements pre-write clearing of existing worksheet cell data.
man/writeData.Rd Documents the new overwrite parameter in the public API docs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/writeData.R
Comment thread R/writeData.R Outdated
Comment thread R/writeData.R
Comment thread man/writeData.Rd Outdated
Rong-Zh and others added 3 commits April 5, 2026 16:00
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

@Rong-Zh Rong-Zh left a comment

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change fixes a potential data-loss issue when overwrite is TRUE.
The overwrite deletion now happens after table-header validation, so if validation fails or the function exits early, existing worksheet content is preserved.

@JanMarvin

Copy link
Copy Markdown
Collaborator

Hi @Rong-Zh , thanks for the pull request! To be honest, I'm a bit indifferent about this. I guess everyone who needed this, should have some workaround by now? But the change is minimal enough for me to not bother. Just asking: does delete() remove also the styles, formulas etc or does it only remove the values from the sheet?

Also please note, openxlsx is no longer actively developed, so bringing in new features has the danger, that it will be hard to detect, if the feature works as expected (similarly to the partially broken delete column/row feature).

@JanMarvin

Copy link
Copy Markdown
Collaborator

No need to bother with the lintr warnings. Probably the return_lintr should be disabled, it is a bit of a nuisance.

@Rong-Zh

Rong-Zh commented Apr 5, 2026

Copy link
Copy Markdown
Author

Hi @Rong-Zh , thanks for the pull request! To be honest, I'm a bit indifferent about this. I guess everyone who needed this, should have some workaround by now? But the change is minimal enough for me to not bother. Just asking: does delete() remove also the styles, formulas etc or does it only remove the values from the sheet?

Also please note, openxlsx is no longer actively developed, so bringing in new features has the danger, that it will be hard to detect, if the feature works as expected (similarly to the partially broken delete column/row feature).

Thanks very much for the feedback.

We currently depend on openxlsx in our production workflows. While implementing a local workaround, I made a very small adjustment and thought it might be helpful to contribute it back as a minor improvement.

I fully understand your point that the package is no longer actively maintained, so I tried to keep the change as minimal and non-intrusive as possible. If you feel it is better not to include it, I completely understand and am happy to follow your preference.

@JanMarvin

Copy link
Copy Markdown
Collaborator

Ah well, you already did the work. Could you add a test for this and add a note in the NEWS file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants