-
Notifications
You must be signed in to change notification settings - Fork 33
Closed
Labels
Description
What happens?
When SET old_implicit_casting = true;, list type was casted to string unexpectedly, leading to incorrect results with list_extract. This seems to be a regression in 1.4.x as DuckDB 1.3.x works fine.
Reproduce:
import duckdb
import pyarrow as pa
data = {
"str_col": ["1-2", "3-4", "a-z", None]
}
arrow_table = pa.table(data)
conn = duckdb.connect(":memory:")
conn.execute("SET old_implicit_casting = true;")
conn.register("input", arrow_table)
result1 = conn.execute("SELECT list_extract(string_split(str_col, '-'), 1) FROM input").fetch_arrow_table()
print(result1)
In DuckDB 1.4.0 and 1.4.1, we got
pyarrow.Table
list_extract(string_split(str_col, '-'), 1): string
----
list_extract(string_split(str_col, '-'), 1): [["[","[","[",null]] # instead of the first element of list, the first character "[" was returned
In DuckDB 1.3.2, we got
pyarrow.Table
list_extract(string_split(str_col, '-'), 1): string
----
list_extract(string_split(str_col, '-'), 1): [["1","3","a",null]]
To Reproduce
import duckdb
import pyarrow as pa
data = {
"str_col": ["1-2", "3-4", "a-z", None]
}
arrow_table = pa.table(data)
conn = duckdb.connect(":memory:")
conn.execute("SET old_implicit_casting = true;")
conn.register("input", arrow_table)
result1 = conn.execute("SELECT list_extract(string_split(str_col, '-'), 1) FROM input").fetch_arrow_table()
print(result1)
OS:
linux
DuckDB Version:
1.4.1
DuckDB Client:
Python
Hardware:
No response
Full Name:
Hongyu Shi
Affiliation:
Benchling
Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?
- Yes, I have
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant data sets for reproducing the issue?
Yes