Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
>>> pd.options.mode.dtype_backend = 'pyarrow'
>>> ser = pd.Series([1,2,3], dtype="int64[pyarrow]")
>>> df2 = pd.DataFrame({"A": ser})
>>> scalar = pa.scalar(23)
>>> scalar
<pyarrow.Int64Scalar: 23>
>>> df2["B"] = scalar
>>> df2["B"].dtype
dtype('O')
>>>
Issue Description
Since "B" is a brand new column, the dtype should be "int64[pyarrow]", not object.
Expected Behavior
With numpy scalars, the type inferred is a numpy type, so with a pyarrow scalar, I expect the type to be an arrow type, in this case, "int64[pyarrow]", not object.
Installed Versions
Comment From: mroeschke
Thanks for the report. I'd venture if/when pandas takes on pyarrow as a required dependency we can always check for pyarrow objects without also checking if a user has pyarrow installed first https://github.com/pandas-dev/pandas/issues/50285
Comment From: mroeschke
Noting that replacing an existing arrow backed column with a pa.scalar works
In [2]: import pyarrow as pa
In [3]: arr = pd.arrays.ArrowExtensionArray(pa.array([1, 2], type=pa.int8()))
In [4]: arr[1] = pa.scalar(23)
In [5]: arr
Out[5]:
<ArrowExtensionArray>
[1, 23]
Length: 2, dtype: int8[pyarrow]
In [6]: ser = Series(arr)
In [7]: ser[1] = pa.scalar(24)
In [8]: ser
Out[8]:
0 1
1 24
dtype: int8[pyarrow]