Code Sample, a copy-pastable example if possible
In [2]: s = pd.Series(dtype='object')
In [3]: s.loc[0] = 1
In [4]: s
Out[4]:
0 1
dtype: int64
In [5]: s = pd.Series(dtype='object')
In [6]: s.loc[0] = 1.
In [7]: s
Out[7]:
0 1.0
dtype: float64
Compare with:
In [8]: s = pd.Series({1 : 2}, dtype='object')
In [9]: s.loc[0] = 1
In [10]: s
Out[10]:
1 2
0 1
dtype: object
In [11]: s = pd.Series({1 : 2.}, dtype='object')
In [12]: s.loc[0] = 1.
In [13]: s
Out[13]:
1 2
0 1
dtype: object
Problem description
The dtype of a Series
should change, on addition of new elements, only when it has to:: as a corollary, it should never change when it is object
.
Vaguely related to #17261, maybe to #8902.
Expected Output
Always object
dtype.
Output of pd.show_versions()
Comment From: jreback
see https://github.com/pandas-dev/pandas/issues/6485 and #6942
this is actually a feature
Comment From: toobaz
I don't think it is related: #6485 is an unwanted upcasting (while I'm seeing an unwanted downcasting), example in #6942 is not complete but seems to be specific to datetimelike stuff.
Most importantly, this one specifically affects empty Series
only. Are you referring to this when you say it is a feature? (Might be, but I think it goes against the principle of least surprise, because in the examples above I specifically passed a dtype=object
, which is not being respected)
Comment From: jreback
yes the point is a common use case is to create an empty object and iteratively fill it not a great pattern but one which is supported inference on first expansion is ingrained
Comment From: toobaz
is to create an empty object and iteratively fill it
Ideally, we would then have the possibility of dtype-less empty objects. So that if you do pass a dtype, it's not overridden.
Comment From: jreback
i agree in theory we should have an Any type which does exactly this