Series.sum returns a numpy type, except when it's empty, in which case it returns a python int of value "0":
In [2]: type(pd.Series([0]).sum())
Out[2]: numpy.int64
In [3]: type(pd.Series().sum())
Out[3]: int
This poses a problem when I do 1 / myserie.sum()
because I expect to obtain np.inf
rather than a divison by 0 exception.
I think the return type of Series.sum()
for empty series should be inferred from the Series's dtype This way, Series([], dtype='str').sum()
would return an empty string, and Series([]).sum()
would return np.float64(0)
since an empty series' default type seems to be float64.
Tested with Pandas 0.16.0
Comment From: shoyer
We don't support numeric methods on strings, but otherwise this seasons reasonable to me.
Want to give putting together a patch a try?
Comment From: jreback
@Remiremi yeah I don't think we coerce scalars to anything. This could be made more consistent. We should in general NOT be returning python scalars.
Comment From: remiremi
@shoyer @jreback I created a pull request which adds a fix and a relevant test: https://github.com/pydata/pandas/pull/9829 Let me know if I'm missing anything
Comment From: jorisvandenbossche
This now returns correctly a numpy type:
In [22]: type(pd.Series().sum())
Out[22]: float
So repurpose this issue to add a test to confirm this.
Comment From: jorisvandenbossche
Closing this as we have an issue to convert the Series dtype to object for empty creation, and https://github.com/pandas-dev/pandas/issues/19813 is there for the strange return value.