Sometimes pandas round
function does not return the expected results without informing me about the reason. Consider the following example:
import pandas as pd
import numpy as np
output = pd.DataFrame(index = ['foo'], columns = ['bar'])
output.loc['foo', 'bar'] = 1.2345
output.round(2)
Out[1]:
bar
foo 1.2345
I think the reason for the unexpected result is the type of output
(or it's elements?), which is:
In [2]: output.loc[:,'bar']
Out[2]:
foo 1.2345
Name: bar, dtype: object
To obtain the expected result, I write:
output = output.astype(np.double)
output.round(2)
Out[3]:
bar
foo 1.23
Can somebody maybe say whether the first result is indeed intended? I wonder if pandas.round
can warn the user that it does not operate on a dtype: object
?
Output of pd.show_versions()
Comment From: gfyoung
@FabianSchuetze : Thanks for reporting this! Rounding with object
elements is generally not a good idea because object
is very ambiguous Python-wise. In fact, numpy
(a pandas
dependency) can get very mad if you try to do that. Thus, the result that you see should be expected IMO.
We could maybe issue a warning on a dtype
that is not numeric, but we can't just check for object
in this case because there are many other types that would fail (think str
+ non-numeric subclasses). Thus, it might not be as straightforward to do (or maybe I'm overthinking it 😄 )
Comment From: jreback
.round()
only operates on int/float columns by-definition. I suppose the doc-string could be slightly improved.
But this is the expected behavior. If you have object
dtypes that are in fact numeric you are responsible to conversions. It is not normal, except by explicit user action to have this.