Code Sample, a copy-pastable example if possible
import pandas as pd
import numpy as np
# fails in version 0.24.2 but works in 0.23.4
df = pd.DataFrame.from_dict({'Test': ['0.5', True, '0.6']})
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'
df['Test'] = df['Test'].replace([True], [np.nan])
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'
# fails in version 0.24.2 but works in 0.23.4
df = pd.DataFrame.from_dict({'Test': ['0.5', None, '0.6']})
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'
df['Test'] = df['Test'].replace([None], [np.nan])
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'
# works in both mentioned versions
df = pd.DataFrame.from_dict({'Test': ['0.5', None, '0.6']})
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'
df['Test'] = df['Test'].fillna(np.nan)
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'
Problem description
I have encountered this change of behavior from pandas version 0.23.4 to 0.24.2, but could not find anything in the release notes or the latest documentation, which would inform the user about this change. I am not sure whether the behavioral change of the method is an improvement or not. However, I think it should be mentioned somehow. In addition, since it seems that it is not mentioned, it may hint to a deeper problem of the to_replace method.
For documentation-related issues, you can check the latest versions of the docs on master
here:
https://pandas-docs.github.io/pandas-docs-travis/
Expected Output
no assertion errors, dtype of column stays object for all three cases
Output of pd.show_versions()
Comment From: WillAyd
I don't quite follow what the third example is showing but speaking to the first two it looks like the problem is a coercion to float - is there any particular reason you wouldn't want that?
Comment From: krassowski
I got a different use case, but it seems that it arises from the same change in 0.24: https://stackoverflow.com/q/55383587.
Comment From: mroeschke
Looks like everything remains as object on master. Could use a test
In [2]: df = pd.DataFrame.from_dict({'Test': ['0.5', True, '0.6']})
...: print(df['Test'].dtype)
...: assert df['Test'].dtype == 'object'
...: df['Test'] = df['Test'].replace([True], [np.nan])
...: print(df['Test'].dtype)
...: assert df['Test'].dtype == 'object'
object
object
In [3]: df = pd.DataFrame.from_dict({'Test': ['0.5', None, '0.6']})
...: print(df['Test'].dtype)
...: assert df['Test'].dtype == 'object'
...: df['Test'] = df['Test'].replace([None], [np.nan])
...: print(df['Test'].dtype)
...: assert df['Test'].dtype == 'object'
object
object
In [4]: df = pd.DataFrame.from_dict({'Test': ['0.5', None, '0.6']})
...: print(df['Test'].dtype)
...: assert df['Test'].dtype == 'object'
...: df['Test'] = df['Test'].fillna(np.nan)
...: print(df['Test'].dtype)
...: assert df['Test'].dtype == 'object'
object
object
Comment From: pbhoopala
take