I experience these strange behavior of unstack()
when its level
parameter gets a list instead of string.
This changes dtype
of some columns from numeric to object, which is obviously a worrying change.
I am wondering if this a bug or there is something I am missing about dealing with this.
I am on pandas 0.16.0, numpy 1.9.2.
Here is a code to illustrate the problem:
data = """a,b,c,d
2,101,A,11
3,101,A,12
1,201,B,21
2,201,B,22
3,201,B,23
1,301,C,33
2,301,C,32"""
df = pd.read_csv(pd.core.common.StringIO(data))
print df.dtypes
df_str = df.set_index(['a','b']).unstack(level='a') # Note that level has a str
print df_str.dtypes
df_list = df.set_index(['a','b']).unstack(level=['a']) # Note that level has a list
print df_list.dtypes
Which produces:
a int64
b int64
c object
d int64 <-- Note original dtype
dtype: object
a
c 1 object
2 object
3 object
d 1 float64 <-- expected change of dtype due to NaN addition
2 float64 <-- expected change of dtype due to NaN addition
3 float64 <-- expected change of dtype due to NaN addition
dtype: object
a
c 2 object
3 object
1 object
d 2 object <-- UNexpected change of dtype, when level got list
3 object <-- UNexpected change of dtype, when level got list
1 object <-- UNexpected change of dtype, when level got list
dtype: object
Comment From: jreback
this is fixed in master. if someone can put the appropriate reference, let's close.
Comment From: gfyoung
@jreback : I believe this is a duplicate #11847 and patched in 0.19.2
. Closing as a duplicate.
@wikiped : Could you try upgrading your pandas
version? We are currently at 0.20.3
.