Much simpler example
In [1]: df = pandas.DataFrame({ 'x' : [], 'y' : [], 'z' : []})
In [2]: df
Out[2]:
Empty DataFrame
Columns: [x, y, z]
Index: []
In [3]: df.applymap(lambda x: x)
Out[3]:
x NaN
y NaN
z NaN
dtype: float64
In [4]: df = pandas.DataFrame({ 'x' : [1], 'y' : [1], 'z' : [1]})
In [5]: df.applymap(lambda x: x)
Out[5]:
x y z
0 1 1 1
The docs for Dataframe.applymap clearly say it will return a DataFrame. However that does not happen if you call it on a DataFrame with no rows -- it returns a series. Not only that, the series has values in it, which is not what anyone would expect from an empty DataFrame. This is with pandas 0.18.0.
#For our use case, an 0-row dataframe is 100% legitimate. This experiment really did return 0 rows.
df = pandas.DataFrame({ 'x' : [], 'y' : [], 'z' : []}) #empty dataframe
len(df) #returns 0, just what you'd expect from an empty dataframe.
#Strip off whitespace from any strings in the dataframe. Obviously not needed for this empty
#dataframe, but would be needed on a dataframe that had rows. (Yes, I know there's alternative
#ways to do this, I'm not looking for workarounds. I'm focused on how applymap works.)
df2 = df.applymap(lambda x : x.strip() if isinstance(x, str) and len(x) > 0 else x)
len(df2) #WHOA! This is 3, not 0. How'd that happen?
type(df2) #Oh, it returned a series instead of a dataframe. The docs do NOT say it does that.
#The docs says it always returns a DataFrame, which is what most developers would expect.
Almost anyone would expect applymap() on an empty DataFrame to return an empty DataFrame, not convert it to a Series with NaNs. We should change it to return an empty DataFrame.
Comment From: jreback
duplicate of #8222