Pandas resample drops dtype('O') columns from DataFrame (but not Series)

Code Sample, a copy-pastable example if possible

import numpy
import pandas as pd

idx = pd.date_range('1/1/2016', periods=100, freq='d')

z = pd.DataFrame({"z": numpy.zeros(len(idx))}, index=idx)
obj = pd.DataFrame({"obj": numpy.zeros(len(idx), dtype=numpy.dtype('O'))}, index=idx)
nan = pd.DataFrame({"nan": numpy.zeros(len(idx)) + float('nan')}, index=idx)
objnan = pd.DataFrame([], columns=['objnan'], index=idx[0:0])  # dtype('O') nan after join

df_list = [z, obj, nan, objnan]

def r1w(x):
    return x.resample('1w').sum()

resample_join_result = r1w(pd.DataFrame().join(df_list, how='outer'))
print(resample_join_result.shape)  # (15, 2) --- I thought this should be (15, 4)

join_resample_result = pd.DataFrame().join([r1w(x) for x in df_list], how='outer')
print(join_resample_result.shape)  # (15, 4)

for columnname in join_resample_result.columns:
    if columnname not in resample_join_result.columns:
        print("DataFrame.resample missing column: " + str(columnname) + " (" + str(join_result[columnname].dtype) + ")")

Expected Output

I would have expected resample_join_result to have all four columns and be the same as join_resample_result, but they are not, because it seems pandas.DataFrame.resample drops dtype('O') (object) columns; while pandas.Series.resample converts those columns into numeric dtypes.

output of `pd.show_versions()`

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-327.22.2.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 23.0.0
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.1
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: 0.2.0

Comment From: jreback

xref #12537

for upsampling this would work, but how would you do downsampling? you could maybe do max/min/first/last, but other functions like sum/mean/var won't work at all because its impossible to select a correct row for the non-numerics. So better to simply exclude. groupby does the same (except for sum which is weird because it works on strings).

Comment From: rcyeh

Wow, thanks for your quick reply. I'm not sure I understand it all, but I think I will after I read it a few more times.

As a newbie, I am very surprised that a DataFrame method could ever return a different result than an index-based DataFrame.join of [df[x].method() for x in df], where the identically-named Series method has been applied to each column of the DataFrame. I haven't used pandas enough to understand why this should be. Based on the cross-referenced issue #12537 and comment about groupby, I am guessing that resample is not the only case that will exhibit this. Is there a way I can guess when that might occur?

I'd rather have an explicit NaN or a ValueError exception due to unsupported data type instead of a silent drop. This may be related to issues of control over the treatment of NaN. The pandas default (exclude/ignore) is usually pleasantly convenient, but there are times when I really do want NaN to propagate (and I am fine with saying to go away and use numpy). I can always explicitly drop non-numeric columns.

Re-reading the documentation, I see that you're careful not to encourage users to think of DataFrame as just an aligned list of Series. I need to think more about that.

Comment From: jreback

http://pandas.pydata.org/pandas-docs/stable/groupby.html#automatic-exclusion-of-nuisance-columns

this is standard practice on groupby as well (resample is just a time based grouper)

your view is incorrect and actually non obvious

join of op on Series only makes sense if he shape of the result is consistent

think about what a resample / groupby is actually doing when reducing and it will become clear

Pandas resample drops dtype('O') columns from DataFrame (but not Series)

Code Sample, a copy-pastable example if possible

Expected Output

output of pd.show_versions()

output of `pd.show_versions()`