Code Sample, a copy-pastable example if possible
list_of_dicts = [{
'time': pandas.to_datetime('2016-12-03 18:00:00'),
'building': 'tall',
'type': 'steel'
}, {
'time': pandas.to_datetime('2016-12-03 18:00:00'),
'building': 'tall',
'type': 'brick'
}]
df = pandas.DataFrame(list_of_dicts)
building time type
0 tall 2016-12-03 18:00:00 steel
1 tall 2016-12-03 18:00:00 brick
df = df.groupby(['building', pandas.Grouper(key = 'time', freq = '1D')])
[(<pandas.tseries.resample.TimeGrouper at 0x2b1a5978aed0>,
building time type
1 tall 2016-12-03 18:00:00 brick),
('building',
building time type
0 tall 2016-12-03 18:00:00 steel)]
Problem description
Apparently this was fixed in https://github.com/pandas-dev/pandas/issues/3794, but I am still seeing this issue. However, this goes away if I add 1 more grouper column to it, which is strange.
df.groupby(['building', pandas.Grouper(key = 'time', freq = '1D'), 'type'])
[(('tall', Timestamp('2016-12-03 00:00:00', offset='D'), 'brick'),
building time type
1 tall 2016-12-03 18:00:00 brick),
(('tall', Timestamp('2016-12-03 00:00:00', offset='D'), 'steel'),
building time type
0 tall 2016-12-03 18:00:00 steel)]
Expected Output
If I give in the explicit list, it works correctly.
df.groupby([df['building'], pandas.Grouper(key = 'time', freq = '1D')])
[(('tall', Timestamp('2016-12-03 00:00:00', offset='D')),
building time type
0 tall 2016-12-03 18:00:00 steel
1 tall 2016-12-03 18:00:00 brick)]
This should be the expected output by just using 'building' or even pandas.Grouper(key='building').
Output of pd.show_versions()
# Paste the output here pd.show_versions() here
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.8.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.18-164.el5
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.15.2
nose: None
Cython: None
numpy: 1.9.0.dev-Unknown
scipy: 0.15.1
statsmodels: None
IPython: 0.13.2
sphinx: 1.3.1
patsy: None
dateutil: 2.4.2
pytz: 2013d
bottleneck: 0.8.0
tables: None
numexpr: None
matplotlib: 1.4.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: None
pymysql: None
psycopg2: None
Comment From: jreback
you are using a pretty old version. IIRC there were some addtiional fixes. try upgrading.