Code Sample, a copy-pastable example if possible
import pandas as pd
import datetime
import numpy as np
base = datetime.datetime.today()
date_list = [base - datetime.timedelta(days=x) for x in range(0, 365)]
score_list = list(np.random.randint(low=1, high=1000, size=365))
df = pd.DataFrame()
df['datetime'] = date_list
df['datetime'] = pd.to_datetime(df['datetime'])
df['datetime'] = df['datetime'].astype('datetime64[ns, Asia/Kuala_Lumpur]')
df['score'] = score_list
print(df.groupby([ lambda x: df.loc[x]['datetime'].year ]).count())
Problem description
Pandas crash without obvious reason why. pandas work as expected with following line removed
df['datetime'] = df['datetime'].astype('datetime64[ns, Asia/Kuala_Lumpur]')
Expected Output
aggregated total count group by year.
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-64-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 28.8.0
Cython: 0.25.1
numpy: 1.12.0
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 0.7.3
lxml: None
bs4: 4.4.1
html5lib: 0.999
httplib2: 0.9.1
apiclient: None
sqlalchemy: 1.1.4
pymysql: 0.7.9.None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: None
pandas_datareader: None
Comment From: jorisvandenbossche
Simplified example:
In [68]: df = pd.DataFrame({'datetime': pd.date_range('2012-03-01', periods=365, tz='Asia/Kuala_Lumpur'),
'score': np.arange(365)})
In [69]: df.groupby(df['datetime'].dt.year)[['score']].count()
Out[69]:
score
datetime
2012 306
2013 59
In [70]: df.groupby(df['datetime'].dt.year).count()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-70-90b6e370b9fa> in <module>()
----> 1 df.groupby(df['datetime'].dt.year).count()
/home/joris/scipy/pandas/pandas/core/groupby.py in count(self)
4009 blk = map(make_block, map(counter, val), loc)
4010
-> 4011 return self._wrap_agged_blocks(data.items, list(blk))
4012
4013 def nunique(self, dropna=True):
/home/joris/scipy/pandas/pandas/lib.pyx in pandas.lib.count_level_2d (pandas/lib.c:23708)()
ValueError: Buffer has wrong number of dimensions (expected 2, got 1)
Comment From: jreback
this is a dupe of #13393