Code Sample, a copy-pastable example if possible
import pandas as pd
from numpy import nan
d = {(0,
'NNPDF31_nlo_as_0130_uncorr__combined',
'chi2'): {('NMC', 'Total', 325, 394): nan, ('NMC',
'Total',
325,
400): 906.5202301172686, ('SLAC', 'Total', 67, 394): nan, ('SLAC',
'Total',
67,
400): 110.50940462648715},
(1,
'NNPDF31_nlo_as_0130_uncorr__combined',
'chi2'): {('NMC', 'Total', 325, 394): 801.561757289261, ('NMC',
'Total',
325,
400): 875.9830814675471, ('SLAC',
'Total',
67,
394): 126.25943979466223, ('SLAC', 'Total', 67, 400): 101.69776217874313}}
df = pd.DataFrame.from_dict(d)
df.groupby(level=3).sum()
Problem description
The code above behaves differently in pandas 0.20 and in 0.22 (and also master) thereby causing incorrect results in code relying on the old behaviour. In pandas master I get:
0 1
NNPDF31_nlo_as_0130_uncorr__combined NNPDF31_nlo_as_0130_uncorr__combined
chi2 chi2
394 0.000000 927.821197
400 1017.029635 977.680844
Expected Output
0 1
NNPDF31_nlo_as_0130_uncorr__combined NNPDF31_nlo_as_0130_uncorr__combined
chi2 chi2
394 NaN 927.821197
400 1017.029635 977.680844
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Linux
OS-release: 4.13.0-32-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.0.dev0+261.g6485a36
pytest: 3.3.2
pip: 9.0.1
setuptools: 38.4.0
Cython: 0.27.3
numpy: 1.13.3
scipy: 1.0.0
pyarrow: 0.8.0
xarray: 0.10.0
IPython: 6.2.1
sphinx: 1.6.6
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: 0.4.0
matplotlib: 2.1.2
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.10
s3fs: 0.1.2
fastparquet: 0.1.4
pandas_gbq: None
pandas_datareader: None
I understand the details of how this works have changed several times since 0.20 but I think it should not break by code if possible.
Comment From: jreback
pls read the whatsnew note:http://pandas.pydata.org/pandas-docs/stable/whatsnew.html
.sum()
defaults have changed.